A COMPARATIVE STUDY OF DISTANCE METRICS AND NEIGHBOR SELECTION IN K-NEAREST NEIGHBOR FOR VOCATIONAL STUDENT PERFORMANCE CLASSIFICATION

Authors

  • Muhammad Aris Ganiardi Politeknik Negeri Sriwijaya
  • Ida Wahyuningrum Politeknik Negeri Sriwijaya
  • Nita Novita Politeknik Negeri Sriwijaya
  • Denny Alfian Politeknik Negeri Sriwijaya
(*) Corresponding Author

DOI:

https://doi.org/10.34288/jri.v8i3.520

Keywords:

KNN, Distance Metrics, Student Performance, Vocational Education, Classification

Abstract

This study aims to evaluate parameter sensitivity in the K-Nearest Neighbor (KNN) algorithm, particularly the selection of distance metrics and k-values, for classifying academic performance in vocational education with heterogeneous and imbalanced data characteristics. The dataset consists of 750 first-year students from the Informatics Management program, including academic attributes (GPA, attendance, and core course grades) and demographic attributes (age, gender, educational background, and economic status). Data preprocessing involves data cleaning, one-hot encoding, Z-score normalization, and handling class imbalance using SMOTE. Model evaluation is conducted using K-Fold Cross Validation with accuracy, precision, recall, and macro-average F1-score as performance metrics. The results show that KNN performance is highly influenced by the combination of distance metrics and k-values. All metrics achieve accuracy above 84%, but differ in handling class imbalance. The Chebyshev metric (k = 10) provides the best balance with an F1-score of 0.6468, while the Minkowski metric (p = 3) achieves the highest recall of 0.7334. The Euclidean metric attains the highest accuracy of 0.8504 (k = 11), but tends to be biased toward the majority class. These findings indicate that optimizing KNN parameters should not rely solely on accuracy, but also consider balanced performance across classes. This study provides a practical evaluation framework for selecting KNN parameters to support more robust and fair academic prediction systems in vocational education data.

Downloads

Download data is not yet available.

References

Abou Naaj, M., Mehdi, R., Mohamed, E. A., & Nachouki, M. (2023). Analysis of the Factors Affecting Student Performance Using a Neuro-Fuzzy Approach. Education Sciences, 13(3). https://doi.org/10.3390/educsci13030313

Ali, M., & Koehler, T. (2020). Evaluation of Indonesian Technical and Vocational Education in Addressing the Gap in Job Skills Required by Industry.

Anadi, I., Havrda, D. E., Owens-Mosby, D. A., & Shelton, C. M. (2023). Evaluation of Academic and Nonacademic Factors of First-Generation Students Transitioning to a Pharmacy Program. American Journal of Pharmaceutical Education, 87(12), 100598. https://doi.org/10.1016/j.ajpe.2023.100598

Astu, M., Pawitra, S., Hung, H., & Jati, H. (2024). A Machine Learning Approach to Predicting On-Time Graduation in Indonesian Higher Education. 9(2), 294–308.

Feng, G., Fan, M., & Chen, Y. (2022). Analysis and Prediction of Students’ Academic Performance Based on Educational Data Mining. IEEE Access, 10, 19558–19571. https://doi.org/10.1109/ACCESS.2022.3151652

Johora, F. T., Hasan, M. N., Rajbongshi, A., Ashrafuzzaman, M., & Akter, F. (2025). An explainable AI-based approach for predicting undergraduate students academic performance. Array, 26, 100384. https://doi.org/10.1016/j.array.2025.100384

Khamdun, K., Suparmi, S., Maridi, M., & Rusilowati, A. (2021). Development of vocational science learning devices to improve project based soft skills. Linguistics and Culture Review, 5(S1), 201–213. https://doi.org/10.21744/lingcure.v5ns1.1348

Lakhdar, Y., EL-Bendadi, K., & Bakkas, B. (2024). A New Hybrid Model to Predict the Performance of Trainee Teachers Based on Clustering and Classification. Journal of Computer Science, 20(9), 1020–1029. https://doi.org/10.3844/jcssp.2024.1020.1029

Manurung, J., Saragih, H., Prabukusumo, M. A., & Ahmad, E. (2025). Optimizing the performance of the K-Nearest Neighbors algorithm using grid search and feature scaling to improve data classification accuracy. 14(2), 260–268.

Mohamed Nafuri, A. F., Sani, N. S., Zainudin, N. F. A., Rahman, A. H. A., & Aliff, M. (2022). Clustering Analysis for Classifying Student Academic Performance in Higher Education. Applied Sciences (Switzerland), 12(19). https://doi.org/10.3390/app12199467

Pritasari, O. K., Suhartini, R., & Hasbi, A. (2026). Technological Integration and Soft Skill Synergy in Vocational Education : A Data- Driven Model for Enhancing Hairdressing Work Competence Integración Tecnológica y Sinergia de Habilidades Blandas en la Educación Vocacional : un Modelo Basado en Datos para Mejorar la Competencia Laboral en Peluquería. https://doi.org/10.56294/saludcyt20262638

Setiawan, A. (2022). Perbandingan Penggunaan Jarak Manhattan, Jarak Euclid, dan Jarak Minkowski dalam Klasifikasi Menggunakan Metode KNN pada Data Iris. Jurnal Sains Dan Edukasi Sains, 5(1), 28–37. https://doi.org/10.24246/juses.v5i1p28-37

Shoaib, M., Sayed, N., Singh, J., Shafi, J., Khan, S., & Ali, F. (2024). AI student success predictor: Enhancing personalized learning in campus management systems. Computers in Human Behavior, 158(February), 108301. https://doi.org/10.1016/j.chb.2024.108301

Staneviciene, E., Gudoniene, D., Punys, V., & Kukstys, A. (2024). A Case Study on the Data Mining-Based Prediction of Students’ Performance for Effective and Sustainable E-Learning. Sustainability (Switzerland), 16(23). https://doi.org/10.3390/su162310442

Wati, E. F., Perangin-angin, E. S., & Sari, A. P. (2023). Prediction of Student Graduation using the K-Nearest Neighbors Method. 7(158), 211–216.

Xiao, W., Ji, P., & Hu, J. (2022). A survey on educational data mining methods used for predicting students’ performance. Engineering Reports, 4(5), 1–23. https://doi.org/10.1002/eng2.12482

Yağcı, M. (2022). Educational data mining: prediction of students’ academic performance using machine learning algorithms. Smart Learning Environments, 9(1). https://doi.org/10.1186/s40561-022-00192-z

Yamin, M. (2026). Vocational Education Model in Indonesian Vocational High Schools Based on Teaching Factory. 1(1), 10–16.

Yusof, R., Hashim, N., Abdul Rahman, N., Mohd Yunus, S. Y., & Aziz Fadzillah, N. A. (2022). Academic Performance Prediction Model Using Classification Algorithms: Exploring the Potential Factors. International Journal of Academic Research in Progressive Education and Development, 11(3), 706–724. https://doi.org/10.6007/ijarped/v11-i3/14753

Downloads

Published

2026-06-16

How to Cite

Ganiardi, M. A., Ida Wahyuningrum, Nita Novita, & Denny Alfian. (2026). A COMPARATIVE STUDY OF DISTANCE METRICS AND NEIGHBOR SELECTION IN K-NEAREST NEIGHBOR FOR VOCATIONAL STUDENT PERFORMANCE CLASSIFICATION. Jurnal Riset Informatika, 8(3), 358–368. https://doi.org/10.34288/jri.v8i3.520

Issue

Section

Articles