A COMPARATIVE STUDY OF DISTANCE METRICS AND NEIGHBOR SELECTION IN K-NEAREST NEIGHBOR FOR VOCATIONAL STUDENT PERFORMANCE CLASSIFICATION
DOI:
https://doi.org/10.34288/jri.v8i3.520Keywords:
KNN, Distance Metrics, Student Performance, Vocational Education, ClassificationAbstract
This study aims to evaluate parameter sensitivity in the K-Nearest Neighbor (KNN) algorithm, particularly the selection of distance metrics and k-values, for classifying academic performance in vocational education with heterogeneous and imbalanced data characteristics. The dataset consists of 750 first-year students from the Informatics Management program, including academic attributes (GPA, attendance, and core course grades) and demographic attributes (age, gender, educational background, and economic status). Data preprocessing involves data cleaning, one-hot encoding, Z-score normalization, and handling class imbalance using SMOTE. Model evaluation is conducted using K-Fold Cross Validation with accuracy, precision, recall, and macro-average F1-score as performance metrics. The results show that KNN performance is highly influenced by the combination of distance metrics and k-values. All metrics achieve accuracy above 84%, but differ in handling class imbalance. The Chebyshev metric (k = 10) provides the best balance with an F1-score of 0.6468, while the Minkowski metric (p = 3) achieves the highest recall of 0.7334. The Euclidean metric attains the highest accuracy of 0.8504 (k = 11), but tends to be biased toward the majority class. These findings indicate that optimizing KNN parameters should not rely solely on accuracy, but also consider balanced performance across classes. This study provides a practical evaluation framework for selecting KNN parameters to support more robust and fair academic prediction systems in vocational education data.
Downloads
References
Abou Naaj, M., Mehdi, R., Mohamed, E. A., & Nachouki, M. (2023). Analysis of the Factors Affecting Student Performance Using a Neuro-Fuzzy Approach. Education Sciences, 13(3). https://doi.org/10.3390/educsci13030313
Ali, M., & Koehler, T. (2020). Evaluation of Indonesian Technical and Vocational Education in Addressing the Gap in Job Skills Required by Industry.
Anadi, I., Havrda, D. E., Owens-Mosby, D. A., & Shelton, C. M. (2023). Evaluation of Academic and Nonacademic Factors of First-Generation Students Transitioning to a Pharmacy Program. American Journal of Pharmaceutical Education, 87(12), 100598. https://doi.org/10.1016/j.ajpe.2023.100598
Astu, M., Pawitra, S., Hung, H., & Jati, H. (2024). A Machine Learning Approach to Predicting On-Time Graduation in Indonesian Higher Education. 9(2), 294–308.
Feng, G., Fan, M., & Chen, Y. (2022). Analysis and Prediction of Students’ Academic Performance Based on Educational Data Mining. IEEE Access, 10, 19558–19571. https://doi.org/10.1109/ACCESS.2022.3151652
Johora, F. T., Hasan, M. N., Rajbongshi, A., Ashrafuzzaman, M., & Akter, F. (2025). An explainable AI-based approach for predicting undergraduate students academic performance. Array, 26, 100384. https://doi.org/10.1016/j.array.2025.100384
Khamdun, K., Suparmi, S., Maridi, M., & Rusilowati, A. (2021). Development of vocational science learning devices to improve project based soft skills. Linguistics and Culture Review, 5(S1), 201–213. https://doi.org/10.21744/lingcure.v5ns1.1348
Lakhdar, Y., EL-Bendadi, K., & Bakkas, B. (2024). A New Hybrid Model to Predict the Performance of Trainee Teachers Based on Clustering and Classification. Journal of Computer Science, 20(9), 1020–1029. https://doi.org/10.3844/jcssp.2024.1020.1029
Manurung, J., Saragih, H., Prabukusumo, M. A., & Ahmad, E. (2025). Optimizing the performance of the K-Nearest Neighbors algorithm using grid search and feature scaling to improve data classification accuracy. 14(2), 260–268.
Mohamed Nafuri, A. F., Sani, N. S., Zainudin, N. F. A., Rahman, A. H. A., & Aliff, M. (2022). Clustering Analysis for Classifying Student Academic Performance in Higher Education. Applied Sciences (Switzerland), 12(19). https://doi.org/10.3390/app12199467
Pritasari, O. K., Suhartini, R., & Hasbi, A. (2026). Technological Integration and Soft Skill Synergy in Vocational Education : A Data- Driven Model for Enhancing Hairdressing Work Competence Integración Tecnológica y Sinergia de Habilidades Blandas en la Educación Vocacional : un Modelo Basado en Datos para Mejorar la Competencia Laboral en Peluquería. https://doi.org/10.56294/saludcyt20262638
Setiawan, A. (2022). Perbandingan Penggunaan Jarak Manhattan, Jarak Euclid, dan Jarak Minkowski dalam Klasifikasi Menggunakan Metode KNN pada Data Iris. Jurnal Sains Dan Edukasi Sains, 5(1), 28–37. https://doi.org/10.24246/juses.v5i1p28-37
Shoaib, M., Sayed, N., Singh, J., Shafi, J., Khan, S., & Ali, F. (2024). AI student success predictor: Enhancing personalized learning in campus management systems. Computers in Human Behavior, 158(February), 108301. https://doi.org/10.1016/j.chb.2024.108301
Staneviciene, E., Gudoniene, D., Punys, V., & Kukstys, A. (2024). A Case Study on the Data Mining-Based Prediction of Students’ Performance for Effective and Sustainable E-Learning. Sustainability (Switzerland), 16(23). https://doi.org/10.3390/su162310442
Wati, E. F., Perangin-angin, E. S., & Sari, A. P. (2023). Prediction of Student Graduation using the K-Nearest Neighbors Method. 7(158), 211–216.
Xiao, W., Ji, P., & Hu, J. (2022). A survey on educational data mining methods used for predicting students’ performance. Engineering Reports, 4(5), 1–23. https://doi.org/10.1002/eng2.12482
Yağcı, M. (2022). Educational data mining: prediction of students’ academic performance using machine learning algorithms. Smart Learning Environments, 9(1). https://doi.org/10.1186/s40561-022-00192-z
Yamin, M. (2026). Vocational Education Model in Indonesian Vocational High Schools Based on Teaching Factory. 1(1), 10–16.
Yusof, R., Hashim, N., Abdul Rahman, N., Mohd Yunus, S. Y., & Aziz Fadzillah, N. A. (2022). Academic Performance Prediction Model Using Classification Algorithms: Exploring the Potential Factors. International Journal of Academic Research in Progressive Education and Development, 11(3), 706–724. https://doi.org/10.6007/ijarped/v11-i3/14753
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Muhammad Aris Ganiardi, Ida Wahyuningrum, Nita Novita, Denny Alfian

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The Jurnal Riset Informatika has legal rules for accessing digital electronic articles uunder a Creative Commons Attribution-NonCommercial 4.0 International License . Articles published in Jurnal Riset Informatika, provide Open Access, for the purpose of scientific development, research, and libraries.










