ENHANCING SLEEP QUALITY PREDICTION THROUGH SMOTE-BASED DATA BALANCING AND HYBRID MACHINE LEARNING MODELS

Authors

  • Ami Rahmawati Universitas Nusa Mandiri
  • Ita Yulianti Universitas Bina Sarana Informatika
  • Ani Oktarini Sari Universitas Nusa Mandiri
  • Siti Nurajizah Universitas Bina Sarana Informatika
  • Hikmatulloh Universitas Nusa Mandiri
(*) Corresponding Author

DOI:

https://doi.org/10.34288/jri.v8i1.456

Keywords:

Prediction, RFE, Sleep Quality, SMOTE, XGBoost

Abstract

Sleep is a vital aspect in maintaining a person's physical and psychological balance. Poor sleep quality can reduce physical and cognitive performance, increasing the risk of various health problems. This study aims to develop a predictive model for sleep quality based on factors such as lifestyle, stress, daily activities, and caffeine consumption, using XGBoost combined with Recursive Feature Elimination (RFE). XGBoost was chosen for its ability to handle imbalanced datasets and heterogeneous features, while RFE helps simplify the model without losing important information. In the data pre-processing stage, a class imbalance was found, so the Synthetic Minority Over-sampling Technique (SMOTE) process was carried out to balance the proportion of the minority class. The dataset in this study was divided into two parts, namely 80% as training data and 20% as testing data, and validated using cross-validation to ensure generalization. The results show very high model performance with an accuracy of 99.79% on training data, 99.63% on cross-validation, and 99.10% on testing data. This model was then developed into a web application for practical use in analyzing sleep quality prediction. This study emphasizes the methodological contribution of a SMOTE-based hybrid machine learning model and its ready-to-use application implementation, while also opening opportunities for further testing on more diverse datasets and evaluating biases caused by synthetic data.

Downloads

Download data is not yet available.

References

Agustiana P, L., & Nafisah, K. D. (2024). Hubungan Konsumsi Kopi dengan Kualitas Tidur pada Remaja. Colostrum Mother Journal, 1(01), 1–23.

Firdaus, D., Sumardi, I., & Chazar, C. (2025). Deteksi Serangan Pada Jaringan Internet Of Things Medis Menggunakan Machine Learning Dengan Algoritma XGBoost Medical of Things Attack Detection On Internet Medical Of Things Using Machine Learning With Xgboost Algorithm. 8(1), 34–42.

Henrich, L. C., Antypa, N., & Van den Berg, J. F. (2023). Sleep quality in students: Associations with psychological and lifestyle factors. Current Psychology, 42(6), 4601–4608. https://doi.org/10.1007/s12144-021-01801-9

Khasanah, N., Eka Saputri, D. U., Aziz, F., & Hidayat, T. (2025). Studi Perbandingan Algoritma Random Forest dan K-Nearest Neighbors (KNN) dalam Klasifikasi Gangguan Tidur. Computer Science (CO-SCIENCE), 5(1), 17–25. https://doi.org/10.31294/coscience.v5i1.5522

Mansur, A. R., Farlina, M., & Paraswati, S. I. (2025). Dunia Dalam Secangkir Kopi (I. M. Sari (ed.); 1st ed.). Karya Bakti Makmur (KBM) Indonesia.

Mawardi, A. B., Pradini, R. S., & Haris, M. S. (2024). Komparasi Algoritma Boosting Untuk Prediksi Gangguan Tidur. JITET (Jurnal Informatika Dan Teknik Elektro Terapan), 13(3).

Meiranny, A., & Chabibah, A. M. (2022). Pengaruh Konsumsi Minuman Berkafein Terhadap Pola dan Kualitas Tidur Mahasiswa : A Literatur Review. Media Publikasi Promosi Kesehatan Indonesia (MPPKI), 5(2), 117–122. https://doi.org/10.56338/mppki.v5i2.1910

Mujiyono, S., Sanjaya, U. P., Wibisono, I. S., & Setyowati, H. (2025). Prediksi Fluktuasi Berat Badan Berdasarkan Pola Hidup Menggunakan Model XGBoost dan Deep Learning. Jurnal Algoritma, 22(1), 221–233. https://doi.org/10.33364/algoritma/v.22-1.2253

Nawawi, I., & Fatah, Z. (2024). Penerapan Decision Trees dalam Mendeteksi Pola Tidur Sehat Berdasarkan Kebiasaan Gaya Hidup. Jurnal Ilmiah Sains Teknologi Dan Informasi, 2(4), 34–41. https://doi.org/10.59024/jiti.v2i4.969

Nugraha, R., Muflih, S. R., Ferianda, I., Arashi, Z. R., Sahubawa, N. S., Hendrian, Y., & Kinanti, S. L. (2025). PENGEMBANGAN WEBSITE KLASIFIKASI KUALITAS TIDUR DAN REKOMENDASI PENANGANAN MENGGUNAKAN LOGISTIC REGRESSION. Jurnal Riset, Inovasi Dan Teknologi Kabupaten Batang, 9(2), 30–36.

Permata, Y. N., Sriwiyati, K., & Affanin, S. (2023). Hubungan kebiasaan minum kopi dan aktivitas fisik dengan kualitas tidur mahasiswa fakultas kedokteran. Journal of Nursing Practice and Education, 4(1), 206–211. https://doi.org/10.34305/jnpe.v4i1.957

Priambodo, A. R., & Chozanah, R. (2020). Tergantung Metabolisme Tubuh, Efek Samping Minum Kopi Setiap Orang Beda! Suara.Com. https://www.suara.com/health/2020/06/07/080418/tergantung-metabolisme-tubuh-efek-samping-minum-kopi-setiap-orang-beda

Putra, J. L., & Hidayat, W. F. (2024). Prediksi Kualitas Tidur: Pendekatan Machine Learning yang Mengintegrasikan Faktor Kesehatan dan Lingkungan. Computer Science (CO-SCIENCE), 4(2), 157–162. https://doi.org/10.31294/coscience.v4i2.4737

Ranti, N. B. P., Boekoesoe, L., & Ahmad, Z. F. (2022). Kebiasaan Konsumsi Kopi, Penggunaan Gadget, Stress dan Hubungannya dengan Kejadian Insomnia pada Mahasiswa. Jambura Journal of Epidemiology, 1(1), 20–28. https://doi.org/10.37905/jje.v1i1.15027

Sari, A. P., Billy, B., Tsaqif, D. A., Sartono, B., & Firdawanti, A. R. (2024). Classification of Drinking Water Source Suitability in West Java Using XGBoost and Cluster Analysis Based on SHAP Values. Indonesian Journal of Statistics and Its Applications, 8(2), 202–214. https://doi.org/10.29244/ijsa.v8i2p202-214

Sari, D. (2024). Prediksi Gangguan Tidur pada Sleep Health and Lifestyle Menggunakan Support Vector Machine dan Neural Network. JAVIT : Jurnal Vokasi Informatika, 36–42. https://doi.org/10.24036/javit.v4i1.168

Sastra, A., & Sabri, K. (2025). Analisis Data Science terhadap Kualitas Tidur dan Faktor-Faktor yang Mempengaruhinya : Studi Kasus Sleep Health and Lifestyle Dataset. 3(2), 92–99.

Thanri, Y. Y., Riza, B. S., Iriani, J., & Noor, A. A. (2025). Klasifikasi Gangguan Tidur pada Individu Menggunakan Algoritma Naive Bayes Berbasis Data Gejala Klinis. 13(1).

Tinaliah, T., & Elizabeth, T. (2024). Prediksi Jenis Kanker Payudara Menggunakan Metode Support Vector Machine Berbasis Recursive Feature Elimination. JATISI (Jurnal Teknik Informatika Dan Sistem Informasi), 11(3), 1–9.

Tresnawulan, S., Afrina, R., & Kamilah, S. (2024). Hubungan Perilaku Konsumsi Kopi dengan Kualitas Tidur pada Remaja di Kopi Janji Jiwa Depok Dua Timur Tahun 2022. 3.

Downloads

Published

2025-12-15

How to Cite

Rahmawati, A., Yulianti, I., Oktarini Sari, A., Nurajizah, S., & Hikmatulloh. (2025). ENHANCING SLEEP QUALITY PREDICTION THROUGH SMOTE-BASED DATA BALANCING AND HYBRID MACHINE LEARNING MODELS. Jurnal Riset Informatika, 8(1), 139–148. https://doi.org/10.34288/jri.v8i1.456