Enhancing Obesity Risk Classification: Tackling Data Imbalance with SMOTE and Deep Learning

Authors

  • Muhammad Syofian Universitas nusa mandiri
  • Ilham Maulana
(*) Corresponding Author

DOI:

https://doi.org/10.34288/jri.v6i4.349

Keywords:

SMOTE, classification, data imbalance, confusion matrix, machine learning

Abstract

Data imbalance is a significant challenge in classification models, often leading to suboptimal performance, especially for minority classes. This study explores the effectiveness of the Synthetic Minority Over-sampling Technique (SMOTE) in improving classification model performance by balancing data distribution. The evaluation was conducted using a confusion matrix to measure prediction accuracy for each class. The results indicate that SMOTE successfully enhances minority class representation and improves prediction balance, although some misclassifications remain. Therefore, in addition to oversampling, additional approaches such as class weighting or ensemble learning are required to further improve model accuracy. This study provides deeper insights into the role of SMOTE in addressing data imbalance and its impact on classification model performance.

Downloads

Download data is not yet available.

References

Chatterjee, Ayan, Martin W. Gerdes, and Santiago G. Martinez. "Identification of risk factors associated with obesity and overweight—a machine learning overview." Sensors 20.9 (2020): 2734..

Pratiwi, Irna, Asri Masitha Arsyati, and Andreanda Nasution. "Faktor-Faktor yang Mempengaruhi Kejadian Obesitas pada Remaja di SMPN 12 Kota Bogor Tahun 2021." Promotor 5.2 (2022): 156-164.

D.S. Akram, A.V. Astrup, T. Atinmo, J.L. Boissin, G.A. Bray, K.K. Carroll, P. Chitson, C. Chunming, W.H. Dietz, J.O. Hill, E. J´equier, C. Komodiki, Y. Matsuzawa, W.F. Mollentze, K. Moosa, M.I. Noor, K.S. Reddy, J. Seidell, V. Tanphaichitr, R. Uauy, P. Zimmet, Obesity: Preventing and Managing the Global Epidemic. Number, 2000, p. 894.

Safaei, Mahmood, et al. "A systematic literature review on obesity: Understanding the causes & consequences of obesity and reviewing various machine learning approaches used to predict obesity." Computers in biology and medicine 136 (2021): 104754.

Salvador Camacho, Andreas Ruppel, Is the calorie concept a real solution to the obesity epidemic? Glob. Health Action 10 (1) (2017) 1289650.

Sadaf Ibrahim, Zuneera Akram, Aisha Noreen, Mirza Tasawer Baig, Samina Sheikh, Ambreen Huma, Aisha Jabeen, Muneeza Lodhi, Shahzada Azam Khan, Hudda Ajmal, Uzma Shahid, Nayel Syed, Overweight and obesity prevalence and predictors in people living in Karachi, J. Pharmaceut. Res. Int. (2021) 194–202.

Ellen P. Williams, Marie Mesidor, Karen Winters, Patricia M. Dubbert, Sharon B. Wyatt, Overweight and Obesity: Prevalence, Consequences, and Causes of a Growing Public Health Problem, 2015.

Syahrul Sazliyana Shaharir, Abdul Halim Abdul Gafor, Mohd Shahrir Mohamed Said, C. Norella, T. Kong, Steroid-induced diabetes mellitus in systemic lupus erythematosus patients: analysis from a Malaysian multi-ethnic lupus cohort, Int. J. Rheum. Dis. 18 (5) (2015) 541–547.

Lihua Hu, Xiao Huang, Chunjiao You, Juxiang Li, Kui Hong, Ping Li, Yanqing Wu, Qinhua Wu, Zengwu Wang, Runlin Gao, Huihui Bao, Xiaoshu Cheng, Prevalence of overweight, obesity, abdominal obesity and obesity-related risk factors in southern China, PloS One 12 (9) (2017), e0183934.

Natharnia Young, , Ixora Kamisan Atan, Rodrigo Guzman Rojas, Hans Peter Dietz, Obesity: how much does it matter for female pelvic organ prolapse? Int. Urogynecol. J. 29 (8) (2018) 1129–1134.

Muscogiuri, Giovanna, L. Verde, C. Vetrani, L. Barrea, S. Savastano, and A. Colao. "Obesity: a gender-view." Journal of endocrinological investigation 47, no. 2 (2024): 299-306.

Ibrahim, S., Akram, Z., Noreen, A., Baig, M. T., Sheikh, S., Huma, A., ... & Shahid, U. (2021). Overweight and obesity prevalence and predictors in people living in Karachi. J. Pharm. Res. Int, 33, 194-202.

Tzenios, Nikolaos. "Obesity as a risk factor for cancer." EPRA International Journal of Research and Development (IJRD) 8, no. 2 (2023): 101-104.

Thabtah, Fadi, Suhel Hammoud, Firuz Kamalov, and Amanda Gonsalves. "Data imbalance in classification: Experimental evaluation." Information Sciences 513 (2020): 429-441.

Lin, Enlu, Qiong Chen, and Xiaoming Qi. "Deep reinforcement learning for imbalanced classification." Applied Intelligence 50.8 (2020): 2488-2502.

Mukherjee, Mimi, and Matloob Khushi. "SMOTE-ENC: A novel SMOTE-based method to generate synthetic data for nominal and continuous features." Applied system innovation 4.1 (2021): 18.

P. S. S. I. A. E. M. K. D. N. F. Z. K. D. G., "An enhanced SMOTE-based method for detecting fraudulent activities in financial transactions," International Journal of Data Science and Analytics, vol. 8, no. 4, pp. 335-345, 2021

D. C. R. Souza, A. L. B. L. Nascimento, and E. M. G. G. Souza, "SMOTE-based approach for fraud detection in imbalanced datasets," Journal of Machine Learning Research, vol. 21, no. 129, pp. 1-23, 2020.

Widiasari, Indrastanti R., and Lukito Edi Nugroho. "Deep learning multilayer perceptron (MLP) for flood prediction model using wireless sensor network based hydrology time series data mining." 2017 International Conference on Innovative and Creative Information Technology (ICITech). IEEE, 2017.

S. Chaurasia and D. Pal, "A Review on Artificial Neural Networks: The ReLU Activation Function and Its Applications," International Journal of Computer Applications, vol. 177, no. 7, pp. 1-5, Dec. 2020, doi: 10.5120/ijca2020919706.

H.Kim, S. Lee, and H. Lee, "Geometrical interpretation and architecture selection of MLP," IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 1, pp. 120-134, Jan. 2021, doi: 10.1109/TNNLS.2020.2968695

Downloads

Published

2024-09-15

How to Cite

Syofian, M., & Maulana, I. (2024). Enhancing Obesity Risk Classification: Tackling Data Imbalance with SMOTE and Deep Learning. Jurnal Riset Informatika, 6(4), 231–236. https://doi.org/10.34288/jri.v6i4.349