THE EFFECTIVENESS ANALYSIS OF RANDOM FOREST ALGORITHMS WITH SMOTE TECHNIQUE IN PREDICTING LUNG CANCER RISK

Authors

  • Ita Yulianti Universitas Bina Sarana Informatika
  • Ami Rahmawati Universitas Nusa Mandiri
  • Tati Mardiana Universitas Nusa Mandiri
(*) Corresponding Author

DOI:

https://doi.org/10.34288/jri.v4i2.159

Keywords:

Lung Cancer, Python, Random Forest, SMOTE

Abstract

 

Abstract

When compared with other types of cancer, most of the population with cancer die from lung cancer.A person needs to do a screening test through X-rays, CT scans, and MRI to detect the disease. However, before carrying out the process, the doctor will ordinarily investigate a medical history and physical examination first to study the symptoms and possible risk factors for lung cancer. The lung cancer data set has a class imbalance that affects the performance of the random forest algorithm in predicting the risk of lung cancer. This study aims to employ the SMOTE technique to the random forest algorithm to increase accuracy in predicting lung cancer risk. In this research, data processing and analysis use the Python programming language. The test results show an accuracy value of 88% with an AUC value of 0.93. When employing the random forest method to forecast lung cancer risk, the SMOTE technique is useful in dealing with class imbalances in the data set.

Downloads

Download data is not yet available.

References

American Cancer Society. (2022). Lung Cancer. American Cancer Society. https://www.cancer.org/cancer/lung-cancer.html

Ardiningtyas, Y. E., & Rosa, P. H. P. (2021). Analisis Balancing Data Untuk Meningkatkan Akurasi Dalam Klasifikasi. PROSIDING SNAST (2021), 24–28. https://ejournal.akprind.ac.id/index.php/prosidingsnast/article/view/3334

Arifiyanti, A. A., & Wahyuni, E. D. (2020). SMOTE : Metode Penyeimbang Kelas Pada Klasifikasi Data Mining. SCAN-Jurnal Teknologi Informasi Dan Komunikasi, 15(1), 34–39. http://www.ejournal.upnjatim.ac.id/index.php/scan/article/view/1850

Aripin, H. A. (2021). Outcome Prediction Untuk Penyakit Jantung Dengan Algoritma Artificial Neural Network. Jurnal Informatika Dan Komputer (INFOKOM), 9(1), 30–45. http://www.journal.piksi.ac.id/index.php/INFOKOM/article/view/485

Bulan, I. A., Ratnawati, H., & Wargasetia, T. L. (2017). Lung Cancer Patient Description in Immanuel Hospital Bandung from January 2013 to December 2014. Journal of Medicine and Health, 1(6), 517–524. http://114.7.153.31/index.php/jmh/article/view/548

Hendra, A., & Fitriyani, F. (2021). Analisis Sentimen Review Halodoc Menggunakan Nai ̈ve Bayes Classifier. JISKA (Jurnal Informatika Sunan Kalijaga), 6(2), 78–89. http://ejournal.uin-suka.ac.id/saintek/JISKA/article/view/2076

Indrawati, A. (2021). Penerapan Teknik Kombinasi Oversampling dan Undersampling Untuk Mengatasi Permasalahan Imbalanced Dataset. JIKO (Jurnal Informatika Dan Komputer), 4(1), 38–43. https://doi.org/10.33387/jiko

Kurnia, R., Rahmadewi, R., & Aini, F. (2016). Deteksi Dini Penyakit Paru Dengan Metoda Bayesian Berbasis Android. National Conference of Applied Engineering, Business and Information Technology, Politeknik Negeri Padang, 317–323.

Makaju, S., Prasad, P. W. C., Alsadoon, A., Singh, A. K., & Elchouemi, A. (2018). Lung Cancer Detection using CT Scan Images. Procedia Computer Science, 125(2009), 107–114. https://doi.org/10.1016/j.procs.2017.12.016

Ratnawati, L., & Sulistyaningrum, D. R. (2019). Penerapan Random Forest untuk Mengukur Tingkat Keparahan Penyakit pada Daun Apel. Jurnal Sains Dan Seni ITS, 8(2), A71–A77. http://ejurnal.its.ac.id/index.php/sains_seni/article/view/48517

Rattan, S., Kaur, S., Kansal, N., & Kaur, J. (2018). An optimized lung cancer classification system for computed tomography images. 2017 4th International Conference on Image Information Processing, ICIIP 2017, 2018-Janua, 15–20. https://doi.org/10.1109/ICIIP.2017.8313676

Religia, Y., Nugroho, A., & Hadikristanto, W. (2021). Analisis Perbandingan Algoritma Optimasi pada Random Forest untuk Klasifikasi Data Bank Marketing. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(1), 187–192. http://www.jurnal.iaii.or.id/index.php/RESTI/article/view/2813

Sari, V. R., Firdausi, F., & Azhar, Y. (2020). Perbandingan Prediksi Kualitas Kopi Arabika dengan Menggunakan Algoritma SGD, Random Forest dan Naive Bayes. EDUMATIC : Jurnal Pendidikan Informatika, 4(2), 1–9. https://doi.org/10.29408/edumatic.v4i2.2202

Sofia, R., & Tahlil, T. (2018). Pengalaman Pasien Kanker dalam Menghadapi Kemoterapi. Jurnal Ilmu Keperawatan, 6(2), 81–91. http://202.4.186.66/JIK/article/view/16111

Sulistiyono, M., Pristyanto, Y., Adi, S., & Gumelar, G. (2021). Implementasi Algoritma Synthetic Minority Over-Sampling Technique untuk Menangani Ketidakseimbangan Kelas pada Dataset Klasifikasi. SISTEMASI, 10(2), 445. https://doi.org/10.32520/stmsi.v10i2.1303

Syifa, R. A., Adi, K., Fisika, D., & Diponegoro, U. (2016). Analisis Tekstur Citra Mikroskopis Kanker Paru Menggunakan Metode Gray Level Co-Occurance Matrix (Glcm) Dan Tranformasi Wavelet Dengan Klasifikasi Naive Bayes. Youngster Physics Journal, 5(4), 457–462. https://ejournal3.undip.ac.id/index.php/bfd/article/view/14135

Syukron, M., Santoso, R., & Widiharih, T. (2020). Perbandingan Metode Smote Random Forest Dan Smote Xgboost Untuk Klasifikasi Tingkat Penyakit Hepatitis C Pada Imbalance Class Data. Jurnal Gaussian, 9(3), 227–236. https://ejournal3.undip.ac.id/index.php/gaussian/article/view/28915

WHO. (2022). Cancer. World Health Organization. https://www.who.int/news-room/fact-sheets/detail/cancer

Downloads

Published

2022-03-24

How to Cite

Yulianti, I., Rahmawati, A., & Mardiana, T. (2022). THE EFFECTIVENESS ANALYSIS OF RANDOM FOREST ALGORITHMS WITH SMOTE TECHNIQUE IN PREDICTING LUNG CANCER RISK. Jurnal Riset Informatika, 4(2), 207–212. https://doi.org/10.34288/jri.v4i2.159

Issue

Section

Articles

Most read articles by the same author(s)

1 2 > >>