COMPARATIVE MACHINE LEARNING ALGORITHMS FOR YOUTUBE SENTIMENT ANALYSIS ON DPR DEMONSTRATION 2025 USING LEXICON
DOI:
https://doi.org/10.34288/jri.v8i1.470Keywords:
Sentiment Analysis, Machine Learning, Lexicon-Based, YouTube Comments, DPR DemonstrationAbstract
The high volume of public comments on YouTube regarding the DPR Demonstrasion August 2025, which reached 43,910 raw data, presents a significant challenge in conducting efficient sentiment analysis. Time and cost limitations in manual labeling for large-scale datasets are a major obstacle in the development of predictive models. This study aims to address this problem by proposing a hybrid approach that integrates Lexicon-Based auto-labeling with a comparative evaluation of five Machine Learning algorithms. The research methodology included a text preprocessing stage that generated 40,097 unique comments, feature extraction using TF-IDF, and data sharing with an 80:20 ratio. The performance of the Support Vector Machine algorithm was comprehensively compared to Random Forest, Decision Tree, K-Nearest Neighbors, and Naive Bayes. The results of the experiment showed that the SVM model recorded the most superior performance with an accuracy of 96.5% and a weighted F1-Score of 0.966. This score significantly outperformed other benchmarking algorithms, where Random Forest came in second place with 89.2% accuracy, followed by Decision Tree at 85.6%, KNN at 84.6%, and Naive Bayes at the lowest with 84.0%. These findings validate that the integration of Lexicon-Based labeling with SVM classification is a highly accurate, robust, and efficient solution for handling sentiment analysis on large-scale social media data in Indonesia.
Downloads
References
Adi, S. I. R., Bakkara, B., Zega, K. A., Vielita, F. N., & Rakhmawati, N. A. (2024). Analisis Sentimen Masyarakat Terhadap Progress Ikn Menggunakan Model Decision Tree. JIKA (Jurnal Informatika), 8(1), 57. https://doi.org/10.31000/jika.v8i1.9803
Adriana, N. M. T. O., Suarjaya, I. M. A. D., & Githa, D. P. (2023). Analisis sentimen publik terhadap aksi demonstrasi di Indonesia menggunakan Support Vector Machine dan Random Forest. DECODE: Jurnal Pendidikan Teknologi Informasi, 3(2), 257–267. https://doi.org/http://dx.doi.org/10.51454/decode.v3i2.187
Ardiansyah, A., Agustina, C., Maryani, I., & Pribadi, D. (2025). Analisis Sentimen pada Komentar YouTube terkait Pembahasan eSIM Menggunakan Metode Naive Bayes dan Random Forest. Indonesian Journal on Software Engineering (IJSE), 11(1 JUNI), 7–14. https://doi.org/10.31294/ijse.v11i1.26180
Atinna, A. N., & Akbar, M. (2025). Analisis sentimen masyarakat terhadap kebijakan Undang-Undang Tentara Nasional Indonesia (UU TNI) menggunakan Support Vector Machine. Jurnal Komputer, Informasi Dan Teknologi, 5(1), 1–14. https://doi.org/https://doi.org/10.53697/jkomitek.v5i1.2603
Chamid, A. A., Nindyasari, R., Azizah, N., & Hariyadi, A. (2025). Analysis of Public Opinion on The Governor Candidate Debate Using LDA and IndoBERT. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control. https://doi.org/10.22219/kinetik.v10i3.2221
Chamid, A. A., Nindyasari, R., & Ghozali, M. I. (2025). Comparative Analysis of Machine Learning Algorithms for Predicting Patient Admission in Emergency Departments Using EHR Data. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 9(2), 185–194. https://doi.org/10.29207/resti.v9i2.6188
Chamid, A. A., Widowati, & Kusumaningrum, R. (2022). Graph-Based Semi-Supervised Deep Learning for Indonesian Aspect-Based Sentiment Analysis. Big Data and Cognitive Computing, 7(1), 5. https://doi.org/10.3390/bdcc7010005
Chamid, A. A., Widowati, & Kusumaningrum, R. (2024). Labeling Consistency Test of Multi-Label Data for Aspect and Sentiment Classification Using the Cohen Kappa Method. Ingénierie Des Systèmes d Information, 29(1), 161–167. https://doi.org/10.18280/isi.290118
Chanda, S., & Pal, S. (2023). The Effect of Stopword Removal on Information Retrieval for Code-Mixed Data Obtained Via Social Media. SN Computer Science, 4(5), 494. https://doi.org/10.1007/s42979-023-01942-7
Efraim, D. A., & Ermatita. (2023). Analisis Sentimen Pada Sosial Media Instagram Menggunakan Algoritma Naive Bayes (Studi Kasus : Timnas Futsal Indonesia). In Seminar Nasional Mahasiswa Ilmu Komputer dan Aplikasinya (SENAMIKA) (pp. 498–509). Retrieved from https://conference.upnvj.ac.id/index.php/senamika/article/view/2574
Fasnuari, D., Andrian, H., Yuana, H., & Chulkamdi, M. T. (2022). Penerapan Algoritma K-Nearest Neighbor Untuk Klasifikasi Penyakit Diabetes Melitus. Antivirus : Jurnal Ilmiah Teknik Informatika, 16(2), 133–142. https://doi.org/10.35457/antivirus.v16i2.2445
Hakim, Z. R., & Sugiyono. (2024). Analisa Sentimen Terhadap Kereta Cepat Jakarta – Bandung Menggunakan Algoritma Naïve Bayes Dan K-Nearest Neighbor. Jurnal Sains Dan Teknologi, 5(3), 939–945. https://doi.org/10.55338/saintek.v5i3.1423
Jazuli, A., Widowati, Chamid, A. A., & Kusumaningrum, R. (2025). Transformer-based semantic indexing for aspect-based sentiment analysis using an enhanced index generation algorithm with BERT. International Journal of Advanced Technology and Engineering Exploration, 12(127). https://doi.org/10.19101/IJATEE.2024.111102114
Merdiansah, R., Siska, S., & Ali Ridha, A. (2024). Analisis Sentimen Pengguna X Indonesia Terkait Kendaraan Listrik Menggunakan IndoBERT. Jurnal Ilmu Komputer Dan Sistem Informasi (JIKOMSI), 7(1), 221–228. https://doi.org/10.55338/jikomsi.v7i1.2895
Mola, S. A. S., Lete, P. R., Triyanto, T., Ajilo, B. J. A. J., & Widiastuti, T. (2024). Analisis sentimen menggunakan metode Naive Bayes dan Support Vector Machine pada kasus pelantikan artis sebagai anggota DPR RI tahun 2024. HOAQ: Jurnal Teknologi Informasi, 15(1), 22–32. https://doi.org/https://doi.org/10.52972/hoaq.vol15no1.p22-32
Muhayat, T., Fauzi, A., & Indra, J. (2023). Analisis sentimen terhadap komentar video YouTube menggunakan Support Vector Machines. Progresif: Jurnal Ilmiah Komputer, 15(2).
Ningsih, R. A., & Fatah, Z. (2025). Analisis sentimen komentar YouTube terhadap tragedi demo 25 Agustus menggunakan pendekatan lexicon-based. JAMASTIKA: Jurnal Mahasiswa Teknik Informatika, 4(2).
Ratnaswari, S., Wibowo, N. C., & Kartika, D. S. Y. (2025). Analisis sentimen menggunakan metode lexicon-based dan support vector machine pada presiden dan wakil presiden Indonesia periode 2024–2029. Jurnal Informatika Dan Teknik Elektro Terapan (JITET), 13(1). https://doi.org/https://doi.org/10.23960/jitet.v13i1.5604
Siddiq, M. J., Jayasri, S., Suhendi, A., Hidayat, T., & Rizky, R. (2025). Analisis sentimen opini masyarakat terhadap Pilkada 2024 di media sosial Twitter menggunakan algoritma Naive Bayes. Jurnal Informatika Dan Teknik Elektro Terapan (JITET), 13(2). Retrieved from http://dx.doi.org/10.23960/jitet.v13i2.6280
Syafia, A. N., Hidayattullah, M. F., & Suteddy, W. (2023). Studi Komparasi Algoritma SVM Dan Random Forest Pada Analisis Sentimen Komentar Youtube BTS. Jurnal Informatika: Jurnal Pengembangan IT, 8(3), 207–212. https://doi.org/10.30591/jpit.v8i3.5064
Syofiani, F., Alam, S., & Sulistyo, M. I. S. (2023). Analisis sentimen penilaian masyarakat terhadap childfree berdasarkan komentar di YouTube menggunakan algoritma Naive Bayes. Jurnal Teknologi Informatika Dan Komputer MH. Thamrin, 9(2). https://doi.org/https://doi.org/10.37012/jtik.v9i2.1661
Umrona, R. D., Anwar, S. N., & Soelistijadi, R. (2025). Analisis sentimen komentar YouTube terkait kasus pagar laut menggunakan metode KNN (K-Nearest Neighbor). JINTEKS: Jurnal Informatika Teknologi Dan Sains, 7(3), 1537–1544. https://doi.org/https://doi.org/10.51401/jinteks.v7i3.6251
Undap, M., Rantung, V. P., & Rompas, P. T. D. (2021). Analisis Sentimen Situs Pembajak Artikel Penelitian Menggunakan Metode Lexicon-Based. Jointer - Journal of Informatics Engineering, 2(02), 39–46. https://doi.org/10.53682/jointer.v2i02.44
Utami, R. W., Jazuli, A., & Khotimah, T. (2021). Analisis Sentimen Terhadap Xiaomi Indonesia Menggunakan Metode Naïve Bayes. Indonesian Journal of Technology, Informatics and Science (IJTIS), 3(1), 21–30. https://doi.org/10.24176/ijtis.v3i1.7514
Utomo, W. P. (2022). Hoax and Paradox of Digital Public Sphere. Jurnal Komunikasi Indonesia, 11(1). https://doi.org/10.7454/jkmi.v11i1.1024
Uyun, Q., & Qoiriah, A. (2024). Analisis sentimen opini publik terhadap program Merdeka Belajar Kampus Merdeka dengan algoritma Naive Bayes–Support Vector Machine (NBSVM). JINACS: Journal of Informatics and Computer Science, 6(2).
Wibowo, I. S., Witanti, A., & Susilawati, I. (2024). Keyword Extraction Judul Berita Online Di Indonesia Menggunakan Metode TF-IDF. JATISI (Jurnal Teknik Informatika Dan Sistem Informasi), 11(1). https://doi.org/https://doi.org/https://doi.org/10.35957/jatisi.v11i1.6718
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Syafri Samsudin, Ahmad Abdul Chamid, Ahmad Jazuli

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The Jurnal Riset Informatika has legal rules for accessing digital electronic articles uunder a Creative Commons Attribution-NonCommercial 4.0 International License . Articles published in Jurnal Riset Informatika, provide Open Access, for the purpose of scientific development, research, and libraries.










