Active Learning Query by Committee Labeling Method to Increase Accuracy and Efficiency of Sentiment Analysis Classification
DOI:
https://doi.org/10.34288/jri.v7i4.386Keywords:
Sentiment Analysis, Labeling Method, Query by Committee, Active LearningAbstract
This study proposes the Query by Committee (QBC) labeling method to improve the accuracy of classification models—specifically XLM-RoBERTa—and to increase labeling efficiency compared to manual, supervised labeling, which generally requires more time and resources. The dataset consists of unannotated healthcare-industry application reviews scraped from Google Play. Six distinct labeling strategies were applied as input for fine-tuning XLM-RoBERTa models under identical hyperparameter settings. The six labeling approaches were evaluated namely Rating-based labeling, Lexicon-based labeling, QBC for Rating-Vader labeling, QBC for Rating-Pseudo labeling, QBC for Vader-Pseudo labeling, and QBC triplet for Rating-Pseudo-Vader labeling. Each labeled dataset was split using stratified random sampling, and class weights were set to “auto” during training to address label imbalance. All models were subsequently tested on the IndoNLU SmSA test dataset, with performance compared in terms of accuracy, precision, recall, and F1-score. Results indicate that the triplet QBC approach (combining Rating, VADER, and Pseudo labeling) outperformed all other methods, achieving an accuracy of 91.4%, a precision of 91.28%, a recall of 91.4%, and an F1-score of 91.21%. These findings demonstrate that the QBC labeling method can serve as an effective and efficient alternative to manual annotation for similar classification tasks
Downloads
References
Abiola, O., Abayomi-Alli, A., Tale, O. A., Misra, S., & Abayomi-Alli, O. (2023). Sentiment analysis of COVID-19 tweets from selected hashtags in Nigeria using VADER and Text Blob analyser. Journal of Electrical Systems and Information Technology, 10(1). https://doi.org/10.1186/s43067-023-00070-9
Aliyah Salsabila, N., Ardhito Winatmoko, Y., Akbar Septiandri, A., & Jamal, A. (2018). Colloquial Indonesian Lexicon. Proceedings of the 2018 International Conference on Asian Language Processing, IALP 2018, 226–229. https://doi.org/10.1109/IALP.2018.8629151
Aljrees, T., Umer, M., Saidani, O., Almuqren, L., Ishaq, A., Alsubai, S., … Ashraf, I. (2024). Contradiction in text review and apps rating: prediction using textual features and transfer learning. PeerJ Computer Science, 10, e1722. https://doi.org/10.7717/PEERJ-CS.1722
Barik, K., & Misra, S. (2024). Analysis of customer reviews with an improved VADER lexicon classifier. Journal of Big Data, 11(1), 10. https://doi.org/10.1186/s40537-023-00861-x
Budianto, A. G., Wirjodirdjo, B., Maflahah, I., & Kurnianingtyas, D. (2022). Sentiment Analysis Model for KlikIndomaret Android App During Pandemic Using Vader and Transformers NLTK Library. 2022 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), 0423–0427. IEEE. https://doi.org/10.1109/IEEM55944.2022.9989577
Esuli, A., & Sebastiani, F. (2009). Active Learning Strategies for Multi-Label Text Classification. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5478 LNCS, 102–113. https://doi.org/10.1007/978-3-642-00958-7_12
Fernando, K. R. M., & Tsokos, C. P. (2022). Dynamically Weighted Balanced Loss: Class Imbalanced Learning and Confidence Calibration of Deep Neural Networks. IEEE Transactions on Neural Networks and Learning Systems, 33(7), 2940–2951. https://doi.org/10.1109/TNNLS.2020.3047335
Hou, D., Zhang, Z., Zhao, M., Zhang, W., Zhao, Y., & Yu, J. (2024). Sentence-level Distant Supervision Relation Extraction based on Dynamic Soft Labels. Proceedings of the 2024 27th International Conference on Computer Supported Cooperative Work in Design, CSCWD 2024, 3194–3199. https://doi.org/10.1109/CSCWD61410.2024.10580472
Isnan, M., Elwirehardja, G. N., & Pardamean, B. (2023). Sentiment Analysis for TikTok Review Using VADER Sentiment and SVM Model. Procedia Computer Science, 227, 168–175. Elsevier B.V. https://doi.org/10.1016/j.procs.2023.10.514
JoMingyu. (n.d.). Google Play Scraper. Retrieved October 14, 2024, from https://github.com/JoMingyu/google-play-scraper
Kuligowska, K., & Kowalczuk, B. (2021). Pseudo-labeling with transformers for improving Question Answering systems. Procedia Computer Science, 192, 1162–1169. https://doi.org/10.1016/J.PROCS.2021.08.119
Lu, Y., Song, W., Arachie, C., & Huang, B. (2025). Weakly supervised label learning flows. Neural Networks, 182, 106892. https://doi.org/10.1016/J.NEUNET.2024.106892
Mosqueira-Rey, E., Hernández-Pereira, E., Alonso-Ríos, D., Bobes-Bascarán, J., & Fernández-Leal, Á. (2022). Human-in-the-loop machine learning: a state of the art. Artificial Intelligence Review 2022 56:4, 56(4), 3005–3054. https://doi.org/10.1007/S10462-022-10246-W
Ruhyana -, N., Salsabila Dwi Irmanti -, K., Agung Riyadi -, A., & Mardiana -, T. (2025). SENTIMENT ANALYSIS OF USER REVIEWS BRI MOBILE APPLICATION WITH GRADIENT BOOST METHOD. Jurnal Riset Informatika, 7(2), 1–7. https://doi.org/10.34288/JRI.V7I2.342
Sadiq, S., Umer, M., Ullah, S., Mirjalili, S., Rupapara, V., & Nappi, M. (2021). Discrepancy detection between actual user reviews and numeric ratings of Google App store using deep learning. Expert Systems with Applications, 181, 115111. https://doi.org/10.1016/J.ESWA.2021.115111
Wang, X., Wan, L., & Zhang, J. (2019). An Active Learning Framework Based on Query-By-Committee for Sentiment Analysis. Proceedings of 2019 IEEE International Conference on Artificial Intelligence and Computer Applications, ICAICA 2019, 327–331. https://doi.org/10.1109/ICAICA.2019.8873452
Wilie, B., Vincentio, K., Indra Winata, G., Cahyawijaya, S., Li, X., Lim, Z. Y., … Bandung, I. T. (2020). IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding. Retrieved from https://arxiv.org/pdf/2009.05387
Wilson Wongso. (2023). Indonesian RoBERTa Base Sentiment Classifier. Hugging Face. Retrieved from https://huggingface.co/w11wo/indonesian-roberta-base-sentiment-classifier
Xu, M., & Guo, L. Z. (2021). Learning from group supervision: the impact of supervision deficiency on multi-label learning. Science China Information Sciences, 64(3), 1–13. https://doi.org/10.1007/S11432-020-3132-4/METRICS
Zhang, J., & Cao, M. (2023). Distant supervision for relation extraction with hierarchical attention-based networks. Expert Systems with Applications, 220, 119727. https://doi.org/10.1016/J.ESWA.2023.119727
Zhao, S., Hong, X., Yang, J., Zhao, Y., & Ding, G. (2023). Toward Label-Efficient Emotion and Sentiment Analysis. Proceedings of the IEEE, 111(10), 1159–1197. https://doi.org/10.1109/JPROC.2023.3309299
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Dipa Anasta Iskandar, R. Mohamad Atok

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The Jurnal Riset Informatika has legal rules for accessing digital electronic articles uunder a Creative Commons Attribution-NonCommercial 4.0 International License . Articles published in Jurnal Riset Informatika, provide Open Access, for the purpose of scientific development, research, and libraries.










