Sentiment Analysis of E-Grocery Application Reviews Using Lexicon-Based and Support Vector Machine
DOI:
https://doi.org/10.34288/jri.v6i3.301Keywords:
Sentiment Analysis, E-Grocery, Lexicon Based, Support Vector Machine (SVM), Application ReviewsAbstract
This research aims to conduct sentiment analysis of e-grocery application reviews using the Support Vector Machine (SVM) algorithm. Sentiment analysis is used to distinguish between positive and negative reviews by users who have provided reviews so that an evaluation of the services offered can be made. This research uses scraping techniques to obtain all the needed review data, focusing only on reviews of the Segari and Sayurbox applications. Datasets were collected from reviews using a library in Python, namely, google-play-scraper, obtained by the sayurbox application 4235 reviews and the segari application 5575. The dataset collected does not yet have a label, and the labeling process is impossible to perform manually by looking at the reviews one by one because it takes a long time and requires an expert in the field of language who can interpret the reviews and group them into positive and negative sentiments. Therefore, the sentiment-labeling process applies a lexicon-based method that works based on the inset lexicon dictionary by calculating each review's polarity value. The analysis process of this research uses the SVM algorithm because the SVM method has been proven to provide consistent and accurate results in various classification tasks, including sentiment analysis. The results show that the lexicon-based method and SVM produce good accuracy in determining the sentiment of e-grocery reviews, with a vegetable box application accuracy rate of 94%. In comparison, the segari application accuracy rate reached 97%.
Downloads
References
Adams, H., Farnell, E., & Story, B. (2020). Support vector machines and Radon's theorem. Retrieved from http://arxiv.org/abs/2011.00617
Anaba, A., Bagus Trianto, R., & Supriyadi, E. (2024). Analysis of Sentiment on Teacher Marketplace Issues Using the Lexicon And K-Nearest Neighbor Algorithms. Jurnal Ilmu Komputer An Nuur, 4(1), 30–38.
Ananda, F. D., & Pristyanto, Y. (2021). Analisis Sentimen Pengguna Twitter Terhadap Layanan Internet Provider Menggunakan Algoritma Support Vector Machine. MATRIK : Jurnal Manajemen, Teknik Informatika Dan Rekayasa Komputer, 20(2), 407–416. https://doi.org/10.30812/matrik.v20i2.1130
Anggina, S., Setiawan, N. Y., & Bachtiar, F. A. (2022). Analisis Ulasan Pelanggan Menggunakan Multinomial Naïve Bayes Classifier dengan Lexicon-Based dan TF-IDF Pada Formaggio Coffee and Resto. Is The Best: Accounting Information Systems and Information Technology Business Enterprise, 7(1), 76–90. https://doi.org/10.34010/aisthebest.v7i1.7072
Aryanti, R., Arifin, Y. T., Khairunas, S., Misriati, T., Dalis, S., Baidawi, T., … Marlina, S. (2023). The use of resampling techniques to overcome imbalance of data on the classification algorithm. AIP Conference Proceedings, 2714. American Institute of Physics Inc. https://doi.org/10.1063/5.0128424
Bakhar, M., Harto, B., Gugat, R. M. D., Hendrayani, E., Setiawan, Z., Surianto, D. F., … Sopiana, Y. (2023). Perkembangan Startup di Indonesia (Perkembangan Startup di Indonesia dalam berbagai bidang). PT. Sonpedia Publishing Indonesia.
Daffa, M., Fahreza, A., Luthfiarta, A., Rafid, M., Indrawan, M., & Nugraha, A. (2024). Analisis Sentimen: Pengaruh Jam Kerja Terhadap Kesehatan Mental Generasi Z. Journal of Applied Computer Science and Technology (JACOST), 5(1), 16–25. https://doi.org/10.52158/jacost.715
Diki Hendriyanto, M., Ridha, A. A., & Enri, U. (2022). Analisis Sentimen Ulasan Aplikasi Mola Pada Google Play Store Menggunakan Algoritma Support Vector Machine Sentiment Analysis of Mola Application Reviews on Google Play Store Using Support Vector Machine Algorithm. Journal of Information Technology and Computer Science (INTECOMS), 5(1).
Eng, T., Ibn Nawab, M. R., & Shahiduzzaman, K. M. (2021). Improving Accuracy of The Sentence-Level Lexicon-Based Sentiment Analysis Using Machine Learning. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 57–69. https://doi.org/10.32628/cseit21717
Fatin Liyana Mohd Rosely, N., Mohd Zain, A., Yusoff, Y., & Yusup, N. (2019). A Brief Conceptual View on Classification Using Support Vector Machine. IOP Conference Series: Materials Science and Engineering, 551(1), 012035. https://doi.org/10.1088/1757-899X/551/1/012035
Fernanda, M., & Fathoni, N. (2024). Perbandingan Performa Labeling Lexicon InSet dan VADER pada Analisa Sentimen Rohingya di Aplikasi X dengan SVM. Jurnal Informatika Dan Sains Teknologi, 1(3), 62–76. https://doi.org/10.62951/modem.v1i3.112
Gandasari, R. A., & Tjhin, V. U. (2024). Examining Factors Influencing Continuance Intention to Use for Agricultural E-Commerce Platforms in Indonesia using SEM-PLS. Journal of System and Management Sciences, 14(10). https://doi.org/10.33168/JSMS.2024.10xx
Hanafiah, A., Nasution, A. H., Arta, Y., Wandri, R., Nasution, H. O., & Mardafora, J. (2023). Sentimen Analisis Terhadap Customer Review Produk Shopee Berbasis Wordcloud Dengan Algoritma Naïve Bayes Classifier. INTECOMS: Journal of Information Technology and Computer Science, 6(1), 230–236. https://doi.org/10.31539/intecoms.v6i1.5845
Hsu, D., Muthukumar, V., & Xu, J. (2021). On the proliferation of support vectors in high dimensions.
IŞIK, M., & DAĞ, H. (2020). The impact of text preprocessing on the prediction of review ratings. TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES, 28(3), 1405–1421. https://doi.org/10.3906/elk-1907-46
Juanchich, M., Shepherd, T. G., & Sirota, M. (2020). Negations in uncertainty lexicon affect attention, decision-making and trust. Climatic Change, 162(3), 1677–1698. https://doi.org/10.1007/s10584-020-02737-y
Kumar, S., & Pathak, S. (2022). Sentiment Analysis Methods using Lexicon Approach. SAMRIDDHI : A Journal of Physical Sciences, Engineering and Technology, 14(01), 86–92. https://doi.org/10.18090/samriddhi.v14i01.14
Manullang, O., Prianto, C., & Harani, N. H. (2023). Analisis Sentimen Untuk Memprediksi Hasil Calon Pemilu Presiden Menggunakan Lexicon Based dan Random Forest. Jurnal Ilmiah Informatika (JIF), 11(2), 159–169.
Mir, A., & Nasiri, J. (2019). LightTwinSVM: A Simple and Fast Implementation of Standard Twin Support Vector Machine Classifier. Journal of Open Source Software, 4(35), 1252. https://doi.org/10.21105/joss.01252
Munson, E., Smith, C., Boehmke, B., & Freels, J. (2019). Sentiment Analysis of Twitter Data (saotd). Journal of Open Source Software, 4(34), 764. https://doi.org/10.21105/joss.00764
Musfiroh, D., Khaira, U., Utomo, P. E. P., & Suratno, T. (2021). Analisis Sentimen terhadap Perkuliahan Daring di Indonesia dari Twitter Dataset Menggunakan InSet Lexicon. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 1, 24–33.
Mustikasari, D., Widaningrum, I., Arifin, R., & Putri, W. H. E. (2021). Comparison of Effectiveness of Stemming Algorithms in Indonesian Documents. Proceedings of the 2nd Borobudur International Symposium on Science and Technology (BIS-STE 2020). https://doi.org/10.2991/aer.k.210810.025
Naldi, M., & Petroni, S. (2023). A Testset-Based Method to Analyse the Negation-Detection Performance of Lexicon-Based Sentiment Analysis Tools. Computers, 12(1). https://doi.org/10.3390/computers12010018
Nurkasanah, A., & Hayaty, M. (2022a). Feature Extraction using Lexicon on the Emotion Recognition Dataset of Indonesian Text. Ultimatics : Jurnal Teknik Informatika, 14(1), 20–27.
Nurkasanah, A., & Hayaty, M. (2022b). Feature Extraction using Lexicon on the Emotion Recognition Dataset of Indonesian Text. Ultimatics : Jurnal Teknik Informatika, 14(1), 20–27. https://doi.org/10.31937/ti.v14i1.2540
Ozçelik, M., Yazılım, S., Danıs¸manlık, D., Nas, B., Starlang, A., Danıs¸manlık, Y., … Yıldız¨, Y. (2021). HisNet: A Polarity Lexicon based on WordNet for Emotion Analysis. Retrieved from https://github.com/StarlangSoftware/TurkishSentiNet-C#
Prinsloo, D. J. (2020). Lexicographic Treatment of Negation in Sepedi Paper Dictionaries. Lexikos, 30(1), 1–25. https://doi.org/10.5788/30-1-1590
Sebastian, D., & Nugraha, K. A. (2021). Development of Compound Non-Standard Word Dataset Using Crowdsourcing Method. 2021 7th International Conference on Electrical, Electronics and Information Engineering (ICEEIE), 1–6. IEEE. https://doi.org/10.1109/ICEEIE52663.2021.9616659
Sukeiti, W. W., & Surono, S. (2022). Fuzzy Support Vector Machine Using Function Linear Membership and Exponential with Mahanalobis Distance. JTAM (Jurnal Teori Dan Aplikasi Matematika), 6(2), 268. https://doi.org/10.31764/jtam.v6i2.6912
Utami, M. A. A. T., Silvianti, P., & Masjkur, M. (2023). Algoritme Support Vector Machine untuk Analisis Sentimen Berbasis Aspek Ulasan Game Online Mobile Legends: Bang-Bang. Xplore: Journal of Statistics, 12(1), 63–77. https://doi.org/10.29244/xplore.v12i1.1064
Wang, M., & Hu, F. (2021). The Application of NLTK Library for Python Natural Language Processing in Corpus Research. Theory and Practice in Language Studies, 11(9), 1041–1049. https://doi.org/10.17507/tpls.1109.09
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Riska Aryanti, Eka Fitriani, Royadi, Dian Ardiansyah, Atang Saepudin
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The Jurnal Riset Informatika has legal rules for accessing digital electronic articles uunder a Creative Commons Attribution-NonCommercial 4.0 International License . Articles published in Jurnal Riset Informatika, provide Open Access, for the purpose of scientific development, research, and libraries.