PRODUCT SALES PREDICTION USING XGBOOST WITH FEATURE IMPORTANCE ANALYSIS FOR ADVERTISING MEDIA EVALUATION

Authors

  • Wilsen Grivin Mokodaser Universitas Klabat
  • Tonny Irianto Soewignyo Universitas Klabat
  • Fanny Soewignyo Universitas Klabat
(*) Corresponding Author

DOI:

https://doi.org/10.34288/jri.v8i3.537

Keywords:

Marketing Analytics, XGBoost, Linear Regression, Feature Importance

Abstract

Product sales prediction plays a crucial role in supporting data-driven marketing strategies and optimizing advertising expenditures. Although previous studies have demonstrated the effectiveness of machine learning techniques for sales forecasting, most of them primarily focus on prediction accuracy and provide limited insights into the contribution of individual advertising channels to sales performance. This limitation reduces the interpretability and practical value of predictive models for business decision-making. Therefore, this study proposes a product sales prediction framework using Linear Regression as a baseline model and XGBoost Regression combined with Feature Importance Analysis for advertising media evaluation. The novelty of this study lies in integrating predictive modeling and interpretable analysis within a single framework, enabling both accurate sales prediction and the identification of influential advertising factors. Hyperparameter optimization and five-fold cross validation were employed to improve model reliability and robustness. Experimental results show that Linear Regression outperformed XGBoost, achieving an R² score close to 1.0, while XGBoost achieved an R² score of 0.953 with a mean cross-validation R² score of 0.950, indicating stable predictive performance. Feature Importance Analysis revealed that Affiliate Marketing was the most influential factor, followed by Billboards and Social Media. These findings contribute to marketing analytics by providing interpretable insights that support advertising budget optimization and more effective data-driven business decision-making.

Downloads

Download data is not yet available.

References

Adrian Hidayat. (2024). Strategi Periklanan Terbaru Food & Beverage (F&B) Di Dunia Digital Di Asia Tenggara Dan Indonesia Juga Manfaatnya Bagi Kedua Belah Pihak. EMABI : EKONOMI DAN MANAJEMEN BISNIS, 3.

Azura, D., Reksi, P., Kurniawan, B., & Susandri, S. (2025). Model Prediksi Penjualan Berbasis XGBoost – SHAP untuk Decision Support dalam Arsitektur TOGAF Prosiding Semnas 2025 Sekolah Tinggi Teknologi Dumai. Prosiding Semnas 2025 Sekolah Tinggi Teknologi Dumai Dumai, 1(2), 343–357.

Da Poian, V., Theiling, B., Clough, L., McKinney, B., Major, J., Chen, J., & Hörst, S. (2023). Exploratory data analysis (EDA) machine learning approaches for ocean world analog mass spectrometry. Frontiers in Astronomy and Space Sciences, 10(May), 1–17. https://doi.org/10.3389/fspas.2023.1134141

Endrawati Subroto, D., Nurmiati, A. S., Supriatna, E., Khaldun, F., & Endah Fajariana, D. (2024). Sosialisasi Penggunaan Advertising Sosial Media Sebagai Langkah Peningkatan Digital Marketing Pada Home Industry. Jurnal Pengabdian Kepada Masyarakat Nusantara, 5(1), 1509–1517. https://doi.org/10.55338/jpkmn.v5i1.3012

Graffelman, J., & de Leeuw, J. (2023). Improved Approximation and Visualization of the Correlation Matrix. American Statistician, 77(4), 432–442. https://doi.org/10.1080/00031305.2023.2186952

Hameed, M. M., Masood, A., Hamid, A., Elbeltagi, A., Razali, S. F. M., & Salem, A. (2025). Forecasting monthly runoff in a glacierized catchment: A comparison of extreme gradient boosting (XGBoost) and deep learning models. PLoS ONE, 20(5 May), 1–29. https://doi.org/10.1371/journal.pone.0321008

Helen, E., & Rusdi, F. (2023). Komunikasi Pemasaran Salmonbyesther Menggunakan Media Sosial sebagai Media Periklanan. Kiwari, 2(3), 444–451. https://doi.org/10.24912/ki.v2i3.25877

Hodson, T. O. (2022). Root-mean-square error (RMSE) or mean absolute error (MAE): when to use them or not. Geoscientific Model Development, 15(14), 5481–5487. https://doi.org/10.5194/gmd-15-5481-2022

Kaneko, H. (2023). Interpretation of Machine Learning Models for Data Sets with Many Features Using Feature Importance. ACS Omega, 8(25), 23218–23225. https://doi.org/10.1021/acsomega.3c03722

Katyal, A., Sharma, P. K., & Kannan, M. (2025). Exploratory Data Analysis (EDA) on Undergraduate Data Science Students Through R Programming. Research Square, Icd, 1–18. https://www.researchsquare.com/article/rs-7422204/v1

Khotimah, K., Yudistira, F., & Ardiansyah, M. (2024). Efisiensi Deep learning untuk Analisis Data dan Pengambilan Keputusan. Jurnal Insan Peduli Pendidikan (JIPENDIK), 2(2), 79–82.

Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.

Mahdi, W. A., Alhowyan, A., & Obaidullah, A. J. (2025). Intelligence analysis of drug nanoparticles delivery efficiency to cancer tumor sites using machine learning models. Scientific Reports, 15(1). https://doi.org/10.1038/s41598-024-84450-9

Noorunnahar, M., Chowdhury, A. H., & Mila, F. A. (2023). A tree based eXtreme Gradient Boosting (XGBoost) machine learning model to forecast the annual rice production in Bangladesh. PLoS ONE, 18(3 March), 1–15. https://doi.org/10.1371/journal.pone.0283452

NURADILLA, S., SADIK, K., SUHAENI, C., & SOLEH, A. M. (2025). Klasifikasi Halaman SEO Berbasis Machine Learning Melalui Mutual Information dan Random Forest Feature Importance. MIND Journal, 10(1), 114–129. https://doi.org/10.26760/mindjournal.v10i1.114-129

Robeson, S. M., & Willmott, C. J. (2023). Decomposition of the mean absolute error (MAE) into systematic and unsystematic components. PLoS ONE, 18(2 February), 1–8. https://doi.org/10.1371/journal.pone.0279774

Sandi Asmoro, A., & Sriyono, S. (2025). Peran Machine Learning dalam Pengambilan Keputusan Manajerial di Industri Fintech: Studi Kasus pada Perusahaan Startup. Journal of Accounting and Finance Management, 6(3), 997–1003. https://doi.org/10.38035/jafm.v6i3.2041

Wardani, S. (2023). Analisis Strategis Komunikasi Pemasaran Dalam Meningkatkan Kinerja Ekonomi Perusahaan. Jurnal Ilmiah Manajemen Profetik, 1(2), 76–80. https://doi.org/10.55182/jimp.v1i2.424

Wedel, M., & Kannan, P. K. (2016). Marketing analytics for data-rich environments. Journal of Marketing, 80(6), 97–121.

Wiens, M., Verone-Boyle, A., Henscheid, N., Podichetty, J. T., & Burton, J. (2025). A Tutorial and Use Case Example of the eXtreme Gradient Boosting (XGBoost) Artificial Intelligence Algorithm for Drug Development Applications. Clinical and Translational Science, 18(3). https://doi.org/10.1111/cts.70172

Zhang, P., Jia, Y., & Shang, Y. (2022). Research and application of XGBoost in imbalanced data. International Journal of Distributed Sensor Networks, 18(6). https://doi.org/10.1177/15501329221106935

Zundina Ulya, F., Khomsah, S., Annisa Ferani Tanjung, N., & Korespondensi, P. (2025). Perbandingan Algoritma Xgboost Dan Lstm Untuk Memprediksi Harga Bitcoin Berdasarkan Harga Harian, Sentimen, Dan Google Trends Index Comparison of Xgboost and Lstm Algorithms To Predicte Bitcoin Price Based on Daily Price, Sentiment, and Google Trends Inde. Jurnal Teknologi Informasi Dan Ilmu Komputer (JTIIK), 12(6), 2355–7699. https://trends.google.com/

Downloads

Published

2026-06-16

How to Cite

Mokodaser, W. G., Soewignyo, T. I., & Soewignyo, F. (2026). PRODUCT SALES PREDICTION USING XGBOOST WITH FEATURE IMPORTANCE ANALYSIS FOR ADVERTISING MEDIA EVALUATION. Jurnal Riset Informatika, 8(3), 498–507. https://doi.org/10.34288/jri.v8i3.537

Issue

Section

Articles

Most read articles by the same author(s)