Analysis of Indonesian Language Dataset for Tax Court Cases: Multiclass Classification of Court Verdicts


  • Ade Putera Kemala Bina Nusantara University
  • Hafizh Ash Shiddiqi Bina Nusantara University
(*) Corresponding Author



BERT, Classification, Deep learning, NLP, Tax


Tax is an obligation that arises due to the existence of laws, creating a duty for citizens to contribute a certain portion of their income to the state. The Tax Court serves as a judicial authority for taxpayers seeking justice in tax disputes, handling various types of taxes on a daily basis. This paper presents an analysis of an Indonesian language dataset of tax court cases, aiming to perform multiclass classification to predict court verdicts. The dataset undergoes preprocessing steps, while data augmentation using oversampling and label weighting techniques address class imbalance. Two models, bi-LSTM and IndoBERT, are utilized for classification. The research produced a final result of model with 75.83% using IndoBERT model. The results demonstrate the efficacy of both models in predicting court verdicts. This research has implications for predicting court conclusions with limited case details, providing valuable insights for legal decision-making processes. The findings contribute to the field of legal data analysis, showcasing the potential of NLP techniques in understanding and predicting court outcomes, thus enhancing the efficiency of legal proceedings.


Author Biographies

Ade Putera Kemala, Bina Nusantara University

School of Computer Science, Data Science

Hafizh Ash Shiddiqi, Bina Nusantara University

School of Computer Science, Computer Science


