ADDITIONAL MENU
Early Detection of Hepatitis Disease Using Machine Learning Algorithms
Abstract
Hepatitis is an inflammation of the liver caused by viral infections, autoimmune disorders, or exposure to toxic substances. Hepatitis B and C are major public health concerns because they may progress to cirrhosis or liver cancer. In Indonesia, the transmission rate remains high, primarily through blood contact, unsterile needles, transfusions, and maternal delivery. Limited public awareness, coupled with the often asymptomatic nature of hepatitis, leads to delayed detection, which increases the risk of severe complications and mortality. Therefore, early detection is crucial to minimizing the disease burden.This study proposes a risk prediction model for hepatitis using non-laboratory clinical data and machine learning methods. Eight classification algorithms were compared, Naïve Bayes, K-Nearest Neighbor (K-NN), Random Forest, Support Vector Machine (SVM), Decision Tree, AdaBoost, XGBoost, CatBoost, and LightGBM. Model performance was evaluated through K-fold cross-validation using accuracy, precision, recall, F1-score, and AUC. The results show that the SVM with a linear kernel achieved the highest performance, with 87% accuracy and balanced F1-scores across all classes. The model successfully classified four categories, Acute Hepatitis, Chronic Hepatitis, Liver Abscess, and Parasitic/Viral Infections. These findings highlight the potential of machine learning to improve early detection of hepatitis effectively and efficiently.
Keywords
Diagnosis; Early Detection; Hepatitis; Machine Learning; Support Vector Machine
Full Text:
PDFReferences
S. C. R. Nandipati, C. Xinying, and K. K. Wah, “Hepatitis C Virus (HCV) Prediction by Machine Learning Techniques,” Appl. Model. Simul., vol. 4, pp. 89–100, 2020.
W. H. O. WHO, “WHO sounds alarm on viral hepatitis infections claiming 3500 lives each day.” [Online]. Available: https://www.who.int/news/item/09-04-2024-who-sounds-alarm-on-viral-hepatitis-infections-claiming-3500-lives-each-day
P. Lusita, N. Indriani, H. Anggraini, and S. Handayani, “Faktor-Faktor yang Mempengaruhi Kejadian Hepatitis pada Ibu Hamil,” 2021.
J. Wu, H. Wang, Z. Xiang, C. Jiang, Y. Xu, and G. Zhai, “Role of viral hepatitis in pregnancy and its triggering mechanism,” J. Transl. Intern. Med., vol. 12, no. 4, 2024, doi: 10.2478/jtim-2024-0015.
V. Harabor et al., “Machine Learning Approaches for the Prediction of Hepatitis B and C Seropositivity,” Int. J. Environ. Res. Public Health, vol. 20, no. 3, 2023, doi: 10.3390/ijerph20032380.
R. Kemenkes, “Angka Hepatitis B dan C di Indonesia Turun.” [Online]. Available: https://www.kemkes.go.id/eng/angka-hepatitis-b-dan-c-di-indonesia-turun
D. Andriani, “Ini Cara Cegah Hepatitis Ala Dr dr Rino Alvani Gani, Sp.PD-KGEH,” Bisnis.com. Accessed: Jul. 23, 2025. [Online]. Available: https://lifestyle.bisnis.com/read/20170831/106/685952/ini-cara-cegah-hepatitis-ala-dr-dr-rino-alvani-gani-sp.pd-kgeh-
Kemenkes, “Hepatitis Akut Menular Lewat Saluran Cerna dan Saluran Pernafasan.”
D. Singh, D. Prashar, J. Singla, A. A. Khan, M. Al-Sarem, and N. A. Kurdi, “Intelligent Medical Diagnostic System for Hepatitis B,” Comput. Mater. Contin., vol. 73, no. 3, pp. 6047–6068, 2022, doi: 10.32604/cmc.2022.031255.
Z. Xia, L. Qin, Z. Ning, and X. Zhang, “Deep learning time series prediction models in surveillance data of hepatitis incidence in China,” PLoS One, vol. 17, no. 4 April, pp. 1–18, 2022, doi: 10.1371/journal.pone.0265660.
A. Firdaus, “Menggugah kesadaran global atasi hepatitis yang kian mengancam,” Antaranews.com. Accessed: Jul. 23, 2025. [Online]. Available: https://ambon.antaranews.com/berita/217908/menggugah-kesadaran-global-atasi-hepatitis-yang-kian-mengancam
I. Cholissodin and A. A. Soebroto, AI , Machine Learning & Deep Learning Book ( Teori & Implementasi ), no. July 2019. Malang, 2021. [Online]. Available: https://www.researchgate.net/publication/348003841
A. Zein et al., “Pengenalan Pembelajaran Mesin dan Deep Learning.,” J. Stud. Alquran dan Tafsir, vol. 4, no. 1, pp. 29–38, 2023, [Online]. Available: https://jurnal.pranataindonesia.ac.id/index.php/jik/article/download/96/49
I. I. Ahmed, D. Y. Mohammed, and K. A. Zidan, “Diagnosis of hepatitis disease using machine learning techniques,” Indones. J. Electr. Eng. Comput. Sci., vol. 26, no. 3, pp. 1564–1572, 2022, doi: 10.11591/ijeecs.v26.i3.pp1564-1572.
N. Sharfina and N. G. Ramadhan, “Analisis SMOTE Pada Klasifikasi Hepatitis C Berbasis Random Forest dan Naïve Bayes,” JOINTECS (Journal Inf. Technol. Comput. Sci., vol. 8, no. 1, p. 33, 2023, doi: 10.31328/jointecs.v8i1.4456.
A. Damayanti and G. Testiana, “Penerapan Data Mining untuk Prediksi Penyakit Hepatitis C Menggunakan Algoritma Naïve Bayes,” J. Manaj. Inform. Jayakarta, vol. 3, no. 2, pp. 177–186, 2023, doi: 10.52362/jmijayakarta.v3i2.1098.
A. D. Putra, D. Nurani, M. M. Dewi, and S. Alfie Nur Rahmi, “Supervised Machine Learning Model untuk Prediksi Penyakit Hepatitis,” Indones. J. Comput. Sci., vol. 13, no. 2, pp. 3329–3341, 2024, [Online]. Available: http://ijcs.stmikindonesia.ac.id/ijcs/index.php/ijcs/article/view/3135
M. Diqi, M. E. Hiswati, and E. Damayanti, “Enhancing Hepatitis Patient Survival Detection : A Comparative Study of CNN and Traditional Machine Learning Algorithms,” CoreIT, vol. 10, no. 1, pp. 21–31, 2024.
M. Hussain et al., “Rapid Detection System for Hepatitis B Surface Antigen (HBsAg) Based on Immunomagnetic Separation, Multi-Angle Dynamic Light Scattering and Support Vector Machine,” IEEE Access, vol. 8, pp. 107373–107386, 2020, doi: 10.1109/ACCESS.2020.3000357.
N. L. W. S. R. Ginantra et al., Data Mining dan Penerapan Algoritma. 2021.
Mustika et al., Data Mining dan Aplikasinya. 2021. [Online]. Available: https://repository.penerbitwidina.com/uk/publications/351768/data-mining-dan-aplikasinya
J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques. 2012. doi: 10.1016/C2009-0-61819-5.
A. Cutler, D. R. Cutler, and J. R. Stevens, “Ensemble Machine Learning,” Ensemble Mach. Learn., no. January, 2012, doi: 10.1007/978-1-4419-9326-7.
Y. Liu, Y. Wang, and J. Zhang, “New machine learning algorithm: Random forest,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 7473 LNCS, pp. 246–252, 2012, doi: 10.1007/978-3-642-34062-8_32.
J. Han, J. Pei, and M. Kamber, Data mining: concepts and techniques, 3rd ed. Waltham: Morgan Kaufmann, 2012.
M. I. Hossain, “Support Vector Machine,” Mach. Learn., vol. 104, no. 14, pp. 33–36, 2023, doi: 10.18411/trnio-12-2023-769.
C. Zhang and Y. Ma, Ensemble machine learning: Methods and applications. 2012. doi: 10.1007/9781441993267.
A. J. Ferreira and M. A. T. Figueiredo, “Boosting algorithms: A review of methods, theory, and applications,” Ensemble Mach. Learn. Methods Appl., pp. 35–85, 2012, doi: 10.1007/9781441993267_2.
Z. A. Ali, Z. H. Abduljabbar, H. A. Taher, A. B. Sallow, and S. M. Almufti, “Exploring the power of eXtreme gradient boosting algorithm in machine learning: A review,” Acad. J. Nawroz Univ., vol. 12, no. 2, pp. 320–334, 2023.
I. A. Febriansyah, A. Id Hadiana, and F. Rakhmat Umbara, “Prediksi Curah Hujan Menggunakan Metode Categorical Boosting (Catboost),” JATI (Jurnal Mhs. Tek. Inform., vol. 7, no. 4, pp. 2930–2937, 2024, doi: 10.36040/jati.v7i4.7304.
A. S. Zuhri, S. Kom, M. Kom, S. Kom, and M. Kom, “Comparison of Boosting Algorithms ( LightGBM , CatBoost , and XGBoost ) on Ship Ticket Sales Prediction,” Int. Conf. Artif. Intell. Navig. Eng. Aviat. Technol. ISSN, vol. 1, no. 1, 2024.
P. Septiana Rizky, R. Haiban Hirzi, and U. Hidayaturrohman, “Perbandingan Metode LightGBM dan XGBoost dalam Menangani Data dengan Kelas Tidak Seimbang,” J Stat. J. Ilm. Teor. dan Apl. Stat., vol. 15, no. 2, pp. 228–236, 2022, doi: 10.36456/jstat.vol15.no2.a5548.
R. K. Dinata and N. Hasdyna, “Machine Learning Panduan Memahami Data Science, Supervised Learning, Unsupervised Learning dan Reinforcement Learning,” 2020, Unimal Press, Sulawesi.
I. Markoulidakis, I. Rallis, I. Georgoulas, G. Kopsiaftis, A. Doulamis, and N. Doulamis, “Multiclass Confusion Matrix Reduction Method and Its Application on Net Promoter Score Classification Problem,” Technologies, vol. 9, no. 4, 2021, doi: 10.3390/technologies9040081.
M. Grandini, E. Bagli, and G. Visani, “Metrics for Multi-Class Classification: an Overview,” arXiv Prepr., pp. 1–17, 2020, [Online]. Available: http://arxiv.org/abs/2008.05756
R. Alizadehsani et al., “A database for using machine learning and data mining techniques for coronary artery disease diagnosis,” Sci. Data, vol. 6, no. 1, pp. 1–13, 2019, doi: 10.1038/s41597-019-0206-3.
DOI: http://dx.doi.org/10.24014/ijaidm.v8i3.38084
Refbacks
- There are currently no refbacks.
Office and Secretariat:
Big Data Research Centre
Puzzle Research Data Technology (Predatech)
Laboratory Building 1st Floor of Faculty of Science and Technology
UIN Sultan Syarif Kasim Riau
Jl. HR. Soebrantas KM. 18.5 No. 155 Pekanbaru Riau – 28293
Website: http://predatech.uin-suska.ac.id/ijaidm
Email: ijaidm@uin-suska.ac.id
e-Journal: http://ejournal.uin-suska.ac.id/index.php/ijaidm
Phone: 085275359942
Journal Indexing:
Google Scholar | ROAD | PKP Index | BASE | ESJI | General Impact Factor | Garuda | Moraref | One Search | Cite Factor | Crossref | WorldCat | Neliti | SINTA | Dimensions | ICI Index Copernicus
IJAIDM Stats










