An Ensemble Voting Approach for Dropout Student Classification Using Decision Tree C4.5, K-Nearest Neighbor and Backpropagation

Daffa Nur Cholis, Nurissaidah Ulinnuha


Many factors cause drop out in students. This study classified active students and drop out students using 1092 student data consisting of 557 active student data and 535 drop out student data. The independent variables used are Semester, Semester Credit Units (SKS), Semester Grade Point Average (IPS), Grade Point Average (IPK), admission pathways and Single Tuition Fee (UKT). Classification is carried out using the Ensemble Voting method where the method will combine the Decision Tree C4.5, KNN and Backpropagation methods as a single method. In addition to knowing the classification of active students and drop out students, this study aims to prove whether the Ensemble Voting method is able to get better results than the single method. This classification using a comparison of training and testing data of 90:10 to build model. Classification results from a single method will be included in the Ensemble Voting method. The Decision Tree C4.5 method gets 95.45% accuracy, 98.03% precision and 92.59% recall. KNN gets 96.36% accuracy, 100% precision and 92.59% recall. Backpropagation gets 90.90% accuracy, 95.83% precision and 95.18% recall. Meanwhile, the Ensemble Voting rule used is Ensemble Soft Voting with a weight of (2,1,1). Ensemble Voting with Ensemble Soft Voting rules is able to improve the accuracy, precision and recall values with 98.18% accuracy, 100% precision and 96.29% recall.


Classification, Drop Out, Ensemble Learning, Ensemble Voting, Studen

Full Text:



A. Armansyah, “Prototipe Jaringan Syaraf Tiruan Multilayer Perceptron Untuk Prediksi Mahasiswa Dropout,” J. Nas. Komputasi dan Teknol. Inf., vol. 4, no. 4, pp. 265–271, 2021, doi: 10.32672/jnkti.v4i4.3171.

R. Manrique, B. P. Nunes, O. Marino, M. A. Casanova, and T. Nurmikko-Fuller, “An Analysis Of Student Representation, Representative Features And Classification Algorithms To Predict Degree Dropout,” ACM Int. Conf. Proceeding Ser., pp. 401–410, 2019, doi: 10.1145/3303772.3303800.

U. S. Aesyi, A. R. Lahitani, T. W. Diwangkara, and R. T. Kurniawan, “Deteksi Dini Mahasiswa Drop Out Menggunakan C5.0,” JISKA (Jurnal Inform. Sunan Kalijaga), vol. 6, no. 2, pp. 113–119, 2021, doi: 10.14421/jiska.2021.6.2.113-119.

H. Aldowah, H. Al-Samarraie, A. I. Alzahrani, and N. Alalwan, “Factors Affecting Student Dropout In MOOCs: A Cause And Effect Decision‐Making Model,” J. Comput. High. Educ., vol. 32, no. 2, pp. 429–454, 2020, doi: 10.1007/s12528-019-09241-y.

K. Coussement, M. Phan, A. De Caigny, D. F. Benoit, and A. Raes, “Predicting student dropout in subscription-based online learning environments: The beneficial impact of the logit leaf model,” Decis. Support Syst., vol. 135, p. 113325, 2020, doi: 10.1016/j.dss.2020.113325.

M. Alban and D. Mauricio, “Neural Networks To Predict Dropout At The Universities,” Int. J. Mach. Learn. Comput., vol. 9, no. 2, pp. 149–153, 2019, doi: 10.18178/ijmlc.2019.9.2.779.

M. Laufer and M. Gorup, “The Invisible Others: Stories Of International Doctoral Student Dropout,” High. Educ., vol. 78, no. 1, pp. 165–181, 2019, doi: 10.1007/s10734-018-0337-z.

G. A. S. Santos, K. T. Belloze, L. Tarrataca, D. B. Haddad, A. L. Bordignon, and D. N. Brandao, “Evolved Tree: Analyzing Student Dropout In Universities,” Int. Conf. Syst. Signals, Image Process., vol. 2020-July, pp. 173–178, 2020, doi: 10.1109/IWSSIP48289.2020.9145203.

D. Olaya, J. Vásquez, S. Maldonado, J. Miranda, and W. Verbeke, “Uplift Modeling for preventing student dropout in higher education,” Decis. Support Syst., vol. 134, p. 113320, 2020, doi: 10.1016/j.dss.2020.113320.

L. Bäulke, C. Grunschel, and M. Dresel, “Student dropout at university: a phase-orientated view on quitting studies and changing majors,” Eur. J. Psychol. Educ., vol. 37, no. 1, pp. 853–876, 2021, doi: 10.1007/s10212-021-00557-x.

L. Kemper, G. Vorhoff, and B. U. Wigger, “Predicting student dropout: A machine learning approach,” Eur. J. High. Educ., vol. 10, no. 1, pp. 28–47, 2020, doi: 10.1080/21568235.2020.1718520.

X. Dong and Z. Yu, “A Survey On Ensemble Learning,” Front. Comput. Sci., vol. 10, no. 1, pp. 1–18, 2019, doi: 10.1007/s11704-019-8208-z A.

F. F. Patacsil, “Survival Analysis Approach For Early Prediction Of Student Dropout Using Enrollment Student Data And Ensemble Models,” Univers. J. Educ. Res., vol. 8, no. 9, pp. 4036–4047, 2020, doi: 10.13189/ujer.2020.080929.

V. Senthil Kumaran and B. Malar, “Distributed Ensemble Based Iterative Classification For Churn Analysis And Prediction Of Dropout Ratio In E-Learning,” Interact. Learn. Environ., vol. 0, no. 0, pp. 1–16, 2021, doi: 10.1080/10494820.2021.1956547.

R. Agrawal, Smart Intelligent Computing And Applications, vol. 104. Springer Singapore, 2019. doi: 10.1007/978-981-13-1921-1.

I. S. Purba et al., “Accuracy Level Of Backpropagation Algorithm To Predict Livestock Population Of Simalungun Regency In Indonesia,” J. Phys. Conf. Ser., vol. 1255, no. 1, 2019, doi: 10.1088/1742-6596/1255/1/012014.

S. Setti, A. Wanto, M. Syafiq, A. Andriano, and B. K. Sihotang, “Analysis Of Backpropagation Algorithms In Predicting World Internet Users,” J. Phys. Conf. Ser., vol. 1255, no. 1, 2019, doi: 10.1088/1742-6596/1255/1/012018.

M. Raharjo, M. Napiah, J. L. Putra, and M. Mustofa, “Prediksi Pengaruh Matakuliah Terhadap Peminatan Outline Tugas Akhir Mahasiswa Dengan Jaringan Syaraf Tiruan,” J. Infortech, vol. 2, no. 1, pp. 78–83, 2020, doi: 10.31294/infortech.v2i1.7965.

H. Sulistiani and A. A. Aldino, “Decision Tree C4.5 Algorithm For Tuition Aid Grant Program Classification (Case Study: Department of Information System, Universitas Teknokrat Indonesia),” Edutic - Sci. J. Informatics Educ., vol. 7, no. 1, pp. 40–50, 2020, doi: 10.21107/edutic.v7i1.8849.

I. Triguero, D. García-Gil, J. Maillo, J. Luengo, S. García, and F. Herrera, “Transforming Big Data Into Smart Data: An Insight On The Use Of The K-Nearest Neighbors Algorithm To Obtain Quality Data,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 9, no. 2, pp. 1–24, 2019, doi: 10.1002/widm.1289.

O. Nurdiawan, D. A. Kurnia, D. Solihudin, T. Hartati, and T. Suprapti, “Comparison of the K-Nearest Neighbor algorithm and the decision tree on moisture classification,” IOP Conf. Ser. Mater. Sci. Eng., vol. 1088, no. 1, p. 012031, 2021, doi: 10.1088/1757-899x/1088/1/012031.

A. S. Assiri, S. Nazir, and S. A. Velastin, “Breast Tumor Classification Using An Ensemble Machine Learning Method,” J. Imaging, vol. 6, no. 6, 2020, doi: 10.3390/JIMAGING6060039.

GitHub, “EnsembleVoteClassifier: A majority voting classifier,” 2022. (accessed Jan. 09, 2023).

D. Irmayanti, Y. Muhyidin, and D. A. Nurjaman, “Prediksi Mahasiswa Berpotensi Drop Out Dengan Metode Iteratif Dichotomiser 3 (ID3),” J. Teknol. Inf., vol. 5, no. 2, pp. 103–113, 2021, doi: 10.36294/jurti.v5i2.2054.

R. Sudiyarno, A. Setyanto, and E. T. Luthfi, “Peningkatan Performa Pendeteksian Anomali Menggunakan Ensemble Learning Dan Feature Selection,” Creat. Inf. Technol. J., vol. 7, no. 1, p. 1, 2021, doi: 10.24076/citec.2020v7i1.238.

UINSA, “SIM Akademik Universitas Islam Negeri Sunan Ampel Surabaya,” 2023, 2023. (accessed Jan. 09, 2023).



  • There are currently no refbacks.

Office and Secretariat:

Big Data Research Centre
Puzzle Research Data Technology (Predatech)
Laboratory Building 1st Floor of Faculty of Science and Technology
UIN Sultan Syarif Kasim Riau

Jl. HR. Soebrantas KM. 18.5 No. 155 Pekanbaru Riau – 28293
Phone: 085275359942

Click Here for Information

Journal Indexing:

Google Scholar | ROAD | PKP Index | BASE | ESJI | General Impact Factor | Garuda | Moraref | One Search | Cite Factor | Crossref | WorldCat | Neliti  | SINTA | Dimensions | ICI Index Copernicus