Optimizing Student Depression Prediction Using Particle Swarm Optimization and Random Forest

Mukhammad Khoirul Effendi, Sriyanto -, Suhendro Yusuf Irianto, Chairani Fauzi, Yelfi Vitriani

Abstract


Student mental health is a growing concern due to increasing academic pressure, social demands, and economic factors affecting their well-being. Depression, a common issue among students, significantly impacts academic performance and overall quality of life. Therefore, early detection and accurate prediction of student mental health conditions are essential to provide timely interventions. This study aims to improve the accuracy of depression prediction among university students by integrating Particle Swarm Optimization (PSO) for feature selection with Random Forest (RF) as the classification model. The dataset used is the Student Depression Dataset from Kaggle, consisting of 27,900 respondents with 18 features related to demographic, academic, and psychological factors. Data preprocessing includes handling missing values, normalization, categorical encoding, and feature selection using PSO. The model is trained and evaluated using 10-Fold Cross-Validation. Experimental results show that PSO-optimized Random Forest outperforms the standard Random Forest model. The optimized model achieves an accuracy of 84.08%, precision of 82.79%, recall of 77.79%, and an AUC-ROC score of 0.912, improving classification performance. These findings demonstrate that PSO effectively enhances feature selection, leading to better classification accuracy. This study contributes to the development of a more accurate and efficient machine learning model for detecting student depression. By optimizing feature selection, this approach reduces computational complexity while maintaining high predictive performance. Future research can explore hybrid optimization techniques such as Genetic Algorithm (GA) or Differential Evolution (DE) to further enhance model generalization across different datasets.


Keywords


Machine Learning; Mental Health; Particle Swarm Optimization; Random Forest; Student Depression

References


C. V. LOTULUNG and I. G. PURNAWINADI, “DETEKSI DINI DEPRESI MAHASISWA BARU JURUSAN KEPERAWATAN,” PAEDAGOGY J. Ilmu Pendidik. dan Psikol., vol. 4, no. 2, pp. 179–185, 2024, doi: 10.51878/paedagogy.v4i2.3042.

M. K. Sari and E. A. Susmiatin, “Deteksi Dini Kesehatan Mental Emosional pada Mahasiswa,” J. Ilm. STIKES Yars. Mataram, vol. 13, no. 1, pp. 10–17, 2023, doi: 10.57267/jisym.v13i1.226.

S. Sawangarreerak and P. Thanathamathee, “Random forest with sampling techniques for handling imbalanced prediction of university student depression,” Inf., vol. 11, no. 11, pp. 1–13, 2020, doi: 10.3390/info11110519.

S. Abrori and Z. Fatah, “Implementasi Metode Decission Tree Dalam Mengklasifikasi Depresi Menggunakan Rapidminer,” vol. 5, no. 2, pp. 123–132, 2025.

A. Budiman, J. C. Young, and A. Suryadibrata, “Implementasi Algoritma Naïve Bayes untuk Klasifikasi Konten Twitter dengan Indikasi Depresi,” J. Inform. J. Pengemb. IT, vol. 6, no. 2, pp. 133–138, 2021, doi: 10.30591/jpit.v6i2.2419.

F. Aziz, P. Ishak, and S. Abasa, “Klasifikasi Depresi Menggunakan Support Vector Machine: Pendekatan Berbasis Data Text Mining,” J. Pharm. Appl. Comput. Sci., vol. 2, no. 2, pp. 33–38, 2024, doi: 10.59823/jopacs.v2i2.53.

K. Rahayu, V. Fitria, D. Septhya, R. Rahmaddeni, and L. Efrizoni, “Klasifikasi Teks untuk Mendeteksi Depresi dan Kecemasan pada Pengguna Twitter Berbasis Machine Learning,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 3, no. 2, pp. 108–114, 2023, doi: 10.57152/malcom.v3i2.780.

A. Palanivinayagam and R. Damaševičius, “Effective Handling of Missing Values in Datasets for Classification Using Machine Learning Methods,” Inf., vol. 14, no. 2, pp. 1–15, 2023, doi: 10.3390/info14020092.

P. A. Popoola, J. R. Tapamo, and A. G. H. Assounga, “Effective and Efficient Handling of Missing Data in Supervised Machine Learning,” Data Sci. Manag., 2024, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2666764924000663

H. Wang, J. Tang, M. Wu, X. Wang, and T. Zhang, “Application of machine learning missing data imputation techniques in clinical decision making: taking the discharge assessment of patients with spontaneous supratentorial intracerebral hemorrhage as an example,” BMC Med. Inform. Decis. Mak., vol. 22, no. 1, pp. 1–14, 2022, doi: 10.1186/s12911-022-01752-6.

M. Sholeh, D. Andayati, and R. Y. Rachmawati, “Data Mining Model Klasifikasi Menggunakan Algoritma K-Nearest Neighbor Dengan Normalisasi Untuk Prediksi Penyakit Diabetes,” TeIKa, vol. 12, no. 02, pp. 77–87, 2022, doi: 10.36342/teika.v12i02.2911.

Wenny, “Normalisasi Data Kependudukan Dengan Model Min Max Dan Algoritma K-Means Untuk Pengelompokkan Tingkat Ekonomi Masyarakat,” Bull. Inf. Syst. Res., vol. 2, no. 2, pp. 63–73, 2024, [Online]. Available: https://journal.grahamitra.id/index.php/bios

A.-I. Udila˘, A. Ionescu, and A. Katsifodimos, “Encoding Methods for Categorical Data: A Comparative Analysis for Linear Models, Decision Trees, and Support Vector Machines,” 2023, [Online]. Available: http://repository.tudelft.nl/.

W. Albattah, R. U. Khan, M. F. Alsharekh, and S. F. Khasawneh, “Feature Selection Techniques for Big Data Analytics,” Electron., vol. 11, no. 19, 2022, doi: 10.3390/electronics11193177.

K. Mei, M. Tan, Z. Yang, and S. Shi, “Modeling of Feature Selection Based on Random Forest Algorithm and Pearson Correlation Coefficient,” J. Phys. Conf. Ser., vol. 2219, no. 1, 2022, doi: 10.1088/1742-6596/2219/1/012046.

R. M. Ubaidilah, “Performance Comparasion of Adaboost and PSO Algorithms for Cervical Cancer Classification Using KNN Algorithm,” vol. 3321, no. X, pp. 65–74, doi: 10.24014/coreit.v10i2.31711.

A. C. S. Alexita, P. Kusumaningtyas, and ..., “OPTIMASI ALGORITMA RANDOM FOREST MENGGUNAKAN PSO UNTUK KLASIFIKASI KANKER PAYUDARA DENGAN CITRA MAMMOGRAMS,” Tek. STTKD J. …, 2025, [Online]. Available: https://jurnal.sttkd.ac.id/index.php/ts/article/view/1346

Y. Zhang and Z. Tang, “PSO-weighted random forest for attractive tourism spots recommendation,” Futur. Gener. Comput. Syst., vol. 127, pp. 421–425, 2022, doi: https://doi.org/10.1016/j.future.2021.09.029.

M. Ajdani and H. Ghaffary, “ Introduced a new method for enhancement of intrusion detection with random forest and PSO algorithm ,” Secur. Priv., vol. 4, no. 2, pp. 1–10, 2021, doi: 10.1002/spy2.147.

Kurniabudi et al., “Improvement of attack detection performance on the internet of things with PSO-search and random forest,” J. Comput. Sci., vol. 64, no. April, p. 101833, 2022, doi: 10.1016/j.jocs.2022.101833.

S. M. Malakouti, M. B. Menhaj, and A. A. Suratgar, “The usage of 10-fold cross-validation and grid search to enhance ML methods performance in solar farm power generation prediction,” Clean. Eng. Technol., vol. 15, no. February, p. 100664, 2023, doi: 10.1016/j.clet.2023.100664.

S. M. Malakouti, “Babysitting hyperparameter optimization and 10-fold-cross-validation to enhance the performance of ML methods in predicting wind speed and energy generation,” Intell. Syst. with Appl., vol. 19, no. March, p. 200248, 2023, doi: 10.1016/j.iswa.2023.200248.

B. Hutchinson, N. Rostamzadeh, C. Greer, K. Heller, and V. Prabhakaran, “Evaluation Gaps in Machine Learning Practice,” ACM Int. Conf. Proceeding Ser., pp. 1859–1876, 2022, doi: 10.1145/3531146.3533233.

D. Rajput, W. J. Wang, and C. C. Chen, “Evaluation of a decided sample size in machine learning applications,” BMC Bioinformatics, vol. 24, no. 1, pp. 1–17, 2023, doi: 10.1186/s12859-023-05156-9.




DOI: http://dx.doi.org/10.24014/coreit.v11i1.35954

Refbacks

  • There are currently no refbacks.




Creative Commons License  site stats  
Jurnal CoreIT by http://ejournal.uin-suska.ac.id/index.php/coreit/ is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.