A Smart Architecture for Stunting Prediction: Implementing the SOM–Voting Classifier on Healthcare Big Data

Kelvin Kelvin, Sunaryo Winardi, Frans Mikael Sinaga, Hardy Hardy, Erwin Setiawan Panjaitan, Ng Poi Wong, Ferawaty Ferawaty, Justine Lim, Grace Putri Wijaya

Abstract


Childhood stunting is a persistent public health challenge in Indonesia. This study developed a predictive classification model using healthcare data from hospitals in Medan to enable early identification of at-risk children. A novel framework was proposed that integrated an unsupervised Self-Organizing Map (SOM) for feature engineering with a supervised Voting Classifier ensemble, which combined a Support Vector Classifier (SVC), Random Forest (RF), and Gradient Boosting (GB). The proposed framework achieved an accuracy of 100% on the test set, a substantial improvement over the 91.67% accuracy of the baseline Voting Classifier without SOM. While this result highlighted the model's high predictive potential, it must be interpreted cautiously, acknowledging the need for validation on more diverse datasets to ensure generalizability. The findings demonstrated that this hybrid machine learning approach can serve as a powerful decision-support tool, enabling proactive clinical interventions and aiding public health officials in strategically allocating nutritional resources to support Indonesia's national stunting reduction goals.

Keywords


Big Data; Medical; Stunting; Support Vector Classifier; Voting Classifier

Full Text:

PDF

References


S. Angriani, N. Jalil, S. Aminah, and N. Agus Salim, “Childhood Stunting: Analysis Affecting Children’s Stunting In Sulawesi,” 2021.

T. Beal, A. Tumilowicz, A. Sutrisna, D. Izwardy, and L. M. Neufeld, “A review of child stunting determinants in Indonesia,” Maternal and Child Nutrition, vol. 14, no. 4. 2018. doi: 10.1111/mcn.12617.

S. Processing, “Penyelenggaraan Percepatan Penurunan Stunting,” Signal Processing, 2009.

M. de Onis and F. Branca, “Childhood stunting: A global perspective,” Maternal and Child Nutrition, vol. 12. 2016. doi: 10.1111/mcn.12231.

T. Siswati, B. A. Paramashanti, N. Pramestuti, and L. Waris, “A POOLED DATA ANALYSIS TO DETERMINE RISK FACTORS OF CHILDHOOD STUNTING IN INDONESIA,” Journal of Nutrition College, vol. 12, no. 1, 2023, doi: 10.14710/jnc.v12i1.35413.

J. T. Samudra, R. Rosnelly, and Z. Situmorang, “Comparative Analysis of SVM and Perceptron Algorithms in Classification of Work Programs,” MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 22, no. 2, 2023, doi: 10.30812/matrik.v22i2.2479.

M. H. Bazrkar and X. Chu, “Development of category-based scoring support vector regression (CBS-SVR) for drought prediction,” Journal of Hydroinformatics, vol. 24, no. 1, 2022, doi: 10.2166/HYDRO.2022.104.

Y. Zhang, “Support vector machine classification algorithm and its application,” in Communications in Computer and Information Science, 2012. doi: 10.1007/978-3-642-34041-3_27.

C. Cortes and V. Vapnik, “Support-Vector Networks,” Machine Learning, vol. 20, no. 3, 1995, doi: 10.1023/A:1022627411411.

J. C. Platt, “Fast Training of Support Vector Machines Using Sequential Minimal Optimization,” in Advances in Kernel Methods, 2022. doi: 10.7551/mitpress/1130.003.0016.

L. Breiman, “Random forests. Machine Learning,” Kluwer Academic Publishers. Manufactured in The Netherlands., vol. 45(1), 2001.

J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Annals of Statistics, vol. 29, no. 5, 2001, doi: 10.1214/aos/1013203451.

L. I. Kuncheva, Combining Pattern Classifiers. 2004. doi: 10.1002/0471660264.

H. Bhavsar and M. H. Panchal, “A Review on Support Vector Machine for Data Classification,” International Journal of Advanced Research in Computer Engineering & Technology, vol. 1, no. 10, 2012.

V. K. Chauhan, K. Dahiya, and A. Sharma, “Problem formulations and solvers in linear SVM: a review,” Artificial Intelligence Review, vol. 52, no. 2. 2019. doi: 10.1007/s10462-018-9614-6.

M. Belgiu and L. Drăgu, “Random forest in remote sensing: A review of applications and future directions,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 114. 2016. doi: 10.1016/j.isprsjprs.2016.01.011.

V. F. Rodriguez-Galiano, B. Ghimire, J. Rogan, M. Chica-Olmo, and J. P. Rigol-Sanchez, “An assessment of the effectiveness of a random forest classifier for land-cover classification,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 67, no. 1, 2012, doi: 10.1016/j.isprsjprs.2011.11.002.

A. Chaudhary, S. Kolhe, and R. Kamal, “An improved random forest classifier for multi-class classification,” Information Processing in Agriculture, vol. 3, no. 4, 2016, doi: 10.1016/j.inpa.2016.08.002.

C. Bentéjac, A. Csörgő, and G. Martínez-Muñoz, “A comparative analysis of gradient boosting algorithms,” Artificial Intelligence Review, vol. 54, no. 3, 2021, doi: 10.1007/s10462-020-09896-5.

R. Blagus and L. Lusa, “Gradient boosting for high-dimensional prediction of rare events,” Computational Statistics and Data Analysis, vol. 113, 2017, doi: 10.1016/j.csda.2016.07.016.

M. S. Islam Khan, N. Islam, J. Uddin, S. Islam, and M. K. Nasir, “Water quality prediction and classification based on principal component regression and gradient boosting classifier approach,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 8, 2022, doi: 10.1016/j.jksuci.2021.06.003.

C. Y. Yeh, C. W. Huang, and S. J. Lee, “A multiple-kernel support vector regression approach for stock market price forecasting,” Expert Systems with Applications, vol. 38, no. 3, 2011, doi: 10.1016/j.eswa.2010.08.004.

A. Paniagua-Tineo, S. Salcedo-Sanz, C. Casanova-Mateo, E. G. Ortiz-García, M. A. Cony, and E. Hernández-Martín, “Prediction of daily maximum temperature using a support vector regression algorithm,” Renewable Energy, vol. 36, no. 11, 2011, doi: 10.1016/j.renene.2011.03.030.

A. W. M. Gaffar, Sugiarti, Dewi Widyawati, Andi Muhammad Kemai Arief Hidayat Paharuddin, and Andi Vania Anastasia, “Spatial Prediction of Stunting Incidents Prevalence Using Support Vector Regression Method,” Indonesian Journal of Data and Science, vol. 4, no. 2, 2023, doi: 10.56705/ijodas.v4i2.68.

G. Kunapuli, Ensemble Methods for Machine Learning. 2023.

A. Salini, U. Jeyapriya, S. M. College, and S. M. College, “A Majority Vote Based Ensemble Classifier for Predicting Students Academic Performance,” International Journal of Pure and Applied Mathematics, vol. 118, no. 24, 2018.

X. Dong, Z. Yu, W. Cao, Y. Shi, and Q. Ma, “A survey on ensemble learning,” Frontiers of Computer Science, vol. 14, no. 2. 2020. doi: 10.1007/s11704-019-8208-z.

I. D. Mienye and Y. Sun, “A Survey of Ensemble Learning: Concepts, Algorithms, Applications, and Prospects,” IEEE Access, vol. 10. 2022. doi: 10.1109/ACCESS.2022.3207287.

S. Mishra et al., “Multivariate Statistical Data Analysis- Principal Component Analysis (PCA),” International Journal of Livestock Research, vol. 7, no. 5, 2017.

N. v. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, 2002, doi: 10.1613/jair.953.

Kelvin, R., Purba, R., & Halim, A. (2022). Stock Price Prediction Using XCEEMDAN-Bidirectional LSTM-Spline. Indonesian Journal of Artificial Intelligence and Data Mining (IJAIDM), 5(1), 1-12. https://doi.org/10.24014/ijaidm.v5i1.14424.

Kelvin, Sinaga, F. M., Winardi, S., & Susmanto. (2024). Exploring New Frontiers: XCEEMDAN, Bidirectional LSTM, Attention Mechanism, and Spline in Stock Price Forecasting. Indonesian Journal of Artificial Intelligence and Data Mining (IJAIDM), 7(2), 384-391. https://dx.doi.org/10.24014/ijaidm.v7i2.29649.

Teuvo Kohonen (1990). The self-organizing map . IEEE, vol.78 page 1464-1480. doi: 10.1109/5.58325

Jérôme Lacaille, Hanane Azzag, Florent Forest, Mustapha Lebbah. A Survey and Implementation of Performance Metrics for self-organized maps (2020). arXiv:2011.05847v1 [cs.NE]

Xiaofei qu, Lin yang, Kai guo, Linru Ma, Meng Sun, Mingxing ke, Mu li. A Survey on the Development of Self-Organizing Maps for unsupervised Intrusion Detection (2019). Mobile Network and Applications volume 26, pages 808-829, (2021)

Kelvin Chen, R. A. Fattah Adriansyah, Carles Juliandy, Frans Mikael Sinaga, Frederick Liko, Aswin Angkasa. Classification of Big Data Stunting Using Support Vector Regression Method at Stella Maris Medan Maternity Hospital (2024). Indonesian Journal of Artificial Intelligence and Data Mining Vol 7, No 2 (2024): September 2024




DOI: http://dx.doi.org/10.24014/ijaidm.v8i3.38000

Refbacks

  • There are currently no refbacks.


Office and Secretariat:

Big Data Research Centre
Puzzle Research Data Technology (Predatech)
Laboratory Building 1st Floor of Faculty of Science and Technology
UIN Sultan Syarif Kasim Riau

Jl. HR. Soebrantas KM. 18.5 No. 155 Pekanbaru Riau – 28293
Website: http://predatech.uin-suska.ac.id/ijaidm
Email: ijaidm@uin-suska.ac.id
e-Journal: http://ejournal.uin-suska.ac.id/index.php/ijaidm
Phone: 085275359942

Click Here for Information


Journal Indexing:

Google Scholar | ROAD | PKP Index | BASE | ESJI | General Impact Factor | Garuda | Moraref | One Search | Cite Factor | Crossref | WorldCat | Neliti  | SINTA | Dimensions | ICI Index Copernicus 

IJAIDM Stats