Classification of Online Gambling Spam Comments on YouTube Using Support Vector Machine

Umbu Anaagung Pariamalinya, Josua Josen A. Limbong, Julius Panda Putra Naibaho

Abstract


While digital transformation has established YouTube as a major communication platform, the site has also become vulnerable to online gambling spam in Indonesia. This study investigates the effectiveness of the Support Vector Machine (SVM) algorithm for automated spam detection as an alternative to manual moderation. A total of 9,169 comments were collected from gaming, education, and entertainment channels using the YouTube Data API v3 and were used to train and evaluate the model with an 80:20 data split. The experimental results show that SVM achieved an accuracy of 99.62% and an F1-score of 0.996, demonstrating strong capability in identifying spam comments written in informal and modified promotional language. The main contribution of this study is the development of a highly accurate and practical spam detection approach for Indonesian YouTube comments, which can support more efficient moderation systems. However, the model still has limitations in detecting sarcastic content. Therefore, future research should explore deep learning models such as BERT to improve contextual understanding and strengthen automated moderation in digital environments.

Keywords


Content Moderation; Machine Learning; Online Gambling Spam; Support Vector Machine; YouTube

Full Text:

PDF

References


A. Alamsyah and Y. Sagama, “Empowering Indonesian internet users: An approach to counter online toxicity and enhance digital well-being,” Intelligent Systems with Applications, vol. 22, p. 200394, Jun. 2024, doi: 10.1016/j.iswa.2024.200394.

A. Mishra, S. Sinha, and C. P. George, “Shielding against online harm: A survey on text analysis to prevent cyberbullying,” Eng. Appl. Artif. Intell., vol. 133, p. 108241, Jul. 2024, doi: 10.1016/j.engappai.2024.108241.

O. C. Abikoye, O. Gboyega, R. O. Ogundokun, A. O. Babatunde, and C. Lee, “Cyberbullying Detection and Prevention System for Enhancing Online Platform Safety Using Maximum Entropy Model,” SECURITY AND PRIVACY, vol. 8, no. 2, Mar. 2025, doi: 10.1002/spy2.480.

T. Mahmud, M. Ptaszynski, J. Eronen, and F. Masui, “Cyberbullying detection for low-resource languages and dialects: Review of the state of the art,” Inf. Process. Manag., vol. 60, no. 5, p. 103454, Sep. 2023, doi: 10.1016/j.ipm.2023.103454.

T. H. Teng, K. D. Varathan, and F. Crestani, “A comprehensive review of cyberbullying-related content classification in online social media,” Expert Syst. Appl., vol. 244, p. 122644, Jun. 2024, doi: 10.1016/j.eswa.2023.122644.

P. Yi and A. Zubiaga, “Session-based cyberbullying detection in social media: A survey,” Online Soc. Netw. Media, vol. 36, p. 100250, Jul. 2023, doi: 10.1016/j.osnem.2023.100250.

M. O. Raza et al., “Reading Between the Lines: Machine Learning Ensemble and Deep Learning for Implied Threat Detection in Textual Data,” International Journal of Computational Intelligence Systems, vol. 17, no. 1, p. 183, Jul. 2024, doi: 10.1007/s44196-024-00580-y.

Y. Y. Zandroto, A. V. Vitianingsih, A. L. Maukar, N. K. Hikmawati, and R. Hamidan, “Sentiment Analysis of BCA Mobile App Reviews Using K-Nearest Neighbour and Support Vector Machine Algorithm,” Indonesian Journal of Artificial Intelligence and Data Mining, vol. 8, no. 2, p. 448, Aug. 2025, doi: 10.24014/ijaidm.v8i2.37773.

R. Rahmaddeni and F. Akbar, “Comparison of Naïve Bayes Algorithm, Support Vector Machine and Decision Tree in Analyzing Public Opinion on COVID-19 Vaccination in Indonesia,” Indonesian Journal of Artificial Intelligence and Data Mining, vol. 6, no. 1, p. 8, Apr. 2023, doi: 10.24014/ijaidm.v6i1.19966.

R. Alsheikh, E. Fadel, and N. Akkari, “An Adaptive State Consistency Architecture for Distributed Software-Defined Network Controllers: An Evaluation and Design Consideration,” Applied Sciences, vol. 14, no. 6, p. 2627, Mar. 2024, doi: 10.3390/app14062627.

M. Alzaqebah et al., “Cyberbullying detection framework for short and imbalanced Arabic datasets,” Journal of King Saud University - Computer and Information Sciences, vol. 35, no. 8, p. 101652, Sep. 2023, doi: 10.1016/j.jksuci.2023.101652.

A. Akhter, U. K. Acharjee, Md. A. Talukder, Md. M. Islam, and M. A. Uddin, “A robust hybrid machine learning model for Bengali cyber bullying detection in social media,” Natural Language Processing Journal, vol. 4, p. 100027, Sep. 2023, doi: 10.1016/j.nlp.2023.100027.

Y. Mao, Q. Liu, and Y. Zhang, “Sentiment analysis methods, applications, and challenges: A systematic literature review,” Journal of King Saud University - Computer and Information Sciences, vol. 36, no. 4, p. 102048, Apr. 2024, doi: 10.1016/j.jksuci.2024.102048.

O. S. Jelni, M. L. Radhitya, G. W. Wardhana, Ni Wayan Jeri Kusuma, and N. M. M. R. Desmayani, “Sentiment Analysis of BRImo Reviews on Google Play Store Using SVM and KNN,” Indonesian Journal of Data and Science, vol. 6, no. 3, pp. 548–562, Dec. 2025, doi: 10.56705/ijodas.v6i3.365.

A. A. Jamjoom, H. Karamti, M. Umer, S. Alsubai, T.-H. Kim, and I. Ashraf, “RoBERTaNET: Enhanced RoBERTa Transformer Based Model for Cyberbullying Detection With GloVe Features,” IEEE Access, vol. 12, pp. 58950–58959, 2024, doi: 10.1109/ACCESS.2024.3386637.

S. Cirillo, D. Desiato, G. Polese, G. Solimando, V. Sugumaran, and S. Sundaramurthy, “Exploring the ability of emerging large language models to detect cyberbullying in social posts through new prompt-based classification approaches,” Inf. Process. Manag., vol. 62, no. 3, p. 104043, May 2025, doi: 10.1016/j.ipm.2024.104043.

T. Li, Z. Zeng, Q. Li, and S. Sun, “Integrating GIN-based multimodal feature transformation and multi-feature combination voting for irony-aware cyberbullying detection,” Inf. Process. Manag., vol. 61, no. 3, p. 103651, May 2024, doi: 10.1016/j.ipm.2024.103651.

K. Subhashree and S. M. Kumar, “Enhanced quantum long short-term memory neural network based multi-task learning for sentimental analysis and cyberbullying detection,” Expert Syst. Appl., vol. 282, p. 127555, Jul. 2025, doi: 10.1016/j.eswa.2025.127555.

M. Karpagam et al., “An effective cyberbullying-flashing identification on whatsapp using PTS-GReLU-GRU with harmful level prediction,” Sci. Rep., vol. 16, no. 1, p. 80, Dec. 2025, doi: 10.1038/s41598-025-28765-1.

S. Ullah, M. Kukreti, A. Sami, M. R. Shaukat, and A. Dangwal, “The role of bystander behavior and employee resilience in mitigating workplace cyberbullying impacts on employee innovative performance,” Human Systems Management, vol. 44, no. 4, pp. 629–640, Jul. 2025, doi: 10.1177/01672533251317066.

J. A. Josen Limbong, I. Sembiring, K. Dwi Hartomo, U. Kristen Satya Wacana, and P. Korespondensi, “Analisis Klasifikasi Sentimen Ulasan Pada E-Commerce Shopee Berbasis Word Cloud Dengan Metode Naive Bayes Dan K-Nearest Neighbor Analysis Of Review Sentiment Classification On E-Commerce Shopee Word Cloud Based With Naïve Bayes And K-Nearest Neighbor Methods”, doi: 10.25126/jtiik.202294960.




DOI: http://dx.doi.org/10.24014/ijaidm.v9i1.39193

Refbacks

  • There are currently no refbacks.


Office and Secretariat:

Big Data Research Centre
Puzzle Research Data Technology (Predatech)
Laboratory Building 1st Floor of Faculty of Science and Technology
UIN Sultan Syarif Kasim Riau

Jl. HR. Soebrantas KM. 18.5 No. 155 Pekanbaru Riau – 28293
Website: http://predatech.uin-suska.ac.id/ijaidm
Email: ijaidm@uin-suska.ac.id
e-Journal: http://ejournal.uin-suska.ac.id/index.php/ijaidm
Phone: 085275359942

Click Here for Information


Journal Indexing:

Google Scholar | ROAD | PKP Index | BASE | ESJI | General Impact Factor | Garuda | Moraref | One Search | Cite Factor | Crossref | WorldCat | Neliti  | SINTA | Dimensions | ICI Index Copernicus 

IJAIDM Stats