Evaluation of Support Vector Machine, Naive Bayes, Decision Tree, and Gradient Boosting Algorithms for Sentiment Analysis on ChatGPT Twitter Dataset

Salsabila Rabbani, Dea Safitri, Farida Try Puspa Siregar, Rahmaddeni Rahmaddeni, Lusiana Efrizoni

Abstract


ChatGPT is a language model employed to produce text and engage in conversation with users. It serves as a tool for generating text and facilitating interactions in a conversational manner. The model was designed to provide relevant and useful responses based on the context of the ongoing conversation. By the increasing popularity of using ChatGPT, it makes it difficult for users to classify responses about the use of ChatGPT. Therefore, sentiment classification of ChatGPT is carried out. The dataset used is sourced from the kaggle website with a total of 20,000 data. The classification methods used in this research include Support Vector Machine (SVM), Naïve Bayes, Decision Tree, and Gradient Boosting. Through the research results, the Support Vector Machine algorithm had the highest accuracy value with 80% compared to other methods, when the data is divided by a ratio of 90:10. This research is expected to help developers and service providers to improve ChatGPT and understand user responses better.


Keywords


Sentiment Analysis, Support Vector Machine, Naïve Bayes, Decision Tree, Gradient Boosting

Full Text:

PDF

References


Parimala, M., Swarna Priya, R. M., Praveen Kumar Reddy, M., Lal Chowdhary, C., Kumar Poluru, R., & Khan, S. (2021). Spatiotemporal‐based sentiment analysis on tweets for risk assessment of event using deep learning approach. Software: Practice and Experience, 51(3), 550-570.

George, A. S., & George, A. H. (2023). A review of ChatGPT AI's impact on several business sectors. Partners Universal International Innovation Journal, 1(1), 9-23.

Juniarsih, S., Ripanti, E. F., & Pratama, E. E. (2020). Implementasi Naive Bayes Classifier pada Opinion Mining Berdasarkan Tweets Masyarakat Terkait Kinerja Presiden dalam Aspek Ekonomi. JUSTIN (Jurnal Sistem dan Teknologi Informasi), 8(3), 239-249.

D. Marutho, Muljono, S. Rustad and Purwanto, "Sentiment Analysis Optimization Using Vader Lexicon on Machine Learning Approach," 2022 International Seminar on Intelligent Technology and Its Applications (ISITIA), Surabaya, Indonesia, 2022, pp. 98-103, doi: 10.1109/ISITIA56226.2022.9855341.

Aldisa, R. T., & Maulana, P. (2022). Analisis Sentimen Opini Masyarakat Terhadap Vaksinasi Booster COVID-19 Dengan Perbandingan Metode Naive Bayes, Decision Tree dan SVM. Building of Informatics, Technology and Science (BITS), 4(1), 106-109.

Ramadhan, M. A., & Wahyudin, M. I. (2022). Analisis Sentimen Mengenai Keberhasilan Indonesia di Ajang Thomas Cup 2020 (Studi Kasus Media Sosial Twitter) Menggunakan Metode Naïve Bayes dan Decision Tree. Jurnal JTIK (Jurnal Teknologi Informasi dan Komunikasi), 6(4), 505-511.

Lund, B. D., & Wang, T. (2023). Chatting about ChatGPT: how may AI and GPT impact academia and libraries?. Library Hi Tech News, 40(3), 26-29.

Zhong, Q., Ding, L., Liu, J., Du, B., & Tao, D. (2023). Can chatgpt understand too? a comparative study on chatgpt and fine-tuned bert. arXiv preprint arXiv:2302.10198.

Rahmaddeni, & Akbar, F. (2021). Public Opinion on Covid-19 Vaccination in Indonesia: A Sentiment Analysis on Twitter. International Journal of Advanced Intelligence and Data Mining, 6(1), 8-17.

Sidik, F., Suhada, I., Anwar, A. H., & Hasan, F. N. (2022). Analisis Sentimen Terhadap Pembelajaran Daring Dengan Algoritma Naive Bayes Classifier. Jurnal Linguistik Komputasional, 5(1), 34-43.

A. S. Alammary, "Arabic Questions Classification Using Modified TF-IDF," in IEEE Access, vol. 9, pp. 95109-95122, 2021, doi: 10.1109/ACCESS.2021.3094115.

M. S. Neethu and R. Rajasree, “Sentiment analysis in twitter using machine learning techniques,” in 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT), 2013, pp. 1–5.

Khaira, U., Johanda, R., Utomo, P. E. P., & Suratno, T. (2020). Sentiment analysis of cyberbullying on twitter using SentiStrength. Indones. J. Artif. Intell. Data Min, 3(1), 21.

S. Al-Saqqa, A. Awajan, and S. Ghoul, ‘‘Stemming effects on sentiment analysis using large arabic multi-domain resources,’’ in Proc. 6th Int. Conf. Social Netw. Anal., Manage. Secur. (SNAMS), Oct. 2019, pp. 211–216.

S. Amin et al., "Recurrent Neural Networks With TF-IDF Embedding Technique for Detection and Classification in Tweets of Dengue Disease," in IEEE Access, vol. 8, pp. 131522-131533, 2020, doi: 10.1109/ACCESS.2020.3009058.

S. Nirmal and T. Verma, ‘‘E-Mail spam detection and classification using SVM and feature Extraction,’’ Int. J. Advance Res., Ideas Innov. Technol., vol. 3, no. 3, pp. 1491–1495, 2017.

M. O. Pratama et al., “The sentiment analysis of Indonesia commuter line using machine learning based on twitter data,” J. Phys. Conf. Ser., vol. 1193, no. 1, pp. 1–6, 2019, doi: 10.1088/1742-6596/1193/1/012029.

Y. Findawati, I. R. I. Astutik, A. S. Fitroni, I. Indrawati, and N. Yuniasih, “Comparative analysis of Naïve Bayes, K Nearest Neighbor and C.45 method in weather forecast,” J. Phys. Conf. Ser., vol. 1402, p. 066046, Dec. 2019, doi: 10.1088/1742-6596/1402/6/066046.

I. S. Damanik, A. P. Windarto, A. Wanto, S. R. Andani, and W. Saputra, “Decision Tree Optimization in C4. 5 Algorithm Using Genetic Algorithm,” in Journal of Physics: Conference Series, 2019, vol. 1255, no. 1, p. 012012.

Yang, H., Luo, Y., Ren, X., Wu, M., He, X., Peng, B., et al. (2021). Risk Prediction of Diabetes: Big Data Mining with Fusion of Multifarious Physical Examination Indicators. Inf. Fusion 75, 140–149. doi:10.1016/j.inffus.2021.02.015.




DOI: http://dx.doi.org/10.24014/ijaidm.v7i1.24662

Refbacks

  • There are currently no refbacks.


Office and Secretariat:

Big Data Research Centre
Puzzle Research Data Technology (Predatech)
Laboratory Building 1st Floor of Faculty of Science and Technology
UIN Sultan Syarif Kasim Riau

Jl. HR. Soebrantas KM. 18.5 No. 155 Pekanbaru Riau – 28293
Website: http://predatech.uin-suska.ac.id/ijaidm
Email: ijaidm@uin-suska.ac.id
e-Journal: http://ejournal.uin-suska.ac.id/index.php/ijaidm
Phone: 085275359942

Click Here for Information


Journal Indexing:

Google Scholar | ROAD | PKP Index | BASE | ESJI | General Impact Factor | Garuda | Moraref | One Search | Cite Factor | Crossref | WorldCat | Neliti  | SINTA | Dimensions | ICI Index Copernicus 

IJAIDM Stats