Evaluation of the Latent Dirichlet Allocation for Modeling News Topics of Nusantara Capital City

Luh Gede Surya Kartika, Anggara Putu Dharma Putra, Komang Rinartha, Megawati Megawati

Abstract


Research regarding topic modeling on the coverage of the Nusantara Capital City (IKN) in national mass media remains limited. This study aims to not only model IKN-related topics but also rigorously evaluate the Latent Dirichlet Allocation (LDA) model to ensure its robustness for future implementation. The dataset comprises 1,498 news articles gathered from prominent Indonesian online media, specifically Detik (1,050 articles) and Kompas (448 articles). The methodology involves experimental variations of LDA parameters, including document volume, maximum features, and topic count, utilizing the Scikit-learn library. The results indicate that an increase in data volume and feature dimensions significantly correlates with longer computation times and a higher number of epochs required for convergence. Furthermore, the expansion of variables and data volume resulted in more negative log-likelihood values and increased perplexity, suggesting that model complexity challenges predictive precision. A convergence threshold of $1e^{-2}$ was applied to optimize the training cessation point. While this study establishes a baseline for static topic modeling, future research implies the necessity of Dynamic Topic Modeling (DTM) to capture the temporal evolution of topics, a dimension not addressed by the standard LDA model.


Keywords


IKN; LDA; log-likelihood; model evaluation; perplexity

Full Text:

PDF

References


S. D. Saputra, T. G. J., and M. Halkis, “Analisis Strategi Pemindahan Ibu Kota Negara Indonesia Ditinjau dari Perspektif Ekonomi Pertahanan,” 192 | Jurnal Ekonomi Pertahanan |, vol. 7, no. 2, 2021.

R. R. A. Hasibuan and S. Aisa, “DAMPAK DAN RESIKO PERPINDAHAN IBU KOTA TERHADAP EKONOMI DI INDONESIA,” AT-TAWASSUTH: Jurnal Ekonomi Islam, vol. 5, no. 1, 2020, doi: 10.30829/ajei.v5i1.7947.

R. Cybriwsky and L. R. Ford, “City profile Jakarta,” Cities, vol. 18, no. 3, 2001, doi: 10.1016/S0264-2751(01)00004-X.

Y. S. Amelinda, R. A. Wulandari, and A. Asyary, “The effects of climate factors, population density, and vector density on the incidence of dengue hemorrhagic fever in South Jakarta Administrative City 2016-2020: an ecological study,” Acta Biomedica, vol. 93, no. 6, 2022, doi: 10.23750/abm.v93i6.13503.

R. Setiadi, J. Baumeister, P. Burton, and J. Nalau, “Extending Urban Development on Water: Jakarta Case Study,” Environment and Urbanization ASIA, vol. 11, no. 2, 2020, doi: 10.1177/0975425320938539.

E. U. Nainggolan, “Urgensi Pemindahan Ibu Kota Negara,” https://www.djkn.kemenkeu.go.id/kanwil-kalbar/baca-artikel/14671/Urgensi-Pemindahan-Ibu-Kota-Negara.html.

P. Arsi and R. Waluyo, “Analisis Sentimen Wacana Pemindahan Ibu Kota Indonesia Menggunakan Algoritma Support Vector Machine (SVM),” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 8, no. 1, p. 147, 2021, doi: 10.25126/jtiik.0813944.

H. Dhery, A. Assyam, and F. N. Hasan, “Analisis Sentimen Twitter Terhadap Perpindahan Ibu Kota Negara Ke IKN Nusantara Menggunakan Orange Data Mining,” KLIK: Kajian Ilmiah Informatika dan Komputer, vol. 4, no. 1, 2023.

Y. Sunesti and A. K. Putri, “Narasi Ibu Kota Negara Baru di Twitter: dari Isu Kearifan Lokal Hinggi Isu Anak Muda,” 2022.

Syahril Dwi Prasetyo, Shofa Shofiah Hilabi, and Fitri Nurapriani, “Analisis Sentimen Relokasi Ibukota Nusantara Menggunakan Algoritma Naïve Bayes dan KNN,” Jurnal KomtekInfo, 2023, doi: 10.35134/komtekinfo.v10i1.330.

S. Anggraeni and S. D. Saraswati, “Klasifikasi Sentimen Terhadap Ibu Kota Nusantara (IKN) pada Media Sosial Menggunakan Naive Bayes,” Teknika, vol. 16, no. 2, 2022.

F. Nurdiyansyah and L. U. Pratama, “Analisis sentimen perpindahan ibu kota negara pada aplikasi Tiktok menggunakan metode LSTM,” Teknosains: Media Informasi dan Teknologi, vol. 17, no. 3, pp. 382–387, 2023.

Y. Ardian Pradana, I. Cholissodin, and D. Kurnianingtyas, “Analisis Sentimen Pemindahan Ibu Kota Indonesia pada Media Sosial Twitter menggunakan Metode LSTM dan Word2Vec,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 7, no. 5, pp. 2389–2397, 2023, [Online]. Available: http://j-ptiik.ub.ac.id

F. Zamachsari, G. V. Saragih, Susafa’ati, and W. Gata, “Analisis Sentimen Pemindahan Ibu Kota Negara dengan Feature Selection,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 4, no. 3, 2020.

B. Dame Laoera and T. O. Wibowo, “Indonesian online news and digital culture: a media ecology perspective,” Jurnal Studi Komunikasi (Indonesian Journal of Communications Studies), vol. 7, no. 2, pp. 355–368, Jul. 2023, doi: 10.25139/jsk.v7i2.6190.

Y. Matira and I. Setiawan, “Pemodelan Topik pada Judul Berita Online Detikcom Menggunakan Latent Dirichlet Allocation,” Estimasi: Journal of Statistics and Its Application, vol. 4, no. 1, pp. 2721–379, 2023, doi: 10.20956/ejsa.vi.24843.

D. M. Blei, A. Y. Ng, and J. B. Edu, “Latent Dirichlet Allocation Michael I. Jordan,” 2003.

D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet allocation,” Journal of Machine Learning Research, vol. 3, no. 4–5, 2003, doi: 10.7551/mitpress/1120.003.0082.

M. Steyvers and T. Griffiths, “Probabilistic Topic Models,” in Handbook of Latent Semantic Analysis, Laurence Erlbaum, 2007.

D. Newman, E. V Bonilla, and W. Buntine, “Improving Topic Coherence with Regularized Topic Models. Improving Topic Coherence with Regularized Topic Models,” in Advances in Neural Information Processing Systems 24 (NIPS 2011), 2011. [Online]. Available: https://www.researchgate.net/publication/260639294

I. C. Chang, T. K. Yu, Y. J. Chang, and T. Y. Yu, “Applying text mining, clustering analysis, and latent dirichlet allocation techniques for topic classification of environmental education journals,” Sustainability (Switzerland), vol. 13, no. 19, 2021, doi: 10.3390/su131910856.

M. Röder, A. Both, and A. Hinneburg, “Exploring the space of topic coherence measures,” in WSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining, Association for Computing Machinery, Feb. 2015, pp. 399–408. doi: 10.1145/2684822.2685324.

Scikit Learn, “LatentDirichletAllocation,” Scikit Learn. Accessed: Oct. 08, 2024. [Online]. Available: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html#sklearn.decomposition.LatentDirichletAllocation

O. M. AL-Janabi, N. H. Ahamed Hassain Malim, and Y. N. Cheah, “Unsupervised model for aspect categorization and implicit aspect extraction,” Knowl Inf Syst, vol. 64, no. 6, pp. 1625–1651, Jun. 2022, doi: 10.1007/s10115-022-01678-5.

M. Guha, “Topic Modeling and the Latent Dirichlet Allocation,” https://www.mithilaguha.com/post/topic-modeling-and-latent-dirichlet-allocation.

Y. O. Santoso and S. A. Nugroho, “Pengelompokkan Jurnal Ilmiah Berdasarkan Judul Menggunakan LDA,” vol. 3, no. 1, pp. 31–42, 2019.

Quentin Pleple, “Perplexity To Evaluate Topic Models,” https://qpleple.com/perplexity-to-evaluate-topic-models/.

P. A. Telnoni, Suryatiningsih, and E. Rosely, “Pelabelan Data Dengan Latent Dirichlet Allocation dan K-Means Clustering pada Data Twitter Menggunakan Bahasa Indonesia,” Jurnal Elektro dan Telekomunikasi Terapan, vol. 7, no. 2, pp. 885–892, Mar. 2020, doi: 10.25124/jett.v7i2.3442.

M. D. Hoffman, D. M. Blei, C. Wang, and J. Paisley, “Stochastic Variational Inference,” Journal of Machine Learning Research, vol. 14, pp. 1303–1347, 2013.

H. Valpola and A. Honkela, “On-Line Variational Bayesian Learning,” 2003. [Online]. Available: http://www.cis.hut.fi/projects/ica/bayes/

Matti Lyra, “Evaluating Topic Models.” Accessed: Nov. 01, 2024. [Online]. Available: https://mattilyra.github.io/2017/07/30/evaluating-topic-models.html




DOI: http://dx.doi.org/10.24014/coreit.v11i2.33397

Refbacks

  • There are currently no refbacks.




Creative Commons License  site stats  
Jurnal CoreIT by http://ejournal.uin-suska.ac.id/index.php/coreit/ is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.