Implementasi Metode Terms Frequency-Inverse Document Frequency (TF-IDF) dan Maximum Marginal Relevance untuk Monitoring Diskusi Online

Okfalisa S.T., M.Sc

Abstract


The application of social media during the process of teaching and learning especially in online discussion forum is gradually increased. Neverthelles, the spreading of out of scope discussion that trigger the emergence of negative debates breaks the communication etic code in online discussion. This push forward the increasing of admins or instructurs rules in monitoring and controlling the discussion activity during the forum session.. By applying TF-IDF and Maximum Marginal Relevancy methods a software apllication is developed to monitor the discussion online activity. The list of Text Processing Phase including The sentences breakdown, case folding, tokenizing, filtering and stemming are conducted to extract the document posting from the instructurs as well as members comments. Then, TF-IDF, Query Relevance and Similarity values are calculated. By applying Maximum Marginal Relavancy, the optimal extraction of documen summary is provided to reduce the sentences redudancy and rangking output. The comment which value is zero (0) that based on the comparison of summary between document posting and members comments will be classfied as “Unfeasible” and recommended to be eliminated. As the result of accuracy, blackbox and UAT testing in one of lecture topics this application is success in monitoring the activity of online discussion with compression value 50% and level accuracy is 76,67%.   Hence the discussion forum as one of tool in incerasing the teaching and learning quality can be optimaized accordingly. 

Keywords


Cosine Similarity, Online Discussion, Comments Feasibility, Maximum Marginal Relevance, Text Processing, TF-IDF

Full Text:

PDF

References


Lui, A. K.-F., Li, S. C., & Choy, S. O. (2007). An Evaluation of Automatic Text Categorization in Online Discussion Analysis. Seventh IEEE International Conference on Advanced Learning Technologies (ICALT 2007), (pp. 205 - 209 ).

Karmayasa, Oka. & Mahendra, Ida Bagus. (2012). Implementasi Vector Space Model dan Beberapa Notasi Metode Term Frequency Inverse Document Frequency (TF-IDF) Pada Sistem Temu Kembali Informasi. Bali : Program Studi Teknik Informatika Jurusan Ilmu Komputer Fakultas Matematika Dan Ilmu Pengetahuan Alam Universitas Udayana

Aristoteles (2013), “Penerapan Algoritma Genetika pada Peringkasan Teks Dokumen Bahasa Indonesia”, Prosiding Semirata 2013, vol. 1, no. 1.

Mulyana I, Ramadona S, Herfina (2012). Penerapan Terms Frequency-Inverse Document Frequency PadaSistem Ringkasan Teks Otomatis Dokumen Tunggal Berbahasa Indonesia. Jurnal.

Tata, S., Patel,J.M., Science, C., Arbor, A. (2007) Estimating the Selectivity of TF-IDF based Cosine Similarity Predicates, 36,7-12.

Harjanto ,Dhony Syafe’i, Endah,Sukmawati Nur, Bahtiar,Nurdin. (2012). Sistem Temu Kembali Informasi pada Dokumen Teks Menggunakan Metode TF IDF. Fakultas Sains dan Matematika, Universitas Diponegoro.

Mandala,Rila, Setiawan,Hendra. (2002). Peningkatan Preformansi dengan Perluasan Query Secara Otomatis. Departement Teknik Informatika, Institut Teknologi Bandung

Mustaqhfiri,Muchammad (2011). Peringkasan Teks Otomatis Berita Berbahasa Indonesia Menggunakan Metode Maximum Marginal Relevance. Fakultas Sains dan Teknologi, Jurusan Teknik Informatika, Universitas Islam Negeri Maulana Malik Ibrahim. Malang.

Septiawan,Danny, Suprayogi,Dwi Aries (). Klasifikasi Iklan pada Online Shop dengan Metode Naive Bayes . Teknik Informatika, Program Teknologi dan Ilmu Komputer, Universitas Brawijaya, Malang.

Tala, Fadillah Z.(2003).A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia. Institute for Logic, Language and Computation Universeit Van Amsterdam.

Pradnyana. (2012). Perancangan dan Implementasi Automated Document Integration dengan Menggunakan Algoritma Complete Linkage Aglomerative Hierarchial Clustering. Teknik Informatika, Fakultas Matematika dan Ilmu Pengetahuan Alam, Universitas Udayana.

Pradipa, Enggar (2013). Klasifikasi Pola Konten E-mail dengan Menggunakan Jaringan Syaraf Tiruan Metode Back Propogation untuk Pengecekan Spam E-Mail dengan Acuan DMC. Fakultas Ilmu Komputer. Universitas Dian Nuswantoro. Semarang.




DOI: http://dx.doi.org/10.24014/sitekin.v13i2.1399

Refbacks

  • There are currently no refbacks.


Copyright (c) 2016 Jurnal Sains dan Teknologi Industri

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


Editorial Address:
FAKULTAS SAINS DAN TEKNOLOGI
UIN SULTAN SYARIF KASIM RIAU

Kampus Raja Ali Haji
Gedung Fakultas Sains & Teknologi UIN Suska Riau
Jl.H.R.Soebrantas No.155 KM 18 Simpang Baru Panam, Pekanbaru 28293
Email: sitekin@uin-suska.ac.id
© 2023 SITEKIN, ISSN 2407-0939

SITEKIN Journal Indexing:

Google Scholar | Garuda | Moraref | IndexCopernicus | SINTA


Creative Commons License
SITEKIN by http://ejournal.uin-suska.ac.id/index.php