Penerapan Model OKAPI BM25 Pada Sistem Temu Kembali Informasi

Rizqa Raaiqa Bintana, Surya Agustian

Abstract


The rapid growth of information and digital documents has caused the process to retrieve desire documents becomes more difficult when performed manually. Therefore, an information retrieval system is needed as the search engine to find relevant documents as user desire. This research is to develop a information retrieval system based on OKAPI BM25 model. Two main steps would do in build an information retrieval system with Okapi BM25 model are preprocessing and applied the model to it. The preprocessing steps are collection of document (corpus), tokenization, linguistic preprocessing, and indexing by creating an inverted index. After preprocessing, applied Okapi BM25 model to the system to compute the relevant value between each document to user’s query. The system retrieve a list of documents which in descending sort based on relevant value. System performance is measured by calculating precision and recall. Experiment on several query key, shows a good performance where precision value about 74%-92% and 100% recall. This system is portable so it can be applied to the local server to meet the needs of the organization in finding a document in accordance with its documents collection.
Keywords : Information Retrieval, Okapi BM25, Precision, Recall.

Full Text:

PDF

References


Salton, Gerard, dan Christopher Buckley. “Term-Weighting Approaches In Automatic Text Retrieval,” Department of Computer Science, Cornell University, 19 November 1987. (Rev. 26 Januari 1988).

Robertson, Stephen, dan Hugo Zaragoza. “The Probabilistic Relevance Framework: BM25 and Beyond,” Foundation and Trends in Information Retrieval. Vol. 3, No. 4, hal. 333-389, 2009.

Manning, Christopher D., Prabhakar Raghavan, dan Hinrich Schutze. “An Introduction to Information Retrieval”. Cambridge, England: Cambridge University Press, 2009.

Garcia, E. “A Tutorial on Okapi BM25,” 30 Juni 2011. (Rev. 2 Agustus 2011).

Garcia, E “Document Indexing Tutorial for Information Retrieval Students and Search Engine Marketers” Desember. 2005. [Online] Available http://www.miislita.com/information-retrieval-tutorial/indexing.html, (20 Oktober 2011)

Speriosu, Michael, dan Tetsuya Tashiro. “Comparison of Okapi BM25 and Language Modeling Algorithms for NTCIR-6,” Justsystems Corporation, 14 September 2006. (Rev. 6 Desember 2007).

J, Cios Krzysztof, Pedrycz W., Swiniarski R.W., dan Kurgan L.A. “Data Mining: A Knowledge Discovery Approach”. Springer, 2007.

Fox, E, dkk “Chapter 15: Extended Boolean Models” [Online] Available http://orion.lcg.ufrj.br/Dr.Dobbs/books/book5/chap15.htm, diakses 26 Oktober 2011.

Suryani, Irma. “Tugas Akhir: Sistem Temu Kembali Informasi (Information Retrieval) Dengan Metode Term Frequency - Inverse Document Frequency (TF-IDF) Menggunakan Model Neural Network (NN)”. Jurusan Teknik Informatika Fakultas Sains dan Teknologi UIN SUSKA Riau. Pekanbaru. 2012.


Refbacks

  • There are currently no refbacks.


FAKULTAS SAINS DAN TEKNOLOGI
UIN SUSKA RIAU

Kampus Raja Ali Haji
Gedung Fakultas Sains & Teknologi UIN Suska Riau
Jl.H.R.Soebrantas No.155 KM 18 Simpang Baru Panam, Pekanbaru 28293
Email: sntiki@uin-suska.ac.id