Implementation and Analysis Optimal Flexible Frequency Discretization (OFFD) Method to Minimize Classification Error at Naïve Bayes Classification

Dita Martha Pratiwi, Warih Maharani, Intan Nurma Yunita

Abstract


Naive Bayes is one of the classification techniques in data mining that apply Bayes Theorem in its processing and provide optimal result when each attributes in dataset is independent. But generally, a dataset has numeric attributes and nominal attributes are dependence so that if considered independent, it can cause classification error problems. Therefore, it needs a method to minimize the error rate, the method is discretize strategy. Discretization is a method that maps some numerical values (X) into an interval of nominal value (X*) based on the frequency setting in one interval so it can get number of interval formed in one numeric attribute.

One of discretization method adopted in this research is Optimal Flexible Frequency Discretization (OFFD) based on sequential search and wrapper based supervised for incremental learning. This method will be carried out wrapper feature selection to get optimal attributes based on its fMeasure parameter. Then, optimal dataset will de discrete in sequential search for the minimum frequency on each interval. Based on the results of testing, showed that the OFFD influenced by the process of selecting attributes of Best First Search on the Wrapper Feature Selection, so that influence the decline in the value of the error.

 

Keywords : wrapper based, Feature Selection, discretization, sequential search, Naïve Bayes, Optimal Flexible Frequency Discretization,  interval frequency


Full Text:

PDF

References


Berzal,Fernando , Cubero,Juan-Carlos , Marin,Nicolás , José-Maria , Serrano, 2003 , Usability Issues in Data Mining Systems , Dept. Computer Science and Artificial Intelligence, E.T.S. Ingenieria Informatica, University of Almeria Ctra Sacramento.

Clifton,Chris , Jiang,Wei , Muruguesan,Mummoorthy , Nergiz, M.Ercan , Is Privacy Still an Issue for Data Mining? (Extended Abstract), Dept. of Computer Science, Purdue University.

Dunham, Margaret H. , 2003, New Jersey, Data Mining Introductory and Advanced Topics, Pearson Education Inc.

Han,Jiawei and Micheline Kamber, 2001, Data Mining: Concepts and Techniques First Edition, Morgan Kaufmann Publishers, San Fransisco.

Kohavi,Ron1, George H John2, Wrapper for Feature Selection 1Data Mining and Visualization, Silicon Graphic Inc., 2011, N. Shoreline Boulevard, 2Epiphany Marketing Software, 2141 landings drive mountain view CA 94043 USA.

Moertini, /v.S., 2002, Data Mining sebagai Solusi Bisnis, Universitas Khatolik Parahyangan, Bandung. http://home.unpar.ac.id/ diunduh 21 November 2010.

Ren,Jiangtao , Chen,Xianlu , Dept. of Computer Science, Sun Yat-sen University, China, Lee Den,Sau , Kao,Ben , Cheng,Reynold , Cheung,David , Dept. of Science The University of Hong Kong, Hongkong, Naïve-Bayes Classification of Uncertain Data. http://www.docjax.com/ diunduh pada 10 Oktober 2010.

Wang,Zhihai , Wang,Song , Min,Fan , dan Cao,Tianyu , 2009 , OFFD: Optimal Flexible Frequency Discretization for Naïve-Bayes Classification, Springer-Verlag, Berlin Heidelberg. http://www.cs.uvm.edu/ diunduh pada 11 Oktober 2010.

Weis, M.Sholom, Indurkhya, Nitin, 1998, Predictive Data Mining, Morgan Kaufmann Publisher, Inc., USA.

W.Seifert,Jeffrey , 16 Desember 2004, “CRS Report for Congres. Data Mining : Overview” , Information Science and Technology Policy, Resources, Science, and Industry Division.

Yang,Ying , I.Webb,Geoffrey , 2002, A Comparatives Study of Discretization Methods for Naïve-Bayes Classifier , School of Computer Science and Software Engineering, Monash University. http://citeseerx.ist.psu.edu/ diunduh pada 8 Oktober 2010.

Yang,Ying , I.Webb,Geoffrey , On Why Discretization Works for Naïve-Bayes Classifier , School of Computer Science and Software Engineering, Monash University. http://citeseerx.ist.psu.edu/ diunduh pada 8 Oktober 2010.


Refbacks

  • There are currently no refbacks.


FAKULTAS SAINS DAN TEKNOLOGI
UIN SUSKA RIAU

Kampus Raja Ali Haji
Gedung Fakultas Sains & Teknologi UIN Suska Riau
Jl.H.R.Soebrantas No.155 KM 18 Simpang Baru Panam, Pekanbaru 28293
Email: sntiki@uin-suska.ac.id