Optimizing Performance Random Forest Algorithm Using Correlation-Based Feature Selection (CFS) Method to Improve Distributed Denial of Service (DDoS) Attack Detection Accuracy

Sopian Soim; Sholihin Sholihin; Cahyo Bayu Subianto

doi:10.24014/ijaidm.v7i2.24783

Optimizing Performance Random Forest Algorithm Using Correlation-Based Feature Selection (CFS) Method to Improve Distributed Denial of Service (DDoS) Attack Detection Accuracy

Sopian Soim, Sholihin Sholihin, Cahyo Bayu Subianto

Abstract

In the ever-evolving digital era, Distributed Denial of Service (DDoS) attacks have become a major threat to the security of networks and online services, making it important to develop effective strategies to detect and overcome such attacks.This research aims to improve the performance of Random Forest algorithm in dealing with DDoS attacks by using Correlation-Based Feature Selection method. This method can identify and select the most relevant features from the dataset used, in this case the CIC-DDoS2019 dataset, with respect to accuracy, precision, recall, and F1-score as evaluation metrics, so that this research achieves the best results in effectively detecting and preventing DDoS attacks, making an important contribution in strengthening the security of networks and online services.The results show that the application of the Correlation-Based Feature Selection method is able to improve DDoS attack detection in a complex network context using the Random Forest algorithm, increasing the detection accuracy rate to 99.89%. These findings highlight the potential of using the Random Forest algorithm with the CFS method in improving DDoS attack detection in complex network environments.This study recorded a significant improvement compared to the previous study, which only achieved an accuracy rate of 99.7% using the feature importance method.

Keywords

Accuracy; Correlation; DDoS; Machine Learning; Random Forest Algorithm; Selection Feature

Full Text:

PDF

References

K. Nosalska and G. Mazurek, “Marketing principles for Industry 4.0 - a conceptual framework,” Eng. Manag. Prod. Serv., vol. 11, no. 3, pp. 9–20, 2019.

Badan Siber dan Sandi Negara, “Laporan Tahunan Monitoring Keamanan Siber 2021,” 2022, pp. 54–55.

X. Chen, Distributed denial of service attack and defense, vol. 3. 2010.

Amarudin, R. Ferdiana, and Widyawan, “A Systematic Literature Review of Intrusion Detection System for Network Security: Research Trends, Datasets and Methods,” ICICoS 2020 - Proceeding 4th Int. Conf. Informatics Comput. Sci., pp. 0–5, 2020.

E. Osterweil, A. Stavrou, and L. Zhang, “20 Years of DDoS: a Call to Action,” vol. 1, no. 1, pp. 1–11, 2019.

M. Zamani, “Machine learning techniques for intrusion detection,” Handb. Res. Intrusion Detect. Syst., no. December 2013, pp. 47–65, 2020.

B. Purnama, Pengantar Machine Learning. 2019.

A. R. Wani, Q. P. Rana, U. Saxena, and N. Pandey, “Analysis and Detection of DDoS on Cloud Computing Environment using Machine Learning Techniques,” Commun. Comput. Inf. Sci., vol. 1076, pp. 260–273, 2019.

Y. Chen, J. Hou, Q. Li, and H. Long, “DDoS attack detection based on random forest,” Proc. 2020 IEEE Int. Conf. Prog. Informatics Comput. PIC 2020, pp. 328–334, 2020.

M. Alduailij, Q. W. Khan, M. Tahir, M. Sardaraz, M. Alduailij, and F. Malik, “Machine-Learning-Based DDoS Attack Detection Using Mutual Information and Random Forest Feature Importance Method,” Symmetry (Basel)., vol. 14, no. 6, pp. 1–15, 2022.

A. M. Makkawi and A. Yousif, “Machine Learning for Cloud DDoS Attack Detection: A Systematic Review,” Proc. 2020 Int. Conf. Comput. Control. Electr. Electron. Eng. ICCCEEE 2020, 2021.

M. A. Talukder,Md Alamin; Uddin, “CIC-DDoS2019 Dataset,” Mendeley Data, 2023. [Online]. Available: https://data.mendeley.com/datasets/ssnc74xm6r/1. [Accessed: 05-Jul-2023].

F. Ridzuan and W. M. N. Wan Zainon, “A review on data cleansing methods for big data,” Procedia Comput. Sci., vol. 161, pp. 731–738, 2019.

Á. Arnaiz-González, J. F. Díez-Pastor, J. J. Rodríguez, and C. García-Osorio, “Study of data transformation techniques for adapting single-label prototype selection algorithms to multi-label learning,” Expert Syst. Appl., vol. 109, pp. 114–130, 2018.

E. K.I, “Data Transformation for Machine Learning,” Unversity Jean Monnet, Saint-Etienne, Fr., no. 4, pp. 1–8, 2018.

H. A. Yanti, H. Sukoco, and S. N. Neyman, “Pemodelan Identifikasi Trafik Bittorrent Dengan Pendekatan Correlation Based Feature Selection (CFS) Menggunakan Algoritme Decision Tree (C4.5),” CESS (Journal Comput. Eng. Syst. Sci., vol. 6, no. 1, p. 1, 2021.

M. A. Hall, “Correlation-based Feature Selection for Machine Learning,” no. April, 1999.

B. Toleva, “The Proportion for Splitting Data into Training and Test Set for the Bootstrap in Classification Problems,” Bus. Syst. Res. J., vol. 12, 2021.

C. Strobl, J. Malley, and G. Tutz, “An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests,” Psychol. Methods, vol. 14, no. 4, pp. 323–348, 2009.

M. D. Prasad, P. B. V, and C. Amarnath, “Machine Learning DDoS Detection Using Stochastic Gradient Boosting,” Int. J. Comput. Sci. Eng., vol. 7, no. 4, pp. 157–166, 2019.

DOI: http://dx.doi.org/10.24014/ijaidm.v7i2.24783

Refbacks

There are currently no refbacks.

Office and Secretariat:

Big Data Research Centre
Puzzle Research Data Technology (Predatech)
Laboratory Building 1st Floor of Faculty of Science and Technology
UIN Sultan Syarif Kasim Riau

Jl. HR. Soebrantas KM. 18.5 No. 155 Pekanbaru Riau – 28293
Website: http://predatech.uin-suska.ac.id/ijaidm
Email: ijaidm@uin-suska.ac.id
e-Journal: http://ejournal.uin-suska.ac.id/index.php/ijaidm
Phone: 085275359942

Journal Indexing:

IJAIDM Stats