Al-Bahir Journal for Engineering and Pure Sciences
Abstract
Cybersecurity is seriously threatened by web attacks, which makes it necessary to create strong classification models as soon as possible for early detection and prevention. Networked system security is now a crucial global concern affecting people, businesses, and governments. Attacks on networked systems are occurring far more often, and the attackers' strategies are always changing. One defense against these attacks is intrusion detection. Machine learning is a popular and useful method for creating intrusion detection systems (IDS). Having more representative and discriminative traits greatly enhances an IDS's performance. We do a thorough review of feature selection and classification methods for the purpose of classifying web attacks in this work.
This work aims to use Decision Tree, Random Forest, and Logistic Regression as classifiers and evaluate their accuracy with the phishing website dataset. To improve the performance of the selected classifiers, subsequently, various feature selection methods, including Recursive Feature Elimination (RFE) and the Chi-square test, were used to identify the most relevant features for classification. Feature selection is a key topic for dimension reduction and classification in high-dimensional datasets. During the feature selection procedure, only the most relevant attributes from the datasets will be chosen.
Logistic Regression, Decision Tree, and Random Forest. We assess the models' performance based on metrics such as accuracy and classification reports to determine their effectiveness in classifying web attacks. Our findings provide insights into the effectiveness of different feature selection and classification techniques for web attack classification, contributing to the advancement of cybersecurity research.
Recommended Citation
Ali, Zaman Ahmed Razak Abdul
(2025)
"A Comprehensive Analysis of Feature Selection and Classification Techniques for Web Attack Classification,"
Al-Bahir Journal for Engineering and Pure Sciences: Vol. 6:
Iss.
2, Article 9.
Available at: https://doi.org/10.55810/2313-0083.1091
References
[1] Report K. Cybercriminals attack users with 400,000 new malicious files daily e that is 5% more than in 2021 [Internet]. 2022. Available from: https://www.kaspersky.com/about/ press-releases/cybercriminals-attack-users-with-400000-newmalicious-files-daily-that-is-5-more-than-in-2021.
[2] The Hacker News [Internet]. The hacker news. 2023. Available from: https://thehackernews.com/.
[3] Ravale U, Marathe N, Padiya P. Feature selection based hybrid anomaly intrusion detection system using K means and RBF kernel function. Procedia Comput Sci 2015;45: 428e35.
[4] Chen CM, Chen YL, Lin HC. An efficient network intrusion detection. Comput Commun 2010;33(4):477e84.
[5] Ashiku L, Dagli C. Network intrusion detection system using deep learning. Procedia Comput Sci 2021;185:239e47.
[6] Shams EA, Rizaner A. A novel support vector machine based intrusion detection system for mobile ad hoc networks. Wirel Netw 2018;24:1821e9.
[7] Al-Zubaidie M, Zhang Z, Zhang J. RAMHU: a new robust lightweight scheme for mutual users authentication in healthcare applications. Secur Commun Network 2019; 2019(1):3263902.
[8] Farnaaz N, Jabbar MA. Random forest modeling for network intrusion detection system. Procedia Comput Sci 2016;89: 213e7.
[9] Egea S, Rego Manez A, Carro B, S ~ anchez-Esguevillas A, Lloret J. Intelligent IoT traffic classification using novel search strategy for fast-based-correlation feature selection in industrial environments. IEEE Internet Things J June 2018; 5(3):1616e24. https://doi.org/10.1109/JIOT.2017.2787959.
[10] Meidan Y, Bohadana M, Shocher A, Oren Y, Ovadia Y, Shabtai A, et al. N-BaIoTdnetwork-Based detection of IoT botnet attacks using deep autoencoders. IEEE Pervasive Computing Jul.-Sep. 2018;17(3):12e22. https://doi.org/ 10.1109/MPRV.2018.03367731.
[11] Khammassi C, Krichen S. A GA-LR wrapper approach for feature selection in network intrusion detection. Comput Secur 2017;70:255e77.
[12] Osanaiye O, Cai H, Choo K-KR, Dehghantanha A, Xu Z, Dlodlo M. Ensemble-based multi-fliter feature selection method for DDOS detection in cloud computing. EURASIP J Wirel Commun Netw 2016;2016(1):130.
[13] Ingre B, Yadav A. Performance analysis of NSL-KDD dataset using ANN. In: 2015 international conference on signal processing and communication engineering systems. IEEE; 2015. p. 92e6.
[14] Janarthanan T, Zargari S. Feature selection in UNSW-NB15 and KDDCUP’99 datasets. In: 2017 IEEE 26th international symposium on industrial electronics (ISIE). IEEE; 2017. p. 1881e6.
[15] Khan NM, Negi A, Thaseen IS, Anwar W, Singh P, Sharma R, et al. Analysis on improving the performance of machine learning models using feature selection technique. In: International conference on intelligent systems design and applications. Springer; 2018. p. 69e77.
[16] Kumar V, Sinha D, Das AK, Pandey SC, Goswami RT. An integrated rule based intrusion detection system: analysis on UNSW-NB15 data set and the real time online dataset. Clust Comput 2020;23(2):1397e418.
[17] Yin Y, Jang-Jaccard J, Xu W, Singh A, Zhu J, Sabrina F, et al. IGRF-RFE: a hybrid feature selection method for MLP-based network intrusion detection on UNSW-NB15 dataset. J Big Data 2023;10(1):15.
[18] Osanaiye O, Choo KKR, Dlodlo M. Analysing feature selection and classification techniques for DDoS detection in cloud. In: Proceedings of Southern Africa telecommunication; 2016. p. 198e203.
[19] Sankaran A, Vatsa M, Singh R, Majumdar A. Group sparse autoencoder. Image Vis Comput 2017;60:64e74.
[20] Tama BA, Lim S. Ensemble learning for intrusion detection systems: a systematic mapping study and cross-benchmark evaluation. Comput Sci Rev 2021;39:100357.
[21] Chakraborty A, Alam M, Dey V, Chattopadhyay A, Mukhopadhyay D. Adversarial attacks and defences: a survey. arXiv preprint arXiv:181000069 2018.
[22] Abu Bakar R, Huang X, Javed MS, Hussain S, Majeed MF. An intelligent agent-based detection system for DDoS attacks using automatic feature extraction and selection. Sensors 2023;23(6):3333.
[23] Ninu SB. An intrusion detection system using exponential Henry gas solubility optimization based deep neuro fuzzy network in MANET. Eng Appl Artif Intell 2023;123:105969.
[24] Mathebula Solani D. Biochemical changes in diabetic retinopathy triggered by hyperglycaemia: a review. Aveh Journal 2017.
[25] Chawla A, Chawla R, Jaggi S. Microvascular and macrovascular complications in diabetes mellitus: distinct or continu? NCBI 2016.
[26] Maza S, Touahria M. Feature selection algorithms in intrusion detection system: a survey. KSII Trans Internet Inform Syst (TIIS) 2018;12(10):5079e99.
[27] Karimi F, Sadoghi Yazdi H, Abasi AK, Safavi AA, Abbasi M, Bahrami A, et al. SemiACO: a semi-supervised feature selection based on ant colony optimization. Expert Syst Appl 2023. 190 AL-BAHIR JOURNAL FOR ENGINEERING AND PURE SCIENCES 2025;6:177e191
[28] Alawad NA, Abed-Alguni BH, Al-Betar MA, Jaradat A. Binary improved white shark algorithm for intrusion detection systems. Neural Comput Appl 2023;35(26):19427e51. https:// doi.org/10.1007/s00521-023-08772-x.
[29] Abed-alguni B, Al-Betar MA, Abualigah L, Abd Elaziz M, Mirjalili S, Alawad NA, et al. Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection. Appl Intell 2023;53: 13224e60. https://doi.org/10.1007/s10489-022-04201-z.
[30] Prashanth SK, Iqbal H, Illuri B. An enhanced grey wolf optimisationedeterministic convolutional neural network (GWOeDCNN) model-based IDS in MANET. J Inf Knowl Manag 2023;22(4):2350010
Indexed in: