Academia.eduAcademia.edu

A Novel Intrusion Detection System Using Deep Learning

2021, Advances in Intelligent Systems and Computing

The expanding utilization of the Internet has enlarged dangers and new attacks for quite a while. Altogether to recognize oddity in a network, the intrusion detection system has been proven to be a significant segment of secure networks. Machine learning model learns every time it predicts an output, and this property empowers them to distinguish the network pattern and find whether they are ordinary or noxious. There is an expanding demand for dependable and genuine dataset among the examined network. In this article, a comprehensive examination of the CSE-CIC-IDS2018 dataset is made. During the research, numerous issues and deficiencies in a dataset were found. Solutions to fix those issues led to a model different from the existing solutions. The model consisted of two components-principal component analysis and deep neural network. After pre-processing the dataset, it gave F1-score of 0.99, making it robust than other existing models.

A Novel Intrusion Detection System Using Deep Learning Tanay Singhania, Vatsal Agarwal, Sunakshi, Prashant Shambharkar Giridhar, and Trasha Gupta Abstract The expanding utilization of the Internet has enlarged dangers and new attacks for quite a while. Altogether to recognize oddity in a network, the intrusion detection system has been proven to be a significant segment of secure networks. Machine learning model learns every time it predicts an output, and this property empowers them to distinguish the network pattern and find whether they are ordinary or noxious. There is an expanding demand for dependable and genuine dataset among the examined network. In this article, a comprehensive examination of the CSE-CICIDS2018 dataset is made. During the research, numerous issues and deficiencies in a dataset were found. Solutions to fix those issues led to a model different from the existing solutions. The model consisted of two components—principal component analysis and deep neural network. After pre-processing the dataset, it gave F1-score of 0.99, making it robust than other existing models. Keywords Intrusion detection system · CSE-CIC-IDS2018 · Deep neural networks · Principal component analysis · Random forest classifier · SMOTE T. Singhania (B) · V. Agarwal · Sunakshi · T. Gupta Department of Applied Mathematics, Delhi Technological University, New Delhi, India e-mail: [email protected] V. Agarwal e-mail: [email protected] Sunakshi e-mail: [email protected] T. Gupta e-mail: [email protected] P. S. Giridhar Department of Computer Science and Engineering, Delhi Technological University, New Delhi, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 A. Khanna et al. (eds.), International Conference on Innovative Computing and Communications, Advances in Intelligent Systems and Computing 1394, https://doi.org/10.1007/978-981-16-3071-2_29 343 344 T. Singhania et al. 1 Introduction Nowadays, due to the wide spectrum of applications and uses of the Internet, its structure has grown complex and undifferentiated. It leads to a significant challenge in the face of researchers that is shielding information of different institutes. Also, as intruders are developing newer and newer attacks every here and there, it is a necessity to generate a system that detects these intrusions at zero-day. Although most systems employ a firewall for these detections, a single layer of defense is not competent in today’s world to counter every attack. So, the intrusion detection system increases the security of the data by serving as a secondary layer of defense mechanism [1]. Intrusion detection system (IDS) attempts to secure your system when there is a danger to access your data illegally. It squares unlawful requests and reports intrusions to the concerned authority [1]. Every prevailing IDS can be clustered into two essential classes depending on their techniques: intrusions dependent on anomaly and intrusions dependent on the signature. In the dependent signature technique, particularly previously apprehended attacks can be recognized and no new attack can be recognized. Albeit based on anomaly technique is difficult to actualize, however since it can detect even new attacks it is broadly executed. Gathering information for the intrusion detection system has been a test since most of the datasets are internal and consequently not promptly accessible. Also, as systems and patterns progress, a reasonable dataset is hard to be framed. In this research work, Sect. 2 discusses the previous work done in the field on intrusion detection system and Sect. 3 provides the theoretical review of intrusion detection system along with their need in the current world. It is followed by our experiment details in Sect. 4. Section 5 contains the results of our experiments which is further followed by our conclusion in Sect. 6. Section 7 gives an approach for future work, and then, there are various references which we have followed. 2 Literature Review Mohammed Hamid Abdulraheem and Najla Badie Ibraheem’s work [2] provided a detailed analysis of the most recent dataset at that time, i.e., CICIDS2017. The paper was designed to solve the issue of a comprehensive and reliable dataset that could serve as the basis for the research. Thirty-six features that gave the best results in terms of ROC curves, losses, F1-score metrics, and accuracy were extracted from the dataset. It was also compared to 23 features extracted in the literature, and the results showed that 36 features extracted were at par with 23 features [3]. Shah et al. [3] done a study on several intrusion detection methods that adopt an Artificial neural network (ANN). They used various simple and hybrid techniques that were centered on implementing more than one technique in a single model. During the research, it was observed that various models available address one or A Novel Intrusion Detection System Using Deep Learning 345 other aspects of the performance of an algorithm; hence, for future work, a complex model that could cover the majority criteria for efficiency was proposed [3]. Kanimozhi et al. [4] focused on proposing a model to identify botnet attacks using CSE-CIC-IDS 2018 dataset. The purpose of the research is to examine the efficacy of various classifier methods in classifying the botnet attack. The result finally indicates that the artificial intelligence technique gives the most favorable outcomes [4]. Karatas et al. [1] surveyed various intrusion detection system techniques using deep learning. The paper made a comparative study of several deep learning techniques through the literature. Zhou et al. [5] discussed six machine learning techniques for zero-day identification of the attacks. The CIC-AWS-2018 dataset was employed, and time overhead, precision, F1 value, and recall were chosen as standards to analyze the result [5]. 3 Intrusion Detection System Software or tool that tries to secure or flag your network when there is a threat to steal your information is termed as an intrusion detection system. It blocks unauthorized requests and reports intrusions to the concerned authority [1]. Although there were many different tools available to prevent network intrusion, none of them were capable enough to cover a handful of bases of the variety of network intrusion techniques. IDS using machine learning techniques is opted to boost security and counter the latest techniques used by hackers to crack into the system. The complexity of our current network arrangement makes it apt to errors, which acts as the stairs for hackers to breach into the security. To counter the latest defense mechanisms, new types of attacks are constantly being enveloped. In such an environment, a dynamically improving defense tool needs to be generated which uses learning or update mechanisms [1]. 4 Experiment The experiment is conducted on the latest accessible dataset CSE-CIC-IDS2018 as discussed below. The experiment is performed on the AI Platform on Google Cloud platform that provides the computation power that was used to train the model discussed below. As used in [6], the AI Platform on Google Cloud gives a significant computation that is more than sufficient for this research. 346 T. Singhania et al. Table 1 Types of attack and their count Attack type Count Attack type Count Benign 13,484,708 FTP-BruteForce 193,360 Distributed Denial of Service attack-HOIC 686,012 SSH-BruteForce 187,589 Denial of Service attacks-Hulk 461,912 Infiltration 161,934 Denial of Service attacks-SlowHTTPTest 139,890 BruteForce Web 611 Denial of Service attacks-GoldenEye 41,508 BruteForce XSS 230 Denial of Service attacks-Slowloris 10,990 Bot 286,191 Distributed Denial of Service attack-LOIC-UDP 1,730 Label 59 Distributed Denial of Service attacks-LOIC-HTTP 576,191 Structured Query Language Injection 87 4.1 Dataset Collecting data for the intrusion detection system has been a challenge since the majority of datasets are internal and hence not readily available or are vigorously anonymized, and thus, they do not reflect current patterns. As networks and trends progress, a realistic dataset is difficult to be formed. A dataset provided by the Canadian Institute of Cyber Security [7]—CSE-CIC-IDS2018 was generated by 420 machines and 30 servers, apprehending system logs and network traffic of each machine, which makes it a reliable and realistic dataset. This data was then used to draw out 80 features via CICFlowMeter-V3 [8]. Table 1 depicts all the types of attacks that the machines were tested with and their record count. Dataset consists of CSV files for each day when the attack was carried out along with the timestamp. 4.2 Architecture The architecture comprises three major components, namely—data pre-processing, feature reduction, and deep learning. Data Pre-processing. It includes transforming raw data from 80 features to useful 35 features [9]. The process is visualized with the help of Fig. 2. Under data preprocessing, the following steps were employed which can also be seen in Fig. 2. Removing Null Values—dataset contains null values, and since ML algorithms refuse to accept null values, they are removed. Balancing of Dataset—As shown in Table 1, there is a wide gap in count between ‘BENIGN’ and ‘SQL Injection,’ which is overcome by synthetically up-sampling A Novel Intrusion Detection System Using Deep Learning 347 the dataset to make it a balanced dataset using SMOTE. Moreover, all rows with the outcome label as ‘Label’ are removed as they are meaningless. Feature Selection—To get the relative importance of features in the dataset, the random forest is applied as it will give the splitting criteria for each feature. A threshold is kept (0.001) below in which all features are removed as they do not give any useful information with which prediction can be made. Hence, a dataset with 54 features is generated. Furthermore, the correlation between features is taken into consideration, and highly correlated features are then removed as these features become redundant in making predictions with machine learning algorithms. At the end of this step, dataset is reduced to 35 useful features [10]. Feature Reduction—Principal component analysis is a traditional technique of feature extraction that is most extensively adopted for dimensionality reduction. It aims to reduce the dimensionality of the dataset by eliminating attributes in the dataset that least significantly affects the target variable. In this technique, we determine the correlation between various attributes and then eliminate those attributes having a strong correlation [11]. Principal component analysis is applied to the pre-processed dataset that transforms the data into 16 components highly uncorrelated to each other. This reduces the number of features, and the transformed dataset is then fed to the deep neural network where data is passed through 10 fully connected layers and gives output out of the 15 possible outcomes [12]. Deep Learning—The decision-making algorithm that a deep neural network uses is holistic, which depends on the aggregate of the entire input patterns [9]. The structure may have a few ‘layers’ of neurons, and the overall structure may either be feedback or feedforward structure. Deep neural network has certain advantages over other machine learning techniques. The biggest one is that it learns by itself. Another major benefit is that even if a component of the neural network fails to execute, even then the neural network will produce efficient results due to parallel execution. Each layer uses the rectified linear (ReLU) activation function as it overcomes the vanishing gradient problem and allows the neural net to learn faster. Figure 1 depicts a summary of the complete neural network (Fig. 2). The optimizer in a model has an objective to adjust the weights for every edge of the network to reduce the value of the loss function to an optimal value. Its motive is to find global minima for the loss function. The proposed model operates on an Adaptive Moment Estimation (ADAM) optimizer which computes a different rate of learning for every parameter. The main advantage of ADAM is that it does not need heavy computational power and has minimal memory necessities. In computing the error of the model during the optimization process, a loss function must be picked. This can be a difficult issue as the function must catch the characteristics of the problem and be driven by matters that are essential. In the proposed model, multi-label margin loss is used. It allows variable-sized target 348 T. Singhania et al. Fig. 1 Summary of neural network Fig. 2 Flow of the proposed network classes for different samples. We have a multi-class multi-classification model for loss optimization. 5 Results The model proposed performs with an accuracy of 0.9845776241574671 or 98.45% on the above-mentioned dataset. Figure 3 depicts the deep neural network’s loss values when the loss function used was multi-label margin loss. ADAM was used as the optimizer, and hence, the model converges after a few hundred iterations. After the completion of the training, the loss in our model was observed to be: 0.030820682644844055. Table 2 depicts precision, F1-score, and recall for each label of the dataset. Moreover, Table 3 summarizes the report in Table 2. Table 3 depicts the average and weighted average of precision, recall, and F1-score. Principal A Novel Intrusion Detection System Using Deep Learning 349 Fig. 3 Model loss Table 2 Classification report Classes Precision Recall F1-score Support Benign Infiltration 1.00 1.00 1.00 820,405 0.00 0.00 0.00 1399 Bot 1.00 1.00 1.00 75,864 DoS Attack-GoldenEye 0.99 1.00 1.00 13,079 DoS Attack-Slowloris 0.98 0.98 0.98 3314 DDoS Attack-LOIC-HTTP 1.00 1.00 1.00 185,926 DDoS Attack-LOIC-UDP 0.78 0.94 0.85 571 DDoS Attack-HOIC 1.00 1.00 1.00 226,829 FTP-BruteForce 0.72 0.89 0.79 62,391 SSH-BruteForce 1.00 1.00 1.00 60,475 DoS Attack-SlowHTTPTest 0.77 0.52 0.62 45,059 DoS Attack-Hulk 1.00 1.00 1.00 151,092 BruteForce-Web 0.76 0.17 0.27 132 BruteForce-XSS 1.00 0.54 0.70 5 SQL Injection 1.00 1.00 1.00 820,821 Table 3 Classification report summary Precision Recall Accuracy F1 Support 0.99 2,467,411 Macro-avg. 0.87 0.8 0.81 2,467,411 Weighted avg. 0.99 0.99 0.99 2,467,411 350 Table 4 Performance achieved by different researchers on CSE-CIC-IDS2018 T. Singhania et al. Researcher Dataset Accuracy (%) D’hooge et al. [14] CSE-CIC-IDS2018 96 Lin et al. [15] CSE-CIC-IDS2018 96.20 Chadza et al. [16] CSE-CIC-IDS2018 97 Ferrag et al. [17] CSE-CIC-IDS2018 97.38 Zhao et al. [18] CSE-CIC-IDS2018 97.90 Hua [13] CSE-CIC-IDS2018 98.37 The proposed model CSE-CIC-IDS2018 98.45 component analysis was used with variance 0.95 as it gave the best results among various values. Table 4 describes results (accuracy) achieved by various researchers on the given dataset. These results are obtained via various machine learning and deep learning techniques. From this table, one can deduce that the maximum accuracy achieved is 98.37% by Hua [13]. The proposed model in this research achieved 98.45% accuracy. Though accuracy is similar, the proposed model is more robust as it uses 16 components from 35 features rather than using only 10 features and neglecting the other features, and has achieved an F1-score of 0.99. 6 Conclusion Previously, the major research work in the field of intrusion detection system was focused on the CICIDS2017 dataset. CSE-CICI-IDS2018 dataset is the updated version of the 2017 dataset. There is a remarkable difference in the sample size of the two datasets. 2017 dataset contains very few samples of BOT, Inf, Web Attack, and BF. The number of samples in the 2018 dataset has comprehensively improved, especially in the above-mentioned attacks. To make for these anomalies in the CICIDS2017 dataset, extensive oversampling was performed which led to a semi-balanced dataset having redundant data points [19]. Thus, an over-trained model was developed. Apart from the difference in the dataset, a more complex neural net was adopted. Earlier, a multi-layer perceptron (MLP) with only three layers each with few nodes was used in comparison with the deep neural net that utilizes eight hidden layers with a more substantial number of nodes in each layer for more meticulous training of the model. Thus, a more structured and complex model was developed to handle noise and apprehend the characteristics of the dataset. To take a stride forward in this research, principal component analysis was applied for feature extraction that reduced the dimensionality of the dataset. This led to a model with F1-score of 0.99. A Novel Intrusion Detection System Using Deep Learning 351 7 Future Work In this research, the model is trained and tested on the same dataset. However, the efficiency and accuracy of the dataset will be better authenticated when we test the proposed model on some different datasets, which might be the earlier one. Apart from this, some different feature selection or extraction techniques can be employed and examined for better results. References 1. G. Karatas, O. Demir, O.K. Sahingoz, Deep learning in intrusion detection systems, in 2018 International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism (IBIGDELFT) (IEEE, 2018) 2. M.H. Abdulraheem, N.B. Ibraheem, A detailed analysis of new intrusion detection dataset.J. Theor. Appl. Inf. Technol. 97(17) (2019) 3. B. Shah, B.H. Trivedi, Artificial neural network-based intrusion detection system: a survey. Int. J. Comput. Appl. 39(6), 13–18 (2012) 4. V. Kanimozhi, T. Prem Jacob, Calibration of various optimized machine learning classifiers in network intrusion detection system on the realistic cyber dataset CSE-CIC-IDS2018 using cloud computing. Int. J. Eng. Appl. Sci. Technol. 4(6), 2455–2143 (2019) 5. Q. Zhou, D. Pezaros,Evaluation of machine learning classifiers for zero-day intrusion detection—an analysis on CIC-AWS-2018 dataset. arXiv:1905.03685 (2019) 6. K. Sharma, et al., Hiding data in images using cryptography and deep neural network. J. Artif. Intell. Syst. 1, 143–162. https://doi.org/10.33969/AIS.2019.11009 7. I. Sharafaldin, A.H. Lashkari, A.A. Ghorbani, Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSP (2018) 8. N. Chaabouni, et al., Network intrusion detection for IoT security based on learning techniques. IEEE Commun. Surv. Tutor. 21(3), 2671–2701 (2019) 9. U.R. Acharya, et al., Classification of heart rate data using artificial neural network and fuzzy equivalence relation. Pattern Recogn. 36(1), 61–68 (2003) 10. J. Arora, et al., Ensemble feature selection method based on recently developed nature-inspired algorithms, in International Conference on Innovative Computing and Communications (Springer, Singapore, 2020) 11. D. Gupta, et al.,Usability feature extraction using modified crow search algorithm: a novel approach. Neural Comput. Appl. 1–11 (2018) 12. T. Chen, et al.,Road marking detection and classification using machine learning algorithms, in 2015 IEEE Intelligent Vehicles Symposium (IV) (IEEE, 2015) 13. Y. Hua, An efficient traffic classification scheme using embedded feature selection and lightgbm, in 2020 Information Communication Technologies Conference (ICTC) (IEEE, 2020) 14. L. D’hooge, et al., Inter-dataset generalization strength of supervised machine learning methods for intrusion detection. J. Inf. Secur. Appl. 54, 102564 (2020) 15. P. Lin, K. Ye, C.-Z. Xu, Dynamic network anomaly detection system by using deep learning techniques, in International Conference on Cloud Computing (Springer, Cham, 2019) 16. T. Chadza, K.G. Kyriakopoulos, S. Lambotharan, Contemporary sequential network attacks prediction using hidden Markov model, in 2019 17th International Conference on Privacy, Security and Trust (PST) (IEEE, 2019) 17. M.A. Ferrag, et al., Deep learning for cyber security intrusion detection: approaches, datasets, and comparative study. J. Inf. Secur. Appl. 50, 102419 (2020) 18. F. Zhao, et al., A semi-self-taught network intrusion detection system. Neural Comput. Appl. (2020) 352 T. Singhania et al. 19. V. Agarwal, Sunakshi, R. Medhashree, T. Singh, R.M. Singari, Study and Applications of Fuzzy Systems in Domestic Products. In: R.M. Singari, K. Mathiyazhagan, H. Kumar (eds) Advances in Manufacturing and Industrial Engineering. Lecture Notes in Mechanical Engineering (Springer, Singapore 2021). https://doi.org/10.1007/978-981-15-8542-5_7