Tang 2016

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Deep Learning Approach for Network Intrusion

Detection in Software Defined Networking


(Invited Paper)
Tuan A Tang∗ , Lotfi Mhamdi∗ , Des McLernon∗ , Syed Ali Raza Zaidi∗ and Mounir Ghogho∗†
∗ Schoolof Electronic and Electrical Engineering, The University of Leeds, Leeds, UK.
† International University of Rabat, Morocco.

Email: [email protected], [email protected], [email protected], [email protected] and [email protected].

Abstract—Software Defined Networking (SDN) has recently Although the capabilities of SDN (e.g., software-based
emerged to become one of the promising solutions for the future traffic analysis, logical centralized control, global overview
Internet. With the logical centralization of controllers and a of the network, and dynamic updating of forwarding rules)
global network overview, SDN brings us a chance to strengthen
our network security. However, SDN also brings us a dangerous make it easy to detect and to react to network attacks, the
increase in potential threats. In this paper, we apply a deep separation of the control and data planes introduces new
learning approach for flow-based anomaly detection in an SDN attack opportunities and so SDN itself may be a target of
environment. We build a Deep Neural Network (DNN) model for some attacks. According to Kreutz et al. [8], there are seven
an intrusion detection system and train the model with the NSL- threat vectors in SDN. Three of them are specific to SDN
KDD Dataset. In this work, we just use six basic features (that can
be easily obtained in an SDN environment) taken from the forty- and relate to the controller, the control-data interface, and
one features of NSL-KDD Dataset. Through experiments, we the control-application interface. Network Intrusion Detection
confirm that the deep learning approach shows strong potential to System (NIDS) protects a network from malicious software
be used for flow-based anomaly detection in SDN environments. attacks. Traditionally, there are two types of NIDS according
Index Terms—software defined networking; SDN; intrusion to strategies to detect network attacks. The first one, signature-
detection; deep learning; network security based detection, compares new data with a knowledge base of
known intrusions. Despite the fact that this approach cannot
I. I NTRODUCTION
recognize new attacks, this remains the most popular approach
Traditional network architecture has remained mostly un- in commercial intrusion detection systems. The second one,
changed over the past few decades and proved to be cumber- anomaly-based detection, compares new data with a model of
some. Software Defined Networking (SDN) is an emerging normal user behavior and marks a significant deviation from
architecture that is dynamic, manageable, cost-effective, and this model as an anomaly using machine learning. As a result,
adaptable, making it ideal for the high-bandwidth, dynamic this approach can detect zero-day attacks that have never been
nature of today’s applications. This architecture decouples seen before. The anomaly-based detection approach is usually
the network control and forwarding functions enabling the combined with flow-based traffic monitoring in NIDSs. Flow-
network control to become directly programmable and the based monitoring is based on information in packet headers, so
underlying infrastructure to be abstracted for applications and flow-based NIDSs have to handle a considerably lower amount
network services [1]. SDN and OpenFlow [2] are increasingly of data compared to payload-based NIDSs. Machine learning
attracting researchers from both academia and industry. The is successfully used in many areas of computer science like
advantages of SDN in various scenarios (e.g., the enterprise, face detection and speech recognition, but not in intrusion
the datacenter, etc.) and across various backbone networks detection. In [9], Robin Sommer and Vern Paxson mentioned
have already been proven (e.g., Google B4 [3] and Huawei many factors that affect the use of machine learning in network
carrier network [4]). Various SDN controller software have intrusion detection. Recently, deep learning has emerged and
been proposed by open source organizations or commercial achieved real successes. So far, deep learning has been used
corporations, such as NOX [5], Ryu [6] and Floodlight [7]. extensively in computer science for voice, face and image
There are various commercial vendors supporting OpenFlow recognition. Deep learning is capable of automatically finding
in their hardware switches (e.g., HP, Pronto, Cisco, Dell, Intel, correlation in the data, so it is a promising method for the next
NEC and Juniper). OpenFlow protocol is the first standard generation of intrusion detection. Deep learning can be used
communication interface defined between the control and to efficiently detect zero-day attacks and so we can acquire a
forwarding layers of the SDN architecture. OpenFlow uses high detection rate.
the concept of flow to identify the network traffic and records Based on the flow-based nature of SDN, we propose a flow-
its information by counters. A flow is a group of IP packets, based anomaly detection system using deep learning. In this
with some common properties, passing a monitoring point in paper, we apply a Deep Neural Network (DNN) and use it for
a specified time interval. the NIDS model in an SDN context. We train and evaluate
978-1-5090-3837-4/16/$31.00 c 2016 IEEE the model by using the NSL-KDD Dataset [10]. Through

1
experiments, we find an optimal hyper-parameter for DNN traffic flow features is presented in [15] with extraction of 6-
and confirm the detection rate and false alarm rate. The model tuple features: Average of Packets per flow (APf), Average of
gets the performance with accuracy of 75.75% which is quite Bytes per flow (ABf), Average of Duration per flow (ADf),
reasonable from just using six basic network features. Percentage of Pair-flows (PPf), Growth of Single-flows (GSf),
The rest of the paper is organized as follows. Related works and Growth of Different Ports (GDP). Self Organizing Maps
are introduced in section 2. In section 3, we give a brief (SOMs) are used as the classification method. In order to
introduction to deep learning, our deep learning model and the improve the scalability of native OpenFlow, a new method
NSL-KDD Dataset. The performance of the model is analyzed combining OpenFlow and sFlow has been proposed in [16]
in section 4. Finally, we draw conclusions and propose some for an effective and scalable anomaly detection and mitigation
future work in section 5. mechanism in an SDN environment. Trung et al. [17] combine
hard thresholds of detection and fuzzy inference system (FIS)
II. P REVIOUS W ORK to detect risk of DDoS attacks based on the real traffic
Flow-based intrusion detection is extensively researched characteristic under normal and attack states. Three features
nowadays. In [11], the authors propose a flow-based anomaly are chosen for detecting the attack: Distribution of Inter-
detection system based on a Multi-Layer Perceptron and Grav- arrival Time, Distribution of packet quantity per flow and Flow
itational Search Algorithm. The system can classify benign and quantity to a server. Other researchers also use many feature
malicious flows with a very high accuracy rate. In [12], the selection algorithms to increase the detection accuracy.
authors proposed an NIDS using a one-class support vector In this paper, we use a Deep Neural Network (DNN) for
machine for their analysis and got a low false alarm rate. In anomaly detection. Six basic features are chosen for detecting
contrast to other work, the system is trained with malicious attacks: duration, protocol_type, src_bytes, dst_bytes, count
rather than with a benign network dataset. Intrusion detection and srv_count. So the key difference between our work and
mechanisms in traditional networks have been widely studied other papers is that we use simplex preprocessing and features
and can be applied to SDN. To secure the OpenFlow network, extraction in the SDN context.
many anomaly detection algorithms also have been imple-
mented in the SDN environment. By using the programability III. M ETHODOLOGY
of SDN, the authors of [13] show that a programmable home A. Deep Learning Approach
network router can provide the ideal platform and location In classical machine learning, important features of an input
in the network for detecting security problems in a SOHO are manually designed and the system automatically learns
(Small Office/Home Office) network. Four prominent traffic to map the features to an output. In deep learning, there are
anomaly detection algorithms (threshold random walk with multiple levels of features. These features are automatically
credit based rate limiting (TRW-CB algorithm), rate limiting, discovered and they are composed together in various levels
maximum entropy detector and NETAD) are implemented in to produce outputs. Each level represents abstract features that
an SDN context using OpenFlow compliant switches and a are discovered from the features presented in the previous
NOX controller. Experiments indicate that these algorithms are level.
significantly more accurate in identifying malicious activities In our experiment, we constructed a simple deep neural
in the SOHO network than the ISP (Internet Service Provider) network with an input layer, three hidden layers and an output
and the anomaly detector can work at line rates without layer as described in Figure 1. The input dimension is six
introducing any new performance overhead for the home and the output dimension is two. The hidden layers contain
network traffic. twelve, six and three neurons respectively. Our model initiation
The SDN architecture is a target for many kinds of attacks parameters are setup with 10 for the batch size and 100 for the
and potential Distributed Denial of Service (DDoS) attack vul- epoch. The learning rate will be decided in the experiment.
nerabilities exist across the SDN platform. DDoS attacks are
an attempt to make a machine or network resource unavailable B. Dataset
to its intended users. DDoS attacks are sent by two or more The KDD Cup is the leading data mining competition in
people or bots. For example, an attacker can take advantages of the world. The NSL-KDD Dataset was proposed to solve
the characteristic of SDN to launch DDoS attacks against the some inherent problems of the KDD Cup 1999 Dataset [18].
control layer, the infrastructure layer plane and the application Although it is quite old and not a perfect representative of
layer of SDN. DDoS attacks are easy to launch, but difficult existing real networks, it is still a good reference to compare
to guard against. With the development of the Internet, DDoS the NIDS models. It has been used in the past to evaluate
is growing substantially. One major reason is the emergence the performance of NIDS by many researchers, so there are
and development of botnets which are networks formed by significant performance measurement results for comparison.
bots or machines compromised by malware. According to a There are 125,973 network traffic samples in the KDDTrain+
Q1 2016 “State of the Internet Security” report by Akamai Dataset and 22,554 network traffic samples in the KDDTest+
[14], the number of DDoS attacks had increased 125.36% in Dataset. Each traffic sample has forty one features that are
Q1 2016 compared to the number of attacks in Q1 2015. categorized into three types of features: basic features, content-
A lightweight method for DDoS attack detection based on based features, traffic-based features. Attacks in the dataset are

2
TABLE II
F EATURE D ESCRIPTION

Feature Name Description


duration length (number of seconds) of the connection
protocol_type type of the protocol, e.g. tcp, udp, etc.
src_bytes number of data bytes from source to destination
dst_bytes number of data bytes from destination to source
count number of connections to the same host as the
current connection in the past two seconds
srv_count number of connections to the same service as the
current connection in the past two seconds

switches after a fixed time-window to request the network


statistic. OpenFlow switches send back to the controller an
ofp_flow_stats_reply message with all the statistics. The cen-
tralized controller can take advantages of the complete network
Fig. 1. Deep Learning Network Model overview supported by the SDN to analyze and correlate this
feedback from the network. All network statistics will then be
sent to the NIDS module for analysing and detecting any real
categorized into four categories according to their character- time network intrusion. Once a network anomaly is detected
istics. The details of each category are described in Table I. and identified, the OF protocol can effectively mitigate it
Some specific attack types (written in bold) in the testing set via flow table modification. New security policies can be
do not appear in the training set. That makes the detection propagated to switches in order to prevent attacks.
task more realistic.

TABLE I
ATTACKS IN T HE NSL-KDD DATASET

Category Training Set Testing Set


DoS back, land, back, land, neptune, pod, smurf,
neptune, pod, teardrop, mailbomb,
smurf, teardrop processtable, udpstorm,
apache2, worm
R2L fpt-write, fpt-write, guess-passwd, imap,
guess-passwd, multihop, phf, spy, warezmaster,
imap, multihop, xlock, xsnoop, snmpguess,
phf, spy, snmpgetattack, httptunnel,
warezclient, sendmail, named
warezmaster
U2R buffer-overflow, buffer-overflow, loadmodule, perl,
loadmodule, perl, rootkit, sqlattack, xterm, ps
rootkit Fig. 2. Proposed SDN Security Architecture
Probe ipsweep, nmap, ipsweep, nmap, portsweep, satan,
portsweep, satan mscan, saint
IV. P ERFORMANCE A NALYSIS
In our experiment, a subset of six features is chosen from
A. Evaluation Metrics
the NSL-KDD Dataset having forty one features for training
and testing. These six features are duration, protocol_type, In general, the performance of NIDS is evaluated in terms
src_bytes, dst_bytes, count and srv_count. Details of each of accuracy (AC), precision (P), recall (R) and F-measure (F).
feature can be found in Table II. These features are basic A NIDS requires high accuracy, high detection rate and low
features and traffic-based features that can easily be obtained false alarm rate. A confusion matrix is used to calculate these
in SDN environments. parameters. In the confusion matrix, True Positive (TP) is the
number of attack records correctly classified. True Negative
C. The SDN-based IDS Architecture (TN) is the number of normal records correctly classified.
We propose an SDN-based NIDS architecture as depicted in False Positive (FP) is the number of normal records incorrectly
Figure 2. In this architecture, the NIDS module is implemented classified. False Negative (FN) is the number of attack records
in the controller. The SDN controller can monitor all the Open- incorrectly classified. Then we can say:
Flow switches and request all network statistic when needed, • Accuracy (AC): shows the percentage of true detection
so the NIDS module can take advantage of this global network over total traffic trace:
overview for detecting intrusion. An ofp_flow_stats_request TP + TN
AC = (1)
message will be sent from the controller to all the OpenFlow TP + TN + FP + FN

3
• Precision (P): shows how many intrusions predicted by suddenly decrease at a learning rate of 0.0001. From evaluating
an NIDS are actual intrusions. The higher P is, the lower the accuracy metrics, loss and accuracy, we can see that the
false alarm is: performance of the NIDS model decreases when the learning
TP rate is decreased to 0.0001.
P = (2)
TP + FP
• Recall (R): shows the percentage of predicted intrusions TABLE IV
ACCURACY M ETRICS FOR D IFFERENT L EARNING R ATES
versus all intrusion presented. We want a high R value:
TP Learning Rate Precision (%) Recall (%) F1-score (%)
R= (3) 0.1 79 72 72
TP + FN
0.01 82 73 72
• F-measure (F): gives a better measure of the accuracy of 0.001 83 76 75
an NIDS by considering both the precision (P) and the 0.0001 83 75 74
recal (R). We also aim for a high F value:
2 In the following, Receiver Operating Characteristic (ROC)
F = 1 1 (4)
P + R
curves are presented, with respect to true positive rate and false
positive rate, for each of the learning rates. The area under
B. Experimental Results
the ROC Curve (AUC) is a standard measure for classifier
Initially, we implemented the NIDS for 2-class classification comparison. The AUC is calculated via the “trapezoidal” rule,
(normal and anomaly class). The performance of the model where a trapezoid is constructed from the lines drawn for
depends on the value of the hype-parameter initiated. Firstly, each two consecutive points on the curve. The higher the
the model was optimized by finding the hyper-parameter ROC curve’s area, the better the system. Figure 3 shows the
which leads to optimal classification results. Thus, we tried ROC curves for four different learning rates. As expected, the
to optimize the model by varying the value of learning rate in learning rate of 0.001 performs better than others with the
the range {0.1, 0.01, 0.001, 0.0001}. When training a model, highest AUC. As we can see, the learning rate of 0.0001 has
we tried to minimize the loss and maximize the accuracy. By a slightly poorer performance than that of the learning rate of
comparing loss and accuracy of the training phase (see Table 0.001. However, the learning rate of 0.0001 has a lower false
III), we can see that along with the decrease of the learning positive rate than the learning rate of 0.001. This is because
rate, the loss will decrease and the accuracy will increase. the model was trained quite accurately with a low learning
However, in the testing phase, when we decreased the learning rate. For overall evaluation, the learning rate of 0.001 has the
rate to 0.0001, the results are not as good as the learning rate best performance, so it is chosen for further evaluations.
of 0.001. This is because if the learning rate is too small,
the NIDS model will be trained too accurately. That is the
reason for the best results in the training phase of the learning
rate of 0.0001. Because of this too accurate training phase,
the model cannot generalize the characteristic of the training
samples well. So while the model can easily catch the intrusion
instances in the training set, it cannot catch the new intrusion
instances in the test set. As a result, the NIDS performance
decreases in the testing phase.

TABLE III
L OSS AND ACCURACY E VALUATION FOR D IFFERENT L EARNING R ATES

Train Set Test Set


Learning Rate
Loss Accuracy (%) Loss Accuracy (%)
(%) (%)
0.1 11.49 88.04 31.26 72.05
0.01 8.41 90.9 20.15 73.03
0.001 8.26 91.62 19.51 75.75
0.0001 7.45 91.7 20.3 74.67 Fig. 3. ROC Curve Comparison for Different Learning Rates

Secondly, we evaluated the precision, recall, and f-measure We also evaluated our work by comparing the results with
of the model for more details. The performance of the DNN that of other works and other machine learning algorithms. In
algorithm was evaluated using the test data provided. The [15], the authors use 6-tuple features: APf, ABf, ADf, PPf,
performance of the model with each learning rate is shown GSf and GDP for DDoS detection. Their result is good with a
in Table IV. As we can see, the learning rate of 0.001 gives detection rate of 99.11% and a false alarm rate of 0.46%. Our
us the best results among four learning rates in all evaluation results are quite low compared to their results. The main reason
metrics. All evaluation metrics are in a growing trend when we may be our feature selection for training and testing. Their six
decrease the learning rate from 0.1 to 0.001, but the metrics features are mainly specified for detecting DDoS attacks. In

4
our case, we just chose the six most basic features and we
did not focus on any kind of attacks. In our future research,
we will try to apply their 6-tuple features to our model for
further evaluation. We compared our results with the result in
[10] from different machine learning algorithms. In [10], the
authors train and test different algorithms with a full training
and testing set of forty one features. From these experiments,
the authors can evaluate the performance of these algorithms in
their dataset. From Table V, we can see that our DNN approach
is quite low compared with other algorithms with 75.75%
accuracy. The most accurate machine learning algorithm is
the NB tree with 82.02%. This is the accuracy obtained from
full feature training set, and so the NB tree algorithm can
generalize the characteristics of the normal and anomaly traffic
very well. In our simulation, it must be noted that only the
six basic features which are mentioned in Table I are used Fig. 4. ROC Curve Comparison for Different Algorithms
for training and testing. This sub-feature set cannot provide
enough information for our DNN algorithm to generalize the
characteristics of some sophisticated or new attacks. However, From all the above, we demonstrated that our proposed
it can be seen that algorithm performs reasonably compared DNN approach can generalize and abstract the characteristics
with other algorithms. of the normal and anomaly traffic with a small number of
features and gives us a promising accuracy.
TABLE V
ACCURACY C OMPARISON OF D IFFERENT A LGORITHMS V. C ONCLUSIONS AND F UTURE W ORK
Algorithm Accuracy (%) In this paper, we have implemented a deep learning algo-
J48 81.05 rithm for detecting network intrusion and evaluated our NIDS
Naive Bayes (NB) 76.56 model. Although our results are not yet good enough to be
NB Tree 82.02
Random Forest 80.67 adopted in any commercial product or an alternative solution
Random Tree 81.59 for signature-based IDS, our approach still has significant po-
Multi-layer Perceptron 77.41 tential and advantages for further development. By comparing
Support Vector Machine (SVM) 69.52
Our DNN 75.75
the results with those of other classifiers, we have shown the
potential of using deep learning for the flow-based anomaly
For further evaluation, we just use the sub-feature set for detection system. In the context of the SDN environment, the
training and testing. In the next experiment, the proposed DNN deep learning approach also has potential. This is attributed
algorithm in this study was compared with other machine to the centralized nature of the controller and the flexible
learning algorithms: NB, SVM and Decision Tree. Table structure of SDN. The basic information about network traffic
VI gives us an overview about the performance of each can be extracted easily by the controller an evaluated by
machine learning algorithm using sub-feature dataset. The the deep learning intrusion detection module. To improve the
proposed DNN algorithm gives us the best results amongst accuracy, we will analyze the traffic and propose other types
all algorithms. The other machine learning algorithms cannot of features. With the flexibility of the SDN structure, we can
generalize the characteristic of training samples well with just extract many features that contain more valuable information
six features, so they achieve quite poor performance. or focus on one specific type of attack, like DDoS, to increase
the accuracy of the NIDS. We will also try to adjust our DNN
TABLE VI model for better performance (e.g., varying the number of
ACCURACY C OMPARISON FOR S UB - FEATURE DATASET hidden layers and hidden neurons). In the near future, we will
also try to implement this approach in a real SDN environment
Algorithm Accuracy (%)
with real network traffic and evaluate the performance of the
Naive Bayes 45
SVM 70.9 whole network in terms of latency and throughput.
Decision Tree 74
Our DNN 75.75 R EFERENCES
[1] “Software Defined Networking Definition,” Available:
Finally, the ROC curves of these four algorithms are pre- https://www.opennetworking.org/sdn-resources/sdn-definition,
sented in Figure 4. As expected, deep learning algorithm [Accessed 04 Jul. 2016].
performances are better than others with the highest AUC. [2] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson,
J. Rexford, S. Shenker, and J. Turner, “Openflow: enabling innovation in
Our DNN algorithm achieves a higher accuracy rate with a campus networks,” ACM SIGCOMM Computer Communication Review,
lower false positive rate compared to others. vol. 38, no. 2, pp. 69–74, 2008.

5
[3] S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh,
S. Venkata, J. Wanderer, J. Zhou, M. Zhu et al., “B4: Experience with
a globally-deployed software defined wan,” ACM SIGCOMM Computer
Communication Review, vol. 43, no. 4, pp. 3–14, 2013.
[4] C. T. Huawei Press Centre and H. unveil world’s first commer-
cial deployment of SDN in carrier networks, “[online]. available:
pr.huawei.com/en/news/hw-332209-sdn.htm.”
[5] N. Gude, T. Koponen, J. Pettit, B. Pfaff, M. Casado, N. McKeown, and
S. Shenker, “Nox: towards an operating system for networks,” ACM
SIGCOMM Computer Communication Review, vol. 38, no. 3, pp. 105–
110, 2008.
[6] “Ryu,” Available: http://http://osrg.github.io/ryu/.
[7] “Floodlight,” Available: http://www.projectfloodlight.org/.
[8] D. Kreutz, F. Ramos, and P. Verissimo, “Towards secure and depend-
able software-defined networks,” in Proceedings of the second ACM
SIGCOMM workshop on Hot topics in software defined networking.
ACM, 2013, pp. 55–60.
[9] R. Sommer and V. Paxson, “Outside the closed world: On using machine
learning for network intrusion detection,” in 2010 IEEE symposium on
security and privacy. IEEE, 2010, pp. 305–316.
[10] M. Tavallaee, E. Bagheri, W. Lu, and A.-A. Ghorbani, “A detailed
analysis of the kdd cup 99 data set,” in Proceedings of the Second IEEE
Symposium on Computational Intelligence for Security and Defence
Applications, 2009.
[11] Z. Jadidi, V. Muthukkumarasamy, E. Sithirasenan, and M. Sheikhan,
“Flow-based anomaly detection using neural network optimized with gsa
algorithm,” in 2013 IEEE 33rd International Conference on Distributed
Computing Systems Workshops, 2013, pp. 76–81.
[12] P. Winter, E. Hermann, and M. Zeilinger, “Inductive intrusion detection
in flow-based network data using one-class support vector machines,”
in New Technologies, Mobility and Security (NTMS), 2011 4th IFIP
International Conference on. IEEE, 2011, pp. 1–5.
[13] S. A. Mehdi, J. Khalid, and S. A. Khayam, “Revisiting traffic anomaly
detection using software defined networking,” in International Workshop
on Recent Advances in Intrusion Detection. Springer, 2011, pp. 161–
180.
[14] “Q1 2016 State of the Internet / Security Report,” Available:
https://content.akamai.com/PG6301-SOTI-Security.html, [Accessed 07
Jul. 2016].
[15] R. Braga, E. Mota, and A. Passito, “Lightweight ddos flooding attack
detection using nox/openflow,” in Local Computer Networks (LCN),
2010 IEEE 35th Conference on. IEEE, 2010, pp. 408–415.
[16] K. Giotis, C. Argyropoulos, G. Androulidakis, D. Kalogeras, and
V. Maglaris, “Combining openflow and sflow for an effective and scal-
able anomaly detection and mitigation mechanism on sdn environments,”
Computer Networks, vol. 62, pp. 122–136, 2014.
[17] P. Van Trung, T. T. Huong, D. Van Tuyen, D. M. Duc, N. H. Thanh,
and A. Marshall, “A multi-criteria-based ddos-attack prevention solution
using software defined networking,” in Advanced Technologies for
Communications (ATC), 2015 International Conference on. IEEE,
2015, pp. 308–313.
[18] “KDD Cup 1999,” Available: http://kdd.ics.uci.edu/databases/kddcup99/,
[Accessed 04 Jul. 2016].

You might also like