Alamiedy, 2021

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Ensemble Feature Selection Approach

for Detecting Denial of Service Attacks in RPL


Networks

Taief Alaa Alamiedy1,2 , Mohammed F. R. Anbar1(B) , Bahari Belaton3 ,


Arkan Hamoodi Kabla1 , and Baidaa Hamza Khudayer1,4
1 National Advanced IPv6 Centre, Universiti Sains Malaysia (USM), 11800 Gelugor,
Penang, Malaysia
{taiefalamiedy,Arkan}@student.usm.my, [email protected],
[email protected]
2 ECE Department, Faculty of Engineering, University of Kufa, P.O. Box 21, Najaf, Iraq
3 School of Computer Sciences, Universiti Sains Malaysia (USM), 11800 Gelugor,
Penang, Malaysia
[email protected]
4 Information Technology Department, AlBuraimi University College, Buraimi, Oman

Abstract. The Internet of Things (IoTs) is regarded as a future trend following


the Internet revolution. Many of us now use physical and electronic devices in
our daily lives to perform and deliver specific services. All physical and electronic
devices are linked together in IoT networks. Some of these devices, known as con-
strained devices, are battery-powered and operate in low-energy mode. Therefore,
to allow communication and forward packets between constrained devices. The
routing protocol for a low-power and lossy network (RPL) is proposed. RPL, on
the other hand, is not an energy-aware protocol, making it vulnerable to a wide
range of security threats. Denial of Service (DDoS) flooding attacks were the
most significant attacks that targeted RPL. Hence, a reliable method for detecting
DDoS flooding-based RPL attacks is required. In this paper, an ensemble Feature
Selection (FS) approach for detecting DDoS attacks in RPL networks is presented.
The proposed approach employs three bio-inspired algorithms to select the opti-
mal subset of features that contribute to high detection accuracy. Furthermore,
Support Vector Machine (SVM) is used as a classification algorithm to evaluate
the subset of features produced by bio-inspired algorithms. Finally, the proposed
approach is expected to significantly detect and identify DDoS flooding attack
patterns in RPL networks.

Keywords: Intrusion detection · Machine learning · Internet of things security ·


LLN · RPL · Routing attacks · 6LoWPAN · Hello-flooding attacks

1 Introduction
Many researchers are drawn to this new era because of the rapid expansion of computer
networks and Internet-based devices. Every appliance will be connected to other appli-
ances as part of this revolution, and this group of connected devices will then form a

© Springer Nature Singapore Pte Ltd. 2021


N. Abdullah et al. (Eds.): ACeS 2021, CCIS 1487, pp. 340–360, 2021.
https://doi.org/10.1007/978-981-16-8059-5_21
Ensemble Feature Selection Approach for Detecting Denial of Service Attacks 341

network of connected devices. IoTs [1, 2] are a type of network that can be defined. IoT
is a network of numerous sensors and actuators that provide various services such as
sensing the environment, collecting information, analysing the gathered data, perform-
ing procedure actions, and so on. These devices can communicate and share information
using a variety of protocols and communication techniques [3].
According to Cisco’s recently renamed (annual Internet report) for the period (2018–
2023) [4], Cisco predicts that network devices will grow significantly around the world,
reaching 29.3 billion in 2023. This expansion will be accomplished through machine-to-
machine technology and other forms of communication [4]. The statistics for IoT global
growth obtained by the Cisco report are depicted in Fig. 1.

Fig. 1. IoT global growth by Cisco annual report [4].

The coronavirus disease (Covid-19) emerges in some countries at the end of 2019,
specifically in December [5]. At the beginning of 2020, the virus quickly spread in
most countries, and many countries implemented stringent measures and roles to prevent
Covid-19 spread. These actions have ramifications for both the public and private sectors.
Moving to digital techniques, which was the starting point for many challenges that the
countries and people will face, is one of the proposed solutions to these issues and
problems associated with these procedures.
Many studies [6–9] have been conducted to investigate the future impact of this crisis
on the use of technology in various daily life activities. One of the current hot issues is
the impact of Covid-19 on IoTs and their applications [10].
Furthermore, IoT plays an important role in limiting the spread of this crisis, and it has
the potential to be used to prevent future disasters. Accent System, a Spanish company,
developed an IoT-based contact tracing wristband to track people and determine if they
were in close contact with the Covid-19 patient [11]. However, the security aspect of
these techniques is critical in order to protect people’s private information and prevent
manipulation of the information until it reaches its destination.
This research aims to provide a new subset of features associated with the pattern of
DDoS flooding attacks. Therefore, the following contributions are made to achieve this
goal:
342 T. A. Alamiedy et al.

• Create a solid benchmark dataset with a variety of network scenarios (normal and
attack traffic),
• Use three types of bio-inspired algorithms to select the optimal subset of features that
contribute to high detection accuracy,
• Providing an intersection stage to extract the identical features previously obtained
by bio-inspired algorithms results in an increase in detection accuracy,
• A classification algorithm based on SVM to classify the selected features.

The rest of the paper is organised as follows: Sect. 2 introduces the concept of Low
Power and Lossy Networks, Sect. 3 reveals attacks in RPL networks, Sect. 4 reviews
related studies to this research, Sect. 5 provides details about the methodology stages
for the proposed approach, and Sect. 6 presents the evaluation and validation criteria.
Finally, Sect. 7 provides a conclusion.

2 Low and Lossy Network (LLN) Concept

The IoT includes a variety of devices, such as smart meters, intelligent alarm systems,
and small sensors (gas sensors, temperature, and humidity sensors). These sensors are
linked and communicated in architecture known as LLN. LLN is a variant of the typical
Wireless Sensor Network (WSN). However, there are many variations between these
networks; for example, the density of sensor nodes in LLN are larger, and the sensors
are considered as low power devices (low power operation, limited bandwidth, small
memory size, and short radio range) [12, 13].
In addition, due to limited processing capability, there is a high likelihood of packet
loss in LLN, and delay and transformed size are small compared to WSN. Despite such
drawbacks, these networks communicate based on Internet Protocol (IP), which is still
the best option for controlling and managing network sensors (nodes) [14].
Besides, as previously stated, the LLN is operated in low power mode. Therefore,
it is critical to conserve the energy of such devices while forwarding and exchanging
information between nodes. Many protocols have been proposed to facilitate commu-
nication between the resource-constrained devices in LLN. The routing protocol for a
low-power and lossy network (RPL) was proposed. More information about the RPL
protocol is provided in the following section.

2.1 Routing Protocol for Low Power and Lossy Network (RPL)

RPL was selected as a standard protocol for LLN by the Low Power and Lossy Networks
Working Group (ROLL) in 2012 [15]. RPL has been standardised to function as an
LLN network layer protocol. RPL is used in academic and industrial fields to conduct
research and improve LLN performance. The ability of RPL to work with different
constrained devices is the primary reason for its use in this type of network architecture.
Moreover, RPL provides efficient routing between the constrained nodes. Additionally,
this protocol promotes service quality [16, 17]. The RPL network is depicted in Fig. 2
as a basic concept.
Ensemble Feature Selection Approach for Detecting Denial of Service Attacks 343

Fig. 2. The concept of RPL Network [18].

RPL is also a distance-vector and routing protocol designed for LLN. A protocol of
this type calculates the distance and direction to any node in the network. Consequently,
RPL forwards the packet between two nodes using the least expensive route with the
shortest distance. The cost of reach to the destination is calculated using route metrics
[19].

Terminologies of RPL Protocol. This section provides a brief overview of the net-
work’s RPL component.

Elements of the RPL Network. RPL manages the topology of the LLN network’s nodes
by constructing a Destination-Oriented Directed Acyclic Graph (DODAG). The DODAG
is made up of nodes that are connected to one another and arranged in a tree-like structure
[20]. Meanwhile, the sensor nodes are linked to the main node, which is known as the
root or primary node. The root node collects and distributes information to all network
nodes at the same time, connecting the RPL network to the external network. Therefore,
such a node could be a border router or gateway, or it could be any other normal node
in the network [21].
Furthermore, the root node is identified by an IPv6 address called DODAG ID, which
is used to differentiate the DODAG from the other DODAGs in an RPL Instance. In
addition, as mentioned in Sect. 2, the LLN network is used in a variety of IoT applications.
As a consequence, in RPL, the DODAG is in charge of specific tasks or applications; for
example, if two or more applications are running concurrently or independently; in this
case, many DOGAGs would be used. This group of DODAGs is controlled by an RPL
Instance, which may contain one or more DODAGs, and all DODAGs in this instance
share the same ID, known as the Instance ID [22, 23].
Aside from that, each DODAG node is assigned a rank, the value of which is deter-
mined by a function known as the objective function. Some metrics are used to calculate
the objective function. As the node gets further away from the node, the rank value
decreases (the root node in the network has the lowest rank). The rank value, on the
other hand, identifies the node’s position in the network in relation to the root node [24,
25].
344 T. A. Alamiedy et al.

Traffic Types in RPL Network. The traffic direction in the RPL network is divided into
three types: point to point (P2P), which occurs when two nodes exchange information,
point to multi-point (P2MP), which occurs when traffic from the root node to other
nodes, and multi-point to point (MP2P), which occurs when traffic generated from child
nodes to the parent node [22, 23].
Control Messages in RPL Network. The RPL protocol uses five control messages to
manage packet forwarding in the network. A brief explanation of these messages is
provided below.

• DODAG Information Objects (DIO) are used to broadcast the required information
to build the RPL Network. Such information includes the current RPL instance, the
current rank of the node, the IPv6 address of the root [26], etc.
• Destination Advertisement Object (DAO) messages are used for advertising informa-
tion required to construct the down routes and build the routing tables at the receiving
nodes [24].
• DODAG Information Solicitation (DIS): this message is sent by nodes when they
want to join the network and have not received a DIO message at that time. As a
result, this new node sends a DIS message to their neighbour nodes inquiring whether
or not there is any DODAG available to establish a connection with them [27].
• Destination Advertisement Object Acknowledgement (DAO-ACK): it is sent by the
DAO recipient as a response to a DAO message [28, 29].
• Consistency Check (CC) message: the RPL protocol uses this type of message to verify
the synchronicity of “security counter or timestamp between each pair of nodes” [30].
Figure 3 illustrates the RPL architecture.

Fig. 3. RPL Architecture [26].


Ensemble Feature Selection Approach for Detecting Denial of Service Attacks 345

3 RPL Attacks
Various IoT applications necessitated the use of a security mechanism to ensure that
information is delivered safely. Intrusion Detection System (IDS), authentication, and
cryptography techniques can be used as the first defence line to meet such requirements.
However, due to the limited resources of IoT sensors, these techniques are limited in
their application. Therefore, adversaries can penetrate these devices and connect to the
network as regular nodes. Afterwards, they can modify the operation of the comprised
nodes to perform various types of attacks. As shown in Fig. 4, attacks in the RPL network
can be classified into three types [31].

Fig. 4. Taxonomy of attacks against RPL network [31].

Resource-based attacks target the network’s nodes’ resources by forcing legitimate


nodes to perform unnecessary actions in order to deplete their resources. These types of
attacks aim to consume node energy, memory, or processing, reducing network lifetime
by causing congestion in available links, eventually resulting in network failure [32, 33].
Topology-based attacks, on the other hand, aim to change the network’s topology by
isolating some nodes and changing the path of network traffic, resulting in collusion and
destruction of the entire network [34, 35].
Eventually, the last band of attacks is known as traffic-based attacks. Such attacks
targeted the network by employing techniques such as sniffing, capturing, and analysing
network data, with the goal of infiltrating the network or stealing sensitive information
[36]. As previously stated, the goal of this research is to detect DDoS flooding attacks
classified as resource-based attacks. More information about this attack is provided in
the subsection that follows.
346 T. A. Alamiedy et al.

3.1 Resource-Based RPL Attacks

There are two types of resource-based RPL attacks: direct and indirect attacks. In direct
attacks, a malicious node(s) generates overload traffic directly to degrade network per-
formance and drain their resources. In contrast, indirect attacks seek to cause the other
nodes in the network to generate a large volume of traffic. For example, this attack could
be carried out by creating loops in the RPL network to motivate the other nodes to
generate traffic overhead. The section that follows discusses DDoS flooding attacks.

DDoS Flooding Attacks. DDoS Flooding attacks generate and launch a large volume
of traffic in a network, causing nodes and links to become idle. Besides that, this attack
can be carried out by an external or internal attacker. Internal DDoS flooding attacks are
the focus of this research. Furthermore, in DDoS attacks, the attacks constantly send DIS
messages to its neighbour nodes with its transmission area. As a matter of fact, DDoS
attacks are also known as DIS flooding attacks. After receiving the DIS message, the
victim node(s) would then reset their trickle timer, which is responsible for scheduling
the send time of the packet in the network and with DIO messages to allow the malicious
node to join the network, and this process continues, eventually exhausting the victim
node(s) power resources and causing the network to fail.
Furthermore, during DDoS attacks, the victim node(s) are occupied with process-
ing all of the requests, causing an additional overhead and preventing the nodes from
performing their legitimate operations. Moreover, this attack could send unicast DIS
messages to a single node or broadcast DIS messages to multiple nodes to its neigh-
bour. In both cases, this attack causes network congestion and RPL node saturation [31].
Figure 5 depicts a basic DDoS Flooding attack scenario.

Fig. 5. DDoS flooding attack scenario.


Ensemble Feature Selection Approach for Detecting Denial of Service Attacks 347

4 Literature Review
This section discusses the studies that support the proposed approach. To detect and
prevent attacks in RPL networks, the researchers used various mechanisms. As demon-
strated in the following studies, such techniques include intrusion detection-based,
trusted-based, and cryptography-based solutions.
In the study of Napiah et al. [37], the authors proposed a CHA-IDS, the proposed
approach used a centralised IDS to detect Hello flood, Wormhole, and Sinkhole attacks,
the authors used compression header data to extract the significant traffic features for
both individual and combined attacks. In addition, for the feature selection stage, the
best step-by-step strategy with a correlation-based approach was employed to select the
important features. The authors then evaluated the selected features that used six machine
learning algorithms. SVM, Decision Trees (J48), Random Forest (RF), Logistic Regres-
sion (LR), Multi-layer Perceptron (MLP), and Naive Bayes are just a few examples of
these algorithms. However, the proposed model’s main drawbacks include high memory
resource consumption, energy consumption, and the inability to effectively identify the
attacker.
The authors [38] devise a new IDS known as RIDES. For detecting DoS attacks
in IP-based WSNs, the proposed approach includes hybrid IDS that includes anomaly
and signature-based IDS. In addition, the attack pattern was detected by the anomaly
IDS using Cumulative Sum Control charts (CUSUM) with a predefined threshold. The
second type, on the other hand, was employed to reduce the overhead associated with
long signature code. The coding scheme was used for this step. The long signature codes
were converted into short attack signatures in this scheme. Additionally, the authors used
a distributed IDS to reduce communication and memory consumption, as well as the
computational overhead on network nodes.
Kasinathan et al. [39] developed an IDS for detecting DoS attacks in the 6LoWPAN
network. In the proposed approach, the open-source IDS (Suricata) was used for attack
detection and pattern matching. In addition, the proposed approach used a probe node
to passively sniff the transmission of packets in the network and forward the gathered
information to the main IDS (Suricata IDS) for additional inspection and analysis. Aside
from, to avoid communication overhead issues, the authors used a wired link to connect
the probe node directly to the Suricata IDS.
The authors of [40] proposed a trust-based IDS (T-IDS) scheme in their research. The
proposed approach aims to secure the RPL protocol, and this scheme used network nodes
as monitoring nodes to detect suspicious activity. Correspondingly, these nodes share
information with the other nodes in order to detect any abnormal activities. Furthermore,
each node was given a unique ID, and the network was linked to a backbone router, which
works with the monitoring node to detect attacks. The main limitation of this study is
that it cannot accurately identify malicious nodes.
348 T. A. Alamiedy et al.

Airehrour et al. [41] proposed an embedded trust-based mechanism. The proposed


mechanism aims to reduce blackhole attacks in IoT. The authors add the node’s trust value
to the Objective Function (OF). Therefore, in addition to the other metrics, the trust value
plays a role in determining the optimal route. The number of transmitted and dropped
packets used to compute the trust value. So, the node’s Expected Retransmission (ETX),
trust value, and rank value all play a role in route selection. If multiple packets were
dropped, the proposed method would identify the suspicious node. However, because
only one metric was used to identify the attacks, the proposed mechanism could not
detect them accurately.
The authors of [42] developed a Trust-Aware RPL routing protocol. The proposed
method is designed to detect selective forwarding and blackhole attacks. The proposed
mechanism identifies malicious nodes based on packet drop rates. When the attacker
executes blackhole or selective forwarding attacks, such a mechanism considers the
drop rate of packets by the malicious node to be higher than that of the normal node.
Besides, any node in the network’s behaviour can reveal its level of trust. The author
used trust value to evaluate the trustworthiness of nodes, which aids in the selection of
the optimum path.
The authors of proposed a new coordinative-balanced clustering algorithm in their
work [43]. The proposed method is designed to detect Hello-flooding and Version Num-
ber attacks. In this paper, the authors proposed a coordinative-balanced clustering (CBC)
algorithm to extend the network’s lifetime during DDoS attacks. The CBC algorithm
aims to group each set of nodes into clusters. Accordingly, this process reduces end-
to-end delay, reduces energy consumption, and extends node lifetime. In addition, the
authors introduced an improved ant colony algorithm for detecting DDoS attacks. To
detect attacks, such an algorithm employs node features. Hence, a secure route was used
for data transformation during the selection of the best parent for data forwarding. The
parent was chosen by combining residual energy and a scoring factor. Despite this, the
proposed approach achieves positive results in terms of reducing packet delivery ratio,
energy consumption, end-to-end delay, and packet loss rate. Nonetheless, the proposed
deals with a limited number of attacks on nodes and suffers from a lack of high attack
detection. Table 1 summarises the related studies.
To summarise, as previously stated, the researchers employ various mechanisms
and techniques for detecting and mitigating RPL-based attacks. However, some of the
proposed solutions are incapable of accurately detecting attacks; additionally, some of
the proposed mechanisms use the nodes as monitoring agents, consuming the node’s
resources. Besides, some solutions add extra messages to the network, which creates
overhead and consumes the available network bandwidth. Thus, the identified problem
would be addressed in this research proposal, as presented in the proposed methodology
section.
Ensemble Feature Selection Approach for Detecting Denial of Service Attacks 349

Table 1. The summary of related studies

Ref. & Year Proposed Type of Attack Performance Limitations


Mechanism Metrics
Napiah et al. CHA-IDS Hello flood, – High memory and
(2018) [37] Wormhole, energy
Sinkhole consumption, as
well as insufficient
to effectively
identify the attacker
Amin et al. RIDS DIS Flooding TPR, FPR, ROC Packet delay causes
(2009) [38] Attacks a loss detection rate
Kasinatinathan DoS Detection DIS Flooding TP The centralised
et al. (2013) [39] based IDS Attacks IDS increases the
communication
overhead and
reduces detection
of internal attacks
Medjek et al. Trust-based IDS Sybil-Mobile TPR It is unable to
(2017) [40] Attack accurately identify
malicious nodes
Airehrour et al. Embedded Blackhole – It is used one
(2016) [41] trust-based metric to identify
mechanism the attack which
leads to low
detection accuracy
of attacks It only
uses one metric to
identify attacks,
resulting in low
detection accuracy
Airehrour et al. Trust-Aware RPL Selective Packet Drop Rate It is used only the
(2017) [42] routing protocol Forwarding and trust metric
Backhole Attacks evaluate the
trustworthiness of
nodes which cannot
detect the malicious
node(s) accurately
Alabsi et al. Coordinative-based DIS Flooding and Energy Limited number of
(2019) [43] clustering with Version Number consumption, packet attacks attack
enriched- ant-colony Attacks loss rate, packet nodes, lack of high
algorithm delivery rate attack detection
350 T. A. Alamiedy et al.

5 Methodology of the Proposed Approach


This research proposes an ensemble feature selection method for detecting DDOS flood-
ing attacks in the RPL Network. The proposed approach is divided into four stages:
(i) Data Collection, (ii) Feature Extraction and Preparation, (iii) Ensemble Feature
Selection-based Bio-Inspired Algorithms, and (iv) DDoS Attack Detection have used
Support Vector Machines. The main architecture of the proposed approach is depicted
in Fig. 6.

Fig. 6. Overview of proposed methodology stages.

More information about these stages is provided in the subsections that follow.

5.1 Stage 1: Data Collection


The primary goal of this stage is to construct the network environment and capture
packets from traffic between network nodes. Following that, many processes would be
performed on these packets in order to extract valuable information, as illustrated in the
list of points below.

• Setup the network parameters and scenarios: the goal of this step is to initialise the
network parameters. Such parameters include the design of various network archi-
tectures such as (star, mesh, and hybrid). Following that, the number of normal and
malicious nodes is determined. Other parameters such as the number of transmit-
ted packets and simulation time/speed are also specified. Additionally, as previously
stated, this research focuses on detecting DDoS flooding attacks. As a result, two net-
work scenarios are required. The first scenario gathers normal network traffic, which
is later used as a baseline for evaluation, while the second scenario combines normal
and DDoS flooding attack traffic.
Ensemble Feature Selection Approach for Detecting Denial of Service Attacks 351

• Network traffic generation: this step is carried out with the help of the network simula-
tion environment Cooja [44]. When the simulation begins, all of the network’s nodes
begin exchanging messages in order to build the RPL network. Moreover, the simu-
lation scenarios are run numerous times to ensure the reliability and validity of the
proposed approach. Consequently, any errors that may occur during the experiments
would be minimised.
• Capturing and labelling network traffic: Once the simulation begins, the network traf-
fic is captured and labelled for further analysis. Wireshark [45] is a programme that
sniffs and collects information from network traffic. Following that, the Wireshark pro-
gramme saves all network scenario data into separate files. Finally, each file associated
with the running scenario is labelled.

5.2 Stage 2: Feature Extraction and Preparation


Following the collection of network traffic, packets are extracted and analysed to identify
the pattern of normal and suspicious activities. The points that follow provide more
information about this stage.

• Extraction of dataset features: In this step, the Wireshark programme is used to extract
features from network packets, and then the extraction tool is used to extract features
based on observations and examples from previous studies. Following that, the features
would be aggregated and saved as CSV. files. This procedure is carried out for all
network scenarios.
• Data Cleaning: This is a critical step in any machine learning or deep learning project.
The missing values are compensated in this step by calculating the average of the
record values and removing irrelevant values.
• Data Encoding: The dataset includes a variety of features. These features include
various types of information such as the alphabet, numbers, symbols, and so on. To
analyse different types of information at the same time, ML techniques take a long time
and require a lot of system resources. Furthermore, the performance of ML techniques
may be impacted. To avoid these troubles, the data encoding process is used to map
different data formats into a numerical format.
• Data normalization: This step reduces the range of values in the features. During
the training and prediction phases, this process aids the classification algorithm. This
process also aids in avoiding dataset biasing, which affects the performance of the
classifier used to build the model. The records of the dataset features are scaled in this
research. The boundary of each record, on the other hand, falls between (0) and (1).
To scale each record of dataset features, use the Equation below.

x − xmin
y=
xmax − xmin
Where y represents the normalised (new) value, x represents the existing value in
the feature’s record, and x max and x min represent the maximum and minimum values in
each record, respectively.
352 T. A. Alamiedy et al.

5.3 Stage 3: Ensemble Feature Selection-Based Bio-inspired Algorithms


The proposed ensemble FS model identifies an optimal subset of features that contribute
to high detection accuracy. Several researchers [46, 47] used bio-inspired algorithms
to solve a variety of real-world optimisation problems. In this research, we used three
bio-inspired algorithms that produced significant results in detecting network attacks
[48, 49]. The algorithms chosen are used to obtain an optimal number of traffic features
with high detection accuracy. Figure 7 depicts the architecture of ensemble FS-based
bio-inspired algorithms.

Fig. 7. The architecture of ensemble FS-based bio-inspired algorithms

The process of selecting the subset of a new feature is a difficult one. It is difficult
to do it efficiently, especially when the data is complex in terms of a high dimension of
features [50]. Therefore, bio-inspired metaheuristic algorithms are well suited to dealing
with this issue. These algorithms produce useful results in a reasonable amount of time
and effort. In this research, we used three different types of bio-inspired algorithms to
detect network attacks and achieved significant results [51, 52]. More information about
these algorithms can be found in the subsections below.

Particle Swarm Optimisation (PSO). The Particle Swarm Optimisation (PSO) was
invented by Russell Eberhart and James Kennedy [53]. PSO is a computational method
based on the locomotion of fish schools and bird flocks. PSO was created as a result of
a large number of interpretations performed using computer simulations. This method
employs a collection of particles to form a swarm. The swarm then passes through the
field of research to find the best solution.
However, each particle in the research scope modifies its “travelling” to match its
own travelling experiences as well as the travelling experiences of the other particles. The
random generation of particles contributes to the launch of the PSO, which indicates the
speed of the search. The particles would then be assessed in terms of fitness. Following
that evaluation, two major tests are performed. The first test is called personal best
(pbest), and it compares particle experiences to one’s own. The second test, known as
global best (gbest), compares the fitness of particles to other swarm experiences.
Following that, the best particle is retained as a result of these two critical tests. The
termination criteria are then met. PSO algorithm was used to solve many different types
Ensemble Feature Selection Approach for Detecting Denial of Service Attacks 353

of optimisation algorithms; additionally, PSO is used to detect several network attacks


and produce impressive results.

Grey Wolf Optimisation (GWO) Algorithm. Mirjalili et al. [54] proposed the GWO
algorithm. This algorithm was inspired by the hunting behaviours of grey wolves in terms
of leadership skills. In addition, there are four types of wolves: Alpha, Beta, Delta, and
Omega.
As flock leader, Alpha must make a decision. Even though they are not the strongest,
they are the best at managing the pack. This is due to the importance of pack management
over strength. Beta, on the other hand, is regarded as a lower-level wolf within the pack.
Beta serves as Alpha’s advisor and plays an important role in assuming Alpha’s position
in the event that Alpha dies or is incapacitated in some other way. Alpha’s decisions are
also supported by Beta and the rest of the pack. Furthermore, Beta provides Alpha with
feedback on the pack’s members to help Alpha make decisions.
Omega, on the other hand, is regarded as the pack’s lowest-level wolf. Omega is
a scapegoat in the pack, so its existence is critical to the pack’s permanence. Omega
indirectly preserves and satisfies the other members of the pack.
Last but not least, the rest of the pack is represented as Delta. Delta is made up of
scouts, sentinels, elders, hunters, and caretakers. Thus, according to this hierarchy, the
hunting process consists of three major steps:

1. Tracking, chasing, and approaching the prey,


2. Stopping the prey’s movement through pursuing, encircling, and harassing it,
3. Attacking the prey.

Eventually, the GWO algorithm displays the illustrated hierarchy and hunting pro-
cedures. The GWO algorithm mimics these procedures in order to face and solve huge
engineering problems. On the other hand, this algorithm has been used as a feature
selection technique to detect different types of attacks, then generating a new acceptable
subset of features that could contribute to improving detection accuracy [48].

Firefly Optimisation Algorithm. FFA algorithm was proposed by Xin She Yang [55].
It is a metaheuristic algorithm for feature selection. The main idea behind this algorithm
was inspired by the communication behaviour of tropical fireflies. Also, it is based on the
concept of idealised flashing pattern behaviour. This algorithm’s mathematical model
was built using the following rules:

1. All of the fireflies are unisex.


2. There is a proportional relationship between the brightness and attractiveness of
fireflies.
3. The brightness of fireflies is limited and influenced by the environment of the
objective functions.

In terms of the maximisation problem, the brightness may be proportional to the


objective function value. However, there are two critical aspects of firefly’s regular algo-
rithm. First, consider the light intensity formulation. Second, the attractiveness shifts.
354 T. A. Alamiedy et al.

Consequently, the brightness of the firefly would be determined by the encoded objec-
tive feature landscape. Also, the light intensity difference must be described, and the
attractiveness adjustment must be developed.

Intersection Stage. In this research, we used three bio-inspired algorithms (GWO,


PSO, and FFA) to generate three significant feature sets (SF1, SF2, and SF3). Each
algorithm selects a subset of features and uses an SVM classifier to evaluate them. Fol-
lowing that, each subset of the chosen features is introduced into the intersection stage.
The intersection stage is used to reduce the selected features that resulted from those
three algorithms, and only the synonymous features group a new subset known as SF4,
SF5, and SF6. The number of proposed variants for intersections used in this stage is
shown in Table 2.

Table 2. Type of intersections used in this stage.

Intersection No Intersection Type Output


1 SF1 ∩ SF2 SF4
2 SF1 ∩ SF3 SF5
3 SF2 ∩ SF3 SF6

Finally, SVM is again used to evaluate produced Intersection subsets to select the
optimum subset of characteristics that contribute to the best detection accuracy as
presented in Fig. 8.

Fig. 8. A summary of intersection stage

5.4 Stage 4: Support Vector Machine-Based DDoS Attack Detection

SVM is used in this stage to evaluate the features chosen in Stage 2. SVM is a ML
classification algorithm proposed by [53]. SVM performs well in classifying complex
and noisy datasets. SVM also demonstrates its ability to process input data without prior
knowledge, making it useful for dealing with a variety of datasets. Furthermore, many
Ensemble Feature Selection Approach for Detecting Denial of Service Attacks 355

types of classification algorithms fall into local minimum traps, which the SVM could
avoid.
Further to that, SVM supports two types of classifications: single and multi-class, and
it can predict multiple behaviours at the same time. These advantages compel researchers
to evaluate their approach using a SVM classifier. SVM operations are classified into
two types: linear and nonlinear. The linear type is used to classify simple datasets, while
the nonlinear type is used to deal with complex and complicated datasets. The kernel
function is used in nonlinear equations. This function is classified into three types:
polynomial, gaussian, and gaussian radial basis function (GRBF). The kernel function
chosen is determined by the type of dataset that would be inserted into the classifier. The
most popular is GRBF, which has a small number of control parameters and produces
excellent results. Finally, SVM has been used by various researchers to classify network
traffic into normal and abnormal classes, and the results seem to be impressive [56].

6 Evaluation and Validation of Proposed Approach


We used the confusion matrix, which describes the performance of the classification
model, to evaluate the proposed approach’s efficiency level in detecting DDoS flooding
attacks in the RPL network. To obtain the performance level shown in Table 3, the
confusion matrix employs four metrics which include True positive (TP), true negative
(TN), false positive (FP), and false negative (FN). Meanwhile, other factors such as
accuracy, sensitivity, precision, and F-measure could be derived from these metrics.

Table 3. Confusion Matrix

Predicted
Normal Attack
Actual Normal (TP) (FN)
Attack (FP) (TN)

TPR is used to determine the amount of normal data that is observed to be normal
data. It is calculated as follows:
TP
TPR =
TP + FN
TNR is used to calculate the amount of attack data that is recognised as attack data.
It is calculated as follows:
TN
TNR =
TN + FP
FPR is used to estimate the amount of attack data that is recognised as normal data.
It is calculated as follows:
FP
FPR =
FP + TN
356 T. A. Alamiedy et al.

FNR is used to estimate the amount of the normal data that is classified as attack
data. It is calculated as follows:
FP
FPR =
FP + TN
Accuracy is expressed as a percentage. The percentage then refers to the degree to
which the records are correctly predicted. It is calculated as follows:
TPR + TNR
Accuracy =
TPR + TNR + FPR + FNR
Precision is defined as the ratio of correct decisions. It can be calculated by dividing
the TP by the sum of the FP and TP. It is calculated as follows:
TPR
Precision =
TPR + FPR
Sensitivity is defined as the number of TP evaluations divided by the total number
of positive evaluations. It is calculated as follows:
TPR
Sensitivity =
TPR + FPR
The F-measure is a test for accuracy. It refers to the equilibrium that exists between
precision and sensitivity. It is calculated as follows:

2 ∗ (Precision ∗ Sensitivty)
F − Measure =
Precision + Sensitivity

7 Conclusion

In this paper, an ensemble feature selection approach for detecting DDoS flooding attacks
in the RPL network is proposed. The proposed method employs various bio-inspired
algorithms to select the optimal subset of features that contribute to high detection
accuracy. In addition, SVM is used as a classification algorithm by ML algorithms to
evaluate the selected features. Furthermore, an intersection stage is proposed to find
intersected features from the generated subsets produced by bio-inspired algorithms in
order to improve the selected subsets of dataset features. Moreover, new feature subsets
are generated based on the set of intersection types, which are then passed to the SVM
classifier to seek the optimal feature set with the highest detection accuracy. Finally, the
proposed approach is expected to achieve high DDoS attack detection accuracy with an
optimal subset of features associated with the pattern of DDoS attacks.

Acknowledgment. This research was pursued under the Research University (RU) Grant,
Universiti Sains Malaysia (USM) No: 1001.PNAV.8011107.
Ensemble Feature Selection Approach for Detecting Denial of Service Attacks 357

References
1. Al-Hadhrami, Y., Hussain, F.K.: DDoS attacks in IoT networks: a comprehensive systematic
literature review (2021)
2. Alamiedy, T.A., Anbar, M., Al-Ani, A.K., Al-Tamimi, B.N., Faleh, N.: Review on feature
selection algorithms for anomaly-based intrusion detection system. In: Saeed, F., Gazem,
N., Mohammed, F., Busalim, A. (eds.) Recent Trends in Data Science and Soft Computing.
Advances in Intelligent Systems and Computing, pp. 605–619. Springer, Cham (2019). https://
doi.org/10.1007/978-3-319-99007-1_57
3. Mahmoud, R., Yousuf, T., Aloul, F., Zualkernan, I.: Internet of things (IoT) security: Cur-
rent status, challenges and prospective measures. In: 2015 10th International Conference for
Internet Technology and Secured Transactions, ICITST 2015, pp. 336–341. IEEE (2016)
4. Cisco: Cisco Annual Internet Report (2018–2023). Comput. Fraud Secur. 2020, 4 (2020)
5. Fields, B.K.K., Demirjian, N.L., Gholamrezanezhad, A.: Coronavirus Disease 2019 (COVID-
19) diagnostic technologies: a country-based retrospective analysis of screening and contain-
ment procedures during the first wave of the pandemic (2020). https://doi.org/10.1016/j.cli
nimag.2020.08.014
6. Whitelaw, S., Mamas, M.A., Topol, E., Van Spall, H.G.C.: Applications of digital technology
in COVID-19 pandemic planning and response (2020)
7. Chick, R.C., et al.: Using technology to maintain the education of residents during the COVID-
19 pandemic. J. Surg. Educ. 77, 729–732 (2020). https://doi.org/10.1016/j.jsurg.2020.03.018
8. Kaharuddin, Ahmad, D., Mardiana, Rusni: Contributions of technology, culture, and attitude
to English learning motivation during COVID-19 outbreaks. Syst. Rev. Pharm. 11, 76–84
(2020). https://doi.org/10.31838/srp.2020.11.13
9. Alashhab, Z.R., Anbar, M., Singh, M.M., Leau, Y.B., Al-Sai, Z.A., Alhayja’a, S.A.: Impact of
coronavirus pandemic crisis on technologies and cloud computing applications. J. Electron.
Sci. Technol. 19, 25–40 (2021). https://doi.org/10.1016/j.jnlest.2020.100059
10. Lueth, K.L.: The impact of Covid-19 on the Internet of Things Part 2. https://iot-analytics.
com/the-impact-of-covid-19-on-the-internet-of-things-part-2/
11. Ligero, R.: Accent Systems developed a connected wristband to contain Covid-
19. https://accent-systems.com/blog/accent-systems-developed-connected-wristband-techno
logy-contain-covid19/?v=75dfaed2dded
12. Chen, Y., Chanet, J.P., Hou, K.M., Zhou, P.: A context-aware tool-set for routing-targeted
mutual configuration and optimization of LLNs through bridging virtual and physical
worlds. In: New and smart Information Communication Science and Technology to support
Sustainable Development (NICST 2014) (2014). 5 p.
13. Ammar Rafea, S., Abdulrahman Kadhim, A.: Routing with energy threshold for WSN-IoT
based on RPL protocol. Iraqi J. Comput. Commun. Control Syst. Eng. 71–81 (2019). https://
doi.org/10.33103/uot.ijccce.19.1.9
14. Tennina, S., Gaddour, O., Koubâa, A., Royo, F., Alves, M., Abid, M.: Z-Monitor: A protocol
analyzer for IEEE 802.15.4-based low-power wireless networks. Comput. Netw. 95, 77–96
(2016). https://doi.org/10.1016/j.comnet.2015.12.002
15. Fallis, A.: RFC6550 RPL: IPv6 routing protocol for low-power and lossy networks. J. Chem.
Inf. Model. 53, 1689–1699 (2013)
16. Palattella, M.R., et al.: Standardized protocol stack for the internet of (important) things
(2013)
17. Mahmoud, C., Aouag, S.: Security for internet of things: a state of the art on existing protocols
and open research issues. In: ACM International Conference Proceedings Series (2019).
https://doi.org/10.1145/3361570.3361622
358 T. A. Alamiedy et al.

18. Kim, H.S., Cho, H., Kim, H., Bahk, S.: DT-RPL: diverse bidirectional traffic delivery through
RPL routing protocol in low power and lossy networks. Comput. Netw. 126, 150–161 (2017).
https://doi.org/10.1016/j.comnet.2017.07.001
19. Tian, H., Qian, Z., Wang, X., Liang, X.: QoI-Aware DODAG construction in RPL-based event
detection wireless sensor networks. J. Sens. 2017 (2017). https://doi.org/10.1155/2017/160
3713
20. Xiao, W., Liu, J., Jiang, N., Shi, H.: An optimization of the object function for routing protocol
of low-power and Lossy networks. In: 2014 2nd International Conference on Systems and
Informatics, ICSAI 2014, pp. 515–519 (2015). https://doi.org/10.1109/ICSAI.2014.7009341
21. Lamaazi, H., Benamar, N., Jara, A.J.: RPL-based networks in static and mobile environment: a
performance assessment analysis. J. King Saud Univ. - Comput. Inf. Sci. 30, 320–333 (2018).
https://doi.org/10.1016/j.jksuci.2017.04.001
22. Ma, G., Li, X., Pei, Q., Li, Z.: A security routing protocol for internet of things based on RPL.
In: Proceedings - 2017 International Conference on Networking and Network Applications,
NaNA 2017, pp. 209–213. Institute of Electrical and Electronics Engineers Inc. (2017)
23. Le, A., Loo, J., Lasebae, A., Vinel, A., Chen, Y., Chai, M.: The impact of rank attack on
network topology of routing protocol for low-power and lossy networks. IEEE Sens. J. 13,
3685–3692 (2013). https://doi.org/10.1109/JSEN.2013.2266399
24. Raoof, A., Matrawy, A., Lung, C.H.: Routing attacks and mitigation methods for RPL-based
internet of things. IEEE Commun. Surv. Tutor. 21, 1582–1606 (2019). https://doi.org/10.
1109/COMST.2018.2885894
25. Al-Fuqaha, A., Guizani, M., Mohammadi, M., Aledhari, M., Ayyash, M.: Internet of things:
a survey on enabling technologies, protocols, and applications. IEEE Commun. Surv. Tutor.
17, 2347–2376 (2015). https://doi.org/10.1109/COMST.2015.2444095
26. AlSawafi, Y., Touzene, A., Day, K., Alzeidi, N.: Hybrid RPL-based sensing and routing
protocol for smart city. Int. J. Pervasive Comput. Commun. 16, 279–306 (2020). https://doi.
org/10.1108/IJPCC-11-2019-0088
27. Winter, T., Thubert, P.: RPL: IPv6 routing protocol for low power and lossy networks, draft-
ietf-roll-rpl-04.txt. IETF, Internet Draft (work progress) (2009)
28. Fatima-Tuz-Zahra, Jhanjhi, N.Z., Brohi, S.N., Malik, N.A.: Proposing a rank and wormhole
attack detection framework using machine learning. In: MACS 2019 - 13th International
Conference on Mathematics, Actuarial Science, Computer Science and Statistics Proceedings
(2019). https://doi.org/10.1109/MACS48846.2019.9024821
29. Fatima-Tuz-Zahra, Jhanjhi, N.Z., Brohi, S.N., Malik, N.A., Humayun, M.: Proposing a hybrid
RPL protocol for rank and wormhole attack mitigation using machine learning. In: 2020 2nd
International Conference on Computer and Information Sciences, ICCIS 2020, pp. 1–6. IEEE
(2020)
30. Perazzo, P., Vallati, C., Arena, A., Anastasi, G., Dini, G.: An implementation and evaluation
of the security features of RPL. In: Puliafito, A., Bruneo, D., Distefano, S., Longo, F. (eds.)
ADHOC-NOW 2017. LNCS, vol. 10517, pp. 63–76. Springer, Cham (2017). https://doi.org/
10.1007/978-3-319-67910-5_6
31. Mayzaud, A., Badonnel, R., Chrisment, I.: A taxonomy of attacks in RPL-based internet of
things (2016)
32. Wallgren, L., Raza, S., Voigt, T.: Routing attacks and countermeasures in the RPL-based
internet of things. Int. J. Distrib. Sens. Netw. 2013, 11 (2013). https://doi.org/10.1155/2013/
794326
33. Alzubaidi, M., Anbar, M., Hanshi, S.M.: Neighbor-passive monitoring technique for detecting
sinkhole attacks in RPL networks. In: Proceedings of the 2017 International Conference on
Computer Science and Artificial Intelligence - CSAI 2017. ACM Press, New York (2017)
Ensemble Feature Selection Approach for Detecting Denial of Service Attacks 359

34. Alzubaidi, M., Anbar, M., Chong, Y.W., Al-Sarawi, S.: Hybrid monitoring technique for
detecting abnormal behaviour in RPL-based network. J. Commun. 13, 198–208 (2018). https://
doi.org/10.12720/jcm.13.5.198-208
35. Alzubaidi, M., Anbar, M., Al-Saleem, S., Al-Sarawi, S., Alieyan, K.: Review on mechanisms
for detecting sinkhole attacks on RPLs. In: ICIT 2017 - 8th International Conference on
Information Technology, Proceedings, pp. 369–374. Institute of Electrical and Electronics
Engineers Inc. (2017)
36. Pongle, P., Chavan, G.: A survey: attacks on RPL and 6LoWPAN in IoT. In: 2015 International
Conference on Pervasive Computing: Advance Communication Technology and Application
for Society, ICPC 2015 (2015)
37. Napiah, M.N., Bin Idris, M.Y.I., Ramli, R., Ahmedy, I.: Compression header analyzer intrusion
detection system (CHA - IDS) for 6LoWPAN communication protocol. IEEE Access 6,
16623–16638 (2018). https://doi.org/10.1109/ACCESS.2018.2798626
38. Amin, S.O., Siddiqui, M.S., Hong, C.S., Lee, S.: RIDES: Robust intrusion detection system
for IP-based Ubiquitous Sensor Networks. Sensors 9, 3447–3468 (2009). https://doi.org/10.
3390/s90503447
39. Kasinathan, P., Costamagna, G., Khaleel, H., Pastrone, C., Spirito, M.A.: Demo: an IDS
framework for internet of things empowered by 6LoWPAN. In: Proceedings of the ACM
Conference on Computer & Communications Security, pp. 1337–1339 (2013). https://doi.
org/10.1145/2508859.2512494
40. Medjek, F., Tandjaoui, D., Romdhani, I., Djedjig, N.: A trust-based intrusion detection system
for mobile RPL based networks. In: Proceedings - 2017 IEEE International Conference on
Internet of Things, IEEE Green Computing and Communications, IEEE Cyber, Physical
and Social Computing, IEEE Smart Data, iThings-GreenCom-CPSCom-SmartData 2017,
pp. 735–742. Institute of Electrical and Electronics Engineers Inc. (2018)
41. Airehrour, D., Gutierrez, J., Ray, S.K.: Securing RPL routing protocol from blackhole attacks
using a trust-based mechanism. In: 26th International Telecommunication Networks and
Applications Conference, ITNAC 2016, pp. 115–120. Institute of Electrical and Electronics
Engineers Inc. (2017)
42. Airehrour, D., Gutierrez, J., Ray, S.: A trust-aware RPL routing protocol to detect blackhole
and selective forwarding attacks. Aust. J. Telecommun. Digit. Econ. 5 (2017). https://doi.org/
10.18080/ajtde.v5n1.2
43. Alabsi, B.A., Anbar, M., Manickam, S., Elejla, O.E.: DDoS attack aware environment with
secure clustering and routing based on RPL protocol operation. IET Circuits Devices Syst.
13, 748–755 (2019). https://doi.org/10.1049/iet-cds.2018.5079
44. Autonomous Networks Research Group: Cooja Simulator – Contiki. http://anrg.usc.edu/con
tiki/index.php/Cooja_Simulator
45. Wireshark Foundation: Wireshark  Go deep. https://www.wireshark.org/
46. Pazhaniraja, N., Paul, P., Roja, G., Shanmugapriya, K., Sonali, B.: A study on recent bio-
inspired optimization algorithms. ieeexplore.ieee.org (2017)
47. Rai, D., Garg, A.K., Tyagi, K.: Bio-inspired optimization techniques-a critical comparative
study 38, 1–7 (2013). https://doi.org/10.1145/2492248.2492271, dl.acm.org
48. Alzubi, Q.M., Anbar, M., Alqattan, Z.N.M., Al-Betar, M.A., Abdullah, R.: Intrusion detection
system based on a modified binary grey wolf optimisation. Neural Comput. Appl. 32(10),
6125–6137 (2019). https://doi.org/10.1007/s00521-019-04103-1
49. Alamiedy, T.A., Anbar, M., Alqattan, Z.N.M., Alzubi, Q.M.: Anomaly-based intrusion detec-
tion system using multi-objective grey wolf optimisation algorithm. J. Ambient Intell. Human.
Comput. 11(9), 3735–3756 (2019). https://doi.org/10.1007/s12652-019-01569-8
50. Altaher, A.: Malware detection based on evolving clustering method for classification. Sci.
Res. Essays 7, 2031–2036 (2012). https://doi.org/10.5897/sre12.001
360 T. A. Alamiedy et al.

51. Razak, M.F.A., Anuar, N.B., Othman, F., Firdaus, A., Afifi, F., Salleh, R.: Bio-inspired for
features optimization and malware detection. Arab. J. Sci. Eng. 43(12), 6963–6979 (2017).
https://doi.org/10.1007/s13369-017-2951-y
52. Soliman, O.S., Rassem, A.: A network intrusions detection system based on a quantum bio
inspired algorithm. Int. J. Eng. Trends Technol. 10, 370–379 (2014). https://doi.org/10.14445/
22315381/ijett-v10p271
53. Clerc, M.: Particle Swarm Optimization (2010). https://doi.org/10.1002/9780470612163
54. Safaldin, M., Otair, M., Abualigah, L.: Improved binary gray wolf optimizer and SVM for
intrusion detection system in wireless sensor networks. J. Ambient Intell. Human. Comput.
12(2), 1559–1576 (2020). https://doi.org/10.1007/s12652-020-02228-z
55. Yang: Firefly algorithm - Google Scholar. https://scholar.google.com/scholar?cluster=327632
4836150250709&hl=en&oi=scholarr
56. Mohammadi, M., et al.: A comprehensive survey and taxonomy of the SVM-based intrusion
detection systems (2021)

You might also like