(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 10, October 2011 .
An Immune Inspired Multilayer IDS
Najlaa Badie Aldabagh
Computer Sciences
College of Computer Sciences and Mathematics
Iraq, Mosul, Mosul University
Mafaz Muhsin Khalil Alanezi
Computer Sciences
College of Computer Sciences and Mathematics
Iraq, Mosul, Mosul University
[email protected]
[email protected]
Abstract—The use of artificial immune systems in intrusion
detection is an appealing concept for two reasons. Firstly, the
human immune system provides the human body with a high
level of protection from invading pathogens, in a robust, selforganized and distributed manner. Secondly, current
techniques used in computer security are not able to cope with
the dynamic and increasingly complex nature of computer
systems and their security.
The objective of our system is to combine several
immunological metaphors in order to develop a forbidding
IDS. The inspiration come from: (1) Adaptive immunity
which is characterized by learning, adaptability, and memory
and is broadly divided into two branches: humoral and cellular
immunity. And (2) The analogy of the human immune systems
multilevel defense could be extended further to the intrusion
detection system itself. This is also the objective of intrusion
detection which need multiple detection mechanisms to obtain
a very high detection rate with a very low false alarm rate.
Dasgupta et. al. [2, 3] in which they describe the use of
several types of detector analogous to T helper cells, T
suppressor cells, B cells and antigen presenting cells in two
type of data binary and real, to detect anomaly in time series
data generated by Mackey-Glass equation.
NSL-KDD are data Sets provide platform for the purpose of
testing intrusion detection systems and to generate both
background traffic and intrusions with provisions for multiple
interleaved streams of activity [4]. These provide a (more or
less) repeatable environment in which real-time tests of an
intrusion detection system can be performed. The data set
contain records each of which contains 41 features and is
labeled as either normal or an attack, with exactly one specific
attack type, The data set contains 24 attack types. These
attacks fall into four main categories: DoS; U2R; R2L; and
Probing [24, 26]. These data set available at [25].
II.
In computer security there is no single component or
application that can be employed to keep a computer system
completely secure. For this reason it is recommended that a
multilevel defense approach be taken to computer security.
The biological immune system employs a multilevel defense
against invaders through nonspecific (innate) and specific
(adaptive) immunity. The problems for intrusion detection
also need multiple detection mechanisms to obtain a very high
detection rate with a very low false alarm rate.
The objective of our system is to combine several
immunological metaphors in order to develop a forbidding
IDS. The inspiration come from: (1) Adaptive immunity
which is characterized by learning, adaptability, and memory
and is broadly divided into two branches: humoral and cellular
immunity. And (2) The analogy of the human immune systems
multilevel defense could be extended further to the intrusion
detection system itself.
An IDS is designed with three phases: Initialization and
Preprocessing phase, Training phase, Testing phase. But the
Training phase has two defense layers, the first layer is a
Cellular immunity (T & B cells reproduction) where an ALCs
would attempt to identify the attack. If this level was unable to
identify the attack the second layer Humoral immunity
(Complement System), which is a more complex level of
detection within the IDS would be enabled. The complement
system, represents a chief component of innate immunity, not
Keywords: Artificial Immune System (AIS); Clonal Selection
Algorithm (CLONA); Immune Complement Algorithm (ICA);
Negative Selection (NS); Positive Selection (PS); NSl-KDD dataset.
I.
IMMUNITY IDS OVERVIEW
INTRODUCTION
When designing an intrusion detection system it is desirable
to have an adaptive system. The system should be able to
recognize attacks it has not seen before and then respond
appropriately. This kind of adaptive approach is used in
anomaly detection, although where the adaptive immune
system is specific in its defense, anomaly detection is nonspecific. Anomaly detection identifies behavior that differs
from “normal” but is unable to the specific type of behavior,
or the specific attack. However, the adaptive nature of the
adaptive immune system and its memory capabilities make it a
useful inspiration for an intrusion detection system [1].
However on subsequent exposure to the same pathogen,
memory cells are already present and are ready to be activated
and defend the body. It is important for an intrusion detection
system to be adaptive. There are always new attacks being
generated and so an IDS should be able to recognize these
attacks. It should also then be able to use the information
gathered through the recognition process so that it can quickly
identify the attacks in the future [1].
30
http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 10, October 2011 .
only participates in inflammation but also acts to enhance the
adaptive immune response [23]. All memory ALCs obtained
from Training phase layers used in Testing phase to detect
attacks. This multilevel approach could provide more specific
levels of defense and response to attacks or intrusions.
The problem with anomaly detection systems is that often
normal activity is classified as intrusive activity and so the
system is continuously raising alarms. The co-operation and
co-stimulation between cells in the immune system ensures
that an immune response is not initiated unnecessarily, thus
providing some regulation to the immune response.
Implementing an error-checking process provided by cooperation between two levels of detectors could reduce the
level of false positive alerts in an intrusion detection system.
The algorithm works on similar principles, generating
detectors, and eliminating the ones that detect self, so that the
remaining detectors can detect any non-self.
The initial exposure to Ag that stimulates an adaptive
immune response is handled by a small number of low-affinity
lymphocytes. This process is called primary response and this
what will happened in Training phase. Memory cells with high
affinity for the encounter, however, are produced as a result of
response in the process of proliferation, somatic hyper
mutation, and selection. So, a second encounter with the same
antigen induces a heightened state of immune response due to
the presence of memory cells associated with the first
infection. This process is called secondary response and this
what will happened in Testing phase. By comparison with the
primary response, the secondary response is characterized by a
shorter lag phase and a lower dose of antigen required for
causing the response, and that could be notice in the run speed
of these two phases.
The overall diagram of Immunity-Inspired IDS in figure (1)
Note the terms ALCs and detectors have the same meaning in
this system.
will be close to 1. Since information gain is calculated for
discrete features, continuous features are discretized with the
emphasis of providing sufficient discrete values for detection
[20].
The most 10 significant features the system obtained are:
duration, src_bytes, dst_bytes, hot, num_compromised,
num_root,
count,
srv_count,
dst_host_count,
dst_host_srv_count.
a) Information Gain
Let S be a set of training set samples with their
corresponding labels. Suppose there are m classes (here m=2)
and the training set contains si samples of class I and s is the
total number of samples in the training set. Expected
information needed to classify a given sample is calculated by
[20, 21]:
(1)
A feature F with values { f1, f2, …, fv } can divide the
training set into v subsets { S1, S2, …, Sv } where Sj is the subset
which has the value fj for feature F. Furthermore let Sj contain
sij samples of class i. Entropy of the feature F is
(2)
Information gain for F can be calculated as:
Gain(F) = I(s1,...,sm ) − E(F)
(3)
b) Univariate discretization process
Discrete values offer several advantages over continuous
ones, such as data reduction and simplification. Quality
discretization of continuous attributes is an important problem
that has effects on speed, accuracy, and understandability of
the classification models [22].
Discretization can be univariate or multivariate. Univariate
discretization quantifies one continuous feature at a time while
multivariate discretization simultaneously considers multiple
features. We mainly consider univariate (typical)
discretization in this paper. A typical discretization process
broadly consists of four steps [22]:
• Sort the values of the attribute to be discretized.
• Determine a cut-point for splitting or adjacent intervals
for merging.
• Split or merge intervals of continuous values, according to
some criterion.
• Stop at some point.
Since information gain is calculated for discrete features,
continuous features should be discretized [20, 22]. To this end,
continuous features are partitioned into equalsized partitions
by utilizing equal frequency intervals. In equal frequency
intervals method, the feature space is partitioned into arbitrary
number of partitions where each partition contains the same
number of data points. That is to say, the range of each
partition is adjusted to contain N dataset instances. If a value
occurs more than N times in a feature space, it is assigned a
A. Initialization and Preprocessing phase
Have the following operations:
1) Preprocessing NSL dataset
The data are partitioned in to two classes: normal and attack,
where the attack is the collection of all 22 different attacks
belonging to the four classes described in section I, the labels
of each data instance in the original data set are replaced by
either `normal' for normal connections or `anomalous' for
attacks. Due to the abundance of the 41 features, it is
necessary to reduce the dimensionality of the data set, to
discard the irrelevant attributes. Therefore, information gains
of each attribute are calculated and the attributes with low
information gains are removed from the data set. The
information gain of an attribute indicates the statistical
relevance of this attribute regarding the classification [21].
Based on the entropy of a feature, information gain
measures the relevance of a given feature, in other words its
role in determining the class label. If the feature is relevant, in
other words highly useful for an accurate determination,
calculated entropies will be close to 0 and the information gain
31
http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 10, October 2011 .
partition of its own. In “21% NSL” dataset, certain classes
such as denial of service attacks and normal connections occur
in the magnitude of thousands whereas other classes such as
R2L and U2R attacks occur in the magnitude of tens or
hundreds. Therefore, to provide sufficient resolution for the
minor classes N is set to 10 [20]. The result of this step are the
most gain indexes to use them later in preprocessing training
and testing files.
ranges from outweighing attributes with initially smaller
ranges [9]. There are many methods for data normalization
include min-max normalization, z-score normalization,
Logarithmic normalization and normalization by decimal
scaling [8, 9].
Min-max normalization: The Min-max normalization
performs a linear transformation on the original data. Suppose
that mina and maxa are the minimum and the maximum values
for feature A. Min-max normalization maps a value v of A to
v’ in the range [new-mina, new-maxa] by computing [9]:
v’=((v-mina) / (maxa–mina)) *
(4)
(new-maxa–new-mina) + new-mina
2) Self and NonSelf Antigens
As mentioned in chapter 2 that each record of NSL or KDD
dataset contains 41 features and is labeled as either normal or
an attack, so it would be here as Self and NonSelf
respectively.
The dataset used in the training phase of the system contain
about 200 records from normal and attack records, the attack
records have records from all types of attack in the original
dataset. And this rule applied on NSL and KDD datasets. But
the all “21% NSL” test datasets used when test the system in
testing phase.
The system in training and testing phase, apply on each file
before enter to it: selecting the most gain indexes and convert
each continuous feature to discrete.
In the case range is [0-1] the equation would be:
v’= (v-mina) / (maxa – mina)
(5)
In order to generalization all the comparisons (NS & PS)
done in IIDS, and to simplify the chosen of thresholds values,
the calculated affinities between each one of ALCs and all Ags
is normalized into the range [1-100] in case Th and B cells,
and normalized into the range [0-1] in case Ts cells and CDs.
5) Detector Generation Mechanism
All Nonself or attack records in training file will be consider
as the initial detectors (or ALCs) then in training phase
eliminates those that match self samples.
Sure there are three types of detectors (integer, string, real).
The output of this step is a specified number for every
detectors types and their length equal to Self and NonSelf
patterns length's which is the number of gain indexes.
3) Antigens Presentation
T cell and B cell are assumed that recognize antigens in
different ways. In biological immune system, T cells can only
recognize internal features (peptides) processed from foreign
protein. In our system, T cells recognition is defined as bitlevel recognition (real, integer). This is a low-level recognition
scheme. In the immune system, however, B cells can only
recognize surface features of antigens. Because of the large
size and complexity of most antigens, only parts of the
antigen, discrete sites called epitopes, get bound to B cells. Bcell recognition is proposed that is a higher-level recognition
(string) at different non-contiguous (occasionally contiguous)
positions of antigen strings.
So different data types are used for each ALC in order to
compose several detection levels. In order to present the self
and nonself antigens on ALCs, there are also converted to suit
different data types of ALCs, like integer for T-helper cells,
string for B-cells, and real [0-1] for T-suppresser cells .
Real values would be in range [0-1], so Normalization is
used for conversion operation.
6) Affinity Measure by Matching Rules
In several next steps affinity needs to be calculated the
between (ALCs & Self patterns) and (ALCs & NonSelf Ags),
so matching rules are determined depend on the data type.
• The affinity between an Th ALC (integer) and a NonSelf
Ags or Self patterns is measured by Landscape-affinity
matching (Physical matching rule) [11, 12, 10]. The
Physical matching gives an indication of the similarity
between two patterns, i.e. a higher affinity value between
an ALC and a NonSelf Ags implies a stronger affinity.
(6)
4) Normalization
Data transformation such as normalization may improve the
accuracy and efficiency of classification algorithms involving
neural networks, mining algorithm, or distance measurements
such as nearest neighbor classification and clustering. Such
methods provide better results if data to be analyzed has been
normalized, that scaled to specific ranges such as (0-1) [8, 9].
If using the neural network back propagation algorithm for
classification mining, normalizing the input values for each
attribute measured in the training samples will help speed up
the learning phase. For distanced-based methods,
normalization helps prevent attributes with initially large
• The affinity between an Ts ALC (real) and a NonSelf Ags
or Self patterns is measured by Euclidean distance [11
,13, 12]. The Euclidean distance gives an indication of the
difference between two patterns, i.e. a lower affinity value
between an ALC and a NonSelf Ags implies a stronger
affinity.
(7)
• The affinity between an B ALC (string) and a NonSelf
Ags or Self patterns is measured by R-Contiguous string
matching rule. If x and y are equal-length strings defined
32
http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 10, October 2011 .
over a finite alphabet, match(x, y) is true if x and y agree in
at least r contiguous locations [11, 14, 12, 15]. The RContiguous string matching gives an indication of the
similarity between two patterns, i.e. a higher affinity value
between an ALC and a NonSelf Ags implies a stronger
affinity.
100%], and Maxgeneration is the maximum no of generation
used in random generation of ALCs in initialization and
Generation phase.
•
The affinity is measured here between all cloned ALCs and
NonSelf Ags. And sort all ALCs in descending order depend
on their affinity with NonSelf Ags.
B. Training Phase
Here the system will be train by a serious of recognition
operations between the previous generated detectors and self
and nonself Ags to constitute multilevel recognition, make the
recognition system more robust and ensures efficient
detection.
1) First Layer-Cellular immunity (T & B cells
reproduction)
Both B cells and T cells undergo proliferation and selection
and exhibit immunological memory once they have
recognized and responded to an Ag. All system's ALCs
progress in the following stages:
a) Clonal and Expansion
Clonal selection in AIS is the selection of a set of ALCs
with the highest calculated affinity with a NonSelf pattern.
The selected ALCs are then cloned and mutated in an attempt
to have a higher binding affinity with the presented NonSelf
pattern. The mutated clones compete with the existing set of
ALCs, based on the calculated affinity between the mutated
clones and the NonSelf pattern, for survival to be exposed to
the next NonSelf pattern.
•
•
Clonal Operator
Now is a time to clone the previous selected ALCs in order
to expand the number of ALCs in training phase, and ALC
how has the higher affinity with NonSelf Ags will has the
higher Clonal Rate.
Here the clonal rate is calculated for each one of the selected
ALCs,
(9)
TotalCloneALC = Σni=1 ClonalRateALCi ,
where
ClonalRateALCi = Round (Kscale / i), or
ClonalRateALCi = Round (Kscale × i), [16]
The choice between the two equation of ClonalRateALCi
depend on how much clones required? Kscale is the clonal
rate, Round() is the operator that rounds the value in
parentheses toward its closet integer value, and
TotalCloneALC is the total no of clones cells.
•
Affinity Maturation (Somatic hypermutation)
After producing clones from the selected ALCs, these
clones alter by a simple mutation operator to provide some
initial diversity over the ALCs population.
The process of affinity maturation plays an important role in
adaptive immune response. From the viewpoint of evolution, a
remarkable characteristic of the affinity maturation process is
its controlled nature. That is to say the hypermutation rate to
be applied to every immune cell receptor is proportional to its
antigenic affinity. By computationally simulating this process,
one can produce powerful algorithms that perform a search
akin to local search around each candidate solution. In account
to this important aspect of the mutation in the immune system:
it is inversely proportional to the antigenic affinity [5].
Without mutation the system is only capable of manipulating
the ALCs material that was present in initial population [6].
In case Th, and B ALCs, the system calculate mutation rate
for each ALCs depend on its affinity with NonSelf Ags, where
higher affinity (similarity) has lower mutation rate.
In Ts case, one can evaluate the relative affinity of each
candidate ALCs by scaling (normalizing) their affinities. The
inverse of an exponential function can be used to establish a
relationship between the hypermutation rate α(.) and
normalized affinity D*, as described in next equation. In some
cases it might be interesting to re-scale α to an interval such as
[0 – 1] [5].
α(D*) = exp(-ρD*)
(10)
Selection Mechanism
The selection of cells for cloning in the immune system is
proportional to their affinities with the selective antigens. Thus
implementing an affinity proportionate selection can be
performed probabilistically using algorithms like the roulette
wheel selection, or other evolutionary selection mechanism
can be used, such as elitist selection, rank- based selection, biclassist selection, and tournament selection [5].
Here the system use elitist selection because it needs to
remember good detectors and discard bad ones if it is to make
progress towards the optimum. A very simple selector would
be to select the top N detectors from each population for
progression to the next population. This would work up to a
point, but any detectors which have very high affinity will
always make it through to the next population. This concept is
known as elitism.
To apply this idea four selected percent values are specified,
which determine the percent from each type of ALCs will be
select to Clonal and Expansion operations,
SelectedALCNo =(ALCsize * selectALCpercent) /
Maxgeneration,
Sorting Affinity
(8)
Where SelectedALCNo is no of ALCs will be Selected to
clone them, ALCsize is the number of ALCs survived from NS
and PS in initialization and Generation phase,
selectALCpercent is a selected percent value it range [10-
33
http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 10, October 2011 .
where ρ is a parameter that controls the smoothness of the
inverse exponential, and D* is the normalized affinity, that can
be determined by D* = D/Dmax. inverse mean lower affinity
(difference) has higher mutation rate.
Mutators generally are not as complicated, they tend to just
choose a random point on the ALCs and perturb this allele
(part of Gene) either completely randomly or by some given
amount [6].
To control the mutation operator mutation rate is calculated
as descried up, which is determine number of allele from
ALCs will be mutate. The hypermutation operator for each
type of shape-space as follows:
– Integer shape-space (Th): when mutation rate of the
current Th-ALC high enough, randomly choose the alleles
position from ALC, and replace them with a random
integer values. Another case use inversive mutation that
might occur between one or more pairs of allele.
– String shape-space (B): when mutation rate of the current
Th-ALC high enough, randomly choose the alleles
position from ALC, here the allele has length equal R
string, so may the entire characters of allele change or part
of them with another characters.
– Real shape-space (Ts): randomly choose the alleles
position from ALC, and a random real number to be
added or subtracted to a given allele is generated
m` = m + α(D*) N(0,σ)
(11)
of detectors at one site provides no information of
detectors at different sites.
– The self set and the detector set are mutually protective:
detectors can monitor self data as well as themselves for
change.
The negative selection (NS) based AIS for detecting
intrusion or viruses was the first successful piece of work
using the immunity concept for detecting harmful autonomous
agents in the computing environment.
The steps of NS algorithm are applied here,
– Generated three types of ALCs (Th, Ts, B), and present
them together with the set of Self (normal record) patterns
to NS mechanism.
– For all the ALCs generated, compute the affinity between
each one of ALCs and all Self pattern, The choose of
matching rule to measure the affinity depend on ALCs
data type representation.
– If the ALC did not match with all self patterns depend on
threshold comparison will survive to inter the next step,
and the ALCs whose match with any Self pattern will be
discard. Each type of ALCs have its own threshold value
specially for NS.
– Goto to the first step until reach the maximum number of
generations of ALCs.
But here NS is done between the three types of mutated
ALCs and Self patterns, because may be some ALCs match
Self pattern after mutation.
where m is allele, m` its mutated version, α(D*) is a
function that accounts for affinity proportional mutation.
•
•
Positive Selection
The mutated ALCs survived from previous Negative
selection will be put here to face the NonSelf Ags (attack
records) in order to distinguish which detectors can detect
them and also because may be some ALCs not match NonSelf
Ags after mutation so there is no need to keep them. The steps
of PS algorithm are applied here:
– Present the three types of ALCs (Th, Ts, B) that survive
from NS together with the set of NonSelf Ags to PS
mechanism.
– For all the ALCs, compute the affinity between each one
of ALCs and all NonSelf Ags, The choose of matching
rule to measure the affinity depend on ALCs data type
representation.
– If the ALC match with all Nonself Ags depend on
threshold comparison will survive to inter the Training
Phase, and the ALCs whose did not match with any
NonSelf Ags will be discard. Each type of ALCs have its
own threshold value specially for PS.
– Goto to the first step until apply PS on all ALCs.
Negative Selection
A number of the NS algorithm features that distinguish it
from other intrusion detection approaches. They are as follows
[4]:
– No prior knowledge of intrusions is required: this permits
the NS algorithm to detect previously unknown
intrusions.
– Detection is probabilistic, but tunable: the NS algorithm
allows a user to tune an expected detection rate by setting
the number of generated detectors, which is appropriate in
terms of generation, storage and monitoring costs.
– Detection is inherently distributable: each detector can
detect an anomaly independently without communication
between detectors.
– Detection is local: each detector can detect any change on
small sections of data. This contrasts with the other
classical change detection approaches, such as checksum
methods, which need an entire data set for detection. In
addition, the detection of an individual detector can
pinpoint where a change arises.
– The detector set at each site can be unique: this increases
the robustness of IDS. When one host is compromised,
this does not offer an intruder an easier opportunity to
compromise the other hosts. This is because the disclosure
•
Immune Memory
Save all survived ALCs from NS and PS in text files, text
files for each types of ALCs (Th, Ts, B). Here the system
produce memory cells to protect against the reoccurrence of
the same antigens. Memory cells enable the immune system’s
response to previously encountered antigens (known as the
34
http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 10, October 2011 .
secondary response), which is known to be more efficient and
faster than non-memory cells’ response to new antigens. In an
individual these cells are long-lived, often lasting for many
years or even for the lifetime of it.
i.e. if the affinity between one CD and all NonSelf Ags
not exceed a threshold, then the detector successfully
detect, else not successfully detect.
– Immune Memory: if there are successful CD, then store all
CDs can detect NonSelf Ags in PS in text file and go to
stopping Condition: have an CDsno optimal complement
detectors, else continues.
– Sorting CDs: according to the affinities calculated in
previous PS step, Sort all the successful individuals CDs
in A0NS by their ascending affinities (the higher affinity is
the lower value because this affinity is a difference value).
– Immerge Population: first put A0NS in the population and
then append A0PS after it.
2) Second Layer-Humoral immunity (Complement System)
This layer automatically activated when the first layer
terminate, and this layer simulate the classical pathway of the
complement system, which is activated by a recognition
between antigen and antibody (here detectors). The classical
pathway is composed of three phases: Identify phase, Activate
phase and Membrane attack phase. These phases and all its
step called Immune Complement Algorithm(ICA) describe in
details in [23].
In this system the complement detectors progress ICA steps
with several additional step designed for it purpose, the
objective of ICA is the continuo in generation, cleave, and
bind the CD individuals until find the optimal CD individuals.
The system's ICA summary here in the following four phases:
•
•
– Divide the Population into At1& At2 using Div active
variable. At1is a Cleave Set, and At2 is a Bind Set.
– For each individual in At1apply a Cleave Operator OC to
produce two sub-individual a1 and a2. Then take the
second sub-individual a2 for all CD individuals in At1and
bind them in one remainder cleave set bt by Positive bind
operator OPB.
ICA: Initialization phase
– Get the Nonself as the initial first one population A0 has a
fix number of Complements detectors CDs as individuals
their data type are real in range [0-1].
– Stopping conditions: if the current population has
contained the desire number of optimal detectors (CDsn)
or achieved the maximum generation, then stop, else,
continues.
– Define the following operators
1. Cleave operator OC: A CD individual cleave
according to a cleaved probability Pc, is cleaved in
two sub-individuals: a1 and a2.
2. Bind operator OB : There are two kinds of bind ways
between individuals a and b:
– Positive bind operator OPB : A new individual
c = OPB (a,b)
– Reverse bind operator ORB : A new individual
c= ORB (b,a)
•
ICA: Active phase
•
ICA: Membrane attack process
– Using Reverse bind operator ORB, bind bt and each DC
individual of At2 to get a membrane attack complex set Ct.
– For each DC individual of Ct , recode it by the code length
of initial DC individual, then gets a new set C'.
– Create a random population of complement individuals D,
then join them into C', to finally form a new set E = C' ∪
D. For the next loop A0 is replace with E .
– If the iteration step not finish go to stopping condition.
C. Testing Phase
This phase apply test on the immune memory of ALCs
created in training phase. So here the meeting between
memory ALCs and all types of Antigens Selfs and NonSelfs
take place, it is important to note here that memory ALCs not
encountered in passed with these new Ags.
The Testing phase use Positive Selection to decide wither an
Ag is Selfs or NonSelfs (i.e. normal or attack record) by
calculate the affinity between ALCs and the new Ags and
compared it with testing thresholds. As in Affinity Measure by
Matching Rules section. So if any Ag match any one of ALCs
it consider anomaly, i.e. a NonSelf Ags (attack), otherwise it is
Self (normal).
ICA: Identify Phase
– Negative Selection: For each Complement detector in the
current population apply NS with Self patterns, and the
Complement detector whose match with any Self pattern
will be discard. The Euclidean distance used here, which
is give an indication of the difference between the two
patterns, i.e. if the affinity between one CD and all Self
patterns exceed a threshold, then the detector survive, else
discard.
– Split Population: isolate the CDs how survived from NS
alone (A0NS) from the CDs how discarded (A0PS).
– Positive Selection: For each Complement detector in the
A0NS apply PS with NonSelf Ags, and the Complement
detector whose match with all NonSelf Ag will be
survive. The Euclidean distance used here, which is give
an indication of the difference between the two patterns,
Performance Measurement
In learning extremely imbalanced data, the overall
classification accuracy is often not an appropriate measure of
performance. Metrics are used as true negative rate, true
positive rate, weighted accuracy, G-mean, precision, recall,
and F-measure to evaluate the performance of learning
algorithms on imbalanced data. These metrics have been
widely used for comparison and performance evaluation of
35
http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 10, October 2011 .
Figure (1): The overall diagram of Immunity IDS.
36
http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 10, October 2011 .
the 10 of the 41 features are continuous and identified as most
significant are: 1, 5, 6, 10, 13, 16, 23, 24, 32, 33.
– Save the indexes of these significant feature in text file to use them
later in preprocessing the training and testing files.
1.3. Antigens Presentation
– For both training and testing files apply preprocessing operations on
the 10 significant features of them.
– Convert all inputted Self & NonSelf Ags to (integer, real, string).
– Apply Min-Max normalization on only how has real value to be in
range [0-1].
1.4. Detector Generation
– Get NonSelfs Ags as initial Th, Ts, B ALCs, their length is
ALClength = MaxFeature.
– Convert them to 3 type of ALCs (integer, real, string).
2. Training Phase
Input: 200 NSL records (60 normal, 140 attacks from every types);
2.1. First Layer-Cellular immunity (T & B cells reproduction) - Clonal
and Expansion
For (all ALCs type) do
/*Calculate the select percent for cloning operation;
SelectThNo = (Th_size × SelectTh) / 100;
SelectTsNo = (Ts_size × SelectTs) / 100;
SelectBNo = (B_size × SelectB) / 100;
For (all ALCs type) do /* As an example Th
While (Th_size < MaxThsize ) Λ (generate < MaxgenerationALC)
Calculate the affinity between each ALC and all NonSelf Ags;
Sort the ALCs in ascending or descending order (depend on
affinity similarity or differently), according to the ALCs
affinity;
Select SelectThNo of the highest affinity ALCs with all NonSelf
Ags as subset A;
Calculate Clonal Rate for each one of ALC in A, according to
the ALCs affinity;
Create clones C as the set of clones for each ALC in A;
Normalize the SelectThNo highest affinity ALCs;
Calculate mutation Rate for each one of ALC in C, according to
the ALCs normalized highest affinity;
Mutate each one of ALC in C, according to it's mutation Rate
and randomly select allele no, as the set of mutated clones C';
/*Apply NS between mutated ALCs C' and Self patterns;
For (all Self patterns) do NS
Calculate affinity by Landscape-affinity rule between
current Th-ALC & all Self patterns;
Normalize affinities in range [1-100]
If (all affinity < ThNS)
/* Apply PS between survived mutated ALCs from NS and
NonSelfs Ags;
For (all NonSelf Ags) do PS
Calculate affinity by Landscape-affinity rule between
current Th-ALC & all NonSelf Ags;
Normalize affinities in range [1-100]
If (all affinity >= ThPS)
Th-ALC survive and save it in file "Thmem.txt";
Th_size = Th_size + 1;
Else
Discard current Th-ALC;
Go to next Th-ALC
End If
Add survived mutated ALCs from NS & PS to "Thmem.txt", as
Secondary response;
generate++;
End While
End For
Call Complement System to activate it;
2.2. Second Layer-Humoral immunity (Complement System)
2.2.A. ICA: Initialization phase
Get NonSelfs as an initial real [0-1] population A0 has CDs equal
PopSize.
Stop: if the current population has contained CDsn optimal detectors
or achieved MaxgenerationCDs generation.
classifications. All of them are based on the confusion
matrix as shown at table (1) [7, 17, 18, 19].
Table (1): The Confusion matrix.
predicted predicted
positives
negatives
real
TP
FN
positives
real
FP
TN
negatives
Where TP (true positive), attack records identified
attack; TN (true negative), normal records identified
normal; FP (false positive), normal records identified
attack; FN ( false negative), attack records identified
normal [3, 17, 18].
III.
as
as
as
as
IMMUNITY-INSPIRED IDS PSEUDO CODE
Each phase or layer of the algorithm and its iterative
processes are given below:
1. Initialization and Preprocessing phase
1.1. Set all parameters that have constant value:
– Threshold of NS: ThNS = 60, TsNS =0.2, TbNS = 30, TcompNS = 0.25;
– Threshold of PS: ThPS = 80, TsPS =0.15, TbPS = 70, TcompPS = 0.15;
– Threshold of Test PS: ThTest = 20, TsTest =0.1, TbTest = 80, TcompTest
= 0.05;
– Generation: MaxgenerationALC = 500, MaxThsize = 50, MaxTssize
= 50, MaxBsize = 25.
– Clonal & Expansion: selectTh= 50%, selectTs = 50%, selectB =
100%;
– Complement System: MaxgenerationCDs = 1000, PopSize =
NonSelfno., CDlength = 10, Div = 70%, CDno = 50;
– Others: MaxFeature =10, Interval = 10, classes = 2, ALClength = 10,
R-contiguous R = 1, ρ = 2 parameter control the smoothness of
exponential (mutation);
– Classes:
• Normalize class: contain all functions and operation to perform
min-max normalization in range [0-1] and [1-100].
• Cleave-Bind Class: contain Cleave() function OC ,PositiveBind()
function OPB, ReverseBind() function ORB.
– Input files for Training phase: NSL or KDD file contain 200
records (60 normal, 140 attack from all attack types).
– Input files for Testing phase: files contain 20% from KDD or NSL
datasets.
1.2. Preprocessing and Information Gain
– Using the 21%NSL dataset file to calculate the following:
– Split the dataset into two classes normal and attack.
– Convert alphabetic features to numeric.
– Convert all continuous features to discrete, for each class alone.
For each one of 41 features Do
Sort feature's space values;
Partitioned feature space by Interval number specified, each
partition contains the same number of data;
Find the minimum and maximum values;
Find the initial assignment value
V = (maximum-minimum)/Interval no.;
Assign each interval i by Vi = Σi V;
If a value occurs more than Interval size in a feature space, it is
assigned a partition of its own;
– Calculate Information Gain for every feature in both two classes by
applying equations in section 4.3.1.1.
– By selecting the most significant features (MaxFeature=10) that have
larger values of information gain, the system obtained the same
features for both classes (normal and attack) but in different order. So
37
http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 10, October 2011 .
For (all Ags types) do PS
Calculate the affinity by Landscape-affinity rule between each one
of Ags and current Thmemory ALCs;
Normalize affinities in range [1-100]
If (affinity > ThNS)
Thmemory ALCs detect a NonSelf Ag;
Record Ag name;
TP = TP + 1; /* no of detected Ags
Else
FP = FP +1;
/*do the previous on, TsMemory, BMemory, and CDMemory.
3.1. Performance Measurement
TN = normalAg - FP;
FN = attackAg – TP;
DetectionRate = TP / (TP + FN);
FalseAlarmRate = FP / (TN + FP);
ACY = (TP + TN)/(TP + TN + FP + FN);
Gmean = DetectionRate × (1 - FalseAlarmRate);
Precision =TP / (TP + FP);
Recall = TP / (TP + FN);
F-measure = (2 * Precision * Recall) / (Precision + Recall);
Assign a random real value [0.5-1] as Cleave Probability Pc;
2.B. ICA: Identify Phase
While ((CD_size < CDsn) Λ (generate <= MaxgenerationCDs))
For (each CD in Population A0) do
For (all Self patterns) do NS
Calculate affinity by Euclidean distance between current CD
& all Self patterns;
Normalize affinities in range [0-1]
If (all affinity > TcompNS)
Put current CD in A0NS sub-population;
Else
Put current CD in A0Rem sub-population;
End For
For (each CD in Population A0NS) do
For (all NonSelf Ags) do PS
Calculate affinity by Euclidean distance between current
CD & all NonSelf Ags;
If (all affinity <= TcomPS)
Save it in file "CDmem.txt";
CD_size = CD_size + 1
Else
Discard current CD;
End For
Sort all CDs in A0NS by their ascending affinities with NonSelf Ag,
and put them in At;
Append A0Rem at last At;
2.2.C. ICA: Active phase
Divide At into At1 and At2 depend on Div active variable; /* At1 is a
cleave set, At2 is a bind set;
For (each CD individual in At1) do
Apply cleave operator on CD with cleave probability Pc to
produce two sub-individual a1 and a2, OC (CD, Pc, a1, a2);
For (all sub-individual in a2) do
Bind them in one remainder cleave set bt by Positive bind
operator OPB, bt = OPB (a2i,…,Λ, a2n);
2.2.D. ICA: Membrane attack process
For (each CD individual ai in At2) do
Bind bt with current individual of At2 by Reverse bind
operator ORB, to obtain Membrane Attack complex set
Ct, Ct = ORB(bt, ai);
For (each individual ci in Ct) do
Recode it to the initial CDlength = 10 to get a new set C; /*
different strategies may use here for that purpose.
Create Random population of CDs individuals as a set D;
Join C and D in one set E, consider it as a new population;
E= C & D,
A0 = E;
Generate++;
End While
3. Testing Phase
Input: 21%NSL dataset;
Initialize: FP, FN, TP, TN, DetectionRate, FalseAlarmRate, ACY,
Gmean.
/*Calculation number of normalAg & attackAg only for the purpose to
calculate performance measurements
For (each record in input file) do
If (record type is normal)
normalAg = normalAg +1;
Else
attackAg = attackAg +1;
/*Antigens Presentation
Convert all inputted Self & NonSelf Ags to (integer, real, string).
Apply Min-Max normalization on only how has real value to be in range
[0-1].
Read ThMemory ALCs;
Read TsMemory ALCs;
Read BMemory ALCs;
Read CDMemory Detectors;
/*Apply PS between all inputted Ags (Self & NonSelf, i.e. normal &
attack) and all memory ALCs;
For (all Thmemory ALCs) do /* As an example Th
IV.
SYSTEM PROPERTIES
The special properties of Immunity IDS are:
– The small size of training data, about 200 NSL records(60
normal, 140 attack from different types).
– The speed of system, where the training periods are about
1 minute because the small size of training data, and the
testing periods are about very few minutes depend on
memory ALCs size.
– The results of the system test different after each training
operation, because it depend on randomly mutation for
ALCs.
– The numbers of memory ALCs depend on number of
times of retraining, or what the system want.
– The system permit to delete all memory contents to start
new training, or every new training after the first one, the
ALCs result from it will be add to memory with the
previous.
– The detection rate is high with small numbers of memory
ALCs produced from one training.
– To apply the Immunity IDS in real, the optimal result of
one or more training are chosen, to carry out optimal
outcome.
– The thresholds values determined by many experiments
until found the fit values.
– The IIDS implemented using C# language.
V.
Experimentals Results
1) Several series of experiments were performed by 175
detectors (memory ALCs) sizes. The table (2) shows the test
results of 10 training operation done seriously on 200 records
to test "NSLTest-21.txt" file, which contain 9698 attack
records and 2152 normal records.
2) Comparison of performances (ACY) between single
level detection and multilevel detection. The ACY is chosen
because it include both TPR and TNR. The table (3) and figure
(2) show the test results of 5 training operation done seriously
also on "NSLTest%.txt" file. Notice that CDs have the higher
38
http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 10, October 2011 .
[6]
accuracy and B cells has the lower accuracy. Although the
accuracy of IIDS lower than CD but IIDS has the higher
detection rate this return to the effect of false alarm.
[7]
Table (2): Results of Test experiments.
[8]
TP
8748
8893
8748
8730
8800
8788
8802
8817
8833
8869
TN
2108
1871
2123
2146
1971
2014
2007
2046
2002
1963
FP
44
281
29
6
181
138
145
106
150
189
FN
950
805
950
968
898
910
896
881
865
829
TPR
0.9
0.92
0.9
0.9
0.91
0.91
0.91
0.91
0.91
0.91
TNR
0.02
0.13
0.01
0
0.08
0.06
0.07
0.05
0.07
0.09
ACY
0.92
0.91
0.92
0.92
0.91
0.91
0.91
0.92
0.91
0.91
g_m.
0.88
0.80
0.89
0.9
0.84
0.85
0.85
0.86
0.85
0.83
Prec.
0.99
0.97
1
1
0.98
0.98
0.98
0.99
0.98
0.98
F-m.
0.94
0.94
0.95
0.95
0.94
0.94
0.94
0.95
0.94
0.94
[9]
[10]
[11]
[12]
[13]
Table 3: Accuracy of IIDS and each type of ALCs.
IIDS
0.91
0.91
0.91
0.91
0.91
[14]
ACY
Ts
0.73
0.78
0.74
0.74
0.77
Th
0.84
0.84
0.84
0.84
0.84
B
0.22
0.21
0.22
0.25
0.30
CD
0.92
0.92
0.92
0.92
0.92
[15]
[16]
[17]
[18]
1
0.9
Accuracy
0.8
0.7
IIDS
0.6
Th
0.5
Ts
0.4
B
0.3
CD
[19]
0.2
[20]
0.1
0
1
2
3
4
5
Train no.
[21]
Figure 2 : Accuracy curve comparing the single-level
detection (Th, Ts, B, CD) and multilevel (IIDS).
[22]
REFERENCES
[1]
[2]
[3]
[4]
[5]
M. Middlemiss, "Framework for Intrusion Detection Inspired by the
Immune System", The Information Science Discussion Paper Series, July
2005.
Dasgupta, D., Yu, S., Majumdar, N.S. Majumdar, "MILA - multilevel
immune learning algorithm", In Cantu-Paz, E., et. al., eds.: Genetic and
Evolutionary Computation Conference, Chicago, USA, Springer-Verlag
(2003) 183–194
Dasgupta, D., Yu, S., Majumdar, N.S. Majumdar, "MILA – multilevel
immune learning algorithm and its application to anomaly detection",
DOI 10.1007/s00500-003-0342-7, Springer-Verlag 2003.
Jungwon Kim, Peter J. Bentley, Uwe Aickelin, Julie Greensmith, Gianni
Tedesco, Jamie Twycross, "Immune System Approaches to Intrusion
Detection - A Review", Editorial Manager(tm) for Natural Computing,
2006.
L. N. de Castro and J. Timmis. “Artificial Immune Systems: A New
Computational Intelligence Approach”, book, Springer, 2002.
[23]
[24]
[25]
[26]
39
Edward Keedwell and Ajit Narayanan, "Intelligent Bioinformatics The
application of artificial intelligence techniques to bioinformatics
problems", book, John Wiley & Sons Ltd, 2005.
Yanfang Ye · Dingding Wang · Tao Li · Dongyi Ye , "An intelligent PEmalware detection system based on association mining", J Comput Virol
(2008) 4:323–334, Springer-Verlag France 2008.
Adel Sabry Issa, "A Comparative Study among Several Modified
Intrusion Detection System Techniques", Master Thesis, University of
Duhok, 2009.
Luai Al Shalabi, Zyad Shaaban and Basel Kasasbeh, "Data Mining: A
Preprocessing Engine", Journal of Computer Science 2 (9): 735-739,
2006, ISSN 1549-3636, Science Publications, 2006.
Paul K. Harmer, Paul D. Williams, Gregg H. Gunsch, and Gary B.
Lamont, "An Artificial Immune System Architecture for Computer
Security
Applications",
IEEE
TRANSACTIONS
ON
EVOLUTIONARY COMPUTATION, VOL. 6, NO. 3, JUNE 2002.
Dipankar Dasgupta and Luis Fernando Niño, "Immunological
Computation Theory and Applications", book, 2009.
Zhou Ji and Dipankar Dasgupta, "Revisiting Negative Selection
Algorithms", Massachusetts Institute of Technology, 2007.
Thomas Stibor, "On the Appropriateness of Negative Selection for
Anomaly Detection and Network Intrusion Detection", PhD thesis 2006.
Rune Schmidt Jensen, "Immune System for Virus Detection and
Elimination", IMM-thesis-2002.
Fernando Esponda, Stephanie Forrest and Paul Helman, "A Formal
Framework for Positive and Negative Detection Schames", IEEE 2002.
A. H. Momeni Azandaryani M. R. Meybodi, "A Learning Automata
Based Artificial Immune System for Data Classification", Proceedings of
the 14th International CSI Computer Conference, IEEE 2009.
Chao Chen, Andy Liaw, and Leo Breiman, "Using Random Forest to
Learn Imbalanced Data", Department of Statistics,UC Berkeley, 2004.
Yuchun Tang, Sven Krasser, Paul Judge, and Yan-Qing Zhang, "Fast
and Effective Spam Sender Detection with Granular SVM on Highly
Imbalanced Mail Server Behavior Data", (Invited Paper), Secure
Computing Corporation, North Point Parkway, 2006.
Jamie Twycross , Uwe Aickelin and Amanda Whitbrook, "Detecting
Anomalous Process Behaviour using Second Generation Artificial
Immune Systems", University of Nottingham, UK, 2010.
H. Güneş Kayacık, A. Nur Zincir-Heywood, and Malcolm I. Heywood,
"Selecting Features for Intrusion Detection: A Feature Relevance
Analysis on KDD 99 Intrusion Detection Datasets", 6050 University
Avenue, Halifax, Nova Scotia. B3H 1W5, 2006.
Feng Gu, Julie Greensmith and Uwe Aickelin, "Further Exploration of
the Dendritic Cell Algorithm: Antigen Multiplier and Time Windows",
University of Nottingham, UK, 2007.
Prachya Pongaksorn, Thanawin Rakthanmanon, and Kitsana Waiyamai,
"DCR: Discretization using Class Information to Reduce Number of
Intervals", Data Analysis and Knowledge Discovery Laboratory
(DAKDL), P. Lenca and S. Lallich (Eds.): QIMIE/PAKDD 2009.
Chen Guangzhu, Li Zhishu, Yuan Daohua, Nimazhaxi and Zhai
yusheng. "An Immune Algorithm based on the Complement Activation
Pathway", IJCSNS International Journal of Computer Science and
Network Security, VOL.6 No.1A, January 2006.
J. McHugh, “Testing intrusion detection systems: a critique of the 1998
and 1999 darpa intrusion detection system evaluations as performed by
lincoln laboratory,” ACM Transactions on Information and System
Security, vol. 3, no. 4, pp. 262–294, 2000.
The NSL-KDD Data Set, http://nsl.cs.unb.ca/NSL-KDD.
M. Tavallaee, E. Bagheri, W. Lu, and A. Ghorbani, “A Detailed Analysis
of the KDD CUP 99 Data Set,” Submitted to Second IEEE Symposium
on Computational Intelligence for Security and Defense Applications
(CISDA), 2009.
http://sites.google.com/site/ijcsis/
ISSN 1947-5500