LV, Lu Wang, Wenhai Zhang, Zeyin Liu, Xinggao (2020)
LV, Lu Wang, Wenhai Zhang, Zeyin Liu, Xinggao (2020)
LV, Lu Wang, Wenhai Zhang, Zeyin Liu, Xinggao (2020)
Knowledge-Based Systems
journal homepage: www.elsevier.com/locate/knosys
article info a b s t r a c t
Article history: Intrusion detection is a challenging technology in the area of cyberspace security for protecting a
Received 23 August 2019 system from malicious attacks. A novel accurate and effective misuse intrusion detection system that
Received in revised form 7 February 2020 relies on specific attack signatures to distinguish between normal and malicious activities is therefore
Accepted 10 February 2020
presented to detect various attacks based on an extreme learning machine with a hybrid kernel func-
Available online 13 February 2020
tion (HKELM). First, the derivation and proof of the proposed hybrid kernel are given. A combination
Keywords: of the gravitational search algorithm (GSA) and differential evolution (DE) algorithm is employed
Intrusion detection system to optimize the parameters of HKELM, which improves its global and local optimization abilities
Extreme learning machine during prediction attacks. In addition, the kernel principal component analysis (KPCA) algorithm is
Gravitational search algorithm introduced for dimensionality reduction and feature extraction of the intrusion detection data. Then,
Differential evolution a novel intrusion detection approach, KPCA-DEGSA-HKELM, is obtained. The proposed approach is
Kernel principal component analysis eventually applied to the classic benchmark KDD99 dataset, the real modern UNSW-NB15 dataset and
the industrial intrusion detection dataset from the Tennessee Eastman process. The numerical results
validate both the high accuracy and the time-saving benefit of the proposed approach.
© 2020 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.knosys.2020.105648
0950-7051/© 2020 Elsevier B.V. All rights reserved.
2 L. Lv, W. Wang, Z. Zhang et al. / Knowledge-Based Systems 195 (2020) 105648
fuzzy model with adaptive neuro-fuzzy inference system features, ability of the KELM method aiming at providing accurate and
which can countermeasure SOAP-related attacks with a high de- efficient misuse intrusion detection methods. Furthermore,
tection accuracy and low false positive rate. Furthermore, they the proof of the proposed hybrid kernel function is provided
also developed fuzzy associative rules to effectively countermea- in detail.
sure SOAP-related and XML-related attacks in Web services and • In the context of DE and GSA, a hybrid algorithm, DEGSA,
e-commerce applications [13,14]. Wathiq et al. [15] proposed is proposed and combines the benefits of DE and GSA with
a method called real-time multi-agent system for an adaptive the aim of improving both the local and global optimization
intrusion detection system (RTMAS-AIDS) to allow the IDS to abilities for detecting attacks.
adapt to unknown attacks in real-time, and this method applied • The KPCA algorithm is introduced for the dimensionality
a hybrid support vector machine (SVM) and extreme learning reduction and feature extraction of the intrusion detection
machine (ELM) to classify normal behavior and known attacks. An data. Then, an effective intrusion detection approach, KPCA-
effective SVM-based ID algorithm was presented by Tao et al. [16] DEGSA-HKELM, is obtained.
to identify intrusions, which obtained great results. In addition, • The proposed approach is compared with other literature
the improved algorithms of ELM and SVM were widely used for methods in an extensive testbed comprising of three intrusion
ID applications [17–20]. These advances have achieved a great detection datasets, namely, the classic benchmark KDD99
performance for detecting and reporting malicious attacks. Nev- dataset [28], the real modern UNSW-NB15 dataset [29] and
ertheless, the better accuracy and efficiency of the prediction the industrial intrusion detection dataset from the TE pro-
model is still the first purpose of an IDS. cess [30]. These datasets include both host-based and
The objective of this paper is to provide an accurate and network-based attacks from different platforms, which can
effective misuse intrusion detection system with machine learn- demonstrate the effectiveness of the proposed method.
ing techniques that rely on specific attack signatures to dis- • The proposed approach is evaluated and compared with other
tinguish between normal and malicious activities with a high literature methods using several classification evaluation met-
accuracy and fast learning speed. It is well known that the de- rics. The experimental results show that the proposed ap-
termination of a suitable configuration for a particular dataset is proach is superior to other methods in terms of accuracy
a demanding problem in machine learning. Therefore, numerous (Acc), mean accuracy (MAcc), mean F-score (MF ) and attack
researchers have attempted to find the optimal parameters of accuracy (AAcc) evaluation metrics. Furthermore, the pro-
machine learning models. Aburomman and Reaz [21] applied the posed approach outperforms the CPSO-SVM method in terms
particle swarm optimization (PSO) algorithm to the SVM-KNN of all the overall evaluation metrics while achieving a higher
computational efficiency with less training and testing time.
ensemble method to create a classifier with a better accuracy for
ID. The authors also proposed a novel weighted SVM multiclass The rest of this paper is organized as follows. In Section 2, the
classifier based on differential evolution (DE) for the IDS [22]. extreme learning machine with hybrid kernel function (HKELM)
ELMs are a popular area of research for detecting possible intru- approach is proposed, and DEGSA is presented to optimize the
sions and attacks [23]. Ku and Zheng [24] proposed an improved parameters of the HKELM model, and moreover, the KPCA al-
learning algorithm named self-adaptive differential evolution ex- gorithm is introduced for feature extraction. Section 3 outlines
treme learning machine with Gaussian kernel for classifying and the implementation of the proposed algorithms. The experiment
detecting the intrusions. Bostani and Sheikhan [25] introduced a environment and model evaluation metrics are illustrated in Sec-
hybrid binary gravitational search algorithm (GSA) for feature se- tion 4. In Section 5, the experimental results are provided to
lection in IDSs, and this method can find a great subset of features validate the accuracy and efficiency of the proposed approach.
and achieve a high accuracy and detection rate. GSA is a heuristic Finally, Section 6 contains some concluding remarks.
optimization method with fewer parameters to be determined,
which has the benefits of a high convergence rate and strong 2. Approach description
local optimization ability. Meanwhile a DE algorithm is a heuristic
optimization method with a strong global optimization ability, 2.1. Extreme learning machine with hybrid kernel function (HKELM)
which has the benefit of great adaption [26,27]. Unfortunately,
both the GSA and DE algorithms have their own disadvantages, 2.1.1. Extreme learning machine (ELM)
whereby the former is easily trapped into a local optimum and An ELM is an effective feedforward neural network with a sin-
the latter’s local optimization ability is relatively weak. gle hidden layer, as presented by Huang et al. [31,32]. Traditional
An effective approach to deal with ID problems is therefore neural networks need to set a large number of parameters to train
presented. First, considering the timeliness requirement of ID the network, and moreover, it is easy to generate local optimal
problems, an ELM method is selected as the basic model in solutions. Nevertheless, the ELM only needs to set the number
the current work. Furthermore, to improve the accuracy of the of hidden nodes in the network, without adjusting the weight of
ELM method, a hybrid kernel function combining the radial basis the input layer and the bias of the hidden layer, and it is easier to
function (RBF) kernel with the polynomial kernel is derived and generate a global optimal solution [33]. Therefore, the ELM has
introduced to the ELM model. The proof that the proposed hybrid a faster convergence rate and is more efficient in terms of the
kernel function satisfies Mercer’s theorem is also given. Second, learning performance. The network structure of the ELM is shown
taking advantage of both the GSA and DE algorithms, a hybrid dif- in Fig. 1.
ferential evolution combined with gravitational search algorithm For the given training dataset T0 = {(xj , t j ), j = 1, . . . , N },
(DESGA) is proposed to optimize the parameters of the proposed where xj = [xj1 , . . . , xjn ] ∈ Rn is the input feature vector and
model, which improves both the local and global optimization tar j = [tarj1 , . . . , tarjm ] ∈ Rm is the corresponding target vector,
abilities over those of the individual algorithms. Third, the ker- the goal is to obtain the optimal model for further testing tasks.
nel principal component analysis (KPCA) is introduced for the In Fig. 1, y j = [yj1 , . . . , yjm ] ∈ Rm is the output vector obtained
dimensionality reduction and feature extraction of the nonlinear via the ELM network. Then, the ELM model can be expressed by
ID data. The significance of this paper is summarized as follows. the following formula:
l l
∑ ∑
• A new HKELM method with a hybrid kernel function is pro- yj = βi gi (xj ) = βi g(αi · xj + ci ), j = 1, . . . , N (1)
posed that improves both the generalization and learning i=1 i=1
L. Lv, W. Wang, Z. Zhang et al. / Knowledge-Based Systems 195 (2020) 105648 3
layer, ci is the bias of the ith hidden node, and g(·) is the activation K (x, xN )
function of the hidden layer. The node parameters αi and ci of the
The selection of the kernel function can greatly influence the
hidden layer are randomly assigned, and as a consequence, only
performance of the KELM model. Consequently, it is significant
the number of hidden layer nodes l needs to be determined in the to find an appropriate kernel function for the KELM model. The
ELM model. polynomial kernel function and radial basis function (RBF) kernel
If the error between the output y and the target tar can function are two common kernel functions, which are combined
be approximated to zero, then the following equation can be together as the hybrid function of the KELM in the current work.
obtained as:
2.1.2.1. Polynomial kernel function. The expression of the polyno-
N
∑ mial kernel function is stated as follows,
∥tar j − y j ∥ = 0 (2)
j=1 Kpoly (x, xi ) = (x · xi + b)p (11)
Combining Eq. (1) with Eq. (2), there exist βi , αi and ci that satisfy: where b and p are the constant and exponent parameters of the
polynomial kernel function, respectively.
l
The polynomial kernel function is a typical global kernel func-
∑ tion, which means that its corresponding KELM model possesses
βi g(αi · xj + ci ) = tar j , j = 1, . . . , N (3)
a strong generalization ability and weak learning ability [35,36].
i=1
Fig. 2 demonstrates the curves of the polynomial kernel function
Eq. (3) can be converted into matrix form as follows, with different b and p, where the test point is selected as xi = 0.2.
In Fig. 2(a), the value of parameter p is set as p = 2, while
g(α1 · x1 + c1 ) · · · g(αl · x1 + cl ) βT1
⎡ ⎤ ⎡ ⎤
the value of parameter b changes from 0.2 to 1.0. In Fig. 2(b),
.. .. .. .. ⎥ the value of parameter b is set as b = 1, while the value of
. ⎦ ·⎣
⎢ ⎥ ⎢
⎣ . . . ⎦
g(α1 · xN + c1 ) · · · g(αl · xN + cl ) N ×l p changes from 1 to 5. As Fig. 2 indicates, the output of the
βTl l×m
polynomial kernel function increases with the input. Further-
H N ×l =[h(x1 ),...,h(xN )]T βl×m more, the sample points both near and far away from the test
⎡ ⎤ (4) point have an influence on the output of the kernel function,
tar T1
which verifies the strong generalization ability of the polynomial
=⎣
⎢ .. ⎥
. ⎦ kernel function. However, the test point has no apparent learning
tar TN capability, revealing the weak learning ability of the polynomial
N ×m
kernel function.
T N ×m
2.1.2.2. RBF kernel function. The expression of the RBF kernel
that is, function is indicated as follows,
∥x − xi ∥2 ∥x − xi ∥2
( ) ( )
Hβ = T (5)
KRBF (x, xi ) = exp − = exp − (12)
2σ 2 a
where T is the output matrix of the target and H and β are the
output matrix and weight matrix of the hidden layer, respectively. where a = 2σ 2 is the exponent parameter of the RBF kernel
Therefore, the weight matrix of hidden layer β can be calcu- function.
lated by the following equation: The RBF kernel function is a typical local kernel function,
which means that the corresponding KELM model has a strong
β = H +T (6) learning ability and weak generalization ability [35,36]. Fig. 3
4 L. Lv, W. Wang, Z. Zhang et al. / Knowledge-Based Systems 195 (2020) 105648
that is,
∥x − xi ∥2
( )
Khybrid (x, xi ) = w · exp − + (1 − w) · (x · xi + b)p ,
a
w ∈ [0, 1]
(14)
Proof. KRBF and Kpoly are kernel functions, thus, KRBF and Kpoly are
positive semidefinite matrices, that is, for any vector λ ∈ R, the
following conditions must be satisfied:
λT KRBF λ ≥ 0
{
(15)
Fig. 2. Curves of the polynomial kernel function with different b and p. λT Kpoly λ ≥ 0
Then, the following expressions can be obtained:
wλT KRBF λ ≥ 0 λT (w KRBF )λ ≥ 0
{ {
⇒ (16)
(1 − w )λT Kpoly λ ≥ 0 λT [(1 − w)Kpoly ]λ ≥ 0
Therefore, w KRBF and (1 − w )Kpoly are positive semidefinite matri-
ces. According to Eq. (13), λT Khybrid λ can be rewritten as follows,
Combined with Eqs. (16) and (17), the following expression can
be obtained:
λT Khybrid λ ≥ 0 (18)
where randj is a random number between 0 and 1 and Kbest are Based on the above optimization procedures of DEGSA, a de-
the first k objects with the best fitness values. tailed description of the proposed algorithm is summarized as
By the law of motion, the acceleration of the ith object is follows,
shown as follows,
(1) Initialize the number of objects N and the velocity xi and
Fid (t) position vi of the ith object. Determine the parameters of
acid (t) = (31)
Mi (t) DEGSA, e.g., the descending coefficient γ , the initial gravita-
tional constant G0 , etc. Set the maximal number of iterations
where Mi (t) is the inertial mass of the ith object.
itermax and let the initial iteration be t = 1.
The next velocity of an object is the sum of the current ve-
(2) Calculate the fitness value according to Eq. (26).
locity and its acceleration. Therefore, the updates in velocity and
(3) If the number of the iteration t is odd, go to step (4); other-
position of the ith object can be described as,
wise, go to step (5).
vid (t + 1) = randi × vid (t) + acid (t) (32) (4) Activate GSA.
where fiti (t) is a fitness value. For a minimum problem, best(t) (a) Generate a mutant vector Vi (t) according to Eq. (23).
and w orst(t) are defined by the following equations: (b) Generate a trial vector Ui (t) according to Eq. (24).
(c) If fit(Ui (t)) ≤ fit(Xi (t)), go to step (5)-(d); otherwise, go
best(t) = min fitj (t) (37) back to step (5)-(a).
j∈1,...,NP
(d) If i ≤ N, go back to step(5)-(a); else, go to step (6).
worst(t) = max fitj (t) (38) (6) Update the parameters according to the new fitness.
j∈1,...,NP
(7) If t ≤ itermax , go back to step (3); otherwise, go to step (8).
For a maximum problem, best(t) and w orst(t) are defined as, (8) Output the updated solutions as the optimal parameters.
best(t) = max fitj (t) (39) Then, the proposed DEGSA is finished.
j∈1,...,NP
corresponding eigenvectors. k(xn , xnew ) is the kernel function of xn Name Meaning Notation
and a new column vector sample xnew , which calculates the inner Precision The instances of correctly Pi (i = 0, 1, . . . , c − 1)
classified as the ith class given
products of xn and xnew in the high-dimensional feature space Γ .
the proportion of all instances
predicted as the ith class
3. Algorithm outline Recall The instances of correctly Ri (i = 0, 1, . . . , c − 1)
classified as the ith class given
the proportion of all instances
The ID process based on the KPCA-DEGSA-HKELM approach actually belonging to the ith class
is demonstrated in Fig. 6, which is also described in detail as F-score The balance between the Fi (i = 0, 1, . . . , c − 1)
follows, precision and the recall
Accuracy The frequency of correct Acc
decisions
Algorithm. The proposed KPCA-DEGSA-HKELM approach Mean accuracy The average recall among all the MAcc
Step 1: Input: ID dataset, the initial parameters of the KPCA- classes of the dataset
DEGSA-HKELM model. Mean F-score The average F-score among all MF
Step 2: Dimensionality reduction and feature extraction of the the classes of the dataset
Attack accuracy The accuracy rate for the attack AAcc
input dataset by using the KPCA method. classes
Step 3: Train the HKELM model with the preprocessed training False attack The frequency of falsely FAR
dataset and optimize the model parameters of the HKELM with accuracy predicting a normal instance as
the hybrid algorithm DEGSA. When the number of iterations an attack
False normal The frequency of falsely FNR
reaches the maximum value iter max , the optimization process rate predicting an attack instance as
stops, and the optimal parameters are obtained. normal
Step 4: Evaluate the obtained optimal HKELM model with the
testing dataset.
Step 5: Output: the evaluation metrics of the ID problem.
the different categories of the network connection in the exper-
iments are denoted as normal (labeled as 0), attacks (labeled as
4. Environment and model evaluation
1, 2, . . . , c − 1), where c denotes the number of categories for
All the experiments in the current work are executed on a PC the network connection. Take the KDD99 dataset as an example,
constituted by a 2.6 GHz Intel Core i5 processor with 8.0 GB of which contains four attack categories of DoS, PRB, U2R and R2L.
RAM. To validate the effectiveness of the KPCA-DEGSA-HKELM A confusion matrix of a classification experiment is shown in
classification model, several multiclass classification evaluation Table 2, where Nij represents the number of the ith kind of
metrics are utilized to gauge the models. In current work, the network connection that is predicted as the jth kind of network
precision, recall and F-score are used as the evaluation metrics connection.
of each class, moreover, the accuracy, mean accuracy, mean F- Then, all the evaluation metrics can be calculated as follows,
score, attack accuracy, false alarm rate and false normal rate are
utilized as the overall evaluation metrics. The detailed definitions TPi Nii
Pi = = ∑c −1 (i = 0, 1, . . . , c − 1) (43)
of the evaluation metrics are provided in Table 1. In Table 1, TPi + FPi Nji
j=0
8 L. Lv, W. Wang, Z. Zhang et al. / Knowledge-Based Systems 195 (2020) 105648
Table 2
An example of a confusion matrix.
Classified class
Actual class
Normal (0) DoS (1) PRB (2) U2R (3) R2L (4)
Normal (0) N00 N01 N02 N03 N04
DoS (1) N10 N11 N12 N13 N14
PRB (2) N20 N21 N22 N23 N24
U2R (3) N30 N31 N32 N33 N34
R2L (4) N40 N41 N42 N43 N44
TPi Nii Fig. 7. F-scores and mean F-scores for the Poly_KELM, RBF_KELM and HKELM
Ri = = ∑c −1 (i = 0, 1, . . . , c − 1) (44) methods with the KDD99 dataset.
TPi + FNi Nij
j=0
2 · R i · Pi 5. Case study
Fi = (i = 0, 1, . . . , c − 1) (45)
R i + Pi
5.1. Case 1: Intrusion detection of the KDD99 dataset
∑c −1 ∑c −1
TPi 5.1.1. Dataset description
i=0 0 Nii
Acc = ∑c −1 = ∑c −1 i=
∑c −1 (46) The KDD99 dataset is selected as a standard benchmark
(TPi + FNi ) Nij
i=0 i=0 j=0 database to evaluate the effectiveness of the proposed models.
In the KDD99 dataset, each instance consists of 41 features and
c −1 c −1 a label, where the label belongs to either a normal or specific
1∑ TPi 1∑
MAcc = = Ri (47) attack type. As Table 3 shows, there are 23 types of attacks in
c TPi + FNi c total, which can be divided into four major attack categories [28]:
i=0 i=0
DoS (denial of service), PRB (probing), U2R (user to root) and R2L
(remote to local).
c −1
1∑ As the full dataset of KDD99 (18 M; 743 M uncompressed)
MF = Fi (48)
c is cumbersome for training machine learning algorithms, the
i=0
10% subset of the dataset (2.1 M; 75 M uncompressed) is used
by the majority of researchers. Thus, the subset that maintains
c −1 c −1 the initial characteristics of the full dataset is chosen to be the
1 ∑ TPi 1 ∑
AAcc = = Ri (49) experiment dataset in current work. Table 4 provides the detailed
c−1 TPi + FNi c−1
i=1 i=1 information of the instances in the datasets, where the training
and two testing datasets are denoted as T0 , T1 and T2 , respectively.
FN1 The training dataset T0 together with the testing data set T1 are
FAR = = 1 − R1 (50) used to demonstrate the effectiveness of the proposed HKELM,
TP1 + FN1
DEGSA-HKELM and KPCA-DEGSA-HKELM models. Furthermore,
to be consistent with other literature works such as KDD99 win-
∑c −1
Nj0 ner [48], another testing dataset T2 is applied to compare the
FP0 j=1
FNR = ∑c −1 = ∑c −1 ∑c −1 (51) performance of the proposed KPCA-DEGSA-HKELM model with
FP0 + i=1 TPi j=1 Nj0 + i=1 Nii the models of other research.
Table 3
Attack categories of the KDD99 dataset.
Class Meaning Attacks of KDD99
DoS Denial of service back, land, neptune, pod, smurf, teardrop
PRB Surveillance and other means of probing ipsweep, nmap, portsweep, satan
U2R Unauthorized access to local (root) privileges buffer_overflow, loadmodule, perl, rootkit
R2L Unauthorized access from a remote machine ftp_write, guess_passwd, imap, multihop, phf, spy, warezclient, warezmaster
Table 4
The instances of training and testing datasets for the KDD99 dataset.
Number of instances
Class
Training data set T0 Testing dataset T1 Testing dataset T2 10% KDD99 dataset
Normal 200 1000 10000 97278
DoS 60 500 40000 391458
PRB 40 500 400 4107
U2R 30 52 52 52
R2L 60 1000 100 1126
Table 5 Table 6
Confusion matrices of the KDD99 dataset T1 . Comparison of the overall evaluation metrics with the KDD99 dataset T 1 .
(a) By the Poly_KELM method Evaluation metric
Method
Classified class Acc (%) MAcc (%) MF (%) AAcc (%) FNR (%)
Actual class
Normal DoS PRB U2R R2L Recall (%) Poly_KELM 91.87 89.19 85.15 87.99 6.14
Normal 940 0 0 31 29 94.00 RBF_KELM 92.46 86.27 87.71 83.39 6.68
DoS 1 497 2 0 0 99.40 HKELM 93.25 92.19 88.08 91.12 5.81
PRB 48 0 411 16 25 82.20
U2R 6 0 0 41 5 78.85
R2L 67 1 0 17 915 91.50
Precision (%) 89.27 99.80 99.54 48.42 96.00 the testing results with different values of a, b, p and w are shown
in Fig. 8(a)∼(d).
(b) By the RBF_KELM method
As shown in Fig. 8, the values of these four parameters will
Classified class
Actual class affect the accuracy of the classification, therefore, it is significant
Normal DoS PRB U2R R2L Recall (%) to determine the appropriate set of the four parameter values.
Normal 978 0 2 4 16 97.80 (2) DEGSA-HKELM
DoS 3 463 34 0 0 92.60
It is difficult to quickly and precisely find the optimal param-
PRB 55 15 427 0 3 85.40
U2R 10 0 0 33 9 63.46 eter values via human experience. As a result, DEGSA integrated
R2L 64 7 1 7 921 92.10 with a swarm intelligent optimization algorithm is introduced to
Precision (%) 88.11 95.46 92.03 75.00 97.05 adaptively obtain the optimal parameter values in the current
(c) By the HKELM method work. In the meanwhile, the DE-HKELM and GSA-HKELM meth-
ods are considered to compare with the DEGSA-HKELM method
Classified class
Actual class proposed in the paper. In current work, the internal parameters of
Normal DoS PRB U2R R2L Recall (%)
the DE branch of the DEGSA-HKELM method are consistent with
Normal 965 0 0 23 12 96.50
those of the DE-HKELM method, which are set as: the population
DoS 2 496 2 0 0 99.20
PRB 32 0 429 15 24 85.80 size is NDE = 10, the maximum number of iterations is iter max =
U2R 4 0 0 46 2 88.46 20; the lower bound of the scaling parameter F in Eq. (23) is
R2L 78 1 0 11 910 91.00 FL = 0.2, the upper bound of the scaling parameter F in Eq. (23)
Precision (%) 89.27 99.80 99.54 48.42 95.99 is FU = 0.8, and the crossover rate in Eq. (24) is CR = 0.2.
(a) The parameters are set as b = 15, and p = 4. In addition, the internal parameters of the GSA branch of the
(b) The parameter is set as a = 100. DEGSA-HKELM method are consistent with those of the GSA-
(c) The parameters are set as a = 100, b = 15, p = 4, and w = 0.9.
HKELM method, which are set as: the number of agents is NGSA =
20, the maximum number of iterations is iter max = 20, the small
constant in Eq. (28) is ε = 2−52 , the initial gravitational constant
RBF_KELM and proposed HKELM methods are given in in Eq. (29) is G0 = 300, and the descending coefficient in Eq. (29)
Table 5(a)∼(c). is γ = 20.
To demonstrate the performance of Poly_KELM, RBF_KELM Table 7 shows the detailed F-score of each class obtained
and HKELM, a more visual illustration is shown in Fig. 7, which by DE-HKELM, GSA-HKELM and DEGSA-HKELM. From Table 7,
gives the comparisons among the three methods as the F-score the F-score of each class obtained by DEGSA-HKELM is higher
and mean F-score for each class. In Fig. 7, the F-score of normal, than that of the other two methods. In particular, the F-score of
PRB and mean F-score obtained by the HKELM method are higher U2R obtained by DE-HKELM is 77.78%, GSA-HKELM is 81.48% and
than those of the other two methods. In addition, the overall DEGSA-HKELM is 85.46%. DEGSA-HKELM improves of the F-score
evaluation metrics of the three methods are shown in Table 6. of U2R by approximately 7.68% and 3.98% compared to that of
It is clear that the HKELM method is better than the other two DE-HKELM and GSA-HKELM, respectively. Furthermore, the mean
methods for all the overall evaluation metrics. F-score obtained by DE-HKELM is 92.77%, GSA-HKELM is 93.61%
For the proposed HKELM method, the appropriate values of and DEGSA-HKELM is 94.86%. DEGSA-HKELM achieves percentage
the hybrid kernel function parameters a, b, p and w in Eq. (14) increases of approximately 2.09% and 1.25% for the mean F-
need to be selected. By utilizing the HKELM method that is score, when compared to that of DE-HKELM and GSA-HKELM,
constructed upon the training dataset T0 and testing dataset T1 , respectively.
10 L. Lv, W. Wang, Z. Zhang et al. / Knowledge-Based Systems 195 (2020) 105648
Fig. 8. Testing results for the KDD99 dataset with different values of a, b, p, and w.
L. Lv, W. Wang, Z. Zhang et al. / Knowledge-Based Systems 195 (2020) 105648 11
Table 9
Comparison of the overall evaluation metrics between DEGSA-HKELM and KPCA-DEGSA-HKELM with the KDD99 dataset T1 .
Evaluation metric
Method
Acc (%) MAcc (%) MF (%) AAcc (%) FAR (%) FNR (%) Testing time (s)
DEGSA-HKELM 96.59 95.58 94.86 94.70 0.90 4.12 0.033581
KPCA-DEGSA-HKELM 96.69 95.66 94.38 94.85 1.10 3.73 0.012569
Table 10
Comparison of the solutions achieved by the proposed approach with those of other literature methods with the KDD99 dataset T2 .
Precision (%)
Method Acc (%) MAcc (%) MF (%) AAcc (%) FAR (%)
Normal DoS PRB U2R R2L
KDDwinner [48] 99.45 97.12 83.32 13.16 8.40 92.71 81.91 58.87 83.74 25.39
SVM [49] 99.30 99.50 97.50 19.70 28.80 95.70 N/A N/A N/A 0.70
CSVAC [5] 99.91 99.72 65.74 42.59 20.47 94.86 75.26 66.20 69.83 3.04
CPSO-SVM [18] 96.87 99.98 63.61 11.08 50.27 98.05 93.45 71.28 92.62 3.26
RTMAS-AIDS [15] 97.89 99.79 91.86 24.68 35.90 95.86 N/A N/A N/A 2.13
Dendron [6] 99.36 99.12 82.83 52.63 79.54 98.85 89.85 85.77 87.50 0.75
Current work 96.35 99.99 89.57 54.55 68.15 99.00 95.38 87.21 94.47 0.94
Table 11
The results of the CPSO-SVM method and the current work for different KDD99 testing datasets..
Acc (%) Training time (s) Testing time (s)
Testing dataset
CPSO-SVM Current work CPSO-SVM Current work Time saved CPSO-SVM Current work Time saved
T1 94.35 96.69 19.915382 13.204581 33.70% 0.041214 0.012569 69.50%
T2 98.05 99.00 20.047230 14.830142 26.02% 0.426238 0.168058 60.57%
Table 12
Attack categories of the UNSW-NB15 dataset.
Class Meaning
Generic A technique that works against all block-ciphers (with a given
block and key size), without consideration of the structure of the
block-cipher
Exploits The attacker knows of a security problem within an operating
system or a piece of software and leverages that knowledge by
exploiting the vulnerability
Fuzzers Attempting to cause a program or network to suspend by
feeding it randomly generated data
DoS A malicious attempt to make a server or a network resource
unavailable to users, usually by temporarily interrupting or
suspending the services of a host connected to the Internet
Reconnaissance Contains all strikes that can simulate attacks that gather
information
Analysis It contains different attacks, including port scan, spam and html
file penetrations
Backdoors A technique in which a system security mechanism is bypassed
stealthily to access a computer or its data
Shellcode A small piece of code used as the payload in the exploitation of
software vulnerability
Worms Attacker replicates itself to spread to other computers. Often, it
uses a computer network to spread itself, relying on the security
failures of the target computer to access it
Fig. 10. F-scores and mean F-scores for CPSO-SVM, Dendron and KPCA-DEGSA-HKELM with the UNSW-NB15 dataset.
The UNSW-NB15 dataset has been divided into two subsets, removing 6 features from the original dataset. The training set
namely, the training set and testing set [29], which contain and testing set are used by the majority of researchers; therefore,
175341 and 82332 records, respectively. It is noted that the they are chosen to be the experiment datasets in the current
partitioned dataset has only 43 features with the class label, work. Table 13 indicates the detailed information of the datasets,
L. Lv, W. Wang, Z. Zhang et al. / Knowledge-Based Systems 195 (2020) 105648 13
Table 13
The instances of training and testing datasets for the UNSW-NB15 dataset.
Number of instances
Class
UNSW-NB15 training set UNSW-NB15 testing set Training dataset D0 Testing dataset D1
Normal 56000 37000 3420 30786
Generic 40000 18871 366 3291
Exploits 33393 11132 760 6849
Fuzzers 18184 6062 485 4353
DoS 12264 4089 172 1546
Reconnaissance 10491 3496 270 2433
Analysis 2000 677 45 401
Backdoors 1746 583 40 306
Shellcode 1133 378 40 338
Worms 130 44 22 22
Total 175341 82332 5620 50325
Table 14
Confusion matrix of the UNSW-NB15 dataset D1 .
Classified class
Actual class
Norm. Gene. Expl. Fuzz. DoS Reco. Anal. Back. Shell. Worms Recall (%)
Norm. 30043 7 208 310 6 52 17 6 134 3 97.59
Gene. 44 3014 126 42 30 14 1 0 20 0 91.58
Expl. 363 15 5895 176 136 118 9 12 121 4 86.07
Fuzz. 958 8 82 3107 32 66 6 3 90 1 71.38
DoS 121 28 648 141 395 98 17 7 90 1 25.55
Reco. 102 6 172 38 37 1976 6 0 96 0 81.22
Anal. 54 0 69 62 70 31 101 0 13 1 25.19
Back. 30 0 47 68 56 34 15 41 15 0 13.40
Shell. 77 0 4 11 0 31 0 0 215 0 63.61
Worms 1 0 12 0 0 0 0 0 0 9 40.91
Precision (%) 94.50 97.92 81.16 78.56 51.84 81.65 58.72 59.42 27.08 47.37
Norm.: Normal, Gene.: Generic, Expl.: Exploits, Fuzz.: Fuzzers, Reco.: Reconnaissance, Anal.: Analysis, Back.: Backdoors, Shell.: Shellcode.
where the training and testing datasets are denoted as D0 and D1 , Table 15
respectively. D0 and D1 are applied to compare the performance The detailed F-score of each class obtained by KPCA-DEGSA-HKELM and the
other literature methods with the UNSW-NB15 dataset.
of the proposed KPCA-DEGSA-HKELM model with other literature
Class CPSO-SVM [18] (%) Dendron [6] (%) Current work (%)
methods.
Normal 92.81 95.58 96.02
Generic 87.45 88.96 94.65
5.2.2. Experimental results and discussion Exploits 74.21 76.22 83.55
The confusion matrix derived from the testing process of the Fuzzers 44.94 68.84 74.80
proposed KPCA-DEGSA-HKELM approach is given in Table 14. DoS 20.23 16.76 34.23
Reconnaissance 56.15 53.42 81.43
Table 15 shows the detailed F-score of each class obtained
Analysis 15.89 31.48 35.25
by CPSO-SVM [18], Dendron [6] and the proposed KPCA-DEGSA- Backdoors 12.41 28.01 21.87
HKELM approach. From Table 15, the F-scores for the classes Shellcode 32.99 22.32 37.99
Normal, Generic, Exploits, Fuzzers, DoS, Reconnaissance, Analy- Worms 27.83 6.56 43.90
sis, Shellcode and Worms achieved by the KPCA-DEGSA-HKELM Mean 46.49 48.81 60.37
approach are higher than those of the other two methods, and
the F-score of the Backdoors class obtained by the KPCA-DEGSA-
HKELM approach is higher than that of the CPSO-SVM method.
accuracy (Acc), mean accuracy (MAcc), mean F-score (MF ) and
Particularly, the F-scores for the DoS, Reconnaissance and Worms
attack accuracy (AAcc) of the current work are higher than those
classes are improved by over 10%, when compared with those of other literature methods, while the false attack rate (FAR) of
of the other two methods. Moreover, the mean F-score provided the current work is lower than that of other literature methods.
by KPCA-DEGSA-HKELM is 60.37%, CPSO-SVM is 46.49% and Den- In particular, the accuracy of the proposed KPCA-DEGSA-HKELM
dron is 48.81%. The KPCA-DEGSA-HKELM approach achieves per- approach is 89.01%, which represents a 3.45% increase over the
centage increases of approximately 13.88% and 11.56% in mean suboptimal result 85.56% achieved by DT [50]. Therefore, the
F-score when compared to that of CPSO-SVM and Dendron, re- proposed approach has the ability to improve the performance
spectively. of the UNSW-NB15 problem.
In addition, to indicate the performance of KPCA-DEGSA- In addition, to demonstrate the advantage in computational
HKELM more intuitively, the F-scores and mean F-scores of CPSO- efficiency of current work, the results (accuracy, training and
SVM, Dendron and KPCA-DEGSA-HKELM are shown in Fig. 10. testing time) of CPSO-SVM [18] and the proposed KPCA-DEGSA-
The column bars containing slashes, horizontal lines and pure HKELM with the same training dataset D0 are shown in Table 17.
black are the F-scores obtained by CPSO-SVM, Dendron and As Table 17 indicates, the accuracy of the current work is
KPCA-DEGSA-HKELM, respectively. It is apparent that the F-scores 89.01%, which yields a percentage increase of 7.95% over that
for the nine classes and the overall mean F-score obtained by of the CPSO-SVM method with 81.06%. Furthermore, the training
KPCA-DEGSA-HKELM are higher than those of the other two and testing times of CPSO-SVM are 114.226274 s and 14.430140 s,
methods. while those of KPCA-DEGSA-HKELM are 43.306235 s and 2.567050
Table 16 gives the comparison of the overall evaluation met- s, respectively. Specifically, the KPCA-DEGSA-HKELM achieves
rics for the proposed approach and other literature methods. The savings of 62.09% in training time and 82.21% in testing time,
14 L. Lv, W. Wang, Z. Zhang et al. / Knowledge-Based Systems 195 (2020) 105648
Table 16 Table 18
Comparison of overall evaluation metrics for the proposed approach and other Attack categories of the TE intrusion dataset.
literature methods with the UNSW-NB15 dataset. Attack Target
Evaluation metric Step C header pressure loss—reduced availability (stream 4)
Method
Acc (%) MAcc (%) MF (%) AAcc (%) FAR (%) Random variation A, B, and C feed compositions (stream 4)
CPSO-SVM [18] 81.06 49.98 46.49 44.99 5.16 Slow drift Reaction kinetics
ANN [50] 81.34 N/A N/A N/A 21.13 Sticking Reactor cooling water valve
NB [50] 82.07 N/A N/A N/A 18.56
DT [50] 85.56 N/A N/A N/A 15.78
GALR-DT [51] 81.42 N/A N/A N/A 6.39
MP [6] 73.89 27.91 26.61 20.57 4.56 5.3.1. Intrusion simulation experiment of the TE process
C4.5 [6] 85.15 49.33 48.79 44.14 2.54 The revised TE process model was created by Bathelt et al. [52]
Dendron [6] 84.33 52.21 48.81 47.19 2.61 which provided a real chemical simulation platform. Five major
CAI [23] 82.74 N/A N/A N/A 36.46
Current work 89.01 59.65 60.37 55.43 2.41
units named reactor, product condenser, vapor–liquid separator,
product stripper and recycle compressor constitute the TE pro-
cess [53]. The revised TE process contains four reactants named
Table 17 A, C, D and E as well as an inertia component, B. The setup
The results of the CPSO-SVM method and current work with the testing dataset
D1 .
produces two liquid products named G and H and a byproduct
Method Acc (%) Training time (s) Testing time (s)
named F through a reaction system composed of four irreversible
chemical reactions, and the flowchart of the TE process is shown
CPSO-SVM [18] 81.06 114.226274 14.430140
Current work 89.01 43.306235 2.567050 in Fig. 11 [26,52].
Accuracy improved/time saved 7.95% 62.09% 82.21% The intrusion experiment of the revised TE process was con-
The results are obtained on the same computational platform.
ducted in operating mode 1, which seems to be the most com-
monly used mode in the literature [30]. The process was first
run for 40 h under normal operating conditions (Phase I), and
an attack (including step, random variation, slow drift and stick-
when compared with those of CPSO-SVM. The results of Table 17
ing attack categories) was then introduced to the process for
indicate the time-saving benefits of KPCA-DEGSA-HKELM for the 100 h (Phase II). Taking the slow drift attack as an example,
UNSW-NB15 dataset. the corresponding plots of the reactor pressure and G product
quality are shown in Figs. 12 and 13, respectively. Figs. 12 and
13 illustrate that the process is in control in Phase I; however,
5.3. Case 3: Intrusion detection with the industrial TE process when introducing a slow drift attack at 40 h, the process tends to
become out of control in Phase II. In the practical industrial pro-
cess, fluctuations in process variables, such as reactor pressure,
To further demonstrate the ID performance of the proposed
will cause a significant reduction in product quality and even
KPCA-DEGSA-HKELM approach in a real and complex environ- severe damage to the equipment. As a result, it is significant to
ment, an industrial simulation experiment platform with a non- propose an effective ID system for detecting malicious activities
linear and complex multicomponent TE process is built. This in industrial processes.
platform simulates the continuous chemical system and attack Finally, the values of the 41 measured variables and 12 manip-
activities in the real TE process to obtain the TE intrusion data. ulated variables were collected in sequence during the continuous
L. Lv, W. Wang, Z. Zhang et al. / Knowledge-Based Systems 195 (2020) 105648 15
6. Conclusions
Table 20
Comparison of the solutions between the proposed approach and CPSO-SVM with the TE intrusion dataset.
F-score (%)
Method Acc (%) MF (%) AAcc (%) FAR (%)
Normal Step Random Slow Sticking
CPSO-SVM [18] 88.57 95.99 90.19 87.40 91.98 90.71 90.83 90.78 9.58
Current work 91.28 99.09 95.54 97.41 96.21 95.82 95.90 95.16 1.53
Table 21
Results of the CPSO-SVM method and current work with the testing dataset E1 .
Method Acc (%) Training time (s) Testing time (s)
CPSO-SVM [18] 90.71 54.184571 0.252235
Current work 95.82 21.283896 0.128424
Accuracy improved/time saved 5.11% 60.72% 49.09%
Notation [2] A.L. Buczak, E. Guven, A survey of data mining and machine learning
methods for cyber security intrusion detection, IEEE Commun. Surv. Tutor.
18 (2016) 1153–1176, http://dx.doi.org/10.1109/comst.2015.2494502.
a Exponent parameter of the RBF kernel function
[3] R. Yahalom, A. Steren, Y. Nameri, M. Roytman, A. Porgador, Y. Elovici,
b Constant parameter of the polynomial kernel Improving the effectiveness of intrusion detection systems for hierarchical
function data, Knowl.-Based Syst. 168 (2019) 59–69, http://dx.doi.org/10.1016/j.
B Euclidian distance knosys.2019.01.002.
C Penalty parameter [4] T. Aldwairi, D. Perera, M.A. Novotny, An evaluation of the performance of
fit Fitness value restricted Boltzmann machines as a model for anomaly network intrusion
detection, Comput. Netw. 144 (2018) 111–119, http://dx.doi.org/10.1016/j.
G Gravitational constant comnet.2018.07.025.
H Output matrix of the hidden layer [5] W.Y. Feng, Q.L. Zhang, G.Z. Hu, J.X.J. Huang, Mining network data for
I Identity matrix intrusion detection through combining SVMs with ant colony networks,
K Kernel function Future Gener. Comput. Syst. 37 (2014) 127–140, http://dx.doi.org/10.1016/
M Gravitational mass j.future.2013.06.027.
[6] D. Papamartzivanos, F.G. Marmol, G. Kambourakis, Dendron: Genetic trees
p Exponent parameter of the polynomial kernel
driven rule induction for network intrusion detection systems, Future
function Gener. Comput. Syst. 79 (2018) 558–574, http://dx.doi.org/10.1016/j.future.
tar Target vector 2017.09.056.
T Output matrix of the target [7] C.H. Tsang, S. Kwong, H.L. Wang, Genetic-fuzzy rule mining approach and
T0 Training dataset evaluation of feature selection techniques for anomaly intrusion detection,
Pattern Recognit. 40 (2007) 2373–2391, http://dx.doi.org/10.1016/j.patcog.
T1 , T2 Testing datasets
2006.12.009.
U Trail vector [8] F. Salo, A.B. Nassif, A. Essex, Dimensionality reduction with IG-PCA and
V Mutant vector ensemble classifier for network intrusion detection, Comput. Netw. 148
w Weight coefficient of the hybrid kernel function (2019) 164–175, http://dx.doi.org/10.1016/j.comnet.2018.11.010.
x Input feature vector [9] C.F. Tsai, C.Y. Lin, A triangle area based nearest neighbors approach to
y Output vector intrusion detection, Pattern Recognit. 43 (2010) 222–229, http://dx.doi.org/
10.1016/j.patcog.2009.05.017.
α Weight vector between the input layer and hidden
[10] Y. Li, L. Guo, An active learning based TCM-KNN algorithm for supervised
layer network intrusion detection, Comput. Secur. 26 (2007) 459–467, http:
β Weight matrix of the hidden layer //dx.doi.org/10.1016/j.cose.2007.10.002.
γ Descending coefficient [11] C. Xiang, P.C. Yong, L.S. Meng, Design of multiple-level hybrid classifier
Γ High-dimensional feature space for intrusion detection system using Bayesian clustering and decision
trees, Pattern Recognit. Lett. 29 (2008) 918–924, http://dx.doi.org/10.1016/
Φ Nonlinear mapping function
j.patrec.2008.01.008.
[12] G.Y. Chan, C.S. Lee, S.H. Heng, Policy-enhanced ANFIS model to counter
CRediT authorship contribution statement SOAP-related attacks, Knowl.-Based Syst. 35 (2012) 64–76, http://dx.doi.
org/10.1016/j.knosys.2012.04.013.
Lu Lv: Methodology, Software, Writing - original draft. Wen- [13] G.Y. Chan, C.S. Lee, S.H. Heng, Discovering fuzzy association rule patterns
hai Wang: Supervision, Project administration. Zeyin Zhang: Su- and increasing sensitivity analysis of XML-related attacks, J. Netw. Comput.
Appl. 36 (2013) 829–842, http://dx.doi.org/10.1016/j.jnca.2012.11.006.
pervision, Project administration. Xinggao Liu: Supervision, Writ-
[14] G.Y. Chan, C.S. Lee, S.H. Heng, Defending against XML-related attacks in e-
ing - review & editing, Funding acquisition.
commerce applications with predictive fuzzy associative rules, Appl. Soft.
Comput. 24 (2014) 142–157, http://dx.doi.org/10.1016/j.asoc.2014.06.053.
Acknowledgments [15] W.L. Al-Yaseen, Z.A. Othman, M.Z.A. Nazri, Real-time multi-agent system
for an adaptive intrusion detection system, Pattern Recognit. Lett. 85
This work is supported by the National Key R&D Program of (2017) 56–64, http://dx.doi.org/10.1016/j.patrec.2016.11.018.
China (grant number 2018YFB2004200), Zhejiang Provincial Nat- [16] P.Y. Tao, Z. Sun, Z.X. Sun, An improved intrusion detection algorithm based
ural Science Foundation, PR China (grant number LY18D060002), on GA and SVM, IEEE Access 6 (2018) 13624–13631, http://dx.doi.org/10.
1109/access.2018.2810198.
and National Natural Science Foundation of China (grant number
[17] J.P. Liu, J.Z. He, W.X. Zhang, T.Y. Ma, Z.H. Tang, J.P. Niyoyita, W.H. Gui,
61590921), and their supports are thereby acknowledged. ANID-SEoKELM: Adaptive network intrusion detection based on selective
ensemble of kernel ELMs with random features, Knowl.-Based Syst. 177
References (2019) 104–116, http://dx.doi.org/10.1016/j.knosys.2019.04.008.
[18] F.J. Kuang, S.Y. Zhang, Z. Jin, W.H. Xu, A novel SVM by combining
[1] H.W. Wang, J. Gu, S.S. Wang, An effective intrusion detection framework kernel principal component analysis and improved chaotic particle swarm
based on SVM with feature augmentation, Knowl.-Based Syst. 136 (2017) optimization for intrusion detection, Soft Comput. 19 (2015) 1187–1199,
130–139, http://dx.doi.org/10.1016/j.knosys.2017.09.014. http://dx.doi.org/10.1007/s00500-014-1332-7.
L. Lv, W. Wang, Z. Zhang et al. / Knowledge-Based Systems 195 (2020) 105648 17
[19] A.I. Saleh, F.M. Talaat, L.M. Labib, A hybrid intrusion detection system [35] G.F. Smits, E.M. Jordaan, Improved SVM regression using mixtures of
(HIDS) based on prioritized k-nearest neighbors and optimized SVM kernels, in: Proceeding 2002 Int. Jt. Conf. Neural Networks, Vols. 1–3, 2002,
classifiers, Artif. Intell. Rev. 51 (2019) 403–443, http://dx.doi.org/10.1007/ pp. 2785–2790, http://dx.doi.org/10.1109/IJCNN.2002.1007589.
s10462-017-9567-1. [36] Z.D. Tian, S.J. Li, Y.H. Wang, X.D. Wang, Wind power prediction method
[20] M.R.G. Raman, N. Somu, K. Kirthivasan, R. Liscano, V.S.S. Sriram, An efficient based on hybrid kernel function support vector machine, Wind Eng. 42
intrusion detection system based on hypergraph - Genetic algorithm for (2018) 252–264, http://dx.doi.org/10.1177/0309524x17737337.
parameter optimization and feature selection in support vector machine, [37] R. Storn, K. Price, Differential evolution - A simple and efficient heuristic
Knowl.-Based Syst. 134 (2017) 1–12, http://dx.doi.org/10.1016/j.knosys. for global optimization over continuous spaces, J. Global Optim. 11 (1997)
2017.07.005. 341–359, http://dx.doi.org/10.1023/a:1008202821328.
[21] A.A. Aburomman, M.B.I. Reaz, A novel SVM-kNN-PSO ensemble method [38] S. Das, P.N. Suganthan, Differential evolution: A survey of the state-of-the-
for intrusion detection system, Appl. Soft. Comput. 38 (2016) 360–372, art, IEEE Trans. Evol. Comput. 15 (2011) 4–31, http://dx.doi.org/10.1109/
http://dx.doi.org/10.1016/j.asoc.2015.10.011. tevc.2010.2059031.
[22] A.A. Aburomman, M.B. Reaz, A novel weighted support vector machines [39] A.K. Qin, V.L. Huang, P.N. Suganthan, Differential evolution algorithm with
multiclass classifier based on differential evolution for intrusion detection strategy adaptation for global numerical optimization, IEEE Trans. Evol.
systems, Inf. Sci. 414 (2017) 225–246, http://dx.doi.org/10.1016/j.ins.2017. Comput. 13 (2009) 398–417, http://dx.doi.org/10.1109/tevc.2008.927706.
06.007. [40] E. Rashedi, H. Nezamabadi-Pour, S. Saryazdi, GSA: A gravitational search
[23] C.R. Wang, R.F. Xu, S.J. Lee, C.H. Lee, Network intrusion detection us- algorithm, Inform. Sci. 179 (2009) 2232–2248, http://dx.doi.org/10.1016/j.
ing equality constrained-optimization-based extreme learning machines, ins.2009.03.004.
Knowl.-Based Syst. 147 (2018) 68–80, http://dx.doi.org/10.1016/j.knosys. [41] E. Rashedi, H. Nezamabadi-pour, S. Saryazdi, BGSA: binary gravitational
2018.02.015. search algorithm, Nat. Comput. 9 (2010) 727–745, http://dx.doi.org/10.
[24] J.H. Ku, B. Zheng, Intrusion detection based on self-adaptive differential 1007/s11047-009-9175-3.
evolution extreme learning machine with Gaussian kernel, in: G. Chen, H. [42] M. Zhang, X.G. Liu, Z.Y. Zhang, A soft sensor for industrial melt index
Shen, M. Chen (Eds.), Parallel Archit. Algorithm Program, Paap 2017, 2017, prediction based on evolutionary extreme learning machine, Chin. J. Chem.
pp. 13–24, http://dx.doi.org/10.1007/978-981-10-6442-5_2. Eng. 24 (2016) 1013–1019, http://dx.doi.org/10.1016/j.cjche.2016.05.030.
[25] H. Bostani, M. Sheikhan, Hybrid of binary gravitational search algo- [43] M. Seyedmahmoudian, R. Rahmani, S. Mekhilef, A.M.T. Oo, A. Stojcevski,
rithm and mutual information for feature selection in intrusion detection T.K. Soon, A.S. Ghandhari, Simulation and hardware implementation of
systems, Soft Comput. 21 (2017) 2307–2324, http://dx.doi.org/10.1007/ new maximum power point tracking technique for partially shaded PV
s00500-015-1942-8. system using hybrid DEPSO method, IEEE Trans. Sustain. Energy 6 (2015)
[26] S.M. He, L. Xiao, Y.L. Wang, X.G. Liu, C.H. Yang, J.G. Lu, W.H. Gui, Y.X. Sun, A 850–862, http://dx.doi.org/10.1109/tste.2015.2413359.
novel fault diagnosis method based on optimal relevance vector machine, [44] W.J. Zhang, X.F. Xie, DEPSO: Hybrid particle swarm with differential
Neurocomputing 267 (2017) 651–663, http://dx.doi.org/10.1016/j.neucom. evolution operator, in: 2003 IEEE Int. Conf. Syst. Man Cybern. Vols 1–
2017.06.024. 5, Conf. Proc, 2003, pp. 3816–3821, http://dx.doi.org/10.1109/ICSMC.2003.
[27] X. Qiu, K.C. Tan, J.X. Xu, Multiple exponential recombination for differential 1244483.
evolution, IEEE Trans. Cybern. 47 (2017) 995–1006, http://dx.doi.org/10. [45] I.T. Jolliffe, Principle Component Analysis, 2006.
1109/tcyb.2016.2536167. [46] B. Scholkopf, A. Smola, K.R. Muller, Nonlinear component analysis as a
[28] W. Lee, S.J. Stolfo, A framework for constructing features and models for kernel eigenvalue problem, Neural Comput. 10 (1998) 1299–1319, http:
intrusion detection systems, ACM Trans. Inf. Syst. Secur. 3 (2000) 227–261, //dx.doi.org/10.1162/089976698300017467.
http://dx.doi.org/10.1145/382912.382914. [47] L.L. Guo, P. Wu, J.F. Gao, S.W. Lou, Sparse kernel principal component anal-
[29] N. Moustafa, J. Slay, UNSW-NB15: A comprehensive data set for network ysis via sequential approach for nonlinear process monitoring, IEEE Access
intrusion detection systems (UNSW-NB15 network data set), in: IEEE 7 (2019) 47550–47563, http://dx.doi.org/10.1109/access.2019.2909986.
Military Communications and Information Systems Conference, MilCIS, [48] C. Elkan, Results of the KDD’99 classifier learning, ACM SIGKDD Explor.
2015, http://dx.doi.org/10.1109/MilCIS.2015.7348942. Newsl. 1 (2000) 63–64, http://dx.doi.org/10.1145/846183.846199.
[30] F. Capaci, E. Vanhatalo, M. Kulahci, The revised Tennessee Eastman process [49] S.J. Horng, M.Y. Su, Y.H. Chen, T.W. Kao, R.J. Chen, J.L. Lai, C.D. Perkasa,
simulator as testbed for SPC and DoE methods, Qual. Eng. 31 (2019) A novel intrusion detection system based on hierarchical clustering and
212–229, http://dx.doi.org/10.1080/08982112.2018.1461905. support vector machines, Expert Syst. Appl. 38 (2011) 306–313, http:
[31] G.B. Huang, Q.Y. Zhu, C.K. Siew, Extreme learning machine: A new learning //dx.doi.org/10.1016/j.eswa.2010.06.066.
scheme of feedforward neural networks, in: 2004 IEEE Int. Jt. Conf. Neural [50] N. Moustafa, J. Slay, The evaluation of network anomaly detection systems:
Networks, Vols. 1–4, Proc, 2004, pp. 985–990, http://dx.doi.org/10.1109/ Statistical analysis of the UNSW-NB15 data set and the comparison with
IJCNN.2004.1380068. the KDD99 data set, Int. J. Inf. Secur. 25 (2016) 18–31, http://dx.doi.org/
[32] G.B. Huang, Q.Y. Zhu, C.K. Siew, Extreme learning machine: Theory and 10.1080/19393555.2015.1125974.
applications, Neurocomputing 70 (2006) 489–501, http://dx.doi.org/10. [51] C. Khammassi, S. Krichen, A GA-LR wrapper approach for feature selection
1016/j.neucom.2005.12.126. in network intrusion detection, Comput. Secur. 70 (2017) 255–277, http:
[33] J. Wu, Y. Zhu, Z.C. Wang, Z.J. Song, X.G. Liu, W.H. Wang, Z.Y. Zhang, Y.S. Yu, //dx.doi.org/10.1016/j.cose.2017.06.005.
Z.P. Xu, T.J. Zhang, J.H. Zhou, A novel ship classification approach for high [52] A. Bathelt, N.L. Ricker, M. Jelali, Revision of the Tennessee Eastman
resolution SAR images based on the BDA-KELM classification model, Int. process model, in: 2015 IFAC Symposium on Advanced Control of Chemical
J. Remote Sens. 38 (2017) 6457–6476, http://dx.doi.org/10.1080/01431161. Processes ADCHEM, 48, 2015, pp. 309–314, http://dx.doi.org/10.1016/j.
2017.1356487. ifacol.2015.08.199.
[34] G.B. Huang, H.M. Zhou, X.J. Ding, R. Zhang, Extreme learning machine for
regression and multiclass classification, IEEE Trans. Syst. Man Cybern. B 42
(2012) 513–529, http://dx.doi.org/10.1109/tsmcb.2011.2168604.