Efficient Epileptic Seizure Prediction Based On Deep Learning
Efficient Epileptic Seizure Prediction Based On Deep Learning
Efficient Epileptic Seizure Prediction Based On Deep Learning
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBCAS.2019.2929053, IEEE
Transactions on Biomedical Circuits and Systems
Abstract— Epilepsy is one of the world's most common Epilepsy has a high disease burden where 50 million people
neurological diseases. Early prediction of the incoming seizures worldwide have epilepsy and there are about two million new
has a great influence on epileptic patients’ life. In this paper, a
patients recorded every year. Up to 70% of the epileptic
novel patient-specific seizure prediction technique based on deep
learning and applied to long-term scalp EEG recordings is patients could be controlled by the Anti-Epileptic Drugs
proposed. The goal is to accurately detect the preictal brain state (AED) while the other 30% are uncontrollable [2].
and differentiate it from the prevailing interictal state as early as Electroencephalogram (EEG) is the electrical recording of
possible and make it suitable for real-time. The features the brain activities and is considered the most powerful
extraction and classification processes are combined into a single diagnostic and analytical tool of epilepsy. Physicians classify
automated system. Raw EEG signal without any preprocessing is
the brain activity of the epileptic patients according to the
considered as the input to the system which further reduces the
computations. Four deep learning models are proposed to extract EEG recordings into four states: preictal state, which is
the most discriminative features which enhance the classification defined by the time period just before the seizure, ictal state
accuracy and prediction time. The proposed approach takes which is during the seizure occurrence, postictal state that is
advantage of the convolutional neural network in extracting the assigned to the period after the seizure took place and finally
significant spatial features from different scalp positions and the the interictal state which refers to the period between seizures
recurrent neural network in expecting the incidence of seizures
other than the previously mentioned states [3], these four
earlier than the current methods. A semi-supervised approach
based on transfer learning technique is introduced to improve the states are illustrated in Fig. 1.
optimization problem. A channel selection algorithm is proposed Due to unexpected seizure times, epilepsy has a strong
to select the most relevant EEG channels which makes the psychological and social effect in addition to it could be
proposed system good candidate for real-time usage. An effective considered a life-threatening disease. Consequently, the
test method is utilized to ensure robustness. The achieved highest prediction of epileptic seizures would greatly contribute to
accuracy of 99.6% and lowest false alarm rate of 0.004 𝐡−𝟏 along improving the quality of life of epileptic patients in many
with very early seizure prediction time of one hour make the
proposed method the most efficient among the state of the art.
aspects, like raising an alarm before the occurrence of the
seizure to provide enough time for taking proper action,
Index Terms— classification, deep learning, epilepsy, EEG, developing new treatment methods and setting new strategies
interictal, preictal, seizure prediction to better understand the nature of the disease. According to the
above categorization of the epileptic patient’s brain activities,
the seizure prediction problem could be viewed as a
I. INTRODUCTION classification task between the preictal and interictal brain
E PILEPSY is defined according to the International League states. An alarm is raised in case of detecting the preictal state
Against Epilepsy (ILAE) report [1], as a neurological among the predominant interictal states indicating a potential
brain disorder identified by the frequent occurrence of seizure is coming as shown in Fig.1. The prediction time is the
symptoms called epileptic seizure due to abnormal brain time before the seizure onset when the preictal state is
activities. Seizure’s characteristics include loss of awareness detected.
or consciousness and disturbances of movement, sensation or In the literature, there are various methods proposed to
other cognitive functions. The overall incidence of epilepsy is address the seizure prediction problem trying to reach high
23-100 per 100,000. People at extremes of age are the most classification accuracy with early prediction. Since EEG
affected age group while the disease crests among young signals are different across patients due to the variations in
individuals in ages between 10 to 20 years old [2]. seizure type and location [4], most seizure prediction methods
are therefore patient-specific. In these methods, supervised
learning techniques are used through two main stages which
Manuscript submitted March 1, 2019. are feature extraction and classification between preictal states
Hisham Daoud is with the Center for Advanced Computer Studies, and interictal states. In [5], the authors categorize the feature
University of Louisiana at Lafayette, Lafayette, LA 70503 USA, (e-mail:
[email protected]).
extraction schemes in terms of localization into univariate and
Magdy Bayoumi is with the Department of Electrical and Computer bivariate and in terms of linearity into linear and nonlinear
Engineering, University of Louisiana at Lafayette, Lafayette, LA 70503 USA, Multiple features are sometimes combined to capture the brain
(e-mail: [email protected]).
1932-4545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBCAS.2019.2929053, IEEE
Transactions on Biomedical Circuits and Systems
1932-4545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBCAS.2019.2929053, IEEE
Transactions on Biomedical Circuits and Systems
1932-4545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBCAS.2019.2929053, IEEE
Transactions on Biomedical Circuits and Systems
1932-4545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBCAS.2019.2929053, IEEE
Transactions on Biomedical Circuits and Systems
1280x23 32@1279x22
32@639x11 32@638x10
32@319x5 32@318x4
32@159x2
32@158x1
st st nd nd rd rd th
Multi-channel EEG EEG Matrix 1 Conv 1 Pooling 2 Conv 2 Pooling 3 Conv 3 Pooling 4 Conv
Fig. 7. The architecture of the proposed DCNN front-end in DCNN based models
gates that could store or forget the previous state and use or output at each time step is the combined outputs of the two
discard the current state. Any LSTM cell computes two states blocks at this time step. In addition to the previous context
at each time step: a cell state (c) that could be maintained for processing in standard LSTM, Bi-LSTM processes the future
long time steps and a hidden state (h) that is the new output of context which enhances the prediction results. Using Bi-
the cell at each time step. The mathematical expressions LSTM as a classifier enhances the prediction accuracy through
governing the cell gates’ operation are defined as follows: extracting the important temporal features in addition to the
spatial features extracted by the DCNN.
ft = σ(Wfh ht-1 +Wfx xt +bf ) (5)
it = σ(Wih ht-1 +Wix xt +bi ) (6)
ot = σ(Woh ht-1 +Wox xt +bo ) (7)
c̃ t = tanh(Wch ht-1 +Wcx xt +bc ) (8)
ct = ft ∘ ct-1 +it ∘ c̃ t (9)
ht = ot ∘ tanh(ct ) (10)
where xt is the input at time t , ct and ht are the cell state and
Fig. 9. The unrolled Bidirectional LSTM Network
the hidden state at time t respectively. W and b denote weights
and biases parameters respectively. σ is the sigmoid function Bi-LSTM is used in two proposed models in Fig. 4, 5(b), as the
and ∘ is the Hadamard product operator. c̃ t is a candidate for back-end classifier that works on the feature vector generated
updating ct through the input gate. by DCNN. The proposed network consists of a single
The input gate it decides whether to update the cell with a bidirectional layer that predicts the class label at the last time
new cell state c̃ t , while the forget gate ft decides what to keep instance after processing all the EEG segments as shown in
or forget from the previous cell state and finally the output Fig. 9. We chose the number of units, dimensionality of the
gate ot decides how much information to be passed to the next output space, to be 20. Dropout regularization technique is
cell. utilized to avoid overfitting. The dropout is applied to the
input and the recurrent state with factor of 10% and 50%
respectively. The sigmoid activation function is used for
prediction of the EEG segment’s class and RMSprop is
selected for optimization.
E. Deep Convolutional Autoencoder
Autoencoders (AEs) are unsupervised neural networks
whose target is to find a lower dimensional representation of
the input data. This technique has many applications like data
compression [29], dimensionality reduction [30], visualizing
Fig. 8. Basic LSTM cell
high dimensional data [31] and removing noise from the input
Instead of using LSTM as the classifier, we used a data. The AE network has two main parts namely, encoder and
Bidirectional-LSTM (Bi-LSTM) network [28] in which each decoder. The encoder compresses the high dimensional input
LSTM block is replaced by two blocks that process temporal data into lower dimensional representation called latent space
sequence simultaneously in two opposite directions as representation or bottleneck and the decoder is retrieving the
depicted in Fig. 9. In the forward pass block, the feature vector data back to its original dimension. The simple AE uses fully
generated from the DCNN is processed starting from its first- connected layers for the encoder and decoder. The aim is to
time instance to the end, while the backward pass block learn the parameters that minimize the cost function which
processes the same segment in the reverse order. The network expresses the difference between the original data and the
1932-4545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBCAS.2019.2929053, IEEE
Transactions on Biomedical Circuits and Systems
1278x22 1278x22
639x11 639x11
637x10 636x10
Input Reconstructed
EEG Signal EEG Signal
318x5 318x5
316x4 316x4
st
158x2 158x2 rd
1 C st 156x1 rd 3 U
1 P nd nd 3 D
2 C nd
st
nd 2 U
2 P 3rd C rd 2 D
4 C 1 D 1 U
th st
3 P
Encoder Decoder
Fig. 10. The architecture of the proposed DCAE. C stands for convolution, P for pooling, D for deconvolution and U for upsampling layer
retrieved one. Deep Convolutional Autoencoder (DCAE) LSTM that is used in the third model (DCNN + Bi-LSTM).
replaces the fully connected layers in the simple AE with Training of this model is done in a supervised manner to
convolution layers. predict the patient-specific seizure onset. Since we used both
Due to the limited EEG dataset for each patient, we decided unsupervised and supervised learning algorithms, this model is
to extend our work to develop an unsupervised training considered a semi-supervised learning model.
algorithm using DCAE as shown in Fig. 5(a). The proposed
F. EEG Channel Selection
architecture of the DCAE model is depicted in Fig. 10. We
used the same proposed DCNN model as an encoder and We introduce an EEG channel selection algorithm to select
added the decoder network to build the DCAE. Unsupervised the most important and informative EEG channels related to
learning is deployed using transfer learning technique by our problem. Decreasing the number of channels helps with
training the DCAE on all the selected patients’ data (not reducing the features’ dimension, the computation load and
patient-specific). Transfer learning helps to obtain better the required memory for the model to be suitable for real-time
generalization and enhance the optimization of our prediction application. The proposed channel selection algorithm is
model and therefore reducing the training time. explained in Table II. We provide the algorithm with the EEG
In the DCAE, Fig. 10, the encoder part consists of preictal segments for each patient and the measured prediction
convolution and pooling layers interchangeably, while in the accuracy by running our fourth model, DCAE + Bi-LSTM
decoder part, the deconvolution and upsampling layers are using all channels. On the other hand, the algorithm will
used to reconstruct the original EEG segment. The encoder output the reduced channels that give the same accuracy by
output is the latent space representation which is low omitting redundant or irrelevant channels. We start by
dimensional features that best represent the EEG input computing the statistical variance defined by (12) and the
segment. On the other hand, the decoder output is the entropy defined by (13) for all the available channels (23
reconstructed version of the original input. The learned channels) of the preictal segments. Then, we select the
encoder parameters are saved to be used later for training the channels with highest variance entropy product that provide
prediction model in Fig. 5(b) allowing the training process to the same given prediction accuracy. This is done through an
have a good start point instead of random initialization of the iterative process by training our model on the reduced
parameters which reduces the training time drastically. channels over each iteration. The variance is estimated as
1
Training of the DCAE is done using unlabeled EEG 𝜎 2 (𝑋𝑐 ) = ∑𝑁
𝑖=1(𝑥𝑐 (𝑖) − 𝜇𝑐 )
2
(12)
𝑁
segments (balanced data of preictal and interictal segments) of
all the selected patients. RELU activation function is used where 𝑋𝑐 , 𝜇𝑐 and 𝑁 are the EEG data after normalization,
across all the convolutional layers. Batch Normalization mean and number of samples of channel 𝑐, respectively. The
technique is used to improve the training speed and to reduce entropy of channel 𝑐 is calculated as
overfitting. The DCAE is optimized using RMSprop optimizer.
The mean square error is utilized as the cost function and is 𝐻(𝑋𝑐 ) = − ∑𝑁
𝑖=1 𝑝(𝑥𝑐 (𝑖)) log 2 𝑝(𝑥𝑐 (𝑖)) (13)
defined as where 𝑝(𝑥𝑐 (𝑖)) is the probability mass function of the channel
1 (𝑖)
𝐽(Ɵ) = ∑𝑚
𝑖=1(𝑥́ − 𝑥 (𝑖) )2 (11) 𝑐 having 𝑁 samples.
2𝑚
In the channel selection algorithm, we chose the channels
where 𝑥 is the input EEG signal and 𝑥́ (𝑖) is the reconstructed
(𝑖)
with the highest variance entropy product because we want to
EEG signal. 𝑚 is the number of training examples and Ɵ is the
maximize both. We want to select the channel that has a high
parameters being learned.
variance during the preictal interval and also provide the
After DCAE training, the pre-trained encoder is used as a
largest amount of information.
front-end of the fourth proposed model, DCAE + Bi-LSTM, as
shown in Fig. 5(b) while the back-end is Bi-LSTM network.
We used the same network architecture of the DCNN and Bi-
1932-4545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBCAS.2019.2929053, IEEE
Transactions on Biomedical Circuits and Systems
TABLE II
THE PROPOSED EEG CHANNEL SELECTION ALGORITHM 𝑇𝑃
Algorithm 1: EEG channel selection algorithm 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = (14)
𝑇𝑃+𝐹𝑁
Input: Eight patients EEG preictal segments, the seizure 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 =
𝑇𝑁
(15)
prediction accuracy for each patient using all 𝑇𝑁+𝐹𝑃
channels Acc[1 : 8] 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃+𝑇𝑁
(16)
Output: Chred [1 : 8] (array of reduced channels for each 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
patient that give the same accuracy Acc) where TN, TP, FN and FP are the true negative, true positive,
Initialization: m = 8 (initial number of channels), done = 0 false negative and false positive respectively.
for patient ← 1 to 8 do
m = 8, done = 0; III. RESULTS
for ch ← 1 to 23 do
compute variance[ch]; A. Performance Evaluation and Analysis
compute entropy[ch]; We evaluated our proposed patient-specific models on the
compute variance[ch]*entropy[ch]; selected patients by calculating some performance measures
end such as prediction accuracy, prediction time, sensitivity,
sort the 23 channels with highest variance entropy product specificity and false alarm per hour. The training time is also
first in Temp; computed to evaluate our proposed channel selection
while done ≠ 1 do algorithm. Table III shows the obtained values of these
select first m channels from Temp; measures for the proposed four models which are MLP,
train and test the model with m channels; DCNN + MLP, DCNN + Bi-LSTM, DCAE + Bi-LSTM. The
compute the prediction accuracy Accnew ; fifth model, DCAE + Bi-LSTM + CS, is the same as the fourth
if Accnew ≥ Acc[patient] then one but with using the channel selection algorithm.
done ← 1;
As could be noticed from Table III, MLP has the worst
Chred [patient] ← Temp[1: m];
accuracy, sensitivity, specificity and false alarm rate among
else
the proposed models and this is because the learning process
m ← m+1
end in this model aims at updating the network parameters for the
end output to be close to the ground truth without extracting any
end features from the input data. The huge number of parameters
in this model (around 9 million) is another drawback. The
training time is moderate (7.3 min) due to network simplicity.
G. Training and Testing Method
In order to overcome the problem of the imbalanced dataset,
we selected the number of interictal segments to be equal to 100
the available number of preictal segments during the training
Accuracy (%)
80
process. The interictal segments were selected at random from
the overall interictal samples. To ensure robustness and 60
generality of the proposed models, we used the Leave-one-out
cross validation (LOOCV) technique as the evaluation method 40
for all of our proposed models. In LOOCV, the training is 1 3 7 9 10 20 21 22
MLP
done N separate times, where N is the number of seizures for a DCNN+MLP
Patient ID
specific patient. Each time, all seizures are involved in the DCNN+BiLSTM
training process except one seizure on which the testing is Fig. 11. The measured accuracy among three different proposed algorithms.
applied. The process is then repeated by changing the seizure
under test. By using this method, we ensure that the testing
covers all the seizures and the tested seizures are unseen 100
during the training. The performance for one patient is the
Sensitivity (%)
patients. The prediction time of each model is recorded at the Fig. 12. The measured sensitivity among three different proposed algorithms.
time of first preictal segment detection. The evaluation
measures are defined as follows:
1932-4545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBCAS.2019.2929053, IEEE
Transactions on Biomedical Circuits and Systems
100
This improves the network optimization by starting the
training with an initial set of parameters that makes the
Specificity (%)
80 convergence process faster. As a result, the training time
decreased to 4.25 min on average with the same highest
60 performance. Utilizing the transfer learning technique reduces
overfitting and generalizes better.
40
1 3 7 9 10 20 21 22 The proposed channel selection algorithm reduces the
MLP
DCNN+MLP Patient ID number of channels to 10 channels on average among all the
DCNN+BiLSTM selected patients instead of using all the channels which are 23
Fig. 13. The measured specificity among three proposed algorithms. channels. Therefore, the computation complexity is reduced
making the training time to reach 2.2 min on average with
0.7
lowest number of parameters of around 18K which make this
0.6 model suitable for real-time applications. All the obtained
False Alarm
(per hour)
0.5 results are shown graphically for different models across the
0.4
0.3 selected patients in Fig. (11 – 15).
0.2 Regarding the prediction time, all the proposed models
0.1 were able to accurately predict the tested seizures from the
0
1 3 7 9 10 20 21 22 start of the preictal segments, thus the prediction time is one
MLP
DCNN+MLP Patient ID hour before the seizure onset or less in case of a shorter
DCNN+BiLSTM preictal segment.
Fig. 14. The measured false alarm rate among three proposed algorithms.
B. Statistical Analysis
We performed Kruskal-Wallis test [32] as a nonparametric
33
test statistic to compare the accuracy, sensitivity, specificity
Training Time (min)
30
27
and false alarm rate of each model of the three basic models
24
21
which are, MLP, DCNN + MLP, and DCNN + Bi-LSTM. The
18 Kruskal-Wallis test yielded (p-value < 0.05) for all the
15
12 performance measures indicating statistical significance
9
6 difference between the results among all the proposed models.
3
0 For the accuracy (p-value = 0.01), for the sensitivity (p-value
MLP 1 3 7 9 10 20 21 22
DCNN+MLP = 0.006), for the specificity (p-value = 0.04), and for the false
DCNN+BiLSTM Patient ID
DCAE+BiLSTM alarm rate (p-value = 0.04).
DCAE+BiLSTM+CS
Fig. 15. The measured training time on the test set among five proposed C. Comparison with Other Methods
algorithms: MLP, DCNN + MLP, DCNN + Bi-LSTM, DCAE + Bi-LSTM For further evaluation of our proposed method, we
and DCAE + Bi-LSTM + CS
compared our achieved experimental results with previous
By introducing the DCNN as a front-end, we found around work that have used the same dataset as shown in Table IV.
10% enhancement in the accuracy, sensitivity and specificity While the same criterion to select the patients from the dataset
and the false alarm rate is improved by 60%. This is applied in this paper and [10], the other compared work
improvement is due to the ability of DCNN to extract the employed different criteria which led to different selection of
spatial features across different scalp positions to use it in patients. In the presented previous work, some features were
discrimination between preictal and interictal brain states. On extracted like Zero-Crossing (ZC) interval in the EEG signals
the other hand, the training time is increased by 5 min and this as in [33] and ZC of the Wavelet Transform (WT) coefficients
is due to the added computation complexity by the DCNN. of the EEG signals as in [10], WT of the EEG signals as in
The network parameters are drastically decreased because of [13], spectral power as in [34] and set of features in time
the parameter sharing and sparse connectivity properties of the domain, frequency domain and from graph theory as in [11].
DCNN. In our third model, we used Bi-LSTM as the back- These studies used machine learning based classifiers like
end along with DCNN and this model increase the accuracy to SVM or Gaussian Mixture Model (GMM). The authors in [13]
be 99.6%, the sensitivity to be 99.72% and the specificity to used CNN as a classifier. The proposed method achieved the
be 99.6%. The false alarm rate is enhanced a lot to reach 0.004 highest accuracy, sensitivity and specificity among others. Our
false alarm per hour. This improvement is due to using Bi- prediction time is the earliest and the false alarm rate is the
LSTM as a classifier instead of MLP. Bi-LSTM extracts lowest.
temporal features from the input sequence which helps in
seizure prediction more accurately at the cost of training time IV. CONCLUSION
which reached 14.2 min. The number of parameters is In this paper, a novel deep learning based patient-specific
decreased by 94% by getting rid of the MLP. In the fourth epileptic seizure prediction method using long-term scalp EEG
model, DCAE is used to train the front-end part of our model. data has been proposed. This method achieves a prediction
1932-4545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBCAS.2019.2929053, IEEE
Transactions on Biomedical Circuits and Systems
TABLE III
PERFORMANCE EVALUATION OF THE PROPOSED MODELS
False Alarm Training No. of
Proposed Model Sensitivity Specificity Accuracy
𝒉−𝟏 Time (min) Parameters
MLP 84.67% 82.60% 83.63% 0.174 7.3 8,870,291
DCNN + MLP 95.41% 92.80% 94.10% 0.072 12.5 520,477
DCNN + Bi-LSTM 99.72% 99.60% 99.66% 0.004 14.2 27,657
DCAE + Bi-LSTM 99.72% 99.60% 99.66% 0.004 4.25 27,657
DCAE + Bi-LSTM + CS 99.72% 99.60% 99.66% 0.004 2.2 18,345
TABLE IV
COMPARISON WITH OTHER SEIZURE PREDICTION METHODS APPLIED TO CHB-MIT DATASET
Method Data Prediction
Ref. Sensitivity Specificity Accuracy False Alarm 𝒉−𝟏
Feature Extraction Classification Selection Time
[33] ZC Interval GMM LPOCV 83.81% N/A N/A 0.165 19.8 min
[10] ZC in WT SVM LPOCV 96% 90% 94% N/A N/A
[11] Time, Freq., Graph SVM 10-fold CV 85.75% 85.75% 85.75% N/A N/A
[13] WT CNN 10-fold CV 87.8% N/A N/A 0.147 5.8 min
[34] Spectral Power SVM LOOCV 98.68% N/A N/A 0.046 42.7 min
Ours DCAE + Bi-LSTM LOOCV 99.72% 99.60% 99.66% 0.004 1 hr
accuracy of 99.6%, a sensitivity of 99.72%, a specificity of [6] A. Aarabi, R. Fazel-Rezai, and Y. Aghakhani, “EEG seizure prediction:
Measures and challenges,” Annu. Int. Conf. IEEE Eng. Med. Biol. Soc.,
99.60%, a false alarm rate of 0.004 per hour and prediction pp. 1864–1867, Sep. 2009.
time of one hour prior the seizure onset. An important spatial [7] M. Bandarabadi, C. A. Teixeira, J. Rasekhi, and A. Dourado, “Epileptic
and temporal feature from raw data are learned by the DCNN seizure prediction using relative spectral power features,” Clin.
Neurophysiol., vol. 126, no. 2, pp. 237–248, Feb. 2015.
and Bi-LSTM networks respectively. DCAE based Semi-
[8] L. D. Iasemidis, J. C. Sackellares, H. P. Zaveri, and W. J. Williams,
supervised learning approach is investigated with the transfer “Phase space topography and the Lyapunov exponent of
learning technique which led to reducing the training time. For electrocorticograms in partial seizures,” Brain Topogr., vol. 2, no. 3, pp.
the system to be suitable for real-time application, a channel 187–201, 1990.
[9] M. Le Van Quyen, J. Martinerie, M. Baulac, and F. Varela,
selection algorithm is proposed which reduces the “Anticipating epileptic seizures in real time by a non-linear analysis of
computational load and the training time. Using Leave-One- similarity between EEG recordings,” Neuroreport, vol. 10, no. 10, pp.
Out exhaustive cross-validation technique to test the proposed 2149–2155, Jul. 1999.
[10] S. Elgohary, S. Eldawlatly, and M. I. Khalil, “Epileptic seizure
models proves the robustness and generality of our method prediction using zero-crossings analysis of EEG wavelet detail
against variation across various seizure types. coefficients,” IEEE Conference on Computational Intelligence in
Our experimental results and the comparison with previous Bioinformatics and Computational Biology (CIBCB), pp. 1–6, 2016.
[11] K. M. Tsiouris, V. C. Pezoulas, D. D. Koutsouris, M. Zervakis, and D. I.
work demonstrate that the proposed method is efficient, Fotiadis, “Discrimination of Preictal and Interictal Brain States from
reliable and suitable for real-time application of seizure Long-Term EEG Data,” IEEE 30th International Symposium on
prediction. This is by achieving accuracy higher than the state Computer-Based Medical Systems (CBMS), pp. 318–323, 2017.
[12] C. Alexandre Teixeira et al., “Epileptic seizure predictors based on
of the art with earlier prediction time to mitigate the potential computational intelligence techniques: A comparative study with 278
life-threatening incidents for epileptic patients. patients,” Comput. Methods Programs Biomed., vol. 114, no. 3, pp. 324–
336, May 2014.
[13] H. Khan, L. Marcuse, M. Fields, K. Swann, and B. Yener, “Focal onset
REFERENCES
seizure prediction using convolutional networks,” IEEE Trans. Biomed.
[1] R. S. Fisher et al., “ILAE Official Report: A practical clinical definition Eng., vol. 65, no. 9, pp. 2109–2118, Sep. 2018.
of epilepsy,” Epilepsia, vol. 55, no. 4, pp. 475–482, Apr. 2014. [14] H. G. Daoud, A. M. Abdelhameed, and M. Bayoumi, “Automatic
[2] World Health Organization, Neurological Disorders: Public Health epileptic seizure detection based on empirical mode decomposition and
Challenges. World Health Organization, 2006. deep neural network,” IEEE 14th International Colloquium on Signal
[3] Cheng-Yi Chiang, Nai-Fu Chang, Tung-Chien Chen, Hong-Hui Chen, Processing Its Applications (CSPA), pp. 182–186, 2018.
and Liang-Gee Chen, “Seizure prediction based on classification of EEG [15] “American Epilepsy Society Seizure Prediction Challenge,” Accessed
synchronization patterns with on-line retraining and post-processing on: Jun.17, 2018, [Online] Available: https://www.kaggle.com/c/seizure-
scheme,” 2011 Annual International Conference of the IEEE prediction..
Engineering in Medicine and Biology Society, Boston, MA, pp. 7564– [16] N. V. Chawla, N. Japkowicz, and A. Kotcz, “Editorial: special issue on
7569, 2011. learning from imbalanced data sets,” SIGKDD Explor., vol. 6, pp. 1–6,
[4] “Epilepsy prevalence, incidence and other statistics,” Joint Epilepsy Jun. 2004.
Council, Leeds, UK, 2005. [17] H. Daoud and M. Bayoumi, “Deep Learning based Reliable Early
[5] E. Bou Assi, D. K. Nguyen, S. Rihana, and M. Sawan, “Towards Epileptic Seizure Predictor,” IEEE Biomedical Circuits and Systems
accurate prediction of epileptic seizures: A review,” Biomed. Signal Conference (BioCAS), Cleveland, OH, pp. 1–4, 2018.
Process. and Control, vol. 34, pp. 144–157, Apr. 2017.
1932-4545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBCAS.2019.2929053, IEEE
Transactions on Biomedical Circuits and Systems
[18] A. H. Shoeb, “Application of machine learning to epileptic seizure onset Magdy A. Bayoumi received the B.Sc.
detection and treatment,” Thesis, Massachusetts Institute of Technology,
and M.Sc. degrees in electrical
2009.
[19] A. L. Goldberger et al., “PhysioBank, PhysioToolkit, and PhysioNet: engineering from Cairo University, Egypt,
Components of a New Research Resource for Complex Physiologic in 1973 and 1977, respectively, the M.Sc.
Signals,” Circulation, vol. 101, no. 23, pp. e215–e220, Jun. 2000. degree in computer engineering from
[20] “CHB-MIT Scalp EEG Database,” Accessed on: May 2, 2019 [Online]
Washington University, St. Louis, MO, in
Available: https://physionet.org/pn6/chbmit/.
[21] N. Siddique and H. Adeli, Computational Intelligence: Synergies of 1981, and the Ph.D. degree in electrical
Fuzzy Logic, Neural Networks and Evolutionary Computing. John Wiley engineering from the University of
& Sons, 2013. Windsor, Canada in 1984.
[22] R. H. R. Hahnloser, R. Sarpeshkar, M. A. Mahowald, R. J. Douglas, and
He is the Department Head of W. H. Hall Department of
H. S. Seung, “Digital selection and analogue amplification coexist in a
cortex-inspired silicon circuit,” Nature, vol. 405, no. 6789, pp. 947–951, Electrical & Computer Engineering. He is the Hall Endowed
Jun. 2000. Chair in Computer Engineering. He was the Director of the
[23] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification Center for Advanced Computer Studies (CACS) and the
with Deep Convolutional Neural Networks,” Advances in Neural
Department Head of Computer Science Department. He was
Information Processing Systems, F. Pereira, C. J. C. Burges, L. Bottou,
and K. Q. Weinberger, Eds. Curran Associates, Inc., pp. 1097–1105, also, the Loflin Eminent Scholar Endowed Chair in Computer
2012. Science, all at the University of Louisiana at Lafayette where
[24] Y. Bengio, Y. Lecun, and Y. Lecun, Convolutional Networks for he has been a faculty member since 1985.
Images, Speech, and Time-Series, 1995.
Dr. Bayoumi has graduated about 100 Ph.D. and 150 M.Sc.
[25] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge,
Massachusetts: The MIT Press, 2016. students, authored/co-authored about 600 research papers and
[26] S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep more than 10 books. He was the guest/co-guest editor of more
Network Training by Reducing Internal Covariate Shift,” than 10 special journal issues, the latest was on Machine to
ArXiv:1502.03167, Feb. 2015.
Machine Interface. Dr. Bayoumi is an IEEE Fellow. He has
[27] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural
Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997. served in many capacities in the IEEE Computer, Signal
[28] A. Graves and J. Schmidhuber, “Framewise phoneme classification with Processing, and Circuits & Systems (CAS) societies.
bidirectional LSTM networks,” Proc. IEEE International Joint Currently, he is the vice president of Technical Activities of
Conference on Neural Networks, vol. 4, pp. 2047–2052, 2005.
IEEE RFID council and he is on the IEEE RFID Distinguished
[29] P. Baldi, “Autoencoders, Unsupervised Learning, and Deep
Architectures,” Proc. of ICML Workshop on Unsupervised and Transfer Lecture Program (DLP). He is a member of IEEE IoT Activity
Learning, pp. 37–49, 2012. Board. Dr. Bayoumi has received many awards, among them;
[30] D. P. Kingma and M. Welling, “Auto-Encoding Variational Bayes,” the IEEE CAS Education award and the IEEE CAS
ArXiv:1312.6114, Dec. 2013.
Distinguished Service award. He was on the IEEE DLP
[31] L. van der Maaten and G. Hinton, “Visualizing Data using t-SNE,” J.
Mach. Learn. Res., vol. 9, pp. 2579–2605, 2008. programs for CAS and Computer societies. He was on the
[32] W. H. Kruskal and W. A. Wallis, “Use of Ranks in One-Criterion IEEE Fellow Selection Committee. He has been an ABET
Variance Analysis,” J. Am. Stat. Assoc., Dec. 1952. evaluator and he was an ABET commissioner and team chair.
[33] A. Shahidi Zandi, R. Tafreshi, M. Javidan, and G. A. Dumont,
He has given numerous keynote/invited lectures and talks
“Predicting Epileptic Seizures in Scalp EEG Based on a Variational
Bayesian Gaussian Mixture Model of Zero-Crossing Intervals,” IEEE nationally and internationally. Dr. Bayoumi was the general
Trans. Biomed. Eng., vol. 60, no. 5, pp. 1401–1413, May 2013. chair of IEEE ICASSP 2017 in New Orleans. He, also, chaired
[34] Z. Zhang and K. K. Parhi, “Low-Complexity Seizure Prediction From many conferences including ISCAS 2007, ICIP 2009, and
iEEG/sEEG Using Spectral Power and Ratios of Spectral Power,” IEEE
ICECS 2015. Dr. Bayoumi was the chair of an international
Trans. Biomed. Circuits Syst., vol. 10, no. 3, pp. 693–706, Jun. 2016.
delegation to China, sponsored by People-to-People
Ambassador, 2000. He received the French Government
Fellowship, University of Paris Orsay, 2003-2005 and 2009.
He was a Visiting Professor at King Saud University. He was
Hisham Daoud received the B.Sc. and a United Nation visiting scholar. He has been an advisor to
M.Sc. degrees from Cairo University, many EE/CMPS departments in several countries. Dr.
Egypt, and the Ph.D. degree from Ain Bayoumi was on the State of Louisiana Comprehensive
Shams University, Egypt, in 2004, 2007, Energy Policy Committee. He was the vice president of
and 2014, respectively, all in electronics Acadiana Technology Council. He was on the Chamber of
and communications engineering. Commerce Tourism and Education committees. He was a
Since 2004, he has held multiple member of several delegations representing Lafayette to
positions in both industry and academia. international cities. He was on the Le Centre International
He is currently with the University of Board. He was the general chair of SEASME (an organization
Louisiana at Lafayette, LA, USA. His research interest of French Speaking cities) conference in Lafayette. He is a
includes biomedical signal processing, machine learning, deep member of Lafayette Leadership Institute, he was a founding
learning, and neuromorphic computing. member of its executive committee. He was a column editor
Dr. Daoud’s awards include IEEE CASS Student Travel for Lafayette Newspaper; the Daily Advertiser.
Award, Best Paper Award in the 14th IEEE Colloquium on
Signal Processing and its Applications conference
(CSPA’2018). He has served as a reviewer for several IEEE
conferences and journals.
1932-4545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.