Epileptic EEG Detection Using Neural Networks and Post-Classification

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 1 ( 2 0 0 8 ) 100–109

journal homepage: www.intl.elsevierhealth.com/journals/cmpb

Epileptic EEG detection using neural networks


and post-classification

L.M. Patnaik a,∗ , Ohil K. Manyam b


a Computational Neurobiology Group, Supercomputer Education and Research Centre,
Indian Institute of Science, Bangalore 560012, India
b Department of Electronics and Communication Engineering, National Institute of

Technology Karnataka, Surathkal, Srinivasnagar 575025, India

a r t i c l e i n f o a b s t r a c t

Article history: Electroencephalogram (EEG) has established itself as an important means of identifying
Received 17 July 2007 and analyzing epileptic seizure activity in humans. In most cases, identification of the
Received in revised form epileptic EEG signal is done manually by skilled professionals, who are small in num-
24 November 2007 ber. In this paper, we try to automate the detection process. We use wavelet transform
Accepted 25 February 2008 for feature extraction and obtain statistical parameters from the decomposed wavelet co-
efficients. A feed-forward backpropagating artificial neural network (ANN) is used for the
Keywords: classification. We use genetic algorithm for choosing the training set and also implement
Electroencephalogram (EEG) a post-classification stage using harmonic weights to increase the accuracy. Average speci-
Artificial neural network (ANN) ficity of 99.19%, sensitivity of 91.29% and selectivity of 91.14% are obtained.
Genetic algorithm © 2008 Elsevier Ireland Ltd. All rights reserved.
Resilient backpropagation
Discrete wavelet transform (DWT)

are in great demand. This translates to longer diagnosis time,


1. Introduction increase in medical expenditure and consequent delay in nec-
essary treatment. In many cases, epilepsy can be controlled
EEG involves recording and analysis of electrical signals gener-
purely by medication. In some other cases, surgical removal
ated by the brain. It is an important clinical tool for diagnosing
of the epileptic part of the brain may be carried out. Newer
and monitoring of neurological disorders related to epilepsy.
methods where parts of the brain are electrically stimulated
Epilepsy is characterized by sudden recurrent and transient
to avoid the onset of seizure are being developed. Automatic
disturbances of mental functions and/or movement of the
detection of seizures forms an integral part of such methods.
body that results from excessive discharging of groups of
Therefore there exists a strong need to automate this process.
brain cells. Epileptic EEG from the scalp is characterized by
Most of the work in automatic EEG processing falls into
high-amplitude and synchronized periodic waveforms [1].
two broad categories—seizure detection and seizure predic-
In between seizures, spikes and sharp waves are typically
tion. In 2005, Acir et al. [1] used artificial neural networks for
observed. The detection and classification of these activities
the automatic detection of epileptiform events in EEG signals
by visual screening of the recorded EEG is a complex and time-
and compared backpropagation multi-layer perceptron, radial
consuming operation and requires highly skilled doctors, who


Corresponding author at: Microprocessor Applications Laboratory, CEDT Building, First Floor, Room No. 239, Indian Institute of Science,
Bangalore 560012, Karnataka, India. Tel.: +91 80 23600451; fax: +91 80 23600683.
E-mail address: [email protected] (L.M. Patnaik).
0169-2607/$ – see front matter © 2008 Elsevier Ireland Ltd. All rights reserved.
doi:10.1016/j.cmpb.2008.02.005
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 1 ( 2 0 0 8 ) 100–109 101

basis function network trained by a hybrid method and a sup- a constant frequency resolution is obtained for the entire
port vector method as candidate classifier tools. For training, time–frequency plane. Since the EEG signal is not determin-
7 h 18 min of data from 19 patients were used; while in test- istic, the problem lies in choosing a computationally efficient
ing, 3 h 48 min of data from 10 patients were involved. He time window since the starting and ending time of spectral
also correlated the classification outputs from 19 channels. components are not known beforehand. Testing for a particu-
Hostetler et al. [2] compared a commercial spike detecting lar window involves translating the window along the whole
computer program’s performance to six electroencephalogra- time scale, which involves a lot of calculation [21]. Conse-
phers using six 19-channel EEG recordings of 20 min duration quently we use wavelet transform for feature extraction.
each. One hundred and eighty minutes of 16-channel EEG from
11 patients was used to train an expert system written in
a descriptive artificial intelligence language by Dingle et al. 2. Theory of techniques
[3]. The expert system was tested with the same data used
for training. Adjouadi et al. [4] employed Walsh transform to 2.1. Wavelet transform
detect epileptic spikes in 21 EEG records of 20–30 min duration.
Processing parameters in his algorithm were set using 10 other The discrete wavelet transform (DWT) [22] is a versatile sig-
EEG records. Tzallas et al. [5] used artificial neural networks nal processing tool that finds many engineering and scientific
for detection of epileptic spikes after feature extraction. His applications. It has also proven useful in EEG signal analy-
data-set comprised of 10–15 min records of 25 patients—half sis [23,24]. DWT is a representation of a signal x(t) using an
of which were used for training. Breakspear and Williams [6] orthonormal basis consisting of a countably infinite set of
showed that different techniques of resampling the data in wavelets. DWT employs two functions, (t), the scaling func-
the wavelet domain have great potential for testing non-linear tion and (t), the wavelet function, which are associated with
hypothesis in complex non-linear and biophysical systems low- and high-pass filters, respectively. Both of these functions
like isolated spike in background EEG data. A fractal dimen- are shifted and scaled as shown below:
sion algorithm was used in EEG analysis by Petrosian [7]. An
association rule approach has been used for the classification ∀k, n, k ∧ n ∈ Z : k,n (t) = 2−k/2 (2−k t − n) (1)
of EEG signals [8], and the auto-SLEX method has been used
for preseizure detection of epilepsy in EEG [9]. Iasemidis [10]
∀k, n, k ∧ n ∈ Z : k,n (t) = 2−k/2 (2−k t − n) (2)
presented an overview of the application of signal processing
methodologies based on the theory of non-linear dynamics
and chaos theory to the problem of seizure prediction. Other The wavelet representation of a signal x(t) in terms of the
works of Iasemedis et al. [11–13] provide insight into mod- scaling and wavelet functions is given by
ern techniques applied to EEG for seizure detection. Kalitzin
 
et al. [14] have used relative phase clustering index (rPCI) 





to predict epileptic seizure onsets. Litt and co-workers [15] x(t) = ck0 ,n k0 ,n (t) + (dk,n k,n (t)) (3)
have presented a scheme for quantifying seizure precursors n=−∞ k=k0 n=−∞

and coupling these measures to brain stimulation for abort-


ing seizures. Sensitivity as high as 90.47% [16] has also been where ck0 ,n is called the approximation co-efficient and dk,n is
achieved by him in predicting seizures. Reeves and Taylor [17] called the detailed co-efficient.The frequency upto which the
have used genetic algorithm to choose training sets for neural approximation co-efficients are used for representation of the
networks employing radial basis function, to obtain good gen- signal is determined by k0 .
eralization performance. His networks were trained to solve The decomposition of the signal into the different fre-
the XOR problem. We use a similar approach in this paper, but quency bands as accomplished by the process detailed above,
with a backpropagating neural network. is simply high- and low-pass filtering of the time domain
This paper is based on the observation that the EEG signal yielding detailed and approximation co-efficients
spectrum contains some characteristic waveforms that fall respectively. The low pass filter’s output is further subjected
primarily within four frequency bands: delta (ı) (< 4 Hz), theta to the same process of high- and low-pass filtering. This is
() (4–8 Hz), alpha (˛) (8–13 Hz) and beta (ˇ) (13–30 Hz). Many repeated until the number of levels of decomposition desired
methods such as the Fourier transform (FT) and short-time is reached. The outputs from both the filters are down-
Fourier transform (STFT) have already been proposed and sampled at each stage. For this reason, it is to be ensured
tested [18,19] for analyzing the signal but they suffer from that the sampling frequency of the signal is at least two times
certain shortcomings. that of the maximum frequency to be analyzed. Selection of
The FT is incapable of efficiently handling non-stationary suitable wavelet and the number of levels of decomposition
signals (EEG in our case). It provides no time resolution [20] and is very important in the analysis of signals using DWT. The
consequently, it is a laborious process to represent transient wavelet can be chosen depending on how smooth the signal is
spikes—which are very common in EEG signals. Furthermore, and also on the basis of the amount of computation involved.
there is the problem of large noise sensitivity, which demands The number of levels of decomposition is chosen based on the
further computation to resolve. dominant frequency components of the signal. The levels are
STFT uses a fixed time–frequency resolution. Increasing chosen such that those parts of the signal that correlate well
the resolution in time decreases the resolution in frequency, with the frequencies required for classification of the signal
and vice versa. That is, once a time window has been fixed, are retained in the wavelet co-efficients.
102 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 1 ( 2 0 0 8 ) 100–109

2.2. Artificial neural network (ANN) • The system should be able to train/learn from the small
amount of epileptic data that is usually available. Clinical
The basic units of an ANN are neurons—which are mathe- EEG recordings usually have long hours of non-epileptic
matical functions that manipulate input data using weights data with relatively short durations of epileptic activity.
and biases to produce an output. These neurons can be • While the time taken to train the system is not of much con-
organized in groups which may then be cascaded, thus form- cern, it must be made as short as possible. This is because
ing multi-layered networks. A feed-forward backpropagating the system would have to be trained only once.
neural network involves supervised learning, in which the • If used for real-time detection, the time taken by the system
computed outputs from each neuron move forward to other to process EEG data and produce an output should be small.
layers until finally an output is formed. The backpropagation • The system should have good accuracy. This can be in the
technique then adjusts the weights and biases repeatedly so form of specificity, sensitivity, selectivity or other perfor-
that the computed output is close to the expected output—as mance ratios.
determined by the mean-squared error (MSE) value. The man- • It is an added advantage if the same system is capable of
ner in which the randomly initialized weights and biases detecting different types of epilepsies in multiple patients.
change is determined by training algorithms, such as the • Finally, the whole system should be low on resource
Levenberg–Marquardt (LM), resilient backpropagation (RP) and demand and simple enough for easy and large-scale imple-
the quasi-Newton algorithms. These algorithms vary in their mentation.
convergence speed, memory requirement and total time
to train. Although the Levenberg–Marquardt algorithm con-
4. System description
verges fast [25], it is memory-intensive, and hence we choose
the resilient backpropagation algorithm for training. The
4.1. Data set
training process stops when either the performance goal is
met or the maximum number of epochs is reached.
Our EEG database has been obtained from the website of the
2.3. Genetic algorithm
Albert-Ludwigs-Universtãt, Freiburg, Germany [27] and con-
tains invasive EEG recordings of 21 patients suffering from
The genetic algorithm [26] is a search technique used in
medically intractable focal epilepsy. The data was recorded
computing to find exact or approximate solutions to opti-
during an invasive pre-surgical epilepsy monitoring at the
mization and search problems. They are a particular class of
Epilepsy Center of the University Hospital of Freiburg, Ger-
evolutionary algorithms that use techniques inspired by evo-
many. In 11 patients, the epileptic focus was located in
lutionary biology such as inheritance, mutation, selection, and
neocortical brain structures, in eight patients in the hip-
crossover. These are commonly implemented as computer
pocampus, and in two patients in both. In order to obtain a
simulations where a population of abstract representations
high signal-to-noise ratio, data with artifacts were removed.
(called chromosomes) of candidate solutions to an optimiza-
To record directly from focal areas, intracranial grid, strip, and
tion problem, evolve towards better solutions. Traditionally,
depth electrodes were utilized. The EEG data was acquired
solutions are represented in binary as strings of 0 s and 1 s,
using a Neurofile NT digital video EEG system with 128 chan-
but other encodings are also possible. The evolution usually
nels, 256 Hz sampling rate, and a 16 bit analogue-to-digital
starts from a population of randomly generated individuals
converter. For each of the patients, we use an “ictal” dataset
and gives rise to new generations. In each generation, the fit-
containing recordings with epileptic seizures, atleast 50 min of
ness of every individual in the population is evaluated based
recording before seizure onset and atleast 50 min of recording
on the solution provided by it. Subsequently, multiple individ-
after the seizure stops. These recordings made before and after
uals are stochastically selected from the current population
the seizure are termed pre-ictal data. An average of 7.73 min
(based on their fitness), and modified (recombined and pos-
of epileptic data is available per patient—with a single seizure
sibly randomly mutated) to form a new population. The new
duration varying from 4 s to 28 min. Figs. 1 and 2 show the
population is then used in the next iteration of the algorithm.
ictal and pre-ictal recordings from a patient. The “interictal”
The algorithm terminates when either a maximum number
of generations has been produced, a satisfactory fitness level
has been reached for the population or when no improvement
is seen in the solution to the problem after a certain number
of generations. We use a binary representation of the chromo-
some to indicate whether or not a particular EEG recording is
used in the training set. The fitness of the chromosome is then
updated based on the value of the seizure detection sensitiv-
ity. The algorithm terminates if there is no improvement in
the sensitivity value after 200 successive generations.

3. Design considerations

Any attempt at automating epileptic EEG detection should in


general satisfy the following requirements: Fig. 1 – Ictal EEG: 147 s of ictal EEG from six channels.
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 1 ( 2 0 0 8 ) 100–109 103

Table 1 – Decomposed wavelet co-efficients and


corresponding frequency bands
Decomposed signal Frequency range (Hz)

D1 64–128
D2 32–64
D3 16–32
D4 8–16
D5 4–8
A5 0–4

once, the signal is reversed and filtered again to nullify phase


shifts.

Fig. 2 – Pre-ictal EEG: 57.54 min of pre-ictal EEG from six


4.2.2. Normalization
channels.
We first partially normalize the EEG signal x by using a scaling
factor s defined by

dataset (Fig. 3) contains approximately 24 h of EEG recordings s = (|x̄| + x ) (4)


without seizure activity. At least 24 h of continuous interictal
recordings were available for 13 patients. For the remaining where x is the standard deviation of the EEG signal and |x̄|
patients, interictal invasive EEG data consisting of less than is the mean of the absolute value. Assuming that the EEG
24 h were joined together, to end up with at least 24 h per signal amplitude has a normal distribution [28], the scaling
patient. Thus, one recording corresponds to 1 h of data from process results in a partial normalization. Statistically, (x + x̄)
one channel. The six contacts of all implanted grid, strip and is greater than 68% of the signal. But since we use (x + |x̄|),
depth electrodes were selected by visual inspection of the raw the percentage of signal lying in [−1, +1] would be consider-
data by a certified epileptologist. Three contacts were cho- ably higher. To make the normalization more robust, we use a
sen from the seizure onset zone, i.e. from areas involved early non-linear hyperbolic tangent sigmoidal function defined as
in ictal activity. The remaining three-electrode contacts were
selected such that they were not involved or involved last dur- 2
g(x) = −1 (5)
ing the seizure spread. The seizure periods were determined 1 + e−2x
based on identification of typical seizure patterns preceding
clinically manifest seizures in intracranial recordings by visual The advantage of this two-step process is that the first step
inspection of experienced epileptologists. ensures that the algorithm works across different data sets
where amplitudes may be scaled by different values. Conse-
4.2. Processing quently, our statistical parameters are not adversely affected.
The second step completely fits the signal into [−1, +1] with
4.2.1. Filtering the added advantage that it enhances the amplitudes of those
In order to restrict the EEG signal within the desired frequency parts of the signal which may have been drastically reduced
band (encompassing the ı, , ˛ and ˇ waves), to remove line because of sharp spikes – normal or abnormal – in the EEG.
noise due to electric supply, and to remove stray spikes due Although the hyperbolic tangent sigmoidal function is essen-
to noise, we undertake filtering of the signal. We use a fourth tially non-linear, we observed an average increase of around
order butterworth bandpass filter with lower and upper cut- 20% in specificty, sensitivity and selectivity after adding this
off frequencies of 0.35 and 30.5 Hz, respectively. After filtering to our processing algorithm.

4.2.3. Windowing and wavelet decomposition


Each EEG signal is divided into a 4 s window with a 3 s overlap
between consecutive windows. This means that for processing
1 s of EEG data, we also use the previous 3 s as part of the win-
dow. These durations were arrived at after extensive testing
with various window and overlap lengths.
As mentioned before, EEG signals show characteristic
waveforms in ı, , ˛ and ˇ ranges. Since our data is sampled
at 256 Hz, we choose a level 5 wavelet decomposition so that
each frequency range is almost completely represented by an
individual co-efficient. Table 1 shows the frequency bands that
the detailed (D1–D5) and approximation (A5) co-efficients rep-
resent.
From Table 1 it can be seen that A5 corresponds to the ı
Fig. 3 – Interictal EEG: 1 h of interictal EEG from six range, D5 to the  range and so on. The co-efficients repre-
channels. sent the amplitude of the combined signals in their frequency
104 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 1 ( 2 0 0 8 ) 100–109

interictal. We exploit this grouping to achieve our primary


objective of detecting purely epileptic signals. Thus, our ANN
classifies EEG data into three classes. Consequently our train-
ing set consists of three categories. Since our processing
scheme uses a 4 s window, we include 3 s of EEG data before
and 1 s after epileptic seizures and use this resulting signal
as training set for the ictal class. We train and test the ANN
classifier for each patient individually, i.e. a network trained
using data from one patient is not used for testing data from
another patient.
The choice of the data used for training is a very important
one. Sensitivity as low as 3% was observed with an inappropri-
ate training set. Although we do not define what constitutes an
appropriate training set, we experimentally resolve this issue
using a genetic algorithm. We restrict the number of training
signals to six each for interictal, pre-ictal and ictal activities.
This translates to 6 h of normal data, a few minutes of ictal
data (depending on the duration of the signals chosen) and
roughly 6 h of pre-ictal data. Six signals were chosen in each
case to reduce the training time taken by the neural network
and to allow the possible inclusion of all the six channels of
data from a particular hour of recording. The neural network
training set is chosen by the genetic algorithm, which tries
to maximize the sensitivity. For every iteration of the genetic
algorithm, it is allowed to choose six signals per category from
a set containing 80% of the available data per patient. The
remaining 20% is ‘hidden’ from the genetic algorithm and from
the neural network. During the execution of the genetic algo-
rithm, neural network testing uses data from the reserved set
alone, i.e. the 20% ‘hidden’ data is not tested. This is done so
that the sensitivity can be maximized by the genetic algorithm
within a limited set of possible training vectors. This also
prevents over-fitting of the neural network. Thus, upon termi-
nation of the genetic algorithm, the neural network training
set has six signals per category which are optimal for the set
Fig. 4 – Pre-processing stages and feature extraction. containing 80% of the available data. It may be noted at this
stage that although an optimal training set has been found,
this does not necessarily imply numerically high values of sen-
bands. Once the wavelet decomposition is done, we compute sitivity. The genetic algorithm only maximizes the sensitivity
the following simple statistical parameters for each time win- while the exact numerical values are due to the efficiency of
dow in all the decomposed co-efficients: our main processing algorithm (including post-classification).
The above steps are depicted in Fig. 5.
(1) mean: corresponds to the constant or DC signals at various The ANN used is a two-layered feed-forward backpropagat-
frequency ranges; ing network with 10 neurons in level 1 and 3 neurons in level 2
(2) absolute mean: for tracking both DC signals and alternat- (Fig. 6). Both levels use a hyperbolic tangent sigmoidal squash-
ing or ac signals; ing function. Resilient backpropagation algorithm is used for
(3) root mean squared (RMS) value: to measure power of the training. We set the maximum number of epochs as 1000 and
signal in the window; the performance goal (MSE) as 10−5 . Initial tests with the above
(4) standard deviation: to represent the level of fluctuation of architecture and parameters were both faster and more accu-
the signal. rate when compared to other architectures having lower MSE
values and using different number of neurons in level 1. Three
Since we have six decomposed co-efficients and four neurons are used in level 2 following our requirement of hav-
parameters, each time window is represented by a vector with ing three categories for classification. Each neuron fires when
24 elements. Fig. 4 gives a brief overview of the above steps. it encounters signals corresponding to one of the three cate-
gories.
4.3. Neural network training
4.4. Neural network testing
Normally, EEG data are classified as ‘epileptic’ or ‘normal’. In
our case, as mentioned in Section 4.1, epileptologists have Testing is carried out on the optimally trained neural net-
classified the data into three categories—ictal, pre-ictal and work using all the data available per patient. By testing with
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 1 ( 2 0 0 8 ) 100–109 105

vectors. Specific numerical values used by us are indicated,


but these may be varied.

(1) Define weight-vector-alpha = (˛1 , ˛2 , . . . , ˛n ) where the


constant n is the desired number of previous outputs that
are used for processing a particular output at position
(n + 1) (we call this current-output). ˛i s form an increas-
ing harmonic series starting at a convenient value, which
depends on the number of mis-classified signals from
the ANN classifier that the post-classifier algorithm is
allowed to correct before accepting the ANN classifier’s
output as the true classification result. We set n = 10,
˛1 = 1/20, ˛2 = 1/19, . . . , ˛10 = 1/11.
(2) Define weight-vector-beta = (ˇ1 , ˇ2 , . . . , ˇn , ˇn+1 ) where
ˇi are weights following a similar pattern as ˛i , but
with lesser magnitude. Note that ˇn+1 corresponding to
current-output is also included in the weight vector. We
set ˇi = ˛2i . Thus, ˇ1 = 1/400, ˇ2 = 361, . . . , ˇ11 = 1/100.
(3) Find the sum of the ANN classifier output for each chan-
nel and call this net-output. A lower weight  = 0.5 is used
Fig. 5 – Brief flowchart of the overall algorithm. for the output from channels that are not involved or
involved last in seizure activity. Thus outputs from three
channels in our case are weighted using . The other
the ‘hidden’ data, an estimate of our algorithm’s performance three channels are directly added.
is obtained when it is presented with totally new data. At (4) Using net-output, extract the array (v1 , v2 , . . . , vn , vn+1 )
the same time, testing with the 80% of data reserved for the where current-output is vn+1 , and find its weighted
genetic algorithm is meaningful because in effect, the training sum. Use the corresponding weight from the associated
set for the neural network comprises of only the six optimal weight-vector for each vi . If a weight-vector is not yet
vectors from each category. Hence testing the algorithm over associated with a particular vi , use weight-vector-alpha
all the data available does not present an inaccurate account and the corresponding ˛i as weight. Always use ˇn+1 as
of the algorithm’s performance. Each channel from a subject the weight for vn+1 .
is individually fed to the classifier after processing and the (5) Multiply the weighted sum from the above step with vn+1 .
results obtained are fed to the post-classification stage for (6a) If the sign of the result is positive, apply thresholds to
integration of outputs from multiple channels. The outputs current-output to obtain the final classification. Using our
from the classifier are numerical values p1 , p2 and p3 cor- values of p1 , p2 and p3 , a value of current-output greater
responding to the three categories of the signal - interictal, than zero is labeled interictal and more than kp3 but less
pre-ictal and ictal, respectively. We choose p1 = +1, p2 = −1 than zero is labeled pre-ictal. We use k = 2 + , thereby
and p3 = −5. These are chosen based on the post-classifier requiring atleast two focal electrode channels and one
algorithm which needs p1 and p2 to be of opposite signs, prefer- other channel to have detected epileptic activity. Val-
ably with the same magnitude, and p3 to be of the same sign ues of current-output lower than kp3 represent epileptic
as p2 but with higher magnitude. activity and are labeled ictal. Finally, we associate weight-
vector-alpha with current-output.
(6b) A negative sign to the result indicates that current-output
4.5. Post-classification
is a misclassification by the ANN. If current-output is
greater than zero, it is labeled pre-ictal, otherwise it
For 1 h of EEG recording, the classifier uses six channels of data
is labeled interictal. Also, weight-vector-beta is associ-
and outputs six classification result vectors. We present below
ated with current-output. Thus, although a particular
the framework of our algorithm to correlate the six output
ANN classifier output may be wrong, we still consider
it for post-classification processing, but with a lower
weight.
(7) Advance the position of current-output to the next ele-
ment in net-output. While this retains the associated
weight vector for each element of net-output, the cor-
responding weight for that element will be determined
by the element’s position in the array (v1 , v2 , . . . , vn ,
vn+1 ) which would be extracted in the next iteration.
This process is continued until the end of net-output is
encountered in the extracted array (v1 , v2 , . . . , vn+1 ), indi-
cating that all outputs corresponding to 1 h of six channel
Fig. 6 – Basic structure of neural network used. EEG recordings have been processed.
106 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 1 ( 2 0 0 8 ) 100–109

Fig. 7 – Flowchart of post-classification stage.

Fig. 7 depicts the post-classification algorithm detailed 4.5.1. Further increasing sensitivity
above. This algorithm uses current-output and n outputs The above algorithm uses a simple threshold kp3 to distinguish
before this, to decide if current-output is a mis-classification between ictal and pre-ictal signals in step 6(a). Due to this, in
by the ANN classifier. A harmonic series is used for the some cases, parts of an epileptic EEG signal may be wrongly
weights ˛i so that the n previous outputs are weighted labeled as pre-ictal. To overcome this, we modify our post-
in such a way that older outputs are given lesser prior- classification algorithm using the following:
ity; but at the same time, small groups of newer outputs
that are possibly wrong do not adversely affect the post- • Define future-weight=(˛n , ˛n−1 , . . . , ˛n−m+1 ), where m(≤ n) is
classifier’s output. It may be noted that weights which are the number of classifier outputs occurring after current-
part of a geometric series do not have the second property output in the net-output vector, that are used for processing.
mentioned. ˛i are as defined in weight-vector-alpha. We use m = 10(= n).

Fig. 8 – Flowchart of modified post-classification stage.


c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 1 ( 2 0 0 8 ) 100–109 107

• Compute , the weighted sum of m outputs (vn+2 ,


Table 2 – Results of testing the algorithm using data
vn+3 , . . . , vm+n+1 ) using the corresponding weights from from 21 patients
future-weight vector (i.e. compute vn+2 ˛n + vn+3 ˛n−1 + · · · +
Patient no. Specificity (%) Sensitivity (%) Selectivity (%)
vm+n+1 ˛n−m+1 ). Along with the kp3 threshold of step 6(a), 
is compared to a threshold  (p3 {1/11 + 1/12 + · · · + 1/15} in 1 98.91 89.67 90.12
our case) and only if it is found to be higher, the output is 2 99.67 91.65 93.10
3 99.46 92.34 93.86
labeled as ictal. Failing this, even if current-output is greater
4 98.97 92.31 91.54
than kp3 , it is labeled pre-ictal. The value of  needs to be
5 99.12 93.16 91.84
set considering the minimum duration of epileptic seizures, 6 98.52 89.68 90.33
which is 6 s in our case. 7 99.06 91.28 89.56
8 99.18 93.80 88.67
9 98.87 88.38 91.17
Fig. 8 depicts the final post-classification algorithm after
10 99.82 90.91 92.95
modifications. The post-classification algorithm without the 11 98.61 91.60 93.12
above modification increased the sensitivity of detection by 12 99.13 90.74 91.38
6% while retaining the algorithm’s capability of processing 13 99.63 92.82 90.70
real-time signals. The above modification is seen to increase 14 99.45 87.73 89.63
the sensitivity by a further 2%. It is to be noted though, that 15 99.28 93.21 87.58
16 99.56 92.14 90.27
the modification renders the algorithm incapable of process-
17 99.79 90.88 92.34
ing real-time signals. Thus, the algorithm, would then be
18 98.56 89.34 91.61
restricted to detecting epileptic activity in pre-recorded EEG 19 99.63 91.72 89.81
data. 20 99.47 91.49 90.65
We use the algorithm detailed above along with the modi- 21 98.32 92.18 93.70
fication in the post-classification algorithm to train and test Average 99.19 91.29 91.14
our system. Since our main objective is to detect epilep-
tic seizures, we do not distinguish between interictal and
We present our results in terms of the following accuracy
pre-ictal labels for performance measurement, i.e. we group
measuring ratios : sensitivity, specificity and selectivity. Math-
both these classes into a ‘non-epileptic’ category. Nonetheless,
ematically,
classification into three categories greatly helps our post-
classification algorithm, and is seen to give a better overall
TN
performance. specificity = × 100% (6)
TN + FP

4.6. Implementation of algorithms TP


sensitivity = × 100% (7)
TP + FN
The algorithm was run on a desktop computer with 3.0 Ghz
Intel micro-processor, 1 GB of 533 Mhz DDR2 memory and TP
selectivity = × 100% (8)
80 GB SATA Hard Disk. This computer was used for the neu- TP + FP
ral network training. In all, we use roughly 24 h of EEG data
where TP-true positive, FN-false negative, TN-true negative
per channel per patient for testing. Processing these signals
and FP-false positive.
to generate the 24-element representative vector from the
TP is the number of epileptic signals detected correctly by
windowed wavelet co-efficients involves a lot of computation
the algorithm, i.e. same result as obtained by a trained doc-
owing mainly to the long duration of recordings. But since
tor in detecting seizure. TN is the number of normal signals
the computation can be carried out in parallel upon multi-
detected correctly. FP is the number of normal signals labeled
ple signals, we use a high-performance computational cluster
epileptic by the algorithm and FN is the number of epileptic
consisting of 15 nodes, all with the same specifications as the
signals labeled normal by the algorithm.
desktop computer, which we use as the master for control-
As we train and test our algorithm for each patient individ-
ling the cluster nodes. The master and nodes are connected
ually, we present the classification results of all the 21 patients
by gigabit ethernet Network Interface Cards. The master and
whose EEG data were analysed in Table 2. We also indicate the
all the nodes run on RedHat Enterprise Linux v4. Software
average specificity, sensitivity and selectivity obtained.
implementations are carried out in MATLAB R2007a, installed
Table 3 presents an overall comparison of our method with
on all systems. Wavelet decomposition and the neural net-
a few other detection methods. The results are presented in
work classifier were both implemented using the in-built
terms of specificity, selectivity and sensitivity, wherever avail-
toolboxes.
able. It can be seen that our algorithm performs better than
previous attempts at automation.
5. Results

Average time to process one EEG signal of 1 h duration was


6. Strengths and weaknesses of the
16.92 s; 4.7 ms to process 1 s of EEG signal. It may be noted that
proposed algorithm
the algorithm produces an output for every second of input
EEG data. The major strengths of our algorithm are
108 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 1 ( 2 0 0 8 ) 100–109

ing set takes a long time to terminate. This increases the


Table 3 – Comparison with other automatic seizure
detection attempts time for optimally training the system by many hours. Note
that we have not made any attempts to optimize the genetic
Author Sensitivity Selectivity Specificity
algorithm. More work in this direction may reduce the train-
Acir et al. [1] 89 86 – ing time.
Hostetler et al. [2] 59 89 – • Currently, after our system has been trained using data from
Dingle et al. [3] 53 100 – one patient, an average decrease of around 10% in specificity
Adjouadi et al. [4] 82 92 –
and 18% in sensitivity is observed when tests are carried
Tzallas et al. [5] 80–85 77–90 90–97
Exarchos et al. [8] 84 82 91
out using data from other patients, having different epilep-
Our method 91.29 91.14 99.19 tic foci. This indicates that upto a certain level, all types of
epileptic seizures show up similarly in the EEG recordings.
This possibility needs to be further explored.
• Having used wavelets for feature extraction, we compute
• A further increase in sensitivity and selectivity is a must for
simple statistical parameters from the wavelet co-efficients
clinical deployment of such a system.
and use these parameters to represent the EEG signals. This
is not computationally intensive, even for real-time signals.
• A genetic algorithm has been used for choosing the training
set, as opposed to the common method of randomly select- 7. Conclusions and discussion
ing data for training. We observe a consequent increase in
our accuracy ratios due to this. Detection of epileptic activity in EEG recordings is mostly done
• The post-classification algorithm correlates the ANN clas- by a small number of skilled professionals today. Automat-
sifier output over multiple channels (six in our case) and ing this process presents many advantages and among them
yields a single classification label. It is seen to have the are faster diagnosis, non-stop monitoring, and reduction in
following advantages: overall cost of medical treatment. Automatic intervention of
• It provides a simple yet efficient means of relating multi- systems by electrical stimulation of the brain to prevent onset
ple outputs for the same duration of the EEG signal. Data of seizures would also benefit from such work. We propose
from electrodes placed outside the epileptic focus usu- a wavelet-based feature extraction technique which conse-
ally show low epileptic activity after ANN classification. quently uses simple statistical parameters to detect epileptic
This has been taken into consideration by assigning an EEG signals using a backpropagating artificial neural network
appropriate weight. classifier. We also employ a post-classification stage to corre-
• It eliminates improper classification outputs over short late the outputs from different channels and also to increase
periods which may occur due to low-power noisy distur- the overall accuracy (specificity, sensitivity and selectivity).
bances. Thus it avoids short durations of pre-ictal activity Average detection sensitivity of 91.29%, specificity of 99.19%,
being detected in interictal regions of the signal and vice and selectivity of 91.14% are obtained. Considering the speed
versa. of our algorithm (4.7 ms for wavelet decomposition and sta-
• In many cases, epileptic activity lasts only for a short time. tistical computation and a classification time of 0.012 ms with
To be able to detect this, while still retaining the previ- ANN for one 4 s window with 3 s overlap), implementing this
ously mentioned advantage, we use a high magnitude for real-time epileptic EEG detection also seems feasible, pro-
for p3 . At the same time, the value of p3 is low enough vided the modification in the post-classification algorithm
for the algorithm to not allow labeling of signals as ictal is not included. As mentioned before, the post-classification
unless some duration of pre-ictal signal has already been algorithm’s modification uses outputs occurring in the future
encountered. to increase sensitivity.
• In other techniques such as moving average filter or
simple low pass filter, the resulting values need to be fur-
ther quantized which demands additional thresholding 8. Future work
rules. Also, short durations of mis-classified outputs may
adversely affect the neighboring classification results Automation of epileptic EEG signal detection is a daunting
depending on the filter order and the output value of task and although much work has been done in this field,
the algorithm for each class. These problems have been the search for an algorithm that performs well across multi-
eliminated to a certain extent in our algorithm. ple patients with different types of epileptic seizures is still
• The reasonably high specificity, sensitivity and selectivity on. Although we have used only a few basic techniques in
values obtained show that our processing algorithm for this paper, our accuracy seems higher than those of other
feature extraction and our post-classification stage both similar attempts. Apart from addressing the weaknesses that
perform well. we have mentioned, our algorithm’s performance can prob-
ably be improved by using better techniques of choosing
There are also certain weaknesses in our algorithm which training sets for the neural network, using more statisti-
need to be overcome. cal parameters for each time window, employing a different
type of neural network (supervised or unsupervised), and
• Although a definite increase in the accuracy ratios is by varying the parameters used in the post-classification
observed, the genetic algorithm used for choosing the train- stage.
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 1 ( 2 0 0 8 ) 100–109 109

Society 10th Annual International Conference, 3, 1988, pp.


Acknowledgments 1201–1203.
[12] L.D. Iasemidis, P.M. Pardalos, D.S. Shiau, W.
The authors would like to acknowledge the Department of Chaovalitwongse, K. Narayanan, A. Prasad, K. Tsakalis, R.
Biotechnology, Government of India, New Delhi, for providing Carney, J.C. Sackellares, Long term prospective on-line
real-time seizure prediction, Clin. Neurophysiol. 116 (3)
the financial support to carry out this work, and the Albert-
(2005) 532–544.
Ludwigs-Universtãt, Freiburg, Germany for permitting us to
[13] L.D. Iasemidis, D.S. Shiau, W. Chaovalitwongse, J.C.
use the EEG data from their website. The authors would also Sackellares, P.M. Pardalos, J.C. Principe, P.R. Carney, A.
like to thank the anonymous reviewers for their valuable Prasad, B. Veeramani, K. Tsakalis, Adaptive epileptic seizure
comments and suggestions which have immensely helped in prediction system, IEEE Trans. Biomed. Eng. 50 (5) (2003)
improving the presentation of this manuscript. 616–627.
[14] S. Kalitzin, D. Velis, P. Suffczynski, J. Parra, F. Lopes da Silva,
Electrical brain-stimulation paradigm for estimating the
references
seizure onset site and the time to ictal transition in temporal
lobe epilepsy, Clin. Neurophysiol. 116 (2005) 718–728.
[15] J.J. Niederhauser, R. Esteller, J. Echauz, G. Vachtsevanos, B.
[1] N. Acir, I. Oztura, M. Kuntalp, B. Baklan, C. Guzelis, Litt, Detection of seizure precursors from depth-EEG using a
Automatic detection of epileptiform events in EEG by a sign periodogram transform, IEEE Trans. Biomed. Eng. 51 (4)
three-stage procedure based on artificial neural networks, (2003) 449–458.
IEEE Trans. Biomed. Eng. 52 (1) (2005) 30–40. [16] M. D’Alessandro, R. Esteller, G. Vachtsevanos, A. Hinson, J.
[2] W. Hostetler, H. Doller, W. Homan, Assessment of a Echauz, B. Litt, Epileptic seizure prediction using hybrid
computer program to detect epileptiform spikes, feature selection over multiple intracranial EEG electrode
Electroenceph. Clin. Neurophysiol. 82 (1992) 1–11. contacts: a report of four patients, IEEE Trans. Biomed. Eng.
[3] A.A. Dingle, R.D. Jones, G.J. Carroll, W.R. Fright, A multistage 50 (5) (2003) 603–615.
system to detect epileptiform activity in the EEG, IEEE Trans. [17] C.R. Reeves, S.J. Taylor, Selection of training data for neural
Biomed. Eng. 40 (1993) 1260–1268. networks by a genetic algorithm, Lecture Notes in Computer
[4] M. Adjouadi, M. Cabrerizo, M. Ayala, D. Sanchez, I. Yaylali, P. Science 1498 (1998) 633–642.
Jayakar, A. Barreto, Detection of interictal spikes and [18] C. Yamaguchi, Fourier and Wavelet Analyses of Normal and
artefactual data through orthogonal transformations, Clin. Epileptic Electroencephalogram (EEG), in: Proc. of the 1st
Neurophysiol. 22 (2005) 53–64. Intl. IEEE EMBS Conf. on Neural Engg, 2003, pp. 406–
[5] A.T. Tzallas, C.D. Katsis, P.S. Karvelis, D.I. Fotiadis, S. 409.
Konitsiotis, S. Giannopoulos, Classification of transient [19] M. Akin, Comparison of wavelet transform and FFT methods
events in EEG recordings, in: Proceedings of the 2nd in the analysis of EEG signals, J. Med. Syst. 26 (3) (2002)
International Conference on Advances in Biomedical Signal 241–247.
and Information Processing (MEDSIP), 6–8 September, Malta, [20] I. Daubechies, Ten Lectures on Wavelets, Society for
2004 (ISBN 0-86431-439-7). Industrial and Applied Mathematics, Philadelphia, 1992.
[6] M. Breakspear, L.M. Williams, A novel method for the [21] A. Remond, EEG Informatics, in: A Didactic Review of
topographic analysis of neural activity reveals formation Methods and Application of EEG Data Processing,
and dissolution of dynamic cell assemblies, Journal of Elsevier/North-Holland Biomedical Press, 1977.
Computational Neuroscience 16 (2004) 49–68. [22] S.W. Walker, A Primer on Wavelet Transform and their
[7] A. Petrosian, Kolmogorov complexity of finite sequences and Scientific Application, Chapman & Hall/CRC, 1999.
recognition of different preictal EEG patterns, in: [23] F. Sartorettoa, M. Ermani, Automatic detection of
Proceedings of the IEEE Symposium on Computer-Based epileptiform activity by single-level wavelet analysis, Clin.
Medical Systems, 1995, pp. 212–217. Neurophysiol 110 (1999) 239–249.
[8] T.P. Exarchos, A.T. Tzallas, D.I. Fotiadis, S. Konitsiotis, S. [24] S.J. Schiff, A. Aldroubi, M. Unser, S. Sato, Fast wavelet
Giannopoulos, A data mining based approach for the EEG transformation of EEG, Electroenceph. Clin. Neurophysiol. 91
transient event detection and classification, in: Proceedings (1994) 442–455.
of the 18th IEEE Symposium on Computer-Based Medical [25] M.T. Hagan, M.B. Menhaj, Training feedforward networks
Systems, 2005, pp. 35–40. with the Marquadt algorithm, IEEE Trans. Neural Netw. 5
[9] S.D. Cranstoun, H.C. Ombao, R.V. Sachs, W. Guo, B. Litt, (1994) 989–993.
Time–frequency spectral estimation of multichannel EEG [26] D.E. Goldberg, Genetic Algorithms in Search, Optimization &
using the auto-SLEX, IEEE Trans. Biomed. Eng.49 (9) (2002) Machine Learning, Addison-Wesley, 1989.
988–996. [27] Albert-Ludwigs-Universtãt, Freiburg, Germany, http://www.
[10] L.D. Iasemidis, Epileptic seizure prediction and control, IEEE fdm.uni-freiburg.de/groups/timeseries/epi/EEGData/
Trans. Biomed. Eng. 50 (5) (2003) 549–558. download.
[11] L.D. Iasemidis, H.P. Zaveri, J.C. Sackellares, W.J. Williams, [28] F. Heydenreich, G. Rabending, U. Runge, A test for the
Phase space analysis of EEG in temporal lobe epilepsy, in: evaluation of the amplitude distribution of the EEG,
Proceedings of the IEEE Engineering, Medical & Biology Zentralbl Neurochir. 44 (4) (1983) 283–290, in German.

You might also like