Moeynoi 2017

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

2017 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications

and Information Technology (ECTI-CON)

Dimension Reduction based on Canonical Correlation


Analysis Technique To Classify Sleep Stages of
Sleep Apnea Disorder using EEG and ECG signals
Pimporn Moeynoi Yuttana Kitjaidure
Faculty of Electrical Engineering, Faculty of Electrical Engineering,
King Mangkut’s Institute of Technology Ladkrabang King Mangkut’s Institute of Technology Ladkrabang
Bangkok, Thailand Bangkok, Thailand
[email protected]

Abstract— Sleep stage scoring is the first step to diagnostic of sleep Another signal, ECG recoding has been considered as one of
disorders and it is scored by the conventional method known as the most studied physiological signals. The RR interval time
the visual sleep stage scoring based on human. To assist the sleep series are the inverse of heart rate and derived from ECG signals.
physician in evaluating of patients, a new automatic sleep stage ECG is related to physiological dynamic of sleep stages which
classification system needs to be developed. So this is the aim of have been demonstrated in [5]. They have reported that the high
this work based on Electroencephalography (EEG) and frequency parameter of heart rate variability (HRV) increased
Electrocardiography (ECG) for Sleep apnea patients. This article and the low frequency parameter decreased in NREM stages,
proposes two importance topics, the first is the new feature of EEG and the opposite changes during REM sleep. HRV has been an
signal using a simple statistical technique and the results prove
important role of cardiac activity like the heart rate and rhythm.
that the various sleep stages can be discriminated more clearly at
HRV can be analyzed in time domain, frequency domain, and
significant levels (p<<0.05). Second, the dimension reduction is
proposed based on the Canonical Correlation Analysis (CCA) nonlinear domain [6-7]. HRV techniques have been shown to be
technique that explores possible correlated multi-sources to powerful tools for characterization of the cardiovascular
improve the sleep stages classification at 95.42% accuracy by using systems.
random forest classification. The results show that our proposed Clinical studies have been reported sleep apnea disorder
method has ability to develop a new sleep stage classification causes fluctuation in the brain and heart electrical activities. The
assistance. dynamic inter-actions between both systems during sleep have
been explored and reported using Fast Fourier Transform
Keywords- Sleep Stages Classification; Canonical Correlation
Analysis; Electroencephalography; Electrocardiography; Sleep analysis [8]. Moreover, Non-linear analysis is commonly used
Apnea Disorder. in the investigations associated with sleep [9]. The sleep stage is
mainly conducted by a sleep technician, the diagnosis is
I. INTRODUCTION typically subjective. Recently, an automated classification of
sleep stages has been proposed to help the screening. The
Sleep analysis is very important in the identification of
previous studies [10-11] have utilized EEG and ECG signals to
problems related to sleep. The objective of sleep scoring is to
this purpose. However, the classification analyses have been
identify the sleep stage used for diagnosis and treating sleep
employed either using the ECG signal alone or EEG alone and
disorders. They are commonly conducted based on
rarely simultaneously. As above-mentioned, our research
Polysomnography (PSG) recoding that is obtained from patients
attempts to find an algorithm which classifies sleep stage for
while they sleep overnight and is recorded with various bio-
using in diagnosing sleep stage by utilizing brain and heart
signals. Sleep test is visually scored by experts using manual
signals.
standard. In clinical, sleep can be categorized as Wake, NREM
(Non-Rapid eye movement) sleep, and REM (Rapid Eye The sleep stages based on EEG and ECG signals are obtained
moments) sleep. NREM is sub-divided into four stages N1-N4 from many different sources. So there are a large number of
according to the standard of R&K [1] and more recent standard features, the dimensionality reduction with feature fusion
proposed by the American Academic of Sleep Medicine becomes necessary. Similar study [12] used dimensionality
(ASSM) in 2007. The ASSM combined N3 and N4 into single reduction to find the most discriminative features and to keep the
stage as a deep sleep called N3 also known as Slow Wake Sleep dimensionality of features as low as possible. In [3], they
(SWS) [2]. The EEG signal is one of the most important and proposed techniques to reduce the computational complexity of
frequently used for analyzing sleep staging. Many studies have the classifier and select a set of significant features. Based on the
attempted to develop EEG for implementation of a single- literatures, the feature selection methods and the Principle
channel EEG to classify the sleep stage. These can help to reduce Component Analysis (PCA) are the most popular methods for
the sleep disturbances caused by PSG recording wires and other sleep stages [3,13-14]. However, they have some limitations that
equipments [3]. The automatic sleep stage detection methods they do not use class information. While, Canonical Correlation
have been used various feature extraction techniques from EEG- Analysis (CCA) is a technique of dimensionality reduction
based. Furthermore, the machine learning methods have also methods that use class information to find the correlation
been widely used for sleep stage classification[4]. between feature vectors of data sets. An important property of

978-1-5386-0449-6/17/$31.00 ©2017 IEEE

455
2017 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications
and Information Technology (ECTI-CON)

CCA is invariant with respect to affine transformation of the represented by the sum of distance for the whole sub-window in
feature vectors that is different from other techniques. In each epochs defined as;
biomedical data analytic field, Nicolle M. et al.[15] used CCA w

to combined feature from multimodal images (fMRI and sMRI) MMD =  di


i =1
of Schizophenia patients. This method discovered the
association across two data sources. According to the easily d is distance formula derived from the Pythagorean theorem as
implement we used the benefit of CCA to apply in the sleep d = Δx 2 + Δy 2 , Δx and Δy are the difference of the maximum
stage feature reduction. and minimum of the x-axis and y-axis, respectively.
This study consists of two main parts. The first is to explain Two types of new distances are proposed. One feature is
the new modified EEG features adapted from previous studies computed by average of the whole sub-windows called
[3]. The second important part is to propose the fused multi- avgEVslope and the other is the sum of the whole sub-windows
sources method to develop sleep stage classification. The rest of called sumEVslope. The features can be calculated by the
this paper is organized as follows. Session II the methodology following equations;
including the database, the feature extraction, the dimensionality
reduction, and the classification method. Session III provides the w

results, and the last gives our conclusion. m i w


avgEVslope = i =1
, sumEVslope =  mi
II. METHODOLOGY w i =1

A. Database We divide the epochs into sub-window with various


The signals and sleep stage annotation were extracted from wavelengths including 1 sec, 5sec, 10sec, and 15 sec. Then
PSG recording. These were available on the Physionet database compute the features as the formula above. For the analysis of
for downloading in EDF format from patients referred to the features, we used ANOVA, which is the statistical hypothesis
Sleep Disorders Clinic at St Vincent's University Hospital. Brain testing. ANOVA is used to analyze the difference between
wave was recorded with the 10-20 standard electrode placement. groups.
The V2 lead was the tracing of ECG signal. Sampling rate of Frequency domain has been the most commonly extracted
signals was 128 Hz. In this analyze, sleep stages were classified features which are characterized the spectral structures of the
into 5 classes; wake class, REM class, S1, S2, and SWS. signal. The different frequency bands include low delta (0.5-
2Hz.), high delta (2-4Hz.), theta (4-8 Hz.), alpha (8-12Hz.) and
B. Features Extraction
beta (13-30Hz.). The PSD features are extract from EEG by
EEG extraction using [11]. For the Discrete Wavelet Transform (DWT), this
We used a temporal and spatial techniques for computation experiment uses Daubechies 8 (db8) with 5 level that are
in EEG features extraction. We started with divided EEG signal correlated with the required EEG information in range 0-30 Hz.
into segments of 30 second per epoch that equaled the length of EEG sub-bands are represented by coefficient detail (D3-D5)
a sleep stage annotation provided by the sleep specialists. The and the approximation (A5). The extracted data are calculated
annotated sleep stage included wake stages, REM stages, and on mean, standard deviation and the power spectrum density.
NREM 1 – 4 stages. But our implement combined NREM3 and Totally EEG features are 56.
NREM4 to be the single stage or SWS according to the recent Heart Rate Variability features
ASSM standard. EEG features related to sleep stages can be
The ECG signal is segmented in a similar way to EEG signal
categorized into 2 groups; time and frequency domains. Time
and R peak is identified to compute RR intervals based on Pan-
domain was computed such as the mean, standard deviation,
Tomkin algorithm. We compute the HRV features based on
maximum, minimum value, Hjorth parameters and itakura
three domains, consisting of the time domain the frequency
distance[11]. The new features, of time domain were modified
domain, and the nonlinear domain, which employ some features
with Maximum–Minimum distance (MMD).
in [6]. To determine DWT in this process, db4 is set at 8 levels.
Modified features (argEVslope and sumEVslope) In frequency domain, there are the power spectrum density (the
unit of each frequency range which are: LF, HF, the ratio of LF
The features were modified based on the ratio of Δx and Δy and HF) the mean frequency of each frequency range, the
Δy 2 amplitude of LF and HF ranges, and the spectrum peak of each
, as m = , or the slope of the different distance between frequency range. Using nonlinear analysis needs to understand
Δx 2 dynamic systems of ECG signals. In this experiment, we employ
maximum and minimum or the extreme values in each sub- some nonlinear features, which include the detrended fluctuation
windows. analysis (DFA) and the correlation dimension [6]. The total of
the HRV features is 53.
The MMD proposed by [3] determines as the speed of EEG
signal. Each epoch is divided into a non-overlap sliding sub- C. Dimension reduction
window that is equal the wavelength of EEG waveform. The The dimension reduction methods seek a low-dimension
idea behind this feature is to find the distance between the common subspace to compactly represent various data. In this
maximum points in each sub-windows. The feature is work, EEG and ECG are obtained from multi-signal recording
of the PSG data. There are related function activity and easy to

456
2017 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications
and Information Technology (ECTI-CON)

use, so we select them to study. The various dimension reduction third algorithm, the Decision Tree Classifier was conducted by
methods are compared including Principle Component Analysis ID3, which used entropy and information gain. The last
(PCA), Independent Component analysis (ICA), and Canonical algorithm, random forest, the number of trees was set to be 10
Correlation Analysis (CCA). based on the experiment. We used some libraries of classifier
tools for training and testing procedures with MATLAB
Canonical Correlation Analysis program.
The CCA method is introduced to fuse multi feature sets in
this work because the aims are to discover the association across III. RESULTS AND CONCLUSION
two data sources, to reduce the computation complexity and to A. The extracted feature results
improve the classification performance. Two types of data are
used, one dataset is the EEG and the other is HRV. They are Fig.2 shows the comparison between the means of features,
defined as X1 and X2 with dimensions m×n and m×p, which are the sets of sumEVslope and MMD features for each
respectively. The number of dimensions is equal to the minimum sleep stage. In the figure, the sleep stages derived from
of the columns of them. The new vectors of data projected on sumEVslope features show clearly discriminate among the sleep
∧ ∧ stages with significant at p<<0.5 and argEVslope also gives
direction vectors are defined as; x = x′ wx , y = y ′ wx . Variables x1
1 2 similar results. While the MMD features show slightly different
and x2 are the canonical variables. The correlation between them with p-value <0.05.
given by
E[ X 1 X 2 ] wTx1 Cx1 x2 wx2
ρ= =
E[ X 12 ]E[ X 2 2 ] wTx1 Cx1 x1 wx1 wTx2 Cx2 x2 wx2
The maximum of ρ with respect to wx and wx is the
1 2

maximum canonical correlation. Consider the covariance matrix


where Cx1 x1 and Cx2 x2 are the within-set covariance of the matrix
x1 and x2 while the other C x x is the between-sets covariance
1 2

matrix.
The Fig.1 is the model of the dimensionality reduction
methods that are used in this study. The model of CCA tries to
maximize the inter-subject covariance across two sets of features
and generates two linked variables, one from each data set. The Figure 2. The relation between the mean of features and sleep
ICA model tries to share two or more features that have the same stages in each sub-window.
mixing coefficient matrix and maximize the independence
among class components. The PCA tries to find the maximum B. The classification results
covariance of features without class information. Three methods for classification performance measurement
include the percent of sensitivity, specificity, and accuracy.
Table1 shows the performance comparison of the feature
reduction methods in the case of sleep stage classification. The
best results are obtained from CCA method. Fig. 3 shows the
accuracy of CCA reduction method at various number of
features. The graph shows the RF, SVM and DT classifiers
giving accuracy at higher than 90%. The highest accuracy is
obtained from the RF algorithm, which used six components of
each dataset, with 95.42% accuracy. The second highest is from
SVM at 94.81% accuracy and followed by, DT at 90.47%, KNN
(k=2) at 81.71%, KNN (k=3) at 81.34%, and KNN (k=4) at
78.83% respectively.
Figure 1. The models for CCA , ICA ,and PCA. C. Discussion and Conclusion
D. Classification The task of sleep test requires the observation of the patient
on PSG recording. The scoring stages are labeled by a sleep
In sleep stage classification, Support Vector Machine (SVM),
technical. This study introduces the method to help sleep
k-Nearest Neighbor (kNN), Decision Tree (DT), and Random
specialists for screening sleep stages using only simple signals,
Forest (RF) algorithms are employed. These are useful
EEG and ECG. Two topics are proposed. First, we proposed the
techniques in the machine learning. We conducted experiment
new features, called avgEVslope and sumEVslope. The proposed
by setting data into separate sets, 80% was used for training and
features provide possibilities to clearly distinguish between the
20% was used as a completely independent test of the
different sleep stages (Wake, REM, S1, S2, and SWS stages).
recognition method. In the first classifier, the kernel function of
The proposed features show the relation of the sleep stages when
multiclass SVM is set by the radial basic function to classify
the human falls into deeper sleep. These results make better
sleep stages. In the second one, the prediction of k-NN algorithm
understand of characteristics of physiology system.
was based on k nearest neighbor so our process set k to 2-4. The

457
2017 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications
and Information Technology (ECTI-CON)

Second, we propose the dimensionality reduction based on [2] S. Charbonnier , L. Zoubek , S. Lesecq , F. Chapotot, “Self-evaluated
the CCA technique which combines multi-feature sets, EEG automatic classifier as a decision-support tool for sleep/wake staging.”
Comput. Biol. Med. 2011, 41, pp380–389.
and ECG signals, to classify sleep stages of sleep apnea
[3] A.I. Khald Aboalayon, M.Faezipour , Wafaa S. Almuhammadi, and Saeid
patients. The CCA provides the best accuracy on all classifiers. Moslehpour. “Sleep Stage Classification Using EEG Signal Analysis: A
The best result is obtained from the CCA technique with Comprehensive Survey and New Investigation”, Entropy 272-18, 2016,
random forest classification and SVM. Since the advantage of pp1-31.
RF algorithm is generally computationally inexpensive, making [4] M. Oswaldo Mendez, “Sleep staging from Heart Rate Variability: time-
it possible to construct the model and classifying process faster varying spectral features and Hidden Markov Models” ,Int. J. Biomedical
than SVM so we prefer the RF algorithm. Moreover, the Engineering and Technology, Vol. 3, 2010,pp 246-263.
comparison with recent techniques of sleep stage classification [5] K.S. Phyllis and P. Yachuan “Heart rate variability, sleep and sleep
disorders”, Sleep Medical Review 16, 2012, pp 47-66.
using EEG and ECG [9-12] , our proposed model presents good
advantages in term of the accuracy and complexity. [6] V. Steven , “Heart Rate Variability: linear and non linear analysis with
applications in humans physiology”, Doctaral Thesis of Faculty of
Electrical Engineering Kasteelpark Arenberg, Belgium, 2010.
TABLE I. CLASSIFICATION PREFORMANCE OF KNN (K=2-4), DECISION
TREE AND RANDOM FOREST WITH DIFFERENT FEATURE REDUCTION METHODS [7] S. Boettger,D. Hoyer,K. Falkenhahn, M. Kaatz, V.K.Yeragani, KJ. Bar,
“Altered diurnalautonomic variation and reduced vagal information flow
Classification Feature reduction Sens. Spec. Acc. (%) in acute schizophrenia.”,Clin Neurophysiol, 2006, 117, pp 2715–22.
method method (%) (%) [8] M. Alejandro and et al. “Non-linear analysis of EEG and HRV signal
PCA 91.98 91.43 91.87 during sleep”, Conf Proc IEEE Eng Med Biol Soc. 2015, pp 4174-7.
SVM ICA 91.37 88.69 90.52 [9] J.R. Yeha, C.K.Penga, at el , “Investigating the interaction between heart
CCA 94.90 94.35 94.81 rate variability and sleep EEG using nonlinear algorithms”, Journal of
PCA 77.98 62.39 69.46 Neuroscience Methods219, 2013, pp 233-239.
k-NN ,k=2 ICA 77.06 68.08 72.15 [10] A. Mina, K. Tokuhiro, U. Sunao, M. Shinchi, N. Kyoko, M. Junko and et
al. “Correlation between electroencephalography and heart rate variability
CCA 85.28 77.45 81.71
during sleep.”, Psychiatry ClinNeurosci 2003, Vol57, pp 59–65.
PCA 74.56 63.26 65.38
[11] E. Edson, and H.Nazeran, “EEG and HRV signal featurses for automatic
k-NN, k=3 ICA 76.97 67.22 69.13 sleep stageing and apnea detection”, IEEE 2010, pp142-147.
CCA 84.45 80.38 81.34 [12] S.Khalighi,T. Sousa, G. Pires, U. Nunes, “Automatic sleep staging: A
PCA 60.01 62.81 62.24 computer assisted approach for optimal combination of features and
k-NN, k=4 ICA 64.08 66.14 65.98 polysomnographic channels.” Expert Syst. Appl. 2013, 40, pp7046–7059.
CCA 82.70 78.07 78.83 [13] B. Koley, D. Dey, “An ensemble system for automatic sleep stage
classification using single channel EEG signal.” Comput. Biol. Med.
PCA 86.32 87.01 86.89
Decision 2012, 42, pp 1186–1195.
ICA 85.24 85.95 85.33
Tree [14] S. Baha, P. Musa and et al., “A comparative Study on Classification of
CCA 90.28 89.91 90.47 Sleep stage Based on EEG signal Using Features and Classification
Random PCA 94.25 95.81 93.90 Algorithm”, J. Med Syst,2014,38:18,pp 1-18.
Forest ICA 94.47 95.85 94.15 [15] M. Nicolle and et al., “Canonical Correlation Analysis for feature-Based
CCA 95.88 97.07 95.42 Fusion of Biomedical Imaging Modalities and Its Application to
Detecction of Assocication Network in Schizophenia”, IEEE Journal in
signal processing, Vol2 No6, 2006, pp 998-1007.

Figure 3. The relation between the number of features based on


CCA method and the average of 10-fold accuracy.

REFFERENCES
[1] A. Rechtschaffen and A. Kales. “A manual standardized terminology,
techniques andscoring system for sleep stages of human subjects.”.NIH
Publication No. 204;1968.

458

You might also like