Neural Networks: Arlex Oscar Marín García, Markus Franziskus Müller, Kaspar Schindler, Christian Rummel
Neural Networks: Arlex Oscar Marín García, Markus Franziskus Müller, Kaspar Schindler, Christian Rummel
Neural Networks: Arlex Oscar Marín García, Markus Franziskus Müller, Kaspar Schindler, Christian Rummel
Neural Networks
journal homepage: www.elsevier.com/locate/neunet
1. Introduction Bouchaud, & Potters, 1999; Müller, Baier, Galka, Stephani, & Muhle,
2005; Müller, Baier, Rummel, & Schindler, 2008; Müller et al., 2006;
Often, the precise mathematical definition of measures used Plerou et al., 2002; Plerou, Gopikrishnan, Rosenow, Nunes Amaral,
for data analysis contains integrals over infinite ranges (like & Stanley, 1999; Rummel, Müller, Baier, Amor, & Schindler, 2010).
the Fourier transform, Correlation coefficient, Hilbert transform, Due to the finite size of the data window, the estimate of
etc.) or limits to zero and/or infinity (e.g. Correlation dimension, the cross-correlation of two completely independent time series
Lyapunov exponent, etc.) (see for example Kantz & Schreiber, (e.g. independent Gaussian white noise) is generally non-zero. For
2004). In application to real world data, which are non-stationary the same reason, it is at first also not clear how close the numerical
and recorded with finite sampling rate, such requirements cannot estimate approximates the correct value in the case when genuine
be met. This lack of mathematical precision influences the quality correlations are present. Even worse, as the amount of random
of the numerical estimates. In the case of cross-correlations the correlations depends on the relation of the period of the slowest
sampling theorem proves that coarse graining of the data is not dominant frequency component of the signal and the length of the
relevant, provided that the highest frequency component of the data segment (Rummel et al., 2010), it may change drastically over
signal is smaller than the Nyquist frequency (a property which time (Müller et al., 2011, 2008). This undesired effect questions the
often is not checked for). However, replacing the integral over cross-correlation coefficient as an appropriate technique for the
infinite range with a sum over a finite data segment may cause analysis of real world data.
a serious side effect called ‘‘random correlations’’ (Laloux, Cizeau, At this place the question arises why not to simply use an-
other bivariate measure instead of the cross-correlation coeffi-
cient, which additionally to the above mentioned problem only
∗ Corresponding author at: Facultad de Ciencias, Universidad Autónoma del detects linear interrelationships between two signals. Therefore,
Estado de Morelos, 62209 Cuernavaca, Morelos, Mexico. Tel.: +52 777 3297020; fax: any nonlinear interrelation, which might be expected between sig-
+52 777 3297040. nals measured in real word complex systems, remains unobserved
E-mail address: [email protected] (M.F. Müller). by definition. However, for two reasons we concentrate on the
1 These authors have contributed equally to this work. numerically robust and computationally cheap cross-correlation
0893-6080/$ – see front matter © 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.neunet.2013.05.009
A.O. Marín García et al. / Neural Networks 46 (2013) 154–164 155
coefficient. At first, also for other measures deficiencies appear of the analysis of the model data. We selected several examples of
once the precise mathematical definition is translated into an al- different correlation pattern that demonstrate the pros and cons of
gorithm of a numerical estimate. Usually integrals over infinite each of the measures. Finally, we present an analysis of intracranial
ranges are then converted to finite sums, or limits to infinity or zero electroencephalographic recordings of epilepsy patients in order to
cannot be respected due to the finite amount of coarse-grained illustrate how the pitfalls of some measures may lead to erroneous
data. Secondly, it turns out that the cross-correlation coefficient conclusions drawn from the numerical analysis of real world data
performs equally well, or even better than sophisticated non- with unknown correlation pattern.
linear measures for the detection of the (linear or nonlinear)
coupling between nonlinear or even chaotic dynamical units 2. Methods and concepts
(Ansari-Asl, Senhadji, Bellanger, & Wendling, 2006; Kreuz et al.,
2007). In these studies measures for phase synchronization based 2.1. Correlation measures
on the Hilbert transform or wavelets, information theoretical mea-
sures like mutual information or measures based on phase space The equal-time correlation matrix Ĉ is real, symmetric and pos-
reconstruction have been included. Finally, in Mormann et al. itive definite, i.e. its eigenvalues are positive real numbers: 0 ≤
(2005) 30 different measures are tested for their ability to discrimi- λ1 ≤ λ2 ≤ · · · ≤ λM ≤ M. The range of the eigenvalues is limited
nate a possible pre-seizure from an interictal state by analyzing in- due to the trace condition Tr(Ĉ ) = i=1 λi = M. Cru-
M M
i=1 Cii =
tracranial electroencephalographic recordings. The authors found cial for characterizing cross-correlations is how the corresponding
that ‘‘A remarkable aspect of our results is that both among the uni- eigenvalues of the cross-correlation matrix are distributed (Baier,
variate and the bivariate approaches linear measures performed Müller, Stephani, & Muhle, 2007; Müller et al., 2005, 2006; Plerou
equally good or even better than non-linear measures’’. The au- et al., 2002). Therefore, the eigenvalues are central for designing a
thors of Ansari-Asl et al. (2006) recommend explicitly the appli- certain class of correlation measures.
cation of linear bivariate measures as a ‘‘first attempt’’. For these We aim to compare the equal-time cross-correlation matrix de-
reasons we concentrate on linear cross-correlations in this study. rived from the real world data with the equivalent matrix derived
During the last decade numerous studies have been dedicated from surrogates, which shares the same univariate properties as
to solve the problem of random correlations and several methods the original data but does not contain any genuine zero-lag cross-
have been developed to extract the genuine correlation structure correlations between the data channels. Hence, such data repre-
of a multivariate data set. One class of methods is based on the sents the null hypothesis of zero genuine correlations with the
techniques known from Random Matrix Theory (RMT) (Laloux same amount of random correlations as the original data. For the
et al., 1999; Müller et al., 2005, 2006; Plerou et al., 2002, 1999). At present null hypothesis of zero genuine correlations the surrogates
the core of these methods lays the distribution of eigenvalues and are generated in an iterative process where Fourier phases are ran-
eigenvectors of the equal-time cross-correlation matrix, which is domized while conserving the power spectrum as well as the am-
compared to those of well-defined random matrix ensembles, as plitude distribution of the signals (Schreiber & Schmitz, 2000).
e.g. the Wishart ensemble or the Gaussian Orthogonal Ensemble. Alternatively this kind of surrogate data can be approximated to
That part of the spectrum, which coincides with the random a high degree by randomly shifting the data channels relatively to
matrix prediction, is considered as random; while it is believed each other and wrapping the extra values around to the beginning
that significant deviations from the random matrix prediction are of the data set (Netoff & Schiff, 2002). In the present paper we used
generated by genuine interrelations. Hence, ensembles of random the technique of Fourier phase randomization, i.e. Iterative Am-
interrelation matrices represent the null hypothesis of absence of plitude Adjusted Fourier Transform based surrogates (IAAFT), al-
genuine cross-correlations. These techniques have often been used though shift surrogates deliver qualitatively similar results as we
to analyze stock data (optimization of portfolios by using certain have tested for.
variants of Principal Component Analysis). We define data segments of length L data points, which are
For a second group of methods the null hypothesis is repre- divided in at least NT non-overlapping windows of size T sampling
sented by appropriate surrogate data (Carmeli, Knyazeva, Inno- points. Each of these windows is used to estimate the equal-time
centi, & de Feo, 2005; Cui, Liu, Wan, & Li, 2010; Müller et al., 2011, cross-correlation matrix. Then, for each of the data segments L
2008; Rummel et al., 2010). Some techniques are based on compar- we generate NL surrogates and proceed with them in the same
ing the eigenvalues of the equal-time cross-correlation matrix de- way. Hence, we get an ensemble of correlation matrices, which
rived from the original data and from surrogates. Other approaches uniquely reflect random correlations. We diagonalize each of these
assess the distribution of the matrix elements itself and permit matrices; the eigenvalues are then used to define two different
not only to estimate the amount of genuine correlations but they correlation measures, which are supposed to reflect the ‘‘amount
provide in addition detailed information about the topology of the of genuine cross-correlation’’ of an empirical, multivariate data set
spatial correlation pattern. No thresholds or external parameters (Cui et al., 2010; Müller et al., 2011, 2008; Rummel et al., 2010).
have to been fixed such that methods of this group are fully data We define the amount of genuine cross-correlation as the
driven. Originally they have been designed for analyzing EEG data, average over all zero-lag cross-correlation coefficients of a M-
although a clustering algorithm derived within the same frame- dimensional multivariate data set under the supposition of in-
work (Rummel, 2008; Rummel, Baier, & Müller, 2007; Rummel finitely long, stationary time series recorded with infinite temporal
et al., 2010; Rummel, Müller, & Schindler, 2008) might be a use- resolution. In this case, the integral over an infinite range (the cor-
ful tool also for the analysis of financial data (Münnix et al., 2012). relation integral) can be evaluated precisely for each pair of time
In the present study we will focus solely on this group. series and no spurious contributions occur. Hence, the amount of
The paper is organized as follows. In the next section, we first genuine cross-correlations GCS can be defined as:
review the definitions of the correlation measures and define the M
2 ⌢
term ‘‘amount of genuine cross-correlation’’. Then we discuss the GCS = C ij (1)
model used for the derivation of artificial recordings and present M (M − 1) i>j
as an analytical result a formula, which quantifies the amount of
where
genuine cross-correlations as a function of the coupling strength. ∞
We close the method section with a description of EEG-recordings ⌢
and patients involved in this study. Then we present the results C ij = X̃i (t )X̃j (t )dt (2)
−∞
156 A.O. Marín García et al. / Neural Networks 46 (2013) 154–164
instead of the estimate over a finite data segment of length T and statistically equivalent to those derived from the surrogates.
finite sampling rate: Hence, CCS = 0, because all Si are equal to zero. In the case
of maximal correlations (all signals are identical), all λ̃oi = 0
T
1 for i = 1, . . . , M − 1 and λ̃oM = M. In this case, numerator and
Cij = X̃i (tk )X̃j (tk ). (3)
T k=1 denominator of formula (3) have the same value and CCS = 1
(c) CCS-Matrix (Rummel et al., 2010):
The tilde in Eqs. (2) and (3) indicates that the time series X̃i and The previous concept has been generalized in order to define
X̃j are normalized to zero mean and unit variance (within the data the CCS-Matrix, of which each entry is a bivariate estimate of
window of T data points in case of Eq. (3)). the genuine cross-correlations between
two
data channels. Let
µoij = med Cijo and µsij = med Cijs the medians of the
The definition of the correlation measures considered in this
study is: absolute values of the entries of the correlation matrices de-
rived from theoriginal data and the surrogates respectively and
(a) Synchronization Index SI (Carmeli et al., 2005; Cui et al., 2010): νijo = med Cijo . Then the definition of the CCS-matrix reads:
In order to account for the amount of random correlations,
the average eigenvalues derived from the original data λ̄oi o µoij − µsij
are divided by the corresponding ones estimated from the CCSij = sign νij Sij . (7)
1 − µsij
surrogates λ̄si . Then normalized quantities may be defined as:
In the last formula, the denominator assures proper normal-
λ̄oi /λ̄si ization of the matrix elements, the pre-factor defines the sign
Λi = . (4)
M of the correlation coefficient and the scalar Sij represents the
λ̄oj /λ̄sj
result of a one-sided Mann–Whitney–Wilcoxon-rank test (in-
j =1
cluding Bonferroni correction for multiple testing), i.e. it can be
The Λi are then substituted in the definition of the ‘‘S-esti- equal to zero and one only.
One may apply the significance test
mator’’ introduced by Carmeli and coworkers (Carmeli et al., to the absolute values Cij (as proposed in the original article
2005): Rummel et al., 2010) or directly to Cij . As we found a slightly im-
proved sensitivity to detect small genuine correlations when
M
the Cij are object of the significance test, we proceed in the
Λi log (Λi )
i =1
present study in this way. Note, the CCS-matrix is no longer
SI = 1 + . (5) a quadratic form and hence its eigenvalues are not necessarily
log (M )
positive. However, as each of the entries of this matrix is sup-
The definition is motivated by the information theoretical con- posed to already give an estimate of the genuine bivariate cor-
cept of entropy, although the index SI is certainly not an en- relations, no diagonalization is necessary in order to separate
tropy, as the Λi are no probabilities. However, definition (5) random and genuine contributions. Instead, one may estimate
provides an estimator for the amount of genuine correlations. the whole amount of genuine correlation within a multivariate
If all M signals are uncorrelated, the eigenvalues derived from data set as the average over the matrix elements:
original data and surrogates coincide. Hence, all Λi are equal to M
1/M and therefore SI = 0. If on the other hand all M signals are 2
CCSij .
CMat = (8)
identical, Λi = 0 for all i = 1, . . . , M − 1 and ΛM = 1. Hence, M (M − 1) i>j
in this case it holds that SI = 1.
(b) Cross-Correlation Strength CCS (Müller et al., 2011, 2008): (d) Finally we introduce a fourth measure in the following manner:
Any particular (random or genuine) correlation pattern causes We define the elements of an interrelation matrix called ‘‘sig-
a specific repulsion scheme of the eigenvalues. This finding nificant average correlations’’ as the average correlation coef-
is basic for defining the CCS-coefficient. It consists of the ficient estimated over the segment L.
comparison between the eigenvalues of the equal-time cross- NT
Sij o,k
= cijo Sij .
correlation matrix derived from the original data and the sur- SACij = cij (9)
rogates respectively. As repulsions may occur at any location NT k=1
along the spectrum, each of the eigenvalues is compared in- o,k
dividually. The normalized sum of significant differences may In the last formula, cij denotes the equal-time cross-
correlation coefficient between data channel ‘i’ and ‘j’ esti-
serve as a measure for the amount of genuine correlations of a
mated at the k-th data window of the original data. NT is
multivariate data set:
the number of windows within the segment L of the original
M recording and Sij represents again the outcome of the Mann–
λ̃i − λ̃si Si
o
Whitney–Wilcoxon-rank test (including Bonferroni correction
i=1
CCS =
−1
M . (6) for multiple testing). The genuine correlation strength may
M − λ̃M +
s
λ̃i
s then be estimated in a similar way as for the CCS-matrix:
i=1 M
2
SACij .
In this formula the λ̃ and λ̃ are the medians of the i-th eigen-
o s SMat = (10)
i i M (M − 1) i>j
value derived from the original data and the surrogates. Be-
cause of the factor Si only significant differences according With the exception of the synchronization index, all measures
to the nonparametric Mann–Whitney–Wilcoxon-rank test are yield numerical estimates on a predefined significance level
taken into account. It is equal to zero if the null hypothesis of fixed to 1% (including Bonferroni correction) throughout the
zero genuine correlations cannot be rejected, i.e. the particular paper.
eigenvalues are statistically equivalent. Otherwise, it is equal
to one and the difference between the two eigenvalues enters 2.2. A simple model system
the sum.
CCS varies between zero and one. If no genuine correlations The performance of linear or nonlinear interrelation measures
are present, all eigenvalues derived from the original data are is usually tested by (a) analyzing the data derived from theoretical
A.O. Marín García et al. / Neural Networks 46 (2013) 154–164 157
test models (e.g. Ansari-Asl et al., 2006; Kreuz et al., 2007) or (b) to not sensitive to the sign of the correlation, this consideration is only
empirical data measured under well defined specific conditions relevant for the CCS- and SAC-matrix.
(Mormann et al., 2005). As theoretical test models nonlinear For the numerical studies we varied ρ from zero to one in steps
dynamical systems are frequently used, which are linearly or of 0.02 and for each value of ρ we generated 100 independent
nonlinearly coupled. Then one measures the sensitivity of the realizations. Then we calculated the median value of the 100
interrelation-estimators as a function of the coupling strength and estimates of the correlation measures and the 95% significance
its robustness as its behavior under additive white or colored noise. interval.
However, in most cases, the analytical value for genuine cross- For estimating the correlation measures we generated seg-
correlations (or other bivariate measures) is unknown and one ments of L = 2000 sampling points. The time unit was set to 200
only tests if an interrelation measure detects the variation of the sampling points. Then the segment L corresponds to a 10 s interval.
coupling strength. Here we want to compare the performance of The segment is then divided in NT = 10 non-overlapping windows
the measures not only between each other but also with regard to of length T = 200. Furthermore, we generated from each segment
the analytical results of genuine cross-correlations. For this reason, NL = 10 IAAFT surrogates, which are divided in the same way in
we employ an analytically treatable test model. The fact that the shorter data windows of length T . In this way we get an ensemble
following model is a linear one constitutes no restriction as we aim of Nsurr = 100 random matrices representing the null-hypothesis
to test exclusively linear correlation measures. We define M time of zero genuine correlations.
series as:
We then define a frequency unit such that 1 Hz corresponds to
Xi = (1 − ρ)ξi + ρη, for i = 1, . . . , k 1/(time unit). For the present case we considered a frequency range
and (11) of 0.1–100 Hz for all the data derived from the above-described
numerical models.
Xi = ξi , for i = k + 1, . . . , M .
The ξi and η are independent colored or Gaussian white noise, 2.3. EEG data
i.e. ⟨ξi ⟩ = ⟨η⟩ = ⟨ξi η⟩ = 0 and ξi2 = 1 = η2 , where ⟨∗⟩ denotes
the average. Importantly, the ξi are generated for each of the time As a proof of principle we analyzed exemplarily several in-
series individually while η is a common noise component, which tracranial recordings of three patients suffering from pharma-
is the same for k ≤ M time series. Furthermore, the amount of coresistant epilepsy. Non-invasive studies had not allowed to
genuine correlations does not depend on the type of noise used. unequivocally localize the seizure onset zone and the patients un-
The power spectra of the time series only influence the amount of derwent long-term intracranial EEG recordings at the Department
random correlations while the parameter ρ controls the strength of Neurology of the University of Bern, Switzerland. Beside the
of genuine correlation. If ρ = 0 all M time series are independent need for invasive EEG studies, further inclusion criteria were excel-
and, hence, they do not contain any genuine cross-correlations. In
lent post-surgical outcome, here defined as class 1 or 2 according to
the limiting case ρ → 1, the independent noise term vanishes and
the ILAE classification (Wieser et al., 2001). The ethics committee
the k time series become identical and equal to η. In this case the
of the Kanton of Bern had approved retrospective analysis of EEG
respective correlation coefficients become equal to one. With this
data. In addition, all the patients had given written informed con-
test model it is also possible to generate mutually uncorrelated
sent that their data from long-term EEG might be used for research
clusters of correlated time series. This can be achieved by using
and teaching purposes.
different and mutually independent noise terms for each cluster.
The number of implanted electrodes for patient one was 42, for
For the numerical studies four different correlation patterns
patient two 88 and for patient three 47. All electroencephalograms
derived from model (11) are considered. With the exception of
considered in this paper have been recorded with a sampling
model A, in all cases the total number of time series has been fixed
to M = 19. frequency of 512 Hz. As in the case of the test models, the segment
length L has been chosen as 10 s, which was divided in 10 one-
Model A: is just the bivariate case solved by formula (11), viz. two second windows T . The segments have been moved with a 5 s
time series are successively correlated by increasing the step width along the recordings in order to obtain the temporal
coupling parameter ρ . evolution of the three correlation measures.
Model B: two of 19 data channels are correlated with strength
parameter ρ .
3. Results
Model C: three mutually independent correlation clusters are
formed. Two clusters contain 7 signals and the third one
5 time series. The signals of each cluster are correlated 3.1. Analytical considerations
with the same strength determined by ρ . In this case 52
from 171 matrix elements contain genuine correlations. The most important feature of the test model described in Sec-
Model D: 11 signals are correlated with strength ρ . In this case 55 tion 2.2. is that it can be treated analytically. In order to evaluate
of 171 matrix elements contain genuine correlations. the correlation integral one has to normalize the time series, such
that they have zero mean and unit variance. Hence,
We considered two variants of each model with the same amount
of genuine, but different strength of random correlations: (a) Gaus- (1 − ρ)ξ1 + ρη
sian white noise and (b) 1/f noise (pink noise) has been used for X1 = (12a)
the variables ξ and η. 1 − 2ρ + 2ρ 2
All definitions of the multivariate measures are based on the (1 − ρ)ξ2 + ρη
Pearson correlation coefficient, which takes values between minus X2 = . (12b)
1 − 2ρ + 2ρ 2
and plus one, while many other bivariate measures as e.g. the mu-
tual information or the phase coherence are positive. In order to Evaluating the correlation integral yields the correlation coefficient
test the performance of the different techniques when a positive as a function of ρ :
definite bivariate measure is used, we also constructed the multi-
variate measures based on the absolute value of the Pearson co- ρ2
efficient. However, as the eigenvalues of the correlation matrix are C12 (ρ) = . (13)
1 − 2ρ + 2ρ 2
158 A.O. Marín García et al. / Neural Networks 46 (2013) 154–164
This result holds for two time series only. If in a multivariate data analytical result Cgen (formula (15)). For the case of the eigenvalue-
set mK ≤ M time series are correlated with the same strength pa- based measures also the corresponding analytical result (formula
rameter ρ , the amount of genuine cross-correlation is given by: (17) and (18) respectively) is drawn for comparison.
We start with the discussion of the simplest case of bivariate
ρ2 mK (mK − 1) correlation pattern. In Fig. 1results obtained for model A (Fig. 1(a)
Cgen (ρ) = . (14)
1 − 2ρ + 2ρ M (M − 1)
2 to (f)) and model B (Fig. 1(g) and (h)) are shown. The synchroniza-
tion index (Fig. 1(a)) is by construction far from the analytical re-
Finally, for the case of KG clusters the amount of genuine correla-
tions is given by sult Cgen . The analytical result SIa is qualitatively different and takes
systematically values lower than Cgen . Therefore, the numerical es-
ρ2 KG
mK (mK − 1)
timate of the synchronization index stays almost always far from
Cgen (ρ) = . (15) Cgen (beside of the ranges ρ ≈ 0 and ρ ≈ 1) but approximates sat-
1 − 2ρ + 2ρ 2 k=1 M (M − 1) isfactory the theoretical curve SIa . As expected, the accuracy of the
If we assume block structure of the correlation matrix (3) and estimates decreases in the case of 1/f noise with somewhat larger
absence of random correlations, also for the eigenvalue based statistical fluctuations.
measures analytical formulas can be derived. Considering a single On the other hand the CCS-coefficient (Fig. 1(b)) describes
cluster with mK pairwise correlated signals with correlation coef- the theoretical curve Cgen considerably well. This is because for
ficient C (ρK ) from formula (13). Then the normalized eigenvalues model A Cgen and CCSa coincide. The same is true for the estimates
(Eq. (4)) read: of CMat (Fig. 1(c) and (d)) and SMat (Fig. 1(e) and (f)). Because
these measures contain a significance test, its sensitivity is reduced
1 − C (ρK ) such that small genuine and random correlations cannot be
Λi = ∀i = 1, . . . , mK
M distinguished such that the respective measure is set to zero. CCS
1 + (mK − 1)C (ρK ) starts to detect genuine correlations at about ρ ≈ 0.25 (white
ΛmK = (16) noise) and ρ ≈ 0.35 (1/f noise). The matrix measures are slightly
M
more sensitive (ρ ≈ 0.2 and ρ ≈ 0.3 for white and colored noise
1
Λi = ∀i > mK . respectively). Then, for higher values of the strength parameter
M CCS as well as CMat stay close, although somewhat below, the
This simplifies that for a situation with KG mutually independent theoretical curve. However, Cgen lies always within the 95%
groups, where the signals of each group are correlated pairwise confidence interval of CCS and CMat . The best performance in the bi-
correlated with strength C (ρK ), K = 1, . . . , KG the synchroniza- variate case shows SMat (Fig. 1(e) and (f)). Its median values of the
tion index reads: numerical estimates lie almost exactly upon the theoretical result.
1−C (ρ )
1−C (ρK )
The situation is qualitatively the same when a positive bi-
KG (m K −1)
K
log
M M variate measure in terms of the absolute value of the Pearson cor-
SIa = 1 +
K =1
i =1
log (M ) relation is used for the construction of the matrices CCSij and SACij
(Fig. 1(d) and (f)), although in this case the sensitivity is slightly re-
duced. If the significance test would have been performed for the
1+(mK −1)C (ρK ) 1+(mK −1)C (ρK )
M
log M
absolute values of the correlation matrix, Fig. 1(b), (c) and (d) were
+
log(M ) almost identical. Now the numerical estimates of CMat (Fig. 1(d))
are quantitatively very close to those of CCS (Fig. 1(b)). However,
(M −L) qualitatively CMat shows the same performance than in the case
1
1
M
log M
+ (17) shown in Fig. 1(c). This result could be anticipated because in this
i=L+1
log(M ) case the distributions of the absolute values of the correlation co-
KG
efficients enter the statistics, i.e. similar to the eigenvalues, which
where L = K =1 mK counts the total number of correlated chan- determine the CCS-coefficient there is no distinction between pos-
nels. itive and negative correlations.
In the same manner one obtains for the same multi-cluster In contrast it is more surprising that also SACij is only weakly
situation for the CCS-coefficient: affected by this change. In the case of SMat one should expect that
KG the average over the absolute value of the correlation coefficients
1
CCSa = (mK − 1)C (ρK ). (18) in Eq. (9) is less effective than in the former case. As opposed to this
M − 1 K =1 consideration the qualitative behavior of SMat is almost identical to
that observed in Fig. 1(e).
Formulas (15), (17) and (18) will be used in the following for
Table 1 gives an overview of the performance of the considered
the comparison with the different correlation measures estimated
correlation measures. The mean percentage error (Eq. (19)) of CCS
from the numerical data. Furthermore, a quantitative evaluation
is slightly higher than the one of CMat . The worst performance in
of the performance of the numerical estimates is provided by the
these terms shows the synchronization index. Although its esti-
mean percentage error (MPE), defined as:
mates are not strained by any kind of significance test, which natu-
Numerics analytics
rally decreases the sensitivity to detect small genuine correlations,
Cmeasure (ρ) − Cmeasure (ρ) dρ
its overall behavior yields the largest relative errors. With respect
MPE = analytics . (19)
Cmeasure (ρ)dρ to the mean percentage error the most precise measure in the bi-
variate case is SMat . Its MPE for pink noise is still lower than the
MPE of CMat for white noise.
3.2. Numerical results for model data This observation is qualitatively confirmed when turning to the
multivariate case of two correlated data channels (model B). In
In all figures displaying results derived from model data, the this case CMat and SMat behave also quantitatively similar as in
dashed line (green curve) indicates the results of the case of white the bivariate case of model A. The sensitivity and the magnitude
noise, the solid line (red curve) the corresponding results obtained of fluctuations are comparable for both measures and they ap-
for 1/f -noise and finally the thick solid line (black curve) shows the proximate the analytical result Cgen satisfactorily. For this reason
A.O. Marín García et al. / Neural Networks 46 (2013) 154–164 159
a b
c d
e f
g h
Fig. 1. Shown is the amount of genuine correlation for the bivariate case in panel a to f (model A) and the multivariate case of two correlated signals (model B) in panel g
and h. Green and red curves correspond to the realization of the models with white and 1/f noise respectively. Shown are median values for 100 realizations and the 95%
significance interval. (a) Synchronization index, (b) CCS-coefficient, (c) and (d) shows CMat based on the Pearson coefficient and its absolute value respectively, (e) and (f)
the same as (c) and (d) for SMat . The panels (g) and (h) display the results for the synchronization index and the CCS-coefficient respectively. The bold continuous presents
the analytical result Cgen , the thin dotted line in panel (a) and (b) and equivalently in panels (g) and (h), reflect the analytical results for SIa and CCSa respectively. (For
interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
only the results for the synchronization index SI (Fig. 1(g)) and the concise overview over the performances of the measures also for
CCS-coefficient (Fig. 1(h)) are shown. However, Table 1 provides a the analysis of this model. In short, the lowest relative error is again
160 A.O. Marín García et al. / Neural Networks 46 (2013) 154–164
a b
c d
e f
g h
Fig. 2. Results for model C (three mutually independent groups) are displayed in panels (a) to (f) and model D (one correlation cluster containing 11 channels) in panels
(g) and (h). Color coding is similar to Fig. 1. Shown are median values for 100 realization and the 95% significance interval. (a) Synchronization index, (b) CCS-coefficient,
(c) and (d) show CMat based on the Pearson coefficient and its absolute value respectively, (e) and (f) the same as t (c) and (d) for SMat . The panels (g) and (h) display the
results for the synchronization index and the CCS-coefficient respectively. The bold continuous presents the analytical result Cgen , the thin dotted line in panel (a) and (b)
and equivalently in panels (g) and (h), reflects the analytical results for SIa and CCSa respectively. (For interpretation of the references to colour in this figure legend, the
reader is referred to the web version of this article.)
162 A.O. Marín García et al. / Neural Networks 46 (2013) 154–164
Fig. 3. Results of the analysis of intracranial data of epilepsy patients: panel (a) and (b) seizure 1 of patient 1, panel (c) and (d) seizure of patient 2 and panel (e) and (f) for
patient 3. Panels (a), (c) and (e) display the CCS-coefficient (blue) and the synchronization index (red) and panel (b), (d) and (f) present the result for SMat (blue) and CMat
(red). The vertical black lines indicate seizure on- and offset. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of
this article.)
is slightly more sensitive and more precise in approximating the this peak constitutes the maximal value during the whole time
analytical results Cgen , it was expected that in general SMat > CMat . course shown in Fig. 3(c). Furthermore, SI does not encounter
As the differences in the performance are moderate, one could also the broad minimum just after seizure offset. Instead, during the
anticipate that both measures show the same time evolution. How- entire post-seizure period it fluctuates somehow around a nearly
ever, qualitative differences can be seen in comparison with the constant average value. On the other hand, neither SMat nor CMat
CCS-coefficient and the synchronization index. indicate such a pronounced maximum during seizure as the CCS-
For instance, in the case of patient 1 (Fig. 3(a) and (b)) the coefficient and SI do. Several peaks in the pre- and post-seizure
synchronization index shows only tiny fluctuation in the pre- period encounter much larger values and also the minimum of the
seizure phase. Then it increases during seizure, but notably CCS-coefficient just after seizure offset is less pronounced by the
later than the three other measures and misses completely the SAC- and CCS-matrix.
pronounced maximum during the first half of the seizure marked Finally, for patient 3 (Fig. 3(e) and (f)) both eigenvalue-based
by them. At seizure offset it decreases gradually instead of the measures show a pronounced broad maximum during the first half
abrupt drop shown by the competing methods, such that it reaches of the seizure. SMat and CMat also present some larger fluctuations
its pre-seizure level only at the end of the shown time-course. The after seizure onset, but by no means such a prominent maximum.
CCS-coefficient shows the main characteristics during the seizure Furthermore, the CCS-coefficient shows strong fluctuations during
as the measures derived by the matrices CCS and SAC. the pre- and post-seizure period, which are absent in the time
For patient 2 (Fig. 3(c) and (d)) both eigenvalue-based measures evolution of the other measures. In this case, the CCS-coefficient
show a pronounced maximum during the seizure. In either case, even encounters values much below those of SI.
A.O. Marín García et al. / Neural Networks 46 (2013) 154–164 163
4. Conclusions the sensitivity of CMat and SMat is slightly reduced under this
condition but performs otherwise with similar quality as for the
In view of the studies concerning the model data one can draw usual Pearson-coefficient.
the following conclusions: Also the computational cost is more favorable for the SAC-
The performance of methods based on the eigenvalues of the matrix. In terms of robustness both measures show an equally sta-
interrelation matrices is far of being optimal as we revealed by ble behavior. Furthermore, the CCS- and SAC-matrix provide the
comparing their numerical estimates with those of competing opportunity to study the dynamics of the topology of the func-
techniques and analytical results. The mean percentage error with tional network of a spatially extended system. In this case, one may
respect to the analytical results SIa and CCSa is in all considered understand the matrix as a weighted network and one may ap-
cases (with exception of model A) at least twice the MPE-value ply graph-theoretical tools in order to obtain detailed information
derived for the matrix based measures but occasionally it is more about e.g. the efficiency of information transfer, clustering, vul-
than an order of magnitude larger (compare with white noise val- nerability and other network properties (Horstmann et al., 2010;
ues of SMat for model B, C and D of Table 1). Furthermore, it turns out Kramer & Cash, 2012; Kramer, Kolaczyk, & Kirsch, 2008; Lehnertz
that the sensitivity CCS-coefficient reduces drastically for the de- et al., 2009; Schindler, Bialonski, Horstmann, Elger, & Lehnertz,
tection of small genuine correlation structures. In this case, also the 2008; Stam & Reijneveld, 2007).
robustness of this measure against the amount of random correla- In the present study we refrained from examining the depen-
tions is drastically reduced (Fig. 1(h)). On the other hand, it seems dence of the results on the parameters T , NT and NL . In previous
that the synchronization index shows similar deficiencies when studies the dependence of CCS matrix elements calculated from
several independent clusters are formed (Fig. 2(g)). In any case, the EEG signals turned out rather weak (Rummel et al., 2011). We do
sensitivity and robustness of both eigenvalue based measures is not expect a different finding for the measures discussed in the
deficient in comparison to the performance of SMat and CMat . The present paper. The reason is that in all cases the non-parametric
information of the whole amount of random as well as genuine and robust Mann–Whitney–Wilcoxon U-test is used for hypothe-
correlations of the multivariate data set is condensed in the distri- sis testing.
bution of eigenvalues. The separation of both contributions is more The application to exemplary intracranial EEG recordings
difficult when M eigenvalues are considered than M (M − 1)/2 ma- (Fig. 3) clearly illustrates how the deficiencies of eigenvalue-based
trix elements directly. measures may lead to serious discrepancies when interpreting the
However, we find that the most crucial deficiencies of both
results. First one notices that the CCS-coefficient provides in gen-
eigenvalue-based measures constitute on the fact that they do not
eral too large estimates. However, this might be of less importance
only depend on the strength of genuine correlations and the size of
if at least the time evolution reproduces qualitatively the correct
the correlation structure, but also on the topology of the correlation
correlation dynamics. Unfortunately this is not the case, neither
pattern. Beside that the numerical estimates of SI and CCS provide
for the CCS-coefficient, nor for the synchronization index. There
in general only a poor approximation of their corresponding ana-
may appear pronounced maxima in situations where the amount
lytical counterpart, they differ almost always drastically from Cgen .
of genuine correlations does not change at all. Broad regions of in-
Although they are originally declared as measures for the strength
creased correlations or sharp well pronounced peaks might occur
or total amount of genuine correlations (Cui et al., 2010; Müller
as illustrated in the seizure period of Fig. 3(c) and (e). Finally, also a
et al., 2011, 2008) they depend strongly on the spatial arrangement
systematic underestimation and a complete missing of prominent
of correlations.
maxima as shown in Fig. 3(a) and (d) may happen. Especially the
For instance, if the correlation pattern changes from a larger
drastically different behavior during critical periods of the record-
correlation cluster to several smaller structures such that the over-
ings, namely seizure on- and offset or the immediate post-seizure
all magnitude of genuine cross-correlations stays constant, eigen-
value based measures nevertheless might show a drastic increase period may cause confusions and misleading neurophysiologic in-
(compare the results derived from model C and D). Therefore, al- terpretations. In order to understand the brain dynamics generat-
ternations of this correlation measure do not necessarily indicate ing epileptic seizures or if one tries to discover the mechanism used
variations of the amount of genuine correlations but simply topo- by the brain to terminate the crisis, one needs definitely reliable
logical changes of the correlation pattern. measures whose numerical estimates are qualitatively and quan-
Only the measure derived from the CCS- and the SAC-matrix titatively correct (Netoff & Schiff, 2002; Schindler, Leung, Elger, &
show under all test conditions considered so far a stable and sat- Lehnertz, 2007).
isfying performance. For all cases (small, medium and large cor- In summary, we find measures based on the eigenvalues of
relations structures, few or many cluster formation) they reflect the equal-time cross-correlation matrix to be problematic when
quantitatively the amount of genuine correlations of a multivari- for the analysis of real world data. When analyzing the time
ate data set with a good accuracy. That is because the compari- evolution of cross-correlations, quantitative as well as qualitative
son with surrogate data as expressed in formula (7) and (9) discrepancies may occur. The reason is that the spectrum of
allows a reasonable estimate of genuine correlations from rather eigenvalues of the correlation matrix does not only contain
short time series. In particular SACij provides a good approxima- information about the magnitude of the cross-correlations, but also
tion of the correlation integral (2), which has to be evaluated over about their spatial distribution. Each spatial correlation structure
an infinite range. The small mean percentage errors of SMat are provokes its particular repulsion scheme of the eigenvalues (Baier
mainly caused by its reasonably good but still finite sensitivity; et al., 2007; Müller et al., 2005, 2006). Hence, eigenvalue based
small genuine correlations are not detected because the nonpara- measures automatically contain structural information about the
metric Mann–Whitney–Wilcoxon-rank may not distinguish origi- functional network and the time evolution of such measures
nal data from surrogates. Otherwise, the median values of SMat drop present a uncontrolled mixture of strength and topology features
precisely upon the analytical curves of Cgen . of the correlation pattern of a multivariate data set.
Furthermore, both concepts deliver reasonable results, when The CCS and SAC-matrix, on the other hand provide sensitive
the interrelation matrix is based on a positive measure. This and robust measures, which permit quantitative estimates of the
implies that our results can be expected to hold also when applied amount of genuine correlations within a multivariate data set
to measures as e.g. mutual information or phase coherence instead (as defined for time series of infinite length) even from finite
of the Pearson coefficient. Taking the absolute value of the equal- and eventually short time series. Additionally, they offer the
time cross-correlations has simulated this situation. It seems that possibility to study structural properties of the functional network
164 A.O. Marín García et al. / Neural Networks 46 (2013) 154–164
and provide detailed information about the spatial distribution Kramer, M. A., Kolaczyk, E. D., & Kirsch, H. E. (2008). Emergent network topology at
of cross-correlations. By using so called shift surrogates (Netoff seizure onset in humans. Epilepsy Research, 79, 173–186.
Kreuz, T., Mormann, F., Andrzejak, R. G., Kraskov, A., Lehnertz, K., & Grassberger, P.
& Schiff, 2002) instead of using the computationally expensive (2007). Measuring synchronization in coupled model systems: a comparison of
iterative Fourier-based surrogates (Schreiber & Schmitz, 2000) the different approaches. Physica D, 225, 29.
complexity of the methods can be considerably reduced. We also Laloux, L., Cizeau, P., Bouchaud, J.-P., & Potters, M. (1999). Noise dressing of financial
correlation matrices. Physical Review Letters, 83, 1467.
checked for the performance of the methods using shift surrogates Lehnertz, K., Bialonski, S., Horstmann, M. T., Krug, D., Rothkegel, A., Staniek, M.,
obtaining qualitatively similar results. However, the time saving et al. (2009). Synchronization phenomena in human epileptic brain networks.
might be orders of magnitude in dependence of the number of data Journal of Neuroscience Methods, 183, 42–48.
Mormann, F., Kreuz, T., Rieke, C., Adrrzejak, R. G., Kraskov, A., David, P., et al. (2005).
channels of the multivariate data set. On the predictability of epileptic seizures. Clinical Neurophysiology, 116, 569.
Furthermore, the results obtained for genuine correlation Müller, M., Baier, G., Galka, A., Stephani, U., & Muhle, H. (2005). Detection and
matrices might be relevant for the usage of sample co-variance characterization of changes of the correlation structure in multivariate time
series. Physical Review E, 71, Art. No. 46116.
matrices (Raudys & Sausargiene, 1998) in the context of pattern
Müller, M., Baier, G., López Jiménez, Y., Marín García, A. O., Rummel, C., &
recognition. With the increased velocity by using shift surrogates Schindler, K. (2011). Evolution of genuine cross-correlation strength of focal
also an application as a kind of control mechanism during the onset seizures. Journal of Clinical Neurophysiology, 28(5), 450–462.
learning phase of neural networks used for adaptively extracting Müller, M., Baier, G., Rummel, C., & Schindler, K. (2008). Estimating the strength
of genuine and random correlations in non-stationary mutlivariate time series.
cross-correlation features of large data streams by neural networks Europhysics Letters, 84, Art. No. 10009.
(Feng, 2004), is conceivable. Müller, M., López Jiménez, Y., Rummel, C., Baier, G., Galka, A., Stephany, U., et al.
Finally, this study demonstrates, that testing only whether a (2006). Localized short range correlations in the spectrum of the equal time
correlation matrix. Physical Review E, 74, Art. No. 41119.
measure detects a certain coupling between dynamical units may Münnix, M., Shimada, T., Schäfer, R., Leyvraz, F., Seligman, T. H., Guhr, T., et al.
not be sufficient. Assessing the qualitative and quantitative behav- (2012). Identifying states of a financial market. Scientific Reports, 2, 644.
ior of a measure is indispensable in order to avoid misinterpreta- Netoff, T. I., & Schiff, S. J. (2002). Decreased neural synchronization during
experimental seizures. Journal of Neuroscience, 22, 7297–7307.
tion of numerical results when applied to the real world data. Plerou, V., Gopikrishnan, P., Rosenow, B., Nunes Amaral, L. A., Guhr, T., & Stanley,
H. E. (2002). Random matrix approach to cross correlations in financial data.
Physical Review E, 65, Art. No. 066126.
Acknowledgments
Plerou, V., Gopikrishnan, P., Rosenow, B., Nunes Amaral, L. A., & Stanley, H. E. (1999).
Universal and nonuniversal properties of cross correlations in financial, time
M.M is grateful for the financial support of CONACyT Mexico series. Physical Review Letters, 83, 1471.
(Project No. 156667). K.S. is grateful for the support of the Raudys, S., & Sausargiene, A. (1998). Structures of the covariance matrices in the
classifier design. In Lecture notes in computer science: vol. 1451. Advances in
study by the Swiss National Science Foundation (project no. SNF pattern recognition (p. 583).
320030_122010). Rummel, C. (2008). Quantification of intra- and inter-cluster relations in
nonstationary and noisy data. Physical Review E, 77, 106708.
Rummel, C., Abela, E., Müller, M., Hauf, M., Scheidegger, O., Wiest, R., et al. (2011).
References Uniform approach to linear and nonlinear interrelation patterns in multivariate
time series. Physical Review E, 83, 066215.
Rummel, C., Baier, G., & Müller, M. (2007). Automated detection of time-dependent
Ansari-Asl, K., Senhadji, L., Bellanger, J.-J., & Wendling, F. (2006). Quantitative cross-correlation structure in nonstationary time series. Europhysics Letters, 80,
evaluation of linear and nonlinear methods characterizing interdependencies 68004.
between brain signals. Physical Review E, 74, Art. No. 031916. Rummel, C., Müller, M., Baier, G., Amor, F., & Schindler, K. (2010). Analyzing
Baier, G., Müller, M., Stephani, U., & Muhle, H. (2007). Characterizing correlation spatio-temporal patterns of genuine cross-correlations. Journal of Neuroscience
changes of complex pattern transitions: the case of epileptic activity. Physics Methods, 191, 94–100.
Letters. A, 363, 290. Rummel, C., Müller, M., & Schindler, K. (2008). Data-driven estimates of the Lumber
Carmeli, C., Knyazeva, M. G., Innocenti, G. M., & de Feo, O. (2005). Assessment of of clusters in multivariate time series. Physical Review E, 78, 066703.
EEG synchronization based on state-space analysis. NeuroImage, 24, 339. Schindler, K. A., Bialonski, S., Horstmann, M. T., Elger, C. E., & Lehnertz, K. (2008).
Cui, D., Liu, X., Wan, Y., & Li, X. (2010). Estimation of genuine and random Evolving functional network properties and synchronizability during human
synchronization in multivariate neural series. Neural Networks, 23, 698. epileptic seizures. Chaos, 18, 033119.
Feng, Da-Zheng (2004). A neural network learning for adaptively extracting Schindler, K., Leung, H., Elger, C. E., & Lehnertz, K. (2007). Assessing seizure
cross-correlation features between two high-dimensional data streams. IEEE dynamics by analysing the correlation structure of multichannel intracranial
Transactions on Neural Networks, 15, 1541. EEG. Brain, 130, 65.
Horstmann, M. T., Bialonski, S., Noennig, N., Mai, H., Prusseit, J., Wellmer, J., et al. Schreiber, T., & Schmitz, A. (2000). Surrogate time series. Physica D, 142, 346.
(2010). State dependend properties of epileptic brain networks: comparative Stam, C. J., & Reijneveld, J. C. (2007). Graph theoretical analysis of complex networks
graph theoretical analysis of simultaneously recorded EEG and MEG. Clinical in the brain. Nonlinear Biomedical Physics, 1, 3. http://dx.doi.org/10.1186/1753-
Neurophysiology, 121, 172–185. 4631-1-3.
Kantz, H., & Schreiber, T. (2004). Nonlinear time series analysis (2nd ed.). Cambridge Wieser, H. G., Blume, W. T., Fish, D., Goldensohn, E., Hufnagel, A., King, D., et al.
University Press. (2001). ILAE commission report. Proposal for a new classification of outcome
Kramer, M. A., & Cash, S. S. (2012). Epilepsy as a disorder of cortical network with respect to epileptic seizures following epilepsy surgery. Epilepsia, 42,
organization. Neuroscientist, http://dx.doi.org/10.1177/1073858411422754. 282–286.