1503 07493

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Nonlinear time-series analysis revisiteda)

Elizabeth Bradleyb)
Department of Computer Science, University of Colorado, Boulder CO 80309-0430 USA and
Santa Fe Institute, Santa Fe, New Mexico 87501, USA
Holger Kantzc)
Max Planck Institute for the Physics of Complex Systems, Noethnitzer Str. 38 D 01187 Dresden Germany

In 1980 and 1981, two pioneering papers laid the foundation for what became known as nonlinear time-series
analysis: the analysis of observed datatypically univariatevia dynamical systems theory. Based on the
concept of state-space reconstruction, this set of methods allows us to compute characteristic quantities such
as Lyapunov exponents and fractal dimensions, to predict the future course of the time series, and even to
reconstruct the equations of motion in some cases. In practice, however, there are a number of issues that
restrict the power of this approach: whether the signal accurately and thoroughly samples the dynamics,
arXiv:1503.07493v1 [nlin.CD] 25 Mar 2015

for instance, and whether it contains noise. Moreover, the numerical algorithms that we use to instantiate
these ideas are not perfect; they involve approximations, scale parameters, and finite-precision arithmetic,
among other things. Even so, nonlinear time-series analysis has been used to great advantage on thousands
of real and synthetic data sets from a wide variety of systems ranging from roulette wheels to lasers to the
human heart. Even in cases where the data do not meet the mathematical or algorithmic requirements to
assure full topological conjugacy, the results of nonlinear time-series analysis can be helpful in understanding,
characterizing, and predicting dynamical systems.

PACS numbers: 05.45.Tp


Keywords: time series analysis, other things

Nonlinear time-series analysis comprises a set stochastic processes are fully characterized by their two-
of methods that extract dynamical information point auto-correlation function
about the succession of values in a data set. This
framework relies critically on the concept of re- h(xt xt+ )2 i
c( ) = (1)
construction of the state space of the system hx2t i
from which the data are sampled. The founda-
tions for this approach were laid around 1980, or by their power spectrum, respectively. There are many
when deterministic chaos became a popular field data sets where this type of analysis leads to a good char-
of research and scientists were looking for evi- acterization, such as temperature anomalies: differences
dence of chaos in natural and laboratory systems. between the daily (maximum, mean, minimum) temper-
One of the firstand still most spectacular ature at a given place and the many-year average of that
applications was the prediction of the path of quantity for the corresponding calendar day. Data of
a ball on a roulette wheel, which nicely demon- this type possess an almost Gaussian distribution with an
strated the power of these methods. Since then, almost exponentially decaying auto-correlation function;
nonlinear time-series analysis has left this narrow typically the null hypothesis that they are generated by
niche and moved into much broader use across an AR(1) process cannot be rejected easily on the basis
all branches of science and engineering, as well as of observed data.
social science, the humanities, and beyond. Of course, we know that temperatures can be predicted
much more accurately by high-dimensional physics-based
models of the atmosphere than by AR(1) models. That
scalar temperature data look like AR data comes from
I. WHY NONLINEAR TIME SERIES ANALYSIS? the projection of dynamics in a high-dimensional state
space onto a single quantity. This illustrates that, de-
The goal of time-series analysis is to learn about pending on ones point of view and ones access to a sys-
the dynamics behind some observed time-ordered data. tems variables, the very same system might appear to
Early approaches to this employed linear stochastic have very different complexity.
modelsmore precisely, autoregressive (AR) and mov- As in any other analysis, the choice of a specific time-
ing average (MA) models1 . These stationary Gaussian series analysis method requires justification by some hy-
pothesis about the appropriate data model. Time-series
analysis is essentially data compression: we compute a
a) EB thanks the Max-Planck-Institut fur Physik komplexer Sys- few characteristic numbers from a large sample of data.
teme for hosting the visit during which this paper was written. This reduced information can only enhance our knowl-
b) Electronic mail: [email protected]
edge about the underlying system if we can interpret it,
c) Electronic mail: [email protected]
and it becomes interpretable through the fact that the
2

chosen number has some specific meaning within some Even so, these reconstructionsif done rightcan still
model framework. If the data do not stem from the be extremely useful because they are guaranteed to be
appropriate model class, the chosen quantity might not topologically identical to the full dynamics. And since
make much sense, even if we can compute its numerical many important properties of dynamical systems are in-
value on the given data set using some numerical algo- variant under diffeomorphism, this means that conclu-
rithm. An illustrative example is the computation of the sions drawn about the reconstructed dynamics also hold
mean and the variance of some sample: we know how to for the true dynamics of the system.
do this, but are these two numbers always meaningful?
If the hypothesis is well justified that the observed data
are a sample from a Gaussian distribution, then these A. Delay-coordinate embedding
numbers characterize it completely and there is nothing
else to compute. If, on the other hand, the data stem The standard strategy for state-space reconstruction is
from a bimodal distribution, then the (still well defined) delay-coordinate embedding, where a series of past val-
mean value is very atypical and the variance is not the ues of a single scalar measurement y from a dynamical
most interesting feature. system are used to form a vector that defines a point in
The collection of ideas and techniques known as nonlin- a new space. Specifically, one constructs m-dimensional
ear time-series analysis can be extremely effective when ~
reconstruction-space vectors R(t) from m time-delayed
the data model is deterministic dynamics in some state samples of the measurements y(t), such that
space. This analysis framework allows us to solve an
inverse problem of considerable complexity: from data, ~
R(t) = [y(t), y(t ), y(t 2 ), . . . , y(t (m 1) )]
we can infer properties of the invariant measure of some
hidden dynamical system. In the best case, we can even
An example is shown in Figure 1. Mathematically, one
determine equations of motion. And, if the underlying
can equivalently take forward delays instead of backward
system is deterministic and low dimensional, this analysis
ones, but for practical purposes (e.g., predictions) it is
framework brings out the relationships between geometry
better to obey causality in ones notation. If is very
(fractal dimension), instability (Lyapunov exponents),
small, the m coordinates in each of these vectors are
and unpredictability (K-S entropy), which is a beauti-
strongly correlated and so the embedded dynamics lie
ful theoretical result from ergodic theory. Of course, the
close to the main diagonal of the reconstruction space;
assumption of determinism makes these methods largely
as is increased, that reconstruction unfolds off that
unsuitable for characterizing stochastic aspects of data.
subspace.
Anomalous diffusion, as first observed in Hursts study
The original embedding theorems only require that
of time-series data of the river Nile2 , is nowadays studied
be nonzero and not a multiple of any any orbits pe-
using detrended fluctuation analysis3 ; behavior like this
riod. This is only valid, however, when one is using real-
is a signature of both nonlinearity and non-Gaussianity
valued arithmetic on an infinite amount of noise-free data
in the underlying stochastic process.
from perfect sensors. In practicewhen noisy, finite-
In this article, we want to describewithout too much length time-series data and floating-point arithmetic are
detail or any attempt at a comprehensive bibliography involvedone needs a higher to properly unfold the
the ideas and concepts of nonlinear time-series analysis, dynamics off the main diagonal. The = 1 embedding
to give a fair account of their usefulness, and to offer some in Figure 1, for instance, will be indistinguishable from a
perspectives for the future. Readers who want to enter diagonal line if its thickness is smaller than the measure-
this subject more deeply should consult one of the many ment noise level. Since improperly unfolded reconstruc-
comprehensive review articles or useful monographs on tions are not topologically conjugate to the true dynam-
this topic, such as4,5 . ics, this is a real problem. For this and other reasons,
it can be a challenge to estimate good values for the
parameter, as described in more depth in Section II B.
II. THE BASICS The original embedding theorems also require m > 2d
to assure topological conjugacy, where d is the true di-
State-space reconstruction is the foundation of nonlin- mension of the underlying dynamics. The trajectory
ear time-series analysis. This quite remarkable result, crossings in the two-dimensional projection of the embed-
which was first proposed by Packard et al. in 1979 and ded Rossler data in Figure 1, for instance, do not exist in
19806,7 and formalized by Takens soon thereafter8 , allows the real attractor, and so the two structures do not have
one to reconstruct the full dynamics of a complicated the same topology. Sauer et al.9 loosened this require-
nonlinear system from a single time series, in principle. ment to m > 2dA , where dA is the capacity dimension
The reconstruction is not, of course, identical to the inter- of the attractor. In practice, however, d is rarely known
nal dynamics, or this procedure would amount to a gen- and dA cannot be calculated without first embedding the
eral solution to control theorys observer problem: how to data. A large number of heuristic methods have been
identify all of the internal state variables of a system and proposed to work around this quandary. Many of these
infer their values from the signals that can be observed. methods are computationally expensive, most of them
3

20

15

10

x
0

-5

-10

-15
0 0.5 1 1.5 2
104
t

FIG. 1. A time series from the Rossler system (top) and a number of delay-coordinate embeddings of that time series with
different values of the delay parameter, .

require significant interpretation byand input froma improvement of sensor technology, the dynamical analy-
human expert, and all of them break down if one has sis of multivariate data will likely become more important
a short or noisy time series. These methods, and their in the coming years, as discussed in Section V.
limitations, are also discussed in Section II B. The kind of due diligence exercise mentioned above is
There are two other requirements in the delay- critical to the success of any nonlinear time-series anal-
coordinate embedding theorems, one of which is implicit ysis task. Data length, noise, nonstationarity, algorithm
in the formula above: that one has evenly spaced values parameters, and the like have strong effects on the re-
of y. Data-acquisition systems do not have perfect time sults, and the only way to know whether those effects are
bases, so this can be a problem in practice. An obvious at work in ones results is to repeat the analysis while ma-
workaround here is interpolation, but then one is really nipulating the data (downsampling, for instance, or an-
studying a mix of real and interpolated dynamics. The alyzing the first and last half of the data set separately)
final requirement is that the measurement process that and the analysis parametersthe m and values, the
produces y is a smooth, generic function on the state algorithmic scale parameters, etc. If one can also ma-
space of the system. This will not be the case, for in- nipulate the experimental parameters, repeated analyses
stance, if some event counter in the data-acquisition sys- can reveal whether the data are sampled too coarsely in
tem overflows. It can be hard to know whether the mea- time to capture the details of the dynamics, or for too
surement function satisfies the theoretical requirement; short a period to sample its overall structure.
strategies for doing so include changing the sampling fre-
quency or measuring a different quantity and then re-
peating the analysis. If the results do not change, one can B. Estimation of embedding parameters
be more confident that they are correct. Formal proofs
of that correctness, of course, are not possible because of The theoretical requirements on the embedding
the nature of real-world data and digital computers. parametersthe delay and the embedding dimension
Multivariate time-series data are useful for other rea- mare, as mentioned in the previous section, quite
sons besides the corroboration that is afforded via indi- straightforward. In practice, however, one does not
vidual analyses of different components. It is also possi- know the dimension of the system under study, nor does
ble to perform multi-element reconstructions that com- one have perfect data or a computer that uses infinite-
bine the information in those components. In their 1980 precision arithmetic. Estimating good value for m and
paper7 , Packard et al. conjectured that any m quan- in the face of these difficulties is one of the main chal-
tities that ...uniquely and smoothly label the states lenges of delay-coordinate embedding. Dozens of meth-
of the attractor could serve as effective elements of a ods for doing so have been developed in the past few
reconstruction-space vector. This powerful idea is used decades; we will only cover a few representative mem-
surprisingly rarely, even though it is fully supported by bers of this set.
all routines of the tisean software package10 . With the In traditional practice, one chooses first, most of-
4

ten by computing a statistic that measures the inde- III. MATHEMATICAL BEAUTY: CHARACTERIZATION
pendence of -separated points in the time series. The OF THE INVARIANT MEASURE
first zero of the autocorrelation function of the time se-
ries, for instance, yields the smallest that maximizes The invariant measure of a dynamical system can be
the linear independence of the coordinates of the em- characterized in a number of different ways: the frac-
bedding vector; the first minima of the average mutual tal dimension of the invariant set, for instance, from the
information11 or the correlation sum12,13 occur at val- point of view of state-space geometry, or the Kolmogorov-
ues that maximize more-general forms of independence. Sinai (K-S) entropy if one is interested in uncertainty
(One wants the smallest that is reasonable because about the future of a chaotic trajectory. The stability
the reconstructed attractor can fold over on itself as with respect to infinitesimal perturbations can be quanti-
grows, causing other problems.) There are also geometry- fied by the Lyapunov exponents. The topological equiva-
based strategies for estimating by, for example, ex- lence guaranteed by the embedding theorems allows all of
amining the continuity on the reconstructed attractor these quantitiesand many others not mentioned here
or the amount of space that it fills. While there has to be determined from the time-series data.
been some theoretical discussion14 of what constitutes
an optimal , there are no universal strategies for putting
those ideas into practiceespecially since the process is 1. Dimension estimates
system-dependent, and since a that works well for one
purpose (e.g., prediction) may not work well for another There is a whole family of fractal dimensions Dq , usu-
(e.g., computing dynamical invariants). ally called the Renyi dimensions. Their most intuitive
definition is through a partitioning of the state space:
the number of boxes N of size  needed to cover a frac-
After choosing a value for , the next step is to es-
tal set with dimension D0 scales with the box size  as
timate the embedding dimension m. As in the case of
D0 . This is an evident generalization of the integer di-
, bigger is not necessarily bettersince a single noisy
mensions, as one can easily verify: a line segment, for
point in the time series will affect m of the points in
instance, will yield D0 = 1 via this procedure, regardless
an m-embeddingso one wants the smallest m that af-
of whether the surrounding space has two, three or more
fords a topologically correct result. There are two broad
dimensions. D0 , often called the capacity dimension, is
families of approaches to this, one based on the false
closely related to the Hausdorff dimension1 . For the gen-
near neighbor (FNN) algorithm of Kennel et al.15 and
eralized dimensions, one has to determine the measure
another that might be termed the asymptotic invari-
on every box from thePpartition and raise that measure
ant approach. In the latter, one embeds the data for
to the power q, with pqi (1q)Dq for  0 and pi
a range of dimensions, computes some dynamical invari-
being the weight on the ith box.
ant (e.g., those discussed in Section III), and selects the
Direct application of these box-counting methods to
m where the value of that invariant settles down. In an
the points in the reconstructed state space is possible, but
FNN-based method, one embeds the data, computes each
involves significant memory and processing demands and
points near neighbors, increases the embedding dimen-
its results can be very sensitive to data length. A more-
sion, and repeats the near-neighbor calculation. If any of
efficient, more-robust estimator of fractal dimensions is
the relationships changei.e., some neighbor in k dimen-
the Grassberger-Procaccia correlation sum19 . We recall
sions is no longer a neighbor in k + 1 dimensionsthat is
only the simplest version, which yields D2 . Rather than
taken as an indication that the dynamics were not prop-
count boxes that are occupied by data points, one instead
erly unfolded with m = k. Noise also disturbs neighbor
examines the scaling of the correlation sum as a function
relationships, though, and thus can affect the operation
of :
of FNN-based algorithms. No member of either family
of methods provides a guarantee, but both offer effective 1 X X
C2 (m, ) := (||~xi ~xj ||) , (2)
strategies for estimating m. Again, it can be very useful 2N (N T ) i
j<iT
to employ several different methods to corroborate ones
results. where is the Heaviside step function. C2 (m, ) rep-
resents the fraction of pairs of data points in the m-
dimensional embedding space whose spatial distance
This two-step process is not the only approach. It
(measured by the Euclidean or maximum norm) is
has been noted that what really matters is the m
smaller than the scale . This number scales as D2 if
producti.e., how much of the data are spanned by the
embedding vectorand thus that estimating the two pa-
rameters at the same time, in combination, may be more
effective16,17 . It has also been suggested that one need 1 There is a prominent exception to this statement: while the
not use the same across all m coordinates of the embed- Hausdorff dimension of the rational numbers is zeroas for any
ding vectori.e., that a systematically skewed embed- countable set of isolated pointstheir capacity dimension is 1
ding space may correspond better to the true dynamics18 . because they are dense.
5

m > D2 20 . The parameter T , going back to Theiler21 , en- number of Lyapunov exponents is equal to the number
sures that the temporal spacing between potential pairs of dimensions in the ambient space. Scalar time-series
of points is large enough to represent an independently data sampled from D-dimensional dynamical systems are
identically distributed sample2 . typically embedded in m dimensions with m > D, and
Formally, of course, the dimension of any finite point- those dynamics have m Lyapunov exponents. Ideally, one
set data should be zero. In the limit as  0, methods would like to find D exponents that correspond to those
that simply count occupied boxes correctly reflect this of the original dynamicsor at least to identify the mD
fact. In nonlinear time-series analysis, however, we are extra ones that are spurious. There is a neat theory that
interested in the dimension of the set that is represented predicts the numerical values of these spurious exponents
by the point-set data. The correlation sum provides an in lowest-order approximation28 , but this cannot usually
unbiased estimator for that quantity, and one that is be reproduced in practice due to inaccessability of these
accurate for small unlike the box method, which is scales29 .
strongly biased towards small D values in this limit22 .
There is a conundrum involved in any estimation of
the dimension of a delay-coordinate embedding, which 3. The Kolmogorov-Sinai entropy
is sometimes known as the conflict between redundancy
and irrelevancy14 . Specifically, in order to assure that Theoretically, the K-S entropy (rate) hKS can be de-
successive elements of a delay vector are independent, the termined via Pesins identity30 , which states that it is
time lag should be sufficiently large. This can, however the sum of the positive Lyapunov exponents. Since spu-
(as mentioned in the second paragraph of Section II B) rious exponents are hard to identify, though, and can
overfold the reconstructed dynamicsespecially if the even be positive, it is difficult to put this into practice
embedding dimension is high. In these situations, it can in the context of embedded data (or to use the Kaplan-
require extremely well-sampled data in order to correctly Yorke formula in order to determine the Lyapunov di-
resolve the folds and voids in complicated chaotic attrac- mension). Rather, one typically estimates hKS through
tors. One can turn this reasoning around to estimate refined partitions, closely following its definition (e.g.,31 ).
the number of data points N needed to estimate the di- The most straightforward implementation of this ap-
23
mension
of a data set; a pessimistic answer to this is proach discretizes the space of joint probabilities and
N 100 e D D h , where h2 is the correlation entropy
2 2 2
searches for sequences of successive delay vectors in spe-
of the dynamics, the time delay of the reconstruction, cific sequences of boxes. As in the case of box-counting
and eD2 h2 describes the effects of folding in the delay implementations of fractal dimension calculations, this
embedding space due to the minimal embedding dimen- can lead to underestimation: a sequence that exists in
sion m > D2 . Among other things, this means that the the underlying dynamics may not be sampled by a given
number of points needed to estimate the dimension of set of observations. In the box-counting implementation,
chaotic dynamics reconstructed from a scalar time series every sequence with estimated probability 0 will system-
is much larger than in the original state space, where the atically reduce the estimate of the K-S entropy. A way
entropy factor can be ignored and N > 42D2 has been around this is to compute the correlation entropy (rate)
suggested24 . h2 , which can be estimated by the correlation sum. To
do this, one calculates Eq.(2) for a range of dimensions
m that are larger than the assumed minimum for an em-
2. Lyapunov exponents bedding, obtaining hm () = ln C(m, ) ln C(m + 1, ).
Ideally, for some range of , one should see a convergence
Dimension estimate have pitfalls and caveats, but they of hm () h2 for large m3 . For a consistency check,
are quite robust. Estimates of Lyapunov exponents are one can then go back to Pesins identity and compare
unstable. A number of creative strategies have been de- the estimate of h2 to the sum of the positive Lyapunov
veloped for estimating the full set of m Lyapunov expo- exponents.
nents k in the m-dimensional embedding space (e.g.,25 );
there are also many algorithms for estimating 1 , the
largest exponent, alone (e.g.,26,27 ). Every one of these IV. WHAT PRACTITIONERS NEED
algorithms involves free parameters, however, and their
results are often extremely sensitive to the values of those A precise characterization of the invariant measure is
parametersas well as to data length, noise, and the like. not the goal of most time-series analyses; moreover, few
When working with reconstructed dynamics, one must real-world data sets are measured by perfect sensors op-
also be aware of the issue of spurious exponents, since the erating on low-dimensional dynamics, which means that

2 If T is too small, the estimate of D2 might be biased towards too 3  values above this range lead to underestimation;  values below
small numbers, e.g., by intermittency effects. it lead to large fluctuations.
6

a proper determination of, e.g., Lyapunov exponents, is which properties of the original data should be
out of reach, anyway. In practice, one typically wants to shared by the surrogates?
describe a signal in some formalized manner, perhaps in
order to discriminate between it and some other signal. what should the null hypothesis be?
Other important tasks include noise reduction, detection Some of the answers are easy: since insufficient time-
of changes in dynamical properties within a given signal, series length poses severe problems, the individual surro-
or prediction of its future values. In all of these situ- gate data sets should have the same length as the series
ations, nonlinear time-series analysis has something to under study. Others are not: ideally, for instance, each of
contribute. these sequences should represent the same marginal prob-
ability distribution as the original data. Since a rather
powerful null model is the class of ARMA models, it is
A. Signal and system characterization
reasonable to require the surrogate data to have the same
power spectrum (more precisely, the same periodogram)
A typical task is to characterize a single time series by as the original datai.e., that temporal two-point corre-
a small set of numbersfor the purposes of classification, lations are identical. This is very useful when one wants
for instance, or comparison with other time series. Ex- to test for nonlinearities, which express themselves in
amples include medical diagnostics (is a patient healthy nontrivial temporal n-point-correlations.
or sick?) or monitoring of machines (is a lathe bearing The technical way to create surrogate data with
wearing out?). In these and many other important appli- identical two-point correlations and identical marginal
cations, nonlinear time-series analysis offers a large zoo distribution34 is to Fourier transform ones original data,
of useful approaches, a few of which we describe below. randomize the relative Fourier phases, back transform
(this creates close-to-Gaussian random numbers with
identical Fourier spectrum), and map the results onto
1. Surrogate data the original time-series values by rank ordering. The
third step restores the original marginal distribution but
In cases where strong evidence for some property is partly destroys the correlations, so the power spectrum
missing, one must resort to statistical hypothesis test- has to be re-adjusted by Wiener filtering. Some iteration
ing. With a finite data set, one can never prove results of these steps is generally required until the features of
about the underlying dynamics; one can only calculate the surrogate data converge. See35 for a careful discus-
probabilities that a particular finding is unprobable us- sion of this family of methods.
ing a simple null hypothesis. This approach can provide While surrogate data tests are very usefuland very
some evidence that a more-complex (nonlinear, chaotic) different than the bootstrapping techniques used in other
dynamics is plausible, for instance. data-analysis fieldsthere are a number of caveats of
In nonlinear time-series analysis, the test statistics which one must be aware when using them. Prominent
Lyapunov exponents, entropies, prediction errors, etc. among these is the nonstationarity trap: surrogates, by
are complicated and their probability distributions under construction, are stationary, whereas the original data
simple null hypotheses are typically unknown. Further- may be nonstationary. A difference in test statistics be-
more, the simple null hypotheses are typically not so tween surrogates and original data, then, might have its
simple. In the face of these challenges, one can proceed origin in nonstationarities rather than in nonlinearities.
as follows. First, one chooses a particular statistical esti-
mator (e.g., the violation of time inversion invariance32 ,
which is a nonlinear property). Second, one determines 2. Permutation Entropy
its value vd on the target data set. Third, one interprets
that value by comparing it to the distribution of values Since the 1950s, entropy has been a well-established
vs obtained from a large number of time series that fulfill measure of the complexity and predictability of a time
a certain null hypothesis (e.g., of AR processes). De- series36 . This is all very well in theory; in practice, how-
pending on where the computed value vd falls in this vs ever, estimating the entropy of an arbitrary, real-valued
distribution, one can compute the probability of obtain- time series is a significant challenge. The K-S entropy, for
ing that value by chance. This provides a confidence instance, is defined as the supremum of the Shannon en-
level by which the null hypothesis can be rejected. tropy rates of all partitions37 , but not any partition will
How does one obtain the distribution of the test statis- do for this computation. There are creative ways to work
tics under the null hypothesis? This is where surrogate around this, as described in Section III 3. The main issue
data33 enter the game. These are data that share certain is discretization: these entropy calculations require cat-
properties of the time series under study and also fulfill egorical datasymbols drawn from a finite alphabet
a certain null hypothesis. The idea is that if one can but time-series data are usually real-valued and binning
produce a number of such surrogate time series, one can real-valued data from a dynamical system with anything
numerically compute the distribution of the test statistic other than a generating partition can destroy the corre-
on the null hypothesis. The critical questions here are spondence between the true and symbolized dynamics38 .
7

Permutation entropy39 is an elegant way to work


around this problem. Rather than computing the statis-
tics of sequences of categorical values, as in the calcula-
tion of K-S and Shannon entropy, permutation entropy
considers the statistics of ordinal permutations of short
subsequences of the time series. If (x1 , x2 , x3 ) = (9, 1, 7),
for example, then its ordinal pattern, (x1 , x2 , x3 ), is 231
since x2 x3 x1 . The ordinal pattern of the permu-
tation (x1 , x2 , x3 ) = (9, 7, 1) is 321. To compute the per-
mutation entropy, one considers all the permutations
in the set Sn of all n! permutations of order n, determines
the relative frequency with which they occur in the time
series, {xt }t=1,...,T :

|{t|t T n, (xt+1 , . . . , xt+n ) = }|


p() =
T n+1
where | | is set cardinality, and computes
X
HP E (n) = p() log2 p()
Sn

Like many algorithms in nonlinear time-series analysis,


this calculation has a free parameter: the length n of the
subsequences used in the calculation. The key consider-
ation in choosing it is that the value be large enough to
expose forbidden ordinal patterns but small enough that FIG. 2. A signal and its recurrence plot. Reproduced with
permission from Chaos. 2:596 (2002). Copyright 2002 AIP
reasonable statistics over the ordinals can be gathered
Publishing.
from the given time series. When this value is chosen
properly, permutation entropy can be a powerful tool;
among other things, it is robust to noise, requires no
knowledge of the underlying mechanisms, and is iden- make them hard to interpret. Recurrence quantification
tical to the Shannon entropy for many large classes of analysis (RQA)43 defines a number of quantitative met-
systems40 . rics to describe this structure: the percentage of black
points on the plot, for example, or the percentage of those
black points that are contained in lines parallel to (but
3. Recurrence plots excluding) the main diagonal. RQA has been applied
very successfully to many different kinds of time-series
A recurrence plot41 is a two-dimensional visualization data, notably from physiological experiments (e.g.,44 ).
of a sequential data setessentially, a graphical repre- An extremely useful review article is45 .
sentation of the recurrence matrix of that sequence. The
pixels located at (i, j) and (j, i) on a recurrence plot (RP)
are black if the distance between the ith and j th points 4. Network characteristics for time series
in the time series falls within some threshold corridor

l < ||~xi ~xj || < h Recently, recurrence plots have been interpreted in a
very different way, namely as the adjacency matrix of an
for some appropriate choice of norm, and white other- undirected network46 . In this approach, an RP of an N -
wise. These plots can be very beautiful, particularly in point time series is converted into a network of N nodes,
the case of chaotic signals; see Figure 2 for an example. pairs of which are connected where the corresponding en-
(There are also unthresholded RPs, which use color- tries of the adjacency matrix are non-zero. One can then
coding schemes to represent a range of distances accord- determine numerical values for different network charac-
ing to hue; these are even more striking.) teristics, such as centrality, shortest path length, cluster-
RPs are useful in that they bring out correlations at all ing coefficients, and many more. There are some evident
scales in a manner that is obvious to the human eye, and questions, the most relevant being about the invariance
they are one of the few analysis techniques that work with of findings under variation of the threshold value l , since
nonstationary time-series data, but their rich geometric this value determines the link density of the network and
structurewhich, in the case of chaotic signals, is related all network characteristics become trivial in the limit of
to the unstable periodic orbits in the dynamics42 can full connectivity.
8

B. Prediction in different ways, however: shot noise, for instance,


which appears only intermittently, or systematic bias in
Prediction strategies that work with state-space mod- some measurement device. Regardless of its form, noise
els have a long historyand a rich traditionin non- can interfere with nonlinear time-series analysis if it is too
linear dynamics. The reconstruction machinery of Sec- largewhere too large depends greatly on the method
tion II plays a critical role in these strategies, as it allows that one wants to use.
them to be brought to bear on the problem of predict- Many studies in the literature are concerned with
ing a scalar time series47 . In 1969, for instance, Lorenz the fundamental issue of distinguishing chaos and noise
proposed his Method of Analogues, which searches the (see60 and references therein). This can be a real chal-
known state-space trajectory for the nearest neighbor of lenge. Both types of signals exhibit irregular temporal
a given point and takes that neighbors forward path as fluctuations, with a fast decay of the auto-correlation
the forecast48 ; not long after the original embedding pa- function, and both are hard to forecast. They differ in
pers, Pikovsky showed that the Method of Analogues also the dynamical origin of these features: chaos is a de-
works in reconstructed state spaces49 . terministic process, noise not. In a deterministic sys-
Of course the canonical prediction example in deter- tem, the short-term futures of two almost-identical states
ministic nonlinear dynamics is the roulette work of the should be similar; in a pure noise process that is im-
Chaos Cabal at the University of California at Santa probable. But, as mentioned above, noise takes on many
Cruz, a project that not only catalyzed a lot of nonlinear forms. The simplest and most tractable is white noise:
scienceincluding the original embedding paper7 but sequences of independently identically distributed (iid)
also a lot of interest in the field from both scientific and random numbers. Their statistical independence, as ex-
lay communities50 that continues to this day51 . pressed by the factorization of their joint probability dis-
In the decades since Lorenzs Method of Analogues and tributions, can be easily identified by statistical tests.
the roulette project, a large number of creative strategies If the noise is not additive, the challenge mounts. A
have been developed to predict the future course of a non- noise-driven chaotic systeme.g., a nonlinear stochastic
linear dynamical system52 . Most of these methods build differential equationproduces something we might call
some flavor of local model in patches of a reconstructed noise. Mathematically speaking, such a system will, in
state space and then use that model to make the predic- any delay-coordinate embedding space, generate an in-
tion of the next point. Early examples include24,5355 . variant measure whose support has the full state-space
This remains an active area of research and has even dimension without fractal structure. In such a system,
spawned a time-series prediction competition56 . infinitesimally close trajectories will not diverge expo-
The Method of Analogues is not only applicable to nentially fast, but rather separate diffusively, at least on
deterministic dynamics. The short-term transition prob- short time scales. Nonetheless, if such dynamical noise
ability density of a Markov process depends only on the or interactive noise is sufficiently weak, one can still iden-
current state, which can be approximated by a delay tify and characterize the deterministic properties of the
vector. The futures of delay vectors from a small system. However, there is often a smooth transition be-
neighborhood can be viewed as a sample of the distri- tween chaos and noise, leaving the whole issue without a
bution, one time step ahead. This approach has been clear resolution.
used for modeling57 and predicting58 nonlinear stochas- It is, however, our impression that this issue is over-
tic processes. emphasized. In most time-series applications, it is not
Surprisingly, perfect embeddings are not required for most critical to distinguish between chaos and noise,
successful predictions. In particular, reconstructions that but rather to decide on the complexity of the process:
do not satisfy the theoretical requirements on the embed- whether is it linear or nonlinear, where it falls on the
ding dimension m can give prediction methods enough spectrum between redundancy and irrelevancy, etc. And
traction to match or even exceed the accuracy of the same then we are much better off, as there exist quite pow-
methods working in a full embeddingparticularly when erful tools for answering these questions (see, e.g., Sec-
the data are noisy59 . One can then try to optimize, e.g., tions IV A 1 and IV A 2).
the embedding parameters. Of course, overfitting can be Removing noise from a signal can also be a real chal-
an issue in any prediction strategy; one must be careful lenge. Traditional filtering strategies discriminate be-
not to fool oneself by over-optimizing a predictor to the tween signal and noise using some sort of frequency
given data. threshold: e.g., removing all of the high-frequency com-
ponents of the signal. In a chaotic signal, where the fre-
quency spectrum is broad band, such a scheme will filter
C. Noise and filtering signal out along with the noise61 . To be effective, filtering
strategies for nonlinear time-series data must be tailored
All real-world signals are contaminated by measure- to and informed by the unique properties of nonlinear dy-
ment noise. Most commonly, noise is treated as an ad- namics. One can, for instance, use the native geometry of
ditive random process on top of the true signal. Some the stable and unstable manifolds in a chaotic attractor62
forms of experimental apparatus contaminate the signal or local models of the dynamics on the attractor63,64 , to
9

reduce noise. One can also exploit the topology of such dimensional embedding of those data. The detection and
attractors in nonlinear filtering schemes65 . filtering strategies mentioned in Section IV C can help
with noise problems, and subsequence analysis can be
used to explore whether the data are adequate to support
D. Issues and limitations the analysis, but in the end there is simply no way around
not having enough data.
Nonlinear time-series analysis in the reconstructed Delay-coordinate embedding, as formulated at the be-
state space is a powerful and useful idea, but it does ginning of Section II, requires data that are evenly sam-
has some practical limitations. These limitations are by pled in time. If this is not true, constructing the delay
vector R~ is impossible without interpolation, which intro-
no means fatal, but one has to be aware of them in order
to report correct results. duces spurious dynamics into the results. There is, how-
In theory, delay-coordinate embedding is only guar- ever, an elegant way around this issue if the data consist
anteed to work for an infinitely long noise-free observa- of discrete events, like the spikes generated by a neuron:
tion of a single dynamical system. This poses a number one simply embeds the inter-spike intervals70 . The idea
of problems in practice, beginning with nonstationarity: here is that if the spikes can be considered to be the re-
embedding a time series gathered from a system that sult of an integrate-and-fire process, then their spacing
is undergoing bifurcations, for instance, will produce a is an effective proxy for the integral of the corresponding
topological stew of those different attractors. Invari- internal variable, and that is a wholly justifiable quan-
ants computed from such a structure, needless to say, tity to embed. Even without integrate-and-fire dynamics,
will not accurately describe any of the associated dynam- one can interpret interspike intervals as a specific Poinare
ical regimes. One can use the tests described at the end map, which justifies their embedding71 . This also applies
of Section II A to determine whether these effects are at to the time series formed by all maxima (all minima) of
work in ones results: e.g., repeating the analysis on dif- the signal.
ferent subsequences of the data and seeing if the results Even though for practical purposes it is quite handy,
change. The recurrence plots described in Section IV A 3 using the same value of in between successive elements
can also be helpful in these situations, allowing one to of a delay vector may not be optimal. Indeed, using delay
quickly see if different parts of the signal have different vectors of the form y(t), y(t 1 ), y(t 1 2 ), ...., y(t
dynamical signatures. 1 2 . . . m1 ), with non-negative i , can introduce
The analysis of different subsequences of a time series more time scales into the reconstruction, which has been
has many other uses besides detecting nonstationarity, shown to be useful in many situations17,72 . Such strate-
including determining whether or not one has enough gies might be also a way to tackle signals from multi-scale
data to support ones analysis. The original embed- dynamics: if there are different time and length scales in-
ding theorems require an infinite amount of data, but volved, a fixed may be too large to resolve the short
looser bounds have since been established for different ones and/or too small to resolve the long ones. This is
problems24,66,67 . It is important to know and attend to particularly evident when embedding a human ECG sig-
these limits; a computation of a Lyapunov exponent of a nal: using standard delay vectors, one can either unfold
five-dimensional system from a data set that contains 100 the QRS complex or represent the t-wave as a loop, but
points, for instance, should probably not be trusted. It is not both4 .
also important to keep these effects in mind when repeat-
ing analyses on subsets of ones data, since the changes
in the results that one wants to use as a diagnostic tool V. PERSPECTIVES
can simply be the result of short data lengths.
Dimension is a major practical issue for many reasons, When getting involved in time-series analysis some 25
not just because it is not known a priori and can be a years ago, we could not have anticipated the wealth of
challenge to estimate. Most of the results cited above data that would be available in 2015, facilitated by cheap
regarding the data length that is necessary for success in and powerful sensors for all sorts of quantities, data-
nonlinear time-series analysis scale with the dimension of acquisition systems with sub-microsecond sampling rates
the dynamical systemoften quite badly. This becomes and terabytes of memory, widespread remote-sensing
even more of a challenge in spatially extended systems, technology, and incredible sense/compute power in small
where the state space is high (or even infinite) dimen- devices carried by the majority of the population of
sional and the dynamics is spatio-temporal. In cases like Earth. Commercial hardware and software are available
this, the full attractor cannot be reconstructed by delay-
coordinate embedding. This can in some cases be circum-
vented by exploiting homogeneity of the system, however:
if the dynamics is translationally invariant, local dynam- 4 Concerening spatial scales, it has been shown73 that spatial dis-
ics can be reconstructed and used for predictions68,69 . tances might play a different role: so called finite-size Lyapunov
Noise effects also scale with dimension, since any noisy exponents might detect different strengths of instability of dif-
time-series point will affect m of the points in an m- ferent spatial scales.
10

to monitor all kinds of things, from physiological pa- opments would streamline nonlinear time-series analysis,
rameters obtained during daily activity by watch-sized making it an indispensible tool to make sense out of the
objects to real-time traffic flows gathered by cameras real world.
on highways. These data can be used to suggest life-
1 G. E. P. Box and F. M. Jenkins, Time Series Analysis: Fore-
changing health interventions, produce routing sugges-
tions to avoid traffic jams that have not yet formed, and casting and Control (Holden Day, 1976) second edition.
2 H. Hurst, Long-term storage of water reserviors, Transactions
the like. of the American Society of Civil Engineers 116 (1951).
All this involves data analysis: often, time-series anal- 3 C. Peng, S. Buldyrev, S. Havlin, M. Simons, H. Stanley, and

ysis. The bulk of the techniques used in the various aca- A. Goldberger, Mosaic organization of DNA nucleotides, Phys-
ical Review E 49, 1685 (1994).
demic and commercial communities that are concerned 4 H. Abarbanel, Analysis of Observed Chaotic Data (Springer,
with this problemdata mining, machine learning, and 1996).
the likeare linear and statistical. Analysis techniques 5 H. Kantz and T. Schreiber, Nonlinear Time Series Analysis

that accommodate nonlinearity and determinism could (Cambridge University Press, 2004).
6 J. Crutchfield, Prediction and stability in classical mechanics,
be an extremely important weapon in this arsenal, but
(1979), senior thesis in physics and mathematics, University of
nonlinear time-series analysis is currently underused out- California, Santa Cruz.
side the field of nonlinear science. (Of course, much of 7 N. Packard, J. Crutchfield, J. Farmer, and R. Shaw, Geometry
this software is proprietary, so one must be careful about from a time series, Physical Review Letters 45, 712 (1980).
8 F. Takens, Detecting strange attractors in fluid turbulence, in
such generalizations; nonlinear time-series analysis may
already be running on Googles computers and it would Dynamical Systems and Turbulence, edited by D. Rand and L.-S.
Young (Springer, Berlin, 1981) pp. 366381.
be hard for those outside the company to know.) 9 T. Sauer, J. Yorke, and M. Casdagli, Embedology, Journal of

There are some serious barriers for the movement of Statistical Physics 65, 579616 (1991).
10 R. Hegger, H. Kantz, and T. Schreiber, Practical implementa-
nonlinear time-series analysis beyond the university desks
of physicists and into widespread professional practice, tion of nonlinear time series methods: The TISEAN package,
Chaos: An Interdisciplinary Journal of Nonlinear Science 9, 413
however. Linear techniques have a long history and are 435 (1999).
taught in most academic programs. They are compara- 11 A. Fraser and H. Swinney, Independent coordinates for strange

tively easy to use and they almost always produce an- attractors from mutual information, Physical Review A 33,
swers. Whether or not those answers are correct, or 11341140 (1986).
12 W. Liebert and H. Schuster, Proper choice of the time delay
meaningful, is a serious issue: cf., the discussion in Sec-
for the analysis of chaotic time series, Physics Letters A 142,
tion I of the mean of a bimodal distribution. But to a 107111 (1989).
community that is familiar with these linear techniques, 13 P. Grassberger and I. Procaccia, Measuring the strangeness of

the notion of learning a whole new methodologyone strange attractors, Physica D 9, 189208 (1983).
14 M. Casdagli, S. Eubank, J. Farmer, and J. Gibson, State space
that relies on more-complex mathematics and only works
reconstruction in the presence of noise, Physica D 51, 5298
if the data are good and the algorithm parameters are set (1991).
rightcan be daunting. One of us (EB) encountered sig- 15 M. B. Kennel, R. Brown, and H. D. I. Abarbanel, Determining
nificant resistance when attempting to convince the com- minimum embedding dimension using a geometrical construc-
puter systems community to attend to nonlinearity and tion, Physical Review A 45, 34033411 (1992).
16 W. Liebert, K. Pawelzik, and H. Schuster, Optimal embed-
chaos in computer dynamicsan effect that could signif-
dings of chaotic attractors from topological considerations, Eu-
icantly impact the designs of those systems. Only when rophysics Letters 14, 521 (1991).
those effects become apparent and meaningful to those 17 L. Pecora, L. Moniz, J. Nichols, and T. Carroll, A unified ap-

communities will nonlinear time-series analysis become proach to attractor reconstruction, Chaos: An Interdisciplinary
more widespread. Another relevant issue here is whether Journal of Nonlinear Science 17, 013110 (2007).
18 P. Grassberger, T. Schreiber, and C. Schaffrath, Nonlinear
low-dimensional deterministic dynamics is a good data
time sequence analysis, International Journal of Bifurcation and
model for broader use. So the only prediction that we Chaos 1, 521 (1991).
make here is that nonlinear time-series analysis is still 19 P. Grassberger and I. Procaccia, Measuring the strangeness of

far from its culmination point, in terms of application. strange attractors, Physica D 9, 189 (1983).
20 T. Sauer and J. Yorke, How many delay coordinates do you
What will be the relevant issues concerning the need? International Journal of Bifurcation and Chaos 3, 737
methodology itself? Here we can only speculate. It is (1993).
evident that nonstationarity is still a major problem and 21 J. Theiler, Spurious dimension from correlation algorithms ap-

many of its facets are not fully explored. Change-point plied to limited time series data, Physical Review E 34, 2427
detection is one of these. Distilling causality relationships (1986).
22 P. Grassberger, Finite sample corrections to entropy and dimen-
from data is another critical open problem in nonlinear sion estimates, Physics Letters A 128, 369 (1988).
time-series analysis (e.g., couplings in climate science). 23 E. Olbrich and H. Kantz, Inferring chaotic dynamics from time
Will this ever be possible? It is hard to say. On the series: On which length scale determinism becomes visible,
algorithmic end of things, the various free parameters Physics Letters A 232, 6369 (1997).
24 L. Smith, Intrinsic limits on dimension calculations, Physics
and the sensitivity of the results to their valuesare im-
Letters A 133, 283288 (1988).
portant issues. Will it be possible to design algorithms 25 J. Eckmann, S. Oliffson-Kamphorst, D. Ruelle, and S. Ciliberto,
whose free parameters can be chosen systematically, via Lyapunov exponents from time series, Physical Review A 34,
intuition, or perhaps even automatically? Such devel- 4971 (1986).
11

26 A. Wolf, J. Swift, H. Swinney, and J. Vastano, Determining 033150 (2012).


Lyapunov exponents from time series, Physica D 16, 285 (1985). 52 M. Casdagli and S. Eubank, eds., Nonlinear Modeling and Fore-
27 M. Sano and Y. Sawada, Measurement of the Lyapunov spec- casting (Addison Wesley, 1992).
trum from a chaotic time series, Physical Review Letters 55, 53 J. Farmer and J. Sidorowich, Predicting chaotic time series,

1091 (1985). Physical Review Letters 59, 845848 (1987).


28 T. Sauer, J. Tempkin, and J. Yorke, Spurious Lyapunov expo- 54 M. Casdagli, Nonlinear prediction of chaotic time series, Phys-
nents in attractor reconstruction, Physical Review Letters 81, ica D 35, 335356 (1989).
4341 (1998). 55 G. Sugihara and R. May, Nonlinear forecasting as a way of dis-
29 H.-L. Yang, G. Radons, and H. Kantz, Covariant Lyapunov
tinguishing chaos from measurement error in time series, Nature
vectors from reconstructed dynamics: The geometry behind true 344, 734741 (1990).
and spurious Lyapunov exponents, Physical Review Letters 56 A. Weigend and N. Gershenfeld, eds., Time Series Prediction:
109, 244101 (2012). Forecasting the Future and Understanding the Past (Santa Fe
30 Y. Pesin, Characteristic Lyapunov exponents and smooth er-
Institute Studies in the Sciences of Complexity, Santa Fe, NM,
godic theory, Russian Mathematical Surveys. 32, 55 (1977). 1993).
31 H. Schuster and W. Just, Deterministic chaos (Wiley, 2005). 57 F. Paparella, A. Provenzale, L. Smith, C. Taricco, and R. Vio,
32 T. Schreiber and A. Schmitz, Discriminating power of measures
Local random analogue prediction of nonlinear processes,
for nonlinearity in a time series, Physical Review E 55, 5443 Physics Letters A 235, 233240 (1997).
(1997). 58 M. Ragwitz and H. Kantz, Markov models from data by sim-
33 J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, and J. Farmer,
ple nonlinear time series predictors in delay embedding spaces,
Testing for nonlinearity in time series: The method of surrogate Physical Review E 65, 056201 (2002).
data, Physica D 58, 7294 (1991). 59 J. Garland and E. Bradley, Prediction in projection, (2015),
34 T. Schreiber and A. Schmitz, Improved surrogate data for non-
arxiv.org/abs/1503.01678.
linearity tests, Physical Review Letters 77, 635 (1996). 60 M. Cencini, M. Falcioni, E. Olbrich, H. Kantz, and A. Vulpiani,
35 T. Schreiber and A. Schmitz, Surrogate time series, Physica D
Chaos or noise: Difficulties of a distinction, Physical Review
142, 346382 (2000). E 62, 427 (2002).
36 C. E. Shannon, Prediction and entropy of printed English, Bell 61 J. Theiler and S. Eubank, Dont bleach chaotic data, Chaos:
System Techical Journal 30, 5064 (1951). An Interdisciplinary Journal of Nonlinear Science 3, 771782
37 H. Petersen, Ergodic Theory (Cambridge University Press, 1989).
(1993).
38 D. Lind and B. Marcus, An Introduction to Symbolic Dynamics 62 J. Farmer and J. Sidorowich, Exploiting chaos to predict the
and Coding (Cambridge University Press, 1995). future and reduce noise, in Evolution, Learning and Cognition
39 C. Bandt and B. Pompe, Permutation entropy: A natural com-
(World Scientific, 1988).
plexity measure for time series, Physical Review Letters 88, 63 E. Kostelich and J. Yorke, Noise reduction in dynamical sys-
174102 (2002). tems, Physical Review A 38, 16491652 (1988).
40 J. Amig o, Permutation Complexity in Dynamical Systems: Or- 64 P. Grassberger, R. Hegger, H. Kantz, C. Schaffrath, and
dinal Patterns, Permutation Entropy and All That (Springer, T. Schreiber, On noise reduction methods for chaotic data,
2012). Chaos: An Interdisciplinary Journal of Nonlinear Science 3, 127
41 J.-P. Eckmann, S. Kamphorst, and D. Ruelle, Recurrence plots
(1993).
of dynamical systems, Europhysics Letters 4, 973977 (1987). 65 V. Robins, N. Rooney, and E. Bradley, Topology-based signal
42 E. Bradley and R. Mantilla, Recurrence plots and unstable pe-
separation, Chaos: An Interdisciplinary Journal of Nonlinear
riodic orbits, Chaos: An Interdisciplinary Journal of Nonlinear Science 14, 305316 (2004).
Science 12, 596600 (2002). 66 A. Tsonis, J. Elsner, and K. Georgakakos, Estimating the di-
43 J. Zbilut and C. Webber, Embeddings and delays as derived
mension of weather and climate attractors: Important issues
from recurrence quantification analysis, Physics Letters A 171, about the procedure and interpretation, Journal of the Atmo-
199203 (1992). spheric Sciences 50, 25492555 (1993).
44 C. Webber and J. Zbilut, Dynamical assessment of physiological 67 J.-P. Eckmann and D. Ruelle, Fundamental limitations for es-
systems and states using recurrence plot strategies, Journal of timating dimensions and Lyapunov exponents in dynamical sys-
Applied Physiology 76, 965973 (1994). tems, Physica D 56, 185187 (1992).
45 N. Marwan, M. Romano, M. Thiel, and J. Kurths, Recurrence 68 M.B ar, R. Hegger, and H. Kantz, Fitting partial differential
plots for the analysis of complex systems, Physics Reports 438, equations to space-time dynamics, Physical Review E 59, 337
237 (2007). (1999).
46 R. Donner, M. Small, J. Donges, N. Marwan, Y. Zou, R. Xi- 69 U. Parlitz and C. Merkwirth, Prediction of spatiotemporal time
ang, and J. Kurths, Recurrence-based time series analysis by series based on reconstructed local states, Physical Review Let-
means of complex network methods, International Journal of ters 84, 1890 (2000).
Bifurcation and Chaos 21, 10191046 (2011). 70 T. Sauer, Interspike interval embedding of chaotic signals,
47 J.-P. Eckmann and D. Ruelle, Ergodic theory of chaos and
Chaos: An Interdisciplinary Journal of Nonlinear Science 5, 127
strange attractors, Reviews of Modern Physics 57, 617 (1985). (1995).
48 E. Lorenz, Atmospheric predictability as revealed by naturally 71 R. Hegger and H. Kantz, Embedding of sequences of time in-
occurring analogues, Journal of the Atmospheric Sciences 26, tervals, Europhysics Letters 38, 267272 (1997).
636646 (1969). 72 D. Holstein and H. Kantz, Optimal Markov approximations and
49 A. Pikovsky, Noise filtering in the discrete time dynamical sys-
generalized embeddings, Physical Review E 79, 056202 (2009).
tems, Soviet Journal of Communications Technology and Elec- 73 E. Aurell, G. Boffetta, A. Crisanti, G. Paladin, and A. Vulpi-
tronics 31, 911914 (1986). ani, Predictability in the large: An extension of the concept
50 T. Bass, The Eudaemonic Pie (Penguin, New York, 1992).
of Lyapunov exponent, Journal of Physics A: Mathematics and
51 M. Small and C. Tse, Predicting the outcome of roulette,
General 30, 1 (1997).
Chaos: An Interdisciplinary Journal of Nonlinear Science 22,

You might also like