Machine Learning Paper - 6

A theory for long-memory in supply and demand
Fabrizio Lillo,1, 2 Szabolcs Mike,1, 3 and J. Doyne Farmer1

1
Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501
2
INFM Unità di Palermo and Dipartimento di Fisica e Tecnologie Relative, viale delle Scienze I-90128, Palermo, Italy
3
Budapest University of Technology and Economics, H-1111 Budapest, Budafoki út 8, Hungary
arXiv:cond-mat/0412708v2 [cond-mat.other] 23 Mar 2005
Recent empirical studies have demonstrated long-memory in the signs of orders to buy or sell in
financial markets [2, 19]. We show how this can be caused by delays in market clearing. Under the
common practice of order splitting, large orders are broken up into pieces and executed incrementally.
If the size of such large orders is power law distributed, this gives rise to power law decaying
autocorrelations in the signs of executed orders. More specifically, we show that if the cumulative
distribution of large orders of volume v is proportional to v −α and the size of executed orders is
constant, the autocorrelation of order signs as a function of the lag τ is asymptotically proportional
to τ −(α−1) . This is a long-memory process when α < 2. With a few caveats, this gives a good match
to the data. A version of the model also shows long-memory fluctuations in order execution rates,
which may be relevant for explaining the long-memory of price diffusion rates.
Contents have been observed in many physical, biological and eco-

nomic systems ranging from turbulence [26] to chaotic
I. Introduction 1 dynamics due to flights and trapping [14], dynamics of
aggregates of amphiphilic molecules [23] and DNA se-
II. Description of models 2 quences [24, 25]. In finance the volatility, roughly defined
as the diffusion rate of price fluctuations, is known to be
III. Analytic computation for fixed N model 3 a long-memory process [4, 8]. In this paper we analyze a
A. Autocorrelation in probabilistic terms 3 mechanism for creating a long-memory process, based on
B. de Moivre-Laplace approximation 4 converting a static power law distribution into a random
process with a power law autocorrelation function. Other
C. Pareto distribution 5
examples of stochastic processes relating power laws to
long-memory have been given by Mandelbrot [21] (an-
IV. Liquidity fluctuations of the λ model 5
alyzed by Taqqu and Levy [27]), and in the context of
DNA sequences by Buldyrev et al. [5].
V. Testing the predictions 6
A. Market structure and order distributions 6 Recently a new long-memory property of the order
flow in a financial market was independently observed
B. Predicted vs. actual values of γ 8
by Bouchaud et al. in the Paris Stock Exchange [2] and
C. Run length 9
Lillo and Farmer in the London Stock Exchange (LSE)
D. Review of assumptions 9 [19]. These studies have shown that there is a remark-
able persistence in buying vs. selling. Labeling the signs
VI. Discussion 10 of trading orders as ±1 according to whether they are to
buy or to sell, the autocorrelation of observed order signs
ACKNOWLEDGMENTS 10 is strongly positive, asymptotically decaying roughly as a
power law τ −γ , where γ ≈ 0.6. Such positive autocorre-
APPENDIX 10 lations can be measured at statistically significant levels
over time lags as long as two weeks.
References and Notes 12
For example, in Fig. 1 we show the empirical autocor-
relation function of the time series of signs of orders that
result in immediate trades for the stock Shell. The au-
I. INTRODUCTION tocorrelation function is well described by a power law
decay over almost three decades and a least squares fit
A random process is said to have long-memory if it has to this gives γ = 0.53. The fact that γ < 1 implies
an autocorrelation function that is not integrable. This that this is a long-memory process, i.e. its autocorrela-
happens, for example, when the autocorrelation function tion function decays so slowly that it is not integrable.
decays asymptotically as a power law of the form τ −γ This is important because it implies that values from
with γ < 1. This is important because it implies that the distant past have a significant effect on the present.
values from the distant past can have a significant effect A diffusion process built from long-memory increments
on the present, that the stochastic process lacks a typical has a variance σ 2 that grows in time as σ 2 (τ ) ∼ τ 2H ,
time scale, and implies anomalous diffusion in a stochas- where is called the Hurst exponent. For 0 < γ < 1,
tic process whose increments have long-memory. Exam- H = 1 − γ/2. For a normal diffusion process H = 1/2,
ples of long-memory processes and anomalous diffusion but when H > 1/2 the variance grows faster than τ 1/2 ,
2
which is called anomalous diffusion. Another important

consequence is that statistical averages converge slowly,
e.g. the mean of a quantity that displays anomalous dif-
fusion converges as T −(1−H) , where T is the sample size.
The signs of orders in the LSE have been shown to pass
tests for long-memory with a high degree of statistical
significance [19].
From an economic point of view this is important be-
cause of its implications for market efficiency. All other
things being equal, since buy orders tend to drive the
price up and sell orders tend to drive them down, this
would imply that it was possible to make profits using a
simple linear model to predict future price moments. In
order to prevent this the market has to make substan-
tial compensating adjustments [2, 3, 19]. The difficulty FIG. 1: Autocorrelation function of the time series
of signs of orders that result in immediate trades
of making such adjustments perfectly may have impor- for the stock Shell traded at the London Stock Ex-
tant implications about the origin of long-memory in the change in the period May 2000 - December 2002, a
volatility of prices. total of 5.8 × 105 events.
In this paper we hypothesize that the cause of the long-
memory of order flow is a delay in market clearing. To
make this clearer, imagine that a large investor like War- ters, and test it against simulation results. Section IV
ren Buffet decides to buy ten million shares of a company. discusses the properties of the λ model, showing that
It is unrealistic for him to simply state his demand to the it displays interesting temporal fluctuations. Section V
world and let the market do its job. There are unlikely compares the predictions to empirical evidence and dis-
to be sufficient sellers present, and even if there were, cusses the assumptions of the model in the context of real
revealing a large order tends to push the price up. In- markets. In Section VI we discuss the possible broader
stead he keeps his intentions as secret as possible and implications.
trades the order incrementally over an extended period
of time, possibly through intermediaries. In a study of
this phenomenon, about a third of the dollar value of II. DESCRIPTION OF MODELS
such institutional trades took more than a week to com-
plete [6, 7]. This conflicts with standard neoclassical eco-
We develop a model with two variations, which we call
nomic models, which assume market clearing, i.e. that
the λ model and the fixed N model. We first describe the λ
the price always adjusts so that supply and demand are
model, which is more realistic, but for which we have only
evenly matched. The fact that large orders are kept se-
simulation results. We then describe the fixed N model,
cret and executed incrementally implies that at any given
which is less realistic, but has the important advantage
time there may be a substantial imbalance of buyers and
of being simpler, allowing us to obtain analytic results.
sellers, which can be interpreted as a failure of market
Because of the simple nature of these results, they apply
clearing. Supply and demand do not match, and the mar-
equally well to the λ model.
ket fails to clear. Effective market clearing is delayed, by
We first describe the λ model. Let N (t) be the number
variable amounts that depend on fluctuations in the size
of hidden orders at time t = 1, 2, . . . , T . At each time t
and signs of unrevealed orders.
generate a new hidden order with probability 0 < λ < 1
We propose a simple model to explain the long-memory if N (t) > 0, or probability one if N (t) = 0. Assign each
of order flow based on delays in market clearing. We pos- new hidden order a random sign si and an initial size
tulate that unrevealed hidden orders are distributed ac- vi (t∗ ) = L∆v, where t∗ is the time when the hidden or-
cording to a power law. These are broken up into pieces, der is created, and L = 1, 2, . . . is drawn from a Pareto
which we call revealed orders, that are submitted at a distribution P (L) = αL−(α+1) , with α > 0. The random
steady rate. We show that this leads to long-memory in variables L and si are IID1 . At each timestep t an ex-
order flow, yielding a model consistent with empirical ob-
servations. The main result is an analytic computation
relating the exponent of the power law of the volume
distribution of hidden orders to the rate of decay of the 1 In the language of extreme value theory [9], the Pareto distribu-
long-memory process characterizing revealed orders. tion is just one example of a power law. A distribution f (x) is
The paper is organized as follows: In Section II we de- a power law with tail exponent α if there exists a slowly vary-
fine the two models that we study here, which we call the ing function g(x) such that limx→∞ f (x)g(x) = Kx−α , where K
and α are positive constants. A function g(x) is a slowly varying
fixed N model and the λ model. In Section III we analyt- function if for any t > 0 limx→∞ g(tx)/g(x) = 1. A common
ically compute the autocorrelation function of revealed example of a slowly varying function is log x, so in this sense
orders for the fixed N model in terms of the parame- the function x−α log x is a power law. Thus, the term “power
3
isting hidden order i is chosen at random with uniform pute the correct prefactor.
probability, and a volume ∆v of that order is removed,
so that vi (t + 1) = vi (t) − ∆v. This generates a revealed
order of volume ∆v and sign xt = si . A hidden order i A. Autocorrelation in probabilistic terms
is removed if vi (t + 1) = 0. Thus, the number of hidden
orders N (t) fluctuates in time, depending on fluctuations Under the convention that the signs of the revealed
in arrival and removal. orders are xt = ±1, because of the symmetry between
The fixed N model is the same, except that the number buying and selling E[xt ] = 0 and E[x2t ] = 1, where E
of hidden orders N is kept fixed. Thus, if a hidden order denotes the expectation. Therefore the autocorrelation
is removed it is immediately replaced by a new one with is simply ρ(τ ) = E[xt xt+τ ]. We can rewrite this as
a random sign and a new size.
∞
The main result of this paper is the calculation of the X
E[xt xt+τ ] = Q(L)E[xt xt+τ |L], (1)
autocorrelation function of revealed order signs xt for the
L=1
fixed N model. We show in the next section that the tail
of the autocorrelation function asymptotically scales as where E[xt xt+τ |L] is conditioned on the hidden order
τ −(α−1) . While varying N affects the shape of the auto- that generated xt having length L. Q(L) is the probabil-
correlation function for small τ , providing α is held fixed, ity that a revealed order drawn at random comes from a
it does not affect its asymptotic scaling. Even though hidden order of length L. Let q(τ |L) be the probability
N (t) varies in the λ model, the asymptotic behavior is that revealed orders at times t and time t + τ came from
independent of N (t), and so the asymptotic behavior of the same hidden order, given that it has original length
the autocorrelation function is the same. This is particu- L. Because E[xt xt+τ ] = 0 if xt and xt+τ came from
larly convenient because it allows us to make a prediction different hidden orders, and E[xt xt+τ ] = 1 if they came
in terms of observable quantities (see Section V). from the same hidden order, the conditional expectation
can be rewritten
III. ANALYTIC COMPUTATION FOR FIXED N

E[xt xt+τ |L] = q(τ |L), (2)
MODEL
which implies
∞
Because the hidden order arrival process is IID, it is X
possible to compute the autocorrelation of the fixed N ρ(τ ) = Q(L)q(τ |L). (3)
L=1
model analytically. The basic idea of the computation is
to understand the behavior of the autocorrelation con- To compute Q, we note that the number of revealed
ditioned on L, the initial length of the hidden order in orders coming from hidden orders of length L is pro-
units of the revealed order size ∆v, and then combine the portional to Lp(L), where p(L) is the probability that a
results for different values of L. hidden order has length L. To compute Q(L) we must
We first begin by giving a simple intuitive argument properly normalize this by summing over L,
for the asymptotic scaling. The probability at any in-
stant of time that a revealed order comes from a hidden Lp(L)
Q(L) = P∞ . (4)
order of length L is Q(L) ∝ Lp(L). This revealed order L=1 Lp(L)
contributes to inducing a positive autocorrelation at lag
τ only if the revealed order τ steps ahead comes from This gives
the same hidden order. In other words, in order to con- ∞
tribute to the autocorrelation function at lag τ , a hidden 1 X
ρ(τ ) = Lq(τ |L)p(L), (5)
order must be of length L > Aτ , where A is a constant. L̄ L=1
Summing R ∞over all hidden orders gives an autocorrelation
ρ(τ ) ∼ Aτ Q(L) ∼ τ −(α−1) , which is the main result of where L̄ is the average value of L.
Eq. 17. In the remainder of this section, we present a The conditional probability q(τ |L) can be written
more detailed calculation, which also allows us to com-
w(L, τ )p, (6)
where w(L, τ ) is the probability that a given hidden order

law” refers not to a specific distribution, but to an equivalence is still active after time τ , and p is the probability that it
class of distributions with the same asymptotic scaling proper- will be selected for execution assuming it is still active.
ties. It is clear from the calculations leading up to our main By assumption p = 1/N .
result, equation (17), that it is not necessary to assume that the Computing w(L, τ ) is more complicated: Let s be the
distribution of volumes is strictly Pareto distributed; any power
law distribution p(L) with a given tail exponent α will give the number of revealed orders drawn from a given hidden
same asymptotic scaling for the autocorrelation function of re- order during the τ − 1 timesteps between time t and time
vealed orders. t+ τ , and let Pτ −1 (s < k) be the probability that s is less
4
than a given value k. Thus, for a hidden order that has B. de Moivre-Laplace approximation
length l at time t, the probability that it still exists at
time t+τ is Pτ −1 (s < l). For a hidden order with original The autocorrelation can now be computed using
length L, l is uniformly distributed with probability 1/L Eq. (5). However, since the sums of binomial coefficients
over the values 1, . . . , L. Thus we can express w(L, τ ) as are difficult to manage we will make use of the de Moivre-
a sum of probabilities, one for each possible value of l. Laplace approximation [11]. For npq >> 1 one can ap-
1 proximate
w(L, τ ) = (Pτ −1 (s < L − 1) + Pτ −1 (s < L − 2) +
L
(k − np)2

n k n−k 1
. . . + Pτ −1 (s < 1)). (7) p q ≃ √ exp − . (10)
k 2πnpq 2npq
The probabilities Pτ −1 (s < k) can be expressed as
sums of binomial probabilities, corresponding to the pos- As a consequence the sum of consecutive terms of a bi-
sible sequences with which a given hidden order generates nomial distribution can be approximated as
k − 1 revealed orders.
k2
X n k n−k
k−1
X
τ −1 h p q ≃ (11)
Pτ −1 (s < k) = p (1 − p)τ −1−h . (8) k
k=k1
h
h=0

1 k2 − np + 1/2 k1 − np − 1/2
erf √ − erf √ ,
Therefore 2 2npq 2npq
L−2 j where erf is the error function.
p XX τ −1 h
q(τ |L) = p (1 − p)τ −1−h . (9) By converting the sum to an integral, and letting s =
L j=1 h
h=0 τ − 1, equation (9) becomes
L−2
" ! !#
p X j − sp + 1/2 −sp − 1/2
q(s + 1|L) ≃ erf p − erf p ≃
2L j=1 2sp(1 − p) 2sp(1 − p)
Z L−2+1/2 " ! !#
p x − sp + 1/2 −sp − 1/2
erf p − erf p dx, (12)
2L 1/2 2sp(1 − p) 2sp(1 − p)
For the approximation of the sum by the integral we use gives

Pb R b+1/2
i=a f (i) ≃ a+1/2 f (x)dx. Performing the last integral
r
p (sp)2 2p (L − 1 − sp)2
q(s + 1|L) ≃ (− exp(− )+ sp(1 − p)(exp(− )) +
2L 2sp(1 − p) π 2sp(1 − p)
1 − sp 1/2 + sp 1 − L + sp
(sp − 1)erf( p ) + (L − 2)erf( p ) + (1 + sp − L)erf( p ) . (13)
2sp(1 − p) 2sp(1 − p) 2sp(1 − p)
The sum over L in Eq. (5) can be approximated by the 1)p(1 − p) >> 1. This leads to the condition
integral
N2
Z ∞
p(L)L τ >> − 1 ≃ N, (15)
ρ(τ ) ≃ q(τ |L) dL. (14) N −1
1+1/2 L̄
i.e. the approximation is valid as long as the lag is much
Finally, we need to translate the domain of validity of greater than the number of hidden orders. Since the num-
the de Moivre-Laplace approximation into more relevant ber of hidden orders is fixed, the approximation is always
terms. The condition npq >> 1 in Eq. (9) becomes (τ − valid for sufficiently large τ .
5
We have tested these calculations for the simple case

in which all hidden orders have the same size L0 , i.e.
p(L) = δ(L − L0 ), where δ is the Dirac delta function.
This implies ρ(τ ) = q(τ |L0 ), so that Eq. (13) gives a
closed form expression for the autocorrelation function.
As expected, the approximation always agrees very well
for large values of τ . The agreement is also good for small
values of τ when N is small and L0 is sufficiently large.
C. Pareto distribution
We now consider the more realistic case that the hid-

den order size L has a Pareto distribution
α
p(L) = α+1 (16)
L
where α > 1 is the tail exponent. In this case the integral FIG. 2: (Color online) Autocorrelation of the fixed
of Eq. (14) cannot be performed analytically. We can, N model with α = 1.5, for N = 1 (green circles),
N = 5 (red squares) and N = 50 (blue diamonds),
however, give an analytical asymptotic expansion of the
based on a simulation with T = 109 . This is com-
integral (14). The calculations detailed in the appendix pared to the asymptotic predictions of Eq. (17),
make use of the saddle point approximation. The result shown as dashed black lines.
is that the leading term of the asymptotic expansion of
ρ(τ ) is given by the terms depending on erf functions in
Eq. (13), and the autocorrelation function decays asymp- IV. LIQUIDITY FLUCTUATIONS OF THE
totically as λ MODEL
N α−2 −(α−1)
ρ(τ ) ∼ τ . (17) We now return to discuss the λ model. As a reminder,
α this differs from the fixed N model analyzed so far in
This result indicates that the autocorrelation function that the number of buffers N (t) is not fixed. Instead,
decays as a power law with exponent γ = α − 1. The new buffers are added with probability λ when N (t) > 0,
number of hidden orders affects the prefactor, but does and probability 1 otherwise. For the mean of N (t) to
not affect the scaling exponent. Interestingly, when α = 2 remain bounded it is necessary that the rate of cre-
the prefactor is independent of N . When α < 2 it is ation of new orders equal the rate at which they are
a decreasing function of N , and when α > 2 it is an removed. This implies the model has a critical thresh-
increasing function of N . The value α = 2 separates old where E[N (t)] → ∞. This can be simply computed
the regime where the size of hidden orders has infinite as follows: Let n(t) be the total number of future re-
variance from the regime where the variance is finite2 . vealed orders stored in all hidden orders at time t, i.e.
Fig. 2 compares the autocorrelation function predicted n(t) = N
P (t)
i=1 vi (t)/∆v. The average rate of change of
by Eq. (17) to a simulation for α = 1.5, N = 1, N = 5, n(t) is
and N = 50. For large values of τ the match is excellent,
both in terms of the slope and the size of the prefactor. E[n(t + 1) − n(t)] = R(n(t))L̄ − 1.
For N = 1 the prediction matches the simulation across
the entire range of τ . As expected, when N increases the The first term represents addition of a new hidden or-
prediction deviates at small τ , but still matches for large der, and the second term the removal of a revealed order
τ . We have also checked the consequences of varying at every timestep. The creation rate R(n(t)) = λ when
α and find that the prefactor behaves as predicted by n(t) > 0 and R(n(t)) = 1 otherwise. The average length
Eq. (17). of a new hidden order
P is L̄, which under the Pareto as-
Note that we used T = 109 samples to simulate the sumption is L̄ = ∞ L=1 L(L) = α/(1 − α). In the limit
model and compare to theory. This is because for α = 1.5 where E[n(t)] is large it is a good approximation to say
this is a strongly long-memory process, and the conver- that n(t) is never zero, so that R(n(t)) = λ. Setting
gence is extremely slow. This will become an issue later E[n(t + 1) − n(t)] = 0 implies the critical value λc is
on when we test the model against real data – even for
very large sample sizes the error bars remain quite large. λc = 1/L̄ = (α − 1)/α = γ/α. (18)
For the last equality we have made use of the fact that γ
does not depend on N in Eq. (17), which indicates that
2 Note that Buldyrev et al. [5] found a similar formula in the γ = α − 1 applies equally well to the λ model as long as
context of structure in DNA sequences. λ < λc (we have verified this in simulations). We also
6
FIG. 4: (Color online) Autocorrelation function of

the number of active hidden orders in the λ model
for four different values of λ, as shown in the inset.
The dashed black lines have slope α − 1.
FIG. 3: (Color online)The average number of hid-
den orders as a function of the creation parameter
λ for α = 1.3 (red downward pointing triangles),
α = 1.5 (black circles) and α = 1.7 (green upward V. TESTING THE PREDICTIONS
pointing triangles). The dashed lines are the corre-
sponding predicted critical values λc = (α − 1)/α. Unfortunately, data comparing hidden orders and re-
vealed orders are not widely available, which complicates
the problem of testing this model. The only data set we
confirm the dependence of the critical behavior on α in know of that includes the kind of data that is needed for
Fig. 3. a proper test was used by Chan and Lakonishok [6, 7] to
One of the interesting features of the λ model is that study the execution of customer orders at large brokerage
it generates long-memory fluctuations in the number of firms. Unfortunately, they did not fit functional forms to
active hidden orders. This is caused by positive feedback the size distributions or test for long-memory, and we
between the number of orders and the accumulation rate. have not been able to obtain their data. Their study
This is because the average rate at which hidden orders does make it clear that order splitting is very common,
are executed is 1/N (t). Thus when N (t) is larger than and suggests that the time scale on which order splitting
average, the rate at which active hidden orders are re- occurs is sufficiently long to match the autocorrelations
moved is lower than average, which tends to cause N (t) in order flow.
to increase above its average value. Such an increase is We compare the predictions of the model to the data in
triggered by random fluctuations in which one or more two different ways. The first is based on computation of
particularly large orders are created; when these orders the scaling exponents, described in Section V B, and the
are finally removed, N (t) decreases. N (t) thus makes second is based on the properties of run length, described
large and persistent fluctuations. The autocorrelation in Section V C. Before presenting the first test, we must
function has an asymptotic power law decay of the form first review the market structure.
ρN (τ ) ∼ τ −γ as shown in Fig. 4. From simulations, we
find that γ = α − 1.
For this model fluctuations in the number of hidden A. Market structure and order distributions
orders correspond to fluctuations in the time to execute
an order. In economics this is one aspect of what is called Although we have no transaction data with direct in-
liquidity, which is a general term referring to the ease of formation about hidden orders, we can perform an indi-
execution of an order. One of the interesting properties rect test of the scaling relations predicted by the model
of prices of economic time series is that they display what which takes advantage of the market structure used in the
is commonly called clustered volatility, i.e. the diffusion New York Stock Exchange and the London Stock Ex-
rate of price changes is strongly autocorrelated in time, change. They both employ two parallel markets which
and in fact is a long-memory process [4, 8]. It has recently provide alternative methods of trading, called the on-
been shown that this is related to fluctuations in liquidity, book or “downstairs” market, and the off-book or “up-
in this case defined as the price response to an order of a stairs” market. In the LSE orders in the on-book mar-
given size [10]. The fact that this kind of model predicts ket are placed publicly but anonymously and execution
long-memory fluctuations in another aspect of liquidity is completely automated. The off-book market, in con-
(the time to execute an order) may be related to the trast, operates through a bilateral exchange mechanism,
explanation of clustered volatility. via telephone calls or direct contact of the trading parties.
7
The anonymous nature of the on-book market facilitates

order splitting, and it is clear that it is a common prac-
tice. This is also supported by the fact that in our data
set it is possible to track the on-book orders for individ-
ual trading institutions, and the long-memory property
of order flow is evident even for single institutions [19]. In
contrast, off-book trading is based on personal relation-
ships and order splitting is believed to be less frequent.
This is because a series of orders of the same sign tend
to gradually change the price in a direction that is unfa-
vorable to the other party [6, 7].
Thus one might make the hypothesis that in the off-
book market people just submit their orders rather than
hiding them, while in the on-book market they hide their
true orders and execute them through a series of revealed
orders. While there is some truth in this hypothesis, it is
not strictly true. When we examine sequences of off-book
trades for individual institutions, we often see long runs
of trades of the same sign, suggesting that order split-
ting is also fairly common in the off-book market. Even
though order-splitting is not common when trading with
the same party, it is still possible to split a large order
and trade it in the off-book market with many different
parties. Thus the transactions in the off-book market
have already undergone some order splitting, and it is
not clear how well the distribution of transactions corre-
sponds to that for hidden orders.
Despite the caveats mentioned above, we will press
forward with the hypothesis that off-book trades can FIG. 5: Volume distributions of off-book trades
be used as a proxy for hidden orders, and see how (circles), on- book trades (diamonds) and the ag-
the predictions of our model match the empirical ob- gregate of both (squares). In (a) we show this for a
servations of order splitting. To this end we select 20 collection of 20 different stocks, normalizing the vol-
highly capitalized stocks traded at the London Stock Ex- ume of each by the mean volume before combining,
change in the period May 2000 - December 2002. The whereas (b) shows unnormalized values (in shares)
stocks we analyzed are Astrazeneca (AZN), Bae Systems for the stock Astrazeneca. The number of trades in
(BA.), Baa (BAA), BHP Billiton (BLT), Boots Group each case is 11 × 106 (aggregate on-book), 5.7 × 106
(BOOT), British Sky Broadcasting Group (BSY), Dia- aggregate off-book, 8.0 × 105 (AZN on-book) and
2.8 × 105 (AZN off-book). The dashed black lines
geo (DGE), Gus (GUS), Hilton Group (HG.), Lloyds Tsb
have the slope found by the Hill estimator (and are
Group (LLOY), Prudential (PRU), Pearson (PSON), shown for the largest one percent of the data).
Rio Tinto (RIO), Rentokil Initial (RTO), Reuters Group
(RTR), Sainsbury (SBRY), Shell Transport & Trading
Co. (SHEL), Tesco (TSCO), Vodafone Group (VOD),
and WPP Group (WPP). The number of trades for the
combined group of stocks is 16.7 × 106 ; of these 11 × 106 relation for volume V is P (V > x) ∼ x−α , and estimate
are on-book trades and 5.7 × 106 are off-book trades. α using a Hill estimator applied to the largest one per-
cent of the data [16]. For the aggregate data set this gives
In Fig. 5 we show the empirical probability distribu-
α = 1.59 for the off-book data, α = 2.90 for the on-book
tions for the volume of trades in both the off-book and
data, and α = 1.64 for the combined data3 . Similar
on-book markets in the London Stock Exchange. We
values are computed for individual stocks, as shown in
show this for an aggregate of 20 heavily traded stocks and
Fig. 6. The average values are α = 1.74 ± 0.23 for off-
for the single stock Astrazeneca, which is typical of the
book, α = 4.2 ± 1.5 for on-book, and α = 1.36 ± 0.10
stocks in the sample. This makes it clear that the tails die
overall. These results are consistent with the hypothesis
out more slowly in the off-book market. The largest trade
sizes in the off-book market are more than a factor of ten
larger than those in the on-book market; for Astrazeneca,
for example, the largest orders are roughly four million 3 The results for the combined data set are in rough agreement
shares in the off-book market vs. 200 thousand in the with those first reported for the NYSE and NASDAQ by Gopikr-
on-book market. Alternatively, to measure the decay of ishnan et al. [15], and for the LSE and Paris by Gabaix et al.
the tails more quantitatively, we assume the asymptotic [13].
8
FIG. 6: Scaling exponents α for the twenty stocks

FIG. 7: The scaling exponents α for the twenty
we study here, based on the hypothesis that the
stocks we study here (with the hypothesis P (V >
largest one percent of the trades V are described
x) ∼ x−α ), plotted against the exponent γ of
by the relation P (V > x) ∼ x−α . The stocks are
the autocorrelation function (under the hypothe-
arranged along the x axis in alphabetical order. The
sis ρ(τ ) ∼ τ −γ ). The error bars shown are the
circles refer to off-book trades, the diamonds to on-
95-percent confidence intervals of the Hill estima-
book and the squares to the aggregate of both. For
tor, under the assumptions of IID errors and perfect
comparison we draw a dashed line for α = 1.5.
Pareto scaling across the entire range of V . Both
assumptions are highly optimistic.
that order splitting is more common in the on-book mar-

ket than it is in the off-book market. However, they also
suggest that the separation between the styles of trading nent of the autocorrelation function is γ = 0.57 ± 0.05.
in these two markets is not absolute. They both show This can be compared either to γ̂ = 0.74 ± 0.23 based on
an approximate power law decay in their tails, although the average value of α, or to γ̂ = 0.59 based on the α for
this decay is much steeper for the on-book market. the aggregate distribution. In either case the agreement
Finally Fig. 6 shows that the exponent for the volume is well within the error bars. (The error bars, which are
distribution of the aggregate of the on- and off- book based on the standard error of the mean of the 20 stock
trades is systematically smaller than the exponent for sample, are highly optimistic due to correlations within
either of them by themselves. This is caused by the ag- the sample and possibly also due to skewness and sys-
gregation of two distributions: Mixing distributions with tematic bias of the Hill estimates).
different scaling properties tends to fatten the tails. It As a stronger test, one might hope that variations in
indicates that one should be very careful in aggregating measured values of α might predict variations in mea-
distributions4 . sured values of γ. The model fails this test. Performing
a regression of predicted vs. actual values gives a sta-
tistically insignificant, slightly negative slope. There are
B. Predicted vs. actual values of γ several possible explanations for this: First, as we have
already discussed, the off-book data may be a poor proxy
for hidden orders. Second, the sample errors are very
Taking the off-book market as a crude proxy for hid-
large, particularly for measuring α. The errors bars we
den orders, we test the model by comparing γ̂ = α − 1 as
have shown for α in Fig. 7 are the 95-percent confidence
predicted by Eq. (16) to the value of γ measured directly
intervals of the Hill estimator under the assumption that
from the order signs. The scaling exponent γ is measured
the data are IID and that the top one percent of the val-
by computing the Hurst exponent of the series of market
ues have converged to a perfect Pareto distribution. This
order signs for each stock using the DFA method [25],
is clearly far too optimistic. This can be seen by break-
and making use of the relation γ = 2(1 − H). (This is
ing the data into subsamples; the variation from year to
much more accurate than computing the autocorrelation
year is much larger than the error bars given by the Hill
function directly). We compare the predicted and actual
estimator. Even though our samples are large, the er-
values in Fig. 7. The average value of the scaling expo-
rors are still large because both volume and order signs
are long-memory processes [19, 20], and averages gener-
ally converge as T −(1−H) , where H ≈ 0.75 in both cases.
4
In addition, the measured values of α have larger errors
When power law distributions are combined the one with the low-
est tail exponent determines the tail exponent of the aggregate. than those of γ due to a strong tendency of the volume
For a finite sample, however, there are often slow convergence to trend upward, an effect that isn’t easily removed by
effects as a function of sample size that can alter this conclusion. simple normalization. Gabaix et al. have conjectured
9
that the exponent α for the volume distribution has a

universal value α = 3/2; if true, this would imply that
deviations from that value are purely statistical fluctua-
tions. Finally, it is of course possible that our model is
wrong, due to violations of the assumptions of the model.
We list some of the possible problems in Section V D.
C. Run length
Another test for comparing the models to data con-

cerns the distribution of run lengths. A run is a series of
revealed orders that are all of the same sign. In figure 8
we compare the run length distribution of the real order
flow with a simulation of the both the fixed N model
and the λ model. In panel (a) we show the autocorrela-
tion function of the sign of market orders for the stock
Astrazeneca (AZN) and compare it with the autocorrela-
tion of a simulation of the two models. The parameters
are N = 24 and α = 1.63 for the fixed N model and
N = 21.1, α = 1.63, and λ = 0.38 for the λ model.
These parameters were chosen to give a best fit to the
autocorrelation function of the real data. Both models
are able to capture the asymptotic behavior of the auto-
correlation function, but the fixed N model clearly un-
derestimates the autocorrelation function for small lags.
We can get a more detailed test by comparing the run
length distribution of the models and the data, as shown
in see panel (b) of figure 8). The figure shows that the
λ model is able to describe the run length distribution,
whereas the fixed N model underestimate the run length
probability for long runs. The λ model appears to be a
better candidate for describing real order flow.
FIG. 8: (Color online) (a) Autocorrelation func-
tion of the market order sign for the stock As-
D. Review of assumptions
trazeneca (black line) compared with the autocorre-
lation function of a numerical simulation of the fixed
Below we give a brief discussion of the assumptions of N model (red filled circles, parameters N = 24 and
the model, as well as the circumstances under which this α = 1.63) and of the λ model (empty blue circles,
might alter the basic conclusions of the model. parameters α = 1.63, and λ = 0.38 (which implies
an average value of N = 21.1). (b) Probability
• Distribution of hidden orders. This has already distribution of the run length for real data and sim-
been discussed in some detail above. Here we ulations of the model. The symbols and parameters
want to add that we have not addressed the pos- are the same as in panel (a).
sible cause of the power law distribution of hid-
den orders. One possibility (originally suggested
by Levy and Solomon and developed by Gabaix et tion is sufficiently thin tailed we think the model
al. [12, 17, 18]) is that the hidden order size dis- should still be valid. Power law tails, however,
tribution is in some way related to the power law might affect γ.
distribution of the size of holdings of the largest • Aggregation of orders. In reality, there is a limited
market participants. number of brokerage firms, and when they receive
• IID hidden order arrival. Strong autocorrelations hidden orders with opposite signs within a suffi-
in hidden order size or hidden order signs could ciently short period of time, they may cross such
affect γ, particularly if these were strong enough to orders internally before they execute the remainder
be long-memory. externally. This will reduce the amount of unexe-
cuted volume and improve market clearing. In our
• Distribution of revealed orders. In reality, revealed model it has the potential to change the effective
orders do not have constant size. If their distribu- value of N . However, because of the independence
10
of the asymptotic scaling behavior on N , we do not The market response to the long-memory of order flow
think this will affect γ. is an interesting example of a self-organized collective
phenomenon. It may be one of the causes of other im-
• Feedback between order execution and order genera- portant properties of prices, such as the long-memory
tion. In our model we do not worry about whether in their diffusion rate. We have demonstrated that the λ
revealed orders are actually executed. In reality model, which allows fluctuations in the number of hidden
many revealed orders may never be executed. In orders, automatically generates fluctuations in liquidity.
this case there may be feedback effects, i.e., if an This is known to affect price diffusion rates [10]. The in-
order is not executed the hidden order size is not dependence on the number of hidden orders, which was
decreased, and consequently may result in the gen- not obvious to us before doing the calculation, is a con-
eration of additional revealed orders when the agent venient property of our result that makes it possible to
tries again. We cannot say with certainty that such test the model based on information that can be feasibly
effects are not important. However, one piece of rel- gathered. This is thus a falsifiable model.
evant evidence is that within statistical error the
same scaling is observed for market orders, limit
orders, and cancellations [19]. Since market orders ACKNOWLEDGMENTS
are by definition executed immediately, this sug-
gests that such feedback effects are of minor im-
FL thanks partial funding support from research
portance.
projects MIUR 449/97 “Dinamica di altissima fre-
quenza nei mercati finanziari” and MIUR-FIRB
RBNE01CW3M. SM and JDF would like to thank Credit
VI. DISCUSSION Suisse First Boston, the McDonnell Foundation, Bill
Miller, and Bob Maxfield for funding this work.
We have presented and solved a rather idealized model
of the long-memory of order flow which was designed
to yield tractable results. As detailed in the preceding APPENDIX
section, many of its assumptions are not strictly true.
At the very least, though, it illustrates how two appar- In this appendix we evaluate the asymptotic behavior
ently disparate phenomena may be linked together, and of the autocorrelation ρ(τ ) of Eq. (14) when the hidden
makes quantitative predictions about their relationship. order size L has a Pareto distribution of Eq. (16). We
Because we lack the proper data to test the model, we split the integral of Eq. (14) in three parts and we set
have used an imperfect proxy to test the model. The b = p(1 − p).
model passes this test. However, it would be nice to do The first contribution is
a more definitive test, based on a data set that more
closely characterizes the dichotomy between hidden and 2√
r
(ps − 1)2
Z ∞
pα
revealed orders. Even if the model is not strictly true, − α+1
bs exp − dL. (19)
3/2 2L̄L π 2bs
the model could potentially be extended to include more
realistic assumptions, such as a non-trivial distribution This can be calculated explicitly. It is
of revealed order sizes.
The long-memory of order signs is interesting for its p
r
2√

(ps − 1)2

own sake, but it may also have more profound effects on − bs exp − , (20)
other aspects of the market. The persistent autocorrela- 2L̄ π 2bs
tion function associated with a long-memory process im-
which asymptotically goes as
plies a high degree of predictability by just constructing
a simple linear time series model (see refs. [2, 19]). Since √

ps

buy orders tend to generate a positive price response, and − s exp − . (21)
2(1 − p)
sell orders tend to generate a negative price response, all
other things being equal this would translate into easily This decay is very fast due to the exponential term.
exploitable predictable movements in prices. In order to
The second contribution is
prevent this from happening, other features of the mar-
ket have to adjust to compensate. Such features include

Z ∞ exp − (L−1−sp)2
pα 2 √
r
the size of buy vs. sell orders, the volume of unexecuted 2bs
bs dL. (22)
orders at the best prices, and many other aspects of the 2L̄ π 3/2 Lα+1
market [2, 3, 19]. Market participants do not behave
out of philanthropic motives; presumably these effects This integral cannot be computed analytically. In order
all come about due to the application of profit-making to get its asymptotic behavior for large s (i.e. large τ ) we
strategies. It is not at all obvious what these strategies make use of the saddle point approximation [22]. To have
are, and how they combine to eliminate this inefficiency. an idea of the approximation let us consider the case in
11
which one has to calculate the asymptotic behavior of an By applying the saddle point approximation one easily
integral of the type gets for the integral the approximation
b √

1
Z
dx eN f (x) (23) 2πbs exp (sp)−(α+1) , (27)
a 4bs
for large values of N . If there exists a point x0 in (a, b) and by putting also the prefactor we get for the second
which is a minimum for f (x), then we can expand f (x) contribution
around x0 , yielding
1 1
1 (α − 1)p(1 − p) exp (sp)−α ∼ α . (28)
eN f (x) ≃ exp[N (f (x0 ) + f ′′ (x0 )(x − x0 )2 )], (24) 4bs τ
2
Thus the second contribution gives a power law behavior
and we can compute the Gaussian integral but with an exponent α rather than α − 1.
Z b s The third contribution is the one depending on the
N f (x) 2π three erf functions
dx e ≃ exp(N f (x0 )). (25)
a f (x0 )
′′
pα ∞ 1 − sp 1/2 + sp
Z
(sp − 1)erf( √ ) + (L − 2)erf( √ )
The method can be applied also when the integral is not 2L̄ 3/2 2bs 2bs
of the form (23), given that the integrand can be written 1 − L + sp
as exp(f (x, N )). In our case the integral in Eq. (22) can +(1 + sp − L)erf( √ ) dL.(29)
2bsp
be rewritten as
Z ∞
(L − 1 − sp)2
After some algebraic manipulations we can rewrite this
exp − + (α + 1) log x dL. (26) term as
3/2 2bs

p(α − 1) α 1 − ps 1/2 + sp
− 2 erf √ + erf √
2α α−1 2bs 2bs
Z ∞
p(α − 1) 1 − ps L − 1 − ps 1
+ (L − 1 − ps)erf √ , √ α+1
dL, (30)
2 3/2 2bs 2bs L
where erf(x1 , x2 ) = erf(x2 ) − erf(x1 ) [1], and we have function.

used the fact that L̄ = α/(α − 1). The term in square Finally we compute the asymptotic behavior of the in-
brackets has asymptotic behavior tegral in Eq. (30), i.e.
2
r exp − p2bs ∞
1 − ps L − 1 − ps
1
Z
p(α − 1) α 2bs p/2b
−2 (e − ep/b ) , I≡ (L − 1 − ps)erf √ , √ α+1
dL.
2α α−1 π ps 3/2 2bs L
2bs
(31) (32)
and it is dominated by the exponential. The result is It is convenient to perform first an integration by parts
obtained by using the asymptotic expansion of the erf obtaining

1 L 1 + ps 1 − ps L − 1 − ps ∞
I= α + erf √ , √ 3/2
−
L 1−α α 2bs 2bs
∞
(L − x − ps)2

1 L 1 + ps 2
Z
+ √ √ exp − dL. (33)
3/2 Lα 1 − α α π 2bs 2bs
The finite term decays exponentially to zero because of havior of the two integrals can be computed with the
the properties of the error function. The asymptotic be- saddle point method in the same way as Eq. (22). Both
12
decay asymptotically as s−α+1 and the final result is which coincides with Eq. (17).
p(α − 1) 1 1 1 1
I∼ α−2 α−1
∼ α−2 α−1
, (34)
2 αp s αp τ
[1] M. Abramowitz. Handbook of mathematical functions, ley. Statistical properties of share volume traded in fi-
with formulas, graphs and mathematical tables. Dover nancial markets. Physical Review E, 62(4):R4493–R4496,
Publications, 1974. 2000. Part A.
[2] J-P. Bouchaud, Y. Gefen, M. Potters, and M. Wyart. [16] B. M. Hill. A simple general approach to inference about
Fluctuations and response in financial markets: The sub- the tail of distribution. The Annals of Statistics, 3(5):
tle nature of “random” price changes. Quantitative Fi- 1163–1174, 1975.
nance, 4(2):176–190, 2004. [17] M. Levy. Market efficiency, the pareto wealth distri-
[3] J-P. Bouchaud, J. Kockelkoren, and M. Potters. Random bution, and the distribution of stock returns. In S. N.
walks, liquidity molasses and critical response in financial Durlauf and L. E. Blume, editors, The Economy as an
markets. Technical report, Science and Finance, 2004. Evolving Complex System III. Oxford University Press,
[4] F. J. Breidt, N. Crato, and P. J. F. de Lima. Modeling 2005.
long-memory stochastic volatility. Working paper, Johns [18] M. Levy and S. Solomon. Power laws are logarith-
Hopkins University, 1993. mic boltzmann laws. International Journal of Modern
[5] S. V. Buldyrev, A. L. Goldberger, S. Havlin, C-K. Peng, Physics C, 7(4):595–601, 1996.
M. Simons, and H. E. Stanley. Physical Review E, 47: [19] F. Lillo and J. D. Farmer. The long memory of the effi-
4514, 1993. cient market. Studies in Nonlinear Dynamics & Econo-
[6] L. K..C. Chan and J. Lakonishok. Institutional trades metrics, 8(3), 2004.
and intraday stock price behavior. Journal of Financial [20] I. N. Lobato and C. Velasco. Long-memory in stock-
Economics, 33:173–199, 1993. market trading volume. Journal of Business & Economic
[7] L. K..C. Chan and J. Lakonishok. The behavior of stock Statistics, 18:410–427, 2000.
prices around institutional trades. The Journal of Fi- [21] B. Mandelbrot. Long-run linearity, locally gaussian pro-
nance, 50(4):1147–1174, 1995. cesses, h-spectra, and infinite variances. International
[8] Z. Ding, C. W. J. Granger, and R. F. Engle. A long mem- Economic Review, 10:82–113, 1969.
ory property of stock returns and a new model. Journal [22] F.W.J. Olver. Asymptotics and special functions. Aca-
of Empirical Finance, 1:83, 1993. demic Press, 1974.
[9] J. P. Embrechts, C. Kluppelberg, and T. Mikosch. [23] A. Ott, J-P. Bouchaud, D. Langevin, and W. Urbach.
Modeling Extremal Events for Insurance and Finance. Anomalous diffusion in “living polymers”: A genuine levy
Springer-Verlag, Berlin, 1997. flight? Physical Review Letters, 65:2201–2204, 1990.
[10] J. D. Farmer, L. Gillemot, F. Lillo, S. Mike, and A. Sen. [24] C-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin,
What really causes large price changes? Working paper F. Sciortino, M. Simons, and H. E. Stanley. Long-range
04-02-006, Santa Fe Institute, 2004. to appear in Quan- correlations in nucleotide sequences. Nature, 365:168–
titative Finance August 2004. 170, 1992.
[11] W. Feller. An Introduction to Probability Theory and its [25] C-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E.
Ap plications, volume 1. Wiley and Sons, third edition, Stanley, and A. L. Goldberger. Mosaic organization of
1950. dna nucleotides. Physical Review E, 49(2):1685–1689,
[12] X. Gabaix, P. Gopikrishnan, V. Plerou, and H. E. Stan- 1994.
ley. A theory of power-law distributions in financial mar- [26] L.F. Richardson. Proceedings of the Royal Society of Lon-
ket fluctuations. Nature, 423:267–270, 2003. don, Serial A, 110:709, 1926.
[13] X. Gabaix, P. Gopikrishnan, V. Plerou, and H. E. Stan- [27] M. S. Taqqu and Joshua B. Levy. Using renewal processes
ley. A theory of large fluctuations in stock market activ- to generate long-range dependence and high variability.
ity. Technical report, MIT Economics Dept. and NBER, In E. Eberlein and M. S. Taqqu, editors, Dependence
Oct. 20 2004. in Probability and Statistics, pages 73–89. Birkhuser,
[14] T Geisel, J Nierwetberg, and A Zacherl. Physical Review Boston, 1986.
Letters, 54:616, 1985.
[15] P. Gopikrishnan, V. Plerou, X. Gabaix, and H. E. Stan-

Machine Learning Paper - 6

Uploaded by

Copyright:

Available Formats

Machine Learning Paper - 6

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning Paper - 6

Uploaded by

Copyright:

Available Formats

A theory for long-memory in supply and demand

Fabrizio Lillo,1, 2 Szabolcs Mike,1, 3 and J. Doyne Farmer1

Contents have been observed in many physical, biological and eco-

which is called anomalous diffusion. Another important

III. ANALYTIC COMPUTATION FOR FIXED N

where w(L, τ ) is the probability that a given hidden order

For the approximation of the sum by the integral we use gives

We have tested these calculations for the simple case

We now consider the more realistic case that the hid-

FIG. 4: (Color online) Autocorrelation function of

The anonymous nature of the on-book market facilitates

FIG. 6: Scaling exponents α for the twenty stocks

that order splitting is more common in the on-book mar-

that the exponent α for the volume distribution has a

Another test for comparing the models to data con-

where erf(x1 , x2 ) = erf(x2 ) − erf(x1 ) [1], and we have function.

You might also like