Machine Learning Paper - 6
Machine Learning Paper - 6
Machine Learning Paper - 6
Recent empirical studies have demonstrated long-memory in the signs of orders to buy or sell in
financial markets [2, 19]. We show how this can be caused by delays in market clearing. Under the
common practice of order splitting, large orders are broken up into pieces and executed incrementally.
If the size of such large orders is power law distributed, this gives rise to power law decaying
autocorrelations in the signs of executed orders. More specifically, we show that if the cumulative
distribution of large orders of volume v is proportional to v −α and the size of executed orders is
constant, the autocorrelation of order signs as a function of the lag τ is asymptotically proportional
to τ −(α−1) . This is a long-memory process when α < 2. With a few caveats, this gives a good match
to the data. A version of the model also shows long-memory fluctuations in order execution rates,
which may be relevant for explaining the long-memory of price diffusion rates.
isting hidden order i is chosen at random with uniform pute the correct prefactor.
probability, and a volume ∆v of that order is removed,
so that vi (t + 1) = vi (t) − ∆v. This generates a revealed
order of volume ∆v and sign xt = si . A hidden order i A. Autocorrelation in probabilistic terms
is removed if vi (t + 1) = 0. Thus, the number of hidden
orders N (t) fluctuates in time, depending on fluctuations Under the convention that the signs of the revealed
in arrival and removal. orders are xt = ±1, because of the symmetry between
The fixed N model is the same, except that the number buying and selling E[xt ] = 0 and E[x2t ] = 1, where E
of hidden orders N is kept fixed. Thus, if a hidden order denotes the expectation. Therefore the autocorrelation
is removed it is immediately replaced by a new one with is simply ρ(τ ) = E[xt xt+τ ]. We can rewrite this as
a random sign and a new size.
∞
The main result of this paper is the calculation of the X
E[xt xt+τ ] = Q(L)E[xt xt+τ |L], (1)
autocorrelation function of revealed order signs xt for the
L=1
fixed N model. We show in the next section that the tail
of the autocorrelation function asymptotically scales as where E[xt xt+τ |L] is conditioned on the hidden order
τ −(α−1) . While varying N affects the shape of the auto- that generated xt having length L. Q(L) is the probabil-
correlation function for small τ , providing α is held fixed, ity that a revealed order drawn at random comes from a
it does not affect its asymptotic scaling. Even though hidden order of length L. Let q(τ |L) be the probability
N (t) varies in the λ model, the asymptotic behavior is that revealed orders at times t and time t + τ came from
independent of N (t), and so the asymptotic behavior of the same hidden order, given that it has original length
the autocorrelation function is the same. This is particu- L. Because E[xt xt+τ ] = 0 if xt and xt+τ came from
larly convenient because it allows us to make a prediction different hidden orders, and E[xt xt+τ ] = 1 if they came
in terms of observable quantities (see Section V). from the same hidden order, the conditional expectation
can be rewritten
than a given value k. Thus, for a hidden order that has B. de Moivre-Laplace approximation
length l at time t, the probability that it still exists at
time t+τ is Pτ −1 (s < l). For a hidden order with original The autocorrelation can now be computed using
length L, l is uniformly distributed with probability 1/L Eq. (5). However, since the sums of binomial coefficients
over the values 1, . . . , L. Thus we can express w(L, τ ) as are difficult to manage we will make use of the de Moivre-
a sum of probabilities, one for each possible value of l. Laplace approximation [11]. For npq >> 1 one can ap-
1 proximate
w(L, τ ) = (Pτ −1 (s < L − 1) + Pτ −1 (s < L − 2) +
L
(k − np)2
n k n−k 1
. . . + Pτ −1 (s < 1)). (7) p q ≃ √ exp − . (10)
k 2πnpq 2npq
The probabilities Pτ −1 (s < k) can be expressed as
sums of binomial probabilities, corresponding to the pos- As a consequence the sum of consecutive terms of a bi-
sible sequences with which a given hidden order generates nomial distribution can be approximated as
k − 1 revealed orders.
k2
X n k n−k
k−1
X
τ −1 h p q ≃ (11)
Pτ −1 (s < k) = p (1 − p)τ −1−h . (8) k
k=k1
h
h=0
1 k2 − np + 1/2 k1 − np − 1/2
erf √ − erf √ ,
Therefore 2 2npq 2npq
L−2 j where erf is the error function.
p XX τ −1 h
q(τ |L) = p (1 − p)τ −1−h . (9) By converting the sum to an integral, and letting s =
L j=1 h
h=0 τ − 1, equation (9) becomes
L−2
" ! !#
p X j − sp + 1/2 −sp − 1/2
q(s + 1|L) ≃ erf p − erf p ≃
2L j=1 2sp(1 − p) 2sp(1 − p)
Z L−2+1/2 " ! !#
p x − sp + 1/2 −sp − 1/2
erf p − erf p dx, (12)
2L 1/2 2sp(1 − p) 2sp(1 − p)
r
p (sp)2 2p (L − 1 − sp)2
q(s + 1|L) ≃ (− exp(− )+ sp(1 − p)(exp(− )) +
2L 2sp(1 − p) π 2sp(1 − p)
1 − sp 1/2 + sp 1 − L + sp
(sp − 1)erf( p ) + (L − 2)erf( p ) + (1 + sp − L)erf( p ) . (13)
2sp(1 − p) 2sp(1 − p) 2sp(1 − p)
The sum over L in Eq. (5) can be approximated by the 1)p(1 − p) >> 1. This leads to the condition
integral
N2
Z ∞
p(L)L τ >> − 1 ≃ N, (15)
ρ(τ ) ≃ q(τ |L) dL. (14) N −1
1+1/2 L̄
i.e. the approximation is valid as long as the lag is much
Finally, we need to translate the domain of validity of greater than the number of hidden orders. Since the num-
the de Moivre-Laplace approximation into more relevant ber of hidden orders is fixed, the approximation is always
terms. The condition npq >> 1 in Eq. (9) becomes (τ − valid for sufficiently large τ .
5
C. Pareto distribution
N α−2 −(α−1)
ρ(τ ) ∼ τ . (17) We now return to discuss the λ model. As a reminder,
α this differs from the fixed N model analyzed so far in
This result indicates that the autocorrelation function that the number of buffers N (t) is not fixed. Instead,
decays as a power law with exponent γ = α − 1. The new buffers are added with probability λ when N (t) > 0,
number of hidden orders affects the prefactor, but does and probability 1 otherwise. For the mean of N (t) to
not affect the scaling exponent. Interestingly, when α = 2 remain bounded it is necessary that the rate of cre-
the prefactor is independent of N . When α < 2 it is ation of new orders equal the rate at which they are
a decreasing function of N , and when α > 2 it is an removed. This implies the model has a critical thresh-
increasing function of N . The value α = 2 separates old where E[N (t)] → ∞. This can be simply computed
the regime where the size of hidden orders has infinite as follows: Let n(t) be the total number of future re-
variance from the regime where the variance is finite2 . vealed orders stored in all hidden orders at time t, i.e.
Fig. 2 compares the autocorrelation function predicted n(t) = N
P (t)
i=1 vi (t)/∆v. The average rate of change of
by Eq. (17) to a simulation for α = 1.5, N = 1, N = 5, n(t) is
and N = 50. For large values of τ the match is excellent,
both in terms of the slope and the size of the prefactor. E[n(t + 1) − n(t)] = R(n(t))L̄ − 1.
For N = 1 the prediction matches the simulation across
the entire range of τ . As expected, when N increases the The first term represents addition of a new hidden or-
prediction deviates at small τ , but still matches for large der, and the second term the removal of a revealed order
τ . We have also checked the consequences of varying at every timestep. The creation rate R(n(t)) = λ when
α and find that the prefactor behaves as predicted by n(t) > 0 and R(n(t)) = 1 otherwise. The average length
Eq. (17). of a new hidden order
P is L̄, which under the Pareto as-
Note that we used T = 109 samples to simulate the sumption is L̄ = ∞ L=1 L(L) = α/(1 − α). In the limit
model and compare to theory. This is because for α = 1.5 where E[n(t)] is large it is a good approximation to say
this is a strongly long-memory process, and the conver- that n(t) is never zero, so that R(n(t)) = λ. Setting
gence is extremely slow. This will become an issue later E[n(t + 1) − n(t)] = 0 implies the critical value λc is
on when we test the model against real data – even for
very large sample sizes the error bars remain quite large. λc = 1/L̄ = (α − 1)/α = γ/α. (18)
For the last equality we have made use of the fact that γ
does not depend on N in Eq. (17), which indicates that
2 Note that Buldyrev et al. [5] found a similar formula in the γ = α − 1 applies equally well to the λ model as long as
context of structure in DNA sequences. λ < λc (we have verified this in simulations). We also
6
C. Run length
of the asymptotic scaling behavior on N , we do not The market response to the long-memory of order flow
think this will affect γ. is an interesting example of a self-organized collective
phenomenon. It may be one of the causes of other im-
• Feedback between order execution and order genera- portant properties of prices, such as the long-memory
tion. In our model we do not worry about whether in their diffusion rate. We have demonstrated that the λ
revealed orders are actually executed. In reality model, which allows fluctuations in the number of hidden
many revealed orders may never be executed. In orders, automatically generates fluctuations in liquidity.
this case there may be feedback effects, i.e., if an This is known to affect price diffusion rates [10]. The in-
order is not executed the hidden order size is not dependence on the number of hidden orders, which was
decreased, and consequently may result in the gen- not obvious to us before doing the calculation, is a con-
eration of additional revealed orders when the agent venient property of our result that makes it possible to
tries again. We cannot say with certainty that such test the model based on information that can be feasibly
effects are not important. However, one piece of rel- gathered. This is thus a falsifiable model.
evant evidence is that within statistical error the
same scaling is observed for market orders, limit
orders, and cancellations [19]. Since market orders ACKNOWLEDGMENTS
are by definition executed immediately, this sug-
gests that such feedback effects are of minor im-
FL thanks partial funding support from research
portance.
projects MIUR 449/97 “Dinamica di altissima fre-
quenza nei mercati finanziari” and MIUR-FIRB
RBNE01CW3M. SM and JDF would like to thank Credit
VI. DISCUSSION Suisse First Boston, the McDonnell Foundation, Bill
Miller, and Bob Maxfield for funding this work.
We have presented and solved a rather idealized model
of the long-memory of order flow which was designed
to yield tractable results. As detailed in the preceding APPENDIX
section, many of its assumptions are not strictly true.
At the very least, though, it illustrates how two appar- In this appendix we evaluate the asymptotic behavior
ently disparate phenomena may be linked together, and of the autocorrelation ρ(τ ) of Eq. (14) when the hidden
makes quantitative predictions about their relationship. order size L has a Pareto distribution of Eq. (16). We
Because we lack the proper data to test the model, we split the integral of Eq. (14) in three parts and we set
have used an imperfect proxy to test the model. The b = p(1 − p).
model passes this test. However, it would be nice to do The first contribution is
a more definitive test, based on a data set that more
closely characterizes the dichotomy between hidden and 2√
r
(ps − 1)2
Z ∞
pα
revealed orders. Even if the model is not strictly true, − α+1
bs exp − dL. (19)
3/2 2L̄L π 2bs
the model could potentially be extended to include more
realistic assumptions, such as a non-trivial distribution This can be calculated explicitly. It is
of revealed order sizes.
The long-memory of order signs is interesting for its p
r
2√
(ps − 1)2
own sake, but it may also have more profound effects on − bs exp − , (20)
other aspects of the market. The persistent autocorrela- 2L̄ π 2bs
tion function associated with a long-memory process im-
which asymptotically goes as
plies a high degree of predictability by just constructing
a simple linear time series model (see refs. [2, 19]). Since √
ps
buy orders tend to generate a positive price response, and − s exp − . (21)
2(1 − p)
sell orders tend to generate a negative price response, all
other things being equal this would translate into easily This decay is very fast due to the exponential term.
exploitable predictable movements in prices. In order to
The second contribution is
prevent this from happening, other features of the mar-
ket have to adjust to compensate. Such features include
Z ∞ exp − (L−1−sp)2
pα 2 √
r
the size of buy vs. sell orders, the volume of unexecuted 2bs
bs dL. (22)
orders at the best prices, and many other aspects of the 2L̄ π 3/2 Lα+1
market [2, 3, 19]. Market participants do not behave
out of philanthropic motives; presumably these effects This integral cannot be computed analytically. In order
all come about due to the application of profit-making to get its asymptotic behavior for large s (i.e. large τ ) we
strategies. It is not at all obvious what these strategies make use of the saddle point approximation [22]. To have
are, and how they combine to eliminate this inefficiency. an idea of the approximation let us consider the case in
11
which one has to calculate the asymptotic behavior of an By applying the saddle point approximation one easily
integral of the type gets for the integral the approximation
b √
1
Z
dx eN f (x) (23) 2πbs exp (sp)−(α+1) , (27)
a 4bs
for large values of N . If there exists a point x0 in (a, b) and by putting also the prefactor we get for the second
which is a minimum for f (x), then we can expand f (x) contribution
around x0 , yielding
1 1
1 (α − 1)p(1 − p) exp (sp)−α ∼ α . (28)
eN f (x) ≃ exp[N (f (x0 ) + f ′′ (x0 )(x − x0 )2 )], (24) 4bs τ
2
Thus the second contribution gives a power law behavior
and we can compute the Gaussian integral but with an exponent α rather than α − 1.
Z b s The third contribution is the one depending on the
N f (x) 2π three erf functions
dx e ≃ exp(N f (x0 )). (25)
a f (x0 )
′′
pα ∞ 1 − sp 1/2 + sp
Z
(sp − 1)erf( √ ) + (L − 2)erf( √ )
The method can be applied also when the integral is not 2L̄ 3/2 2bs 2bs
of the form (23), given that the integrand can be written 1 − L + sp
as exp(f (x, N )). In our case the integral in Eq. (22) can +(1 + sp − L)erf( √ ) dL.(29)
2bsp
be rewritten as
Z ∞
(L − 1 − sp)2
After some algebraic manipulations we can rewrite this
exp − + (α + 1) log x dL. (26) term as
3/2 2bs
p(α − 1) α 1 − ps 1/2 + sp
− 2 erf √ + erf √
2α α−1 2bs 2bs
Z ∞
p(α − 1) 1 − ps L − 1 − ps 1
+ (L − 1 − ps)erf √ , √ α+1
dL, (30)
2 3/2 2bs 2bs L
1 L 1 + ps 1 − ps L − 1 − ps ∞
I= α + erf √ , √ 3/2
−
L 1−α α 2bs 2bs
∞
(L − x − ps)2
1 L 1 + ps 2
Z
+ √ √ exp − dL. (33)
3/2 Lα 1 − α α π 2bs 2bs
The finite term decays exponentially to zero because of havior of the two integrals can be computed with the
the properties of the error function. The asymptotic be- saddle point method in the same way as Eq. (22). Both
12
decay asymptotically as s−α+1 and the final result is which coincides with Eq. (17).
p(α − 1) 1 1 1 1
I∼ α−2 α−1
∼ α−2 α−1
, (34)
2 αp s αp τ
[1] M. Abramowitz. Handbook of mathematical functions, ley. Statistical properties of share volume traded in fi-
with formulas, graphs and mathematical tables. Dover nancial markets. Physical Review E, 62(4):R4493–R4496,
Publications, 1974. 2000. Part A.
[2] J-P. Bouchaud, Y. Gefen, M. Potters, and M. Wyart. [16] B. M. Hill. A simple general approach to inference about
Fluctuations and response in financial markets: The sub- the tail of distribution. The Annals of Statistics, 3(5):
tle nature of “random” price changes. Quantitative Fi- 1163–1174, 1975.
nance, 4(2):176–190, 2004. [17] M. Levy. Market efficiency, the pareto wealth distri-
[3] J-P. Bouchaud, J. Kockelkoren, and M. Potters. Random bution, and the distribution of stock returns. In S. N.
walks, liquidity molasses and critical response in financial Durlauf and L. E. Blume, editors, The Economy as an
markets. Technical report, Science and Finance, 2004. Evolving Complex System III. Oxford University Press,
[4] F. J. Breidt, N. Crato, and P. J. F. de Lima. Modeling 2005.
long-memory stochastic volatility. Working paper, Johns [18] M. Levy and S. Solomon. Power laws are logarith-
Hopkins University, 1993. mic boltzmann laws. International Journal of Modern
[5] S. V. Buldyrev, A. L. Goldberger, S. Havlin, C-K. Peng, Physics C, 7(4):595–601, 1996.
M. Simons, and H. E. Stanley. Physical Review E, 47: [19] F. Lillo and J. D. Farmer. The long memory of the effi-
4514, 1993. cient market. Studies in Nonlinear Dynamics & Econo-
[6] L. K..C. Chan and J. Lakonishok. Institutional trades metrics, 8(3), 2004.
and intraday stock price behavior. Journal of Financial [20] I. N. Lobato and C. Velasco. Long-memory in stock-
Economics, 33:173–199, 1993. market trading volume. Journal of Business & Economic
[7] L. K..C. Chan and J. Lakonishok. The behavior of stock Statistics, 18:410–427, 2000.
prices around institutional trades. The Journal of Fi- [21] B. Mandelbrot. Long-run linearity, locally gaussian pro-
nance, 50(4):1147–1174, 1995. cesses, h-spectra, and infinite variances. International
[8] Z. Ding, C. W. J. Granger, and R. F. Engle. A long mem- Economic Review, 10:82–113, 1969.
ory property of stock returns and a new model. Journal [22] F.W.J. Olver. Asymptotics and special functions. Aca-
of Empirical Finance, 1:83, 1993. demic Press, 1974.
[9] J. P. Embrechts, C. Kluppelberg, and T. Mikosch. [23] A. Ott, J-P. Bouchaud, D. Langevin, and W. Urbach.
Modeling Extremal Events for Insurance and Finance. Anomalous diffusion in “living polymers”: A genuine levy
Springer-Verlag, Berlin, 1997. flight? Physical Review Letters, 65:2201–2204, 1990.
[10] J. D. Farmer, L. Gillemot, F. Lillo, S. Mike, and A. Sen. [24] C-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin,
What really causes large price changes? Working paper F. Sciortino, M. Simons, and H. E. Stanley. Long-range
04-02-006, Santa Fe Institute, 2004. to appear in Quan- correlations in nucleotide sequences. Nature, 365:168–
titative Finance August 2004. 170, 1992.
[11] W. Feller. An Introduction to Probability Theory and its [25] C-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E.
Ap plications, volume 1. Wiley and Sons, third edition, Stanley, and A. L. Goldberger. Mosaic organization of
1950. dna nucleotides. Physical Review E, 49(2):1685–1689,
[12] X. Gabaix, P. Gopikrishnan, V. Plerou, and H. E. Stan- 1994.
ley. A theory of power-law distributions in financial mar- [26] L.F. Richardson. Proceedings of the Royal Society of Lon-
ket fluctuations. Nature, 423:267–270, 2003. don, Serial A, 110:709, 1926.
[13] X. Gabaix, P. Gopikrishnan, V. Plerou, and H. E. Stan- [27] M. S. Taqqu and Joshua B. Levy. Using renewal processes
ley. A theory of large fluctuations in stock market activ- to generate long-range dependence and high variability.
ity. Technical report, MIT Economics Dept. and NBER, In E. Eberlein and M. S. Taqqu, editors, Dependence
Oct. 20 2004. in Probability and Statistics, pages 73–89. Birkhuser,
[14] T Geisel, J Nierwetberg, and A Zacherl. Physical Review Boston, 1986.
Letters, 54:616, 1985.
[15] P. Gopikrishnan, V. Plerou, X. Gabaix, and H. E. Stan-