Lectures On Integrable Probability
Lectures On Integrable Probability
Lectures On Integrable Probability
Abstract
These are lecture notes for a mini-course given at the St. Petersburg School in
Probability and Statistical Physics in June 2012. Topics include integrable models of
random growth, determinantal point processes, Schur processes and Markov dynamics on them, Macdonald processes and their application to asymptotics of directed
polymers in random media.
Preface
These lectures are about probabilistic systems that can be analyzed by essentially algebraic methods.
The historically first example of such a system goes back to De Moivre (1738) and
Laplace (1812) who considered the problem of finding the asymptotic distribution of a sum
of i.i.d. random variables for Bernoulli trials, when the pre-limit distribution is explicit,
and took the limit of the resulting expression. While this computation may look like a
simple exercise when viewed from the heights of modern probability, in its time it likely
served the role of a key stepping stone first rigorous proofs of central limit theorems
appeared only in the beginning of the XXth century.
At the moment we are arguably in a De Moivre-Laplace stage for a certain class
of stochastic systems which is often referred to as the KPZ universality class, after an
influential work of Kardar-Parisi-Zhang in mid-80s. We will be mostly interested in the
case of one space dimension. The class includes models of random growth that have builtin mechanisms of smoothing and lateral growth, as well as directed polymers in space-time
uncorrelated random media and driven diffusive lattice gases.
While the class and some of its members have been identified by physicists, the first
examples of convincing (actually, rigorous) analysis were provided by mathematicians,
who were also able to identify the distributions that play the role of the Gaussian law.
Nowadays, they are often referred to as the Tracy-Widom type distributions as they had
previously appeared in Tracy-Widoms work on spectra of large random matrices.
The reason for mathematicians success was that there is an unusually extensive
amount of algebra and combinatorics required to gain access to suitable pre-limit formulas that admit large time limit transitions. As we will argue below, the solvable or
Department of Mathematics, Massachusetts Institute of Technology, USA, and Institute for Information Transmission Problems of Russian Academy of Sciences, Russia
e-mail: [email protected]
e-mail: [email protected]
integrable members of the class should be viewed as projections of much more powerful
objects whose origins lie in representation theory. In a way, this is similar to integrable
systems that can also be viewed as projections of representation theoretic objects; and this
is one reason we use the words integrable probability to describe the phenomenon. There
are also much more direct links between integrable systems and integrable probability
some of which we mention below.
The goal of these notes is not to give a survey of a variety of integrable probabilistic
models that are known by now (there are quite a few, and not all of them are members
of the KPZ universality class), but rather to give a taste of them by outlining some of the
algebraic mechanisms that lead to their solution.
The notes are organized as follows.
In Section 1 we give a brief and non-exhaustive overview of the integrable members of
the KPZ universality class in (1+1) dimensions.
In Section 2 we provide the basics of the theory of symmetric functions that may be
seen as a language of the classical representation theory.
In Section 3 we discuss determinantal random point processes a fairly recent class
of point processes that proved very effective for the analysis of growth models and also
for the analysis of integrable probabilistic models of random matrix type.
Section 4 explains a link between a particular random growth model (known as the
polynuclear growth process or PNG) and the so-called Plancherel measures on partitions
that originates from representation theory of symmetric groups.
In Section 5 we introduce a general class of the Schur measures that includes the
Plancherel measures; members of these class can be viewed as determinantal point processes, which provides a key to their analysis. We also perform such an analysis in the case
of the Plancherel measure, thus providing a proof of the celebrated Baik-Deift-Johansson
theorem on asymptotics of longest increasing subsequences of random permutations.
Section 6 explains how integrable models of stochastic growth can be constructed with
representation theoretic tools, using the theory of symmetric functions developed earlier.
In Section 7 we show how one can use celebrated Macdonald symmetric functions to
access the problem of asymptotic behavior of certain directed random polymers in (1+1)
dimensions (known as the OConnell-Yor polymers). The key feature here is that the
formalism of determinantal processes does not apply, and one needs to use other tools
that here boil down to employing Macdonald-Ruijsenaars difference operators.
Introduction
Suppose that you are building a tower out of unit blocks. Blocks are falling from the
sky, as shown at Figure 1 (left picture) and the tower slowly grows. If you introduce
randomness here by declaring the times between arrivals of blocks to be independent
identically distributed (i.i.d.) random variables, then you get the simplest 1d random
growth model. The kind of question we would like to answer here is what the height h(t)
of tower at time t is?
The classical central limit theorem (see e.g. [Billingsley-95, Chapter 5] or
[Kallenberg-02, Chapter 4]) provides the answer:
3
2 2
h(t) c1
1 t + c2 c1 t ,
where c1 and c2 are the mean and standard deviation of the times between arrivals of the
blocks, respectively, and is a standard normal random variable N (0, 1).
Figure 1: Growth models based on the falling boxes: 1d model, 2d model without interactions, 2d model with sticky boxes (ballistic deposition)
11
00
00
11
00
11
00
11
1
0
0
1
1
0
1
0
0
1
01
1
0
0
1
1 0
0
1
11
0
0
01
1
0
1
0
0
1
11
0
01
0 1
0
01
1
01
0 1
0
Figure 2: Broken line with slopes 1, local minimum where a box can be added, and
correspondence with particle configurations on Z.
001
11
0
0
1
1
0
1
0
00
11
0
0
0
001
11
01
1
01
1
0
1
11 0
00
100
11 0
1
11
00
0
00
0
00 1
11
011
1
00 1
11
0
1
Figure 3: Wedge and flat initial conditions: broken lines and corresponding particle
configurations.
Theorem 1.1 ([Johansson-00]). Suppose that at time 0 the interface h(x; t) is a wedge
(h(x, 0) = |x|) as shown at Figure 3 (left picture). Then for every x (1, 1)
h(t, tx) c1 (x)t
s = F2 (s),
lim P
t
c2 (x)t1/3
where c1 (x), c2 (x) are certain (explicit) functions of x.
Theorem 1.2 ([Sasamoto-05], [Borodin-Ferrari-Prahofer-Sasamoto-07]). Suppose that at
time 0 the interface h(x; t) is flat as shown at Figure 3 (right picture). Then for every
xR
h(t, x) c3 t
lim P
s = F1 (s),
t
c4 t1/3
where c3 , c4 are certain (explicit) positive constants.
Here F1 (s) and F2 (s) are distributions from random matrix theory, known under the name of Tracy-Widom distributions. They are the limiting distributions for
the largest eigenvalues in Gaussian Orthogonal Ensemble and Gaussian Unitary Ensemble of random matrices (which are the probability measures with density proportional to exp(T race(X 2 )) on real symmetric and Hermitian matrices, respectively), see
[Tracy-Widom-94], [Tracy-Widom-96].
These two theorems give the conjectural answer for the whole universality class of
2d random growth models, which is usually referred to as the KPZ (Kardar-Parisi-Zhang)
universality class. Comparing to the answer in the 1d case we see that the asymptotic
behavior becomes more delicate while scaling by t1/3 is always the same, the resulting
distribution may also depend on the subclass of our model. Also, conjecturally, the only
4
two generic subclasses are the ones we have seen. They are distinguished by whether the
global surface profile is locally curved or flat near the observation location.
Let us concentrate on the wedge initial condition. In this case there is yet another
reformulation of the model. Write in each box (i, j) of the positive quadrant a random
waiting time w(i,j) . Once our random interface (of type pictured in Figure 2) reaches
the box (i, j) it takes time w(i,j) for it to absorb the box. Now the whole quadrant is filled
with nonnegative i.i.d. random variables. How to reconstruct the growth of the interface
from these numbers? More precisely, at what time T (i, j) a given box (i, j) is absorbed
by the growing interface? A simple argument shows that
i+j1
T (i, j) =
max
(1,1)=b[1]b[2]...b[i+j1]=(i,j)
wb[k] ,
(1.1)
k=1
where the sum is taken over all directed (leading away from the origin) paths joining
(1, 1) and (i, j), see Figure 4. The quantity (1.1) is known as the (directed) Last Passage
Percolation time. Indeed, if you think about numbers w(i,j) as of times needed to percolate
into a given box, then (1.1) gives the time to percolate from (1, 1) in (i, j) in the worst
case scenario.
3
5
1
2
1
4
4
2
4
3
1
4
1
5
3
3
2
Figure 4: The quadrant filled with waiting times and two (out of
directed paths joining (1, 1) and (4, 1).
4
1
= 4 possibilities)
Universality considerations make one believe that the limit behavior of the Last Passage Percolation time should not depend on the distribution of w(i,j) (if this distribution is
not too wild), but we are very far from proving this at the moment. However, again, the
Last Passage Percolation time asymptotics has been computed for certain distributions,
e.g. for the exponential distribution in the context of Theorem 1.1.
Let us present another example, where the (conjecturally, universal) result can be
rigorously proven. Consider the homogeneous, density 1 Poisson point process in the
first quadrant, and let L() be the maximal number of points one can collect along a
North-East path from (0, 0) to (, ), as shown at Figure 5.
This quantity can be seen as a limit of the LPP times when w(i,j) takes only two values
0 and 1, and the probability of 1 is very small. Such considerations explain that L()
should be also in the KPZ universality class. And, indeed, this is true.
Figure 5: The Poisson point process in the first quadrant and a NorthEast path joining
(0, 0) and (, ) and collecting maximal number of points, which is 5 here.
1/3
It is not hard to show that Theorem 1.3 is equivalent to
Theorem 1.4 ([Baik-Deift-Johansson-99]). Let be a uniformly distributed permutation
of the set {1, . . . , n}, and let `n () be the length of the longest increasing subsequence of
. Then
`n 2 n
s = F2 (s).
lim P
n
n1/6
The problem of understanding the limit behavior of `n has a long history and
goes
back to the book of Ulam of 1961 [Ulam-61]. Ulam conjectured that E`n c n but
was not able to identify the constant; he also conjectured Gaussian fluctuations. In 1974
Hammersley [Hammersley-72]
proved, via a subadditivity argument, that there exists
a constant such that `n c n and this constant was identified in 1977 by Kerov and
Vershik [Vershik-Kerov-77].
The random variable `n has an interesting interpretation in terms of an airplane boarding problem. Imagine a simplified airplane with one seat in each of n rows, large distances
between rows, and one entrance in front. Each entering passenger has a ticket with a seat
number, but the order of passengers in the initial queue is random (this is our random
permutation). Suppose that each passenger has a carry-on, and it takes one minute for
that person to load it into the overhead bin as soon as (s)he reaches her/his seat. The
aisle is narrow, and nobody can pass the passenger who is loading the carry-on. It turns
out that the total time to board the airplane is precisely `n . Let us demonstrate this with
an example.
Consider the permutation = 2413 with `4 () = 2. The airplane boarding looks as
follows: The first passenger enters the airplane and proceeds to the seat number 2. While
(s)he loads a carry-on, the other passengers stay behind and the one with the ticket for
6
the seat number 1 ((s)he was the third person in the original queue) starts loading her/his
carry-on. After one minute, the passenger with the ticket for the seat number 4 proceeds
to his seat and also starts loading, as well as the one aiming for the seat number 3. In
two minutes the boarding is complete.
Interestingly enough, if the queue is divided into groups, as often happens
in reality,
then the boarding time (for long queues) will only increase by the factor k, where k is
the number of the groups.
Let us now proceed to more recent developments. In the Last Passage Percolation
problem we were maximizing a functional H(x) over a set X . A general statistical mechanics principle says that such a maximization can be seen as zero-temperature limit of
the Gibbs ensemble on X with Hamiltonian H(x). More formally, we have the following
essentially obvious statement
1 X H(x)
ln
e
.
xX
The parameter is usually referred to as the inverse temperature in the statistical mechanics literature.
In the Last Passage Percolation model, X is the set of all directed paths joining (1, 1)
with a point (a, b), and the value of H on path x is the sum of w(i,j) along the path x. The
Gibbs ensemble in this case is known under the name of a Directed Polymer in Random
Media. The study of such objects with various path sets and various choices of noise (i.e.
w(i,j) ) is a very rich subject.
Directed Polymers in Random Media appeared for the first time close to thirty years
ago in an investigation of low temperature expansion of the partition function of the Ising
model with domain wall boundary conditions, see [Huse-Henley-85], [Imbrie-Spenser-88],
but nowadays there are many other physical applications. Let us give one concrete model
where such polymers arise.
Consider a set of massive particles in Z that evolve in discrete time as follows. At each
time moment the mass of each particle is multiplied by a random variable dt,x , where t
is the time moment and x is the particles position. Random variables dt,x are typically
assumed to be i.i.d. Then each particle gives birth to a twin of the same mass and the
twin moves to x + 1. If we now start at time 0 with a single particle of mass 1 at x = 1,
then the mass Z(T, x) of all particles at x at time T can be computed as a sum over all
directed paths (1, 1) = b[1] b[2] . . . b[x + T 1] = (T, x) joining (1, 1) and (T, x):
Z(T, x) =
x+T
Y1
(1,1)=b[1]b[2]...b[x+T 1]=(T,x)
k=1
db[k] ,
(1.2)
This model can be used as a simplified description for the migration of plankton with dt,x
representing the state of the ocean at location x and time t which affects the speed of
growth of the population. Independent dt,x model quickly changing media, e.g. due to the
turbulent flows in the ocean.
Random Polymers in Random Media exhibit a very interesting phenomenon called
intermittency which is the existence of large peeks happening with small probability,
that are high enough to dominate the asymptotics of the moments. Physicists believe
that intermittency is widespread in nature and, for instance, the mass distribution in the
7
universe or a magnetogram of the sun show intermittent behavior. To see this phenomenon
in our model, suppose for a moment that dt,x does not depend on t. Then there would
be locations where the amount of plankton exponentially grows, while in other places all
the plankton quickly dies, so we see very high peaks. Now it is reasonable to expect that
such peaks would still be present when dt,x are independent both of t and x and this will
cause intermittency. Proving and quantifying intermittency is, however, rather difficult.
Regarding the distribution of Z(T, x), it was long believed in the physics literature
that it should belong to the same KPZ universality class as the Last Passage Percolation.
Now, at least in certain cases, we can prove it. The following integrable random polymer
1
was introduced and studied by Seppalainen [Seppalainen-12] who proved the t 3 exponent
for the fluctuations. The next theorem is a refinement of this result.
Theorem 1.5 ([Borodin-Corwin-Remenik-12]). Assume dt,x are independent positive random variables with density
1 1
1
x
exp
.
()
x
Then there exist > 0 and (explicit) c1 , c2 > 0 such that for 0 < < ,
Z(n, n) c1 n
lim P
s = F2 (s).
n
c2 n1/3
The upper bound on the parameter > 0 in this theorem is technical and it will
probably be removed in future works.
In a similar way to our transition from Last Passage Percolation to monotone paths in
a Poisson field and longest increasing subsequences, we can do a limit transition here, so
that discrete paths in (1.2) turn into Brownian bridges, while dt,x turn into the spacetime
white noise. Let us explain in more detail how this works as this will provide a direct link
to the KardarParisiZhang equation that gave the name to the KPZ universality class.
For a Brownian bridge B = B(s) we obtain a functional
Z
(s, B(s))ds,
H(B) = W
(1.3)
is the 2d white noise. Thus, the partition function Z(t, x) has the form
where W
2
x
1
exp
E : exp : (H(B) ,
Z(t, x) =
2t
2t
(1.4)
where E is the expectation with respect to the law of the Brownian bridge which starts
at 0 at time 0 and ends at x at time t, and : exp : is the Wick ordered exponential,
see [Alberts-Khanin-Quastel-12b] and references therein for more details. Note that the
randomness coming from the white noise is still there, and Z(t, x) is a random variable.
Another way of defining Z(t, x) is through the stochastic PDE it satisfies:
1
Z(t, x) =
t
2
2
Z.
Z(t, x) + W
(1.5)
This is known as the stochastic heat equation. Indeed, if we remove the part with the
white noise in (1.5), then we end up with the usual heat equation.
8
1
U (t, x) =
t
2
2
U (t, x) +
U (t, x)
x
2
,
+W
(1.6)
F =
+
(xi xj ) F,
t
2 i=1 xi
i6=j
(1.7)
where is the Dirac deltafunction. (1.7) is known as the evolution equation of the
quantum delta-Bose gas. It was the second quantum many body system solved via Bethe
ansatz, see [Lieb-Liniger-63], [McGuire-64].
There is also a deeper analogy: Both integrable systems and integrable probability
models can be viewed as shadows of representation theory of infinitedimensional Lie
groups and algebras. However, while integrable PDEs often represent rather exotic behavior from the point of view of general PDEs, integrable probability delivers universal
behavior for the whole universality class of similar models. Moreover, in the rare occasions when the universality can be proved (e.g. in random matrices, see recent reviews
9
[Erdos-Yau-12], [Tao-Vu-12] and references therein, or in (1 + 1)d polymers in the socalled intermediate disorder regime, see [Alberts-Khanin-Quastel-12a]), one shows that
the generic behavior is the same as in the integrable case. Then the integrable case
provides the only known route to an explicit description of the answer.
While we will not explain in these notes the representation theoretic undercurrent in
any detail, we cannot and do not want to get rid of it completely. In what follows we will
rely on the theory of symmetric functions which is the algebraiccombinatorial apparatus
of the representation theory.
Acknowledgments
We are very grateful to Ivan Corwin, Grigori Olshanski, and Leonid Petrov for very
valuable comments on an earlier version of this text. A. B. was partially supported by
the NSF grant DMS-1056390. V. G. was partially supported by RFBR-CNRS grants
10-01-93114 and 11-01-93105.
Symmetric functions
In this section we briefly review certain parts of the theory of symmetric functions. If
the reader is familiar with the basics of this theory, (s)he might want to skip this section
returning to it, if necessary, in the future. There are several excellent treatments of
symmetric functions in the literature, see e.g. [Macdonald-95], [Sagan-01], [Stanley-99].
We will mostly follow the notations of [Macdonald-95] and recommend the same book for
the proofs.
Our first aim is to define the algebra of symmetric functions in infinitely many
variables. Let N = C[x1 , . . . , xN ]SN be the space of polynomials in x1 , . . . , xN which are
symmetric with respect to permutations of the xj . N has a natural grading by the total
degree of a polynomial.
Let N +1 : C[x1 , . . . , xN +1 ] C[x1 , . . . , xN ] be the map defined by setting xN +1 = 0.
It preserves the ring of symmetric polynomials and gradings. Thus we obtain a tower of
graded algebras
1
2
3
C
1
2
....
We define as the projective limit of the above tower
= lim N = {(f1 , f2 , f3 , . . . ) | fj j , j fj = fj1 , deg(fj ) are bounded }.
Theorem 2.1. The systems {ek }, {hk }, {pk } are algebraically independent generators of
. In other words, can be seen as the algebra of polynomials in hk , or the algebra of
polynomials in ek , or the algebra of polynomials in pk :
= C[e1 , e2 , . . . ] = C[h1 , h2 , . . . ] = C[p1 , p2 , . . . ].
The proof of this statement can be found in [Macdonald-95, Chapter I, Section 2].
Theorem 2.1 for the polynomials in finitely many variables is known as the fundamental
theorem of symmetric polynomials.
It is convenient to introduce generating functions for the above generators:
H(z) :=
hk z k ,
E(z) :=
z=0
ek z k ,
P (z) :=
z=0
pk z k1 ,
z=1
(2.1)
Y
E(z) =
(1 + xi z),
(2.2)
H(z) =
Y
i
d X
1
P (z) =
ln
.
dz i
1 xi z
(2.3)
In particular,
H(z) =
1
= exp
E(z)
X
z k pk
k=1
!
.
(2.4)
Proof. In order to prove (2.2) open the parentheses and compare with the definition of
ek . To prove (2.1) note that
Y
i
Y
1
=
(1 + xi z + x2i z 2 + x3i z 3 + . . . )
1 xi z
i
11
and open the parentheses again. Finally, using the power series expansion of the logarithm
we get
d X
1
d X
x2i z 2 x3i z 3
ln
=
xi z +
+
+ ...
dz i
1 xi z
dz i
2
3
=
xi +
x2i z
x3i z 2
+ ... =
pk z k1 .
k=1
Figure 6: Left panel: Young diagram of size 9 with row lengths (4, 3, 2) and column
lengths (3, 3, 2, 1). Right panel: Transposed diagram 0 .
Let be a Young diagram of size n or, equivalently, a partition of n. In other words,
is a sequence 1 2 . . . of non-negative
integers (which are identified with row lengths
P
of the Young diagram), such that i i = || = n. The diagram whose row lengths are
column lengths of is called transposed diagram and denoted 0 . In other words, for each
i, 0i is equal to the number of j such that j i. We draw Young diagrams as collections
of unit boxes and Figure 6 gives an example of a Young diagram and its transpose.
The length `() of is defined as the number of non-zero numbers i (equivalently,
the number of rows in ). Clearly, `() = 01 .
We denote the set of all Young diagrams by Y. By definition Y includes the empty
partition = (0, 0, . . . ).
Definition 2.3. The Schur polynomial s (x1 , . . . , xN ) is a symmetric polynomial in N
variables parameterized by Young diagram with `() N and given by
h
iN
j +N j
det xi
i,j=1
s (x1 , . . . , xN ) = Q
.
(x
x
i
j)
i<j
(2.5)
Proposition 2.4. The Schur functions s , with ranging over the set of all Young
diagrams, form a linear basis of . They are related to generators ek and hk through the
JacobiTrudi formulas:
i
h
i
h
,
s = det hi i+j
= det e0i i+j
i,j=1,...,`(0 )
i,j=1,...,`()
s (x1 , x2 , . . . )s (y1 , y2 , . . . ) =
Y
i,j
1
,
1 xi y j
(2.7)
and also
X p (x1 , x2 , . . . )p (y1 , y2 . . . )
z
= exp
X
pk (x1 , x2 , . . . )pk (y1 , y2 , . . . )
k=1
!
=
Y
i,j
1
, (2.8)
1 xi y j
where
p = p1 p2 p`()
and z =
i1
Remark. The righthand sides of (2.7) and (2.8) should be viewed as formal power series
via
1
= 1 + xi yj + (xi yj )2 + (xi yj )3 + . . . .
1 xi y j
The proof of Theorem 2.5 can be found in [Macdonald-95, Chapter I, Section 4]. In fact,
(2.7) is a particular case of the more general skew Cauchy identity. We need to introduce
further notations in order to state it.
13
Definition 2.6. Let be any Young diagram. Expand s (x, y) as a linear combination
of Schur symmetric functions in variables yi ; the coefficients of this expansion are called
skew Schur functions and denoted s/ :
X
s (x, y) =
s/ (x)s (y).
Y
i,j
X
1
s/ (y1 , y2 , . . . )s/ (x1 , x2 , . . . ). (2.9)
1 xi yj Y
For the proof of this statement see [Macdonald-95, Chapter I, Section 5, Example
26]. In order to see that (2.7) is indeed a particular case of (2.9) we need the following
generalization of JacobiTrudi identity (its proof can be found in the same section of
[Macdonald-95]).
Proposition 2.8. Assuming that hk = 0 for k < 0, we have
h
i
s/ = det hi j i+j
i,j=1,...,max(`(),`())
(
1, = ,
=
0, otherwise.
Proposition 2.9. Let x and y be two sets of variables. For any , Y we have
X
s/ (x, y) =
s/ (x)s/ (y).
Y
(f g)() = f ()g(),
Note that the condition i |ui | < implies that the series in (2.10) converges for any
k 1.
Sometimes it is important to know which specializations are positive in a certain sense.
We call a specialization Schurpositive if for every Young diagram we have
s () 0.
There is an explicit classification for Schurpositive specializations.
Theorem 2.11. The Schurpositive specializations are parameterized by pairs of sequences ofP
non-negative reals = (1 2 0) and = (1 2 0)
satisfying i (i + i ) < and an additional parameter 0. The specialization with
parameters (; ; ) can be described by its values on power sums
X
p1 7 p1 (; ; ) = +
(i + i ),
i
pk 7 pk (; ; ) =
(ik + (1)k1 ik ),
k2
hk (; ; )z k = ez
k=0
Y 1 + i z
.
1
z
i
i1
Theorem 2.11 has a number of equivalent reformulations, in particular, it is equivalent to the description of all characters of the infinite symmetric group S() and
to the classification of totally positive triangular Toeplitz matrices. The first proofs
of Theorem 2.11 were obtained (independently) by Thoma [Thoma-64] and Edrei
[Edrei-53], a proof by a different method can be found in [Vershik-Kerov-81], [Kerov-03],
[Kerov-Okounkov-Olshanski-98], and yet another proof is given in [Okounkov-94].
Our next goal is to study the simplest Schurpositive specializations more thoroughly.
Given two Young diagrams and we say that / is a horizontal strip if 0 0i 0i 1
for all i; / is a vertical strip if 0 i i 1 for all i.
A semistandard Young tableau of shape and rank N is a filling of boxes of with
numbers from 1 to N in such a way that the numbers strictly increase along the columns
and weakly increase along the rows. A standard Young tableau of shape is a filling of
boxes of with numbers from 1 to || in such a way that the numbers strictly increase
both along the columns and along the rows (in particular, this implies that each number
appears exactly once). Examples of Young tableaux are given in Figure 7.
1 1 2 5 5
2 3 4
3 4
1 2 3 5
4 6 8
7
Figure 7: Left panel: a semistandard Young tableau of shape (5, 3, 2) and rank 5. Right
panel: a standard Young tableau of shape (4, 3, 1).
The number of all semistandard Young tableaux of shape and rank N is denoted as
DimN (). The number of all standard Young tableau of shape is denoted as dim().
These quantities have representationtheoretic interpretations, namely, DimN () is the
dimension of the irreducible representation of unitary group U (N ) indexed by (the highest
weight) and dim() is the dimension of the irreducible representation of symmetric group
S(||) indexed by .
The proofs of the following statements are a combination of Theorem 2.11 and results
of [Macdonald-95, Chapter I].
Proposition 2.12. Suppose that 1 = c and all other -, -, -parameters are zeros.
Then for Schurpositive specialization (; ; ), s (; ; ) = 0 unless is a onerow
Young diagram (i.e. `() = 1), and, more generally, s/ (; ; ) = 0 unless / is a
horizontal strip. In the latter case
s/ (; ; ) = c|||| .
Proposition 2.13. Suppose that 1 = c and all other -, -, -parameters are zeros.
Then for the Schurpositive specialization (; ; ), s (; ; ) = 0 unless is a one
column Young diagram (i.e. 1 1), and, more generally, s/ (; ; ) = 0 unless / is
a vertical strip. In the latter case
s/ (; ; ) = c|||| .
16
Proposition 2.14. Suppose that 1 = 2 = = N = 1 and all other -, -, parameters are zeros. Then for the Schurpositive specialization (; ; )
s (; ; ) = DimN (),
Y.
Proposition 2.15. Suppose that = c and all - and -parameters are zeros. Then for
the Schurpositive specialization (; ; )
s (; ; ) =
c||
dim(),
||!
Y.
In this section we introduce determinantal point processes which are an important tool in
the study of growth models and in the KPZ universality class.
Consider a reasonable state space or one particle space X, say the real line R, or
the Euclidean space Rd , or a discrete space such as the set Z of integers or its subset. A
point configuration X in X is a locally finite (i.e. without accumulation points) collection
of points of the space X. For our purposes it suffices to assume that the points of X
are always pairwise distinct. The set of all point configurations in X will be denoted as
Conf(X).
A compact subset A X is called a window. For a window A and X Conf(X),
set NA (X) = |A X| (number of points of X in the window). Thus, NA is a function
on Conf(X). We equip Conf(X) with the Borel structure (i.e. algebra) generated by
functions NA for all windows A.
A random point process on X is a probability measure on Conf(X). We will often use
the term particles for the elements of a random point configuration. Thus, we will speak
about particle configurations.
The most known example of a random point process is the homogeneous (rate 1)
Poisson process on R. For any finite interval A R (or, more generally, for any compact
set A), the number NA of particles falling in A is finite because, by the very assumption,
X has no accumulation points. Since X is random, NA is random, too. Here are the key
properties of the Poisson process (see e.g. [Billingsley-95, Section 23] or [Kallenberg-02,
Chapter 10]):
NA has the Poisson distribution with parameter |A|, the length of A. That is
P(NA = n) = e|A|
|A|n
,
n!
n = 0, 1, 2, . . .
If A1 ,. . . ,Ak are pairwise disjoint intervals, then the corresponding random variables
NA1 , . . . , NAk are independent. This means that the particles do not interact.
The Poisson process can be constructed as follows. Let M = 1, 2, 3, . . . be a natural
number. Take the interval [M/2, M/2] and place M particles in it, uniformly and
independently of each other. Observe that the mean density of the particles is equal to
1 for any M because the number of particles and the length of the interval are the same.
Now pass to the limit as M . As M gets large, the interval approximates the whole
real line, and in the limit one obtains the Poisson random configuration.
17
Exercise 3.1. Assuming that the limit process exists, show that it satisfies the above two
properties concerning the random variables NA .
The above simple construction contains two important ideas: First, the idea of limit
transition. Starting from M particle random configurations one can get infinite particle
configurations by taking a limit. Second, the observation that the limit transition may
lead to a simplification. Indeed, the structure of the joint distribution of NA1 , . . . , NAk
simplifies in the limit.
Let us construct a discrete analog of the Poisson process. Replace the real line R by
the lattice Z of integers. This will be our new state space. A particle configuration on Z
is simply an arbitrary subset X Z, the assumption of absence of accumulation points
holds automatically.
Fix a real number p (0, 1). The stationary Bernoulli process with parameter p on
Z is constructed as follows: For each integer n Z we put a particle at the node n with
probability p, independently of other nodes. This procedure leads to a random particle
configuration.
Equivalently, the Bernoulli process is a doubly infinite sequence n , n Z, of binary
random variables, such that each n takes value 1 or 0 with probability p or 1 p, respectively, and, moreover, these variables are independent. Then the random configuration X
consists of those ns for which n = 1.
The following construction is a simple example of a scaling limit transition. Shrink our
lattice by the factor of p. That is, instead of Z consider the isomorphic lattice p Z R with
mesh p, and transfer the Bernoulli process to p Z. The resulted scaled Bernoulli process
can be regarded as a process on R because p Z is contained in R, and each configuration
on p Z is simultaneously a configuration on R. As p goes to 0, the scaled Bernoulli process
will approximate the Poisson process on the line. This is intuitively clear, because for
Bernoulli, like Poisson, there is no interaction, and the mean density of particles for the
scaled Bernoulli process is equal to 1.
How to describe a point process? The problem here comes from the fact that the space
of particle configurations Conf(X) is, typically, infinitedimensional and, thus, there is no
natural Lebesgue measure which could be used for writing densities. One solution is to
use correlation functions.
Let us temporarily restrict ourselves to point processes on a finite or countable discrete
space X (for instance, the reader may assume X = Z). Such a process is the same as a
collection {x } of binary random variables, indexed by elements x X, which indicate the
presence of a particle at x. (They are often called occupancy variables.) The Bernoulli
process was a simple example. Now we no longer assume that these variables are independent and identically distributed; they may have an arbitrary law. Their law is simply an
arbitrary probability measure P on the space of all configurations. This is a large space,
it can be described as the infinite product space {0, 1}X . Thus, defining a point process
on X amounts to specifying a probability measure P on {0, 1}X .
Definition 3.2. Let A range over finite subsets of X. The correlation function of a point
process X on X is the function (A) defined by
(A) = P(A X).
18
Z
X
f n = E
f (xi1 , . . . , xin )
Xn
xi1 ,...,xin X
where E denotes averaging with respect to our point process, and the sum on the right is
taken over all n-tuples of pairwise distinct points of the random point configuration X.
In particular, for n = 1 we have
!
Z
X
f (x)1 (dx) = E
f (x)
X
xX
n 1.
If X R and is absolutely continuous with respect to the Lebesgue measure, then the
probabilistic meaning of the nth correlation function is that of the density of probability
to find a particle in each of the infinitesimal intervals around points x1 , x2 , . . . xn :
n (x1 , x2 , . . . xn )(dx1 ) (dxn )
= P (there is a particle in each interval (xi , xi + dxi )).
For a random point process with fixed number of particles, say N , described by a joint
probability distribution P (dx1 , . . . , dxN ) (it is natural to assume that P is symmetric with
respect to permutations of the arguments), the correlation measures for n N are given
by
Z
N!
P (dx1 , . . . , dxN ).
(3.1)
n (dx1 , . . . , dxn ) =
(N n)! xn+1 ,...,xN X
Indeed, we have
Z
X
E
f (xi1 , . . . , xin ) =
xi1 ,...,xin
pairwise distinct
XN
xi1 ,...,xin
pairwise distinct
N!
=
(N n)!
Z
f (x1 , . . . , xn )P (dx).
XN
which can be identified with the distribution of eigenvalues of the N N random unitary
matrices; here we equip the unitary group U (N ) with the Haar measure of total mass 1.
Another case is the measure on N particle configurations on R with density proportional to
Y
Y
2
(xi xj )2
exi ,
i<j
21
which is the distribution of eigenvalues of a random Hermitian matrix from Gaussian Unitary Ensemble (GUE), see [Mehta-04], [Forrester-10], [Anderson-Guionnet-Zeitouni-10],
[Akemann-Baik-Francesco-11].
Theorem 3.9. Any biorthogonal ensemble is a determinantal point process. Its correlation kernel has the form
K(x, y) =
N
X
i,j=1
where G = [Gij ]N
i,j=1 is the Gram matrix:
Z
i (x)j (x)(dx).
Gij =
X
Remark. When the functions i (x) and j (y) are biorthogonal, the Gram matrix is diagonal and it can be easily inverted. This is the origin of the term biorthogonal ensemble.
Proof of Theorem 3.9. Observe that
Z
XN
N
det [i (xj )]N
i,j=1 det [i (xj )]i,j=1 (dx1 ) (dxN )
Z
N
X
Y
=
sgn( )
(i) (xj ) (i) (xj ) (dx1 ) . . . (dxN )
XN
i=1
, S(N )
sgn( )
N
Y
i=1
, S(N )
This implies that the normalization constant cN in the definition of the biorthogonal
ensemble above is equal to (N ! det G)1 , and that the matrix G is invertible.
We need to prove that n (x1 , . . . , xn ) = det[K(xi , xj )]ni,j=1 for any n 1. For n > N
the statement is trivial as both sides vanish (the right-hand side vanishes because the
matrix under determinant has rank no more than N due to the explicit formula for K).
Assume n N . By formula (3.1) for the correlation functions,
n (x1 , . . . , xn )
N !cN
=
(N n)!
Z
XN n
N
det [i (xj )]N
i,j=1 det [i (xj )]i,j=1 (dxn+1 ) (dxN ).
N
X
k=1
Akl l ,
k =
N
X
Bkl l ,
k = 1, . . . , N.
k=1
Then
hi , j iL2 (X,) = [AGB t ]ij = ij .
22
Also AGB t = Id implies det A det B det G = 1. Hence, the formula for the correlation
functions can be rewritten as
n (x1 , . . . , xn )
1
=
(N n)!
Z
XN n
N
det [i (xj )]N
i,j=1 det [i (xj )]i,j=1 (dxn+1 ) (dxN ).
Opening up the determinants and using the fact that i s and j s are biorthogonal, we
obtain
1
n (x1 , . . . , xn ) =
(N n)!
sgn( )
n
Y
i=1
, S(N )
(k)= (k), k>n
1j1 <<jn N
similarly for j1 ,...,jn . The Cauchy-Binet formula now yields n (x1 , . . . , xn ) = det t ,
and
N
N
X
X
[At B]lm l (xi )m (xj ).
Akl Bkm l (xi )m (xj ) =
[t ]ij =
l,m=1
k,l,m=1
The assumption of finiteness is not necessary as long as the sums in (3.2) converge.
23
Theorem 3.10. Let (u1 , . . . , un ) and (v1 , . . . , vn ) be two n-tuples of vertices of our graph,
and assume that for any nonidentical permutation S(n),
(1 , . . . , n ) | i ui , v(i) , i j = , i, j = 1, . . . , n = .
Then
X
Theorem 3.10 means that if, in a suitable weighted oriented graph, we have nonintersecting paths with fixed starting and ending vertices, distributed according to their
weights, then the distribution of the intersection points of these paths with any chosen
section has the same structure as in Definition 3.8, and thus by Theorem 3.9 we obtain
a determinantal point process. More generally, the distribution of the intersection points
of paths with finitely many distinct sections also form a determinantal point process.
The latter statement is known as the EynardMetha theorem, see e.g. [Borodin-Rains-05]
and references therein.
A continuous time analog of Theorem 3.10 goes back to [Karlin-Mcgregor-59], who in
particular proved the following statement (the next paragraph is essentially a quotation).
Consider a stationary stochastic process whose state space is an interval on the extended real line. Assume that the process has strong Markov property and that its paths
are continuous everywhere. Take n points x1 < < xn and n Borel sets E1 < < En ,
and suppose n labeled particles start at x1 , . . . , xn and execute the process simultaneously and independently. Then the determinant det [Pt (xi , Ej )]ni,j=1 , with Pt (x, E) being
the transition probability of the process, is equal to the probability that at time t the
particles will be found in sets E1 , . . . , En , respectively, without any of them ever having
been coincident in the intervening time.
Similarly to Theorem 3.10, this statement coupled with Theorem 3.9 (or more generally, with EynardMetha theorem) leads to determinantal processes.
The aim of this section is to connect the Last Passage Percolation with certain probability
measures on the set of Young diagrams often referred to as Plancherel measures for the
symmetric groups.
We start from the Poisson process in the first quadrant, as in Theorem 1.3 and Figure
5, but now we rotate the quadrant by 45 degrees, like in Figure 4. There is a graphical
way to find the value of L(). Namely, for each point of the process draw two rays starting
from it and parallel to the axes. Extend each ray till the first intersection with another
ray. In this way, we get a collection of broken lines, as shown in Figure 8. At the first
intersection points of the rays we put new points that form the second generation. Note
now that L() is equal to the number of broken lines separating (, ) and the origin.
As it turns out, it is beneficial to iterate this process. We erase all the points of the
original Poisson process, but keep the points of the second generation and draw broken
lines joining them; we repeat this until no points inside the square with vertices (0, 0),
(0, ), (, 0), and (, ) are left, as shown in Figure 9. Compute the number of broken lines
separating (, ) and (0, 0) at each step and record these numbers to form a Young diagram
24
Figure 8: Points of Poisson process (in black), broken lines joining them and points of the
second generation (in light blue). Maximum number of points collected along monotonous
paths joining points (, ) (the big green point) and (0, 0) coincides with number of broken
lines separating them, which is 4 in our case. Only the points inside dashed square matter.
Figure 9: Left panel: Points of second generation (in light blue), broken lines joining
them and points of the third generation (in dark red). Right panel: Points of the third
generation (in dark red), broken lines joining them and points of the fourth generation
(in purple).
obtained tableau. Therefore, given that the size n() of () is m, the number of boxes
in () is also m, and the conditional distribution of () is
P(() = | n() = m) =
dim2 ()
.
m!
(4.2)
||=n
On the other hand, if we recall the definition of dim() as the dimension of irreducible
representation of the symmetric group S(n), then, taking into the account that |S(n)| =
n!, the equality (4.2) is nothing else but the celebrated Burnside identity which says that
squares of the dimensions of irreducible complex representations of any finite group sum
up to the number of the elements in the group.
Let us now suggest some intuition on why the asymptotic behavior of L() should be
related to those of growth models and, in particular, to the ballistic deposition that we
started with in Section 1. Introduce coordinates (z, t) in Figure 8 so that t is the vertical
coordinate, and consider the following growth model. At time t the height profile is given
by an integervalued (random) function h(x, t), at time zero h(x, 0) 0. At any given
time t and any point x, the left and right xlimits of the function h(x, t) differ at most
by 1, in other words, h(x, t) is almost surely a step function with steps (+1) (up step,
when we read the values of the function from left to right) and (1) (down step). If
there is a point of the Poisson process (of Figure 8) at (z, t), then at time t a seed is born
at position x = z, which is combination of up and down steps, i.e. h(z, t) increases by 1.
After that the down step starts moving with speed 1 to the right, while the up step starts
moving to the left with the same speed. When the next seed is born, another up and down
steps appear and also start moving. When up and down steps (born by different seeds)
26
meet each other, they disappear. This model is known as Polynuclear Growth (PNG),
see [Meakin-98], [Prahofer-Spohn-00], a very nice computer simulation for it is available
at Ferraris website [Ferrari], and at Figure 10 we show one possible height function.
Figure 10: Height function of PNG model after the birth of 3 seeds.
Coming back to Figure 8, note that its broken lines symbolize the spacetime trajectories of up/down steps, while second generation points are identified with collisions of
up and down steps. In particular, the positions of up and down steps at time t = t0 are
the points of intersection of the line t = t0 at Figure 8 with parts of broken lines of slope
(1) and (+1), respectively. Now it is easy to prove that the PNG-height h(0, t) at time
t and point 0 is precisely the Last Passage Percolation Time L(t). In order to observe the
full Young diagram () one should introduce multi-layer PNG model, where a seed on
level k, k 2, is born when the up and down steps collide at level k 1, and the position
of seed coincides with the position of collision, see [Prahofer-Spohn-02] for details.
The PNG model is in the KPZ universality class, and obtaining information on its
asymptotic behavior (roughness of interface, fluctuation distributions) would give us similar (although conjectural) statements for other members of the same universality class.
The aim of this section is to show how the asymptotic behavior of the Poissonized
Plancherel measure and certain more general distributions on Young diagrams can be
analyzed.
Take any two Schurpositive specializations 1 , 2 of the algebra of symmetric functions
(those were classified in Theorem 2.11). The following definition first appeared in
[Okounkov-01].
Definition 5.1. The Schur measure S1 ;2 is a probability measure on the set of all Young
diagrams defined through
s (1 )s (2 )
,
P1 ,2 () =
H(1 ; 2 )
where the normalizing constant H(1 ; 2 ) is given by
H(1 ; 2 ) = exp
X
pk (1 )pk (2 )
k=1
!
.
Remark. The above definition makes sense only if 1 , 2 are such that
X
s (1 )s (2 ) < ,
27
(5.1)
and in the latter case this sum equals H(1 ; 2 ), as follows from Theorem 2.5. The
convergence of (5.1) is guaranteed, for instance, if |pk (1 )| < Crk and |pk (2 )| < Crk with
some constants C > 0 and 0 < r < 1. In what follows we assume that this (or a similar)
condition is always satisfied.
Proposition 5.2. Let be the (Schurpositive) specialization with single non-zero parameter = , i.e.
p1 ( ) = , pk ( ) = 0, k > 1.
Then P , is the Poissonized Plancherel measure (4.1).
Proof. This is an immediate corollary of Proposition 2.15.
Our next goal is to show that any Schur measure is a determinantal point process.
Given a Young diagram , we associate to it a point configuration X() = {i i+1/2}
Z + 1/2. This is similar to the correspondence shown in Figures 2, 3. Note that X() is
semiinfinite, i.e. there are finitely many points to the right of the origin, but almost all
points to the left of the origin belong to X().
Theorem 5.3 ([Okounkov-01]). Suppose that the Y is distributed according to the
Schur measure S1 ;2 . Then X() is a determinantal point process on Z + 1/2 with correlation kernel K(i, j) defined by the generating series
X
K(i, j)v i wj =
i,jZ+ 12
where
H(; z) =
H(1 ; v)H(2 ; w1 )
H(2 ; v 1 )H(1 ; w)
hk ()z k = exp
k=0
w k
X
k= 21 , 32 , 25 ,...
zk
pk ()
k
k=1
(5.2)
!
.
Remark 1. If we expand Hfunctions in the righthand side of (5.2) into power series
and multiply the resulting expressions, then (5.2) can be viewed as a formal identity of
power series.
Remark 2. There is also an analytical point of view on (5.2). Using the fact that (under
suitable convergence conditions) the contour integral around zero
!
I
X
dz
1
ak z k
2i
z n+1
k=
is equal to an and also that when |w| < |v| we have
X w k
vw
=
,
v
v
w
1 3 5
k= 2 , 2 , 2 ,...
I I
(5.3)
with integration going over two circles around the origin |w| = R1 , |v| = R2 such that
R1 < R2 and the functions H(1 ; u), H(2 ; u1 ) are holomorphic in the annulus R1 <
|u| < R2 + . In particular, if |pk (1 )| < Crk and |pk (2 )| < Crk with some constants
C > 0 and 0 < r < 1, then any r < R1 < R2 < r1 are suitable.
We now present a proof of Theorem 5.3 which is due to Johansson [Johansson-01b],
see also [Okounkov-01] for the original proof.
Proof of Theorem 5.3. We have to prove that for any finite set A = {a1 , . . . , am } Z+1/2
we have
X s (1 )s (2 )
= det [K(ai , aj )]m
i,j=1 .
H(1 ; 2 )
:AX()
For this it suffices to prove the following formal identity of power series. Let x = (x1 , . . . )
and y = (y1 , . . . ) be two sets of variables; then
X
Q
:AX()
h
im
s (x)s (y)
b
= det K(ai , aj )
,
1
i,j=1
i<j (1 xi yj )
(5.4)
(5.5)
where Gt
ij is the inversetranspose matrix of the N N Gram matrix
Gij =
x`i yj` =
`0
1
.
1 xi y j
(5.6)
This is known as the Cauchy determinant evaluation. One can prove (5.6) directly, see
e.g. [Krattenthaler-99]. Another way is to recall that in the proof of Theorem 3.9 we
showed that the determinant of G is the normalization constant of the measure, and we
know the normalization constant from the very definition of the Schur measure, i.e. by
the Cauchy identity.
By the Cramers rule, we have
(Gt )k,` =
Using the fact that submatrices of Gij are matrices of the same type and their determinants
can be evaluated using (5.6), we get
QN
j=1 (1 xj y` )(1 xk yj )
t
Q
Q
.
(G )k,` =
(1 xk y` ) j6=k (xk xj ) j6=` (yj y` )
We claim that
e 1 , `2 ) =
K(`
N
X
i,j=1
x`i 1 yj`1 Gt
ij
1
=
(2i)2
I I Y
N
(1 zyk )(1 vxk ) z `1 v `2
dzdv,
(z xk )(v yk ) 1 zv
k=1
(5.7)
with contours chosen so that they enclose the singularities at z = xk and v = yk , but do
not enclose the singularity at zv = 1. Indeed, (5.7) is just the evaluation of the double
integral as the sum of the residues. Changing the variables z = 1/w and shifting `1 , `2 by
N 1/2 in (5.7) we arrive at (5.3).
Applying Theorem 5.3 to the Poissonized Plancherel measure we obtain
Corollary 5.4 ([Borodin-Okounkov-Olshanski-00],[Johansson-01a]). Suppose that is
a random Young diagram distributed by the Poissonized Plancherel measure. Then the
points of X() form a determinantal point process on Z + 12 with correlation kernel
I I
1
vw dvdw
1
1
K (i, j) =
exp
(v
w
+
w
)
,
(2i)2
v w v i+1 wj+1
with integration over positively oriented simple contours enclosing zero and such that
|w| < |v|.
Our next aim is to study the behavior of K (i, j) as . The argument
below is due to Okounkov [Okounkov-03], but the results were obtained earlier in
[Borodin-Okounkov-Olshanski-00], [Johansson-01a] by different tools. Let us start from
the case i = j. Then K(i, i) is the density of particles of our point process or, looking at
Figure 3, the average local slope of the (rotated) Young diagram. Intuitively, one expects
to see some non-trivial behavior when i is of order . To see that set i = u. Then K
transforms into
I I
1
vw dvdw
exp
(S(v)
S(w))
K (u, u) =
,
(5.8)
(2i)2
v w vw
with
S(z) = z z 1 u ln z.
30
Our next aim is to deform the contours of integration so that <(S(v) S(w)) < 0 on
them. (It is ok if <(S(v) S(w)) = 0 at finitely many points.) If we manage to do that,
then (5.8) would decay as . Let us try to do this. First, compute the critical points
of S(z), i.e. roots of its derivative
S 0 (z) = 1 + z 2 uz 1 .
When |u| < 2 the equation S 0 (z) = 0 has two complex conjugated roots of absolute value
1 which we denote ei . Here satisfies 2 cos() = u. Let us deform the contours so
that both of them pass through the critical points and look as shown at Figure 11. We
v
w
1
Figure 11: Deformed contours: vcontour in blue and wcontour in green. The dashed
contour is the unit circle and the black dots indicate the critical points of S(z).
claim that now <S(v) < 0 everywhere on its contour except at critical points ei , and
<S(w) > 0 everywhere on its contour except at critical points ei (<S(v) = <S(w) = 0
at e .) To prove that observe that <S(z) = 0 for z on the unit circle |z| = 1 and compute
the gradient of <S(z) = <S(a + bi) on the unit circle (i.e. when a2 + b2 = 1):
<S(a + bi) = a
a2
a
u
ln(a2 + b2 ),
2
+b
2
b 2 a2
au
2ab
bu
<S(a + bi) = 1 2
(a + b2 )2 a2 + b2 (a2 + b2 )2 a2 + b2
= 1 b2 + a2 au , 2ab bu = 2a2 au , 2ab bu = (2a u)(a, b). (5.9)
Identity (5.9) implies that the gradient vanishes at points ei , points outwards the unit
circle on the right arc joining the critical points and points inwards on the left arc. This
implies our inequalities for <S(z) on the contours. (We assume that the contours are
fairly close to the unit circle so that the gradient argument works.)
31
Now it follows that after the deformation of the contours the integral vanishes as
. Does this mean that the correlation functions also vanish? Actually, no. The
reason is that the integrand in (5.8) has a singularity at v = w. Therefore, when we
deform the contours from the contour configuration with |w| < |v|, as we had in Corollary
5.4, to the contours of Figure 11 we get a residue of the integrand in (5.8) at z = w along
the arc of the unit circle joining ei . This residue is
Z ei
dz
1
= .
2i ei z
Turning to the original picture we see that the asymptotic density of particles at point
i changes from 0 when i 2 to 1 when i 2. This means that after rescaling
by the factor 1 times the Plancherelrandom Young diagram asymptotically looks like
in Figure 12. This is a manifestation of the VershikKerovLoganShepp limit shape
theorem, see [Vershik-Kerov-77], [Logan-Shepp-77].
lim K (u, u) =
Figure 12: The celebrated VershikKerovLogan-Shepp curve as a limit shape for the
Plancherel random Young diagrams.
sin((x y))
, if x 6= y,
(x
y)
(5.10)
lim K (buc + x, buc + y) =
,
otherwise,
where = arccos(u/2).
32
Remark. The righthand side of (5.10) is known as the discrete sine kernel and it is
similar to the continuous sine kernel which arises as a universal local limit of correlation
functions for eigenvalues of random Hermitian (Wigner) matrices, see e.g. [Erdos-Yau-12],
[Tao-Vu-12] and references therein.
Proof of Theorem 5.5. The whole argument remains the same as in the case x = y = 0,
except for the computation of the residue which is now
1
2i
ei
ei
dz
z xy+1
sin((x y))
.
(x y)
(5.11)
sin((x y))
.
(x y)
So far we got some understanding on whats happening in the bulk, while we started
with the Last Passage Percolation which is related to the so-called edge asymptotic behavior, i.e. limit fluctuations of 1 . This corresponds to having u = 2, at which point
the above arguments no longer work. With some additional efforts one can prove the
following theorem:
Theorem 5.6 ([Borodin-Okounkov-Olshanski-00],[Johansson-01a]). For any two reals x,
y we have
lim 1/3 K (2 + x1/3 , 2 + y1/3 ) = KAiry (x, y)
(5.12)
where
1
KAiry (x, y) =
(2i)2
Z Z
3 /3w
e3 /3+e
v xwy
e
eve
de
v dw
e
,
v w
(5.13)
One shows that the above Fredholm determinant is the TracyWidom distribution F2 (s)
from Section 1, see [Tracy-Widom-94].
Proof of Theorem 5.6. We start as in the proof of Theorem 5.5. When u = 2 the two
critical points of S(z) merge, so that the contours now look as in Figure 13 (left panel)
and the integral in (5.11) vanishes. Therefore, the correlation functions near the edge tend
to 0. This is caused by the fact that points of our process near the edge rarify, distances
between them become large, and the probability of finding a point in any given location
tends to 0.
33
w
w
1
Figure 13: Contours for the edgescaling limit (u = 2). Left panel: vcontour in blue
and wcontour in green. The dashed contour is the unit circle. Right panel: limiting
contours.
w = 1 + 1/3 w
e
in the contour integral. Note that z = 1 is a double critical point of S(z) = zz 1 2 ln(z),
so that in the neighborhood of 1 we have
1
S(z) = (z 1)3 + O((z 1)4 )
3
Now as we have
1 1/3 3 1 1/3 3
exp (S(v) S(w)) = exp
(
ve) (
w)
+ o(1)
3
3
1 3 1 3
= exp
ve w .
3
3
We conclude that as
K (2 + x
1/3
, 2 + y
1/3
1/3
)
(2i)2
Z Z
3 /3w
e3 /3+e
v xwy
e
eve
de
v dw
e
,
v w
(5.14)
and the contours here are contours of Figure 13 (left panel) in the neighborhood of 1;
they are shown at the right panel of Figure 13.
Using Theorem 5.6 we can now compute the asymptotic behavior of the Last Passage
Percolation time, i.e. of 1 . Using the inclusionexclusion principle, for any A R we
have
X
1 X
1 X
P(1 A) = 1
1 (x) +
2 (x, y)
3 (x, y, z) + . . .
(5.15)
2! x,y>A
3! x,y,z>A
x>A
Recall that correlation functions k are k k determinants involving kernel K , substitute
A = 2 + s1/3 and send . The sums in (5.15) turn into the integrals and we get
34
1/3
Z
)=1
x>s
In the last expression one recognizes the Fredholm determinant expansion (see e.g. [Lax-02]
or [Simon-05]) for
det(1 KAiry (x, y))L2 (s,+) .
The conceptual conclusion from all the above is that as soon as we have an integral
representation for the correlation kernel of a point process, many limiting questions can
be answered by analyzing these integrals. The method for the analysis that we presented
is, actually, quite standard and is well-known (at least since the XIX century) under
the steepest descent method name. In the context of determinantal point processes and
Plancherel measures it was pioneered by Okounkov and we recommend [Okounkov-03] for
additional details.
While in the previous sections we gave a few of tools for solving the problems of probabilistic origin, in this section we present a general framework, which produces analyzable
models.
6.1
specializations +
0 , . . . , N 1 , 1 , . . . , N and given by
(1)
(1)
(2)
(2)
(N 1)
(N )
P , , , ,...,
,
=
1
+
+
s (1) (+
0 )s(1) /(1) (1 )s(2) /(1) (1 ) s(N ) /(N 1) (N 1 )s(N ) (N ), (6.1)
Z
+
0
(1)
@
1
@
R(1)
@
+
1
(2)
@
+
2
(3)
@
@
R(2)
@
3
@
R
@
X
pk ()pk (0 )
0
H(; ) = exp
.
k
k=1
Given two specializations 1 , 2 , their union (1 , 2 ) is defined through its values on power
sums pk
pk (1 , 2 ) = pk (1 ) + pk (2 ).
Theorem 2.11 implies that if 1 is a Schurpositive specialization with parameters ((1) , (1) , (1) ), and 2 is a Schurpositive specialization with parameters
((2) ,S
(2) , (2) ), Sthen (1 , 2 ) is a Schurpositive
specialization with parameters
S (2)
(1)
(2)
(1)
(2)
(1)
(2)
(1)
(
,
, + ), where
stands for the sequence obtained by
(1)
(2)
rearranging the union of sequences and in decreasing order (and similarly for
). In particular, if 1 and 2 specialize symmetric functions by substituting sets of variables, say (x1 , x2 , . . . ) and (y1 , y2 , . . . ) (which corresponds to zero i and ), then (1 , 2 )
substitutes all the variables (x1 , x2 , . . . , y1 , y2 , . . . ).
The definition implies that for specializations 1 , . . . , k , 01 , . . . , 0m we have
H(1 , . . . , k ; 01 , . . . , 0m ) =
k Y
m
Y
H(i ; j ).
i=1 j=1
Z=
H(+
i ; j ).
i<j
and
s/ (, 0 ) =
s/ ()s/ (0 ).
The above identities are valid for any specializations , 0 such that all the sums are convergent and are just the results of the application of these specializations to the statements
of Propositions 2.7 and 2.9.
36
We have:
X
+
+
s(1) (+
0 )s(1) /(1) (1 )s(2) /(1) (1 ) s(N ) /(N 1) (N )s(N ) (N )
X
+
+
+
s/ (
= H(+
1 )s(1) / (0 )s(2) /(1) (1 ) s(N ) /(N 1) (N )s(N ) (N )
0 ; 1 )
X
+
+ +
= H(+
;
)
s/ (
0
1
1 )s(2) / (0 , 1 ) s(N ) /(N 1) (N )s(N ) (N )
X
+
)
s(2) (+
;
= H(+
0 , 1 ) s(N ) /(N 1) (N )s(N ) (N ), (6.2)
1
0
where we used the fact that s/ = 0 unless (1) = , and s/ = 1. Note, that the
summation in the last line of (6.2) runs over (2) , (2) , . . . , (N 1) , (N ) , i.e. there is no
summation over (1) , (1) anymore. Iterating this procedure, we get the value of the
normalization constant Z.
It turns out that onedimensional marginals of the Schur processes are the Schur
measures:
Proposition 6.3. The projection of the Schur process to the Young diagram (k) is the
Schur measure S1 ;2 with specializations
2 = (
k , k+1 , . . . , N ).
+
+
1 = (+
0 , 1 , . . . , k1 ),
6.2
2
1
2
0
Figure 15: Left panel: Plane partition of volume 5. Right panel: The corresponding 3d
Young diagram.
which is one of the simplest (speaking of definition, not properties) possible probability
measures on this set. The normalization constant M is given by the celebrated MacMahon
formula (see [MacMahon-1912], [Stanley-99, Section 7.20], [Macdonald-95, Chapter I,
Section 5, Example 13])
Y
M=
(1 q n )n .
n=1
We claim that the above measure can be described via a Schur process. In fact this is
a particular case of a more general statement that we now present.
Definition 6.4. Fix two natural numbers A and B. For a Young diagram B A =
= B A /. A skew plane partition with support
is a filling of all
(B, . . . , B ), set
| {z }
A times
boxes of
by nonnegative integers i,j (we assume that i,j is located in the ith row and
jth column of B A ) such that i,j i,j+1 and i,j i+1,j for all values of i, j. The
volume of the skew plane partition is defined as
X
volume() =
i,j .
i,j
and weights proportional to q volume( ) , 0 < q < 1, is a Schur process. This fact
has been observed and used in [Okounkov-Reshetikhin-01], [Okounkov-Reshetikhin-05],
[Okounkov-Reshetikhin-06], [Boutillier-Mkrtchyan-Reshetikhin-Tingley-12], [Borodin-11].
+
The Schur process will be such that for any two neighboring specializations
k , k at
least one is trivial. This implies that each (j) coincides either with (j) or with (j+1) .
Thus, we can restrict our attention to (j) s only.
For a plane partition , we define the Young diagrams (k) (1 k A + B + 1) via
(k) () = i,i+kA1 | (i, i + k A 1)
.
Note that (1) = (A+B+1) = . Figure 16 shows a skew plane partition and corresponding sequence of Young diagrams.
We need one more piece of notation. Define
L() = {A + i i + 1 | i = 1, . . . , A}.
38
10
6
8
5
7
5
8
6
3
3
This is an A-point subset in {1, 2, . . . , A + B}, and all such subsets are in bijection with
the partitions contained in the box B A ; this is similar to the identification of Young
diagrams and point configurations used in Theorem 5.3. The elements of L() mark the
up-right steps in the boundary of (=back wall of ), as in Figure 16.
Theorem 6.5. Let be a partition contained in the box B A . The measure on the plane
partitions with support
and weights proportional to q vol() , is the Schur process with
(
(q j ; 0; 0),
+
=
j
(0; 0; 0),
+
0 = N = (0; 0; 0)
(
(0; 0; 0),
j L(),
j =
(q j ; 0; 0),
j
/ L();
(6.3)
j L(),
j
/ L(),
(6.4)
(j) (j+1) if j
/ L(),
where we write or if 1 1 2 2 . . . .
On the other hand, Proposition 2.12 implies that the weight of ((1) , (2) , . . . , (N ) )
with respect to the Schur process from the hypothesis is equal to q raised to the power
A+B
X
(j)
(j 1)1j1L() (j 1)1j1L()
+
j1
+
j1
,
/
jL()
j L()
/
j=2
39
+
where the four terms are the contributions of +
j1 , j1 , j , j , respectively.
PA+B (j)
Clearly, the sum is equal to j=2 | | = volume().
Theorem 6.5 gives a way for analyzing random (skew) plane partitions via the
approach of Section 5.
Using this machinery one can prove various interesting limit theorems describing the asymptotic behavior of the model as q 1,
see [Okounkov-Reshetikhin-01], [Okounkov-Reshetikhin-05], [Okounkov-Reshetikhin-06],
[Boutillier-Mkrtchyan-Reshetikhin-Tingley-12].
6.3
+ + aN )t)k
.
k!
The growth of N (t) can be illustrated by the random point process with point (t, j)
appearing if the letter j is added to the word at time t, see Figure 17. For any T > 0,
one can produce a Young diagram (N ) (T ) from all the points (t, j) with t T using
the RobinsonSchensted-Knuth (RSK) algorithm, whose geometric version was given in
Section 4. (We again address the reader to [Sagan-01], [Fulton-97], [Knuth-73] for the
details on RSK.) In particular, the length of the first row of (N ) (T ) equals the maximal
number of points one can collect along a monotonous path joining (0, 0) and (T, N ), as
in Figure 17. More generally, let wN k (t) be the word obtained from wN (t) by removing
all the instances of letters N, N 1, . . . , N k + 1, and let (N k) (T ) denote the Young
diagram corresponding to wN k (t), i.e. this is the Young diagram obtained from all the
points (t, j) with t T , j N k.
Proposition 6.6. For any t and N the collection of (random) Young diagrams
(1) (t), (2) (t), . . . , (N ) (t) forms a Schur process with probability distribution
1
s (1) (a1 )s(2) /(1) (a2 ) s(N ) /(N 1) (aN )s(N ) (t ),
Z
where we identify ai with the Schurpositive specialization with parameter 1 = ai and all
other parameters 0, and t is the specialization with single non-zero parameter = t.
Proof. This statement follows from properties of RSK correspondence, cf. [Johansson-05].
(1)
(N )
Now let us concentrate on the random vector (`1 (t), . . . , `N (t)) = (1 (t), . . . , 1 (t))
and try to describe its time evolution as t grows. Recall that in Figure 17 the value of
`k (T ) is the maximal number of points collected along monotonous paths joining (0, 0)
and (T, k). Suppose that at time t a new letter k appears, so that there is a point (t, k)
40
5
4
3
2
1
T
Figure 17: Collection of points with coordinates (t, j) corresponding to the growth of
random word and the path collecting maximal number of points `N (T ) = (N ) (T ) = 5.
Here N = 5.
in the picture. It means that `k grows by 1 (`k (t) = `k (t) + 1), because we can add
this point to any path coming to any (t0 , k) with t0 < k. Clearly, `j does not change for
j < k. Note that if `k+1 (t) > `k (t), then `k+1 also does not change, since the maximal
number of points collected along a path passing through (t, k) is at most `k (t). Finally,
if `k (t) = `k+1 (t) = = `k+m (t), then all the numbers `k , . . . , `k+m should increase
by 1, since the optimal path will now go through (t, k) and collect `k points.
The above discussion shows that the evolution of (`1 (t), . . . , `N (t)) is a Markov process, and it admits the following interpretation. Take N (distinct) particles on Z with
coordinates `1 + 1 < `2 + 2 < < `N + N . Each particle has an independent exponential
clock of rate ai . When a clock rings, the corresponding particle attempts to jump to the
right by one. If that spot is empty then the particle jumps and nothing else happens.
Otherwise, in addition to the jump, the particle pushes all immediately right adjacent
particles by one, see Figure 18 for an illustration of these rules. The dynamics we have
just described is known as the Long Range Totally Asymmetric Exclusion Process, and
it is also a special case of PushASEP, see [Borodin-Ferrari-08a], [Spitzer-70].
11
00
11
0
1
00
11
00 00
11
00
11
0
1
11
00
11
00
0
00
0
00 1
11
011
1
00 1
11
0
1
11
00
11
0
1
00
11
00 00
11
00
11
0
1
11
00
11
00
00
00
11
0
00 11
11
00
11
00 1
11
0
1
Figure 18: PushASEP. Top panel: jump of a particle. Bottom panel: jump of a particle
which results in pushing.
(1)
(N )
We also note (without proof) that if instead of (1 (t), . . . , 1 (t)) one considers the
(1)
(2)
(N )
random vector (1 (t), 2 (t) . . . , N (t)), then the fixed time distribution of the particles
41
(N )
(N 1)
(1)
6.4
In the last section we linked Schur processes to simple particle dynamics like TASEP
using the RSK correspondence. In this section we produce another family of Markov
dynamics connecting such objects whose description is arguably more straightforward
and independent of complicated combinatorial algorithms, such as RSK.
We start by introducing a general framework. Let and 0 be two Schurpositive
specializations such that H(; 0 ) < . Define matrices p and p with rows and
columns indexed by Young diagrams and as follows:
p (; 0 ) =
1
s ()
s/ (0 ),
0
H(; ) s ()
(6.5)
s ()
s/ (0 ).
s (, 0 )
(6.6)
p (; 0 ) =
Proposition 6.7. The matrices p and p are stochastic, i.e. all matrix elements
are non-negative, and for every Y we have
X
p (, 0 ) = 1,
(6.7)
Y
p (, 0 ) = 1.
(6.8)
Note, however, that the evolution of the particles in TASEP is different from that of
(1)
(2)
(N )
(1 (t), 2 (t) . . . , N (t)).
42
and
X
S1 ;2 ,3 ()p (2 ; 3 ) = S1 ;2 ().
N 1 ;
+
+
+ +
((N ) )p(N ) (N 1) (+
0 , . . . , N 2 ; N 1 ) p(2) (1) (0 ; 1 ).
More generally, any Schur process can be viewed as a trajectory of a Markov chain with
transitional probabilities given by matrices p and p (with suitable specializations) and
initial distribution being a Schur measure.
Another property that we need is the following commutativity.
Proposition 6.9. The following commutation relation on matrices p and p holds:
p (1 , 2 ; 3 )p (1 ; 2 ) = p (1 ; 2 )p (1 ; 3 )
(6.9)
Remark. In terms of acting on Schur measures, as in Proposition 6.8, (6.9) says that
adding 3 and then removing 2 is the same as first removing 2 and then adding 3 :
S4 ;1 ,2 p (1 , 2 ; 3 ) = S3 ,4 ;1 ,2 , S3 ,4 ;1 ,2 p (1 ; 2 ) = S3 ,4 ;1 ,
S4 ;1 ,2 p (1 ; 2 ) = S4 ;1 ,
S4 ;1 p (1 ; 3 ) = S3 ,4 ;1 .
Proof of Proposition 6.9. We should prove that for any , Y
X
X
p (1 , 2 ; 3 )p (1 ; 2 ) =
p (1 ; 2 )p (1 ; 3 ).
Y
Using definitions (6.5), (6.6) this boils down to the specialized version of the skew Cauchy
Identity, which is Proposition 2.7, cf. [Borodin-11].
Commutativity relation (6.9) paves the way to introducing a family of new
Markov chains through a construction that we now present.
This construction
first appeared in [Diaconis-Fill-90] and was heavily used recently for probabilistic models related to Young diagrams, see [Borodin-Ferrari-08b], [Borodin-Gorin-08],
[Borodin-Gorin-Rains-10], [Borodin-Duits-11], [Borodin-11], [Borodin-Olshanski-12],
[Betea-11], [Borodin-Corwin-11].
43
Take two Schurpositive specializations 1 , 2 and a state space Y(2) of pairs of Young
(2)
diagrams (1) such that p(2) (1) (1 ; 2 ) > 0. Define a Markov chain on Y(2) with the
following transition probabilities:
P
(2)
(1)
(2)
p(2) (2) (1 , 2 ; 0 )p(2) (1) (1 ; 2 )
= p(1) (1) (1 ; ) P
(1)
(1 , 2 ; 0 )p (1) (1 ; 2 )
p (2)
(6.10)
In words (6.10) means that the first Young diagram (1) (1) evolves according to
the transition probabilities p(1) (1) (1 ; 0 ), and given (2) and (1) the distribution of
(2) is the distribution of the middle point in the two-step Markov chain with transitions
p(2) (1 , 2 ; 0 ) and p(1) (1 ; 2 ).
Proposition 6.10. The above transitional probabilities on Y(2) map the Schur process
with distribution
S1 ,2 ; ((2) )p(2) (1) (1 ; 2 )
to the Schur process with distribution
S1 ,2 ; ,0 ((2) )p(2) (1) (1 ; 2 ).
Informally, the specialization 0 was added to and nothing else changed.
Proof. Direct computation based on Proposition 6.9, cf. [Borodin-Ferrari-08b, Section
2.2].
More generally, we can iterate the above constructions and produce a Markov chain
on sequences of Young diagrams (N ) , . . . , (1) as follows. The first Young diagram (1)
evolves according to transition probabilities p(1) (1) (1 ; 0 ). Then, for any k 2, as
soon as (k1) is defined and given (k) the distribution of (k) is the distribution of the
middle point in the two-step Markov chain with transitions p(k) (1 , . . . , k ; 0 ) and
p(k1) (1 , . . . , k1 ; k ).
Similarly to Proposition 6.10 one proves that one step of thus constructed Markov
chain adds 0 to the specialization of the Schur process with distribution
S1 ,...,N ; ((N ) )p(N ) (N 1) (1 , . . . , N 1 ; N ) p(2) (1) (1 ; 2 ).
(6.11)
The above constructions might look quite messy, so let us consider several examples,
where they lead to relatively simple Markov chains.
Take each k to be the Schurpositive specialization with single nonzero parameter
1 = 1, and let 0 be the Schurpositive specialization with single non-zero parameter 1 = b. Consider a discrete time homogeneous Markov chain ((1) (t), . . . , (N ) (t))
with defined above transitional probabilities and started from the Schur process as in
(6.11) with being trivial specialization (with all zero parameters). This implies that
((1) (0), . . . , (N ) (0)) = (, . . . , ). Note that at any time t the Young diagram (k) (t)
has at most k non-empty rows and their coordinates satisfy the following interlacing
conditions:
(k)
(k1)
(k)
(k1)
(k)
1 1
2 k1 k .
(6.12)
44
In particular, (1) has a single row, i.e. it is a number. The definitions imply that the
transitional probabilities for (1) are
b
(1)
(1)
1+b , if = + 1,
1
p(1) (1) (1 ; 0 ) = 1+b
, if (1) = (1) ,
0,
otherwise.
In other words, the evolution of (1) is a simple Bernoulli random walk with probability
of move p = b/(1 + b). More generally, given that (k) (t) = and (k1) (t + 1) = the
distribution of (k) (t + 1) is given by
Prob((k) (t + 1) = | (k) (t) = , (k1) (t + 1) = ) =
s/ (0; b; 0)s/ (1; 0; 0)
b|||| ,
=P
s/ (0; b; 0)s/ (1; 0; 0)
given that 0 i i 1, and and satisfy the interlacing conditions (6.12). In
other words, the length of each row of (k) independently increases by 1 with probability
b/(1 + b) unless this contradicts interlacing conditions (in which case that length either
stays the same or increases by 1 with probability 1).
In order to visualize the above transitional probabilities consider N (N + 1)/2 interlac(j)
ing particles with integer coordinates xji = j+1i N + i, j = 1, . . . , N , i = 1, . . . , j. In
(j)
other words, for each j we reorder the coordinates i and make them strictly increasing.
The coordinates of particles thus satisfy the inequalities
j
xji1 < xj1
i1 xi .
(6.13)
Such arrays are often called GelfandTsetlin patterns (or schemes) of size N and under
this name they are widely used in representation-theoretic context. We typically use the
notation xji both for the location of a particle and the particle itself. It is convenient to
put particles xji on adjacent horizontal lines, as shown in Figure 19.
xx15515511551
xx
xx25525522552 xx
xx35535533553
xx
xxx4141414114
xxx4242424224
xx3113311331
xx
xx45545544554
xx
xxx55555555555
xxx4343434334 xxx4444444444
xx3223322332 xxx3333333333
xx
xxx2121212112
xxx2222222222
xx1111111111
xx
x3i (t + 1), i = 1, 2, 3, etc. To start with, we set x11 (t + 1) = x11 (t) + 1, if the coin of
the particle x11 came heads, otherwise, x11 (t + 1) = x11 (t). Subsequently, once the values
of xji (t + 1), j = 1, . . . , k 1, i = 1, . . . , j are determined, we define xki (t + 1) for each
i = 1, . . . , k independently by the following procedure, in which each rule 3 is performed
only if the conditions for the previous two are not satisfied.
k1
k
1. If i > 1 and xki (t) = xk1
i1 (t + 1) 1, then we say that particle xi is pushed by xi1
and set xki (t + 1) = xki (t) + 1.
2. If xik1 (t + 1) = xki (t) + 1, then we say that particle xki is blocked by xik1 and set
xki (t + 1) = xki (t).
3. If the coin of the particle xki came heads, then we set xki (t + 1) = xki (t) + 1,
otherwise, we set xki (t + 1) = xki (t).
Informally, one can think of each particle having a weight depending on its vertical
coordinate, with higher particles being lighter. The dynamics is defined in such a way
that heavier particles push lighter ones and lighter ones are blocked by heavier ones in
order to preserve the interlacement conditions.
One interesting fact about this dynamics is that the joint distribution of xji (t + j) is
the projection of the uniform distribution on the domino tilings of the so-called Aztec
diamond, see [Nordenstam-10], [Borodin-Ferrari-08b]. A computer simulation showing
the connection with tilings can be found on Ferraris website [Ferrari].
Sending b 0 and rescaling the time we get a continuous version of the above dynamics. Formally, the process Y (t) = {Yij (t)}, t 0, j = 1, . . . , N , i = 1, . . . , j, is a
continuous-time Markov chain defined as follows. Each of the N (N + 1)/2 particles has
an independent exponential clock of rate 1 (in other words, the times when clocks of
particles ring are independent standard Poisson processes). If the clock of the particle Yij
rings at time t, we check whether Yij1 (t) = Yij (t) + 1. If so, then nothing happens;
j+1
j+k
otherwise we let Yij (t) = Yij (t) + 1. If Yij (t) = Yi+1
(t) = = Yi+k
(t), then we
j+1
j+1
j+k
j+k
also set Yi+1 (t) = Yi+1 (t) + 1, . . . , Yi+k (t) = Yi+k (t) + 1.
The Markov chain Y was introduced in [Borodin-Ferrari-08b] as an example of a 2d
growth model relating classical interacting particle systems and random surfaces arising
from dimers. The computer simulation of Y (t) can be found at Ferraris website [Ferrari].
The restriction of Y to the N leftmost particles Y11 , . . . , Y1N is the familiar totally asymmetric simple exclusion process (TASEP), the restriction Y to the N rightmost particles
Y11 , . . . , YNN is long range TASEP (or PushASEP), while the particle configuration Y (t)
at a fixed time t can be identified with a lozenge tiling of a sector in the plane and with
a stepped surface (see the introduction in [Borodin-Ferrari-08b] for the details).
One proves that the fixed time distribution of Y (t) is the Schur process of Section 6.3
with ai = 1 (under the above identification of particle configurations and sequences of
Young diagrams). Moreover, the restriction of Y (t) on N leftmost particles Y11 , Y12 . . . , Y1N
is the same as the restriction of the dynamics of Section 6.3 and similar statement is true
for restrictions on Y11 , Y22 . . . , YNN and on Y1N , Y2N , . . . , YNN . Nevertheless, the dynamics
Y (t) is different from that of Section 6.3.
By appropriate limit transition we can also make the state space of our dynamics
continuous.
46
In this section we generalize the constructions of Sections 5, 6 based on the Schur functions
to their (q, t) deformation known as Macdonald symmetric functions and discuss new
applications.
7.1
In what follows we assume that q and t are real numbers satisfying 0 < q < 1, 0 < t < 1.
One way to define Macdonald symmetric functions P ( ; q, t) indexed by Young diagrams is through the GramSchmidt orthogonalization procedure. Define the following
3
47
Figure 20: Sample from the measure q volume on skew plane partitions with one particular
choice of support.
`()
Y
1 q i
i=1
1 ti
48
I{1,...,N }, |I|=r
where
AI (x; t) = tr(r1)/2
Y txi xj
,
xi xj
iI, j I
/
N Y
X
txi xj
i=1 j6=i
xi xj
Y (txi yj ; q)
i,j
49
(xi yk ; q)
where
(a; q)
Y
=
(1 aq i ).
i=0
and
Q (x, y) =
Q/ (x)Q (y).
There is also a (q, t) analogue of the skew Cauchy identity, see [Macdonald-95, Chapter
VI, Section 7].
Another ingredient of our constructions related to Schur functions was the classification
of positive specializations of Theorem 2.11. For Macdonald polynomials the following
conjecture is due to Kerov:
Conjecture 7.6. The Macdonaldpositive specializations4 are parameterized by pairs of
sequences P
of non-negative reals = (1 2 0) and = (1 2 0)
satisfying i (i + i ) < and an additional parameter 0. The specialization with
parameters (; ; ) can be described by its values on power sums
X
1q
1q
+
i +
i ,
p1 7 p1 (; ; ) =
1t
1
t
i
k
X
k
k
k1 1 q
,
pk
7 pk (; ; ) =
i + (1)
1 tk i
i
k 2,
gk (; ; )z k = ez
k=0
Y
(ti z; q)
,
(1 + i z)
(i z; q)
i1
50
P (1 )Q (2 )
,
Hq,t (1 ; 2 )
X
1 tk pk (1 )pk (2 )
Hq,t (1 ; 2 ) = exp
k
1
q
k
k=1
!
.
Remark 1. The above definition makes sense only if 1 , 2 are such that
X
P (1 )Q (2 ) < ,
(7.1)
in which case this sum equals Hq,t (1 ; 2 ), as follows from Proposition 7.4. The convergence of (7.1) is guaranteed, for instance, if |pk (1 )| < Crk and |pk (2 )| < Crk with some
constants C > 0 and 0 < r < 1.
Remark 2. The definition of Macdonald measure was first given almost ten years ago
by Forrester and Rains, see [Forrester-Rains-05]. In addition to the Schur (q = t) case,
Macdonald measures (and processes) were also studied by Vuletic [Vuletic-09] for the
HallLittlewood symmetric functions, which correspond to q = 0. Recently, new applications of these measures and new tools to work with them were developed starting from
[Borodin-Corwin-11].
Definition 6.1 of the Schur process also has a straightforward (q, t)analogue involving
Macdonald polynomials. In our applications we will use only one particular case of this
definitions which is a (q, t) generalization of measures of Section 6.3.
+
(1)
P( , . . . ,
7.2
(N )
+
+
P(1) (+
1 )P(2) /(1) (2 ) P(N ) /(N 1) (N )Q(N ) ( )
)=
.
+
Hq,t (+
1 ; ) Hq,t (N ; )
The construction of Section 6.4 of dynamics preserving the class of Schur processes can
be literally repeated for Macdonald processes by replacing all instances of skew Schur
polynomials by skew Macdonald polynomials, see [Borodin-Corwin-11, Section 2.3] for
the details.
Let us focus on one example. We set t = 0 till the end of this section and, thus, Macdonald functions are replaced with their degeneration known as qWhittaker functions.
We continue the analogy with Section 6.4 and study the example which led to the
process Y ( ) (we replaced the time parameter by , since t now has a different meaning).
Namely, we set +
i to be the specialization with single non-zero parameter 1 = 1 in
Definition 7.8 and to be the specialization with single non-zero 1 = b. At each step
51
(1 q Zi
( )Zij ( )1
)(1 q Zi ( )Zi1 ( ) )
j1
1 q Zi ( )Zi1 ( )+1
(7.2)
in other words, this rate depends on the positions of three neighbors, as shown in Figure
21. If one of the neighbors does not exist, then the corresponding factor disappears
from (7.2). When the clock of particle Zij rings at time , we let Zij ( ) = Zij ( ) + 1.
j+1
j+k
j+1
j+1
( ) = = Zi+k
( ) = Zi+1
( ) +
If Zij ( ) = Zi+1
( ), then we also set Zi+1
j+k
j+k
1, . . . , Zi+k ( ) = Zi+k ( ) + 1. Equivalently, one can think that when (7.2) becomes
infinite because of the denominator vanishing, the corresponding particle immediately
moves by one to the right. Note that if Zij1 = Zij + 1, then the rate (7.2) vanishes,
therefore the blocking which was present in the definition of Y ( ) is also implied by the
definition of Z( ).
a
1q
11
00
11
0
1
00
11
00 00
11
00
11
0
1
11
00
a=2
Figure 21: Left panel: The distances to three neighbors which the rate of the particle
depends on. Right panel: jump rate for qTASEP.
The restriction of Z( ) to the leftmost particles Z11 ( ), . . . , Z1N ( ) is a Markovian
dynamics known as qTASEP (see [Borodin-Corwin-11]). Namely, the rate of particle Z1i
j1
j
at time is 1 q Z1 ( )Z1 ( )1 , as shown in Figure 21. When q 0 we recover the
familiar TASEP.
There is one distinction from Y ( ), namely, the restriction of Z( ) on the rightmost
particles is no longer a simple Markov process.
There are various interesting limit transitions as and q 1. Let us concentrate
on one where a phenomenon known as crystallization appears. Let
q = exp(),
t
,
2
and send to 0. In this limit, particles Zij ( ) approximate a perfect lattice, namely,
Zij ( )
t
ln
(2i j 1).
2
52
The fluctuations around the points of this lattice are of order 1 . More precisely, the
following theorem holds.
Theorem 7.9. Let
ln
t
j
j
b
(2i j 1) ,
Zi (t, ) = Zi ( ) 2 +
q = exp(),
t
.
2
where
Bi (a, b) = Bi (b) Bi (a).
The integral (7.3) is known as the partition function of the OConnellYor semidiscrete
polymer. This has the following explanation. Identify each sequence 0 < s1 < s2 < <
sN 1 < t with a piecewiseconstant monotonous function with unit increments at si or,
equivalently, with staircaselike path (directed polymer) joining (0, 1) and (t, N ), as
shown in Figure 22. Further, we view Bi (a, b) as an integral of the 1d white noise along
the interval (a, b). Then the sum
B1 (0, s1 ) + B2 (s1 , s2 ) + + BN (sN 1 , t)
turns into the integral of spacetime white noise along the path defined by 0 < s1 < s2 <
< sN 1 < t. Integrating over all possible paths we arrive at the partition function, see
[Borodin-Corwin-11], [Borodin-Corwin-Ferrari-12] and references therein for more details.
7.3
In the previous section we explained that the formalism of Macdonald processes leads to
quite interesting probability models, but we do not know yet how to analyze them. The
methods we used for the Schur processes were based on the determinantal point processes.
53
5
4
3
2
1
s1
s2
s3
s4
Figure 22: Staircaselike path corresponding to the sequence 0 < s1 < s2 < < sN 1 <
t. Here N = 5.
Similar determinantal structure is not known to exist for the Macdonald processes and,
probably, there is no such structure at all. However, there is another approach based on
the Macdonald difference operators.
We start from the (q, t)Cauchy identity of Proposition 7.4 in the form
X
P (x1 , . . . , xN ; q, t)Q (Y ; q, t) =
N
Y
(xi ; Y ),
(7.4)
i=1
where
(x; Y ) =
Y (txyj ; q)
j
(xyk ; q)
X
1 tk pk (Y )xk
= exp
1 qk
k
k=1
!
.
r
to both sides of (7.4) with respect to
Let us apply the Macdonald difference operator DN
the variables x1 , . . . , xN . Proposition 7.3 yields
!
N
X
Y
r
er (q 1 tN 1 , . . . , q N )P (x1 , . . . , xN ; q, t)Q (Y ; q, t) = DN
(xi ; Y ) .
i=1
)
i
2
i=1
x =a , 1iN,
i
where
X
1 tk pk ()xk
(x; ) = exp
1 qk
k
k=1
54
!
.
r
Remark. Clearly, we can replace DN
with any product of such operators, similarly
1 N 1
N
replacing er (q t
, . . . , q ) with the corresponding product, and the statement will still
be valid.
The remarkable property of Proposition 7.11 is that while the Macdonald polynomials
themselves are fairly mysterious objects with complicated definition, the righthand side
of (7.5) is explicit. Therefore, we have the formulas for averages of certain observables
of Macdonald measures. These formulas can be compactly written via contour integrals.
Let us present one particular case of such formulas.
Theorem 7.12. Assume that, in the notations of Proposition 7.11, t = 0, all parameters
ai are equal to 1, and 2 is the Macdonaldpositive specialization with single non-zero
parameter = . In other words, we deal with probability measure
eN P (1, . . . , 1; q, 0)Q ((0; 0; ); q, 0).
Then
I
I Y
k
(1)k q k(k1)
2
zA zB Y e(q1) zj dzj
kN
,
...
E q
=
(2i)k
zA qzB j=1 (1 zj )N zj
A<B
where the integration is over the nested contours such that zj contour contains 1 and also
qzj+1 , . . . , qzk , and no other singularities of the integrand, see Figure 23.
Remark 1. The measure of Theorem 7.12 via the identification of Young diagrams
with N rows and N point particle configurations coincides with the distribution of the
vector Z1N ( ), . . . , ZNN ( ) of the stochastic dynamics of Section 7.2.
Remark 2. Theorem 7.12 admits various generalizations: t can be non-zero,
both specializations can be arbitrary, we can compute expectations related to the
higher order Macdonald operators and also apply it to the joint distributions of
various Young diagrams of the Macdonald process. See [Borodin-Corwin-11], and
[Borodin-Corwin-Gorin-Shakirov-13] for details.
q2 q
Figure 23: Nested contours of integration. Here N = 3, z1 is integrated over the largest
contour and z3 is integrated over the smallest one.
The moments of q N can be combined into a qLaplace transform for which we can
also get a neat expression in the form of a Fredholm determinant.
55
i+1/2
i+1/2
where
1
gw,w0 (q ) = s
q w w0
s
(q s w; q)
(w; q)
N
s
exp w(q 1) .
k k
X
X
uk k
u
=
E .
(7.6)
E exp(u) = E
k!
k!
k=0
k=0
However, the moments of the partition function of the OConnellYor directed poly2
mer grow rapidly (as ek , cf. [Borodin-Corwin-12]) and the series in the right side of
(7.6) does not converge for any u 6= 0. This is caused by the intermittency which we
discussed in Section 1. Similar things happen when one considers fully continuous polymer which we briefly mentioned in Section 1, i.e. when one integrated 2d white noise
over the paths of Brownian bridges. Nevertheless, physicists tried to overcome these
difficulties and find the Laplace transform (and the distribution itself after that) using
the moments (the latter could be computed in this model using the socalled Bethe
ansatz for the delta quantum Bosegas). A non-rigorous argument used here is known
56
as replica trick and it has a long history (first applied to directed polymers in random
media in [Kardar-87]); this is some sort of an analytic continuation argument for the
function with specified values at integer points. However, the first answers to this problem obtained via the replica trick were incorrect. (They were later corrected though, cf.
[Dotsnenko-10], [Calabrese-Doussal-Rosso-10].) A correct mathematical computation of
the Laplace transform for the continuous directed polymer appeared concurrently in the
work of AmirCorwinQuastel [Amir-Corwin-Quastel-11] (which is fully rigorous) and,
independently, of SasamotoSpohn [Sasamoto-Spohn-10]; it was based on previous work
of TracyWidom on ASEP, see [Tracy-Widom-11] and references therein.
Therefore, the manipulations with contours which deduce Theorem 7.13 from Theorem
7.12 can be viewed as a fully rigorous qincarnation (put it otherwise, as a mathematical
justification) of the non-rigorous replica trick. See [Borodin-Corwin-Sasamoto-12] for
further developments in this direction.
The asymptotic analysis of the operator K in the Fredholm determinant of Theorem
7.13 based on the steepest descent ideas that we discussed in Section 5, leads to the
TracyWidom distribution F2 (s) (which is also given by Fredholm determinant, as we
remarked after Theorem 5.6). Along these lines one proves the KPZuniversality for the
partition function of the OConnellYor directed polymer, and also of certain integrable
discrete polymers (Theorem 1.5), see [Borodin-Corwin-11], [Borodin-Corwin-Ferrari-12],
[Borodin-Corwin-Remenik-12], [Borodin-Corwin-Sasamoto-12].
References
[Akemann-Baik-Francesco-11] G. Akemann, J. Baik, P. Di Francesco (editors), The Oxford Handbook of Random Matrix Theory, Oxford University Press, 2011.
[Alberts-Khanin-Quastel-12a] T. Alberts, K. Khanin, J. Quastel, Intermediate Disorder
Regime for 1+1 Dimensional Directed Polymers, Annals of Probability, 42, no. 3
(2014), 12121256. arXiv:1202.4398.
[Alberts-Khanin-Quastel-12b] T. Alberts, K. Khanin, J. Quastel, The Continuum Directed Random Polymer, Journal of Statistical Physics, 154, no. 1-2 (2014), 305326.
arXiv:1202.4403.
[Amir-Corwin-Quastel-11] G. Amir, I. Corwin, J. Quastel. Probability distribution of the
free energy of the continuum directed random polymer in 1 + 1 dimensions. Communications on Pure and Applied Mathematics, 64(2011), 466537. arXiv:1003.0443.
[Anderson-Guionnet-Zeitouni-10] G. W. Anderson, A. Guionnet, O. Zeitouni, An Introduction to Random Matrices, Cambridge University Press, 2010.
[Baik-Deift-Johansson-99] J. Baik, P. Deift, and K. Johansson, On the distribution of the
length of the longest increasing subsequence of random permutations, Journal of the
American Mathematical Society, 12, no. 4, 1119 1178, (1999). arXiv:math/9810105.
[Barabasi-Stanley-95] A.-L. Barabasi, H. E. Stanley, Fractal Concepts in Surface Growth,
Cambridge University Press, 1995.
57
59
61
64