The Discrete Time Case: 1.1 Normal Martingales

Download as pdf or txt
Download as pdf or txt
You are on page 1of 52

Chapter 1

The Discrete Time Case


In this chapter we introduce the tools of stochastic analysis in the simple
framework of discrete time random walks. Our presentation relies on the
use of nite dierence gradient and divergence operators which are dened
along with single and multiple stochastic integrals. The main applications of
stochastic analysis to be considered in the following chapters, including func-
tional inequalities and mathematical nance, are discussed in this elementary
setting. Some technical diculties involving measurability and integrability
conditions, that are typical of the continuous-time case, are absent in the
discrete time case.
1.1 Normal Martingales
Consider a sequence (Y
k
)
kN
of (not necessarily independent) random vari-
ables on a probability space (, T, P). Let (T
n
)
n1
denote the ltration
generated by (Y
n
)
nN
, i.e.
T
1
= , ,
and
T
n
= (Y
0
, . . . , Y
n
), n 0.
Recall that a random variable F is said to be T
n
-measurable if it can be
written as a function
F = f
n
(Y
0
, . . . , Y
n
)
of Y
0
, . . . , Y
n
, where f
n
: R
n+1
R.
Assumption 1.1.1. We make the following assumptions on the sequence
(Y
n
)
nN
:
a) it is conditionally centered:
E[Y
n
[ T
n1
] = 0, n 0, (1.1.1)
N. Privault, Stochastic Analysis in Discrete and Continuous Settings,
Lecture Notes in Mathematics 1982, DOI 10.1007/978-3-642-02380-4 1,
c Springer-Verlag Berlin Heidelberg 2009
7
8 1 The Discrete Time Case
b) its conditional quadratic variation satises:
E[Y
2
n
[ T
n1
] = 1, n 0.
Condition (1.1.1) implies that the process (Y
0
+ + Y
n
)
n0
is an T
n
-
martingale, cf. Section 9.4 in the Appendix. More precisely, the sequence
(Y
n
)
nN
and the process (Y
0
+ + Y
n
)
n0
can be viewed respectively as a
(correlated) noise and as a normal martingale in discrete time.
1.2 Stochastic Integrals
In this section we construct the discrete stochastic integral of predictable
square-summable processes with respect to a discrete-time normal martingale.
Denition 1.2.1. Let (u
k
)
kN
be a uniformly bounded sequence of random
variables with nite support in N, i.e. there exists N 0 such that u
k
= 0
for all k N. The stochastic integral J(u) of (u
n
)
nN
is dened as
J(u) =

k=0
u
k
Y
k
.
The next proposition states a version of the It o isometry in discrete time. A
sequence (u
n
)
nN
of random variables is said to be T
n
-predictable if u
n
is
T
n1
-measurable for all n N, in particular u
0
is constant in this case.
Proposition 1.2.2. The stochastic integral operator J(u) extends to square-
integrable predictable processes (u
n
)
nN
L
2
( N) via the (conditional)
isometry formula
E[[J(1
[n,)
u)[
2
[ [ T
n1
] = E[|1
[n,)
u|
2

2
(N)
[ T
n1
], n N. (1.2.1)
Proof. Let (u
n
)
nN
and (v
n
)
nN
be bounded predictable processes with nite
support in N. The product u
k
Y
k
v
l
, 0 k < l, is T
l1
-measurable, and u
k
Y
l
v
l
is T
k1
-measurable, 0 l < k. Hence
E
_

k=n
u
k
Y
k

l=n
v
l
Y
l

T
n1
_
= E

k,l=n
u
k
Y
k
v
l
Y
l

T
n1

= E

k=n
u
k
v
k
Y
2
k
+

nk<l
u
k
Y
k
v
l
Y
l
+

nl<k
u
k
Y
k
v
l
Y
l

T
n1

1.2 Stochastic Integrals 9


=

k=n
E[E[u
k
v
k
Y
2
k
[ T
k1
] [ T
n1
] +

nk<l
E[E[u
k
Y
k
v
l
Y
l
[ T
l1
] [ T
n1
]
+

nl<k
E[E[u
k
Y
k
v
l
Y
l
[ T
k1
] [ T
n1
]
=

k=0
E[u
k
v
k
E[Y
2
k
[ T
k1
] [ T
n1
] + 2

nk<l
E[u
k
Y
k
v
l
E[Y
l
[ T
l1
] [ T
n1
]
=

k=n
E[u
k
v
k
[ T
n1
]
= E
_

k=n
u
k
v
k

T
n1
_
.
This proves the isometry property (1.2.1) for J. The extension to L
2
(N) is
proved using the following Cauchy sequence argument. Consider a sequence of
bounded predictable processes with nite support converging to u in L
2
(
N), for example the sequence (u
n
)
nN
dened as
u
n
= (u
n
k
)
kN
= (u
k
1
{0kn}
1
{|u
k
|n}
)
kN
, n N.
Then the sequence (J(u
n
))
nN
is Cauchy and converges in L
2
(), hence we
may dene
J(u) := lim
k
J(u
k
).
From the isometry property (1.2.1) applied with n = 0, the limit is clearly
independent of the choice of the approximating sequence (u
k
)
kN
.
Note that by polarization, (1.2.1) can also be written as
E[J(1
[n,)
u)J(1
[n,)
v)[T
n1
] = E[1
[n,)
u, 1
[n,)
v)

2
(N)
[ T
n1
], n N,
and that for n = 0 we get
E
_
J(u)J(v)] = E[u, v)

2
(N)

, (1.2.2)
and
E[[J(u)[
2
] = E
_
|u|
2

2
(N)
_
, (1.2.3)
for all square-integrable predictable processes u = (u
k
)
kN
and v = (v
k
)
kN
.
Proposition 1.2.3. Let (u
k
)
kN
L
2
( N) be a predictable square-
integrable process. We have
E[J(u) [ T
k
] = J(u1
[0,k]
), k N.
10 1 The Discrete Time Case
Proof. In case (u
k
)
kN
has nite support in N it suces to note that
E[J(u) [ T
k
] = E
_
k

i=0
u
i
Y
i

T
k
_
+

i=k+1
E[u
i
Y
i
[ T
k
]
=
k

i=0
u
i
Y
i
+

i=k+1
E[E[u
i
Y
i
[ T
i1
] [ T
k
]
=
k

i=0
u
i
Y
i
+

i=k+1
E[u
i
E[Y
i
[ T
i1
] [ T
k
]
=
k

i=0
u
i
Y
i
= J(u1
[0,k]
).
The formula extends to the general case by linearity and density, using the
continuity of the conditional expectation on L
2
and the sequence (u
n
)
nN
dened as u
n
= (u
n
k
)
kN
= (u
k
1
{0kn}
)
kN
, n N, i.e.
E
_
_
J(u1
[0,k]
) E[J(u) [ T
k
]
_
2
_
= lim
n
E
_
_
J(u
n
1
[0,k]
) E[J(u) [ T
k
]
_
2
_
= lim
n
E
_
(E[J(u
n
) J(u) [ T
k
])
2
_
lim
n
E
_
E
_
(J(u
n
) J(u))
2

T
k
__
= lim
n
E
_
(J(u
n
) J(u))
2
_
= 0,
by (1.2.3).
Corollary 1.2.4. The indenite stochastic integral (J(u1
[0,k]
))
kN
is a dis-
crete time martingale with respect to (T
n
)
n1
.
Proof. We have
E[J(u1
[0,k+1]
) [ T
k
] = E[E[J(u1
[0,k+1]
) [ T
k+1
[ T
k
]
= E[E[J(u) [ T
k+1
[ T
k
]
= E[J(u) [ T
k
]
= J(u1
[0,k]
).

1.3 Multiple Stochastic Integrals 11


1.3 Multiple Stochastic Integrals
The role of multiple stochastic integrals in the orthogonal expansion of
a random variable is similar to that of polynomials in the series expan-
sion of a function of a real variable. In some situations, multiple stochastic
integrals can be expressed using polynomials, such as in the symmetric case
p
n
= q
n
= 1/2, n N, in which the Krawtchouk polynomials are used, see
Relation (1.5.2) below.
Denition 1.3.1. Let
2
(N)
n
denote the subspace of
2
(N)
n
=
2
(N
n
)
made of functions f
n
that are symmetric in n variables, i.e. such that for
every permutation of 1, . . . , n,
f
n
(k
(1)
, . . . , k
(n)
) = f
n
(k
1
, . . . , k
n
), k
1
, . . . , k
n
N.
Given f
1

2
(N) we let
J
1
(f
1
) = J(f
1
) =

k=0
f
1
(k)Y
k
.
As a convention we identify
2
(N
0
) to R and let J
0
(f
0
) = f
0
, f
0
R. Let

n
= (k
1
, . . . , k
n
) N
n
: k
i
,= k
j
, 1 i < j n, n 1.
The following proposition gives the denition of multiple stochastic integrals
by iterated stochastic integration of predictable processes in the sense of
Proposition 1.2.2.
Proposition 1.3.2. The multiple stochastic integral J
n
(f
n
) of f
n

2
(N)
n
,
n 1, is dened as
J
n
(f
n
) =

(i1,...,in)n
f
n
(i
1
, . . . , i
n
)Y
i1
Y
in
.
It satises the recurrence relation
J
n
(f
n
) = n

k=1
Y
k
J
n1
(f
n
(, k)1
[0,k1]
n1()) (1.3.1)
and the isometry formula
E[J
n
(f
n
)J
m
(g
m
)] =
_
n!1
n
f
n
, g
m
)

2
(N)
n if n = m,
0 if n ,= m.
(1.3.2)
12 1 The Discrete Time Case
Proof. Note that we have
J
n
(f
n
) = n!

0i1<<in
f
n
(i
1
, . . . , i
n
)Y
i1
Y
in
= n!

in=0

0in1<in

0i1<i2
f
n
(i
1
, . . . , i
n
)Y
i1
Y
in
. (1.3.3)
Note that since 0 i
1
< i
2
< < i
n
and 0 j
1
< j
2
< < j
n
we have
E[Y
i1
Y
in
Y
j1
Y
jn
] = 1
{i1=j1,...,in=jn}
.
Hence
E[Jn(fn)Jn(gn)]
= (n!)
2
E


0i
1
<<in
fn(i1, . . . , in)Yi
1
Yin

0j
1
<<jn
gn(j1, . . . , jn)Yj
1
Yjn

= (n!)
2

0i
1
<<in, 0j
1
<<jn
fn(i1, . . . , in)gn(j1, . . . , jn)E[Yi
1
Yin
Yj
1
Yjn
]
= (n!)
2

0i
1
<<in
fn(i1, . . . , in)gn(i1, . . . , in)
= n!

(i
1
,...,in)n
fn(i1, . . . , in)gn(i1, . . . , in)
= n!1n
fn, gm

2
(N)
n.
When n < m and (i
1
, . . . , i
n
)
n
and (j
1
, . . . , j
m
)
m
are two sets of
indices, there necessarily exists k 1, . . . , m such that j
k
/ i
1
, . . . , i
n
,
hence
E[Y
i1
Y
in
Y
j1
Y
jm
] = 0,
and this implies the orthogonality of J
n
(f
n
) and J
m
(g
m
). The recurrence
relation (1.3.1) is a direct consequence of (1.3.3). The isometry property
(1.3.2) of J
n
also follows by induction from (1.2.1) and the recurrence relation.

If f
n

2
(N
n
) is not symmetric we let J
n
(f
n
) = J
n
(

f
n
), where

f
n
is the
symmetrization of f
n
, dened as

f
n
(i
1
, . . . , i
n
) =
1
n!

n
f(i
(1)
, . . . , i
n
), i
1
, . . . , i
n
N
n
,
and
n
is the set of all permutations of 1, . . . , n.
1.4 Structure Equations 13
In particular, if (k
1
, . . . , k
n
)
n
, the symmetrization of 1
{(k1,...,kn)}
in n
variables is given by

1
{(k1,...,kn)}
(i
1
, . . . , i
n
) =
1
n!
1
{{i1,...,in}={k1,...,kn}}
, i
1
, . . . , i
n
N,
and
J
n
(

1
{(k1,...,kn)}
) = Y
k1
Y
kn
.
Lemma 1.3.3. For all n 1 we have
E[J
n
(f
n
) [ T
k
] = J
n
(f
n
1
[0,k]
n),
k N, f
n

2
(N)
n
.
Proof. This lemma can be proved in two ways, either as a consequence of
Proposition 1.2.3 and Proposition 1.3.2 or via the following direct argument,
noting that for all m = 0, . . . , n and g
m

2
(N)
m
we have:
E[(J
n
(f
n
) J
n
(f
n
1
[0,k]
n))J
m
(g
m
1
[0,k]
m)]
= 1
{n=m}
n!f
n
(1 1
[0,k]
n), g
m
1
[0,k]
m)

2
(N
n
)
= 0,
hence J
n
(f
n
1
[0,k]
n) L
2
(, T
k
), and J
n
(f
n
) J
n
(f
n
1
[0,k]
n) is orthogonal to
L
2
(, T
k
).
In other terms we have
E[J
n
(f
n
)] = 0, f
n

2
(N)
n
, n 1,
the process (J
n
(f
n
1
[0,k]
n))
kN
is a discrete-time martingale, and J
n
(f
n
) is
T
k
-measurable if and only if
f
n
1
[0,k]
n = f
n
, 0 k n.
1.4 Structure Equations
Assume now that the sequence (Y
n
)
nN
satises the discrete structure
equation:
Y
2
n
= 1 +
n
Y
n
, n N, (1.4.1)
where (
n
)
nN
is an T
n
-predictable process. Condition (1.1.1) implies that
E[Y
2
n
[ T
n1
] = 1, n N,
14 1 The Discrete Time Case
hence the hypotheses of the preceding sections are satised. Since (1.4.1)
is a second order equation, there exists an T
n
-adapted process (X
n
)
nN
of
Bernoulli 1, 1-valued random variables such that
Y
n
=

n
2
+X
n
_
1 +
_

n
2
_
2
, n N. (1.4.2)
Consider the conditional probabilities
p
n
= P(X
n
= 1 [ T
n1
) and q
n
= P(X
n
= 1 [ T
n1
), n N.
(1.4.3)
From the relation E[Y
n
[ T
n1
] = 0, rewritten as
p
n
_

n
2
+
_
1 +
_

n
2
_
2
_
+q
n
_

n
2

_
1 +
_

n
2
_
2
_
= 0, n N,
we get
p
n
=
1
2
_
1

n
_
4 +
2
n
_
, q
n
=
1
2
_
1 +

n
_
4 +
2
n
_
, (1.4.4)
and

n
=
_
q
n
p
n

_
p
n
q
n
=
q
n
p
n

p
n
q
n
, n N,
hence
Y
n
= 1
{Xn=1}
_
q
n
p
n
1
{Xn=1}
_
p
n
q
n
, n N. (1.4.5)
Letting
Z
n
=
X
n
+ 1
2
0, 1, n N,
we also have the relations
Y
n
=
q
n
p
n
+X
n
2

p
n
q
n
=
Z
n
p
n

p
n
q
n
, n N, (1.4.6)
which yield
T
n
= (X
0
, . . . , X
n
) = (Z
0
, . . . , Z
n
), n N.
Remark 1.4.1. In particular, one can take = 1, 1
N
and construct
the Bernoulli process (X
n
)
nN
as the sequence of canonical projections on
= 1, 1
N
under a countable product P of Bernoulli measures on 1, 1.
In this case the sequence (X
n
)
nN
can be viewed as the dyadic expansion of
1.5 Chaos Representation 15
X() [0, 1] dened as:
X() =

n=0
1
2
n+1
X
n
().
In the symmetric case p
k
= q
k
= 1/2, k N, the image measure of P by
the mapping X() is the Lebesgue measure on [0, 1], see [139] for the
non-symmetric case.
1.5 Chaos Representation
From now on we assume that the sequence (p
k
)
kN
dened in (1.4.3) is
deterministic, which implies that the random variables (X
n
)
nN
are in-
dependent. Precisely, X
n
will be constructed as the canonical projection
X
n
: 1, 1 on = 1, 1
N
under the measure P given on cylin-
der sets by
P(
0
, . . . ,
n
1, 1
N
) =
n

k=0
p
(1+
k
)/2
k
q
(1
k
)/2
k
,

0
, . . . ,
n
1, 1
n+1
. The sequence (Y
k
)
kN
can be constructed as a
family of independent random variables given by
Y
n
=

n
2
+X
n
_
1 +
_

n
2
_
2
, n N,
where the sequence (
n
)
nN
is deterministic. In this case, all spaces
L
r
(, T
n
), r 1, have nite dimension 2
n+1
, with basis
_
1
{Y0=0,...,Yn=n}
: (
0
, . . . ,
n
)
n

k=0
__
q
k
p
k
,
_
p
k
q
k
_
_
=
_
1
{X0=0,...,Xn=n}
: (
0
, . . . ,
n
)
n

k=0
1, 1
_
.
An orthogonal basis of L
r
(, T
n
) is given by
_
Y
k1
Y
k
l
= J
l
(

1
{(k1,...,k
l
)}
) : 0 k
1
< < k
l
n, l = 0, . . . , n + 1
_
.
16 1 The Discrete Time Case
Let
S
n
=
n

k=0
1 +X
k
2
(1.5.1)
=
n

k=0
Z
k
, n N,
denote the random walk associated to (X
k
)
kN
. If p
k
= p, k N, then
J
n
(1
n
[0,N]
) = K
n
(S
N
; N + 1, p) (1.5.2)
coincides with the Krawtchouk polynomial K
n
(; N + 1, p) of order n and
parameter (N + 1, p), evaluated at S
N
, cf. Proposition 4 of [115].
Let now H
0
= R and let H
n
denote the subspace of L
2
() made of integrals
of order n 1, and called chaos of order n:
H
n
= J
n
(f
n
) : f
n

2
(N)
n
.
The space of T
n
-measurable random variables is denoted by L
0
(, T
n
).
Lemma 1.5.1. For all n N we have
L
0
(, T
n
) = (H
0
H
n+1
)

L
0
(, T
n
). (1.5.3)
Proof. It suces to note that H
l
L
0
(, T
n
) has dimension
_
n+1
l
_
, 1 l
n + 1. More precisely it is generated by the orthonormal basis
_
Y
k1
Y
k
l
= J
l
(

1
{(k1,...,k
l
)}
) : 0 k
1
< < k
l
n
_
,
since any element F of H
l
L
0
(, T
n
) can be written as F = J
l
(f
l
1
[0,n]
l ).
Hence L
0
(, T
n
) and (H
0
H
n+1
)

L
0
(, T
n
) have same dimension
2
n+1
=
n+1

k=0
_
n + 1
k
_
, and this implies (1.5.3) since
L
0
(, T
n
) (H
0
H
n+1
)

L
0
(, T
n
).

As a consequence of Lemma 1.5.1 we have


L
0
(, T
n
) H
0
H
n+1
.
Alternatively, Lemma 1.5.1 can be proved by noting that
J
n
(f
n
1
[0,N]
n) = 0, n > N + 1, f
n

2
(N)
n
,
1.5 Chaos Representation 17
and as a consequence, any F L
0
(, T
N
) can be expressed as
F = E[F] +
N+1

n=1
J
n
(f
n
1
[0,N]
n).
Denition 1.5.2. Let o denote the linear space spanned by multiple stochas-
tic integrals, i.e.
o = Vect
_

_
n=0
H
n
_
(1.5.4)
=
_
n

k=0
J
k
(f
k
) : f
k

2
(N)
k
, k = 0, . . . , n, n N
_
.
The completion of o in L
2
() is denoted by the direct sum

n=0
H
n
.
The next result is the chaos representation property for Bernoulli processes,
which is analogous to the Walsh decomposition, cf. [78]. Here this property is
obtained under the assumption that the sequence (X
n
)
nN
is made of inde-
pendent random variables since (p
k
)
kN
is deterministic, which corresponds
to the setting of Proposition 4 in [38]. See [38] and Proposition 5 therein for
other instances of the chaos representation property without this indepen-
dence assumption.
Proposition 1.5.3. We have the identity
L
2
() =

n=0
H
n
.
Proof. It suces to show that o is dense in L
2
(). Let F be a bounded
random variable. Relation (1.5.3) of Lemma 1.5.1 shows that E[F [ T
n
] o.
The martingale convergence theorem, cf. e.g. Theorem 27.1 in [67], implies
that (E[F [ T
n
])
nN
converges to F a.s., hence every bounded F is the L
2
()-
limit of a sequence in o. If F L
2
() is not bounded, F is the limit in L
2
()
of the sequence (1
{|F|n}
F)
nN
of bounded random variables.
As a consequence of Proposition 1.5.3, any F L
2
(, P) has a unique de-
composition
F = E[F] +

n=1
J
n
(f
n
), f
n

2
(N)
n
, n N,
18 1 The Discrete Time Case
as a series of multiple stochastic integrals. Note also that the statement of
Lemma 1.5.1 is sucient for the chaos representation property to hold.
1.6 Gradient Operator
We start by dening the operator D on the space o of nite sums of multiple
stochastic integrals, which is dense in in L
2
() by Proposition 1.5.3.
Denition 1.6.1. We densely dene the linear gradient operator
D : o L
2
( N)
by
D
k
J
n
(f
n
) = nJ
n1
(f
n
(, k)1
n
(, k)),
k N, f
n

2
(N)
n
, n N.
Note that for all k
1
, . . . , k
n1
, k N, we have
1
n
(k
1
, . . . , k
n1
, k) = 1
{k/ (k1,...,kn1)}
1
n1
(k
1
, . . . , k
n1
),
hence we can write
D
k
J
n
(f
n
) = nJ
n1
(f
n
(, k)1
{k/ }
), k N,
where in the above relation, denotes the rst k1 variables (k
1
, . . . , k
n1
)
of f
n
(k
1
, . . . , k
n1
, k). We also have D
k
F = 0 whenever F o is T
k1
-
measurable.
On the other hand, D
k
is a continuous operator on the chaos H
n
since
|D
k
J
n
(f
n
)|
2
L
2
()
= n
2
|J
n1
(f
n
(, k))|
2
L
2
()
(1.6.1)
= nn!|f
n
(, k)|
2

2
(N
(n1)
)
, f
n

2
(N
n
), k N.
The following result gives the probabilistic interpretation of D
k
as a nite
dierence operator. Given
= (
0
,
1
, . . .) 1, 1
N
,
let

k
+
= (
0
,
1
, . . . ,
k1
, +1,
k+1
, . . .)
and

= (
0
,
1
, . . . ,
k1
, 1,
k+1
, . . .).
1.6 Gradient Operator 19
Proposition 1.6.2. We have for any F o:
D
k
F() =

p
k
q
k
(F(
k
+
) F(
k

)), k N. (1.6.2)
Proof. We start by proving the above statement for an T
n
-measurable F o.
Since L
0
(, T
n
) is nite dimensional it suces to consider
F = Y
k1
Y
k
l
= f(X
0
, . . . , X
k
l
),
with from (1.4.6):
f(x
0
, . . . , x
k
l
) =
1
2
l
l

i=1
q
ki
p
ki
+x
ki

p
ki
q
ki
.
First we note that from (1.5.3) we have for (k
1
, . . . , k
n
)
n
:
D
k
(Y
k1
Y
kn
) = D
k
J
n
(

1
{(k1,...,kn)}
)
= nJ
n1
(

1
{(k1,...,kn)}
(, k))
=
1
(n 1)!
n

i=1
1
{ki}
(k)

(i1,...,in1)n1

1
{{i1,...,in1}={k1,...,ki1,ki+1,...,kn}}
=
n

i=1
1
{ki}
(k)J
n1
(

1
{(k1,...,ki1,ki+1,...,kn)}
)
= 1
{k1,...,kn}
(k)
n

i=1
k
i
=k
Y
ki
. (1.6.3)
If k / k
1
, . . . , k
l
we clearly have F(
k
+
) = F(
k

) = F(), hence

p
k
q
k
(F(
k
+
) F(
k

)) = 0 = D
k
F().
On the other hand if k k
1
, . . . , k
l
we have
F(
k
+
) =
_
q
k
p
k
l

i=1
k
i
=k
q
ki
p
ki
+
ki
2

p
ki
q
ki
,
F(
k

) =
_
p
k
q
k
l

i=1
k
i
=k
q
ki
p
ki
+
ki
2

p
ki
q
ki
,
20 1 The Discrete Time Case
hence from (1.6.3) we get

p
k
q
k
(F(
k
+
) F(
k

)) =
1
2
l1
l

i=1
k
i
=k
q
ki
p
ki
+
ki

p
ki
q
ki
=
l

i=1
k
i
=k
Y
ki
()
= D
k
(Y
k1
Y
k
l
) ()
= D
k
F().
In the general case, J
l
(f
l
) is the L
2
-limit of the sequence E[J
l
(f
l
) [ T
n
] =
J
l
(f
l
1
[0,n]
l ) as n goes to innity, and since from (1.6.1) the operator D
k
is
continuous on all chaoses H
n
, n 1, we have
D
k
F = lim
n
D
k
E[F [ T
n
]
=

p
k
q
k
lim
n
(E[F [ T
n
](
k
+
) E[F [ T
n
](
k

))
=

p
k
q
k
(F(
k
+
) F(
k

)), k N.

The next property follows immediately from Proposition 1.6.2.


Corollary 1.6.3. A random variable F : R is T
n
-measurable if and
only if
D
k
F = 0
for all k > n.
If F has the form F = f(X
0
, . . . , X
n
), we may also write
D
k
F =

p
k
q
k
(F
+
k
F

k
), k N,
with
F
+
k
= f(X
0
, . . . , X
k1
, +1, X
k+1
, . . . , X
n
),
and
F

k
= f(X
0
, . . . , X
k1
, 1, X
k+1
, . . . , X
n
).
The gradient D can also be expressed as
D
k
F(S

) =

p
k
q
k
_
F
_
S

+1
{X
k
=1}
1
{k}
_
F
_
S

1
{X
k
=1}
1
{k}
__
,
where F(S

) is an informal notation for the random variable F estimated on


a given path of (S
n
)
nN
dened in (1.5.1) and S

+ 1
{X
k
=1}
1
{k}
denotes
the path of (S
n
)
nN
perturbed by forcing X
k
to be equal to 1.
1.6 Gradient Operator 21
We will also use the gradient
k
dened as

k
F = X
k
(f(X
0
, . . . , X
k1
, 1, X
k+1
, . . . , X
n
)
f(X
0
, . . . , X
k1
, 1, X
k+1
, . . . , X
n
)) ,
k N, with the relation
D
k
= X
k

p
k
q
k

k
, k N,
hence
k
F coincides with D
k
F after squaring and multiplication by p
k
q
k
.
From now on, D
k
denotes the nite dierence operator which is extended to
any F : R using Relation (1.6.2).
The L
2
domain of D, denoted Dom(D), is naturally dened as the space of
functionals F L
2
() such that
E
_
|DF|
2

2
(N)
_
< ,
or equivalently by (1.6.1),

n=1
nn!|f
n
|
2

2
(N
n
)
< ,
if F =

n=0
J
n
(f
n
).
The following is the product rule for the operator D.
Proposition 1.6.4. Let F, G : R. We have
D
k
(FG) = FD
k
G+GD
k
F
X
k

p
k
q
k
D
k
FD
k
G, k N.
Proof. Let F
k
+
() = F(
k
+
), F
k

() = F(
k

), k 0. We have
D
k
(FG) =

p
k
q
k
(F
k
+
G
k
+
F
k

G
k

)
= 1
{X
k
=1}

p
k
q
k
_
F(G
k
+
G) +G(F
k
+
F) +(F
k
+
F)(G
k
+
G)
_
+1
{X
k
=1}

p
k
q
k
_
F(GG
k

) +G(F F
k

) (F F
k

)(GG
k

)
_
= 1
{X
k
=1}
_
FD
k
G+GD
k
F +
1

p
k
q
k
D
k
FD
k
G
_
+1
{X
k
=1}
_
FD
k
G+GD
k
F
1

p
k
q
k
D
k
FD
k
G
_
.

22 1 The Discrete Time Case


1.7 Clark Formula and Predictable Representation
In this section we prove a predictable representation formula for the func-
tionals of (S
n
)
n0
dened in (1.5.1).
Proposition 1.7.1. For all F o we have
F = E[F] +

k=0
E[D
k
F [ T
k1
]Y
k
(1.7.1)
= E[F] +

k=0
Y
k
D
k
E[F [ T
k
].
Proof. The formula is obviously true for F = J
0
(f
0
). Given n 1, as a
consequence of Proposition 1.3.2 above and Lemma 1.3.3 we have:
J
n
(f
n
) = n

k=0
J
n1
(f
n
(, k)1
[0,k1]
n1())Y
k
= n

k=0
J
n1
(f
n
(, k)1
n
(, k)1
[0,k1]
n1())Y
k
= n

k=0
E[J
n1
(f
n
(, k)1
n
(, k)) [ T
k1
]Y
k
=

k=0
E[D
k
J
n
(f
n
) [ T
k1
]Y
k
,
which yields (1.7.1) for F = J
n
(f
n
), since E[J
n
(f
n
)] = 0. By linearity the
formula is established for F o.
For the second identity we use the relation
E[D
k
F [ T
k1
] = D
k
E[F [ T
k
]
which clearly holds since D
k
F is independent of X
k
, k N.
Although the operator D is unbounded we have the following result, which
states the boundedness of the operator that maps a random variable to the
unique process involved in its predictable representation.
Lemma 1.7.2. The operator
L
2
() L
2
( N)
F (E[D
k
F [ T
k1
])
kN
is bounded with norm equal to one.
1.7 Clark Formula and Predictable Representation 23
Proof. Let F o. From Relation (1.7.1) and the isometry formula (1.2.2)
for the stochastic integral operator J we get
|E[D

F [ T
1
]|
2
L
2
(N)
= |F E[F]|
2
L
2
()
(1.7.2)
|F E[F]|
2
L
2
()
+ (E[F])
2
= |F|
2
L
2
()
,
with equality in case F = J
1
(f
1
).
As a consequence of Lemma 1.7.2 we have the following corollary.
Corollary 1.7.3. The Clark formula of Proposition 1.7.1 extends to any
F L
2
().
Proof. Since F E[D

F [ T
1
] is bounded from Lemma 1.7.2, the Clark
formula extends to F L
2
() by a standard Cauchy sequence argument.

Let us give a rst elementary application of the above construction to the


proof of a Poincare inequality on Bernoulli space. Using (1.2.3) we have
Var (F) = E[[F E[F][
2
]
= E

k=0
E[D
k
F [ T
k1
]Y
k
_
2

= E
_

k=0
(E[D
k
F [ T
k1
])
2
_
E
_

k=0
E[[D
k
F[
2
[ T
k1
]
_
= E
_

k=0
[D
k
F[
2
_
,
hence
Var (F) |DF|
2
L
2
(N)
.
More generally the Clark formula implies the following.
Corollary 1.7.4. Let a N and F L
2
(). We have
F = E[F [ T
a
] +

k=a+1
E[D
k
F [ T
k1
]Y
k
, (1.7.3)
24 1 The Discrete Time Case
and
E[F
2
] = E[(E[F [ T
a
])
2
] +E
_

k=a+1
(E[D
k
F [ T
k1
])
2
_
. (1.7.4)
Proof. From Proposition 1.2.3 and the Clark formula (1.7.1) of Proposi-
tion 1.7.1 we have
E[F [ T
a
] = E[F] +
a

k=0
E[D
k
F [ T
k1
]Y
k
,
which implies (1.7.3). Relation (1.7.4) is an immediate consequence of (1.7.3)
and the isometry property of J.
As an application of the Clark formula of Corollary 1.7.4 we obtain the fol-
lowing predictable representation property for discrete-time martingales.
Proposition 1.7.5. Let (M
n
)
nN
be a martingale in L
2
() with respect to
(T
n
)
nN
. There exists a predictable process (u
k
)
kN
locally in L
2
(N), (i.e.
u()1
[0,N]
() L
2
( N) for all N > 0) such that
M
n
= M
1
+
n

k=0
u
k
Y
k
, n N. (1.7.5)
Proof. Let k 1. From Corollaries 1.6.3 and 1.7.4 we have:
M
k
= E[M
k
[ T
k1
] +E[D
k
M
k
[ T
k1
]Y
k
= M
k1
+E[D
k
M
k
[ T
k1
]Y
k
,
hence it suces to let
u
k
= E[D
k
M
k
[ T
k1
], k 0,
to obtain
M
n
= M
1
+
n

k=0
M
k
M
k1
= M
1
+
n

k=0
u
k
Y
k
.

1.8 Divergence Operator


The divergence operator is introduced as the adjoint of D. Let | L
2
(
N) be the space of processes dened as
| =
_
n

k=0
J
k
(f
k+1
(, )), f
k+1

2
(N)
k

2
(N), k = 0, . . . , n, n N
_
.
1.8 Divergence Operator 25
We refer to Section 9.7 in the appendix for the denition of the tensor product

2
(N)
k

2
(N), k 0.
Denition 1.8.1. Let : | L
2
() be the linear mapping dened on | as
(u) = (J
n
(f
n+1
(, ))) = J
n+1
(

f
n+1
), f
n+1

2
(N)
n

2
(N),
for (u
k
)
kN
of the form
u
k
= J
n
(f
n+1
(, k)), k N,
where

f
n+1
denotes the symmetrization of f
n+1
in n + 1 variables, i.e.

f
n+1
(k
1
, . . . , k
n+1
) =
1
n + 1
n+1

i=1
f
n+1
(k
1
, . . . , k
k1
, k
k+1
, . . . , k
n+1
, k
i
).
From Proposition 1.5.3, o is dense in L
2
(), hence | is dense in L
2
(N).
Proposition 1.8.2. The operator is adjoint to D:
E[DF, u)

2
(N)
] = E[F(u)], F o, u |.
Proof. We consider F = J
n
(f
n
) and u
k
= J
m
(g
m+1
(, k)), k N, where
f
n

2
(N)
n
and g
m+1

2
(N)
m

2
(N). We have
E[D

J
n
(f
n
), J
m
(g
m+1
(, )))

2
(N)
]
= nE[J
n1
(f
n
(, )), J
m
(g
m
(, )))

2
(N)
]
= nE[J
n1
(f
n
(, )1
n
(, )), J
m
(g
m
(, )))

2
(N)
]
= n!1
{n1=m}

k=0
E[J
n1
(f
n
(, k)1
n
(, k))J
m
(g
m+1
(, k))]
= n1
{n1=m}

k=0
1
n
(, k)f
n
(, k), g
m+1
(, k))

2
(N
n1
)
= n!1
{n=m+1}
1
n
f
n
, g
m+1
)

2
(N
n
)
= n!1
{n=m+1}
1
n
f
n
, g
m+1
)

2
(N
n
)
= E[J
n
(f
n
)J
m
( g
m+1
)]
= E[F(u)].

The next proposition shows that coincides with the stochastic integral op-
erator J on the square-summable predictable processes.
26 1 The Discrete Time Case
Proposition 1.8.3. The operator can be extended to u L
2
( N) with
(u) =

k=0
u
k
Y
k

k=0
D
k
u
k
(Du), (1.8.1)
provided all series converges in L
2
(), where (
k
)
kN
appears in the structure
equation (1.4.1). We also have for all u, v |:
E[[(u)[
2
] = E[|u|
2

2
(N)
] +E

k,l=0
D
k
u
l
D
l
u
k

. (1.8.2)
Proof. Using the expression (1.3.3) of u
k
= J
n
(f
n+1
(, k)) we have
(u) = J
n+1
(

f
n+1
)
=

(i1,...,in+1)n+1

f
n+1
(i
1
, . . . , i
n+1
)Y
i1
Y
in+1
=

k=0

(i1,...,in)n

f
n+1
(i
1
, . . . , i
n
, k)Y
i1
Y
in
Y
k
n

k=0

(i1,...,in1)n1

f
n+1
(i
1
, . . . , i
n1
, k, k)Y
i1
Y
in1
[Y
k
[
2
=

k=0
u
k
Y
k

k=0
D
k
u
k
[Y
k
[
2
=

k=0
u
k
Y
k

k=0
D
k
u
k

k=0

k
D
k
u
k
Y
k
.
By polarization, orthogonality and density it suces to take u = gJ
n
(f
n
),
f, g
2
(N), and to note that
|(u)|
2
L
2
()
= |J
n+1
(1
n+1
f
n
g)|
2
L
2
()
=
1
(n + 1)
2
_
_
_
_
_
n

i=0
J
n+1
(f
i
g f
(ni)
1
n+1
)
_
_
_
_
_
2
L
2
()
=
1
(n + 1)
2
((n + 1)!(n + 1)|f|
2n

2
(N)
|g|
2

2
(N)
+(n + 1)!n(n + 1)|f|
2n2

2
(N)
f, g)
2

2
(N)
)
= n!|f|
2n

2
(N)
|g|
2

2
(N)
+ (n 1)!n
2
|f|
2n2

2
(N)
f, g)
2

2
(N)
= |u|
2
L
2
(N)
+E
_
g, DJ
n
(f
n
))

2
(N)
g, DJ
n
(f
n
))

2
(N)

1.8 Divergence Operator 27


= |u|
2
L
2
(N)
+E

k,l=0
g(k)g(l)D
l
J
n
(f
n
)D
k
J
n
(f
n
)

= E[|u|
2

2
(N)
] +E

k,l=0
D
k
u
l
D
l
u
k

From the above argument the Skorohod isometry can also be written as
|(u)|
2
L
2
()
= E[|u|
2

2
(N)
] +E

k,l=0
D
k
u
k
D
l
u
l

,
however this formulation does not lead to a well dened expression in the
continuous time limit of Chapter 4.
In particular, (1.8.1) implies the following divergence formula
Corollary 1.8.4. For u L
2
( N) an F L
2
() we have
(Fu) = F(u) u, DF)

2
(N)
(()u()D

F), (1.8.3)
provided all series converge in L
2
().
In the symmetric case p
k
= q
k
= 1/2 we have
k
= 0, k N, and
(u) =

k=0
u
k
Y
k

k=0
D
k
u
k
.
Moreover, (1.8.2) can be rewritten as a Weitzenb ock type identity, cf.
Section 7.6 for details:
|(u)|
2
L
2
()
+
1
2

k,l=0
|D
k
u(l) D
l
u(k)|
2
L
2
()
= |u|
2
L
2
(N)
+|Du|
2
L
2
(N
2
)
.
(1.8.4)
The last two terms in the right hand side of (1.8.1) vanish when (u
k
)
kN
is
predictable, and in this case the Skorohod isometry (1.8.2) becomes the Ito
isometry as shown in the next proposition.
Corollary 1.8.5. If (u
k
)
kN
satises D
k
u
k
= 0, i.e. u
k
does not depend on
X
k
, k N, then (u) coincides with the (discrete time) stochastic integral
(u) =

k=0
Y
k
u
k
, (1.8.5)
28 1 The Discrete Time Case
provided the series converges in L
2
(). If moreover (u
k
)
kN
is predictable
and square-summable we have the isometry
E[(u)
2
] = E
_
|u|
2

2
(N)
_
, (1.8.6)
and (u) coincides with J(u) on the space of predictable square-summable
processes.
1.9 Ornstein-Uhlenbeck Semi-Group and Process
The Ornstein-Uhlenbeck operator L is dened as L = D, i.e. L satises
LJ
n
(f
n
) = nJ
n
(f
n
), f
n

2
(N)
n
.
Proposition 1.9.1. For any F o we have
LF = DF =

k=0
Y
k
(D
k
F) =

k=0

p
k
q
k
Y
k
(F
+
k
F

k
),
Proof. Note that D
k
D
k
F =0, k N, and use Relation (1.8.1) of Proposition
1.8.3.
Note that L can be expressed in other forms, for example
LF =

k=0

k
F,
where

k
F = (1
{X
k
=1}
q
k
(F() F(
k

)) 1
{X
k
=1}
p
k
(F(
k
+
) F()))
= F (1
{X
k
=1}
q
k
F(
k

) +1
{X
k
=1}
p
k
F(
k
+
))
= F E[F [ T
c
k
], k N,
and T
c
k
is the -algebra generated by
X
l
: l ,= k, l N.
Let now (P
t
)
tR+
= (e
tL
)
tR+
denote the semi-group associated to L and
dened as
P
t
F =

n=0
e
nt
J
n
(f
n
), t R
+
,
1.9 Ornstein-Uhlenbeck Semi-Group and Process 29
on F =

n=0
J
n
(f
n
) L
2
(). The next result shows that (P
t
)
tR+
admits
an integral representation by a probability kernel. Let q
N
t
: R
+
be
dened by
q
N
t
( , ) =
N

i=0
(1 + e
t
Y
i
()Y
i
( )), , , t R
+
.
Lemma 1.9.2. Let the probability kernel Q
t
( , d) be dened by
E
_
dQ
t
( , )
dP

T
N
_
() = q
N
t
( , ), N 1, t R
+
.
For F L
2
(, T
N
) we have
P
t
F( ) =
_

F()Q
t
( , d), , n N. (1.9.1)
Proof. Since L
2
(, T
N
) has nite dimension 2
N+1
, it suces to consider
functionals of the form F =Y
k1
Y
kn
with 0 k
1
< < k
n
N. By
Relation (1.4.5) we have for , k N:
E
_
Y
k
()(1 + e
t
Y
k
()Y
k
())

= p
k
_
q
k
p
k
_
1 + e
t
_
q
k
p
k
Y
k
()
_
q
k
_
p
k
q
k
_
1 e
t
_
p
k
q
k
Y
k
()
_
= e
t
Y
k
(),
which implies, by independence of the sequence (X
k
)
kN
,
E[Y
k1
Y
kn
q
N
t
(, )] = E
_
Y
k1
Y
kn
N

i=1
(1 + e
t
Y
ki
()Y
ki
())
_
=
N

i=1
E
_
Y
ki
()(1 + e
t
Y
ki
()Y
ki
())

= e
nt
Y
k1
() Y
kn
()
= e
nt
J
n
(

1
{(k1,...,kn)}
)()
= P
t
J
n
(

1
{(k1,...,kn)}
)()
= P
t
(Y
k1
Y
kn
)().

30 1 The Discrete Time Case


Consider the -valued stationary process
(X(t))
tR+
= ((X
k
(t))
kN
)
tR+
with independent components and distribution given by
P(X
k
(t) = 1 [ X
k
(0) = 1) = p
k
+ e
t
q
k
, (1.9.2)
P(X
k
(t) = 1 [ X
k
(0) = 1) = q
k
e
t
q
k
, (1.9.3)
P(X
k
(t) = 1 [ X
k
(0) = 1) = p
k
e
t
p
k
, (1.9.4)
P(X
k
(t) = 1 [ X
k
(0) = 1) = q
k
+ e
t
p
k
, (1.9.5)
k N, t R
+
.
Proposition 1.9.3. The process (X(t))
tR+
=((X
k
(t))
kN
)
tR+
is the
Ornstein-Uhlenbeck process associated to (P
t
)
tR+
, i.e. we have
P
t
F = E[F(X(t)) [ X(0)], t R
+
, (1.9.6)
for F bounded and T
n
-measurable on , n N.
Proof. By construction of (X(t))
tR+
in Relations (1.9.2)-(1.9.5) we have
P(X
k
(t) = 1 [ X
k
(0)) = p
k
_
1 + e
t
Y
k
(0)
_
q
k
p
k
_
,
P(X
k
(t) = 1 [ X
k
(0)) = q
k
_
1 e
t
Y
k
(0)
_
p
k
q
k
_
,
where Y
k
(0) is dened by (1.4.6), i.e.
Y
k
(0) =
q
k
p
k
+X
k
(0)
2

p
k
q
k
, k N,
thus
dP(X
k
(t)( ) = [ X(0))() =
_
1 + e
t
Y
k
()Y
k
( )
_
dP(X
k
( ) = ),
= 1. Since the components of (X
k
(t))
kN
are independent, this shows that
the law of (X
0
(t), . . . , X
n
(t)) conditionally to X(0) has the density q
n
t
( , )
with respect to P:
1.9 Ornstein-Uhlenbeck Semi-Group and Process 31
dP(X
0
(t)( ) =
0
, . . . , X
n
(t)( ) =
n
[ X(0))( )
= q
n
t
( , )dP(X
0
( ) =
0
, . . . , X
n
( ) =
n
).
Consequently we have
E[F(X(t)) [ X(0) = ] =
_

F()q
N
t
( , )P(d), (1.9.7)
hence from (1.9.1), Relation (1.9.6) holds for F L
2
(, T
N
), N 0.
The independent components X
k
(t), k N, can be constructed from the
data of X
k
(0) = and an independent exponential random variable
k
via
the following procedure. If
k
> t, let X
k
(t) = X
k
(0) = , otherwise if
k
< t,
take X
k
(t) to be an independent copy of X
k
(0). This procedure is illustrated
in the following equalities:
P(X
k
(t) = 1 [ X
k
(0) = 1) = E[1
{
k
>t}
] +E[1
{
k
<t}
1
{X
k
=1}
]
= e
t
+p
k
(1 e
t
), (1.9.8)
P(X
k
(t) = 1 [ X
k
(0) = 1) = E[1
{
k
<t}
1
{X
k
=1}
]
= q
k
(1 e
t
), (1.9.9)
P(X
k
(t) = 1 [ X
k
(0) = 1) = E[1
{
k
>t}
] +E[1
{
k
<t}
1
{X
k
=1}
]
= e
t
+q
k
(1 e
t
), (1.9.10)
P(X
k
(t) = 1 [ X
k
(0) = 1) = E[1
{
k
<t}
1
{X
k
=1}
]
= p
k
(1 e
t
). (1.9.11)
The operator L
2
( N) L
2
( N) which maps (u
k
)
kN
to (P
t
u
k
)
kN
is also denoted by P
t
. As a consequence of the representation of P
t
given in
Lemma 1.9.2 we obtain the following bound.
Lemma 1.9.4. For F Dom(D) we have
|P
t
u|
L

(,
2
(N))
|u|
L

(,
2
(N))
, t R
+
, u L
2
( N).
Proof. As a consequence of the representation formula (1.9.7) we have P(d )-
a.s.:
|P
t
u|
2

2
(N)
( ) =

k=0
[P
t
u
k
( )[
2
=

k=0
__

u
k
()Q
t
( , d)
_
2
32 1 The Discrete Time Case

k=0
_

[u
k
()[
2
Q
t
( , d)
=
_

|u|
2

2
(N)
()Q
t
( , d)
|u|
2
L

(,
2
(N))
.

1.10 Covariance Identities


In this section we state the covariance identities which will be used for the
proof of deviation inequalities in the next section. The covariance Cov (F, G)
of F, G L
2
() is dened as
Cov (F, G) = E[(F E[F])(GE[G])]
= E[FG] E[F]E[G].
Proposition 1.10.1. For all F, G L
2
() such that E[|DF|
2

2
(N)
] < we
have
Cov (F, G) = E
_

k=0
E[D
k
G [ T
k1
] D
k
F
_
. (1.10.1)
Proof. This identity is a consequence of the Clark formula (1.7.1):
Cov (F, G) = E[(F E[F])(G E[G])]
= E
__

k=0
E[D
k
F [ T
k1
]Y
k
__

l=0
E[D
l
G [ T
l1
]Y
l
__
= E
_

k=0
E[D
k
F [ T
k1
]E[D
k
G [ T
k1
]
_
=

k=0
E[E[E[D
k
G [ T
k1
]D
k
F [ T
k1
]]
= E
_

k=0
E[D
k
G [ T
k1
]D
k
F
_
,
and of its extension to G L
2
() in Corollary 1.7.3.
A covariance identity can also be obtained using the semi-group (P
t
)
tR+
.
1.10 Covariance Identities 33
Proposition 1.10.2. For any F, G L
2
() such that
E
_
|DF|
2

2
(N)
_
< and E
_
|DG|
2

2
(N)
_
< ,
we have
Cov (F, G) = E
_

k=0
_

0
e
t
(D
k
F)P
t
D
k
Gdt
_
. (1.10.2)
Proof. Consider F = J
n
(f
n
) and G = J
m
(g
m
). We have
Cov (Jn(fn), Jm(gm)) = E[Jn(fn)Jm(gm)]
= 1
{n=m}
n!fn, gn1n

2
(N
n
)
= 1
{n=m}
n!n
_

0
e
nt
dtfn, gn1n

2
(N
n
)
= 1
{n1=m1}
n!n
_

0
e
t

k=0
fn(, k), e
(n1)t
gn(, k)1n
(, k)

2
(N
n1
)
dt
= nmE
_
_

0
e
t

k=0
Jn1(fn(, k)1n
(, k))e
(m1)t
Jm1(gm(, k)1m
(, k))dt
_
= nmE
_
_

0
e
t

k=0
Jn1(fn(, k)1n
(, k))PtJm1(gm(, k)1m
(, k))dt
_
= E
_
_

0
e
t

k=0
D
k
Jn(fn)PtD
k
Jm(gm)dt
_
.

By the relations (1.9.8)-(1.9.11) the covariance identity (1.10.2) shows that


Cov (F, G) = E
_

k=0
_

0
e
t
D
k
FPtD
k
Gdt
_
= E
_
_
1
0

k=0
D
k
FP
(log )
D
k
Gd
_
=
_
1
0
_

k=0
D
k
F()D
k
G((i1
{
i
<log }
+

i
1
{
i
<log }
)
iN
)dP(d)P(d

)
=
_
1
0
_

k=0
D
k
F()D
k
G((i1
{
i
<}
+

i
1
{
i
>}
)
iN
)P(d)P(d

)d,
(1.10.3)
where (
i
)
iN
is a family of independent identically distributed (i.i.d.) ran-
dom variables, uniformly distributed on [0, 1]. Note that the marginals of
34 1 The Discrete Time Case
(X
k
, X
k
1
{
k
<}
+X

k
1
{i>}
) are identical when X

k
is an independent copy
of X
k
. Letting

(s, t) = E
_
e
isX
k
e
it(X
k
+1
{
k
<}
)+it(X

k
+1
{
k
>}
)
_
,
we have the relation
Cov (e
isX
k
, e
itX
k
) =
1
(s, t)
0
(s, t)
=
_
1
0
d

d
(s, t)d.
Next we prove an iterated version of the covariance identity in discrete time,
which is an analog of a result proved in [56] for the Wiener and Poisson
processes.
Theorem 1.10.3. Let n N and F, G L
2
(). We have
Cov (F, G) (1.10.4)
=
n

d=1
(1)
d+1
E


{1k1<<k
d
}
(D
k
d
D
k1
F)(D
k
d
D
k1
G)

+(1)
n
E


{1k1<<kn+1}
(D
kn+1
D
k1
F)E
_
D
kn+1
D
k1
G [ T
kn+11

.
Proof. Take F = G. For n = 0, (1.10.4) is a consequence of the Clark formula.
Let n 1. Applying Lemma 1.7.4 to D
kn
D
k1
F with a = k
n
and b = k
n+1
,
and summing on (k
1
, . . . , k
n
)
n
, we obtain
E


{1k1<<kn}
(E[D
kn
D
k1
F [ T
kn1
])
2

= E


{1k1<<kn}
[ D
kn
D
k1
F[
2


{1k1<<kn+1}
_
E
_
D
kn+1
D
k1
F [ T
kn+11
_
2

,
which concludes the proof by induction and polarization.
As a consequence of Theorem 1.10.3, letting F =G we get the variance in-
equality
1.10 Covariance Identities 35
2n

k=1
(1)
k+1
k!
E
_
|D
k
F|
2

2
(
k
)
_
Var (F)
2n1

k=1
(1)
k+1
k!
E
_
|D
k
F|
2

2
(
k
)
_
,
since
E


{1k1<<kn+1}
(D
kn+1
D
k1
F)E
_
D
kn+1
D
k1
F [ T
kn+11

= E


{1k1<<kn+1}
E
_
(D
kn+1
D
k1
F)E
_
D
kn+1
D
k1
F [ T
kn+11

[ T
kn+11

_
= E


{1k1<<kn+1}
(E
_
D
kn+1
D
k1
F [ T
kn+11

)
2

0,
see Relation (2.15) in [56] in continuous time. In a similar way, another iter-
ated covariance identity can be obtained from Proposition 1.10.2.
Corollary 1.10.4. Let n N and F, G L
2
(, T
N
). We have
Cov (F, G) =
n

d=1
(1)
d+1
E


{1k1<<k
d
N}
(D
k
d
D
k1
F)(D
k
d
D
k1
G)

+(1)
n
_

{1k1<<kn+1N}
D
kn+1
D
k1
F()D
kn+1
D
k1
G(

)
q
N
t
(,

)P(d)P(d

). (1.10.5)
Using the tensorization property
Var (FG) = E[F
2
]Var (G) + (E[G])
2
Var (F)
E[F
2
Var (G)] +E[G
2
Var (F)]
of the variance for independent random variable F, G, most of the identities in
this section can be obtained by tensorization of elementary one dimensional
covariance identities.
The following lemma is an elementary consequence of the covariance identity
proved in Proposition 1.10.1.
36 1 The Discrete Time Case
Lemma 1.10.5. Let F, G L
2
() such that
E[D
k
F[T
k1
] E[D
k
G[T
k1
] 0, k N.
Then F and G are non-negatively correlated:
Cov (F, G) 0.
According to the next denition, a non-decreasing functional F satises
D
k
F 0 for all k N.
Denition 1.10.6. A random variable F : R is said to be non-
decreasing if for all
1
,
2
we have

1
(k)
2
(k), k N, F(
1
) F(
2
).
The following result is then immediate from Proposition 1.6.2 and Lemma
1.10.5, and shows that the FKG inequality holds on . It can also be obtained
from Proposition 1.10.2.
Proposition 1.10.7. If F, G L
2
() are non-decreasing then F and G are
non-negatively correlated:
Cov (F, G) 0.
Note however that the assumptions of Lemma 1.10.5 are actually weaker as
they do not require F and G to be non-decreasing.
1.11 Deviation Inequalities
In this section, which is based on [59], we recover a deviation inequality of [19]
in the case of Bernoulli measures, using covariance representations instead
of the logarithmic Sobolev inequalities to be presented in Section 1.12. The
method relies on a bound on the Laplace transform L(t) = E[e
tF
] obtained
via a dierential inequality and Chebychevs inequality.
Proposition 1.11.1. Let F L
1
() be such that [F
+
k
F

k
[ K, k N,
for some K 0, and |DF|
L

(,
2
(N))
< . Then
P(F E[F] x) exp
_

|DF|
2
L

(,
2
(N))
K
2
g
_
xK
|DF|
2
L

(,
2
(N))
__
exp
_

x
2K
log
_
1 +
xK
|DF|
2
L

(,
2
(N))
__
,
with g(u) = (1 +u) log(1 +u) u, u 0.
1.11 Deviation Inequalities 37
Proof. Although D
k
does not satisfy a derivation rule for products, from
Proposition 1.6.4 we have
D
k
e
F
= 1
{X
k
=1}

p
k
q
k
(e
F
e
F

k
) +1
{X
k
=1}

p
k
q
k
(e
F
+
k
e
F
)
= 1
{X
k
=1}

p
k
q
k
e
F
(1 e

p
k
q
k
D
k
F
) +1
{X
k
=1}

p
k
q
k
e
F
(e
1

p
k
q
k
D
k
F
1)
= X
k

p
k
q
k
e
F
(e

X
k

p
k
q
k
D
k
F
1),
hence
D
k
e
F
= X
k

p
k
q
k
e
F
(1 e

X
k

p
k
q
k
D
k
F
), (1.11.1)
and since the function x (e
x
1)/x is positive and increasing on R we
have:
e
sF
D
k
e
sF
D
k
F
=
X
k

p
k
q
k
D
k
F
_
e
s
X
k

p
k
q
k
D
k
F
1
_

e
sK
1
K
,
or in other terms:
e
sF
D
k
e
sF
D
k
F
= 1
{X
k
=1}
e
sF

k
F
+
k
1
F

k
F
+
k
+1
{X
k
=1}
e
sF
+
k
F

k
1
F
+
k
F

e
sK
1
K
.
We rst assume that F is a bounded random variable with E[F] = 0. From
Proposition 1.10.2 applied to F and e
sF
, noting that since F is bounded,
E
_
|De
sF
|
2

2
(N)
_
C
K
E[e
2sF
]|DF|
2
L

(,
2
(N))
< ,
for some C
K
> 0, we have
E[Fe
sF
] = Cov (F, e
sF
)
= E
_
_

0
e
v

k=0
D
k
e
sF
P
v
D
k
Fdv
_

_
_
_
_
e
sF
De
sF
DF
_
_
_
_

E
_
e
sF
_

0
e
v
|DFP
v
DF|

1
(N)
dv
_
38 1 The Discrete Time Case

e
sK
1
K
E
_
e
sF
|DF|

2
(N)
_

0
e
v
|P
v
DF|

2
(N)
dv
_

e
sK
1
K
E
_
e
sF

|DF|
2
L

(,
2
(N))
_

0
e
v
dv

e
sK
1
K
E
_
e
sF

|DF|
2
L

(,
2
(N))
,
where we also applied Lemma 1.9.4 to u = DF.
In the general case, letting L(s) = E[e
s(FE[F])
], we have
log(E[e
t(FE[F])
]) =
_
t
0
L

(s)
L(s)
ds

_
t
0
E[(F E[F])e
s(FE[F])
]
E[e
s(FE[F])
]
ds

1
K
|DF|
2
L

(,
2
(N))
_
t
0
(e
sK
1)ds
=
1
K
2
(e
tK
tK 1)|DF|
2
L

(,
2
(N))
,
t 0. We have for all x 0 and t 0:
P(F E[F] x) e
tx
E[e
t(FE[F])
]
exp
_
1
K
2
(e
tK
tK 1)|DF|
2
L

(,
2
(N))
tx
_
,
The minimum in t 0 in the above expression is attained with
t =
1
K
log
_
1 +
xK
|DF|
2
L

(,
2
(N))
_
,
hence
P(F E[F] x)
exp
_

1
K
_
x +
|DF|
2
L

(,
2
(N))
K
_
log
_
1 +
Kx
|DF|
2
L

(,
2
(N))
_

x
K
_
exp
_

x
2K
log
_
1 +xK|DF|
2
L

(,
2
(N))
__
,
1.11 Deviation Inequalities 39
where we used the inequality
u
2
log(1 +u) (1 +u) log(1 +u) u, u R
+
.
If K =0, the above proof is still valid by replacing all terms by their lim-
its as K0. Finally if F is not bounded the conclusion holds for F
n
=
max(n, min(F, n)), n 1, and (F
n
)
nN
, (DF
n
)
nN
, converge respectively
almost surely and in L
2
( N) to F and DF, with |DF
n
|
2
L

(,L
2
(N))

|DF|
2
L

(,L
2
(N))
.
In case p
k
= p for all k N, the conditions
[D
k
F[ , k N, and |DF|
2
L

(,
2
(N))

2
,
give
P(F E[F] x) exp
_

2
pq

2
g
_
x

pq
__
exp
_

pq
2
log
_
1 +
x

pq
__
,
which is Relation (13) in [19]. In particular if F is T
N
-measurable, then
P(F E[F] x) exp
_
Ng
_
x

pq
N
__
exp
_

pq

_
log
_
1 +
x

pq
N
_
1
__
.
Finally we show a Gaussian concentration inequality for functionals of
(S
n
)
nN
, using the covariance identity (1.10.1). We refer to [17], [18], [61],
[75], for other versions of this inequality.
Proposition 1.11.2. Let F L
1
() be such that
_
_
_
_
_

k=0
1
2(p
k
q
k
)
[D
k
F[|D
k
F|

_
_
_
_
_

K
2
.
Then
P(F E[F] x) exp
_

x
2
2K
2
_
, x 0. (1.11.2)
Proof. Again we assume that F is a bounded random variable with E[F] = 0.
Using the inequality
[e
tx
e
ty
[
t
2
[x y[(e
tx
+ e
ty
), x, y R, (1.11.3)
40 1 The Discrete Time Case
we have
[D
k
e
tF
[ =

p
k
q
k
[e
tF
+
k
e
tF

k
[

1
2

p
k
q
k
t[F
+
k
F

k
[(e
tF
+
k
+ e
tF

k
)
=
1
2
t[D
k
F[(e
tF
+
k
+ e
tF

k
)

t
2(p
k
q
k
)
[D
k
F[E
_
e
tF
[ X
i
, i ,= k

=
1
2(p
k
q
k
)
tE
_
e
tF
[D
k
F[ [ X
i
, i ,= k

.
Now Proposition 1.10.1 yields
E[Fe
tF
] = Cov (F, e
sF
)
=

k=0
E[E[D
k
F [ T
k1
]D
k
e
tF
]

k=0
|D
k
F|

E
_
[D
k
e
tF
[

t
2

k=0
1
p
k
q
k
|D
k
F|

E
_
E
_
e
tF
[D
k
F[ [ X
i
, i ,= k

=
t
2
E
_
e
tF

k=0
1
p
k
q
k
|D
k
F|

[D
k
F[
_

t
2
E[e
tF
]
_
_
_
_
_

k=0
1
p
k
q
k
[D
k
F[|D
k
F|

_
_
_
_
_

.
This shows that
log(E[e
t(FE[F])
]) =
_
t
0
E[(F E[F])e
s(FE[F])
]
E[e
s(FE[F])
]
ds
K
2
_
t
0
sds
=
t
2
2
K
2
,
hence
e
x
P(F E[F] x) E[e
t(FE[F])
]
e
t
2
K
2
/2
, t 0,
1.11 Deviation Inequalities 41
and
P(F E[F] x) e
t
2
2
K
2
tx
, t 0.
The best inequality is obtained for t = x/K
2
.
Finally if F is not bounded the conclusion holds for F
n
= max(n, min
(F, n)), n 0, and (F
n
)
nN
, (DF
n
)
nN
, converge respectively to F and DF
in L
2
(), resp. L
2
( N), with |DF
n
|
2
L

(,
2
(N))
|DF|
2
L

(,
2
(N))
.
In case p
k
= p, k N, we obtain
P(F E[F] x) exp
_

px
2
|DF|
2

2
(N,L

())
_
.
Proposition 1.11.3. We have E[e
|F|
] < for all > 0, and E[e
F
2
] <
for all < 1/(2K
2
).
Proof. Let < c/e. The bound (1.11.2) implies
IE
_
e
|F|
_
=
_

0
P(e
|F|
t)dt
=
_

P([F[ y)e
y
dy
1 +
_

0
P([F[ y)e
y
dy
1 +
_

0
exp
_

(IE[[F[] +y/)
2
2K
2
_
e
y
dy
< ,
for all > 0. On the other hand we have
IE[e
F
2
] =
_

0
P(e
F
2
t)dt
=
_

P(F
2
y)e
y
dy
1 +
_

0
P([F[ (y/)
1/2
)e
y
dy
1 +
_

0
exp
_

(IE[[F[] + (y/)
1/2
)
2
2K
2
_
e
y
dy
< ,
provided 2K
2
< 1.
42 1 The Discrete Time Case
1.12 Logarithmic Sobolev Inequalities
The logarithmic Sobolev inequalities on Gaussian space provide an innite
dimensional analog of Sobolev inequalities, cf. e.g. [77]. On Riemannian
path space [22] and on Poisson space [6], [151], martingale methods have
been successfully applied to the proof of logarithmic Sobolev inequalities.
Here, discrete time martingale methods are used along with the Clark pre-
dictable representation formula (1.7.1) as in [46], to provide a proof of
logarithmic Sobolev inequalities for Bernoulli measures. Here we are only
concerned with modied logarithmic Sobolev inequalities, and we refer to
[127], Theorem 2.2.8 and references therein, for the standard version of the
logarithmic Sobolev inequality on the hypercube under Bernoulli measures.
The entropy of a random variable F > 0 is dened by
Ent [F] = E[F log F] E[F] log E[F],
for suciently integrable F.
Lemma 1.12.1. The entropy has the tensorization property, i.e. if F, G are
suciently integrable independent random variables we have
Ent [FG] = E[FEnt [G]] +E[GEnt [F]]. (1.12.1)
Proof. We have
Ent [FG] = E[FGlog(FG)] E[FG] log E[FG]
= E[FG(log F + log G)] E[F]E[G](log E[F] + log E[G])
= E[G]E[F log F] +E[F]E[Glog G)] E[F]E[G](log E[F] + log E[G])
= E[FEnt [G]] +E[GEnt [F]].

In the next proposition we recover the modied logarithmic Sobolev inequal-


ity of [19] using the Clark representation formula in discrete time.
Theorem 1.12.2. Let F Dom(D) with F > a.s. for some > 0. We
have
Ent [F] E
_
1
F
|DF|
2

2
(N)
_
. (1.12.2)
Proof. Assume that F is T
N
-measurable and let M
n
= E[F [ T
n
], 0 n N.
Using Corollary 1.6.3 and the Clark formula (1.7.1) we have
M
n
= M
1
+
n

k=0
u
k
Y
k
, 0 n N,
with u
k
=E[D
k
F [ T
k1
], 0 k n N, and M
1
= E[F]. Letting
f(x) = xlog x and using the bound
1.12 Logarithmic Sobolev Inequalities 43
f(x +y) f(x) = y log x + (x +y) log
_
1 +
y
x
_
y(1 + log x) +
y
2
x
,
we have:
Ent [F] = E[f(M
N
)] E[f(M
1
)]
= E
_
N

k=0
f(M
k
) f(M
k1
)
_
= E
_
N

k=0
f (M
k1
+Y
k
u
k
) f(M
k1
)
_
E
_
N

k=0
Y
k
u
k
(1 + log M
k1
) +
Y
2
k
u
2
k
M
k1
_
= E
_
N

k=0
1
E[F [ T
k1
]
(E[D
k
F [ T
k1
])
2
_
E
_
N

k=0
E
_
1
F
[D
k
F[
2
[ T
k1
_
_
= E
_
1
F
N

k=0
[D
k
F[
2
_
.
where we used the Jensen inequality (9.3.1) and the convexity of (u, v)
v
2
/u on (0, ) R, or the Schwarz inequality applied to
1/

F and (D
k
F/

F)
kN
,
as in the Wiener and Poisson cases [22] and [6]. This inequality is extended
by density to F Dom(D).
By a one-variable argument, letting df = f(1) f(1), we have
Ent [f] = pf(1) log f(1) +qf(1) log f(1) E[f] log E[f]
= p(E[f] +qdf) log(E[f] +qdf)
+q(E[f] pdf) log(E[f] pdf) (pf(1) +qf(1)) log E[f]
= pE[f] log
_
1 +q
df
E[f]
_
+pqdf log f(1)
+qE[f] log
_
1 p
df
E[f]
_
qpdf log f(1)
pqdf log f(1) pqdf log f(1)
= pqE[dfd log f] ,
44 1 The Discrete Time Case
which, by tensorization, recovers the following L
1
inequality of [47], [29], and
proved in [151] in the Poisson case. In the next proposition we state and prove
this inequality in the multidimensional case, using the Clark representation
formula, similarly to Theorem 1.12.2.
Theorem 1.12.3. Let F > 0 be T
N
-measurable. We have
Ent [F] E
_
N

k=0
D
k
FD
k
log F
_
. (1.12.3)
Proof. Let f(x) = xlog x and
(x, y) = (x +y) log(x +y) xlog x (1 + log x)y, x, x +y > 0.
From the relation
Y
k
u
k
= Y
k
E[D
k
F [ T
k1
]
= q
k
1
{X
k
=1}
E[(F
+
k
F

k
) [ T
k1
] +p
k
1
{X
k
=1}
E[(F

k
F
+
k
) [ T
k1
]
= 1
{X
k
=1}
E[(F
+
k
F

k
)1
{X
k
=1}
[ T
k1
]
+1
{X
k
=1}
E[(F

k
F
+
k
)1
{X
k
=1}
[ T
k1
],
we have, using the convexity of :
Ent [F] = E
_
N

k=0
f (M
k1
+ Y
k
u
k
) f(M
k1
)
_
= E
_
N

k=0
(M
k1
, Y
k
u
k
) +Y
k
u
k
(1 + log M
k1
)
_
= E
_
N

k=0
(M
k1
, Y
k
u
k
)
_
= E
_
N

k=0
p
k

_
E[F | F
k1
], E[(F
+
k
F

k
)1
{X
k
=1}
| F
k1
]
_
+q
k

_
E[F | F
k1
], E[(F

k
F
+
k
)1
{X
k
=1}
| F
k1
]
_
E
_
N

k=0
E
_
p
k

_
F, (F
+
k
F

k
)1
{X
k
=1}
_
+ q
k

_
F, (F

k
F
+
k
)1
{X
k
=1}
_

F
k1
_
_
= E
_
N

k=0
p
k
1
{X
k
=1}

_
F

k
, F
+
k
F

k
_
+ q
k
1
{X
k
=1}

_
F
+
k
, F

k
F
+
k
_
_
1.12 Logarithmic Sobolev Inequalities 45
= E
_
N

k=0
p
k
q
k
(F

k
, F
+
k
F

k
) + p
k
q
k
(F
+
k
, F

k
F
+
k
)
_
= E
_
N

k=0
p
k
q
k
(log F
+
k
log F

k
)(F
+
k
F

k
)
_
= E
_
N

k=0
D
k
FD
k
log F
_
.

The application of Theorem 1.12.3 to e


F
gives the following inequality for
F > 0, T
N
-measurable:
Ent [e
F
] E
_
N

k=0
D
k
FD
k
e
F
_
= E
_
N

k=0
p
k
q
k
(e
F

k
, e
F
+
k
e
F

k
) +p
k
q
k
(e
F
+
k
, e
F

k
e
F
+
k
)
_
= E
_
N

k=0
p
k
q
k
e
F

k
((F
+
k
F

k
)e
F
+
k
F

k
e
F
+
k
F

k
+ 1)
+p
k
q
k
e
F
+
k
((F

k
F
+
k
)e
F

k
F
+
k
e
F

k
F
+
k
+ 1)
_
= E
_
N

k=0
p
k
1
{X
k
=1}
e
F

k
((F
+
k
F

k
)e
F
+
k
F

k
e
F
+
k
F

k
+ 1)
+q
k
1
{X
k
=1}
e
F
+
k
((F

k
F
+
k
)e
F

k
F
+
k
e
F

k
F
+
k
+ 1)
_
= E
_
e
F
N

k=0

p
k
q
k
[Y
k
[(
k
Fe

k
F
e

k
F
+ 1)
_
. (1.12.4)
This implies
Ent [e
F
] E
_
e
F
N

k=0
(
k
Fe

k
F
e

k
F
+ 1)
_
. (1.12.5)
As noted in [29], Relation (1.12.3) and the Poisson limit theorem yield the L
1
inequality of [151]. More precisely, letting M
n
= (n +X
1
+ +X
n
)/2, F =
(M
n
) and p
k
= /n, k N, , n 1, > 0, we have, from Proposition 1.6.2,
46 1 The Discrete Time Case
n

k=0
D
k
FD
k
log F
=

n
_
1

n
_
(n M
n
)((M
n
+ 1) (M
n
)) log((M
n
+ 1) (M
n
))
+

n
_
1

n
_
M
n
((M
n
) (M
n
1)) log((M
n
) (M
n
1)),
in the limit as n goes to innity we obtain
Ent [(U)] E[((U + 1) (U))(log (U + 1) log (U))],
where U is a Poisson random variable with parameter . In one variable we
have, still letting df = f(1) f(1),
Ent [e
f
] pqE
_
de
f
d log e
f

= pq(e
f(1)
e
f(1)
)(f(1) f(1))
= pqe
f(1)
((f(1) f(1))e
f(1)f(1)
e
f(1)f(1)
+ 1)
+pqe
f(1)
((f(1) f(1))e
f(1)f(1)
e
f(1)f(1)
+ 1)
qe
f(1)
((f(1) f(1))e
f(1)f(1)
e
f(1)f(1)
+ 1)
+pe
f(1)
((f(1) f(1))e
f(1)f(1)
e
f(1)f(1)
+ 1)
= E
_
e
f
(fe
f
e
f
+ 1)

,
where
k
is the gradient operator dened in (1.6.4). This last inequality is
not comparable to the optimal constant inequality
Ent [e
F
] E
_
e
F
N

k=0
p
k
q
k
([
k
F[e
|
k
F|
e
|
k
F|
+ 1)
_
, (1.12.6)
of [19] since when F
+
k
F

k
0 the right-hand side of (1.12.6) grows as
F
+
k
e
2F
+
k
, instead of F
+
k
e
F
+
k
in (1.12.5). In fact we can prove the following
inequality which improves (1.12.2), (1.12.3) and (1.12.6).
Theorem 1.12.4. Let F be T
N
-measurable. We have
Ent [e
F
] E
_
e
F
N

k=0
p
k
q
k
(
k
Fe

k
F
e

k
F
+ 1)
_
. (1.12.7)
Clearly, (1.12.7) is better than (1.12.6), (1.12.4) and (1.12.3). It also improves
(1.12.2) from the bound
xe
x
e
x
+ 1 (e
x
1)
2
, x R,
1.12 Logarithmic Sobolev Inequalities 47
which implies
e
F
(Fe
F
e
F
+ 1) e
F
(e
F
1)
2
= e
F
[e
F
[
2
.
By the tensorization property (1.12.1), the proof of (1.12.7) reduces to the
following one dimensional lemma.
Lemma 1.12.5. For any 0 p 1, t R, a R, q = 1 p,
pte
t
+qae
a

_
pe
t
+qe
a
_
log
_
pe
t
+qe
a
_
pq
_
qe
a
_
(t a)e
ta
e
ta
+ 1
_
+pe
t
_
(a t)e
at
e
at
+ 1
__
.
Proof. Set
g(t) = pq
_
qe
a
_
(t a)e
ta
e
ta
+ 1
_
+pe
t
_
(a t)e
at
e
at
+ 1
__
pte
t
qae
a
+
_
pe
t
+qe
a
_
log
_
pe
t
+qe
a
_
.
Then
g

(t) = pq
_
qe
a
(t a)e
ta
+pe
t
_
e
at
+ 1
__
pte
t
+pe
t
log(pe
t
+qe
a
)
and g

(t) = pe
t
h(t), where
h(t) = a 2pt p + 2pa + p
2
t p
2
a + log(pe
t
+qe
a
) +
pe
t
pe
t
+qe
a
.
Now,
h

(t) = 2p +p
2
+
2pe
t
pe
t
+qe
a

p
2
e
2t
(pe
t
+qe
a
)
2
=
pq
2
(e
t
e
a
)(pe
t
+ (q + 1)e
a
)
(pe
t
+qe
a
)
2
,
which implies that h

(a) = 0, h

(t) < 0 for any t < a and h

(t) > 0 for any


t > a. Hence, for any t ,= a, h(t) > h(a) = 0, and so g

(t) 0 for any t R


and g

(t) = 0 if and only if t = a. Therefore, g

is strictly increasing. Finally,


since t = a is the unique root of g

= 0, we have that g(t) g(a) = 0 for all


t R.
This inequality improves (1.12.2), (1.12.3), and (1.12.6), as illustrated in one
dimension in Figure 1.1, where the entropy is represented as a function of
p [0, 1] with f(1) = 1 and f(1) = 3.5:
The inequality (1.12.7) is a discrete analog of the sharp inequality on Poisson
space of [151]. In the symmetric case p
k
= q
k
= 1/2, k N, we have
48 1 The Discrete Time Case
0
2
4
6
8
10
12
14
16
18
0.2 0.4 0.6 0.8 1
p
optimal
modified
L1
sharp
entropy
Fig. 1.1 Graph of bounds on the entropy as a function of p [0, 1]
Ent [e
F
] E
_
e
F
N

k=0
p
k
q
k
(
k
Fe

k
F

k
F + 1)
_
=
1
8
E
_
N

k=0
e
F

k
((F
+
k
F

k
)e
F
+
k
F

k
e
F
+
k
F

k
+ 1)
+e
F
+
k
((F

k
F
+
k
)e
F

k
F
+
k
e
F

k
F
+
k
+ 1)
_
=
1
8
E
_
N

k=0
(e
F
+
k
e
F

k
)(F
+
k
F

k
)
_
=
1
2
E
_
N

k=0
D
k
FD
k
e
F
_
,
which improves on (1.12.3).
Similarly the sharp inequality of [151] can be recovered by taking F = (M
n
)
in
Ent [e
F
] E
_
e
F
N

k=0
p
k
q
k
(
k
Fe

k
F

k
F + 1)
_
=

n
_
1

n
_
E
_
M
n
e
(Mn)
(((M
n
) (M
n
1))e
(Mn)(Mn1)
e
(Mn)(Mn1)
+ 1)
_
+

n
_
1

n
_
E
_
(n M
n
)e
(Mn)
(((M
n
+ 1) (M
n
))e
(Mn+1)(Mn)
e
(Mn+1)(Mn)
+ 1)
_
,
1.13 Change of Variable Formula 49
which, in the limit as n goes to innity, yields
Ent [e
(U)
] E[e
(U)
(((U +1) (U))e
(U+1)(U)
e
(U+1)(U)
+1)],
where U is a Poisson random variable with parameter .
1.13 Change of Variable Formula
In this section we state a discrete-time analog of It os change of variable
formula which will be useful for the predictable representation of random
variables and for option hedging.
Proposition 1.13.1. Let (M
n
)
nN
be a square-integrable martingale and f :
R N R. We have
f(M
n
, n)
= f(M
1
, 1)+
n

k=0
D
k
f(M
k
, k)Y
k
+
n

k=0
E[f(M
k
, k)f(M
k1
, k 1) [ T
k1
]
+
n

k=0
f(M
k1
, k) f(M
k1
, k 1). (1.13.1)
Proof. By Proposition 1.7.5 there exists square-integrable process (u
k
)
kN
such that
M
n
= M
1
+
n

k=0
u
k
Y
k
, n N.
We write
f(M
n
, n) f(M
1
, 1) =
n

k=0
f(M
k
, k) f(M
k1
, k 1)
=
n

k=0
f(M
k
, k) f(M
k1
, k) +f(M
k1
, k) f(M
k1
, k 1)
=
n

k=0
_
p
k
q
k
_
f
_
M
k1
+u
k
_
q
k
p
k
, k
_
f(M
k1
, k)
_
Y
k
+
p
k
q
k
1
{X
k
=1}
_
f
_
M
k1
+u
k
_
q
k
p
k
, k
_
f(M
k1
, k)
_
+1
{X
k
=1}
_
f
_
M
k1
u
k
_
p
k
q
k
, k
_
f(M
k1
, k)
_
+
n

k=0
f(M
k1
, k) f(M
k1
, k 1)
50 1 The Discrete Time Case
=
n

k=0
_
p
k
q
k
_
f
_
M
k1
+u
k
_
q
k
p
k
, k
_
f(M
k1
, k)
_
Y
k
+
n

k=0
1
q
k
1
{X
k
=1}
E[f(M
k
, k) f(M
k1
, k) [ T
k1
]
+
n

k=0
f(M
k1
, k) f(M
k1
, k 1).
Similarly we have
f(M
n
, n)
= f(M
1
, 1)
n

k=0
_
q
k
p
k
_
f
_
M
k1
u
k
_
p
k
q
k
, k
_
f(M
k1
, k)
_
Y
k
+
n

k=0
1
p
k
1
{X
k
=1}
E[f(M
k
, k) f(M
k1
, k) [ T
k1
]
+
n

k=0
f(M
k1
, k) f(M
k1
, k 1).
Multiplying each increment in the above formulas respectively by q
k
and p
k
and summing on k we get
f(M
n
, n) = f(M
1
, 1) +
n

k=0
f(M
k
, k) f(M
k1
, k 1)
= f(M
1
, 1) +
n

k=0
q
k
(f(M
k
, k) f(M
k1
, k 1))
+
n

k=0
p
k
(f(M
k
, k) f(M
k1
, k 1))
= f(M
1
, 1) +
n

k=0

p
k
q
k
_
f
_
M
k1
+u
k
_
q
k
p
k
, k
_
f(M
k1
, k)
_
Y
k

k=0

p
k
q
k
_
f
_
M
k1
u
k
_
p
k
q
k
, k
_
f(M
k1
, k)
_
Y
k
+
n

k=0
E[f(M
k
, k) [ T
k1
] f(M
k1
, k 1)
+
n

k=0
f(M
k1
, k) f(M
k1
, k 1)
= f(M
1
, 1)
1.13 Change of Variable Formula 51
+
n

k=0

p
k
q
k
_
f
_
M
k1
+u
k
_
q
k
p
k
, k
_
f
_
M
k1
u
k
_
p
k
q
k
, k
__
Y
k
+
n

k=0
E[f(M
k
, k) [ T
k1
] f(M
k1
, k 1)
+
n

k=0
f(M
k1
, k) f(M
k1
, k 1).

Note that in (1.13.1) we have


D
k
f(M
k
, k) =

p
k
q
k
_
f
_
M
k1
+u
k
_
q
k
p
k
, k
_
f
_
M
k1
u
k
_
p
k
q
k
, k
__
,
k N.
On the other hand, the term
E[f(M
k
, k) f(M
k1
, k 1) [ T
k1
]
is analog to the generator part in the continuous time It o formula, and can
be written as
p
k
f
_
M
k1
+u
k
_
q
k
p
k
, k
_
+q
k
f
_
M
k1
u
k
_
p
k
q
k
, k
_
f (M
k1
, k 1) .
When p
n
= q
n
= 1/2, n N, we have
f(M
n
, n) = f(M
1
, 1) +
n

k=0
f (M
k1
+ u
k
, k) f (M
k1
u
k
, k)
2
Y
k
+
n

k=0
f (M
k1
+u
k
, k) f (M
k1
u
k
, k) 2f (M
k1
u
k
, k)
2
+
n

k=0
f(M
k1
, k) f(M
k1
, k 1).
The above proposition also provides an explicit version of the Doob decom-
position for supermartingales. Naturally if (f(M
n
, n))
nN
is a martingale we
have
52 1 The Discrete Time Case
f(M
n
, n) = f(M
1
, 1)
+
n

k=0

p
k
q
k
_
f
_
M
k1
+u
k
_
q
k
p
k
, k
_
f
_
M
k1
u
k
_
p
k
q
k
, k
__
Y
k
= f(M
1
, 1) +
n

k=0
D
k
f(M
k
, k)Y
k
.
In this case the Clark formula, the martingale representation formula
Proposition 1.7.5 and the change of variable formula all coincide. In this
case, we have in particular
D
k
f(M
k
, k) = E[D
k
f(M
n
, n) [ T
k1
]
= E[D
k
f(M
k
, k) [ T
k1
], k N.
If F is an T
N
-measurable random variable and f is a function such that
E[F [ T
n
] = f(M
n
, n), 1 n N,
we have F = f(M
N
, N), E[F] = f(M
1
, 1) and
F = E[F] +
n

k=0
E[D
k
f(M
N
, N) [ T
k1
]Y
k
= E[F] +
n

k=0
D
k
f(M
k
, k)Y
k
= E[F] +
n

k=0
D
k
E[f(M
N
, N) [ T
k
]Y
k
.
Such a function f exists if (M
n
)
nN
is Markov and F =h(M
N
). In this
case, consider the semi-group (P
k,n
)
0k<nN
associated to (M
n
)
nN
and de-
ned by
(P
k,n
h)(x) = E[h(M
n
) [ M
k
= x].
Letting f(x, n) = (P
n,N
h)(x) we can write
F = E[F] +
n

k=0
E[D
k
h(M
N
) [ T
k1
]Y
k
= E[F] +
n

k=0
D
k
(P
k,N
h(M
k
))Y
k
.
1.14 Option Hedging 53
1.14 Option Hedging
In this section we give a presentation of the Black-Scholes formula in discrete
time, or Cox-Ross-Rubinstein model, see e.g. [45], [74], [125], or 15-1 of [149]
as an application of the Clark formula.
In order to be consistent with the notation of the previous sections we choose
to use the time scale N, hence the index 0 is that of the rst random value of
any stochastic process, while the index 1 corresponds to its deterministic
initial value.
Let (A
k
)
kN
be a riskless asset with initial value A
1
, and dened by
A
n
= A
1
n

k=0
(1 +r
k
), n N,
where (r
k
)
kN
, is a sequence of deterministic numbers such that r
k
> 1,
k N. Consider a stock price with initial value S
1
, given in discrete time as
S
n
=

(1 +b
n
)S
n1
if X
n
= 1,
(1 +a
n
)S
n1
if X
n
= 1, n N,
where (a
k
)
kN
and (b
k
)
kN
are sequences of deterministic numbers such that
1 < a
k
< r
k
< b
k
, k N.
We have
S
n
= S
1
n

k=0
_
(1 +b
k
)(1 +a
k
)
_
1 +b
k
1 +a
k
_
X
k
/2
, n N.
Consider now the discounted stock price given as

S
n
= S
n
n

k=0
(1 +r
k
)
1
= S
1
n

k=0
_
1
1 +r
k
_
(1 +b
k
)(1 +a
k
)
_
1 +b
k
1 +a
k
_
X
k
/2
_
, n N.
If 1 < a
k
< r
k
< b
k
, k N, then (

S
n
)
nN
is a martingale with respect to
(T
n
)
n1
under the probability P

given by
p
k
=
r
k
a
k
b
k
a
k
, q
k
=
b
k
r
k
b
k
a
k
, k N.
54 1 The Discrete Time Case
In other terms, under P

we have
E

[S
n+1
[ T
n
] = (1 +r
n+1
)S
n
, n 1,
where E

denotes the expectation under P

. Recall that under this probabil-


ity measure there is absence of arbitrage and the market is complete. From
the change of variable formula Proposition 1.13.1 or from the Clark formula
(1.7.1) we have the martingale representation

S
n
= S
1
+
n

k=0
Y
k
D
k

S
k
= S
1
+
n

k=0

S
k1

p
k
q
k
b
k
a
k
1 +r
k
Y
k
.
Denition 1.14.1. A portfolio strategy is represented by a pair of predictable
processes (
k
)
kN
and (
k
)
kN
where
k
, resp.
k
represents the numbers of
units invested over the time period (k, k + 1] in the asset S
k
, resp. A
k
, with
k 0.
The value at time k 1 of the portfolio (
k
,
k
)
0kN
is dened as
V
k
=
k+1
A
k
+
k+1
S
k
, k 1, (1.14.1)
and its discounted value is dened as

V
n
= V
n
n

k=0
(1 +r
k
)
1
, n 1. (1.14.2)
Denition 1.14.2. A portfolio (
k
,
k
)
kN
is said to be self-nancing if
A
k
(
k+1

k
) +S
k
(
k+1

k
) = 0, k 0.
Note that the self-nancing condition implies
V
k
=
k
A
k
+
k
S
k
, k 0.
Our goal is to hedge an arbitrary claim on , i.e. given an T
N
-measurable ran-
dom variable F we search for a portfolio (
k
,
k
)
0kn
such that the equality
F = V
N
=
N
A
N
+
N
S
N
(1.14.3)
holds at time N N.
Proposition 1.14.3. Assume that the portfolio (
k
,
k
)
0kN
is self-
nancing. Then we have the decomposition
V
n
= V
1
n

k=0
(1 +r
k
) +
n

i=0

i
S
i1

p
i
q
i
(b
i
a
i
)Y
i
n

k=i+1
(1 +r
k
). (1.14.4)
1.14 Option Hedging 55
Proof. Under the self-nancing assumption we have
V
i
V
i1
=
i
(A
i
A
i1
) +
i
(S
i
S
i1
)
= r
i

i
A
i1
+ (a
i
1
{Xi=1}
+b
i
1
{Xi=1}
)
i
S
i1
=
i
S
i1
(a
i
1
{Xi=1}
+b
i
1
{Xi=1}
r
i
) +r
i
V
i1
=
i
S
i1
((a
i
r
i
)1
{Xi=1}
+ (b
i
r
i
)1
{Xi=1}
) +r
i
V
i1
= (b
i
a
i
)
i
S
i1
(p
i
1
{Xi=1}
+q
i
1
{Xi=1}
) +r
i
V
i1
=
i
S
i1

p
i
q
i
(b
i
a
i
)Y
i
+r
i
V
i1
, i N,
by Relation (1.4.5), hence for the discounted portfolio we get:

V
i


V
i1
=
i

k=1
(1 +r
k
)
1
V
i

i1

k=1
(1 +r
k
)
1
V
i1
=
i

k=1
(1 +r
k
)
1
(V
i
V
i1
r
i
V
i1
)
=
i
S
i1

p
i
q
i
(b
i
a
i
)Y
i
i

k=1
(1 +r
k
)
1
, i N,
which successively yields (1.14.4).
As a consequence of (1.14.4) and (1.14.2) we immediately obtain

V
n
=

V
1
+
n

i=0

i
S
i1

p
i
q
i
(b
i
a
i
)Y
i
i

k=0
(1 +r
k
)
1
, (1.14.5)
n 1. The next proposition provides a solution to the hedging problem
under the constraint (1.14.3).
Proposition 1.14.4. Given F L
2
(, T
N
), let

n
=
1
S
n1

p
n
q
n
(b
n
a
n
)
E

[D
n
F [ T
n1
]
N

k=n+1
(1 +r
k
)
1
, (1.14.6)
0 n N, and

n
= A
1
n
_
N

k=n+1
(1 +r
k
)
1
E

[F [ T
n
]
n
S
n
_
, (1.14.7)
56 1 The Discrete Time Case
0 n N. Then the portfolio (
k
,
k
)
0kn
is self nancing and satises

n
A
n
+
n
S
n
=
N

k=n+1
(1 +r
k
)
1
E

[F [ T
n
],
0 n N, in particular we have V
N
= F, hence (
k
,
k
)
0kN
is a hedging
strategy leading to F.
Proof. Let (
k
)
1kN
be dened by (1.14.6) and
1
= 0, and consider the
process (
n
)
0nN
dened by

1
=
E

[F]
S
1
N

k=0
(1 +r
k
)
1
and
k+1
=
k

(
k+1

k
)S
k
A
k
,
k = 1, . . . , N 1. Then (
k
,
k
)
1kN
satises the self-nancing condition
A
k
(
k+1

k
) +S
k
(
k+1

k
) = 0, 1 k N 1.
Let now
V
1
= E

[F]
N

k=0
(1 +r
k
)
1
, and V
n
=
n
A
n
+
n
S
n
, 0 n N,
and

V
n
= V
n
n

k=0
(1 +r
k
)
1
, 1 n N.
Since (
k
,
k
)
1kN
is self-nancing, Relation (1.14.5) shows that

V
n
=

V
1
+
n

i=0
Y
i

i
S
i1

p
i
q
i
(b
i
a
i
)
i

k=1
(1 +r
k
)
1
, (1.14.8)
1 n N. On the other hand, from the Clark formula (1.7.1) and the
denition of (
k
)
1kN
we have
E

[F [ T
n
]
N

k=0
(1 +r
k
)
1
= E

_
E

[F]
N

k=0
(1 +r
k
)
1
+
N

i=0
Y
i
E

[D
i
F [ T
i1
]
N

k=0
(1 +r
k
)
1

T
n
_
= E

[F]
N

k=0
(1 +r
k
)
1
+
n

i=0
Y
i
E

[D
i
F [ T
i1
]
N

k=0
(1 + r
k
)
1
1.14 Option Hedging 57
= E

[F]
N

k=0
(1 +r
k
)
1
+
n

i=0
Y
i

i
S
i1

p
i
q
i
(b
i
a
i
)
i

k=1
(1 +r
k
)
1
=

V
n
from (1.14.8). Hence

V
n
= E

[F [ T
n
]
N

k=0
(1 +r
k
)
1
, 1 n N,
and
V
n
= E

[F [ T
n
]
N

k=n+1
(1 +r
k
)
1
, 1 n N.
In particular we have V
N
= F. To conclude the proof we note that from
the relation V
n
=
n
A
n
+
n
S
n
, 0 n N, the process (
n
)
0nN
coincides
with (
n
)
0nN
dened by (1.14.7).
Note that we also have

n+1
A
n
+
n+1
S
n
= E

[F [ T
n
]
N

k=n+1
(1 +r
k
)
1
, 1 n N.
The above proposition shows that there always exists a hedging strategy
starting from

V
1
= E

[F]
N

k=0
(1 +r
k
)
1
.
Conversely, if there exists a hedging strategy leading to

V
N
= F
N

k=0
(1 +r
k
)
1
,
then (

V
n
)
1nN
is necessarily a martingale with initial value

V
1
= E

V
N
] = E

[F]
N

k=0
(1 +r
k
)
1
.
When F = h(

S
N
), we have E

[h(

S
N
) [ T
k
] = f(

S
k
, k) with
f(x, k) = E

_
h
_
x
n

i=k+1
_
(1 +b
k
)(1 +a
k
)
1 +r
k
_
1 +b
k
1 +a
k
_
X
k
/2
__
.
58 1 The Discrete Time Case
The hedging strategy is given by

k
=
1
S
k1

p
k
q
k
(b
k
a
k
)
D
k
f(

S
k
, k)
N

i=k+1
(1 +r
i
)
1
=

N
i=k+1
(1 +r
i
)
1
S
k1
(b
k
a
k
)
_
f
_

S
k1
1 +b
k
1 +r
k
, k
_
f
_

S
k1
1 +a
k
1 +r
k
, k
__
,
k 1. Note that
k
is non-negative (i.e. there is no short-selling) when
f is an increasing function, e.g. in the case of European options we have
f(x) = (x K)
+
.
1.15 Notes and References
This chapter is a revision of [113] with some additions, and is mainly based on
[59] and [115]. It is included for the sake of consistency and for the role it plays
as an introduction to the next chapters. Other approaches to discrete-time
stochastic analysis include [53], [54], [48], [78] and [89]; see [8] for an approach
based on quantum probability. Deviation inequalities and logarithmic Sobolev
inequalities are treated in [19], [46], [59]. We also refer to [5], [17], [18], [61],
[75], for other versions of logarithmic Sobolev inequalities in discrete settings.
See [74], 15-1 of [149], and [125], for other derivations of the Black-Scholes
formula in the discrete time Cox-Ross-Rubinstein model.

You might also like