2002 - Stochastic Calculus PDF
2002 - Stochastic Calculus PDF
2002 - Stochastic Calculus PDF
Mircea Grigoriu
Stochastic Calculus
Applications in Science
and Engineering
Grigoriu, Mircea.
Stochastic calculus : applications in science and engineering I Mircea Grigoriu.
p.cm.
Includes bibliographical references and index.
ISBN 978-1-4612-6501-6
1. Stochastic analysis. I. Title.
QA274.2.G75 2002
519.2-dc21 2002074386
CIP
All rights reserved. This work may not be translated or copied in whole or in part without the written
permission of the publisher (Springer Science+Business Media, LLC),
except for brief excerpts in connection with reviews or
scholarly analysis. Use in connection with any form of information storage and retrieval, electronic
adaptation, computer software, or by similar or dissimilar methodology now known or hereafter
developed is forbidden .
The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the
former are not especially identified, is not to be taken as a sign that such names, as understood by the
Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.
SPIN 10835902
9 876 54 32 1
Contents
1 Introduction 1
2 Probability Theory 5
2.1 Introduction . . 5
2.2 Probability space .. 5
2.2.1 Sample space . 5
2.2.2 a-field •• 0 • 6
2.2.3 Probability measure 8
2.3 Construction of probability spaces 12
2.3.1 Countable sample space 13
2.3.2 Product probability space . 13
2.3.3 Extension of probability measure 16
2.3.4 Conditional probability . 16
2.3.5 Sequence of sets 18
2.3.6 Sequence of events 20
2.4 Measurable functions . . . 21
2.4.1 Properties . . . . . 22
2.4.2 Definition of random variable 22
2.4.3 Measurable transformations 24
2.5 Integration and expectation . . . . . . 26
2.5.1 Expectation operator . . . . . 28
2.5.1.1 Finite-valued simple random variables 28
2.5.1.2 Positive random variables . . . . 29
2.5.1.3 Arbitrary random variables .... 29
2.5.2 Properties of integrals of random variables 30
2.5.2.1 Finite number of random variables 30
2.5.2.2 Sequence of random variables . 32
2.5.2.3 Expectation ............ 33
2.6 The Lq (Q, F, P) space . . . . . . . . . . . . . . . . . . . . 34
2.7 Independence . . . . . . . . . . . 36
2.7.1 Independence of a-fields . 36
2.7.2 Independence of events . 36
v
v1 Contents
Bibliography 757
Index 771
Stochastic Calculus
Applications in Science
and Engineering
Chapter 1
Introduction
Algebraic, differential, and integral equations are used in the applied sciences, en-
gineering, economics, and the social sciences to characterize the current state of a
physical, economic, or social system and forecast its evolution in time. Generally,
the coefficients of and/or the input to these equations are not precisely known be-
cause of insufficient information, limited understanding of some underlying phe-
nomena, and inherent randonmess. For example, the orientation of the atomic
lattice in the grains of a polycrystal varies randomly from grain to grain, the spa-
tial distribution of a phase of a composite material is not known precisely for a
particular specimen, bone properties needed to develop reliable artificial joints
vary significantly with individual and age, forces acting on a plane from takeoff to
landing depend in a complex manner on the environmental conditions and flight
pattern, and stock prices and their evolution in time depend on a large number of
factors that cannot be described by deterministic models. Problems that can be
defined by algebraic, differential, and integral equations with random coefficients
and/or input are referred to as stochastic problems.
The main objective of this book is the solution of stochastic problems, that
is, the determination of the probability law, moments, and/or other probabilistic
properties of the state of a physical, economic, or social system. It is assumed that
the operators and inputs defining a stochastic problem are specified. We do not
discuss the mathematical formulation of a stochastic problem, that is, the selection
of the functional form of the equation and of the probability laws of its random
coefficients and input for a particular stochastic problem.
The book is addressed to researchers and graduate students. It is intended
to serve as a bridge between heuristic arguments used at times in the applied sci-
ences and the very rich mathematical literature that is largely inaccessible to many
applied scientists. Mathematicians will find interesting unresolved technical prob-
lems currently solved by heuristic assumptions.
1
2 Chapter 1. Introduction
Organization
The book is largely self-contained and has two parts. The first part includes
Chapters 2-5, and develops the probabilistic tools needed for the analysis of the
stochastic problems considered in the book. Essentials of probability theory are
reviewed in Chapter 2. An extensive review of stochastic processes, elements of
stochastic integrals, Ito's formula, and a primer on stochastic differential equa-
tions and their numerical solution is presented in Chapters 3 and 4. Numerous ref-
erences are provided that contain details and material not included in Chapters 2,
3, and 4. Methods of Monte Carlo simulation for random variables, stochastic
processes, and random fields are discussed in Chapter 5. These methods provide
useful illustrations for some of the theoretical concepts in Chapters 2-4 and are
essential for the solution of many of the stochastic problems in this book.
The second part of the book, Chapters 6-9, develops methods for solving
stochastic problems. In this book stochastic problems are characterized by type
INPUT
Deterministic Chapter6
Stochastic Chapter7
rather than by the particular field of application. Table 1 shows the four types
of problems considered in the book. Chapter 6 is concerned with deterministic
problems, that is, problems in which both the governing differential equations are
deterministic, but whose solution utilizes concepts of the theory of probability
and stochastic processes. Such problems arise in physics, mechanics, material
science, heat conduction, and many other fields. For example, consider a Laplace
equation "£f= 1 a2 u (x) 1axf = 0 with Dirichlet boundary conditions defined on an
open bounded subset D of Rd . The solution of this equation at an arbitrary point
x E Dis equal to the expectation of u(Y), where Y denotes the exit point from D
of an Rd-valued Brownian motion starting at x. Numerical algorithms based on
this approach are simpler and more efficient, though less general, than current fi-
nite element and finite difference based algorithms. Chapter 7 discusses problems
in which the equations describing a system have deterministic coefficients and the
input to these equations is stochastic. Such problems arise in the description of
various types of physical, economic, and ecological systems. Chapter 8 covers
problems defined by equations with random coefficients and deterministic input.
Examples include the derivation of macroscopic properties of a material from the
attributes of its constituents and phenomena, such as localization and pattern for-
mation, relevant in physics and mechanics. Chapter 9 is concerned with problems
characterized by equations with random coefficients and inputs. An example is
Chapter 1. Introduction 3
the system describing the propagation of pollutants through a soil deposit with
uncertain properties for which no realistic deterministic description is possible.
We now briefly describe the presentation of the material. Essential facts are
in boxes throughout the book. The book contains numerous examples that have
two purposes: to state consequences of the essential facts and to illustrate the use
of the stated facts in the solution of stochastic problems. The statements of the
facts and examples are followed by proofs or notes printed in smaller characters.
Complete proofs are given if they are not very technical. The notes include the
idea and/or the essential steps of technical proofs and references where complete
proofs can be found.
The advanced topics on probability theory and stochastic processes in the
first part of the book are essential for Chapters 6 and 7. Many of the developments
in Chapters 8 and 9 are largely based on the second moment calculus. How-
ever, extensions of the methods in Chapters 6 and 7 to the problems considered in
Chapters 8 and 9 require most of the advanced topics in Chapters 2-4.
Classroom use
The book can be used as a text for four different one-semester courses or
one two-semester course. The four one-semester courses can differ depending
on the applications they emphasize. The first, second, third, and fourth courses
emphasize the applications in Chapter 6, 7, 8, and 9, respectively, and require the
following background.
• The first course: properties of the conditional expectation (Chapter 2), stop-
ping times and martingales (Chapter 3), the Ito calculus and diffusion processes
(Chapter 4), and Monte Carlo techniques (Chapter 5).
• The second course: facts on stochastic processes (Chapter 3), the stochas-
tic integral, the Ito calculus, and the theory of stochastic differential equations
(Chapter 4), and Monte Carlo techniques (Chapter 5).
• The third course: a review of probabilistic concepts (Chapter 2), second mo-
ment calculus for stochastic processes (Chapter 3), and Monte Carlo techniques
(Chapter 5).
Acknowledgements
This book could not have been completed without the contributions of many
individuals. In particular, I wish to express my deepest appreciation to Professors
S. I. Resnick and G. Samorodnitsky of Cornell University for their technical ad-
vice and comments on various topics in the book, Dr. E. Simiu of National Insti-
tute of Standards and Technology for reviewing the entire manuscript, Professor
S. T. Ariaratnam of the University of Waterloo, Canada, for numerous stimulat-
ing discussions, as well as Professors P. R. Dawson, J. Jenkins, S. Mukherjee,
and Dr. C. Myers of Cornell University and Professor 0. Ditlevsen of Denmark
Technical University, Lyngby, for their useful comments. I also wish to thank my
doctoral students S. Arwade, E. Mostafa, and C. Roth for their many contribu-
tions, and Mr. C. Willkens for his enthusiastic and professional support of the
computer hardware and software used in this project. Finally, I am grateful to my
wife Betsy for understanding, encouragement, and support. During this project,
she became an accomplished sculptor and pilot.
My research on stochastic problems, some of which is incorporated in this
book, has been supported by the National Science Foundation, National Institute
of Standards and Technology, Electric Power Research Institute, Jet Propulsion
Laboratory, Air Force Office of Scientific Research, Federal Aviation Adminis-
tration, AON Financial Products, and other institutions. I am indebted to these
organizations and their continued support.
Chapter 2
Probability Theory
2.1 Introduction
Essential concepts of probability theory needed in this text are reviewed
and are illustrated by examples. The review includes the concepts of events,
sample space, a-field, measure, probability measure, probability space, condi-
tional probability, independence, random variable and vector, integral of random
variables, expectation, distribution, density, and characteristic functions, second
moment properties, convergence of sequences of random variables, conditional
expectation, and martingales. The readers familiar with these concepts can skip
this chapter entirely. However, some of those readers may benefit from using this
chapter as a summary of facts and examples needed in the rest of the book.
5
6 Chapter 2. Probability Theory
Example 2.1: Consider the experiment of rolling a die and two games associated
with this experiment. In the first game, one loses $10 if w :::: 3 and wins $10 if w >
3. In the second game, one loses $10 if w :::: 3 and wins $5, $10, or $15 if w = 4,
5, or 6. The relevant information for the first and second games is given by the
collections of subsets A1 = {{1, 2, 3}, {4, 5, 6}} and Az = {{1, 2, 3}, {4}, {5}, {6}}
of Q, respectively. Playing these games does not require knowing in the finest
detail the outcome of each roll of the die. Coarser information suffices. To play
the first game we need to know only if w :::: 3 or w > 3. The second game requires
a more refined description of the outcomes of the experiment because it has more
options of interest. ¢>
Note: Similar considerations apply to the experiment of testing steel specimens for strength.
Suppose that each steel specimen is subjected to the same force of magnitude x > 0 and
the objective is to design a safe steel structure. There are two relevant outcomes for this
"game", survival A = {w ~ x} and failure B = A c. The precise value of the strength w of a
particular specimen is not essential. To assess the likelihood of survival of steel specimens
subjected to action x it is sufficient to know whether w < x or w ~ x. A
2.2.2 a-field
Let F be a collection of subsets of Q relevant to a particular experiment. It
seems natural to re~uire that F has at least two pro_rerties: (1) If A E F, then the
outcomes corresponding to the non-occurrence of$ should also be in F, that is,
A E F implies A c E F, so that F is closed to co~plements, and (2) If A i E F,
i E I, that is, if the subsets Ai occur individually, then UiEJ Ai should also be in F
for any subset J of the index set I. These heuristic considerations are consistent
with the following properties defining F.
1. 0 E F.
2. A E F ===}AcE F.
3. Ai E F, i E I, I= a countable set ===?- UiE/ Ai E F. (2.1)
Note: The first condition in the definition of the a-field :F can be replaced with 0 E :F
because :F is closed to complements. The last conditions imply that countable intersections
of events are events. We have UiE/ Ai E :F for Ai E :F (condition 3) so that (UiEI Ai)c E
:F (condition 2) and (UiE/ Ai)c = niE/ Af E F by De Morgan's formulas.
If the last condition in the definition of a a- field F is replaced with the requirement
that F is closed under finite union, F is said to be a field. Hence, a a-field is a field.
However, a field may not be a a-field. There is no difference between a field and a a-field
for finite sample spaces. A
The members ofF are called events, or F-measurable subsets of Q, or just
measurable subsets of Q if there is no confusion regarding the reference a -field.
The pair (Q, F) is said to be a measurable space.
2.2. Probability space 7
Example 2.2: The collections Ai, i 1, 2, defined in Example 2.1 are not a-
fields. However,
a(A) = n
Q2A
Q, (2.2)
where g are a-fields on n. Then a(A) is a unique a-field, called the a-field
generated by A. There is no a-field smaller than a(A) that includes A.
• The Borel a-fields on JRd, d > 1, and lR are generated by the intervals in these
spaces and are denoted by B(JRd) =Ed and B = .6 1 = B(JR), respectively. The
Borel a-fields on the intervals [a, b], [a, b), (a, b], and (a, b) of the real line are
denoted by B([a, b]), B([a, b)), B((a, b]), and B((a, b)), respectively.
Note: The Borel u-field constitutes a special case of u(A) in Eq. 2.2 that is generated
by the collection of open sets D of a space X. This collection is a topology on X if it
(1) contains the empty set and the entire space, (2) is closed to finite intersections, and
(3) is closed to uncountable unions. There are notable similarities between a topology D
and a u-field :F defined on a space X. Both D and :F include the entire space and the
empty set and are closed to countable unions and finite intersections. However, D is closed
to uncountable unions while :F is closed under complements and countable intersections.
If X= lR, B can be generated by any of the intervals (a, b), (a, b], [a, b), or [a, b]
([151], Section 1.7, p. 17), that is,
B = u((a, b), -oo:::; a:::; b:::; +oo) = u([a, b), -oo <a:::; b:::; +oo)
= u([a, b], -oo <a:::; b < +oo) = u((-oo, x], x E JR) = u(open subsets oflR).
The Borel u-field Bd, d > 1, can be generated, for example, by the open intervals
xf=l (ai, bi) of JRd. ~
Example 2.3: Let A be a subset of Q. The a-field generated by A is :F =
{0, A, Ac, Q}. The a-fields :Fi in Example 2.2 coincide with the a-fields gen-
erated by the collections of events Ai. ¢
8 Chapter 2. Probability Theory
Example 2.5: The intervals (a, b], [a, b), and [a, b], a < b, are Borel sets al-
though they are not open intervals, that is, they are not members of the usual
topology V on the real line. Singletons, that is, isolated points of the real line, are
also Borel sets. <>
Proof: The intervals (a, b + 1/n), (a -ljn, b), and (a- 1jn, b + 1/m) are open intervals
that are in B for each n = 1, 2, ... , so that nn,m;::l (a- 1/n, b + 1/m) = [a, b] is in B.
Similar calculations show that [a, b), and (a, b] are Borel sets. That singletons are Borel
sets follows from the equality {a} = nn,m:::I (a-1jn, a+ 1/m) since (a-1jn, a+ 1/m) E
B holds for each n, m, and a E lit •
that is, the volume of the interval [a1, bl] x ... x [aa, ba].
2. The counting measure is used later in the book in conjunction with the Poisson
process. Suppose that :F is the power set of the sample space Q, that is, the
collection of all subsets of Q. The counting measure of A E :F is the cardinal of
A if A is finite and infinity otherwise.
3. The a-finite measure has the property that for every A E :F there exists a
sequence of disjoint events Ai, i = 1, 2, ... , such that J.t(Ai) < oo for every i,
Q = UiAi, and J.t(A) = L:;~ 1 J.t(A n Ai). For example, the Lebesgue measure A
on lR is a-finite because A(A) = LieZ A(A n [i, i + 1)), where A E !3, Z denotes
the set of integers, and A = A1 is the Lebesgue measure on the real line.
4. The finite measure is a measure with the property J.t(Q) < oo. The scaled
version of this measure, J.t(A) 1J.t(Q), takes values in [0, 1].
1. P(Q) = 1,
L P(Ai),
00
Note: A probability space can be obtained from a measure space (Q, F, J.l) with a finite
measure J.l by setting PO = J.l01 J.l(Q). A set N in :F with probability zero is called a
null set. A property that is true on Q \ N is said to hold almost everywhere (a.e.), almost
surely (a.s.), for almost every w, or with probability one (w.p.1). Because of the second
condition in Eq. 2.4, we say that the probability measure is countably additive.
The definition of P is consistent with our intuition. Let A and B be two events
associated with an experiment defining Q and F. Suppose that the experiment is performed
n times and denote by n A and n B the number of outcomes in which A and B are observed,
respectively. For example, nA and nB may denote the number of outcomes {w ::: 3} and
{w > 3} when a die is rolled n times. The probabilities of A and B are given by the limits
as n --+ oo of the relative frequencies of occurrence, n A In and n BIn, of the events A and
B. The relative frequencies have the properties (1) P(Q) = liilln--+oo(noln) = 1 since
no = n for each nand (2) if A and Bare disjoint events, P(A U B) = liilln--+oo((nA +
nB)/n) = limn--+oo(nA/n) + limn--+oo(nB/n) = P(A) + P(B), consistent with Eq. 2.4.
We also note that Eqs. 2.3 and 2.4 are meaningful because F is closed to countable unions.
Also, A E :F implies that A c is in the domain of P since F is a CJ' -field. &
We assume throughout the book that all probability spaces are complete. A
probability space (Q, F, P) is complete if for every A c B such that B E :F
and P(B) = 0 we have A E :F so that P(A) = 0 (Eq. 2.5). The following result
shows that this assumption is not restrictive.
10 Chapter 2. Probability Theory
For any probability space (Q, F, P) there exists a complete space (Q, F, P)
such that F ~ f: and P = P on F ([40], Theorem 2.2.5, p. 29).
P(A)::::: P(B), A~ B, A, BE F.
P(A) = 1- P(Ae), A E F.
P(A U B) = P(A) + P(B) - P(A n B), A, B E F. (2.5)
Proof: The first relationship follows since B = A U (B \ A), A and B \ A are disjoint
events, B \ A = B n A e, and the probability measure is a countably additive positive set
function.
We have 1 = P(Q) = P(A U A e) = P(A) + P(Ae) by Eq. 2.4. This also shows
that the probability of the impossible event is zero because P(0) = 1 - P(Q) = 0.
The last relationship results from Eq. 2.4 and the observations that A U B is the
union of the disjoint sets {A n Be, A n B, A e n B}, A is the union of the disjoint sets
{An Be, An B}, and B is the union of the disjoint sets {Ae n B, An B} . •
Proof: This equality follows from Eq. 2.4 since Q = Ut=l A;, A; E F, A; n AJ = 0,
i =f- j, and B = U; (B n A;) is a union of disjoint events. •
If A; E F, i = 1, 2, ... , then
L P(A;).
00
Proof: Since
u~ 1 A;= A 1 u (A]' n A 2) u (AI n A2 n A3 ) U···
and the events AJ, AI n A2, AI n A2 n A3, ... are disjoints, we have
which gives Eq. 2.7 by using the first relation in Eq. 2.5. For example, P(Aj n A2) :::;
P(A2), PC AI n A2 n A3) :::; P(A3), and so on. A set function P satisfying Eq. 2.7 is said
to be subadditive. •
2.2. Probability space 11
n n i-1
P (uj= 1 A;)= L P(A;)- L L P(A; n Aj)
i=l i=2 j=l
n i-! j-1
+L L L P(A; n AJ n Ak)- ... + (-l)n+I p ( n~=JAq),
i=3 j=2k=!
(2.8)
Proof: This formula extends the last equation in Eq. 2.5 and results from a repeated appli-
cation of this equation. For example, the probability of P(A 1 U Az U A 3) is equal to the
probability of the union of A 1 U A 2 and A 3 so that
Example 2.7: Let F; be the failure event of component i of a series system with n
components, that is, a system that fails if at least one of its components fails. The
system probability of failure is P f = P(Uj= 1 F;). If the events F; are disjoint, the
failure probability is Pf = Lt=l P(F;). Otherwise, Lt=l P(F;) gives an upper
bound on P f. <>
Example 2.8: The probability of failure P f = P(Uj= 1 F;) of the series system
in the previous example can be calculated exactly by the inclusion-exclusion for-
mula. However, the use of this formula is prohibitive for large values of n or even
impossible if the probability of the events F; 1 n · · · n F;m is not known for m 2:: 2.
The calculation of the bounds
n n
Pt:::: Pf,u = L P(F;)- L.
i=1
max
i=2 j=l, ... ,l-l
P(Fj n F;) and
n
Pf,u = n p- (n- 1) p 2 and Pf,l = p +I: max (o. p- (i- 1) p 2 ).
i=2
These bounds are shown in Fig. 2.1 for p = 0.1 as a function of the system size
n. The bounds are relatively wide and deteriorate as n increases. <>
1.2
Proof: The equality P(Fi n Fj) = p 2, i =j:. j, is valid if Fi and F1 are independent events,
as we will see later in this section (Eq. 2.54). We have (Eq. 2.7)
Proof: The set function Pis positive for any A E :F, P(Q) = L:~ 1 Pi = 1 by definition,
and is countably additive since
(2.11)
Product a-field:
:F = :F1 = a(R), where
x :Fz (2.12)
R = {A1 x Az : A1 e :F1, Az e :Fz} = measurable rectangles. (2.13)
14 Chapter 2. Probability Theory
(2.14)
Note: The product sample space Q contains the outcomes of both experiments generating
the sample spaces r.lt and r.lz. Let
be two collections of subsets of Q. These collections are a-fields on r.l, are included in :F,
and :F = a(R) = a(9 1 , 9z) since every member of n is the intersection of sets from 9t
and 9z.
The construction of the product probability measure is less simple. It can be
shown that there exists a unique probability P on (Q, :F) such that it satisfies Eq. 2.14
([40], Theorem 3.3.5, p. 59) . .&
Note: The sample space Q corresponds to the experiment of rolling two dice. The prod-
uct a-field consists of all subsets of Q since the members of n are (i, j), Uieh (i, j),
U je[z (i, j), Uieh ,je/z (i, j), where It, lz ~ {1, 2, 3, 4, 5, 6}. The product probability
measure can be defined for each outcome w = (i, j) and is P({w}) = 1/36 because the
members of Q are equally likely. Also, P({w} = (i, j)) is equal to P1 ({i}) Pz({j}) =
(1/6) (1/6) . .&
Example 2.10: Consider the same experiment as in the previous example but as-
sume that the a-fields on QI = Qz = {1, 2, 3, 4, 5, 6} are
:Ft ={At= {1, 2}, A}, 0, QI} and :Fz = {Az = {1, 2, 3}, A2, 0, Qz}.
Q = QI X··· X Qn,
:F = FI X ••• X Fn,
P =PI X··· X Pn. (2.15)
Example 2.11: A loaded coin with sides {1} and {0} and probabilities p E (0, 1)
and q = 1 - p, respectively, is tossed n times. The probability space for a single
toss is Q = {0, 1}, :F = {0, Q, {0}, {1}}, P({l}) = p, and P({O}) = q. The
corresponding elements of the product probability space for n tosses are
Qn = {w= (WI, ... ,Wn) :wi =0orl},
If (1) Cis a field on Q and (2) R is a real-valued, positive, and countably additive
function defined on C such that R(Q) = 1, then there exists a unique probability
P on F = a(C) such that P(A) = R(A) for each A E C, that is, the restriction
of P to Cis equal toR ([66], Theorem 14, p. 94).
Example 2.12: Let Q = lR and let C be the collection of all finite unions of
intervals of the type (a, b] for a < b, (-oo, a], (a, oo), and (-oo, oo) to which
we add the empty set. Let F : lR -+ [0, 1] be a continuous increasing function
such that Iimx-+-oo F(x) = 0 and limx-+oo F(x) = 1. DefineR : C -+ [0, 1]
by R((a, b]) = F(b)- F(a), R((-oo, a]) = F(a), R((a, oo)) = 1 - F(a),
and R((oo, oo)) = 1. This set function can be extended uniquely to a probability
measure on (JR, B). <>
Note: The collection of subsets Cis a field and a(C) = B. The set function R is well
defined because, for example, R((a, b] U (b, c]) = R((a, c]) and R(R) = 1, and it is
finitely additive. Moreover, R is countably additive ([66], Proposition 9, p. 90). The above
theorem implies the stated result. A
Note: The set function P(· I B) is a probability because it is defined on :F, takes positive
values, P(Q I B) = P(B)/ P(B) = 1, and is countably additive, that is,
P(Uiel Ai I B) = P((Uie/ Ai) n B)/ P(B) = P(Uiel (Ai n B))/ P(B)
=L P(Ai n B)/ P(B) =L P(Ai 1 B)
iel iel
for any countable set I and disjoint events Ai, i E I. A
2.3. Construction of probability spaces 17
and
The Bayes formula:
Example 2.13: Consider the experiment of rolling two dice. The sample space,
the a-field :F, and the probability measure are n = {w = (i, j) : i, j = 1, ... , 6},
the collection of all parts of n, and P(w) = 1/36, respectively. Consider the
events A1 = (6, 2) and A2 = {(6, 2), (4, 4), (1, 6)}. The probabilities of these
events given that B = {w E n : i + j = 8} has occurred are P(A 1 I B) = 1/5
and P(A2 I B) = 2/5. <>
Proof: The event B is {(6, 2), (5, 3), (4, 4), (3, 5), (2, 6)}. The probability of A1 condi-
tional on B is P(A1 I B) = 1/5 since A1 is one of the five equally likely members of B.
The same result can be obtained from Eq. 2.16 since P(A1 n B) = P(A 1) = 1/36 and
P(B) = 5/36. The conditional probability of A1 given Be is zero. Hence, the probability
of occurrence of A 1 is 1/5 and zero if B and Be has occurred, respectively. The conditional
probabilities of A2 under B and Be are P(A2 I B) = (2/36)/(5/36) = 2/5 and P(A2 I
Be) = (1/36)/(31/36) = 1/31 since P(B) = 5/36 and P(Be) = 1- P(B) = 31/36. We
will see in Example 2.86 that the probabilities P(A2 I B), P(A2 1 Be), P(B), and P(Be)
are sufficient to specify the conditional probability P(A2 I 9), where 9 = {0, Q, B, Be}
is the a-field generated by B. •
18 Chapter 2. Probability Theory
Example 2.14: Let X be the unknown strength of a physical system, that is, the
maximum load that the system can sustain. Suppose that the system survives
without any damage if subjected to a proof load test of intensity x pr· Consider the
events A = {X > x} and B = {X > Xpr} that the system strength exceeds x and
Xpr. respectively. The probability that the system can sustain a load x given that B
has occurred,
P8 (x) = P(X > max{Xpr. x})j P(X > Xpr),
is larger than the system reliability P(X > x) calculated when the information
provided by the proof load test is not available or is ignored. ~
(2.20)
Note: The limits in Eqs. 2.19-2.20 are subsets of n. The sequence of events Bj = U~j Ai
and Cj = n~ j Ai are monotone, B j is a decreasing sequence, that is, Bj 2 B j +1o whjle
C j is an increasing sequence, that is, Cj ~ Cj +1o for all j. These properties are denoted
by B j -).. and C j t, respectively. .A
The limits in Eqs. 2.19 and 2.20 have some notable properties.
Proof: Recall that the function 1A : n -+ {0, 1}, called the indicator function, is defined
by 1A (w) = 1 for w E A and 1A (w) = 0 for w ¢ A.
2.3. Construction of probability spaces 19
If wE lim sup; A;, then wE Bj = u~ 1 A;, Vj, so that there exists iJ 2: j such
that w E A;1 , j= 1, 2, ... , and L~I 1A; (w) 2: Lj 1A;J (w) = oo implying that w E
{w: I:~ 1 1A; (w) = oo}. Therefore, we have lim sup; A; c {w: L~I lA; (w) = oo}.
If wE {w: L~I1A;(w) = oo}, there exists iJ -+ oo, j = 1, 2, ... , such that
wE A;r Hence, wE Uk-:::_jAk for all j's so that wE lim sup; or {w: I:~ 1 1A;(w) =
oo} c lim sup; A; ([151], Lemma 1.3.1, p. 6). •
= /w: f
!=1
1Af(w) < oo} = {w: wE A;, Vi ::: jo(w)}. (2.22)
Proof: If w E liminf; A;, there exists jo(w) such that w E Cia = n~ 10 A;, that is, w
belongs to all sets A; fori 2: jo(w), or w does not belong to the sets AJ, ... , Aio(w) so
thati:~ 1 1Af(w) is finite.
If w is such that L~ 1 1A c ( w) is finite, then w does not belong to a finite number of
subsets A;. Hence, there erlsts j~(w) such that w E A; fori 2: jo or w E n~ A;. •10
Proof: If w E lim inf; A i, there exists jo such that w E CJo, that is, w E A i for
i 2: jo. Hence, w E A; infinitely often, that is, w E lim sup; A;. De Morgan's law
(u~ 1 n~i A;r = n~ 1 U~i Af yields Eq. 2.23 . •
If the limits in Eqs. 2.19 and 2.20 coincide, the sequence A i is said to be a
convergent sequence with the limit
Proof: We need to show that lim sup A; and lim inf A; are equal. If A; t, then B j =
U;::;:jAi = U;::;:tA; sothatlimsupA; = U;::;:tA;. The sequence Cj = ni::;:jAi inEq. 2.20
is Aj so that liminfA; = U;::;:tA; =limA;. Similar considerations give the limit of a
decreasing sequence of subsets. •
since P is a countably additive function. By the first property in Eq. 2.5, the numerical
sequence P(An) is increasing. Similar consideration can be applied for the case A; ,J... If A;
is a decreasing sequence such that n~ 1 A; = 0, we have li!Ilj--+ 00 P(A;) = P(n~ 1 A;)=
P(0) = 0. •
lim P(A;)
i--+oo
= P( i--+oo
lim A;) = P(A), (2.25)
that is, for a convergent sequence of events, probability and limit operations
can be interchanged.
Proof: A direct consequence of Eq. 2.25 is that for any sequence of events A; the equalities
P(limsupA;) = .lim P(Bj) = .lim P(U~jA;) and
i--+oo J--+OO J--+00
P(ljminfA;) = .lim P(Cj) = ,lim P(n~jA;)
1--+00 J --+00 j--+00
hold because Bj = U~jAi and Cj = n~jAi are decreasing and increasing sequences,
respectively, lim sup A; = limB j, and lim inf A; = lim C j.
Since A; is a convergent sequence of events, the decreasing and increasing se-
quences Dj = SUPi::;:j A; and Ej = inf;::;:j A; respectively, converge to A. The inequali-
ties P(Ej) :s P(A j) :s P(Dj) hold for each j because Ej ~ Aj ~ Dj· By the continuity
of the probability measure, P ( E j) and P ( D j) converge to P (A) as j --+ oo so that the
limit of P(A j) is P(A) . •
2.4. Measurable functions 21
Proof: Because lim sup; A; and liminf; A; are events as countable unions and intersec-
tions of events (Eqs. 2.19 and 2.20), we can calculate P(limsup; A;) and P(liminf; A;).
The inequality P(liminf; A;) ~ P(limsup; A;) follows from Eqs. 2.5 and 2.23. Because
P(A;) is a numerical series, it satisfies the inequality liminf; P(A;) ~ lim sup; P(A;).
The proof of the remaining inequalities is left to the reader.
If the sequence A; is convergent, lim sup; A; = lim inf; A; = lim; A; so that the
left and right terms of Eq. 2.26 coincide. This observation provides an alternative proof for
Eq. 2.25 . •
L P(A;),
00
by the monotonicity and subadditivity of the probability measure. The extreme right term in
these inequalities converges to zero as j --+ oo because it is assumed that P(A;) < oo. Li
Hence, P(limsup; A;)= 0. •
Note: It is common to indicate this property of h by the notation h : (Q, :F) --+ (Ill, 9),
h E :F/9, or even h E :F provided that there can be no confusion about the a-field 9. If
the a-fields :F and 9 are not in doubt, it is sufficient to say that his measurable. &
22 Chapter 2. Probability Theory
2.4.1 Properties
If h : (Q, :F) -+ (\11, Q) is measurable and Pis a probability measure on (Q, :F),
then Q : Q -+ [0, 1], defined by
Proof: The function Q has the properties Q(\11) = 1 by definition and Q(Uie/ Bi) =
Lie/ Q(Bi) for any countable set I and disjoint events Bi E 9, since the sets h- 1 (Bi)
are disjoint events in F and P is a probability. Hence, the set function Q is a probability
measure on (\11, Q). •
Proof: Let vr denote the topology on lR' generated by the open intervals of lR'. Because
g is a continuous function, we have g - 1 (D) E Vd for all D E Vq . This function is
measurable since the Borel fields gi and Bq are generated by the open sets in W and Rq,
respectively. •
(2.30)
Proof: The collection a(X) of subsets of Q is a a-field because (1) W E Bd so that r.l =
x- 1 (Rd) E a(X), (2) if BE Bd, then x- 1 (B) E a(X) and (X- 1(B))c = x- 1(Bc) E
2.4. Measurable functions 23
a(X), and (3) if Bi E Bd, i E /, we have x- 1 (Bi) E a(X) and Uie/X- 1 (Bi) =
x- 1 (Uie/ Bi) E a(X), where I is a countable index set.
If 1l is another u -field relative to which X is measurable, this field must include the
subsets x- 1(Bd) of 0 so that 1l includes a(X) . •
(2.31)
The mapping X : (Q, :F) --+ (IR, B) is a random variable if and only if
x- 1 (( -00, x]} E F, X E !R ([151), Corollary 3.2.1, p. 77).
Note: This result is used extensively in applications to determine whether a function is
measurable and to find properties of random variables . .&
The JRd -valued function X : (Q, :F) --+ (JRd, Bd) is a random vector if and only
if its coordinates are random variables.
Example 2.20: Let Q = [0, 1], F = B([O, 1]), and P(dw) = dw. If a and f3 > a
are constants, the function w f--* X (w) = a + ({3 - a) w is a random variable on
(Q, F, P) since it is continuous. The distribution of X is Q((xt, xz]) = (xz-
Xt)/({3- a). ¢
Proof: Take B = (XJ, xz] in the range of X. The distribution Q ((x!, xz]) is equal to the
probability P((wt. wz]) of the interval x- 1((x 1, xz]), where w; = (xi - a)/({3 -a). •
Proof: ForB E 1{, g- 1 (B) ising since g is measurable. We have h- 1 (A) E :F for any
A E g since h is measurable, so that h-I (g -I (B)) E :F. Hence, g o h is measurable from
(Q, :F) to (<I>, 1{). •
e
-J..X
' and eAux
are random variables on the same probability space. ¢
2.4. Measurable functions 25
Real Real
line line
g-1
set
Example 2.23: If (X, Y) : (Q, F) --+ (JR.2 , B 2) and g : (JR.2 , B 2) --+ (JR., B) are
measurable functions, then g(X, Y) e :F jB. For example, the functions X v Y =
max(X, Y), X 1\ Y = min(X, Y), X+ Y, X- Y, X Y, and X/Y are measurable
from (Q, F) to (JR., B). The transformation XfY is defined for Y =f:. 0 ([40],
Theorem 3.1.5, p. 36)). ~
{1, 2, ... }. The function (m, w) ~---+ Xm(w) depending on the arguments m and w
is measurable from (/Z. + x Q, K x :F) to (IV, Q). Generally, this property does
not hold if the discrete index m in this example is allowed to take values in an
uncountable set, as we will see in the next chapter. ¢
Proof: Let A= {(m, w) : Xm(w) E B} be the inverse image of the function (m, w) ~--+
Xm(w) in z+ x n corresponding to an arbitrary member B of 9. Because Xm is mea-
surable, this set can be expressed as the countable union Um{w : Xm(w) E B} of sets
{w : Xm (w) E B} that are in :F for each m 2: 1. Hence, A is inK x :F. We also note that
the function m !--+ Xm (w) is K-measurable for a fixed (L) E n since {m : Xm (w) E B} is a
subset of z+ so that it is inK. •
Example 2.26: Let M : (Q, :F) --+ (IZ +, K) and Xm : (Q, :F) --+ (IV, Q) for
m E z+ be measurable functions, where K consists of all parts of Z +. The
functionw ~---+ XM(w)(w) is measurable from (Q, :F) to (IV, Q). ¢
Note: Because X takes a constant value x; in A; and the subsets A; partition n, we have
P(A;) = P(X = x;), where {X; = x;} is an abbreviated notation for the event {w E n :
X(w) = x;}.
The collection of simple random variables is a vector space. Moreover, if X and
Yare two simple random variables defined on a probability space (Q, :F, P), then X Y,
X A Y = min(X, Y) and X v Y = max(X, Y) are simple random variables on the same
space ([151], Section 5.1). ~
2.5. Integration and expectation 27
1 ifwEA
1A(w) ={ 0 ifw ~A is a simple random variable. (2.33)
Proof: Because 1:4 1 (B) = {w: lA(w) E B} is 0, A, Ac, and Q ifO, 1 ¢ B, 1 E B but
0 ¢ B, 0 E B but 1 ¢ B, and 0, 1 E B, respectively, we have 1:4 1(B) E F for any BE B
so that lA is FIB-measurable. •
Xn
nzn
= LX
(k 1)
; lAn,k + n lBn ~ 0
k=l
of simple random variables, where An,k = {w : (k- l)/2n ::::; X(w) < kj2n} and Bn =
{w : X(w) ~ n} (Fig. 2.3). The sequence Xn, n = 1, 2, ... , is increasing, that is, Xn ::::;
Xn+l• and its members are measurable functions that are smaller than X. If X(w) < oo,
28 Chapter 2. Probability Theory
then IXn(w)- X(w)l ::; 2-n --+ 0 as n--+ oo. If X(w) = +oo, then Xn(w) = n so that it
approaches infinity as n increases. •
If X is a simple random variable (Eq. 2.32) such that lxd < oo, i E /, its
expectation is
E[X] = I>i P(A;). (2.34)
iel
Note: The expectation of the indicator function in Eq. 2.33 is E[lA] = P(A). A
Example 2.27: Consider a loaded die with m < oo sides showing the numbers
Xi with probabilities Pi. i = 1, ... , m. Suppose we roll the die n times and
ni ;::: 0 denotes the number of times we see side i. The arithmetic average, i =
Lie! Xi (ni/n), of the observed values approximates E[X] in Eq. 2.34 for large
values of n because n if n converges to Pi as n --+ oo. <>
The following properties are direct consequences of the definition in Eq. 2.34.
Proof: Let X = Lief Xi 1Ai and Y = LjeJ Yj 1ni' where I and J are finite sets.
X 2:: 0 implies x; 2:: 0 so that E[X] is positive (Eq. 2.34), that is, the first property. To
prove the second property, note that a X + f3 Y is a finite-valued simple random variable
corresponding to the partition A; n Bj of f.l ([151], pp. 120-121). If X ::; Y, the first
property and the fact that Y- X 2:: 0 is a simple random variable imply E[Y- X] 2:: 0 or
E[Y] 2:: E[X] (Eq. 2.34) . •
If X, Y, and Xn, n = 1, 2, ... , are positive random variables with range [0, oo],
then ([151], Section 5.2.3)
Note: The last property, called the monotone convergence theorem, gives conditions
under which limits and expectations can be interchanged ([151], Section 5.2.3). A similar
result is given in Eq. 2.36. A
• If X is an arbitrary random variable such that at least one of E[X +]and E[X-]
is finite, we define
(2.40)
elf E[X+] and E[X-] are both finite, then E[X] in Eq. 2.40 exists and is finite.
We say that X has a finite expectation or is P-integrable.
• If both E[X+] and E[X-] are unbounded, E[X] does not exist.
30 Chapter 2. Probability Theory
Note: If the expectations E[X+] and E[X-] are unbounded, E[X] given by Eq. 2.40 does
not exist. If E[X+] and E[X-] are finite, then both E[X] < oo and E[IXI] < oo since
X = x+ - x- and lXI = x+ + x-. If one of the expectations E[X+] and E[X-] is
finite and the other infinite, E[X] is defined but is unbounded. For example, E[ X] exists
and E[X] = +oo if E[X+] = +oo and E[X-1 is bounded. A
Note: The integrals in Eqs. 2.41 and 2.42 are special cases of the Lebesgue-Stieltjes inte-
gral J h dJ.L, where his a measurable function on a measure space (r.l, :F, J.L). The random
variable X and the probability measure Pin Eqs. 2.41 and 2.42 correspond to the measur-
J
able function h and the measure J.L, respectively. Hence, X d P has properties similar to
fhdJ.L. A
1 1
E[X lA] = 2)i 2 + } 2 ) 36 = [(2)(40) + (2)(34) + 32] 36 = 5,
wE A
where A= {w = (i, j): i +j = 8} = {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)}. <>
Note: We have IE[X]I < oo if and only if E[IXI] < oo by setting A = Q. This result is
consistent with an earlier comment (Eq. 2.40). A
Note: If A = Q, Eq. 2.43 yields E[a X + bY] = a E[X] + b E[Y] showing that E
is a linear operator in agreement with Eq. 2.35 and 2.38. This property implies that
fA L7=1 ai Xi dP =L:?=t ai fA Xi dP if the random variables Xi are P-integrable over
A and ai denote constants. A
(2.44)
If Y ::::X a.s. and the integrals f X dP and f Y dP exist ([40], Section 3.2),
are direct consequences of Eq. 2.45. For example, a P(A) :::; fA X dP follows from
Eq. 2.45 with Y = a and X 2: a a.s. The modulus inequality can be obtained from
Eq. 2.45 with (-lXI, X) and (X, IX!) in place of (Y, X). A
32 Chapter 2. Probability Theory
then lim { Xn dP
n~oo }A
= }A{ (n~oo
lim Xn) dP = { X dP.
}A
(2.47)
Note: Under the conditions of this theorem we can interchange the limit and the integra-
tion operations. The a.s. convergence of Xn to X means that the numerical series Xn (w)
converges to X(w) as n ~ oo for each w E Q \ N, where N E :F and P(N) = 0
(Section 2.13). An event N with this property is called a null set.
The statements in Eq. 2.47 corresponding to the conditions (1) IXnl :::=:: Y a.s and
fAY dP < oo, (2) IXnl :::=:: c a.s., and (3) Xn ~ 0 a.s. is an increasing sequence with +oo
being an allowed value are referred to as the dominated convergence, bounded conver-
gence, and monotone convergence theorems, respectively. Slightly weaker versions of the
dominated, bounded, and monotone convergence conditions under which Eq. 2.47 holds
can be found in [40] (Section 3.2). i.
If LnfA IXnl dP < oo, then integration can be performed term by term ([40],
Section 3.2)
(2.48)
limsupXn
n~oo
= n~l
inf supXk = lim Vk>nXb
k~n n~oo -
where 1\k>nXk and Vk>nXk are increasing and decreasing sequences. Because Vk>nXk =
- 1\k~n (:_Xk), we hmre limsupn~oo Xn = -liminfn~ 00 (-Xn). The above limits of
Xn resembles the limits for sequences of subsets in Q (Eqs. 2.19 and 2.20). Also note that
If the sequence of random variables Xn is such that IXnl ~ Z, where Z:::: 0 a.s.
and P-integrable over A, then
~ liminf1 XndP
JA{ (liminfXn)dP A
1 1
n n
Example 2.29: The inequalities in Eq. 2.26 can be obtained from the Lebesgue
theorem in Eq. 2.50 with A = Q, X n = 1An, and An E F. ¢
2.5.2.3 Expectation
• L2 is a vector space.
• L2 is a Hilbert space with the inner product (X, Y) = E[X Y] for X, Y E L2
and the norm II X II Lz= (E[X 2l) 112 , referred to as the L2-norm.
Proof: We need to show that (1) (X, Y) = E[X Y] is an inner product on L2, (2) L2 is a
vector space, and (3) L2 with the metric d(X, Y) = (X- Y, X- Y) 112 =II X- Y IIL 2 is
complete.
The expectation E[X Y] defines an inner product on L2 because
(0, X) = 0, X E L2,
(X, X) > 0, X E L2, X#= 0,
(X, Y) = (Y, X), X, Y E L2,
(X+ Y, Z) =(X, Y) + (Y, Z), X, Y, Z E L2,
A. (X, Y) =(A. X, Y), X, Y E L2,A. E R.
The function d: L2 x L2----+ [0, oo) defined by d(X, Y) =II X- Y IIL 2 is a metric on L2
because
d(X, Y) = 0 if and only if X= Y,
d(X, Y) = d(Y, X) for each X, Y E L2,
d(X, Y) ~ d(X, Z) + d(Z, Y) for each X, Y, Z E L2.
2.6. The Lq (Q, :F, P) space 35
The first condition in the above equation is not satisfied in a strict sense. The condition
holds if we do not distinguish between random variables that differ on a set of probability
zero.
That L2 is a vector space follows from the properties (a) X ~ L2 and ). E lR
imply ). X ~ L2 since E[(). X) 2] = ). 2 E[X2] < oo and (b) X, Y ~ L2 implies
X + Y ~ L2 since E[(X + Y) 2] = E[X 2] + E[Y 2] + 2 E[X Y], E[X 2] and E[Y 2]
are finite, and IE[X Yll < oo by the Cauchy-Schwarz inequality discussed later in this
chapter (Eq. 2.112).
It remains to show that L2 endowed with the metric (X, Y) r+ d(X, Y) =II X-
Y IILz is complete, that is, that any sequence Xn E L2, n = 1, 2, ... , with d(Xn, Xm) -+ 0
as n, m-+ oo is convergent and its limit is in L2. The proof that L2 is a Hilbert space can
be found in [66] (Proposition 4, p. 399). We only show here that, if X, Xn E L2 and
II Xn- X 11£ 2 -+ 0 as n -+ oo, that is, Xn converges in mean square (m.s.) to X, then
II Xn - Xm II Lz-+ 0 as n, m -+ oo, that is, Xn is a Cauchy sequence in the mean square
sense (Section 2.13). Take e > 0 and an index ii such that II Xn- X 11£ 2 < t:/2 and
II Xm- X IIL 2 < ef2 for n, m 2:: ii. This is possible since II Xn- X IIL 2 converges to zero
as n-+ oo. We have II Xn- Xm 11£ 2 :511 Xn- X 11£2 +II Xm- X 11£ 2 < e for n, m 2:: ii,
that is, II Xn - Xm IIL 2 converges to zero as n, m -+ oo. •
Proof: The first part follows from (E[IXIJ)P :5 E[IXIP], 1 < p < oo, X E L P• de-
rived from HOlder's inequality with Y = 1. This inequality is proved later in this chapter
(Eq. 2.113). Holder's inequality with IXIq in place of X gives {E[IXIql)P :::: E[IXIPq]
j
or (E[IXIql)q q :::: E[IXIq l for p = q' jq 2:: 1. Hence, E[IXIq l < oo implies
1 I I
Note: The random variable X E L2(f2, 9, P) has the smallest mean square error of all
members of L2(f2, 9, P), and is called the best m.s. estimator of X. The second equal-
ity in the above equation can be used to calculate X and shows that the error X - X is
orthogonal to L2(f2, 9, P). A
2.7 Independence
We define independent a-fields and apply this definition to events and ran-
dom variables. We also discuss the Borel Zero-One Law giving the probability of
lim sup of independent events.
• If I is finite and
P (nieiA;) = n
ie/
P(Ai), 'v'Ai E :F;, (2.52)
Note: The above condition implies that any sub-collection of events A; E :Fi, i E I, must
satisfy Eq. 2.52 since some of the events Ai may coincide with the sample space 0. This
requirement is consistent with Eq. 2.54. If the a-fields :Fi, i E I, are on different sample
spaces, the above independence condition has to be applied on the corresponding product
measure space. A
Example 2.31: The sub-a-fields :Ft and :F2 generated by the collections of events
At = {{1, 2, 3}, {4, 5, 6}} and A2 = {{1, 2, 3}, {4}, {5}, {6}} in Example 2.1 are
not independent. For example, the subset {1, 2, 3} belongs to both :F 1 and :F2,
but the probability P({1, 2, 3} n {1, 2, 3}) = P({1, 2, 3}) = 1/2 differs from
P({1, 2, 3}) P({1, 2, 3}) = 1/4. <>
Example 2.32: Let A and B be two events in :F. These events are independent if
the a-fields a(A) = {0, A, Ac, Q} and a(B) = {0, B, Be, Q} are independent or,
equivalently, if P(A n B) = P(A) P(B). <>
Proof: The independence of the a-fields a(A) and a(B) requires that P(Ai n Bj) =
P(Ai) P(Bj) for all Ai E a(A) and Bj E a(B) (Eq. 2.52). The resulting non-trivial
conditions of independence are P(A n B) = P(A) P(B), P(Ac n B) = P(Ac) P(B),
P(A n Be) = P(A) P(Bc), and P(Ac n Be) = P(Ac) P(Bc). These conditions are
equivalent with the classical requirement P(A n B) = P(A) P(B) for the independence of
two events (Eq. 2.53). For example, P(AcnB) = P(B)- P(AnB) since A cnB = B\(An
B) so that P(A c n B) = P(A c) P(B) if P(A n B) = P(A) P(B). Similar considerations
apply to show that P(A n Be) = P(A) P(Bc) follows from P(A n B) = P(A) P(B) . •
Note: Let A and B be two events such that P(A) > 0 and P(B) > 0. If A and B are
independent, the occurrence of B does not affect the probability of A so that P(A I B) =
P(A) implying P(A n B) = P(A) P(B). An equivalent condition of independence for
the events A and B is P(B 1 A) = P(B). Note that the events A and B = Ac are not
independent because A cannot occur if B is observed. &
n
m
Example 2.33: The conditions of Eq. 2.54 are essential to assure the indepen-
dence of three or more events. It is not sufficient to require that Eq. 2.54 be
satisfied for the entire collection of events. For example, consider the sample
space Q = {1, 2, 3, 4} with :F given by all parts of Q and a probability measure
P defined by P({1}) = .../2/2- 1/4, P({2}) = 1/4, P({3}) = 3/4- .../ij2, and
P({4}) = 1/4. Let A1 = {1, 3}, A2 = {2, 3}, and A3 = {3, 4} be some events on
(Q, :F). The probability of A1 n A2 n A3 = {3} is P({3}) = 3/4- .../2/2 and
is equal to P(A1)P(A2)P(A3). However, P(A1 n A2) f. P(A1)P(A2) ([106],
p. 2). <>
Example 2.34: Let Si, i = 1, 2, ... , n, be the event that the maximum flow in a
river during year i does not exceed the height of a flood protection system. The
38 Chapter 2. Probability Theory
probability that there will be no flood inn years is P8 (n) = P(n?=l S;). If the
events S; are independent, the reliability of the flood protection system in n years
is P8 (n) = Jl?=l P(S;) so that P8 (n) = pn for the special case P(S;) = p. ¢
The families Ct. t E T, where T is an arbitrary index set, are independent if Cr.
t E I, are independent families for each finite subset I of T.
The above definitions and the following criterion can be used to prove in-
dependence of a-fields. The criterion uses classes of events forming a rr-system.
A collection C of subsets of n is said to be a rr -system if it is closed to finite
intersection, that is, A, B E C implies A n B E C.
Proof: For proof see [151] (Proposition 4.5.2, p. 103). We only show that Li P(A;) =
oo implies P(A; i.o.) = 1. Note that, by the definition of the event {A; i.o.} and the
independence of A;,
00
The inequality 1 - x .:::; e-x, 0 < x < 1, applied for x = P(A;) gives 1 - P(A;) .:::;
exp[-P(A;)] so that 1- P(A; i.o.) .:::; limj~oo e- L~j P(Ai). Because L~j P(A;) =
oo forallj, exp[- L~j P(A;)] is zero so that 1- P(A; i.o.) = 0. •
The random variables X;, i E I, defined on a probability space (!:2, :F, P) are
independent if the a-fields a(X;) generated by these random variables are inde-
pendent, where the index set I is finite or not.
2.8. The Fubini theorem 39
L X(w)i>(dw) = L [L1 2
X(a>t,a>z)Pz(dw 1)] P2 (dw 1) (2.56)
holds. If in addition X is positive and either side of Eq. 2.56 exists and is finite
or infinite, so does the other side and Eq. 2.56 is valid ([40], p. 59).
Note: We have seen in Section 2.2.3 that for any probability space (111, Q, Q) there exists
a complete probability space (111, Q, Q) such that Q c g and Q = Q on Q. Hence,
40 Chapter 2. Probability Theory
the assumption that X is defined on a complete probability space is not restrictive. The
last statement of the theorem considers the case in which X is positive but may not be
?-integrable, that is, the integral fn X dP may not be finite. &
Proof: The measurable mapping (s, w) r-+ X (s, w) generalizes the time series considered
in Example 2.25 because the index s takes values in [0, 1] rather than in a countable set.
This mapping is a stochastic process (Chapter 3). The function X(·, w), called the sample
path w of X, is defined on [0, 1] for each w.
The indicator function, 1s : (JR, B) --+ ({0, 1}, K), is measurable because B E B
and K = {0, {0, 1}, {0}, {1}} so that
Proof: The mapping (x, w) r-+ 1{X(w)>x) is measurable from ([0, oo) x Q, B([O, oo)) x:F)
to ({0, 1}, K), where K = {0, {0}, {1}, {0, 1} }. We have
Let (0, :F) be a measurable space and f.L, v : 0-+ [0, oo] be two measures on
this space. If JL(A) = 0, A E :F, implies v(A) = 0, we say that v is absolutely
continuous with respect to f.L and indicate this property by the notation v « f.L.
If v « f.L and f.L « v, v and f.L are said to be equivalent measures.
Example 2.39: Consider a measure space ( 0, :F, f.L) and a measurable function
h: (0, :F)~ ([0, oo), B([O, oo))). The set function
v(U~l An) = 1 00
Un=lAn
h df-1, = Jf n=l
h lAn df-1,
lim
n~oo }A
{ Xndf-1, { (lim
= }A n~oo
Xn) df-1,.
Lc;hlsj) d~ ~ •(A),
42 Chapter 2. Probability Theory
If J-t and v are a -finite measures on a measurable space (!>2, :F) such that v « J-t,
then there exists a measurable function
dv
h = - : (!>2, F)-+ ([0, oo), B([O, oo))), (2.58)
d~-t
called the Radon-Nikodym derivative of v with respect to f.t, such that Eq. 2.57
holds ([66], Theorem 18, p. 116).
Example 2.40: Consider a probability space (!>2, :F, P), a partition A; E :F,
i = 1, .. . ,n,ofr>2 suchthatP(A;) > O,andarandomvariableX = L:7= 1 x; 1A;
defined on this space, where x; E R Let Q be a probability measure on (!>2, :F)
such that Q(A;) > Oandh; = P(A;)/Q(A;), i = 1, .. . ,n. Denote expectations
under the probability measures P and Q by E p and E Q, respectively. The expec-
tations Ep[X] and EQ[X h] of the random variables X and X h with respect to
the measures P and Q coincide. <>
Proof: The expectation of X h under Q is
h;) Q(A;) = L
n n ( P(A;))
EQ[X h] = ~)x; x; ~ Q(A;) = Ep[X].
i=l i=l Q( ,)
Since both P and Q are zero only on the impossible event 0, the function h is measurable
and Q is absolutely continuous with respect to P on the a -field a (A;, i = 1, ... , n). The
function his the Radon-Nikodym derivative of P with respect to Q. A
Ep[X] =I =I X~~ =I
X dP dQ X hdQ = EQ[X h], (2.59)
Note: F is the probability induced by X on (R, B) and constitutes a special case of the
distribution in Eq. 2.29 forB = ( -oo, x] and h = X. The definition is meaningful because
X is F-measurable so that {w: X(w) ~ x} E F. The notation P(X ~ x) used in Eq. 2.60
is an abbreviation for P({w: X(w) ~ x}). &
Proof: Since F is a probability measure, its range is [0, 1]. Also, F is an increasing
function because {X ~ x1} s; {X ~ xz} for x1 ~ xz (Eq. 2.5). That F is right continuous
follows from the continuity of the probability measure and the definition of F. Let {Xn}
be a decreasing numerical series converging to x, Bn = {w: X(w) ~ xn}, and B = {w:
X(w) ~ x}. The sequence of events Bn is decreasing sothatlirun--* 00 Bn = n~ 1 Bn = B
and (Eq. 2.25)
showing that I~g and I~;' are disjoint intervals. The collection of the intervals l~g is countable
since each /1; contains a rational number and the set of rational number is countable.
The sum of all jumps ofF is Ll;eJ[F(~+)- F(~-)] = Ll;eJ[F(~)- F(~-)] ::::
1, where J denotes the collection of jump points of F. Hence, 8 n8 :::: 1 so that n 8 :::: 1/8,
where n8 denotes the number of jumps of F larger than 8.
The third property results from the equalities
lim
x~+oo
P(( -oo, x]) = P(!J) = 1 and lim
x~-oo
P(( -oo, x]) = P(0) = 0.
The fourth property follows from the equality P(B \ A) = P(B) - P(A), which
holds since the event A= {w: X(w) ::::a} is included in B = {w: X(w):::: b} for a:::; b.
The last property holds because the event {a :::: X < b} is the union of the disjoint
events {X = a} and {a < X :::: b} \{X = b} so that the probability of {a :::: X < b} is
P(X = a) + P({a < X :::; b} \ {X = b}), and P({a < X :::; b} \{X = b}) is equal to
P(a < X :::: b) - P(X =b) . •
Example 2.42: Consider the system in Example 2.14 and let F denote the distri-
bution of the system strength X. The system reliability without and with a proof
test at Xpr is Ps(X) = 1 - F(x) and Ps(x) = (1- F(x V Xpr))/(1- F(xpr)).
respectively.
Figure 2.4 shows the system reliability disregarding and accounting for the
Reliability with
0.8
proof test
0.6
0.4 Reliability
without prootI test
0.2
X
pr
00 2 3 4 6
X
Proof: Let {xn} be a positive sequence such thatxn -J, 0 as n--... oo. Define the sequence of
intervals An = {w E r.l : X (w) E (x- Xn, x]} for x E R. Since this sequence is decreasing,
we have P({X = x}) = P (n~ 1 An)= limn--+oo P(An) and P(An) = F(x)- F(xn)· If
F is continuous at x, then F(x)- F(xn) --... 0 as n--... oo so that P({x}) = 0. Conversely,
if P({X = x}) = 0, we have limn--+oo[F(x)- F(xn)l = 0 so that F is continuous at x
since it is right continuous. •
Proof: If g = lB, B E B, then Eqs. 2.62 and 2.63 give E[Y] = P(X E B) = Q(B). If I
is a finite index set and g = LiEf bi lBi, the subsets B; E :F partition JR, and bi are real
constants, then Eqs. 2.62 and 2.63 give LiEf bi P(B;) because the integration is a linear
operator.
If g is an arbitrary positive Borel function, there exists a sequence of simple, in-
creasing, and measurable functions gn, n = 1, 2, ... , converging tog as n --... oo. We
have seen that the expectations of gn (X) in Eqs. 2.62 and 2.63 coincide. The monotone
46 Chapter 2. Probability Theory
convergence theorem shows that the expectations of g(X) in Eqs. 2.62 and 2.63 coincide.
If g is an arbitrary Borel function, it can be represented by g = g+ - g-, where g+ and
g- are positive Borel functions. Because g(X) is integrable and the expectation is a linear
operator, the formulas in Eqs. 2.62 and 2.63 give the same result.
That Eqs. 2.62 and 2.63 give the same result can be indicated by Ep[g o X] =
EQ[g], where Ep and EQ denote expectations under the probabilities P and Q, respec-
tively. We note that the probabilities P and Q live on different measurable spaces, P is
defined on (Q, :F) while Q is the probability measure induced by X on (R, B) (Eq. 2.29).
Generally, the last two formulas in Eq. 2.63 involving Riemann-Stieltjes and Riemann in-
tegrals are used to calculate E[Y]. •
Example 2.44: The expectation of the Cauchy random variable X with density
f(x) = aj[;rr (a 2 + x 2 )], a > 0, x E JR. does not exist.<>
Proof: The expectation of x+ is (Eq. 2.63)
E j[g(X)] =!,
It
g(x) f(x) dx =!, Iq
[g(x) f(x)
q(x)
Jq(x) dx,
where the last equality holds since f is zero in Iq \ If. The integral on Iq is well de-
fined because q is strictly positive on Iq, and gives the expectation of the random variable
g(X) f(X)jq(X) with respect to the density function q. •
Example 2.46: Let X be a random variable with density f. It is assumed that the
expectation of X and of all functions of X considered in this example exist and are
finite. Let X;, i = 1, ... , n, be independent copies of X. The classical estimator
X= (ljn) I:7= 1 X; of E[X] has mean E[X] and variance Var[X]/n. The impor-
tance sampling estimator of E[X] is Xis = (1/n) I:7= 1 X; f(X;)/q(X;), where
q is a density such that q(x) = 0 implies f(x) = 0. The mean and variance of
Xis are E[X] and Var[X f(X)/q(X)]jn, respectively.
The success of the importance sampling technique depends on the func-
tional form of q. For example, E[X] = 1/2 and Var[X] = 0.05 for a random
2.10. Random variables 47
Note: The importance sampling technique in this illustration is a special case of Exam-
ple 2.45 for g(x) = x. If we set q(x) = x f(x)/ E[X], the variance of Xis is zero. How-
ever, this estimator cannot be used because it depends on E[X], that is, on the parameter to
be estimated. A
Note: We define the expectation for real-valued random variables and eA u X is complex-
valued. The expectation of eA u X is complex-valued with real and imaginary parts
E[cos(u X)] and E[sin(u X)], respectively. A
Example 2.47: The characteristic function exists for any random variable. For
example, the characteristic function of the Cauchy random variable in Exam-
ple 2.44 is cp ( u) = exp( -a Iu I), u E R We have seen that the expectation of
this random variable does not exist.
The moment generating function, m(u) = E[exp(u X)], u E R is also
used in calculations. However, m(·) may not be bounded, for example, the mo-
ment generating function of a Cauchy random variable. ¢
2.10.3.1 Properties
Proof: Eq. 2.64 gives rp(O) = 1, lrp(u)l = IJ e R ux f(x) dxl ::'0 J leR uxl f(x) dx =
1, and rp(u) = rp(-u)*.
Let Z = Lk=lZk exp(.J=T Uk X) be a random variable. Because Z Z* = IZI 2 is
a positive random variable for any Zk E IC and uk E JR,
n n
0 ::'0 E[Z Z*] =L Zk z/ E [eR (uk-uz) X]= L Zk z/ rp(uk- UJ),
k,l=l k,l=l
F(x2) - F(xl) = -
1
lim
1'
e-Ruxz _ e-Rux1
,--, rp(u) du, (2.66)
2:rc r---+oo -r -v-lu
where and x 1 and x 2 > x 1 are points of continuity ofF ([142], Theorem 3, p. 12).
To prove that rp is uniformly continuous in JR, we have to show that for an arbitrary
s > 0 there exists 8 > 0 such that lrp(u +h) - rp(u)l < s if lhl < 8 for all u E R The
increment of the characteristic function in ( u, u + h) satisfies the inequality
eRx- L (v-lx)
n ,--, k lxln+l
<--- and
21xln
<--
k=O k! - (n + 1)! n!
2.1 0. Random variables 49
IeR u
xlleHhX -1-FfhXI
h :::0: 21XI E L1
by the second inequality with n = 1 and x replaced by h X. The first inequality gives
so that
lirulcp(u + h~- cp(u)- E [n X eRuxJI = 0
Example 2.49: Let X be a random variable with the distribution function F (x) =
LiEf Pi 1[xi,oo)(x), where I is a finite index set, Pi :=:: 0 such that LiEf p; = 1,
and {x;} is an increasing sequence of real numbers. The density and the charac-
teristic functions, f(x) = LiEf p; 8(x -Xi) and q;(u) = LiEf p; [cos(u Xi) +
A sin(u x;)], of X are Fourier pairs. The characteristic function consists of a
superposition of harmonics with amplitudes p; and frequencies coinciding with
the locations x; of the jumps ofF. <>
l: l:
order r of X is
If the functions g(x) = (x - t.L(1)Y and g(x) = lx- f.LClW are considered,
the resulting expectations define the central moments of order r and absolute
50 Chapter 2. Probability Theory
2
f(x) = -1- exp [ - -1(X-JL)
-- ] , X E JR. (2.68)
,J2ii a 2 a
The skewness and kurtosis coefficients of X are Y3 = 0 and Y4 = 3. The skewness
coefficient is zero because f is symmetric about x = f.L.
The random variable X follows a gamma distribution with parameters k >
0 and A > 0 if it has the density
xk-1 Ak e-A.x
f(x) = r(k) , x:;:: o. (2.69)
The first four moments of this random variable, denoted by X r(k, A), are
JL = kjA, a = kjA
2 2 , Y3 = 2/ -Jk,, and Y4 = 3(1 + 2/ k).
The characteristic functions of the random variables with densities given by
Eqs. 2.68 and 2.69 are
1
for X"' r(k, A). (2.71)
cp(u) = (1 - J=T ujA)k
Figure 2.5 shows the density and the characteristic functions of a gamma and a
Gaussian random variable with mean f.L = 1.5 and variance a 2 = 1. Because
these densities are not even functions, their characteristic functions are complex-
valued.<>
Example 2.52: Let N be a discrete random variable taking values in {0, 1, 2, ... }
with the probability
An
P(N = n) = - e-:A, n = 0, 1, ... , (2.72)
n!
where A > 0 is an intensity parameter. The probability measure in Eq. 2.72 defines
a Poisson random variable. The characteristic function of X = a N + b is
(2.73)
where a and b are constants. The random variable X is referred to as Poisson with
parameters (a, b, A). <>
2.1 0. Random variables 51
f(x)
0.5
X
0
-3 -2 -1 0 2 3 4 5 6
9i{!p}
0.5
0
u
-0.5
-4 -3 -2 -1 0 2 3 4
S{<p}
0
u
-1
-4 -3 -2 -1 0 2 3 4
Figure 2.5. The density and characteristic functions of a gamma and a Gaussian
variable with the same mean and variance
(2.74)
P(N = k) + P(N = 0)
k=l
~ (A. qiy(u))k A
= LJ k! e- = exp[-A. (1- qiy(u))],
k=O
where rpy denotes the characteristic function of Y1. The expression of rp in Eq. 2.76 is an
alternative form of the above result. •
If there is a characteristic function (/)n for every integer n :=:: 1 such that
Note: This definition shows that a random variable X with an infinitely divisible char-
acteristic function has the representation X = L?=l X~n) for each n ~ 1, where x}n),
i = 1, ... , n, are independent identically distributed (iid) random variables with the char-
acteristic function (/Jn. A
Proof: The characteristic functions ({Jn (u) = exp ( J=T JJ, u In - u2 u 2I (2 n)) and <p in
Eq. 2.70 satisfy Eq. 2.77 for each n. Hence, X can be represented by a sum of n independent
Gaussian variables with mean JJ,In and variance u 2 ln. •
Example 2.56: Let N denote a Poisson random variable with intensity A > 0 and
let a, b be some constants (Example 2.52). The characteristic function of aN+ b
is infinitely divisible. ¢
Proof: <fJn(u)= exp (CAin) (eRau -1) + J=T (bin) u) and<p inEq. 2.73 satisfy the
condition <p = (<pn)n for any n :::=: I integer. Hence, a Poisson variable X with parameters
(a, b, A) can be represented by a sum of n independent Poisson variables with parameters
(a, bin, AIn) for each n (Example 2.52). •
If rp is an i.d. characteristic function, then rp has no real zeros, that is, rp(u) =j:. 0
for every u E IR ([124], Theorem 5.3.1, p. 80).
Proof: This property can be used as a criterion for finding whether a particular character-
istic function is not infinitely divisible. For example, the characteristic function in Exam-
ple 2.49 is not infinitely divisible.
If <p is infinitely divisible, there exist characteristic functions ({Jn = <plfn for each
integer n > 0 so that g(u) = liiDn--+oo <fJn(u) = liiDn--+oo(<p(u)) 1fn takes only two values,
zero for <p(u) = 0 and one for <p(u) I 0. Because <p and ({Jn are characteristic functions,
there is an interval I c R containing zero in which both <p and ({Jn are not zero so that
log(<pn(u)) = (lin) log(<p(u)), u E I. The right side of this equation approaches zero as
n --+ oo so that <fJn (u) --+ 1, u E I, as n increases indefinitely.
The function g is a characteristic function as a limit of characteristic functions
([124], Chapter 3), can be either zero or one, and g(u) =
1 for u e I. Hence, g(u) 1 =
everywhere by the continuity of the characteristic function so that <p(u) I 0, Vu E R. •
[n
function as a product of characteristic functions and
rp(u) = Ii
k=l
rpk(U) = n[(/Jk,n(U)]n =
k=l k=l
rpk,n(U)]n = rpn(u)n
for any n 2: 1, where rpn = Tik=l rpk,n· Hence, <pis infinitely divisible. •
54 Chapter 2. Probability Theory
Proof: The function lcp(u)l 2 = cp(u) (cp(u))* = cp(u) cp(-u) is an i.d. characteristic func-
tion by the previous property so that (lcp(u)l 2 ) 1/(2 n) = lcp(u)l 1/n is a characteristic func-
tion for each integer n > 0. •
Note: This fact can be used to construct infinitely divisible characteristic functions. For
example, the characteristic function in Eq. 2.73 with b = 0 can be obtained from Eq. 2.78
with p = J... and g(u) = exp(H au), a E JR. .&
where Pn > 0 are real numbers and gn denote characteristic functions ([124],
Theorem 5.4.1, p. 83).
where bk = Pn [Gn(ak)- Gn(ak-I)]. For any a > 0 the last expression is the limit
of a finite product of characteristic functions corresponding to Poisson random variables
(Eq. 2.73). The stated result follows by taking the limit as a --+ oo. •
11
log(qJ(u))=v-wu+ 1("M.
A ux-1-
e- .J=T
1 +x
ux) 1 +x x-dO(x) (2.80)
2 --
2
2
for all u E JR., then qJ is an i.d. characteristic function. The constant a and the
function (J are uniquely determined by qJ ([124], Lemma 5.5.1, p. 85).
Note: The integrand of the integral in Eq. 2.80 is defined by continuity at x = 0 so that it
is equal to -u 2 /2 at this point. A
~n(U)=Hanu+ h(
"M.
eA- ux-1- R u x
1 +x
2
1 +x
) -
x
-
2
2 -dOn(X),
an= n { -1 X 2 dFn(X),
J"M. +x
On(X) = n l x
-00
Y
- -2 dFn(y),
1+y
(2.81)
and Fn denotes the distribution function of qJ lfn ([124], Lemma 5.5.3, p. 88).
56 Chapter 2. Probability Theory
r> 2 u 2
log(rp(u))=J=lau---+
2
k (
IR\{0}
eR-lux_1- R
1 +x
u
2
x) -+
1 -x2
2 -d&(x)
x
(2.82)
for each u E R because the integrand is -u2 /2 at x = 0 (Eq. 2.80). An alternative form of
this equation is
r>2u2
log(rp(u)) = J=l au- - -+ ~ ( eR-lux - 1- .Pux) d~(x),
2 (2.83)
2 IR\{0} 1 +x
I
where
j:_ 00 l+t d&(y) for x < 0,
~(x) = Y (2.84)
-J
2
X
00 l+y
y2 d&( y ) for x > 0.
The function ~ is defined on R \ {0}, is increasing in (-oo, 0) and (0, oo), satisfies the
conditions~( -oo) = 0, ~(oo) = 0, and the integral fc-t:,t:)\{O} x 2 d~(x) < oo for any s >
0. The representation in Eq. 2.83 is unique and is referred to as the Levy representation
for the i.d. characteristic function rp. A
a2 u2
log(cp(u)) = J=T au- - 2-
Example 2.58: The characteristic functions of the Gaussian, Poisson, and com-
pound Poisson random variables are special cases of the Levy-Khinchine repre-
sentation given by Eq. 2.85. <>
Proof: The Uvy-Khinchine representation in Eq. 2.85 with A£ = 0 gives the characteristic
function of a Gaussian variable (Eq. 2.70). The characteristic function in Eq. 2.85 with
r> 2 = 0 and AL(B) =A 1rn(B) for~ E R \ {0}, A> 0, and BE B, is
and has the form of the characteristic function for a Poisson variable (Eq. 2. 73). If a 2 =0
and AL(dx) =A dF(x), A > 0, Eq. 2.85 becomes
A random variable X is a -stable if, for any a i > 0 and independent copies Xi,
i = 1, 2, of X, there exist c > 0 and b such that
(2.86)
We give here two properties of a-stable random variables that are particu-
larly useful for calculations. An extensive discussion on a-stable random variables
is in [161, 162].
The characteristic function of an a-stable variable is infinitely divisible.
Proof: Let cp denote the characteristic function of an a-stable variable X and let X 1, ... , Xn
denote independent copies of this random variable. Because X is an a-stable variable, we
have L?= 1 Xi if:. en X+ dn, where en > 0 and dn are real numbers. This equality gives
[rp(u)]n = cp(cn u) eH u dn or
rp(u) = [cp(ufcn)r e - R u dn/cn
This property shows that the class of a-stable variables is a subset of the
class of random variables with i.d. characteristic functions. Hence, a-stable vari-
ables have all the properties of random variables with i.d. characteristic functions.
For example, the characteristic functions of a-stable variables have no real zeros.
58 Chapter 2. Probability Theory
- tan(Jra/2), a -::j:. 1,
where w(/u/, a)= { (2 /.7r) log /u/, a= 1, (2.88)
f.L, a ~ 0, /{3/ :::: 1, 0 <a :::: 2 are real constants ([124], Theorem 5.7.3, p. 102).
Note: The characteristic function of an a-stable random variable can be obtained from
Eqs. 2.83 and 2.84 with
where ct, c2 2::: 0 are real numbers such that ct + c2 > 0 ([124], Theorem 5.7.2, p. 101).
The integrals
1° (eAux
-oo
-1- J=I ux) ___:!:___
1 + x2 lxla+l
and
{
00
( Aux 1 J=I ux) dx
Jo e - - 1 +x2 lxla+l
can be calculated for a < 1, a = 1, and a > 1 and give Eqs. 2.87 and 2.88.
The parameters a, f3, a, and !J,, referred to as stability index, skewness, scale, and
shift or location, respectively, control the distribution type, the departure from a symmet-
ric distribution about !J,, the range of likely values, and the shift from zero. We use the
notation X ~ Sa (a, {3, !J,) to indicate that X is a real-valued a-stable random variable with
parameters (a, a, {3, !J,). The density of an a-stable variable with f3 = 0 is symmetric about
its location parameter !J,. 6
(2.89)
Figure 2.6. Characteristic function of some a-stable random variables
be a measurable function, that is, x- 1 (B) E :F for every Borel set B E Bd. This
function, called a random vector in !Rd or IRd -valued random variable, induces
the probability measure Q(B) = P(X E B) = P(X- 1 (B)), B E Bd, on the
measurable space (JRd, Bd).
The domain and the range of this function are JRd and [0, 1], respectively.
The following properties ofF result directly from its definition in Eq. 2.90.
• Iimxk--+oo F(x), 1 :::; k :::; d, is the joint density of the JRd- 1 -valued random
variable (X1, ... , Xk-1, Xk+1, ... , Xd).
elimxk--+-oo F(x) = 0 fork E {1, ... , d}.
• The function x k 1-+ F (x) is increasing for each k E { 1, ... , d}.
• The function Xk 1-+ F(x) is right continuous for each k E {1, ... , d}.
If F is such that
f(x) = ad F(x) (2.91)
ax1 ... axd
exists, then f is called the joint density function of X.
60 Chapter 2. Probability Theory
Note: X takes values in the infinitesimal rectangle xf=l (xi, Xi + dxi] with probability
P ( nf=l {Xi E (xi, Xi + dxj]}) ~ f(x) dx. The distribution of one or more coordinates
of X can be obtained from the joint distribution or the joint density of X. For example, the
marginal distribution and marginal density of Xt are Ft (xt) = F(xt. oo, ... , oo) and
ft (xt) = fJRd-1 f(x) dx2 · · · dxa or ft (xt) = dFt (xt)fdxt. respectively. ~
Suppose that the last d2 < d coordinates of a random vector X have been
measured and are equal to z = (zt •... , za2 ). Let x<l) and x< 2) be vectors con-
sisting of the first dt = d- d2 and the last d2 coordinates of X.
where t<i) denotes the density of X (i), i = 1, 2, and x (1) = (xt •... , xa 1 ).
Note: The definition of the conditional probability P(A I B) in Eq. 2.16 with A= {Xt E
(xt, x1 +dxt ], ... , Xa 1 E (xa 1, xa 1+dxa1]} and B = {Xa1+1 E (zt. Zl +dzt], ... , X a E
(za2 , za2 + dza2 ]} provides a heuristic justification for Eq. 2.92. We can view P(A I B) as
[(f(x<l), z)/j<2)(z))] dx< 1) and gives the probability content of the infinitesimal rectangle
(X}, X} + dxt] X ••• X (xaJ. xaJ + dxaJ] under the condition x< 2) = z. For a rigorous
discussion, see [66] (Section 21.3, pp. 416-417). ~
If X and Y are Jlld -valued random variables and there is a one-to-one corre-
spondence between these variables given by y = g(x) and x = h(y), then the
densities fx and /y of X and Y are related by
/y(Y)
ax· I,
= fx(h(y)) Iay~ x,yer. (2.93)
2.11.2 Independence
Consider a family of random variables X;, i E I, defined on a probability
space (Q, :F, P), where the index set I is finite or infinite. We say that X; are
independent random variables if the a- fields a (X;) generated by these random
variables are independent (Section 2.7.3).
Note: If I = {1, ... , d} is finite, the condition in Eq. 2.94 becomes P(N/= 1{Xi :::::Xi}) =
flf= 1P(Xi :::::Xi), Xi ER, or F(x) = flf=t Fi(Xi), where x = (x1, ... , Xd), F denotes
the joint distribution of X = (X 1, ... , Xd), and Fi is the distribution of Xi. The indepen-
dence condition F(x) = flf= 1Fi (xi) is also satisfied by all subsets of I = {1, ... , d}. For
example, if we set Xd = oo, this condition applies to the first d - 1 coordinates of X. If the
distributions F and Fi have densities f and fi, Eq. 2.94 implies f(x) = flfeJ fi(xi) . .l
Note: That Eqs. 2.95 and 2.96 give the same result can be shown by extending the ar-
guments used to prove the equivalence of Eqs. 2.62 and 2.63. The chain of equalities in
Eq. 2.96 holds because Q(dx) = dF(x) = f(x)dx. If q = 1, lli denote some con-
stants, and g(x) = "L,f= 1ai xi, then the expectation of g(X) is "L,f=t ai E[Xj] showing
that expectation is a linear operator. .l
• The joint characteristic and the joint density functions are Fourier pairs .
• If X has independent coordinates, cp(u) = n%=1 cpk(Uk), where cpk is the
characteristic function of X k.
• cp is uniformly continuous.
Proof: The first property results from the definition of the characteristic function since
The characteristic and the density functions of X are Fourier pairs related by Eq. 2.97
and ([62], p. 524)
f(x) = _1_ { e-Rurx rp(u)du. (2.98)
(27l')d ]n:€_d
If the coordinates of the random vector X are independent, the characteristic func-
tion becomes (Eq. 2.97)
rp(u) = E [eR uT XJ = { d
ln:€.
nd
k=l
(eR Uk Xk fk(xk) dxk)
n E[eRukXk] = n rpk(uk),
d d
=
k=l k=l
where rpk(uk) = E[eR uk xk] is the characteristic function of the coordinate k of X. The
Fourier transform of rp(u) shows that the density of X is equal to the product of the densi-
ties of its coordinates. Hence, the above equality provides an alternative way of checking
whether a random vector has independent coordinates.
For the last property we need to show that for any e > 0 there is 8 > 0 such that
II h II< 8 implies lrp(u +h) - rp(u)l < e, where II · II denotes the usual norm in JRd. The
increment of the characteristic function from u to u + h is
2.11.4 Moments
Let X be an ~d -valued random variable on a probability space (Q, :F, P)
and consider the function g(x) = Tif=l xt;, where s; ~ 0 are integers. Because g
is continuous, g(X) is a real-valued random variable.
If X "" L 8 , that is, X; "" Ls for each i = 1, ... , d, the moments of order
s= "£1=
1 s; of X exist, are finite, and are given by
Note: The characteristic function can be used to calculate moments of any order of X
provided that they exist. For example,
(2.100)
Note: The relationship between c;,j and r;,j holds by the linearity of the expectation op-
erator. If c;,j = 0 fori f:. j, then X; and Xi are said to be uncorrelated. If r;,j = 0 for
i f:. j, then X; and Xi are said to be orthogonal. If Jli = tL i = 0, the coordinates X; and
Xi of X are uncorrelated if and only if they are orthogonal. A
The pair (JL, r) or (JL, c) gives the second moment properties of X. A short
hand notation for these properties is X ~ (JL, r) or X ~ (JL, c). The information
content of (JL, r) and (JL, c) is equivalent (Eq. 2.101).
(2.103)
•IPi,JI::::: 1.
• Pi, 1 = ± 1 if and only if Xi and X 1 are linearly related.
• If Xi and X 1 are independent, Pi,J = 0. The converse is not generally true.
Proof: These properties can be obtained from classical inequalities discussed later in this
chapter or by direct calculations as shown here. Let Xi = (Xi - Mi)/ui denote a scaled
version of X; with mean zero and variance one. Because E[(Xi ±X j ) 2 ] = 2 (1 ±Pi,}) ~ 0,
we have IPi,J I :::: 1.
If a, b are constants and Xi = a X 1 + b, then Pi,} = ±1 results by direct calcu-
lations. If Pi,} = ±1, we have E[(Xi =F X1) 2] = 0. One of the inequalities in the next
section shows that the last equality implies that xi - Xj = 0 a.s. so that xi and X j are
linearly related a.s. (Example 2.65).
If Xi and X 1 are independent, we have
E[(Xi-Mi)(Xj-Mj)]= { (~-P.,i)(YJ-P.,j)f(~,YJ)d~d'fJ
1JR2
= E[Xi- p.,;] E[Xj- Mj] = 0,
because the joint density f of (Xi, X J) is equal to the product of the densities of Xi and
X J. Generally, the converse of this property is not true. For example, let Xi be a Gaussian
random variable with mean zero and variance one and let Xj = xf. These variables are
uncorrelated because E[Xi (Xj- 1)] = E[Xf- X;]= 0 and Xi has zero odd moments.
However, X; and Xi are not independent since x1 is perfectly known if X; is specified. •
f(x) = [(2:n')d det(c)r 112 exp [ -~(x- /L)T c- 1 (x- /L) J and (2.104)
Note: The functions f and rp are defined on JRlf and are real and complex-valued, respec-
tively. If the covariance matrix c is diagonal, the coordinates of X are not only uncorrelated
but also independent because the joint density of X is equal to the product of the marginal
densities of this vector. Hence, the independence and the lack of correlation are equivalent
concepts for Gaussian vectors.
To indicate that X is an !Rd -valued Gaussian variable with second moment proper-
ties (p,, c), we write X ~ Nd(/L, c) or just X ~ N(p,, c) if there can be no confusion about
the dimension of X. A
Figure 2.7 shows the density and the characteristic functions in Eqs. 2.107 and
2.108 for p = ±0.8. ¢
We give two essential properties satisfied by any Gaussian vector X ,....,
N(/L, c). One of these properties uses the notations x< 1> and x< 2> for the first
dt < d and the last d2 = d - dt coordinates of X, /L (p) = E[X<P>], and
c(p,q) = E[(X(p) - /L(P))(X(P) - /L(P))T], p, q 1, 2.=
• Linear transformations of a Gaussian vector are Gaussian vectors.
• The conditional vector X = x< 1> 1 (X<2> = z) is an JRd' -valued Gaussian
variable with mean and covariance matrices
jl = /L(l) + c<I,2) (c(2,2))-l(z _ /L(2)) and
c = c(l,l) _ c(l,2) (c(2,2))-l c<2,1). (2.109)
2.11. Random vectors 67
Density Density
0.4
0.2
p = -0.8
0.5
0
5
5 5
-5 -5 -5 -5
Figure 2.7. The density and the characteristic function of the standard bivariate
Gaussian vector with p = 0.8 and p = -0.8
Proof: Let Y =a X +b, where a and bare real-valued (q, d) and (q, 1) constant matrices.
The characteristic function of Y at v E JR.q is
The first term is the characteristic function of X for u =aT v so that (Eq. 2.105)
E [eR vT YJ = exp ( .J=l(aT v)T JL- ~(aT v)T c (aT v)) exp(.J=l vT b)
= exp ( .J=l vT (ap,+b)- ~vT (aT ca)v).
Hence, Y is a Gaussian vector with mean JLy = a JL + b and covariance cy = aT ca.
The properties of X can be obtained by direct calculations based on the density of
X in Eq. 2.104 and the definition of the conditional density (Eq. 2.92). If dt = d2 =
1, JL = 0, q 1 = c2 2 = 1, and c1 2 = c2 1 = p, then X = X1 I (X2 = z) is
c
a Gaussian v~iable with mean fl = ; z and ~ariance = 1 - p2 . If p = 0, that is,
X 1 and X 2 are independent, X and X 1 have the same distribution. Otherwise, X has
a smaller variance than X I· The probabilistic characteristics of the conditional vector X
have useful applications in reliability studies. For example, suppose that X 1 and X 2 control
the performance of a system and we can measure only X2. If the correlation coefficient p
is not zero, the measured value of X 2 can be used to reduce the uncertainty in X 1. •
68 Chapter 2. Probability Theory
E[(Z- X1) 2] = E[(a (X2- IL2)- (X1- IL1)) 2 ] = a2 a-i +a[- 2a pal a2,
and takes its minimum value at a = p a1fa2. This optimal value of a is the solution of
oE[(Z- x 1)2]/aa = o. •
Proof: The function r(X) is a random variable because {w: r(X(w)) ~ y}, y ~ 0, is equal
to {w: X(w) E [-r- 1(y), ,- 1(y)]} and X is a random variable. We have
E[r(X)] = { r(X)dP ~
Jn
i IXI::a
r(X)dP ~ r(a) P(IXI >a)
because r > 0 andr(x) ~ r(a) for lxl ~a. The common form of the Chebyshev inequality
corresponds to r (x) = lx iP, where p ~ 1 is an integer. •
Jensen's inequality. If g : ~ --+ ~is a convex function and X and g(X) are
integrable random variables, then
Proof: We use two facts to prove Eq. 2.111. First, convex functions are continuous so that
g(X) is a random variable. Second, a convex function g has the property
IE[ X] I :::: E[IXI] and (E[X]) 2q :::: E[X 2q] for an integer q > 0,
follow from Eq. 2.111 with g(x) = lxl and g(x) = x 2 q, respectively. •
(2.113)
Proof: The inequality Ia bl ::=: la!P / p + lblq jq holds for a, bE lR and p, q as in Eq. 2.113.
The expectation of this inequality with a = X/(E[IXIP]) 11P and b = Yj(E[iYiq])lfq
gives Eq. 2.113. The HOlder inequality gives E[IXI] ::=: E[IXIP] 11P for Y = 1 and the
Cauchy-Schwarz inequality for p = q = 2.
The inequality E[IXI] ::=: E[IXIP] 11P shows that X~ Lp implies X~ Lt. where
p > 1 is an integer. •
(2.114)
on the right side terms of the above inequality, where 11p+ 1Iq = 1, so that we have
E[IX +YIP] :::5 (E[IXIP]lfp + E[IYIP] 11P) E[IX + Yl(p-l)q]lfq.
JL= [ f..tl
f..t2
J and c= [ af
pal a2
pa1a2 ]·
a:}
The above inequalities can be used to prove that the correlation coefficient p takes
values in [-1, 1] and p = ± 1 if and only if X 1 and X 2 are linearly related. <>
Proof: Set X; = (X; -
JL;)IO'i ~ (0, 1). The Cauchy-Schwarz inequality applied to the
random variables XI and x2 gives lpl 2 :::: 1. If xl =a x2 + b, then p =ale so that
p = ±1. If p = 1, the expectation of [XI- X2J 2 is zero so that P(IXI- X2l >e)= 0,
Ve > 0, by the Chebyshev inequality. Hence, X1 = X2 a.s. or X 1 = a X 2 + b a.s. •
Note: The a.s. convergence has some equivalent definitions. For example, it can be shown
that Xn ~ X if and only if for any e > 0 we have
lim P(Am(e))
m-+oo
= P(m-+oo
lim Am(e)) = P(limsup{IXn- XI> e})
n-+oo
= P({IXn- XI> e} i.o.),
~
a.s.
X b X
n
•
i
, ,~
On a
subsequence
I
, ,,
,
of every
subsequence , ,,
, If X is dominated
,,
n
pr ,, by some YE L
,,
p
X X
n
!
d
X X
n
Figure 2.8. Types of convergence for random sequences and their relationship
72 Chapter 2. Probability Theory
The limits of sequences of random variables have similar properties as the limits
of numerical series. For example, Xn ~ X and Yn ~ Y imply a Xn +
fJ Yn ~a
~
X+ fJ Y, where a, fJ E ~and Xn Yn ~X Y.
~
If Xn ~ X andY :::: 0 is a random variable such that IXnl < Y a.s. for each
n, then lXI :5 Y a.s.
Note: Proofs of the above statements can be found, for example, in [40] (Chapter 4) and
[150] (Section 6.3). A
Example 2.66: Let ([0, 1], 8([0, 1]), P) be a probability space with P(dw) =
dw and let Xn = 2n 1(0,1/n) be a sequence of random variables defined on this
space. The sequence X n converges to zero in probability as n --+ oo but does not
converge in L P• p :::: 1. ¢
Proof: For any 8 > 0, we have P(IXnl > 8) = P((O, 1/n)) = 1/n ~ 0 as n ~ oo.
Hence, Xn converges in probability to zero. On the other hand,
E[iXniP] = (2n) P((O, 1/n)) = 2n /n ~ oo as n ~ oo.
Hence, convergence in probability does not imply Lp convergence ([151], p. 182). •
so that E[IXnJP] -+- 0 as n -+- oo. However, the numerical sequence Xn(w) does not
converge to zero ([151], p. 182). •
Sn = Ln Xn- f..t pr
----+ 0, n --* 00. (2.115)
k=1 n
This result is referred to as the weak law of large numbers ([106], p. 36). ¢
Proof: The mean and variance of Sn are J.Ln = 0 and u; = u 2 jn. The Chebyshev
inequality (Eq. 2.110) gives P(ISn I > e) :::; u; je 2 = u 2 f(n e 2 ) for arbitrary e > 0 and
each n 2:: 1. Hence, P(ISnl > e) -+- 0 as n -+- oo so that Sn L 0. This convergence
indicates that most of the probability mass of Sn is concentrated in a small vicinity of zero
for a sufficiently large n. The convergence Sn L 0 does not provide any information
on the behavior of the numerical sequence Sn(w) for an arbitrary but fixed we n. Other
versions of the weak law of large numbers can be found in [151] (Section 7.2). •
-1 L:xk ~ E[XI],
n
n--* oo. (2.116)
n k=1
This result is known as the strong law of large numbers [151] (Sections 7.4
and 7.5). Figure 2.9 shows five samples of (1/n) E/:= 1 Xk. where Xk are indepen-
dent Gaussian variables with mean zero and variance one. All samples approach
the mean of X 1 as n increases in agreement with Eq. 2.116. ¢
Note: The strong law of large numbers characterizes the behavior of the numerical se-
quence (1/n) I:/:= 1 Xk(w). The a.s. convergence of(l/n) I:Z= 1 Xk top,= E[XI] means
that the numerical sequence (1 In) Lk=
1 Xk (w) converges to p, for each w E n \ N, where
N E F and P(N) = 0. •
* 1 ~ Xk- f..t d
Sn = r;; ~ ----+ N(O, 1), n--* oo. (2.117)
vn k=1 a
This result is known as the central limit theorem ([151], Section 8.2). ¢
74 Chapter 2. Probability Theory
-1.5'---'---'---'---'-----'----'----'-----'-----'---___j
1 11 13 15 17 19 21
n
J
rfJn (u) = exp [ -.J=l u J.t ./fila rp(u/(../fi a))n,
and results from rp(u/(../fi a))n = exp [ln (rp(u/(../fi a))n)] by expanding the function
1n (rp(u/(../fi a))n) in a Taylor series ([79], p. 376). The limit of the characteristicfunction
of S~ as n -+ oo is equal to exp( -u 2/2) which is the characteristic function of the standard
Gaussian variable. •
n
Rn=LXi, n=1,2, ... , and Ro=O (2.118)
i=l
Example 2.72: Suppose that the random variables Xi in Eq. 2.118 are real-valued
with finite mean p., and variance a 2 and that the measurable space (W, Q) is (lR, B).
The mean, variance, and coefficient of variation of the random walk R n corre-
sponding to these random variables are
y'Var[Rn] v
and c.o.v.[Rnl = = r.;;'
E[Rn] vn
• Rn ~ +oo a.s.,
• Rn ~ -oo a.s., or
• - oo = liminf Rn <lim sup Rn = +oo a.s. (2.119)
76 Chapter 2. Probability Theory
If P(X 1 = 0) < 1 and E[IX 1ll < oo, the asymptotic behavior of the random
walk Rn as n ~ oo is ([150], Proposition 7.2.3, p. 563):
Note: Eq. 2.119 shows that Rn converges to an infinity or oscillates between -oo and +oo
as n --+ oo. The additional information on the mean of X1 in Eq. 2.120 allows us to predict
the behavior of the random walk more precisely than in Eq. 2.119. For example, if the
random variables Xk take the values one and zero with probabilities p, 0 < p < 1, and
1- p, the corresponding random walk Rn converges to +oo a.s. since E[XI] = p. ~
Example 2.73: Let X 1, X2, ... be a sequence of iid Gaussian random variables
with mean J.t and standard deviation u. Figure 2.10 shows samples of the random
walk Rn in Eq. 2.118 for X 1 ""N(J.t, u 2 ), where J.t = 1, J.t = 0, and J.t = -1 and
1500.--.--,----,----,-----,----,----,----,----,---------,
-1000 '-----'-----'-----'-----'-----'------'------'------'------'----.-J
0 100 200 300 400 500 600 700 800 900 1000
Figure 2.10. Sample paths of the random walk for X k ""N(J.t, 102 ) with J.t = 1,
J.t = 0, and J.t = -1
is called the Lyapunov exponent in the analysis of dynamic systems (Section 8. 7).
The solution Xn is stable a.s., that is, it converges a.s. to zero, if ALE < 0 and di-
verges if ALE > 0. <>
Proof: The recurrence formula for Xn yields Xn/x = 0k= 1 Ak for Xo = x so that we
have ln IXn/xl = Lk= 1 ln IAkl· It follows that ln IXn/xl is a random walk since ln IAkl
are iid random variables (Eq. 2.118). If E[ln IA1IJ < 0, the random walk ln IXn/xl con-
verges to -oo a.s. as n -+ oo (Eq. 2.120) so that the solution IXn/xl = exp (Lk= 1 ln IAk I)
converges to zero a.s. If E[ln IA1Il > 0, ln IXn/xl tends to +oo a.s. as n -+ oo so that
IXnfxl = exp (Lk= 1 ln IAkl) approaches +oo a.s. •
Example 2.75: Suppose that the random variables An in Example 2.74 are expo-
nential with distribution F (x) = 1 - exp(- p x ), p > 0, x :::: 0. The solution X n
is stable a.s. for p > p* ::: 0.56, that is, Xn ~ 0 as n ~ oo. Figure 2.11 shows
the dependence of the Lyapunov exponent ALE on p in the range [0.4, 0. 7]. <>
0.4,--~-~-~-~-~----,
0.3
0.2
0.1
'A o - - - - - - - - - ~I
I
-0.1
.
-0.2
-0.3 I
-0.4'----~-~-~.!...._~-~-__j
tp
0.4 0.45 0.5 0.55 0.6 0.65 0.7
p
Figure 2.11. Stability condition for Xn =An Xn-1 with An iid exponential vari-
ables
Proof: The mean of ln IAk I = ln(Ak) is finite and can be calculated from
ALE= {
h
00
ln(u)pexp(-pu)du = -p1 00
-00
~ exp[-~- pexp(-~)]d~
by numerical integration. The Lyapunov exponent ALE= E[ln IA1IJ decreases with p and
is negative for p > p* ::: 0.56. •
78 Chapter 2. Probability Theory
2.15 Filtration
Let (Q, F) be a measurable space. An increasing collection Fo !:; FI !:;
· · · Fn !:; · · · !:; F of sub-a-fields ofF is said to be a filtration in (n, F).
for example, we may be out of money. Hence, the time T at which we stop
playing the game is a random variable that takes values in {1, 2, ... } and depends
on the entire game history. Let :Fn represent the knowledge accumulated at time
n 2: 1. The event {T = n} to quit the game after n rounds should be in :Fn. A
similar "game" can be envisioned between a physical or biological system and the
environment. The system quits the game when its damage state reaches a critical
level, which is the analogue to running out of cash. Depending on the situation,
the time T at which such a system quits the game may be called failure time or
death time.
We define in this section stopping times and prove some of their properties.
Additional information on stopping times can be found in [59] (Chapter 2) and
[151] (Section 10.7).
Note: This definition states that T is an Fn-stopping time if Fn contains sufficient in-
formation to determine whether T is smaller or larger than n for each time n :=::: 0. For
example, T = inf{n :=::: 0 : \Rn I :=::: a}, a > 0, gives the first time when the random walk R
in Eq. 2.118 exits (-a, a). We can determine whether T:::; nor T > n by observing the
random walk up to time n.
We also note that a constant random variable t E {0, 1, ... } is a stopping time since
{w : t :::; n} is either 0 or Q so that the event {t :::; n} is in Fn for each n :=::: 0. If Tis an
Fn-stopping time so is T + t since {w: T(w) + t:::; n} = {w: T(w) :::; n- t} E F(n-t)vO
and F(n-t)vO C Fn fort :=::: 0. A
Proof: We show first that the collection of sets in the second definition of Fr is a a-field,
and then that the above two definitions of Fr are equivalent.
We have (1) Q n {T = n} = {T = n} E Fn since Tis a stopping time so that Q E
Fr. (2) A E Fr implies AcE Fr since Acn{T = n} = {T = n}n(A n {T = n})c E Fn,
80 Chapter 2. Probability Theory
Proof: Take B E Q. The random variable T(w) = inf{n ~ 0 : Xn(w) E B} satisfies the
condition {T:::; n} if at least one of the events {Xo E B}, {Xo ¢:. B, ... , Xk-I ¢:. B, Xk E
B}, k = 1, ... , n, occurs, that is,
so that {T :::; n} E :Fn because it consists of finite intersections and unions of events in :Fk
and :Fk £ :Fn fork:::; n. •
IIf S and T are Fn-stopping times such that S ::::; T, then :Fs ~ :FT.
Proof: If A E :Fs, then A n {S :::; n} E :Fn for all n ~ 0. We need to show that
An {S:::; n} E :Fn implies An {T:::; n} E :Fn for all n ~ 0. This implication is true since
the sets An {S:::; n} and {T:::; n} are in :Fn, and An {T:::; n} =(An {S:::; n}) n {T:::; n}
holds if S:::; T. •
IIf SandT are Fn-stopping times, then S 1\ T and S v T are stopping times.
Proof: We have
To show that Y is independent of Fy, we need to prove that P(An{Y E B}) = P(A) P(Y E
B), where A E Fr and B E goo. We have
L P(A n {Y
00
=L
00
Example 2.78: Let R be the random walkinEq. 2.118, where Xi areiid Gaussian
random variables with mean f1- and standard deviation a > 0. LetT = inf{n ::::
0 : Rn tj. (-a, a)} be the first timeR leaves (-a, a), a > 0. Figure 2.12 shows
1000 samples of the stopping time T and the corresponding histogram of these
samples for f1- = 0.2, a = 1, and a = 5. <>
0.05,--~-~--------,
Histogram ofT
0.04
0.03
0.02
0.01
(2.124)
2.17. Conditional expectation 83
oftheconditional vector x<I) I cx< 2) = z), that is, the expectation of x<l) given
the information x<2) = z.
We extend the definition of the conditional expectation E[X (1) I x<2)] in
Eq. 2.124 by considering information more general than X <2) = z. This section
defines the conditional expectation E[· I Q], where (Q, :F, P) is a probability
space and Q is a sub-a-field of :F. The conditional expectation E[· I Q] is needed
in many applications involving stochastic processes, as we will see in this and the
subsequent chapters.
Example 2.79: Consider the experiment of rolling two dice. The sample space,
a-field, and probability measure for this experiment are Q = {w = (i, j) : i, j =
1, ... , 6}, all subsets of n, and P({w}) = 1/36 for w e Q, respectively. Define
the random variables X (w) = i + j and Z (w) = i 1\ j. The expected value of X
is E[X] = (1/36)[(1 + 1) + (1 + 2) + · · · + (6 + 6)] = 7.
Suppose we are told the value of Z and asked to determine the expectation
of X given Z, that is, the expectation of the conditional variable X I Z denoted
by E[X I Z]. The conditional expectation E[X I Z] is a simple random variable
with density in Fig. 2.13 so that
0.35
P(Z=l)
0.3
P(Z=2)
0.25
0.2
P(Z=3)
0.15 P(Z=4)
0.1 P(Z=5)
0.05
P(Z=6)
I
03 4 5 a 7 a 9 10 11 12 13
X
X I (Z
= 6) are J01 2)(x) = fx1z(x I 6) = 8(x- 12) and E[X I Z = 6] = 12, where 8
denotes the Dirac delta function. The probability of the event {Z = 6} is 1/36. If Z = 5,
there are three equally likely outcomes, (6, 5}, (5, 5), and (5, 6}, so that fx1z(x I 5) =
(1/3)8(x - 10) + (2/3)8(x - 11) and E[X I Z = 5] = (10)(1/3) + (11)(2/3) = 32/3.
The probability of the event {Z = 5} is 3/36.
Note that the random variable E[X I Z] does not have a density. The above expres-
sions of f(l/Z) are formal. They are used here and in some of the following examples for
simplicity. •
Note: E[X I An] is the conditional expectation of X with respect to An and PAn (A) =
P(A I An) = P(A nAn)/ P(An) . .l
• E[X I Q] is (}-measurable.
• E[X I Q] can be viewed as an approximation of X.
• E[X I Q] satisfies the defining relation
Proof: Since 1An e 9 and E[X I An] are some constants, the conditional expectation in
Eqs. 2.125 and 2.126 is a {;-measurable function. We also note that E[X I 9] is a discrete
random variable on (Q, 9) whose probability density function is f (x) = Ln P (An) 8(x-
E[X I AnD·
Since E[X I An] can be viewed as a local average of X over the subset An.
E[X I 9] provides an approximation for the random variable X. The accuracy of this
approximation depends on the refinement of the partition An. If the partition has a single
element that necessarily coincides with the sample space, E[X I 9] is equal to E[X]. If
2.17. Conditional expectation 85
the sample space is countable and the partition is given by the elements {w} of the sample
space, E[X I Q] is equal to X.
It remains to prove Eq. 2.127. If A coincides with the sample space n, then
1 A
X dP =L 1
keJ Ak
X dP =L
keJ
E[X I Ak] P(Ak) and
The above chain of equalities is based on properties of integrals of random variables and
Eqs. 2.125 and 2.126. •
Example 2.80: Let (0, :F, P) be a probability space, where n = [0, 1], :F =
B([O, 1]), and P(dw) = dw. Consider a random variable X(w) = 2 + sin(2n w),
w E n, defined on this space. Let At = [0, 1/4), A2 = [1/4, 3/4), A3 =
[3/4, 1), and A4 = {1} be a partition of nand denote by g the a-field generated
by this partition. We have
where x can be any real number. The lack of uniqueness is of no concern since
P(A4) = 0. The conditional expectation will be defined as a class of random vari-
ables that are equal a.s. (Eq. 2.128). Figure 2.14 shows the measurable functions
X and E[X I Q] for x = 3. The conditional expectation E[X I Q] represents an
approximation of X whose accuracy depends on the partition {An} of Q. 0
The defining relation (Eq. 2.127) is essential for calculations and reveals
some useful properties of the conditional expectation.
86 Chapter 2. Probability Theory
2.
E[X IO](ro)
1.8
1.6
X(ro)
1.4
1.2
1
0 0.2 0.4 0.6
0)
Example 2.82: Let X be a random variable with distribution function F and finite
mean defined on a probability space (Q, :F, P). If a E ffi. is such that F(a) E
(0, 1), then
1-
E[X I A ] = -
P(A) A
1
X dP = -1-
F(a) -oo
xdF(x) la and
1-
E[XIAc]=- {
P(Ac) }Ac
XdP= 1
1- F(a) a
1 00
xdF(x).
so that the random variables E[X I 9] and X have the same expectation. •
Note: The definition is meaningful because, if E[IXIJ < oo and 9 is a sub-cr-field of :F,
there exists a unique equivalence class of integrable random variables, denoted by E[X I
9], that is 9-measurable and satisfies the defining condition of Eq. 2.128 for all A E 9
([40], Theorem 9.1.1, p. 297). &
Proof: If 9 = {0, Q}, then E[X I 9] is constant on n so that Eq. 2.128 gives E[X] =
E[X I 9] for A = n.
If 9 = :F, then E[X I 9] = E[X I F] is F-measurable and the defining relation
yields fA (X- E[X I :F]) dP = 0 for all A E :F so that X= E[X IF] a.s.
The defining relation gives E{E[X I 9]} = E[X] for A= Q . •
88 Chapter 2. Probability Theory
Proof: If Z = lA, A E Q, then Eq. 2.129 holds because E{(X- E[X I Q]) lA} =fA (X-
E[X I Q])dP and this integral is zero by the defining relation. Eq. 2.129 is also valid
for a simple random variable Z = Ln bn iAn, An E Q, by the linearity of expectation.
The extension to an arbitrary random variable Z results from the representation of Z by
a difference of two positive random variables, which can be defined as limits of simple
random variables ([40], Section 9.1). •
E[X lv]
{Z: ZEQ}
Note: The function (X 1, X2) t-+ (X 1, X2) = E[X 1 X2l is the inner product defined on L2
(Section 2.6). If (X 1, X2) = 0, X 1 and X2 are said to be orthogonal. A random variable
Y E L2(Q, :F, P) is orthogonal to L2(Q, Q, P) if (Y, Z) = 0 for all Z E L2(Q, Q, P).
Eq. 2.129 states that X- E[X I Q] is orthogonal to L2(Q, Q, P). According to the or-
thogonal projection theorem in Eq. 2.51, E[X I Q] is the best m.s. estimator of X given the
information content of g. 4
Proof: The random variables E[X Z I Q] and Z E[X I Q] are Q-measurable by the def-
inition of the conditional expectation and the fact that the product of two Q-measurable
functions, the functions Z and E[X I Q], is Q-measurable. If Z = 1~, ~ E Q, Eq. 2.130
holds a.s. because for A E g the left and the right sides of this equation are
respectively, by the defining relation. Hence, Eq. 2.130 holds for a simple random variable
Z because the conditional expectation is a linear function (Eq. 2.131). The extension to
general random variables results from the representation of an arbitrary random variable by
the difference of two positive random variables, which can be defined as limits of simple
random variables ([40], Theorem 9.1.3, p. 300). •
If 91 and 92 are sub-a-fields of :F such that 91 C 92, then we can perform the
following change of fields in the conditional expectation
This gives the first equality of Eq. 2.i35. The second equality of this equation results since
E[X I 9d is 91-measurable and, therefore, 9z-measurable. Hence, E{E[X I 9tll 9z} =
E[X 19tlbythefirstpropertyofEq.2.i31. •
Example 2.83: Let R = (Ro, R1, ... ) be the random walk in Eq. 2.118 and :Fm
be the a- field generated by ( R o, R1 , ... , Rm), m :=: 0. The average of the random
walk at a time n > m given the past history (Ro, R1, ... , Rm), that is, the a-field
:Fm, is E[Rn I :Fm] = Rm + (n- m) E[X 1], where Xk are independent identically
distributed random variables in L 1 . <>
Proof: We have
+E [ t
k=m+1
Xk I .Fm] = Rm + t
k=m+1
E[Xk]
Example 2.84: Let (Q, :F, P) be a probability space, {An} denote a countable
partition of Q such that An E :F for each n, and Q be the a-field generated by
{An}. Then the random variable X - E[X I Q] is orthogonal to Q, where the
conditional expectation E[X I Q] is equal to Ln E[X I An] lAn. <>
Proof: The conditional expectation E[X I 9] is given by Eqs. 2.i25 and 2.i26, where the
members of 9 are UnEJ An} with J c; {1, 2, ... }. It remains to show that X- E[X I 9] is
orthogonal to 9, that is, that E[(X- E[X I 9]) Z] = 0 for all Z E 9. Because Z E 9, we
have Z = Ln ctn iAn so that
n m,n
The above expression is zero since E[X iAn] = E[X I An] P(An) (Eq. 2.i26) and
E[lAm iAn] = P(Am nAn) = 8mn P(An) so that the second summation becomes
Ln E[X I An] ctn P(An) . •
Example 2.85: Let X and Z be random variables in Lz(Q, :F, P). We have seen
that the conditional expectation E[X I a(Z)] = E[X I Z] is the best m.s. esti-
mator of X with respect to the information content of a(Z). The best m.s. linear
estimator of X is
E[X Z] - E[X] E[Z]
+
A
X = E[X] (Z - E[Z]).
E[Z 2] - E[Z] 2
This estimator has the functional form of the conditional Gaussian variable in Ex-
ample 2.64, represents the linear regression of X with respect to Z, and becomes
X = E[X] if X and Z are not correlated.<>
2.17. Conditional expectation 91
Proof: We have P(A I 9) = E[lA I 9] by Eq. 2.136 and (Eqs. 2.125 and 2.126)
E[lA I 9] = E[1A I B]1B + E[1A I Be] 1Bc
= (-1-
P(B)
{ 1A dP) 1B
fn
+ (P(Be)
-1 - { 1A dP) 1nc.
fnc
2.18 Martingales
Let (0, :F, P) be a probability space endowed with a filtration (:Fnk~.o
such that :F00 = U~ 1 :Fn ~ :F and let Xn, n = 0, 1, ... , be random variables
defined on (0, :F, P). The sequence X = (Xo, X1, X2, ... ) is referred to as a
discrete time stochastic process or just a stochastic process. The numerical
sequence (Xo(w), X1 (w), ... ), w E 0, is called a sample or sample path of X.
Continuous time stochastic processes are discussed in Chapter 3.
Note: If the equality in Eq. 2.138 is replaced by 2: and :5, then X is said to be an Fn-
submartingale and Fn-supermartingale, respectively. The filtration Fn need not be men-
tioned if there is no confusion about it.
If the random variables Xn are in Lp(rl., F, P), X is called a p-integrable martin-
gale, submartingale, or supermartingale depending on the last condition of Eq. 2.138. If
p = 2, then X is said to be a square integrable martingale, submartingale, or supermartin-
gale. .l
Suppose m :;:: 1 rounds have been completed in a game with unit stake. We
can think of Xn as our total winnings (or losses) after n rounds of a game so that
Xn - Xm, n > m, gives our net total winnings in the future rounds m + 1, ... , n.
The best m.s. estimator of Xn - Xm. n > m, given our knowledge :Fm after
m rounds is E[Xn - Xm I :Fm]. where :Fm = a(X1, ... , Xm). m :;:: 1, and
:Fo = {0, 0}. If X is a martingale, then E[Xn - Xm I :Fm] = 0, that is, our
average fortune E[Xn I :Fm] at a future time n is equal to our current fortune X m·
This game can be generalized by allowing stakes other then one. Let A i be our
stake at game i = 0, 1, ... , where Ao = 0. Because we decide our stake for round
m + 1 on our knowledge :Fm after m games, Am+l is :Fm-measurable. Processes
with this property are said to be predictable processes. The sequence A 1, A2, ...
is called gambling strategy. The summation,
n
Mn =L Ai (Xi - Xi-1), (2.139)
i=l
gives our total winnings after n :;:: 1 games, where X o = 0. The sequence M =
(Mo, M1, ... ) defined by Eq. 2.139 for n:;:: 1 and Mo = 0 is a discrete version of
the stochastic integral considered in Chapter 4. The integrand A i is a predictable
process and the integrator Xi is a martingale.
2.18. Martingales 93
Example 2.87: Consider the random walk in Eq. 2.118. We have shown that
Example 2.88: Let R = (Ro, R1, ... ) be the random walk in Eq. 2.118. If the
random variables Xi are in L2(Q, .r, P) and have mean zero, the sequence Sn =
R~ = L,7,j=l Xi X j, n ~ 1, with So = 0 is an Fn-submartingale and Sn -n E[XI]
is an Fn-martingale. <>
Proof: The sequences Sn and Sn- n E[XI] have the first two properties in Eq. 2.138. We
also have
Example 2.89: LetYn = exp(H u Rn)/cpn(u), u E IR, where R = (Ro, R1, ... )
istherandomwalkinEq.2.118andcpn(u) = E[exp(HuRn)],u E IR, denotes
the characteristic function of Rn. Then Yn is a martingale. <>
Proof: The characteristic function of Rn is 'Pn(u) = rp(u)n, where rp denotes the charac-
teristic function of X 1· The random variables Yn are Fn = a (RI, ... , Rn)-measurable for
each n 2: 0 and have unit mean. For n > m, we have
= eHuRm [ ,-, n
E e" -1 u Lk=m+l xk
]
= eHuRm rp(u)n-m = Ym
f{Jn(u) f{Jn(u)
since Rm E Fm and the random variables Xko k::: m + 1, are independent of Fm. •
94 Chapter 2. Probability Theory
2.18.1 Properties
Most of the properties given in this section follow directly from the defini-
tion of submartingale, martingale, and supermartingale processes.
Proof: The notation An t means An :::: An-l· We say that A is predictable because An
is Fn-1-measurable. The representation in Eq. 2.142 shows that submartingales have a
predictable part A that can be told ahead of time and an unpredictable part M.
We first show that if the representation in Eq. 2.142 exists, it is unique. We have
E[Xn I Fn-11 = E[An I Fn-11 + E[Mn I Fn-11 =An+ Mn-l for n :::: 1. Substituting
Mn-l in this equation with its expression from Eq. 2.142, we obtain the recurrence formula
An = An-l + E[Xn I Fn-ll- Xn-l, which defines A uniquely with Ao = 0.
We now show that the decomposition in Eq. 2.142 exists, that is, that there are
processes A and M with the claimed properties. Let An, n = 0, 1, ... , be defined by the
above recurrence formula with Ao = 0. Note that An E Fn-1 and An :::: An-l since Xn
is a submartingale. Hence, An, n = 0, 1, ... , has the stated properties. We also have
Proof: The process X~ satisfies the first two conditions in Eq. 2.138. Also,
so that it is finite since Ai and Xi have finite second moments. That Mn E :Fn follows
from its definition and the properties of An and Xn. Recall that An E :Fn-1, n :::: 1, since
A is :Fn-predictable. For n > m, we have
since Ai E:Fi-1 and xi is a martingale so that E[Ai (Xi- xi-1) I :Fi-ll= Ai E[Xi-
xi-1 I :Fi-d = o.
If the integrator X of the discrete stochastic integral in Eq. 2.139 is not a martingale,
the integral M = (Mo, M1, .. .) may not be a martingale either. For example, we have
E[Mn I :Fn-d = Mn-1 + E[An (Xn- Xn-1) I :Fn-d form= n -1 or Mn-1 + Zn An,
where Zn = E[Xn - Xn-1 I :Fn-d· •
the sequence X stopped at T. Figure 2.16 shows five samples of a random walk
Rn = 2::7= 1 Xi for n ~ 1 and Ro = 0 (Example 2.73, Fig. 2.10) stopped at the
timeT when it leaves the interval (-a, a), a = 100, for the first time, where Xi
are independent Gaussian variables with mean zero and variance 100.
150r------.-------.----,----,------,------,
-150'------'-----'----'-------'---------'------'
0 50 100 150 200 250 300
E[IXnAriJ = r
J{T:Sn)
IXni\TidP + r
J{T>n)
IXni\TidP
= tf
k=O j{T=k}
IXkldP + f
j{T>n}
IXnldP,
E[IXni] < oo, and the inequalities f{T=k} IXkldP ::; E[IXkl1 and f{T>n) IXnldP ::;
E[IXniJ. For the last property in Eq. 2.138 we use Xni\T = I:Z:JXk l{T=k)+Xn l{r:::n}
sothatXcn+l)Ar-Xn/\T = (Xn+l-Xn) l{T>n)· The expectation E[X(n+l)I\T-XnAT I
Fn] is equal to l{T>n) E[Xn+l - Xn I Fn] since l{T>n} is Fn-measurable. If X is a
xr
submartingale, martingale, or supermartingale, E[Xn+l - Xn I Fn] is positive, zero, or
negative so that is a submartingale, martingale, or supermartingale, respectively. •
If (1) X is an :Fn -martingale, (2) T is a stopping time with respect to :Fn such that
T < oo a.s., (3) Xr is integrable, and (4) E[Xn l{T>n}l ---+ 0 as n ---+ oo, then
E[Xr] = JL, where JL = E[Xn]. This statement is referred to as the optional
stopping theorem.
because XT is integrable so that the series I:f:,0 E[Xk l{T=k}l is convergent. Hence, the
expectation of XT is J.L. •
2.18.3 Inequalities
We present two inequalities that are most useful for applications. Addi-
tional inequalities and convergence theorems can be found in, for example, [66]
(Chapter 24) and [151] (Chapter 10).
(2.144)
Proof: Set X~ = maxo::;k::;n Xk. For A. > 0 letT = min{k :5 n : Xk 2: A.} if there exists
k ::: n such that Xk 2: A. and let T = n otherwise.
Because X is a positive submartingale and Tis a stopping time, we have E[Xn] 2:
E[XT] since T::; n. Also,
since X~ 2: A. implies XT 2: A. and X~ <A. implies XT = Xn. The above results give
E[Xn] 2: E[XT] 2: A. P(X~ 2: A.)+ E[Xn l{x;t<~.J]
fooo E[Xn L
l{x;:~xj]dx = fooo (fo Xn l{x;:~xjdP) dx = (foX~ Xndx) dP
{ Xn* Xn dP = E[Xn* Xn] :5 ( E[(Xn)
= Jo * 2] E[Xn]
2 )1/2 ,
2.19. Problems 99
where the Fubini theorem and the Cauchy-Schwarz inequality were used. We have found
E[(X~)2] :::; 2 ( E[(X~)2] E[X~]) 112 , which gives the Doob inequality in Eq. 2.145 fol-
180
180
140
120
100
Samples of X
eo
60
40
of X for random variables Zk uniformly distributed in the range (0, 2 J.tk), where
J.tk = p (ePk - 1) and p = 0.1. The samples resemble the growth of a crack in a
the plate with the number n 2: 0 of stress cycles ([79], p. 17). <>
Note: That X is a submartingale follows from E[Xn I Fm] = E[Xm + Lk=m+l zk I
Fm] = Xm + Lk=m+l E[Zk] ~ Xm. The expectation E[Xn] = Lk=O E[Zk] of Xn is
an increasing function of time.
The submartingale X in this example is square integrable so that the inequality in
Eq. 2.145 applies but gives a trivial result since X has increasing samples so that x:; =
maxo:::;k:::;n Xk = Xn. Estimates of P(X~ ~A.) and E[Xn 1x~::::>.J/A. in Eq. 2.144 based on
1,000 samples of X are 0.2910 and 0.3136, respectively for A.= 160. A.
2.19 Problems
2.1: Let Q be a sample space and let A be a collection of subsets of Q. Show that
a(A) = ng 2Ag is a a-field, where g are a-fields.
100 Chapter 2. Probability Theory
:F = {0, Q, {1, 2, 3}, {2, 4, 6}, {2, 4}, {1, 3, 5, 6}, {6}, {1, 2, 3, 4, 5}}
2.5: Suppose that A and B are independent events. Show that A c and B are
independent events.
2.7: Suppose that h : (JE., B) --+ (JE., B) is measurable and let a be a constant.
Show that ha(x) = h(x) if h(x) :::=:a and ha(x) =a if h(x) >a is measurable.
2.8: Let X be a random variable defined on a probability space (Q, :F, P). Show
that the a-field generated by X is the smallest field with respect to which this
random variable is measurable.
2.10: Showthatinfi Xi, supi Xi, liminfi Xi, andlimsupi Xi are random variables
on (Q, :F, P), where Xi, i = 1, 2, ... , are random variables on this space.
2.13: Let hn = l[n,n+l], n = 1, 2, ... , and let A. denote the Lebesgue measure.
J
We have fJE. hn dA. = 1 for all n. Show that (lim hn) dA. i= lim hn dA.. J
2.14: Use the dominated convergence theorem to find the limit as n --+ oo of the
integral Jt) .JX"/(1 + n x 3 ) dx.
2.16: Let Xi "'"' Sa(ai, f3i, JLi), i = 1, ... , n, be independent a-stable random
variables. Find the characteristic function of "L7= 1 Xi. Comment on your result.
2.17: Let (Y1, Yz) be an JE.2 -valuedGaussian variable with the density in Eq. 2.107.
Find the joint density of (XJ = gJ(YJ), Xz = gz(Yz)), where gi, i = 1, 2, are
increasing functions. Specialize your results for g i (y) = exp(y).
2.19. Problems 101
2.18: Calculate the mean and variance of the estimator X in Example 2.64.
2.19: Show that an JRd -valued random variable X E Lz is Gaussian if and only
if Ef= 1 Oli Xi is a real-valued Gaussian variable for every collection of constants
ott, ... , aa.
2.22: Take a collection of random variables Xi, i E I, with the same mean and
variance but different distributions. Plot the probabilities P(IX i I > x) as a func-
tion of x > 0 and a Chebyshev bound. Can the bound be used to approximate
P(IXil > x)?
2.23: Let Rn = '£7= 1 Xi for n :::: 1 and Ro = 0, where Xi > 0 are iid random
variables. Find the probability law of the stopping timeT= inf{n::;: 0: Rn >a},
where a > 0 is a constant.
2.27: Let Xi, i ::;: 1, be iid random variables defined on a probability space
(Q, :F, P) such that P(Xi = 1) = P(Xi = -1) = 1/2. Consider the filtra-
tion :Fn = cr(Xt, ... , Xn) on this space and the random walk Rn = '£7= 1 Xi.
Show that R~ - n and ( -l)n cos(rr Rn) are Fn-martingales.
Stochastic Processes
3.1 Introduction
In the previous chapter we defined a time series or a discrete time stochastic
process as a countable family of random variables X= (X t. X2, ... ). Time series
provide adequate models in many applications. For example, X n may denote the
damage of a physical system after n loading cycles or the value of a stock at the
end of day n. However, there are situations in which discrete time models are
too coarse. For example, consider the design of an engineering system subjected
to wind, wave, and other random forces over a time interval I. To calculate the
system dynamic response, we need to know these forces at each time t E I.
The required collection of force values is an uncountable set of random variables
indexed by t E I, referred to as a continuous time stochastic process or just a
stochastic process. We use upper case letters for all random quantities. A real-
valued stochastic process is denoted by {X (t), t E I} or X. If the process takes on
values in JRd, d > 1, we use the notation {X(t), t E I} or X.
If X is indexed by a space coordinate~ E D c JRq rather than timet E I,
then {X(~),~ E D} is called a random field. There are notable differences be-
tween random processes and fields. For example, the concept of past and future
has a clear meaning for stochastic processes but not for random fields. How-
ever, stochastic processes and random fields share many properties. There are
also situations in which a random function depends on both space and time pa-
rameters. The evolution in time of (1) the wave height in the North Sea, (2) the
temperature everywhere in North America, (3) the ground acceleration in Califor-
nia during an earthquake, and (4) the Euler angles and the dislocation density in
a material subjected to plastic deformation are some examples. These functions
are random fields at each fixed time t E I and stochastic processes at each site
~ E D. They are referred to as space-time stochastic-processes and are denoted
by {X(t, ~), t 2:: 0, ~ED}
This chapter reviews essential concepts on stochastic processes and illus-
103
104 Chapter 3. Stochastic Processes
3.2 Definitions
Let X : I X n 1--+ r be a function of two arguments, t E I and (j) E n,
where I is a subset of lR or [0, oo) and en, :F, P) denotes a probability space.
If X et) is an JRii -valued random variable on the probability space en, :F, P) for
each t E I, that is, X et) E :F for each t E I, then X is said to be an JRd- valued
stochastic process.
Note: We will also refer to an JRd -valued stochastic process with d > 1 as a vector stochas-
tic process. The function X (·, w) for a fixed w E Q is called a sample path, path, sample,
or realization of X. The function X(t, ·)for a fixed tis by definition an JRlf-valued ran-
dom variable. The measurable space (JRli, Bd) used in the definition of X is sufficiently
general for our discussion. The definition of a stochastic process X does not require that
the functions X(·, w) : [0, oo) 1-+ JRll and X(·,·) : [0, oo) x Q 1-+ JRll be measurable. A
Random fields are defined in the same way. Let en, :F, P) denote a proba-
bility space and let X : D X n 1--+ JRd be a function of two arguments, ~ E D and
w E n, where D C R.q and q ~ 1 is an integer.
If X e~) is an JRd -valued random variable on the probability space (n, :F, P) for
each ~ E D, that is, X e~) E :F for each ~ E D, then X is said to be an JRd -valued
random field in D.
Example 3.1: Figure 3.1 shows daily average temperatures measured in Central
Park, New York City, during two different years starting on the first of January
and ending on the last day of December. These records can be interpreted as
two samples of a stochastic process {X (t), t = 1, ... , 365} giving daily average
temperatures in Central Park. A collection of samples of X of the type shown in
this figure can be used to estimate the probability law of X. <>
Example 3.2: Let <I> denote one of the three Euler angles of the atomic lattice
orientation in aluminum AL 7075. Figure 3.2 shows 14,000 measurements of <I>
performed on an aluminum AL 7075 plateD with dimensions 540 x 540 J,Lm. The
left plot shows the spatial variation of <I> over the plate. The right plot presents
contour lines for <I>. These lines partition D in subsets characterized by nearly
constant values of <I>, called grains or crystals. The plots in this figure can be
3.2. Detinitions 105
100,--~--~-----, 100.----------------,
Sample#1 Sample#2
OL--~--~-~-~
0 100 200 300 400 100 200 300 400
Figure 3.1. Two records of daily temperatures in Central Park, New York City in
two consecutive years
y (!Jm)
Figure 3.2. Three dimensional and contour lines for an Euler angle of atomic
lattice orientation in aluminum AL 7075
10
0 0
Figure 3.3. The mapping (t, w) 1-+ X (t, w) = Y (w) cos(t) and ten samples of X
is continuous. It is natural to ask whether the function (t, w) 1-+ X(t, w) is measurable
from ([0, 10] x [0, 1],8([0, 10]) x 8([0, 1])) to ([-1, 1],8([-1, 1])), where [-1, 1] is
the range of X. The answer to this question is in the affirmative (Example 3.4). A
Note: If X is a measurable stochastic process, then X(t, ·)is F-measurable and X(·, w) is
8(1)-measurable by Fubini's theorem (Section 2.8). The first statement is valid even if X
is not measurable because X is a stochastic process.
The assumption that a process is measurable may not be very restrictive since many
of the stochastic processes used in applications are measurable (Example 3.4). A
In Section 2.15 we have defined filtration and adapted time series. These
concepts extend directly to the case in which the time index is continuous.
Note: In Section 2.15 we have defined the natural filtration :F! = o-(X 1, ... , Xn), n 2: 1,
for a time series X= (Xr, X2, ... ). The definition of :F1x in Eq. 3.1 is a direct extension
of the definition of :F! to a continuous time index. If (:F1)t>O is a filtration on (Q, :F, P)
such that :Fl c :F1 for each t 2: 0, then X is :F1 -adapted. A-
Example 3.5: Let ([0, 1], B([O, 1]), P) be a probability space and a random vari-
able Y(w) = w defined on this space, where P(dw) = dw. Consider also a
stochastic process X: [0, 1] x Q--+ IR defined by X(t, w) = Y(w). The natural
filtration of X is :F1x = B([O, 1]), t 2: 0. <>
Note: We have o-(X(t)) = o-(Y) for all t E [0, 1] so that :F1x = o-(Uo 99 o-(X(s))) =
o-(Y). Because w r+ Y(w) is the identity function, the o--field generated by the random
variable Y is B([O, 1]). A
Example 3.6: Let {B(t), t 2: 0} be a Brownian motion, :F1 = a(B(s), 0:::: s :::: t)
denote the natural filtration of B, and let g : [0, oo) x IR --+ IR be a continuous
function. The stochastic process Y(t) = g(t, B(t)) is :F1 -adapted. For example,
B(t) 2 , B(t) 2 - t, B(t)q for an arbitrary integer q 2: 1, and maxo<s<t B(s)q are
:Fradapted. We say that these processes are adapted to the Brownian motion.
On the other hand, the process Z(t) = B(t + r), r > 0, is not adapted to the
Brownian motion. <>
Note: The natural filtration of some of the processes derived from B may be smaller than
:F1 . For example, the natural filtration:;:; = o-(B(s) 2, 0 ::"= s ::"= t) is smaller than :F1 ,
t 2: 0, because :F{ contains the entire information on B 2 and IB I but not on B. Hence, the
stochastic process B2 is both :F{ and :Fradapted ([131], Example 1.5.2, p. 78).
The random variable B(t + r) is :Ft+r-measurable but it is not :F1-measurable for
r > 0. Hence, Z is not :F1-adapted. A
Note: A progressively measurable process X is measurable since B([O, t)) x :Ft is included
in B([O, oo)) x :F. The process X is also :F1 -adapted because X(t, ·)is :Ft-measurable for
each t 2: 0 (Fubini's theorem). A
if k and n are such that t E ((k -1) 2-n, k 2-n] so that it is also B([O, s]) x Fs-measurable
for s 2: k 2-n. That the process X is progressively measurable follows from the continuity
of the sample paths of X and the convergence of the mapping (t, w) r-+ x<nl(t, w) to
(t, w) r-+ X(t, w). •
and adapted. An adapted process may or may not be measurable. An adapted and
measurable process may not be progressive. The following statement generalizes
Example 3.7.
Proof: Suppose that X is right continuous and define the sequence of processes
is in B([O, t]) X Ft since X is adapted. Hence, Xn,t(S, w) with 0 :::: s :::: t and (J) E n is
B([O, t]) x Fr-measurable for each n 2: 1 so that Xn,t is progressive. By right continuity
we have Xn,t(s, w)-;. X(s, w) as n-;. oo for all 0 ::=:: s ::=:: t and wE !J. We conclude that
X is progressive since limits of measurable functions are measurable. •
110 Chapter 3. Stochastic Processes
We assume throughout the book that the filtration :F1 , t 2: 0, on (!J, :F, P)
has two properties, referred to as the usual hypotheses:
1. :F1 is complete meaning that (a) :F is complete, that is, for every A c B such
that B E :F and P(B) = 0 we have A E :F so that P(A) = 0 (Section 2.2.3), and
(b) :Fo contains all null subsets of :F, that is, A E :F such that P(A) = 0 implies
A E Fo.
2. :F1 is right continuous, that is, the a-field :Fr+ = ns>t:Fs is equal to :F1 for all
t 2: 0. The filtration Fr+ provides an "infinitesimal peek into the future" in the
sense that A E :Ft+ implies A E :Ft+s for ails > 0. Note that a right continuous
filtration :Fn indexed by n = 0, 1, ... , is constant since Fn+ = Fn+l·
Example 3.8: Let (!J, :F, (:F1) 1:;::o, P) be a filtered probability space and X be an
Fradapted stochastic process. If :F1 is right continuous, then the random variable
X(t) = lim supu{-t (X (u) - X (t)) /(u - t) is :F1 -measurable provided it exists. <>
Proof: Define the random variables Yn= n (X(t + 1/n)- X(t)) and Zn = SUPm>n Ym,
take p > n. Note that Zp E :Fr+lfp C :Ft+lfn so that infp Zp E :Ft+lfn for-any n
and infp Zp E n~ 1 :Ft+l/n = :Ft+· If the filtration :F1 is right continuous, then X(t) is
:F1-measurable for each t :::: 0. •
3.3 Continuity
We have already defined right/left continuity, continuity, a.s. right/left con-
tinuity, and a.s. continuity. Generally, it is difficult to establish whether a process is
sample continuous from these definitions. The following Kolmogorov criterion
provides a simple alternative. If {X (t), t E I} is a real-valued stochastic process
and I c [0, oo) is a closed interval, and there exist three constants a, {3, ~ > 0
such that
(3.2)
then limh-+0 SUPs,tEI,Is-tl<h [X(s)- X(t)[ = 0 a.s., that is, almost every sample
path of X is uniformly continuous in I. An JE.d -valued process, d > 1, is sam-
ple continuous if each of its coordinates satisfies the condition in Eq. 3.2 ([197],
Proposition 4.2, p. 57).
We give in this section some of the most common definitions of continuity
for stochastic processes. Both continuity at a point and in an interval are discussed.
Let {X(t), t 2: 0} be an JE.d-valued stochastic process defined on a probability
space (!J, :F, P).
X is continuous in probability at t 2: 0 if
P ({w: lim
s-+t
II X(s, w)- X(t, w) II= 0}) = 1 or
x
Note: The Euclidean norm II II= ( 'Lf= xl)
1
112 is used in the above definition to assess
Proof: The process N has stationary independent increments, that is, (1) the distribution
of the random variable N(t) - N(s), s < t, depends only on the time lag t - s rather
than the times t and s and (2) the increments of N over non-overlapping time intervals are
112 Chapter 3. Stochastic Processes
5,----,----.----.----.----.----~---.----~---.-----
4.5
3.5
)\
........::J Compound Poisson
Brownian motion
process
process
0 2 3 4 5 6 7 8 9 10
Figure 3.5. Samples of a Brownian motion and a compound Poisson process with
Y1 "'N(O, a 2 ) andA.a 2 = 1
independent random variables, for example, N (t)- N (u) and N ( v)- N (s) are independent
for s < v :;:: u < t (Sections 3.6.4 and 3.12). The probability,
follows from the definition of B. The expression of the expectation E[(C(t)- C(s)) 2 ] can
be found in [79] (Section 3.3). The limits of the second moments of the increments of B
and C are zero as s -+ t. •
Example 3.11: The Brownian motion process (Example 3.4) and the compound
Poisson process (Example 3.9) are a.s. continuous at each t :::: 0.
Proof: The random variable B(t)- B(t- 1/n) is Gaussian with mean zero and variance
1/n for any integer n ?:: 1. For ant: > 0 and n = 1, 2, ... define the sequence of events
An(t:) = {IB(t)- B(t- 1/n)l > t:}. By Fatou's lemma (Section 2.3.6) we have
Example 3.12: Almost sure continuity at each time t does not imply sample con-
tinuity. <>
Proof: If X is a.s. continuous at each t ?:: 0, the probability of the set Qt in Eq. 3.5 is
zero at each time. However, the probability of Ut::::ont may not be zero or may not even be
defined since this set is an uncountable union of measurable sets, which is not necessarily
measurable.
For example, let (Q, :F, P) be a probability space with Q = [0, 1], :F = B([O, 1]),
and P(dw) = dw and let X be a real-valued stochastic process defined by X(t, w) = 0
and X(t, w) = 1 fort < wand t ?:: w, respectively. Because Qt in Eq. 3.5 is {t}, we
have P(Qt) = 0 so that X is continuous a.s. at each t E [0, 1]. However, the probability
of UtE[O,l]Qt = [0, 1] = Q is equal to 1 so that the process is not sample continuous.
Another example is the compound Poisson process that has a.s. continuous samples at each
timet but its samples are piece-wise constant functions. The Poisson process has the same
property. •
Example 3.13: The Brownian motion B is sample continuous while the com-
pound Poisson process C does not have continuous samples (Fig. 3.5). <>
Proof: Because B(t +h)- B(t) ~ N(O, h), h > 0, for any t ?:: 0, we have E[(B(t +h)-
B(t)) 4] = 3 h 2 so that the Kolmogorov condition in Eq. 3.2 is satisfied for a = 4, f3 = 1,
and I; ?:: 3. That Cis not sample continuous can be seen from Fig. 3.5. •
A process X is called cadh'tg if its sample paths are a.s. right continuous with
left limits. X is called dtglad if its sample paths are a.s. left continuous with
right limits. The names are acronyms for the French "continua droite, limites a
gauche" and "continua gauche, limites a droite", respectively.
Note: The Brownian motion is both cadlag and caglad because it has continuous samples
almost surely. The samples of the compound Poisson process C are right continuous with
left limit so that C is cadlag. The difference C (t+) - C (t-) = C (t) - C (t-) at time t is
one of the random variables Yk if C has a jump at t and zero otherwise . .i.
114 Chapter 3. Stochastic Processes
Let (Q, F, (F1k:o:O• P) be a filtered probability space and T : Q --+ [0, oo] be
a random variable defined on this space. If {w : T(w) :::: t} E Ft for all t ::: 0,
then T is said to be an Frstopping time or a stopping time.
Note: A stopping timeT is said to be finite a.s. if P(T < oo) = 1 or P(T = oo) = 0 . .&
Example 3.14: Any positive constant is a stopping time on an arbitrary filtered
probability space (Q, F, (Ft)t~o. P). ¢
Proof: LetT =a, where a :=:: 0 is a real number. Then {w : T(w) :::; t}, t :=:: 0, is 0 if a > t
and the sample space Q if a :::; t so that it is in :Ft, t :=:: 0. •
Proof: We have {T < t} = U~ 1 {T :::0 t- 1/n} and {T :::0 t - 1/n} E :Ft-lfn C :Ft so
that {T < t} is in :F1 . The event {T = t} is equal to {T :::; t} n {T < t}c so that it is in :Ft.
Suppose now that {T < t} E :F1 for all t :=:: 0. For an arbitrary t :=:: 0 and integer
m :=:: 1 we have {T :::; t} = n~m{T < t + 1/n} so that {T :::; t} E :Ft+l/m since
{T < t + 1/n} E :Ft+lfn c :Ft+lfm· Because m is arbitrary we have {T :::: t} E
n~=l :Ft+lfm = :Ft+ = :Ft, where the latter equality holds since :Ft is assumed to be a
right continuous filtration. •
Let (Q, F, (Ft)t~o, P) be a filtered probability space, Tt, Tz, ... be stopping
times on this space, and c E [0, oo] a constant. Then
• c + Tt and c 1\ Tt are F 1 -stopping times.
• supn~l Tn and infn~l Tn are Ft-stopping times.
eliminfn-+oo Tn, lim SUPn-+oo Tn are Frstopping times.
Proof: If c = oo, then c+ Tt = oo is a constant, which is a stopping time. Also, the random
variable c 1\ Tt = Tt is a stopping time. If c < oo, then {c + Tt :::; t} = {Tt :::: t - c} is 0
3.4. Stopping times 115
fort < c and {T1 ::; t - c} E Ft-c c :Ft fort 2: c. The event {c 1\ T1 ::; t} is Q fort 2: c
and {T1 ::; t} fort <c. Hence, c + T1 and c 1\ T1 are stopping times.
=
The statements in the second bullet results from {supn> 1 Tn ::; t} nn> dTn ::; t},
{infn:::1 Tn < t} = u~ 1 {Tn < t}, and {Tn < t} E :Ft by the right conti;;-uity of the
filtration :F1 , t 2: 0.
The statements in the last bullet follow from the definitions,
lim inf Tn = sup inf Tn and lim sup Tn = inf sup Tn,
n-'>oo m:C:1 n=::m n-'>00 m:C:1 n:C:m
Let X (t), t :::: 0, be a right continuous adapted process and B denote an open
set. Then the first entrance timeT = inf{t :::: 0: X(t) E B} is an F 1 -stopping
time.
Proof: Since X is a right continuous process and B is an open set, we have {T < t} =
Uo::::s<t, s=rationadX(s) E B} and {X(s) E B} E :Fs C :Ft. Hence, Tis an :Ft-stopping
time because :F1 = :Ft+ by assumption. •
Example 3.15: LetT= inf{t 2': 0: B(t) >a} be the first time when a Brownian
motion B exceeds a level 0 < a < oo, that is, the first entrance time of B in
(a, oo), and let F 1 , t :::: 0, denote a right continuous filtration such that F 1 :J
rr(B(s), 0 :::; s :::; t). Then T is an F 1 -stopping time. Figure 3.6 shows the
density ofT for a = 1, 2, 3 calculated from P(T :::; t) = 2 [1 - <P(a/ .J()] by
differentiation. <>
Proof: T is an :Ft-stopping time by the previous property. We have P(B(t) > a)
1- 'P(a/Jf) and
P(B(t) > a) = P(B(t) > a IT > t) P(T > t) + P(B(t) > a IT ::; t) P(T ::; t)
= (1/2) P(T ::; t),
where ¢ denotes the distribution of the standard Gaussian variable. The last equality in
the above formula holds because P(B(t) > a I T > t) = 0 by the definition ofT, and
P(B(t) > a I T ::; t) = 1/2 because B(T) = a prior tot for T ::::: t and the Brownian
motion is symmetric with respect to its starting point. •
Example 3.16: LetT be an F 1 -stopping time and 0 =to < t 1 < · · · < tk < · · ·
a sequence of numbers such that tk -+ oo ask -+ oo. Define an approximation
T of T by T = tk+ 1 if T E [tk, tk+ 1) and T = oo if T = oo. Then T is an
F 1 -stopping time. <>
Proof: If t E [tk, tk+1), we have {'T ::; t} = {T < tk} and the last event is in :F1k c :F1 •
Note that the random variable f = tk if T E [tk, tk+1) and f = oo if T = oo is not an
:F1 -stopping time. •
116 Chapter 3. Stochastic Processes
0.6
0.5
0.4
0.3
a=3
0.2
0.1
0
0 10
time
Note: Fr and Fr + include all events in F occurring prior to T. If the filtration (Ft )1"=o
is right continuous as considered in our discussion, then Fr and Fr + coincide. The proof
that Fr and Fr+ are sub-u-fields ofF can be based on arguments as in Section 2.16 ([61],
Proposition 1.4, p. 52). A
Note: Some of these properties will be used, for example, in Chapters 6 and 7 to solve de-
terministic partial differential equations and assess the evolution of mechanical, ecological,
or other systems. The proof of these and related properties can be found in [61] (Proposi-
tion 1.4, p. 52). A
Fn(X(l), ... , x<n); t1, ... , tn) = P(n?=dX(ti) E x%= 1 ( -oo, Xi,k]}), (3.10)
(3.11)
(3.12)
provided that these derivatives exist. The marginal density of X is the density of
the random variable X(t) and is denoted by /1 (·; t) or f(·; t).
Generally, the information available on a stochastic process X in applica-
tions is sufficient to estimate at most its first and second order finite dimensional
118 Chapter 3. Stochastic Processes
Note: A collection of distributions Fn, n = 1, 2, ... , of the type in Eq. 3.10 satisfies the
consistency condition if any Fm of order m < n can be derived from Fn. For example,
Fm(Xl, ... ,xm; tl, ... ' tm) = Fn(Xl, ... 'Xm, 00, ... ' oo; flo ... ' tm, ... ' fn). (3.13)
n
n
Fn(Xl, ... 'Xn; tl, ... ' tn) = <l>(xk), n = 1, 2, ... ,
k=l
satisfies the conditions of consistency and symmetry. Hence, this family can be
used to define a stochastic process but not the process X in this example. <>
Proof: That the family of distributions Fn satisfies the conditions of consistency and sym-
metry follows from its definition.
However, this family cannot be used to define the process X since (1) the sequence
of events An = {w : X(t, w) > £, X(t + 1/n, w) < -£}converges to the empty set as
n--+ oo by the continuity of the samples of X so that P(An)--+ P(0) =
0 as n--+ oo for
every e > 0 and (2) P(X(t) > £, X(s) < -£) = (<1>(-£)) 2 for any t =!=sand£> 0 by
the definition of Fn so that
does not approach zero as n --+ oo, a contradiction. The conclusion is somewhat obvious
in this example since the assumptions that X (t) is independent of X (s) for t =!= s and that
X has continuous samples are at variance. •
3.6. Classes of stochastic processes 119
d
(X(ti), ... , X(tn)) = (X(ti + r), ... , X(tn + r)) (3.14)
Example 3.18: The Brownian motion B is not a stationary process since B(t) "'
N(O, t) so that its marginal distribution, F(x; t) = P(B(t) :::::: x) = cf>(xj .ji),
changes in time. <>
Example 3.19: Let Yk be independent identically distributed random variables
that are independent of a random variable Y. The time series X k = Y + Yk,
k = 1, 2, ... , is stationary. <>
Proof: The finite dimensional distributions of the process X = (XI, X2, ... ) are
Fn(Xl, ... , xn; k1, ... , kn) = P(nt=I {Y + Yk; ::::xi})
11r/2
E[g(X(t))] = lim - g(X(s)) ds a.s. (3.15)
r---+oo r -r/ 2
for any real-valued measurable function g such that E[lg(X(t))l] < oo.
Note: The condition in Eq. 3.15 states that the ensemble average E[g(X(t))] can be calcu-
lated from the temporal average (1/r) J~~~ 2 g(X(s)) ds on a sample path of X of length
r -+ oo ([176], Section 3.2.3, [197], Section 2.3). A
Example 3.20: The time series X = (X 1, X2, ... ) in Example 3.19 is ergodic if
Y = c is a constant. Otherwise, X is not an ergodic time series. <>
Proof: If Y = c, the random variables g(c + Yk) are independent so that the sum
n
(1/n) L g(c + Yk)
k=I
converges a.s. to E[g(c+ Y1)] as n -+ oo by the strong law oflarge numbers (Section 2.13),
provided that E[lg(c + YJ)Il is finite.
If Y is not a degenerated random variable, X is not ergodic because Eq. 3.15 does
not hold for all measurable functions g. For example, if g(x) = x the left and right sides
ofEq. 3.15 are E[Y] + E[Yj] and Y(w) + E[Yj], respectively, where Y(w) is a sample of
the random variable Y. •
where the subscripts of f indicate the reference random variables. For example, the func-
tion fX(tn)IXCtn-Il is the density of X(tn) I X(tn-I). The alternative forms of this defini-
tion,
and
can be obtained from Eq. 3.16 by multiplying with the density of X(tn-I) and from
Eq. 3.17 by dividing with the density of (X(tn-J), ... , X(tt)), respectively. A Markov
processcanalsobedefinedbytheconditionE[g(X(t+s)) 1.1l1 = E[g(X(t+s)) I X(t)],
where t, s :::: 0 are arbitrary times and g : lR---+ lR is a Borel function ([61], p. 156) . .l
The density of the conditional random variable X(t) I X(s), t > s, called
the transition density of X, has the following remarkable properties.
If X is a Markov process, then all its finite dimensional densities are specified
completely by its marginal and transition densities.
Proof: The density of the conditional random variable X (t) I X (to), to < s < t, is
fx(t)IX(to)(x I xo) =
f2(xo, x; to, t)
f ( t)
1 xo; o
1
= f 1 (xo; t)
h
o lR'. f3(xo, y, x; to, s, t)dy
= { f3(x 0 , y, x; to, s, t) f2(xo, y; to, s) dy,
llR'. f2(xo, y; to, s) !1 (xo; to)
which gives Eq. 3.19. •
122 Chapter 3. Stochastic Processes
Example 3.21: Let X (t) = L~=l Yk be a discrete time process, where t ::=: 1 is an
integer and Yk, k = 1, 2, ... , are iid random variables. The process X is Markov
and fPX(t)I(X(s)=y)(u) = eHuy n~=s+l cpy1 (u) is the characteristic function of
X(t) I X(s), t > s. o
Proof: Fort > s we have X(t) I (X(s) = y) = y + Li=s+l Yk so that future values X(t)
of X are independent of its past values conditional on X (s) = y. The characteristic function
results from the expression of X(t) I (X(s) = y), where cpy1 denotes the characteristic
function of Yt. •
Note: The Brownian motion and the compound Poisson processes in Examples 3.4 and 3.9
have stationary independent increments. For example, by the definition of the Brownian
motion process B, the increments of B over non-overlapping intervals are independent and
B(t)- B(s) ~ N(O, t - s), t 2: s. A
fn (XI, ... , Xn; ft, ... , tn) = fy, (xt) fy2 (xz- XI) ... !Y. (Xn - Xn-1), (3.20)
P(X(tt).::; x1, ... , X(tn).::; xn) =P (rl.::; x1, Yt + Yz.::; xz, ... , t
k=l
Yk.::: xn)
l Xn-1-:Ek:iYk
· · · -oo dYn-1 fyn-! (Yn-1)
1Xn-:Ek:/Yk
-oo dyn fyn(Yn)
where fyk denotes the density of Yk. The corresponding finite dimensional density of order
n of X can be obtained from the above equation by differentiation or from the density of
the random vector (Yt. ... , Yn) and its relation to (X(t1), ... , X (tn)). •
3.6. Classes of stochastic processes 123
Proof: The densities of the conditional variables X Un) I {X Un-1), ... , X(t1)} and X(tn) I
XUn-1) coincide (Eq. 3.20). This property shows that the Brownian motion and the com-
pound Poisson processes are Markov. •
for r1, r2 > 0, where F,, fr:, cp, denote the distribution, the density, and the
characteristic function of X(t + r)- X(t), r > 0.
Proof: Set .6.1 = X(t + r1)- X(t) and .6.2 = X(t + r1 + r2)- X(t + r1) for r1, r2 > 0.
The distribution of X (t + r1 + r2) - X (t) = .6.1 + .6.2 is
P(X(t + r1 + r2)- X(t) _:::: x) = P(.6.1 + .6.2 _:::: x) = k Fr:2 (x- y) /r:1(y)dy
since the random variables .6.1 and .6.2 are independent. The density of X (t + r1 + r2) -
X (t) results from the differentiation of the above equation. The characteristic function of
X(t + r1 + r2) - X (t) is the product of the characteristic functions of .6.1 and .6.2, that is,
CfJr:1+r:z(U) = fPr:l (u)rpr:2 (u). This relationship becomeS fPL,~=l Tk(U) = nk=1 fPr:k(U) for
m non-overlapping time intervals oflength Tk, k = 1, ... , m. •
The mean J.L(t) = E[X (t)] and variance a (t) 2 = E[(X (t)- J.L(t)) 2 ] of a process
X(t), t :::: 0, with X(O) = 0 and stationary independent increments are linear
functions of time.
Proof: Properties of the expectation give
M(t + r) = E[(X(t + r)- X(t)) + X(t)] = M(r) + M(t),
a(t + r) 2 = E{[(X(t + r)- X(t))- (M(t + r)- M(t)) + (X(t)- M(t))] 2}
= E{[(X(t + r)- X(t))- (M(t + r)- M(t))] 2} + E{[X(t)- M(t)] 2}
= a(r) 2 + a(t) 2
fort, r :::: 0 so that M(t)= t M(l) and a(t) 2 = ta(1) 2. This result is consistent with
the evolution of mean and variance of the Brownian motion B and the compound Poisson
process C. These moments are E[B(t)] = 0, Var[B(t)] = t, E[C(t)] = At E[Yd, and
Var[C(t)] =At E[Yf] provided that Y1 E L2, where A > 0 is the intensity of the Poisson
process N in the definition of C (Example 3.9). •
124 Chapter 3. Stochastic Processes
fn (XI, · · . , Xn; f1, ... , tn) = </J ( X~) </J ( ~) ... </J ( Xn - Xn-l ) ,
yfl tz-
Jtn- tn-1
t1
(3.22)
where ¢ denotes the density of the standard Gaussian variable. The characteristic
function of the increment C(t) - C(s), t ::;: s, of a compound Poisson process
C(t) = "£f~? Yk is
J
E [ eA u (C(t)-C(s)) = e). (t-s)[l-<py(u)], (3.23)
where A. > 0 is the intensity of the underlying Poisson process N and cp y is the
characteristic function of Y 1· <>
Proof: Properties of Brownian motion and Eq. 3.20 give Eq. 3.22. The characteristic
function of C(t)- C(s) 1: C(t- s), t 2: s, is
E [eA u C(t-s)] = f E [eA u C(t-s) I N(t- s) = n] [A. (t ~ s)]n e-A (t-s)
whererpy(u) = E[eHuYq. •
Proof: Let Xn = (X(t1), ... , X(tn)), where n 0:: 1 is an integer and (tJ, ... , tn) denote
arbitrary times. The vector Xn can be expressed as a linear transformation of the Gaussian
vector Z so that it is Gaussian (Section 2.11.5). •
(3.27)
(3.28)
Proof: Take m = 2 and n = 2 in Eq. 3.28. Suppose that h in this equation defines
a non-degenerate process in X, the densities fik) satisfy the condition !?) (xl, x2) =
3. 7. Second moment properties 127
P J?) (XI, X2) + (1 - p) J?) (XI, X2) = p f(l) (XI) f(l) (x2) + (1 - p) /( 2) (xi) /( 2) (X2)
= (p f(I)(XI) + (1- p) J< 2)(XI)) (p j<I)(x2) + (1- p) f( 2)(x2))
since uncorrelated translation variables are independent. The above equality gives
Example 3.25: Let {('r, a) = E[X(t) X(t + i) X(t +a)] be the second order
correlation function of X E X with finite dimensional density in Eq. 3.28. Then
{(i, a)= L~=I Pk {k(i, a), where sk(i, a)= E[Xk(t) Xk(t + i) Xk(t+a)]. ~
Note: This is a special case of Eq. 3.29. The second order correlation functions ~ and ~k
depend on only two arguments because the processes X and Xk are stationary. A
Note: The moments in Eq. 3.30 exist and are finite because X(t) E L2. For example, the
entry (i, j) of r(t, s) giving the correlation function r;,j(f, s) = E[X;(t) Xj(s)] is finite
since (E[X; (t)X i (s)]) 2 .::; E[X; (t) 2] E[X j (s) 2] by the Cauchy-Schwarz inequality and
the expectations E[X; (t) 2 ] exist and are finite by hypothesis. Similar considerations show
that the mean and covariance functions of X exist and are finite.
Generally, the condition X(t) E L2, t ::::_ 0, is insufficient for the existence of
moments of X of order higher than two. The class of Gaussian processes is an exception
because their moments of order three and higher are related to the first two moments by
algebraic equations ([79], Appendix B). A
128 Chapter 3. Stochastic Processes
The second moment properties of X are given by the pair of functions (p,, r)
or (p,, c) in Eq. 3.30.
Note: The functions (JL, r) and (JL, c) contain the same information since the correlation
and covariance functions are related by c(t, s) = r(t, s)- /l(t) p,(tl.
If r;,j (t, s) = 0 at all times, the processes X; and Xi are said to be orthogonal.
If c i, j ( t, s) = 0 at all times, the processes X; and X J are said to be uncorrelated. If the
processes X; and X J have mean zero, that is, 1-ti (t) = 0 and 1-t J (t) = 0 at all times, they
are orthogonal if and only if they are uncorrelated. A
Example 3.26: Generally, the second moment properties of a process X are in-
sufficient to define its finite dimensional distributions. The Gaussian process is a
notable exception. <>
Proof: Let n :::: 1 be an integer and t;, i = 1, ... , n, denote arbitrary times. The
finite dimensional distribution of order n of X is the distribution of the vector Xn =
(X(t]), ... , X(tn)). If X is a Gaussian process, Xn is a Gaussian vector whose second
moment properties can be obtained from the mean and correlation functions of X. The
distribution of Xn is completely defined by its first two moments (Section 2.11.5). •
where p E (0, 1), Wk are uncorrelated random variables with mean E[W k] = 0
and variance Var[Wk] = 1 that are uncorrelated to the initial value X o of X. The
process X is weakly stationary as k --+ oo but may or may not be stationary
depending on the higher order properties of W k· <>
Proof: The recurrence formulas 11-k+l = p Il-k and u£+ 1 = p 2 u£ + 1 for the mean
Il-k= E[Xk] and the variance u£ = E[(Xk- Mk) ] of Xk result by averaging the defining
2
equation for X and its square, respectively. The covariance function of X satisfies the
equation c(k + p, k) = E[Xk+p Xk] = pP u£, where xk = xk -Il-k· This equation
results by taking the expectation of the product of Xk+p = pP xk + Z:f=l ps-I Wk+p-s
with Xk because Xu and Wv, v 2: u, are not correlated. The asymptotic values of the mean,
variance, and covariance functions of X ask---+ oo are zero, 1/(1- p2), and pP j(l- p 2),
respectively, so that X becomes a weakly stationary process for large values of k.
However, X may not be stationary. For example, f{Jk(u) = E[exp(J=T u Xk)]
satisfies the recurrence formula (/Jk+l (u) = f{Jk(P u) f{JWk (u) if the random variables Wk
and Xk are independent for all k's, where cpwk(u) = E[exp(J=T u Wk)]. Suppose that X
becomes stationary for large values of k and denote the stationary characteristic function
of X by cp. For these values of k the recurrence formula becomes cp ( u) :::::: cp (p u) cpwk ( u) or
cp(u)jcp(p u) :::::: cpwk (u). This equality is not possible because its left side is time invariant
while its right side depends on time if, for example, Wk has the above second moment
properties but different characteristic functions at each time k. •
;(r) = E[g(G(t)) g(G(t + r))] = { g(yi) g(y2) </J(JI, y2; p(r)) dy1 dy2,
]Tft2
(3.31)
where¢ denotes the joint density of (G(t), G(t + r)) (Section 2.11.5). The
covariance function ; of X takes on values in the range [; *, 1], where ;* =
E[g(Z) g(-Z)] and Z is a standard Gaussian variable. The translation process
X is weakly stationary and stationary. <>
Proof: The formula in Eq. 3.31 results from the definition of the translation process X or
a theorem by Price ([79], Section 3.1.1). This theorem also shows that ~(r) increases with
p(r). The covariance function of X is ~(r) = 0 and 1 for p(r) = 0 and 1, respectively.
However, p(r) = -1 does not imply ~(r) = -1. If we have p(r) = -1, that is, G(t) = Z
and G(t + r) = -Z, the value of the covariance~* = E[g(Z) g(-Z)] of (X(t), X(t +
r)) depends on g. For example,~* = -1 if g(y) = y 3 but~* > -1 if g(y) = a+
exp(,B + y G(t)), where a, ,B, yare some constants ([79], Section 3.1.1). This property of
the covariance function of X is particularly relevant in applications when we attempt to fit
130 Chapter 3. Stochastic Processes
• r;,J, i =/= j, satisfies the condition r;,J (t, s) = TJ,i (s, t). If X is weakly station-
ary, then r;,J(r) = TJ,;(-r), where r = t - s.
• The inequality lr;,J (t, s) I :S Jr;,; (t, t) TJ,J (s, s) holds. If X is weakly station-
ary, then lr;,J(r)l :S Jr;,;(O)rj,J(O).
Proof: The first property follows from the definition of the correlation function because
r;,j (t, s) = E[X; (t) X j (s)] = E[X j (s) X; (t)] = r j,i (s, t). If X is weakly stationary, we
haver;,j(t+r, t) = r1,;(t, t+r) sothatr;,j(r) = rj,;(-r). TheCauchy-Schwarzinequal-
ity gives the second property. We also haver;,; (t, t) rj,j (s, s) .:::; [r;,; (t, t) + rj,j(s, s)] 2 /4
since 4a b.:::; (a+ b) 2 for any a, b E R •
Example 3.29: Let X be a <Cd -valued process in L2 (Q, :F, P), that is, its co-
ordinates are complex-valued processes such that E[X; (t) X; (t)*] < oo, i =
1, ... , d, at all times. The correlation functions of X are
r;,j(t, s) = E[X;(t) Xj(s)*], i, j = 1, ... , d, (3.32)
3. 7. Second moment properties 131
and satisfy the condition r;,J (t, s) = TJ,i (s, t)*, where z* denotes the complex
conjugate of z E IC. <)
Proof: Let U; and V; denote the real and imaginary parts of X; . Then
Example 3.30: Let Ak, Bk be uncorrelated random variables with mean zero and
unit variance. The function
n
X (t) = L ak [Ak cos(vk t) + Bk sin(vk t)], t 2: 0, (3.33)
k=l
is a real-valued parametric stochastic process with mean zero and covariance func-
tion
n
c(t, s) =La[ cos(vk (t- s)), (3.34)
k=l
where ak, Vk > 0, k = 1, 2, ... , n, are some constants. The process X is weakly
stationary. If in addition Ak, Bk are Gaussian variables, then X is a stationary
Gaussian process. <)
Proof: The mean is zero because X is a linear function of Ak and Bb and these random
variables have mean zero. The covariance function of X is
n
c(t, s) = L uk uzE[(Ak cos(vk t) + Bk sin(vk t))(At cos(vz s) + Bz sin(vt s))]
k,l=l
n
= L u[ (cos(vk t) cos(vk s) + sin(vk t) sin(vk s))
k=l
n
= L u£ cos(vk (t - s)) = c(t - s)
k=l
by the linearity of the expectation operator and the properties of the random variables
Ak, Bk. Because c(t, s) depends on the time lag t - s rather than the times t and s, X
is a weakly stationary process.
If in addition the random variables Ako Bk are Gaussian, then X is stationary since
it is a weakly stationary Gaussian process. •
132 Chapter 3. Stochastic Processes
i:
A continuous function r : ~ t-* <C is positive definite if and only if it has the
representation
r(r) = eA v-r dS(v), (3.35)
Note: The function r is said to be positive definite if Lk,l=l ~k ~z* r(tk - tz) ~ 0 for any
integer n ~ 1, arguments tk E JR, and complex constants ~b k = 1, ... , n, where z*
denotes the complex conjugate of z E C.
The proof of Bochner's theorem can be found in [45] (Section 7.4). We only show
that r(·) given by Eq. 3.35 is positive definite. Note that
L: (t~keAvtk) (t~zeAvtt)*
k,l=l k,l=l
= dS(v)
k=l 1=1
dS(v)
is positive for any integer n ~ 1, arguments tk, and complex constants ~k· &
L:
The correlation and spectral density functions are Fourier pairs, that is,
r(r) = eR vr s(v) dv 1
and s(v) = -
2rr
100
_
e-Rvr
00
r(r)dr.
(3.36)
Note: If M = E[X (t)] is not zero, the spectral density function of X has an atom at v =0
equal to M2 since r(r) = M2 + c(r) (Eq. 3.36). Formal calculations give
where 8(·) denotes the Dirac delta function. In engineering applications the one-sided
spectral density function
is preferred since v :::: 0 in Eq. 3.38 can be interpreted as frequency. By analogy with the
spectral distribution S of X, we can define the one-sided spectral distribution function of
this process by
where the notation J;+ indicates that the integral does not include the value of the integrand
at zero. &
We summarize now some properties of the spectral and the one-sided spec-
tral density functions.
• s ( ·) or g ( ·) provide the second moment properties of X.
• sO and g(·) are positive functions and s(v) = s( -v) for all v E R
• An alternative to Eq. 3.36 involving only real-valued functions is
r(T) = fooo g(v) cos(v r) dv and g(v) = _!:__ roo r(T) COS(V T) dT. (3.39)
rr lo
• The area under the spectral densities sO and g(·) is r(O). If E[X(t)] = 0,
this area is the variance of X.
Proof: The Fourier transform of the spectral density is the correlation function of X and
the mass of the atom of its spectral density at v = 0 is M2 . Hence, the functions sO and
gO deliver the second moment properties of X. Because Sis an increasing function, the
spectral densities sO and g(·) are positive. The spectral density sO is an even function
since the correlation function of X is even (Eq. 3.36). The last two properties result from
Eq. 3.36, Eq. 3.38, and the fact that the spectral density s(·) is an even function. •
134 Chapter 3. Stochastic Processes
Table 3.1. Examples of correlation and one-sided spectral density functions for
weakly stationary processes with mean zero (adapted from [175])
!"'
0 < V ::'S Vc
BLWN go sin(vcr)/r
0, V > Vc
go, Va < V ::'S Vb
RP go(Vb- Va) sin(p) cos(y)jp,
0, v ¢. (va, Vb]
Example 3.31: The spectral density and the one-sided spectral density functions
of the stochastic process in Eq. 3.33 are
1 n n
s(v) = 2 L af [o(v- Vk) + o(v + Vk)] and g(v) = L af o(v- Vk),
k=l k=l
(3.40)
Table 3.1 gives examples of covariance and spectral density functions for
real-valued weakly stationary processes, where the WN, BLWN, RP, BN, 1M,
and 2M denote white noise, band-limited white noise, rectangular pulse, binary
noise, first order Markov, and second order Markov process, respectively, and go,
Vc, Va, Vb, ~.A., and a 2 are positive constants. The notations p = (vb- va) r/2
andy = (vb + va) r /2 are also used in the above table.
obtained from its real and imaginary parts (Example 3.29) and has the following
properties.
• The correlation function of X has the property r (t - s) = r (s - t) *.
• The correlation function of X is positive definite.
Proof: The equality r(t, s) = r(s, t)* (Example 3.29) implies the first property. The
second property results from the observation that the random variable
is positive for any integer n :::: 1, times fb and complex-valued constants ~k so that its
expectation Lk,t=l ~k ~z* r(tk- tt) is positive. The Bochner theorem (Eq. 3.35) states that
X has a spectral distribution function. •
If X is an JRd -valued weakly stationary stochastic process with mean zero and
continuous correlation function r(r) = E[X(t + r) X(t)T], then
r·l,j·(r) = i
lR
eRvr dSl,j·(v) = klR
eRvr s·l,j·(v)dv , where
Sp, p = 1, 2, is defined by Eq. 3.42 below, and s;,J (v) = dS;,J (v)jdv.
Proof: The representation of the entries r;, i of the correlation function follows from results
in Section 3.7.2.2. We show now that the correlation functions ri,J• i i= j, have a similar
representation. The integral representations of r;,j involving the spectral densities s;,j are
valid if the functions s;,J exist.
136 Chapter 3. Stochastic Processes
The complex-valued process Y(t) = L,f=l t;; X;(t), t;; E C, is weakly station-
ary because it has mean zero and correlation function ry(r) = E[Y(t + r) Y(t)*] =
L.%, 1= 1 sksz* rk,z(r). If ry is continuous, then ry has the representation in Eq. 3.36 since
it is positive definite. Let i, j E { 1, ... , d} be some fixed, distinct indices and let Y1 and
Y2 be special cases of Y corresponding to (1) t;; = sj = 1 and sk = 0 fork =I= i, j and
(2) t;; = A, t; j = 1, and sk = 0 for k =I= i, j, respectively. The correlation functions of
these processes,
where Sp denotes the spectral distribution of Yp, p = 1, 2 (Eq. 3.36). The last two equa-
tions and the representation of the correlation functions r;,; in Bochner's theorem give
1JR
r·1,]·(r)= [ eRvrdS·1,]·(v) , where
S·1,]· = s~.
],1
= ~
2 [s1 - R s2 - · ·)]
(1 - J=T)(S·1,1· + s],]
for the correlation functions r;,j, i =I= j. The spectral distributions S;,j, i =I= j, are bounded
complex-valued functions and the matrix {S;,j} is a Hermitian since S;,j = Sj,;. The
spectral distributions S;,j, i =1= j, are not monotone functions.
It remains to show that it is possible to define a spectral density for distinct pairs of
coordinates of X. Consider a bounded interval [v, v + Av) and set AS;,j (v) = S;,j (v +
Av) - S;,j (v). The Hermitian matrix {AS;,j (v)}, i, j = 1, ... , d, has the property
L,fJ=l t;; t;j AS;,J (v) ::: 0, t;; E C, since
and the spectral distribution Sy of Y(t) = L,f=l sk X(tk) is an increasing function. For a
fixed pair of distinct indices i, j E { l, ... , d} and sk = 0 for k =I= i, j the sum
d
L t;; t;j AS;,j (v)
i,j=l
becomes
ASi,i (v) It;; 12 + ASJ,J (v) Is} 12 + 2 !R(ASi,J (v)t;; t;j) ::: 0, or
because the real part of AS; ,1· (v) t;; (!'J is smaller than its absolute value. The latter ex-
pression divided by Is} 12 is a polynomial of 11 = It;; 1/lsj I that has no real roots so that
3.8. Equivalent stochastic processes 137
Si,i(v)=(l-p)sw;(v)+psw(v), i=l,2,
s1,2(v) = s2,1(v) = psw(v)
Note: Let no be the subset of Q collecting the samples X(-, w) and Y(-, w) of X andY
that do not coincide. The processes X and Y are indistinguishable if P(Qo) = 0. A
Note: Indistinguishable processes are modifications but modifications may not be indistin-
guishable. Let Q 1 = {w : X(t, w) i= Y(t, w)} be the subset of Q on which the modifica-
tions X and Y differ at t. The set Q 1 is an event and P (Q1 ) = 0. If the probability of U1 Q 1
is zero, then X and Y will be indistinguishable. Because U1 Q 1 is an uncountable union of
members of :F, it may not be in :F, in which case P(U1 Q 1) is not even defined. A
The following conditions of equivalence for X and Y are weaker and do not
require that these processes be defined on the same probability space.
The stochastic processes X and Y are said to be versions if they have the same
finite dimensional distributions.
The stochastic processes X and Y are said to be equal in the second moment
sense if they have the same second moment properties.
Note: Generally, processes with the same second moment properties have different finite
dimensional distributions. An exception is the class of Gaussian processes. Recall that two
Gaussian processes with the same second moment properties are versions.
Other types of equivalence between stochastic processes can be defined. For ex-
ample, processes with the same marginal distribution and second moment properties or
processes with the same finite dimensional distributions of order two can be viewed as
equivalent. A
Example 3.34: Suppose that the stochastic processes X and Y are modifications
with right continuous sample. Then X and Y are indistinguishable. <>
Proof: The event {X(r) i= Y(r)} has zero measure for each r E Q so that the event
A = UrEQJ{X(r) i= Y(r)} has zero probability since the set of rational numbers Q is
countable. If rn -(, t, rn E Q, and t E R, then {X(rn) i= Y(rn)} C A for each n so that we
also have {X (t) i= Y (t)} c A at any time t E lR?. by right continuity. Hence, X and Y are
indistinguishable. •
Example 3.35: Let X (t) and Y (t), t E [0, 1], be real-valued processes defined
on the probability space ([0, 1], B([O, 1]), P), where P(dw) = dw such that
X (t, w) = 0 and Y (t, w) = 0 or 1 for t ::j::. w or t = w, respectively. The
processes X and Y are not indistinguishable but are versions and modifications. <>
Proof: The samples of X andY differ for all w E Q, for example, suptE[O, 11 X (t, w) = 0
while suptE[O,l] Y(t, w) = 1 foreachw E [0, 1].
The processes X and Y are modifications because Qt = {w : X ( t, w) i= Y (t, w)} =
{t} and P({t}) = 0 for any t E [0, 1]. These processes are also versions since the vectors
(X(t1), ... , X(tn)) and (Y(t1), ... , Y(tn)) differ on a subset of Q with measure zero for
any times t1, ... , tn in [0, 1]. •
Example 3.36: The mean and covariance functions of the Brownian motion and
the compound Poisson processes B and C are
and
fi,c(t) =At E[Yl], cc(t, s) =A min(t, s) E[Yf], (3.44)
with the notations in Eq. 3.6 and the assumption E[Yf] < oo. If EY1 = 0
and A E[Yf] = 1, the Brownian motion and the compound Poisson process are
equal in the second moment sense. However, the samples of these processes differ
significantly (Fig 3.5) showing that the second moment equivalence provides little
information on the sample properties. <>
Proof: The Brownian motion has mean zero by definition. The expectation E[B(t) B(s)],
t > s, is
since E[(B(t)- B(s)) B(s)] = E[B(t)- B(s)] E[B(s)] = 0 by the independence of the
increments of Band E[B(s) 2 ] = s.
We have E[C(t)] = L~o E[C(t) I N(t) = n] P(N(t) = n), E[C(t) I N(t) =
n] = n E[YI], and P(N(t) = n) =e-At (A t)n jn! so that
00 00
• If a sequence X n has two m.s. limits X and Y, then X = Y a.s., that is,
Xn ~ X and Xn ~ Y implies P(X i= Y) = 0. (3.47)
Proof: The above statements show that the expectations and the m.s. limits can be in-
terchanged under some conditions (Eqs. 3.45-3.46) and that the m.s. limit is unique with
probability one, that is, the subset of Q in which the m.s. limits of a sequence Xn may
differ has zero measure (Eq. 3.47).
We have
where the Cauchy-Schwarz inequality has been used to obtain the final upper bound on
IE[Xm Yn - X Y]l. The property in Eq. 3.46 follows from the above inequality since
Xm ~ X and Yn ~ Y by hypothesis.
The uniqueness with probability 1 of the m.s. limit in Eq. 3.47 follows from the
Chebyshev inequality P(IX- Yl >e) :::; E[(X- Y) 2 ]/e 2 , e > 0, and the inequalities
where the last inequality holds since (a+ b)2 :::; 2a 2 + 2b2 . Because Xn ~ X, Y, we
have P(IX- Yl > e) = 0, Ve > 0. •
The remainder of this section gives definitions and basic properties of m.s. con-
tinuity, differentiation, and integration for real-valued processes in L 2(Q, :F, P).
We also extend these results to Rd -valued stochastic processes.
3.9. Second moment calculus 141
3.9.1 Continuity
We have defined in Section 3.3 several types of continuity for stochastic
processes. Generally, it is difficult to show that a process X is sample continuous,
that is, that almost all samples of X are continuous functions of time. Weaker def-
initions of continuity may be adequate in many applications. We define again m.s.
continuity and give simple criteria for determining whether a process is m.s. con-
tinuous at a time t and in a time interval.
A real-valued stochastic process X in L2 is m.s. continuous or continuous in
the mean square sense at timet ifl.i.m. 5 ~rX(s) = X(t), that is,
Note: This definition states that X is m.s. continuous at t if the distance between X(t)
and X (s) defined by the second moment of the difference X (t) - X (s) approaches zero as
s ~ t. The process X is said to be m.s. continuous in an interval/ if it is m.s. continuous
at each t E /.The norm in Eq. 3.48 is the norm in L2.
An JRd -valued stochastic process X is m.s. continuous at a time t if and only if its
coordinates Xi, i = 1, ... , dare m.s. continuous at t. &
The definition in Eq. 3.48 is not very useful for checking m.s. continuity.
We give a simple criterion for assessing whether a process is m.s. continuous.
• A real-valued process X is m.s. continuous at a time t if and only if its
correlation function r(u, v) = E[X(u) X(v)] is continuous at u = v = t, that
is, limu,v~t r(u, v) = r(t, t).
• A real-valued weakly stationary process X is m.s. continuous at a time t if
and only iflim,~o r(r) = r(O).
Example 3.37: The Brownian motion B, the compound Poisson process C, and
the process X in Eq. 3.33 are m.s. continuous. Also, the processes B and X are
sample continuous while C is not. <>
Proof: The correlation functions of B and C at two times t and s are proportional with
t 1\ s. Because t 1\ sis a continuous function, Band Care m.s. continuous. The process
X is weakly stationary so that it is sufficient to observe that its correlation function is
continuous at the origin (Eq. 3.34).
The differences between the sample properties of B, C, and X show that m.s. con-
tinuity is a relatively weak requirement that does not provide much information on the
sample continuity of a process. •
3.9.2 Differentiation
The derivative X(t) = dX(t)jdt of the process X in Eq. 3.33 is
n
X(t, w) = L ak l!k [-Ak(w) sin(vk t) + Bk(w) cos(vk t)]
k=l
and can be calculated sample by sample. Generally, such calculations are not pos-
sible since analytical expressions are rarely available for the samples of arbitrary
stochastic processes.
A real-valued process X in L2 is mean square differentiable at t if
. X(t+h)-X(t)
L1.m. h---+0 h (3.49)
exists. This limit, denoted by X(t) or dX(t)jdt, is called the m.s. derivative of
X att.
Note: Because the limit in Eq. 3.49 is not known, it is not possible to calculate the second
moment of the difference X(t)- (X(t +h)- X(t))/ h. To prove the existence of X(t),
we need to show that {(X(t + hn)- X(t))/ hn} is a Cauchy sequence in L2 if hn ~ 0 as
n ~ oo. Because L2 is complete, {(X (t + hn) - X (t)) / hn} has a limit in L2 denoted by
X(t). Recall that a sequence {Yn} in L2 is Cauchy if II Yn- Ym ib~ 0 as m, n ~ oo.
3.9. Second moment calculus 143
An .!Rd -valued process X is m.s. differentiable at t if and only if its coordinates, that
is, the real-valued stochastic processes X;, i = 1, 2, ... , are m.s. differentiable at t. The
process X is said to be m.s. differentiable in an interval I if it has an m.s. derivative at
each timet E /. A
d
dt E[X(t)] = E[X(t)],
. a .
at E[X(t) X(s)] = E[X(t) X(s)], and
a2 • . a2 r(t, s)
atasE[X(t)X(s)] = E[X(t)X(s)] or rx,x(t,s) = atas (3.50)
Proof: The formulas in Eq. 3.50 show that differentiation and expectation can be inter-
changed. The proof of these formulas is based on Eq. 3.45. For example, the deriva-
tive of E[X(t)] is given by the limit of E[(X(t +h)- X(t))/h] ash ~ 0. Because
X exists, the expectation and the limit operations can be interchanged (Eq. 3.45) so that
limh---+0 E[(X (t +h)- X (t))/ h] is equal to E[l.i.m.h---;.O(X(t+h)- X(t))/ h] = E[X(t)].
The existence of derivatives of X of order two and higher require additional condi-
tions. For example, X(t) exists and E[X(t) 2 ] < 00 if and only if arx j((U, v)jau av exists
and is finite at u = v = t. • '
d 2r(r)
rx
'
j((T) = ---
dr
2- = -r
11
(r) and rx,x(r) =- /_: v 2 eRv'dS(v).
(3.51)
Proof: The first formula in Eq. 3.51 follows from the last formula in Eq. 3.50. The
second formula in Eq. 3.51 can be obtained by differentiating Eq. 3.35. These results
144 Chapter 3. Stochastic Processes
d dX (t) dX (t)
-[aX(t) +hY(t)] = a - - +b - - , a, b = constants
dt dt dt
d dg(t) dX (t)
d[g(t)X(t)]= --X(t)+g(t)--, g = a differentiable function
t dt dt
dn
E [ x(n) (t)] = - E [X(t)] E [ xCnl (t)] = 0, n2:,l
dtn
d · d .
r (r) = -E[X(t + r) X(t)] = -E[X(t) X(t- r)]
11
dr dr
= -E[X(t) X(t- r)J = -E[X(t + r) X(t)J
Proof: We need to show that II X (t +h) - X(t) 112 converges to zero ash -+ 0. Because
X is m.s. differentiable, the right side of the equation
3.9.3 Integration
Let {X(t), t E [a, b]} be areal-valued stochastic process andleth : [a, b]-+
~be a real-valued function, where [a, b] C llt Our objective is to define the inte-
grals
in the mean square sense. These integrals are random variables because their
values depend on the particular sample of X used for calculations. If X is a para-
metric process, the integrals in Eq. 3.52 can be written explicitly. For example, if
X is in Eq. 3.33 and h is an integrable function in [0, t], we have
mn
Sh,x(Pn) = Lh(t£(n)) ( X(t~n))- X(t~~ 1 )) and
k=l
mn
Sx,h(Pn) = LX(t£(n)) (h(t~n))- h(t~~ 1 )). (3.53)
k=l
Note: The m.s. limits of the Cauchy sequences Sh,X and Sx,h define the integrals in
Eq. 3.52. These definitions are admissible because the limits of the Cauchy sequences
in Eq. 3.53 are in L2 and are independent of the intermediate points t£ (n) ([157], Theo-
rem 2.16, p. 41). The sequences Sh,X and Sx,h are similar to the Riemann-Stieltjes sums
used to define classical integrals.
The independence of the limits of the Cauchy sequences in Eq. 3.53 on the interme-
diate points t£ (n) of the partitions of [a, b] is essential for the definition of the m.s. integrals
in Eq. 3.52. If the limits were dependent on t£ (n), the above definitions would not be ad-
missible. This condition is not always satisfied (Section 4.3).
I: I:
The definition of the m.s. integrals in Eq. 3.52 can be extended to the integrals
X(t) dh(t) depending on an JRlf -valued process X. The coordinates
I: I:
h(t) dX(t) and
h(t)dXi(t) and Xi(t)dh(t), i = 1, ... ,d, of these integrals are given by limits of
Cauchy sequences as in Eq. 3.53. A
m
• Vh(P) = L lh(tk)- h(tk-I)I = thevariationofh and
k=l
m n
v,(p, q) = LL lr(tk, St) - r(tk, Sf-!) - r(fk-l, S[) + rCtk-l, S[-J)l
k=l l=l
= the variation of r (3.55)
on [a, b] x [a, b] relative to the partitions p =(a= to < t1 < · · · < tm =b)
and q =(a= so < SJ < · · · < sm =b) of [a, b].
Note: The variation of ron [a, b] x [a, b] is defined by
m n
v,(p, q) =LL IE[(X(tk)- X(tk-J))(X(st)- X(st-J))JI
k=ll=l
so that
on [a, b], where the supremum is taken over the set of all partitions p of [a, b ].
v, =sup v,(p, q) =the total variation of ron [a, b] x [a, b], (3.57)
p,q
where the supremum is calculated over all partitions (p, q) of [a, b] x [a, b].
vx(p) =L
m
II
ltk X(t) dt 112:::: Lm ltk II X(t) 112 dt :::=a (b- a),
k=1 tk-1 k=1 tk-1
where a= max1e[a,b] II X(t) liz. The equality and inequality in this equation follow from
Eq. 3.58 and 3.61 in a subsequent section. •
Example 3.41: The Brownian motion B is of bounded variation in the weak sense
on any bounded interval [0, r ], 0 < r < oo, butis not of bounded variation in the
strong sense on [0, r]. <>
Proof: Let p = (0 = to < t1 < · · · < tm = r) be a partition of [0, r ]. The variation of the
correlation function r(t, s) = E[B(s) B(t)] = s 1\ ton [0, r] x [0, r] with respect top is
m
v,(p, p) = L IE [<BCtk)- BCtk-1>> (BCtt)- BCtt-1>>]1
k,/=1
m m
=LIE [<B(tk)- B(tk-1))ZJI = L (tk- tk-1) = r < oo.
k=I k=1
Example 3.42: The compound Poisson process C in Eq. 3.6 with jumps Y k ~ Lz
such that E[Yk] = 0 is of bounded variation in the weak sense on [0, r] but is not
of bounded variation in the strong sense. <>
Proof: Let p = (0 = to < t1 < · · · < tm = r) be a partition of [0, r ]. The variation of the
correlation function of C on [0, r] x [0, r] relative to p is
so that vc(pn) ---+ oo as n ---+ oo. The properties E[C(t)] = ). t E[YJl, Var[C(t)] =
). t E[Yf], and E[C(t) 2] = (). t E[YJ]) 2 + ). t E[Yf] of C have been used in the above
equations. •
If X is m.s. continuous on [a, b], then the integral J~ X(s) ds exists, is m.s. dif-
ferentiable on [a, b], and
-d
dt
lt
a
X(s) ds = X(t), tE[a,b]. (3.60)
If X is m.s. continuous on [a, b], then the following integral exists and
Note: The existence of the integral J/:X (t) dt in the last two properties in Eq. 3.58 follows
from the first property in this equation applied for the special case h(t) = t. The continuity
of the function t r+ II X(t) 112 on [a, b] implies the existence of J/: II X(t) 112 dt.
150 Chapter 3. Stochastic Processes
The proof of the statements in Eqs. 3.58, 3.59, 3.60, and 3.61 can be found in [157]
(Theorem 2.22, p. 46, Theorem 2.23, p. 50, Theorem 2.24, p. 51, and Theorem 2.25, p. 51,
respectively) . .&
(3.62)
• 1b1b h(s)h(t)
a2r(s t)
' dsdt,
1b h(t)dX(t) existand
tt
a a as at a
f3 = maxte[a,b]lh(t)l
I:
where ([157], Theorem 2.29, p. 59). This special case also shows
that the integral h (t) d X (t) exists if h is continuous and X is of bounded variation in
the strong sense on [a, b] ([157], Theorem 2.26, p. 52). The proof of the statements in
Eqs. 3.62, 3.63, and 3.64 can be found in [157] (Theorem 2.30, p. 60, Theorem 2.27, p. 53,
and Theorem 2.31, p. 61, respectively).
We also note that the m.s. integrals in Eqs. 3.58, 3.59, 3.60, 3.62, and 3.64 with X
and Z being equal to a Brownian motion process B are defined in the m.s. sense. For exam-
ple, we have II I:
B(t)dt 112:::0 I:
II B(t) 112 dt :5 b 112 (b-a) and E [ (!:
B(t)dt) 2] =
I[a,b]z(u v)dudv.
/1. A
We summarize some properties of the m.s. integrals in Eq. 3.52 that are rele-
vant for calculations. These properties resemble features of the Riemann-Stieltjes
integral.
Integration by parts. The m.s. integral I: h(t) dX (t) exists if and only if the
m.s. integral J: X(t) dh(t) exists and
1b h(t)dX(t) =h(t)X(t) I~ -1b X(t)dh(t). (3.66)
The m.s. integral is linear. If h, k and X, Y E L2 are such that J: h(t) dX (t),
J: k(t) dX (t), and I: h(t) dY (t) exist in m.s., then
• 1b [a h(t) + fJ k(t)] dX (t) =a 1b h(t) dX (t) + fJ 1b k(t) dX (t),
• 1b h(t)d[a X(t) + fJ Y(t)] =a 1b h(t)dX(t) + fJ 1b h(t)dY(t),
(3.68)
where a, fJ E lit Similar properties hold for the m.s. integral I: X (t) dh(t).
152 Chapter 3. Stochastic Processes
Note: The proof of the statements in Eqs. 3.66, 3.67, 3.68, and 3.69 can be found in [157]
(Theorem 2.17, p. 42, Theorem 2.18, p. 43, Theorem 2.19, p. 44, and Theorem 2.20, p. 44,
respectively) and are based on the definition of the m.s. integrals in Eq. 3.52. For example,
the expectation of Sh,x(Pn) is
mn
E[Sh,X(Pn)] = L h(t~(n)) (E[X(t~n))]- E[X(t~~ 1 )J).
k=l
The left side of this equation converges to E [!: h (t) d X (t) Jbecause Sh, x (Pn) is a Cauchy
sequence in Lz approximating I:
h(t) dX (t) as Ll(pn) ---+ 0 (Eq. 3.45). The right side of
I:
the above equation converges as Ll(Pn) ---+ 0 to the classical Riemann-Stieltjes integral
h(t)dE[X(t)]. &
Example 3.43: Let h : [a, b] ---+ lR be a differentiable function on [a, b] and let
X be an m.s. continuous process. The m.s. integral Y = I: X (t) dh(t) is well
defined and E[Y 2 ] =I: I: r(t, s) g(t) g(s) dt ds exists and is bounded, where r
denotes the correlation function of X and g(t) = dh(t)jdt. <>
Example 3.44: Let hand X be as in Example 3.43. The mean and variance of the
m.s. integral Y =I: X (t) dh(t) are
where f-L and c are the mean and covariance functions of X. If X is a Gaus-
sian process, Y is a Gaussian variable with the characteristic function cp y (u) =
exp( H u /-Ly - u 2 a'j /2). If X is a Brownian motion process, then f-Ly = 0 and
a'j =I: I:(t 1\s)g(t)g(s)dtds. <>
3.9. Second moment calculus 153
Proof: The properties in Eq. 3.58 can be used to confirm the above statements. We present
here a direct proof. The sum Sgx,l(Pn) = :E;~l g(tk(n)) X(tk(n)) (t~n)- t~':l 1 ) is a
Cauchy sequence in L2 with expectation 1-ty,n = Lk=l g(tk(n)) ~-t(tk (n)) (tt>- t~':l 1 ) that
converges to 1-ty as !:::.(pn) -+ 0. Similar considerations show that the expectation of the
square of the centered sums Sg x, 1 (Pn) - 1-ty,n converges to a} as !:::.(pn) -+ 0. If X is
Gaussian, then Y is a Gaussian variable with the specified characteristic function. •
where g(t) = dh(t)fdt. The process Y(t), t ::: 0, can represent the output of
a linear dynamic system with transfer function g subjected to an input X (Sec-
tion 7.2). If X is Gaussian, Y is a Gaussian process so that the above moments
define all finite dimensional distributions of Y. ¢
Proof: The existence of the process Y and the expressions of its first two moments result
from Eq. 3.58 since Y(t) = JJ X(s) dh(s) and his differentiable so that it is of bounded
variation on compact sets.
The process Y is m.s. differentiable since g X is m.s. continuous (Eq. 3.60) so that
Y is also m.s. continuous. Note that Y satisfies the differential equation Y(t) = g(t) X(t),
t :=:: 0, with the initial condition Y (0) = 0. The right side of this equation is m.s. continuous
so that its left side must also be m.s. continuous. Hence, JJ Y(s) ds = Y(t) - Y(O) by
JJ
Eq. 3.61, which gives Y(t) = g(s) X(s) ds, that is, the definition of Y.
We give the definition of a compact set here for convenience. Consider a topological
space (Section 2.2.2) and a subset K of this space. The set K is compact if every open
cover of K has a finite subcover, that is, if any collection Da, a E I, of open sets such that
K c Uael Da has a finite subcollection Da 1 , ••• , Dan with the property K c Uk=l Dak.
It can be shown that a set K c R is compact if and only if K is closed and bounded
([26], Theorem 13.6, p. 95). Hence, the intervals [a, oo) and [a, b) of the real line are not
compact sets, but [a, b] is a compact set. The same conclusions follows from the definition.
For example, [a, oo) is not a compact set since it is not possible to extract from the open
cover (a- lfn, n), n = 1, 2, ... , of this interval a finite cover. •
Example 3.46: Let X be the real-valued stochastic process in Eq. 3.33 with co-
variance and spectral density functions given by Eqs. 3.34 and 3.40, respectively.
An alternative representation of X and its covariance and spectral density func-
tions is
n
X(t) = L Cp eR vpt, c(r) = and
p=-n p=-n
L { [8(v -
n a2
s(v) = vp) + 8(v + vp)], (3.70)
p=-n
where V-k = -Vk, O"-k = ak, k = 1, ... , n, the random coefficients C P are
complex-valued with means E[C p] = 0 and correlations E[C P c;] = (a'j;/2) 8pq.
Co = 0, and ao = 0. <>
Note: The representation in Eq. 3.70 results from Eqs. 3.33, 3.34, and 3.40 and the rela-
tionships cos(u) = [eR u + e-R u]/2, sin(u) = [eR u - e-R u]/(2 .J=T). The
coefficients of the harmonics of X are Cp = O"p [Ap- .J=T Bp]/2, where A-p = Ap
and B-p = -Bp. The second moment properties of Cp are E[Cp] = 0 and E[Cp c;] =
(O"p O"q/4) E[(Ap- R Bp) (Aq + R Bq)] = (0"~/2) Opq. where A_k = Ak and
B_k = - Bk fork = 1, ... , n. A
Note: This theorem shows that a complex-valued weakly stationary stochastic process can
be constructed as a superposition of orthogonal harmonics eP v t d Z (v) with frequency
v and random amplitude and phase defined by the increments d Z (v) of a process Z, called
the spectral process associated with X ([45], Section 7.5). The representation in Eq. 3.71
is the continuous version ofEq. 3.70.
Since X is m.s. continuous, its correlation function is continuous so that the Bochner
theorem in Eq. 3.35 can be applied and gives E[X(t + r) X(t)] = f~oo eP vr dS(v).
The m.s. integral in Eq. 3.71 generalizes the m.s. integral defined by Eq. 3.52 be-
cause both its integrand and integrator are complex-valued. The definition, properties, and
calculation formulas established for the m.s. integral in Eq. 3.52 can be extended to this
type of integral by applying results of the previous section to the real and imaginary parts
of the integral in Eq. 3.71. These considerations show, for example, that the integral in
Eq. 3.71 exists in the mean square sense because its integrand eP vt is a continuous
function and the integrator Z is of bounded variation in the weak sense. The variation of
the correlation function of Z in a bounded interval [-a, a] x [-a, a], a > 0, for a partition
p = (-a = vo < VJ < · · · < Vm =a) of this interval is (Eq. 3.55)
m
Vr(p, p) = L IE[(Z(vk)- Z(vk-J))(Z(vl)- Z(vi-J))*ll
k,l=l
m m
=L E[IZ(vk)- Z(vk-1)1 2 ] = L(S(vk)- S(vk-J))) = S(a)- S(-a) (3.73)
k=l k=l
so that Z is of bounded variation in the weak sense on [-a, a] for any a > 0. The process
Z is also of bounded variation in the weak sense on lR because liiila~oo[S(a)- S( -a)]=
S(oo) < oo. A
Note: The properties of the processes U, V are similar to the properties of the coefficients
Ak> Bk in Eq. 3.33. The moments of U and V in Eq. 3.75 can be obtained from Eqs. 3.71
and 3.72 under the assumption that X is real-valued ([45], Section 7.6). A
Example 3.47: The correlation functions of the processes in Eqs. 3.71 and 3.74
156 Chapter 3. Stochastic Processes
are
Y(t) = 1oo
-00
[tk=l
Ck eR v tk] eR v t dZ(v) = 1oo h(v) eR v t dZ(v)
-00
£:
Example 3.50: Let X be the process in Eq. 3.71 and define
and is real-valued with the notation in Eqs. 3.74 and 3.75. <>
Note: The definition of X shows that this process can be obtained from X by a linear
operation with the gain h(v) = A. 0, and -A for v < 0, v = 0, and v > 0,
respectively, that is, X(t) = fJR h(v) eH v t dZ(v) ([45], p. 142). The expression of the
gain results from Eq. 3.78 and the definition of X. A
Proof: We have shown that the correlation and spectral density functions of the coordinates
of X are Fourier pairs (Eq. 3.41). An alternative form of the correlation function in Eq. 3.41
is
ri,J (r) = k eR v' Si,J (v) dv = fooo [ (si,J (v) + Si,J ( -v)) cos(v r)
ri,J (r) = E [Xi (t + r) X 1(t)] = fooo [gi,j (v) cos(v r) -hi,} (v) sin(v r)] dv.
The above expressions of ri,J yield the definitions of the second moment properties of U
and V in Eq. 3.~1. •
Example 3.51: Let X be the IR2 -valued process in Example 3.33 with coordinates
Xi(t) = .Jf-=-pWi(t) + .JPW(t), where (W1, W2, W) are zero-mean, weakly
stationary processes that are mutually uncorrelated and p E (0, 1). The process
X = (X 1, X2) has the spectral representation
(Q, :F, P) for each t E D, where D c JRd' and d, d' ~ 1 are integers. We limit
our discussion to real-valued random fields (d = 1) and denote these fields by X
or {X(t), tED}.
The f.d.d. of order n of X is the probability ofn?=l {X(ti) ::::Xi}, where ti E
D and Xi E R The random field X is said to be strictly stationary/homogeneous
or just stationary/homogeneous if its finite dimensional distributions are invari-
ant under a space shift, that is, if the points t i, i = 1, ... , n, are mapped into
ti + T E D, where T E JRd' and the integers n ~ 1 are arbitrary. The marginal
and the finite dimensional distributions of X are space invariant and depend only
on the "space lag" (tt - t2), respectively.
Suppose that X E L2(Q, :F, P) and define by
the mean, correlation, and covariance functions, respectively. The pair (JL, r)
or (JL, c) defines the second moment properties of X. The properties of these
functions result from Section 3.7.1. If X is stationary, then JL(t) = JL is constant
andr(t, s), c(t,s) depend only on (t- s). If X is such that
field with spectral distribution S and spectral density s, then X has the spectral
representation (Eq. 3.71)
(3.85)
where the random field Z is such that E[Z(v)] = 0 and E[dZ(v) dZ(v')*] =
8(v- v') dS(v) = 8(v- v') s(v) dv. If X is a real-valued field, then (Eq. 3.84)
where E[Ak] = E[Bk] = 0, E[Ak BL] = 0, E[Ak At] = E[Bk Bz] = 8kz, ak > 0,
and Vk E JR2 for all k, l = 1, ... , n. The field has mean zero. The correlation and
spectral density functions of X are
n
r(T) = E[X(t + T) X(t)] = L al cos(vk · T) and
k=l
n 2
s(v) =L ak [8(v - Vk) + 8(v + Vk)] (3.88)
k=l 2
q = 0, 1, ... , in the (tl, t2)-space so that the wave associated with the frequency Vk has
length 2 rr I J
v'f. 1 + v'f. 2 and evolves in a direction perpendicular to the zero lines at an
angle ek = tan- 1(vk,21vk,1) relative to the coordinate t1. •
I: I:
function of X is symmetric, that is, r(t, s) = r(s, t)* (Example 3.29) and satisfies
the condition lr(t, s)e dt ds < oo since X is in L2. Consider the integral
equation
I:
independent eigenfunctions, (4) the eigenfunctions ifJ k corresponding to distinct
eigenvalues are orthogonal, that is, ifJk(t) ifJL(t)* dt = 0 fork ¥=land Ak ¥= A.1,
and (5) if the correlation function r of X is square integrable and continuous in
[a, b] x [a, b], the Mercer theorem holds, that is, r(t, s) = "£'1:::
1 Ak ifJk(t) ifJk(s)*,
and this series representation converges absolutely and uniformly in [a, b] x [a, b]
([1], Section 3.3, [49], Appendix 2, [98], Section 6.2).
162 Chapter 3. Stochastic Processes
1b
k=l
1
xk = - X(t) ¢k(t)* dt, (3.90)
Gk a
Note: We only show that Xk in Eq. 3.90 has the stated properties. The mean of Xk is zero
since E[X(t)] = 0. The correlation of these random variables is
= -1-
r7k !7[
1b [1b
a a
r(t, s) r/Jt(s) ds ] ¢k(t)* dt = -!7[
r7k
1b
a
r/Jt(t)¢k(t)* dt = Dkf·
Example 3.53: Let {X (t), t E [ -~, ~]}, ~ > 0, be a real-valued stochastic pro-
cess with mean zero and correlation function r(t,s) = (lj4)e- 2 "'1<1, where
r = t - s denotes the time lag and a > 0. The Karhunen-Loeve representa-
tion of X in [-~, ~] is
X(t) = f [A
k=l
xk ¢k(t) + [5:: xk ¢k(t) J'
where
1 cos(2 a bk t)
Ak = , ¢k (t) = -----n::=;=::;::.=T,i==;=~::;=n====r=~
4a (1 + b~) .j~ + sm(4a bk ~)j(4a bk)
1 sin(2a bk t)
J~ - sin(4 a bk ~)/(4a bk)
A A
Ak = ¢k(t) = '
+ b~)
A '
4a (1
3.9. Second moment calculus 163
Note: The eigenvalues and eigenfunction in the representation of X are the solution of the
integral equation J~~ e- 2 a lt-sl ¢(s) ds = 4 A¢(t) defined by Eq. 3.89 ([49], Example 6-
4.1, p. 99) ...
Example 3.54: Let X be a real-valued process defined in [0, 1] with mean zero
and correlation function r(t, s) = t 1\ s. The Karhunen-Loeve representation of
this process is
f s ¢(s) ds +t [
1
¢(s) ds =A ¢(t), t E (0, 1),
and gives J/ ¢(s)ds = AtjJ1 (t) by differentiation with respect tot. We can perform this
operation since the left side of the above equation is differentiable so that ¢ on its right
side is also differentiable. Differentiating one more time with respect to time, we have
At/J 11 (t) + ¢(t) = 0 in (0, 1) with the boundary conditions ¢(0) = 0 and ¢'(1) = 0, which
J/
result from Eq. 3.89 with r(t, s) = tl\s and the equation ¢(s) ds =A ¢ 1 (t), respectively.
The solution of the above differential equation gives the eigenvalues and eigenfunctions in
the representation of X.
We note that the Brownian motion and the compound Poisson processes have corre-
lation functions similar to the correlation function of the process X in this example. Hence,
the above Karhunen-Loeve representation can be used to characterize these two processes.
This observation also demonstrates a notable limitation of the Karhunen-Loeve representa-
tion. The representation cannot distinguish between processes that are equal in the second
moment sense. •
Example 3.55: Let {W(t), t E [a, b]} be a white noise defined formally as a
process with mean zero and correlation function E[W(t) W(s)] = y 8(t - s),
where t, s E [a, b], y > 0 is interpreted as the noise intensity, and 8(·) denotes
the Dirac delta function. Then W(t) = ,JY L~l wk ¢k(t) is the Karhunen-
Loeve representation of W, where Wk are random variables with E[Wk] = 0 and
E[Wk Wt] = 8kt and {¢k} is a collection of orthonormal functions spanning the
class of square integrable functions defined in [a, b]. <>
Proof: Eq. 3.89 gives y¢(t) = At/J(t) so that all eigenvalues are equal toy and the
eigenfunctions are arbitrary provided that they are orthonormal and span the space of square
integrable functions. The first two moments of the random variables {Wk} result from the
definition of Xk in Eq. 3.90. These calculations are formal since the process W does not
exists in the second moment sense. However, this representation of the white noise is
frequently used in engineering applications [175]. •
164 Chapter 3. Stochastic Processes
Example 3.56: Suppose that a weakly stationary process X with mean zero and
unknown correlation function E [X (t) X (s)] is observed through an imperfect de-
vice. The observation equation is Y(t) = X(t) + W(t), 0 ~ t ~ -r, where W
is the white noise process in the previous example. The processes X and W are
not correlated. It is assumed that the observed process Y has mean zero and a
known correlation function ry(t- s) = E[Y(t) Y(s)], t, s E [0, -r]. Then the best
m.s. estimator of X is
1
=L l +
00
where y fJ..k is the noise to signal ratio, (J..k, </>k) are the eigenvalues and eigenfunc-
tions given by Eq. 3.89 with r(t, s) = ry(t- s), and the properties of the random
variables Yk result from Eq. 3.90 written for the process Y. If the noise intensity
y decreases to zero, the optimal estimator Xopt has the same Karhunen-Loeve
representation as X. <>
where Yk = ..ji:k Xk + ~ Wk (Eq. 3.90 and Example 3.55). The above equation shows
that the Karhunen-Loeve representations of X and Y involve the same eigenfunctions cfrk
and these functions can be obtained from Eq. 3.89 with r(t, s) = ry(t- s). The Karhunen-
Loeve representation of Y can be calculated since this process is assumed to have known
second moment properties.
Consider the estimator X(t) = L~l hk Yk r/Jk(t), t E [0, -r] of X, where the coef-
ficients hk need to be determined. The m.s. error, e =E [fo ( X(t)- X(t) ) 2 dt]. of X
L~l E [ihk Yk- ../):k Xkl 2] so that it is minimized for hk = A.k/(A.k + y).
is equal to
The expression of X with hk = J...kf()..k + y) is the stated optimal estimator. In some of
the above operations we interchange summations with integrals and expectations. It can be
shown that these operations are valid ([49], Appendix 2). •
Let N(D; t) or N(D; I) denote the number of times X exits D during a time
interval I = [0, t]. Then
Proof: We have Pt(t) = P({N(D; t) > 0} U {X(O) ¢. D}) so that Pt(t)::; P(N(D; t) >
0) + P(X(O) ¢. D). Also,
Numerical results show that the upper bound on Pt(t) in Eq. 3.91 is relatively tight for
highly reliable systems, that is, systems characterized by very small values of Pt(t) and
E[N(D; t)] ([175], Chapter 7). If E[N(D; t)] is not finite or is finite but larger than 1, the
upper bound in Eq. 3.91 provides no information on Pt(t). •
The following two sections give conditions under which E[N(D; t)] is fi-
nite and formulas for calculating E [ N ( D; t)] and the density of T.
Proof: We first note that the mean rate at which X crosses with negative slope a level x E lR
at a timet, that is, the mean x-downcrossing rate of X at t, is
A.(x)- = E [X(t)- I X(t) = x] !x(t)(x).
If X is not a stationary process, its mean x-upcrossing and x-downcrossing rates will de-
pend on time and will be denoted by A.(x; t)+ and A.(x; t)-, respectively.
The event X(O) < x < X(h) that Xh has an x-upcrossing in h can be written as
{X(O) < x < X(O) + h Zh} = {X(O) < x} n {Zh > (x- X(O))/ h} so that
1
f(x; t) = a(t) ¢
(X- {l(t))
a(t) ' where
3.1 0. Extremes of stochastic processes 167
{J,(t) = (l.,(t) + [C,t(f, s) lt=s jc(t, t)] (X- f.L(l)), C,t(f, s) = ac(t, s)jat, a(t) 2 =
C,ts(f, s) lt=s - (c,t(f, s) lt=s) 2 jc(t, t),
and C,ts(t, S) = s)j(at as).a2 c(t,
Figure 3.7 shows the variation in time of the mean x-upcrossing rate of
0.03
0.025
0.02
O.D15
~= 2.5
0.01 /~=2.0
/ /}=1.0
~= 0.0
0.005
0
0 0.5 2.5 3.5 4.5
time
Figure 3.7. Mean x-upcrossing rate for a non-stationary Gaussian process X (t) =
Y(t) I (Y(O) = n
X= 3.0, and t E [0, 5]
Proof: The JR2 -valued Gaussian variable (X(t), X(t)) has mean JL = (J-L(t), (t(t)) and
covariance matrix c with entries cu = 1- p(t) 2 , c12 = c21 = p(t) p'(t), and c22 =
-p"(O)- p'(t) 2 . Hence, X(t) I (X(t) = x) is a Gaussian random variable with mean and
standard deviation {l(t) and &(t) (Section 2.11.5).
The mean x-upcrossing rate of Y is }..(x)+ = (aj.J21r) <f>(x) since Y(t) and Y(t)
are independent Gaussian variables so that E[Y(t)+ I Y(t)] = E[Y(t)+] and E[Y(t)+] =
aj.J21r. •
168 Chapter 3. Stochastic Processes
Durbin formula. If c(·, ·) has continuous first order partial derivatives and
Iimstt Var[X(t)- X(s)]/(t- s) < oo, then
Note: The density fT of T is equal to the mean x -upcrossing rate of the samples of X that
have not ever left D = (-oo, x) in [0, t]. Because l(X(s)<x, O:O::S<t) is zero on the subset
of the sample space corresponding to samples of X that upcross x at least once in [0, t], we
have fT(t) :::; A.(x; t)+. Details on Durbin's formulainEq. 3.94, extensions of this formula,
and computer codes for calculating the density ofT can be found in [56, 158, 159]. A
0.5
0.4
0.3
Exact
Approximate
0.2
0.1
0
0 10 20 30 60 70 80 90 100
time
Figure 3.8. Exact and approximate probabilities P(T > t) for F(x) = <l>(x) and
X= 1.5
3.11. Martingales 169
3.11 Martingales
We have examined in Section 2.18 discrete time martingales. Here we con-
sider continuous time stochastic processes that are martingales.
Note: The last condition in Eq. 3.95 can be replaced with E[X(t) I Fs] = X(s At),
t, s ~ 0, since E[X(t) I Fsl = X(t) = X(t As) fort <sand E[X(t) I Fs] = X(s) for
t ~ s.
If the equality E[X(t) I Fs] =
X(s), s :::; t, is replaced by~ and:::;, X is said to be
an .1'1-submartingale and .1'1-supermartingale, respectively.
If X is in L2 (Q, F, P) and satisfies the last two conditions in Eq. 3.95, then X is said
to be an :Fi -square integrable martingale. Similarly, a submartingale/supermartingale
X E L2(0, F, P) is called a square integrable submartingale/supermartingale. If the first
condition in Eq. 3.95 is replaced by E[IX(t)IP] < oo, p > 1, then X is said to be a
p-integrable martingale. &
We have seen in Section 2.18 that the martingale, submartingale, and su-
permartingale can be viewed as models for fair, super-fair, and unfair games by
interpreting t 2:: s, s, E[X(t) I :F8 ], and X(s) as future, present, average future
fortune, and current fortune, respectively. For example, the average future fortune
170 Chapter 3. Stochastic Processes
of a player E[X(t) I Fs] is equal to his or her current fortune X (s) for martingales
but it is larger/smaller then X (s) for submartingales/supermartingales. The con-
ditions that a stochastic process X must satisfy to be a martingale, submartingale,
and supermartingale add structure to this process, as we will see in this section.
Example 3.59: Consider a filtered probability space (Q, F, (F1 ) 1::::o. P) and a
random variable Y E F such that E[IYI] < oo. The process X(t) = E[Y I Ft],
t 2: 0, is an F 1 -martingale. <>
Proof: We have E[IX(t)ll = E[IE(Y I Ft)ll::::: E[E[IYI I Ftll = E[IYI] < oo by the
Jensen inequality for conditional expectation (Section 2.17.2) and hypothesis. X is :Ft-
adapted because E[Y I F 1 ] is F 1 -measurable for each t :::: 0. Properties of the conditional
expectation give E[X(t) I Fs] = E[E(Y I Ft) I Fs] = E[Y I Fs] = X(s) for all s ::::: t.
Hence, X is a martingale, referred to as the martingale closed by the random variable Y. •
Proof: We have E[IB(t)l] ::::: ( E[B(t) 2 ]) 112 = t 112 by the Cauchy-Schwarz inequality
and properties of B. Since F 1 = a(B(s) : 0::::: s ::::: t) for all t:::: 0, B is F 1-adapted. For
t:::: s we have
E[B(t) IFsl = E[(B(t)- B(s)) + B(s) IFs]
=E[B(t)- B(s) I Fs] + E[B(s) I Fs] = B(s)
since the conditional expectation is a linear operator, the increment B(t)- B(s) is inde-
pendent of B(u), u ::::: s, so that E[B(t)- B(s) I Fs] = E[B(t)- B(s)], B has mean zero,
and B(s) is Fs-measurable. Hence, B is a martingale. •
Example 3.61: The compound Poisson process C in Eq. 3.6 can be an F 1 -martin-
gale, submartingale, or supermartingale depending on the sign of E[Y J], where F 1
denotes the natural filtration of C and Y 1 is assumed to be in L J. <>
1]
Proof: The mean of the absolute value of C(t) is smaller thanE [ L,f~? IYk =At E[IYJIJ,
which is finite because E[IYJil < oo by hypothesis. The process is Ft-adapted. Fort 2: s
we have
Example 3.62: Let {Bi(v), 0 :::; v :::; v}, i = 1, 2, be two independent Brownian
motions, 0 < v1 < vz < · · · < Vn < Vn+l = v < oo be some frequencies, and
ak > 0, k = 1, ... , n, denote arbitrary numbers. Then
n
X(t) = L ak [B1 (vk) t:.B1 (vk) cos(vk t) + Bz(vk) !::.Bz(vk) sin(vk t)]
k=l y'vk
(3.96)
is a weakly stationary process with one-sided spectral density function g(v) =
Lk=l a[ o(v- Vk), where t:.Bi(Vk) = B(vk+l)- B(vk), i = 1, 2. The definition
of X constitutes a discrete version ofEq. 3.74.
Figure 3.9 shows a sample of X scaled to have unit variance for n = 100,
t:.v = Vk+l - Vk = 1/10, a[ = g(vk) t:.v, and g(v) = (2/JT)j(v 2 + 1). The
estimated value of the coefficient of kurtosis of X is 2.18, indicating that X is not
a Gaussian process. <>
1.5
05
-0.5
-1
-1.5
-2
- 2 . 5 o'---,c'co-~,o--:'eoc---=a~o-----c1~oo-1c'c20-~140c--~,.cc-o---c1B~o---o-'2oo
time
Proof: The mean and variance of the increments Bi(vk) t:.Bi(vk) are zero and vk t:.vb
respectively. The correlation of these increments fork < l is
since Bi(vk), l!!.Bi(vk), and Bi(vz) are .Fz-measurable and E[t:.Bi(vz) I .Fz] = 0. Hence,
the increments Bi ( v) t:. Bi ( v) have the same second moment properties as the increments
of U and V in the spectral representation of a real-valued process (Eq. 3.74). •
172 Chapter 3. Stochastic Processes
Example 3.63: Let N be a Poisson process with intensity ). > 0. The random
variables Tn = inf{t =:: 0 : N(t) = n}, n = 1, 2, ... , define an increasing
sequence of stopping times such that Tn --+ oo a.s. as n --+ oo. The processes
NTn, n = 1, 2, ... , are :F1 -submartingales so that N is an :F1 -local submartingale
with the localizing sequence Tn. Note also that N is an :Frsubmartingale. <>
Proof: The random variable Tn is an .1"1-stopping time since the events {Tn ~ t} and
{N(t) 2: n} coincide and {N(t) 2: n} E Ft. We also have Tn = Lt=l Zi, where Zi are
independent exponential random variables with mean 1/'A so that Tn+l = Tn + Zn+l 2: Tn
a.s. for all n 2: 1 and Tn --+ oo a.s. as n --+ oo since E[Zd = 1/'A > 0 (Section 2.14).
We will see in Section 3.11.2 that right continuous stopped martingales, submartin-
gales, and supermartingales are martingales, submartingales, and supermartingales, respec-
tively. Hence, NTn is an .1"1 -submartingale for each n 2: 1. •
Example 3.64: Let {B(t), t =:: 0} be a Brownian motion in ~d such that B(O) =
x E ~d and let S(x, n) = {y E ~d :II y- x II:S n}, n = 1, 2, ... , be a
sphere of radius n centered at x = (xi. ... , xa). The coordinates Bi of B are
independent real-valued Brownian motions starting at xi rather than zero. The
random variables Tn = inf{t =:: 0: B(t) fj. S(x, n)} define an increasing sequence
of stopping times with respect to the natural filtration of B and T n --+ oo a.s. as
n--+ oo. <>
Proof: The random variables Tn satisfy the condition Tn :::; Tn+l and are stopping times
since we can tell whether Tn exceeds an arbitrary time t 2: 0 from properties of B up to this
time. Since the Brownian motions Bi have continuous samples, they cannot reach infinity
in a finite time so that Tn --+ oo a.s. as n --+ oo. •
3.11. Martingales 173
3.11.1 Properties
Many of the properties of the martingales considered here are counterparts
of the properties of discrete time martingales in Section 2.18.1.
since X(v)- X(u) is Fv-measurable, Fv C .1"8 , and E[X(t)- X(s) I Fs] = 0 by the
martingale property. The above expectations are finite since X (t) E L2, t 2:: 0. •
Proof: We have E[IX(t)l] < oo because X is a martingale. The random variable X(t) is
Ff -measurable by the definition of the natural filtration. Fort 2:: s we have
E[X(t) I F{'J = E{E[X(t) I Fs] IF{'}= E[X(s) IF{']= X(s)
1. Predictable and optional processes are measurable because the a-fields P and 0 are
included in B([O, oo)) x F and
2.P~O.
We only show that Pis included in 0. Let X be an Ft-adapted process with left
continuous paths and define x<n) (t) = X(kf2n) fort E [k/2n, (k+ 1)/2n) (Fig. 3.10). The
process x<n) is F 1-adapted, has right continuous samples, and approaches X as n ~ oo.
Sample of a predictable
process, X(·,ro)
time
for every non-negative right continuous F 1-martingale Y, every t :::: 0, and Ft-
stopping timeT ([61], Theorem 5.1, p. 74).
Note: We have proved a similar decomposition for discrete time submartingales (Sec-
tion 2.18.1). A right continuous .1'1 -submartingale X is said to be of class DL ifthe family
X (t 1\ T) with T ranging over the collection of .1'1-stopping times is uniformly integrable
for each t :::>:: 0. A family of random variables Za, a E /, is uniformly integrable if
SUPae/ ~Zai:O::n IZal dP converges to zero as n--+ oo.
We say that A is an increasing process if A(t, w), t :::>:: 0, are increasing functions
for all w E Q. For example, the Poisson process is increasing. •
If X is an .1"1 -submartingale and T1, T2 are .1"1 -stopping times assuming values
in a finite set {t1, t2, ... , tn} ~ [0, oo), then ([61], Lemma 2.2, p. 55)
(3.97)
Note: This inequality is similar to the last condition in Eq. 3.95 with s, t replaced by the
stopping times T1, T2. By the defining properties of the conditional expectation, Eq. 3.97
is satisfied if and only if fA X (T2) d P :::0:: fA X (T1 1\ T2) d P, 'v' A E .FT1 • A
Proof: Suppose that X(t), t 2: 0, is a right continuous submartingale and Tis a stopping
time. We have E[IX (T 1\ t)l] < oo since X is a submartingale. Since X is right continuous
and adapted, it is progressive (Section 3.2) so that X(T 1\ t) E :FT 1\t c :F1 . Also, Eq. 3.98
with T1 = s ::::; t and Tz = T yields E[X(T 1\ t) I :Fsl 2: X(T 1\ s). Hence, xT is a
submartingale.
Suppose now that X(t), t 2: 0, is a right continuous local submartingale, T is a
stopping time, and that Tn, n = 1, 2, ... , is a localizing sequence for X. The above consid-
erations applied to the submartingale xTn show that xTn (T 1\ t), t 2: 0, is a submartingale.
Hence, xTn (T 1\ t) = X (Tn 1\ T 1\ t) = xT (Tn 1\ t), t 2: 0, is a submartingale for all n 2: 1
so that Tn is a localizing sequence for XT.
Similar arguments can be used to show that stopped right continuous local martin-
gales and supermartingales are local martingales and supermartingales, respectively. •
The family E [X ( T 1\ t) I :FT 1\t] of random variables indexed by the stopping times T
is uniformly integrable for each t 2: 0 ([40], Theorem 4.5.3, p. 96). Since X(T 1\ t) is
bounded from below, then X(T 1\ t) indexed by Tis uniformly integrable. Hence, X is
of class DL. For example, the Poisson process N is right continuous and is bounded from
below since N(t) 2: 0, t 2: 0. Hence, N is of class DL. •
3.11.3 Inequalities
Proof: We first note that Eq. 3.99 implies that cp(X) is an :F1-submartingale. The Jensen
inequality can be used to show, for example, that IB I and B 2 are submartingales, where B
is a Brownian motion. We have also established a Jensen inequality for random variables
(Section 2.12).
Because cp is a convex function, we have cp(x) = sup{l(x) : l(u) ::::; cp(u), 'v'u}, where
l(u) =au+ b is a linear function with a, b real constants. Also,
since E[sup{l(X(t))} I :Fsl 2: E[l(X(t)) I :Fsl for any linear function l, the conditional
expectation is a linear operator, and X is an .1(-martingale. •
3.11. Martingales 177
If X is an F 1-submartingale, 0 < -r < oo, and F c [0, -r] is a finite set, then for
each x > 0 and bounded interval (a, b) we have ([61], Lemma 2.3, p. 56 and
Lemma 2.5, p. 57)
Note: If T1 = min{t E F : X(t) ::;: a} and Tz = min{t > T1 : t E F : X(t) 0:: b},
we say that X has an oscillation in (T1, Tz) larger than b - a. This event relates to the
event of an a-downcrossing followed by a b-upcrossing defined for differentiable processes
(Section 3.10.1). The number of oscillations of X in F larger than b- a is denoted by
U(a, b, F).
The bounds in Eq. 3.100 show that the samples of X cannot be very rough because
(1) E[X(t')+]/x and (E[X(t')+] - X(O))/x converge to zero as lxl --+ oo so that the
maximum and the minimum values of X in F are finite with probability one and (2) the
average number of oscillations of X in F that are larger than b - a is finite. A
1.4,------,---.,.------r---.,------,---.,.----~-~
0.6
0.4
0.2
in [O,t) exceeds x
0~-~--~--~--~-~---L--~-~
1 1.5 2.5 3.5 4.5
X
Figure 3.11. Probability of the largest value of a Brownian motion and a bound
on this probability for -r = 10
Note: The process [X] is called the square bracket or quadratic variation process. This
process is also denoted by [X, X]. The latter notation is used almost exclusively in the later
chapters of the book. A
Example 3.67: For a Brownian motion B(t), t =:: 0, the quadratic variation pro-
cess is [B](t) = t.
Proof: Take t > 0 and consider the partition Pn = k tIn, k = 0, 1, ... , n, of [0, t]. The
sequence of sums in Eq. 3.102 is Sn = Lk=l (B(k tIn) - B((k- 1) t ln)) 2 so that Sn g,
(tin) Lk=l G~, where Gk, k = 1, ... , n, are independent N(O, 1), since the increments
B(k tIn) - B((k- 1) tin) of B are independent Gaussian variables with mean zero and
variance t ln. Hence, Sn ~ t E[GI] = t as n ~ oo by the law of large numbers. We will
revisit this property of the Brownian motion later in this chapter (Eq. 3.112). •
Example 3.68: Let N be a Poisson process with unit intensity. The quadratic
variation process of M(t) = N(t)- t, t =:: 0, is [M](t) = N(t).
Proof: Take t > 0 and consider the partition Pn = k tIn, k = 0, 1, ... , n, of [0, t].
For a sufficiently large n, the intervals ((k- 1) tin, k tin], k = 1, ... , n, of the partition
contain at most one jump of N. Let ln(t) be the collection of intervals ((k- 1) tin, k tIn]
containing a jump of N. The sum Sn in Eq. 3.102 is
= L 1
n2 + L ( 1
1 - ;;
)2 1
= n 2 (n - N(t)) + N(t) ( 1
1 - ;;
)2 ,
k¢.Jn(t) kEln(t)
Proof: The expectation of the absolute value of X (t )2 - [X](t) is finite for each t ~ 0 since
X is a square integrable martingale and [X] is in L1 (Eq. 3.102). The process X 2 - [X]
is F 1-adapted because [X] is adapted by definition. It remains to show the last property in
Eq. 3.95 holds. For t ~ s we have
for any sequence of partitions s = tcin) :::; tt> :::; · · · :::; ti:'} t
= of the interval [s, t].
The left side of the above equalities converges in L1 to [X](t) - [X](s) as fl(pn) --+ 0
(Eq. 3.102) so that
as the meshes of the partitions {t~n)} and {ukn)} of [0, t] converge to zero (Eq. 3.102).
The convergence is in probability. If the martingales X and Y are square integrable, the
convergence is in L 1· Similar statements hold for local martingales ([ 61], Proposition 6.2,
p. 79).
If the covariation of the martingales X and Y is zero, that is, [X, Y] = 0, we say
that X and Y are orthogonal. A
Note: The process x 2 is a submartingale which is bounded from below since X 2(t) ~ 0
for all t ~ 0. We have seen that such a process is of class DL. Hence, X2 admits the
Doob-Meyer decomposition, where A in the Doob-Meyer decomposition is denoted by
<X>.
Consider the decomposition of X 2 in Eqs. 3.103 and 3.105. Since X 2 - [X] and
X 2 - < X > are martingales, so is (X 2 - < X >)- (X 2 - [X]) = [X]- < X >.
Generally, [X] and < X > differ. If X is continuous, the processes [X] and < X > are
indistinguishable ([61], p. 79). A
E[M 2 (t) I Fsl = E [ (N(t) - N(s) - J.. (t- s) - (N(s) - J.. s)) 2 I Fs]
In Section 3.9.3.1 we have defined the variation and the total variation for
deterministic functions and stochastic processes (Eqs. 3.54-3.56). In the follow-
ing sections we will extend the concept of variation for deterministic functions
to samples X(·, w), w E Q, of a stochastic process X(t), t 2: 0. The varia-
tion of X(·, w) in a time interval [a, b] is defined by Vx(w) = sup[all p) Vx(w, p),
182 Chapter 3. Stochastic Processes
where p = (a =to < t1 < · · · < tm =b) is a partition of [a, b], Vx(w, p) =
I:%'= 1 IX(tk, w) - X(tk-1, w)l, and the supremum is calculated over all partitions
p of [a, b]. We say that X is of finite variation if almost all samples of X are of fi-
nite variation on each compact of [0, oo). Otherwise, X is said to be of unbounded
variation.
is the counting process associated with the sequence Tn. If T = supn {Tn} = oo
a.s., then N is said to be without explosions.
Note: The counting process N (1) has right continuous, piece-wise constant samples with
unit jumps at the random times Tn and starts at zero with probability one if T1 has no
probability mass at zero, (2) takes on positive integer values including {+oo}, (3) has cadlag
samples if T = oo a.s., and (4) is Fr-adapted if and only if Tn are .F1 -stopping times. A
Note: The requirement of increments independent of the past is stronger than the require-
ment of independent increments stated in Section 3.6.4. Consider a process X adapted to a
filtration .F1 , t ~ 0, and take 0 :::; u < v :::; s < t. Since X (t) - X (s) is independent of Fs
by hypothesis, it is also independent of X ( v) - X (u) E Fv <; Fs . A
Note: The probability in Eq. 3.107 results from the defining properties of the Poisson
process ([147], Theorem 23, p. 14). The samples of N look like staircases with steps of
unit height and random width. The Poisson process is a special case of the compound
Poisson process (Fig. 3.5). We also note that N is continuous in probability, m.s., and
a.s. at each time t > 0 but does not have continuous samples.
The probability in Eq. 3.107 shows that (1) the period between consecutive jumps of
N is an exponential random variable with distribution P(T1 > t) = P(N(t) = 0) = e-1. t
and mean 1/). and (2) N starts at zero a.s. because P(T1 > t) --+ 1 as t--+ 0. •
Note: These properties can be obtained from the probability law of N in Eq. 3.107 ([79],
pp. 79-83). The relationship between cumulants and moments can be found in [79] (Ap-
pendix B, p. 377), where q ~ 1 is an integer. Because E[N(t)] gives the expected number
of jumps of N in timet,). = E[N(t)]jt represents the average number of jumps per unit
of time and is called the intensity or the mean arrival rate of N. &
E [ (N(t) -). t) 2 I :Fs] = E [ ((N(t) -). t) - (N(s) -). s) + (N(s) -). s)) 2 I :Fs]
Example 3.71: Let C(t) = Ln;:::l Yn 1(t::::Tn) = 2:::~? Yn be the compound Pois-
son process in Example 3.9 with Y1 E Lq, q ~ 2. The mean, variance, covariance
function, cumu1ant of order q, and characteristic function of C are
E[C(t)] =A. t E[YI], Var[C(t)] =A. t E[Yfl,
Cov[C(s), C(t)] =A. rnin(s, t) E[Yfl, Xq =A. t E[Y{], and
q;(u; t) = exp [-A. t (1 - q;y1 (u))], (3.109)
184 Chapter 3. Stochastic Processes
where A > 0 is the intensity of the Poisson process N and cp y 1 denotes the charac-
teristic function of Y !· <>
Note: These properties of C can be obtained from the definition of this process and the
probability law of the Poisson process in Eq. 3.107 ([79], Section 3.3). A
Example 3.72: Let C be a compound Poisson process with E[l Y Ill < oo. Denote
by F 1 the natural filtration of C. Then C is an F 1 -submartingale, martingale, or
supermartingale if E[Y!] ~ 0, = 0, or ::::: 0, respectively. The compensated
compound Poisson process C(t) = C(t)- At E[YI] is an .Frmartingale. If Y1 ~
0 a.s., then A(t) = At E[YI] is the process in the Doob-Meyer decomposition of
the submartingale C. <>
Proof: The first part of the statement has been proved in Example 3.61, where it was shown
that E[C(t) /J=:d =A (t- s) E[Yj] + C(s) for s ::::; t.
If Y1 :::: 0 a.s., then C is a submartingale such that C(t) :::: 0 for all t :::: 0. Hence,
C is of class DL (Section 3.11.2) so that it admits the Doob-Meyer decomposition in
Section 3.11.1. Since C(t) = C(t)- At E[Yj] is a martingale and At E[Yj] is an adapted
continuous increasing process, then the process A in the Doob-Meyer decomposition is
A(t) =At E[Yj] . •
y+dy
y
- - - - - - -y- - -
I
~--------
y2 ---------------,--
I
0
time I
M (t, dy) be a random measure giving the number of jumps of C in (y, y + dy]
during a time interval (0, t]. This measure is equal to 2 for the sample of W p
in Fig. 3.12. The measure M(t, dy) (1) is random because its value depends on
the sample of C, (2) has the expectation E[M(t, dy)] = 'AtdF(y) = tL(dy)t,
where tL(dy) = 'AdF(y) and F denotes the distribution of Yr, and (3) provides
the alternative definition,
Let C be a compound Poisson process, A be a Borel set in IR, and M (t, dy) be
a random measure giving the number of jumps of C in (y, y + dy] during (0, t].
Then
cA(t)
N(t)
= LYk 1A(Yk) =
k=l
L
O<s<t
~C(s) 1A(~C(s)) = iA
y M(t, dy)
(3.111)
is a compound Poisson process, where ~C(s) = C(s)- C(s-) and C(s-) =
limstt C(s).
Compound Poisson
process ~o·--~Q~--~o----~._o
I 0 •
Thinned compound
Poisson process
•
10
time
so that J..t (1- 'Py1 (u)) = J..t (P(YJ E A)- rp~~l(u)) = 5.t (1- rj)(u)), where 5.
ct(t) =
N(t)
L h(Yk)IA (Yk) = L
k=l O<s~t
h(~C(s))IA (~C(s)) = L h(y) M(t, dy)
are compound Poisson processes with jumps h(Yk) and h(Yk) lA (Yk) arriving at the mean
rates ). and 5., respectively, where h : JR. -+ JR. is a Borel measurable function. A
Example 3.73: Let C be a compound Poisson process with jumps satisfying the
condition E[Yf] < oo. The quadratic variation of the compensated compound
Poisson process C(t)
-
= C(t) -A. t E[YJ] is [C](t)
-
= Lk=l
N(t) 2
Yk. <>
that B starts at 0 E JRd. If B (0) = x, we say that the Brownian motion starts at x.
Because the properties of B are determined by the properties of its coordinates B i,
we consider here only real-valued Brownian motions, and denote these processes
by B. We have defined the Brownian motion process B in Example 3.4. This
section gives essential properties of B.
There exists a modification of the Brownian motion with a.s. continuous paths
([147], Theorem 26, p. 17).
Note: This theorem is consistent with a previous result based on the Kolmogorov criterion
(Eq. 3.2). We will work exclusively with continuous modifications of the Brownian motion
B. Therefore, it will be assumed that B has continuous samples. A
mn 2
lim "- (B(t(n)) - B(t<n_}_ )) = t a.s. (3.112)
n-+oo ~ k k 1
k=l
Note: This property justifies the notation (dB(t)) 2 = dt used extensively in applications.
If the mesh of the sequence of partitions Pn converges to zero as n --+ oo but is not
refining, then the limit in Eq. 3.112 exists in m.s. (Example 3.74). We have also seen that
the sequence Sn = I::~ 1 ( B(t~n))- B(t~':l 1 ) ) 2 converges tot as n --+ oo in probability
and L1 (Eq. 3.102).
Figure3.14showsthreesamplesofZmn(s) = I::~,; 1 (s(t~n) As)- B(t~':l 1 As)) 2
1.2,--------------, 1.2,--------------,
Samples of z Three samples of z
m
1 form =1000 n
n
0.8
0.6
0.4
0.2
time time
for s ::::: t, t = 1, tt) = k/mn, and mn = 10, 100, and 1,000 corresponding to a single sam-
ple of the Brownian motion B. The figure also shows three samples of Zmn corresponding
to the same partition of [0, 1] (mn = 1, 000) but different samples of the Brownian motion.
If mn is small, the samples of Zmn can differ significantly from the identity function t ~--+ t
and from one another. However, these samples nearly coincide with t ~--+ t as the partition
is refined such that Ll(pn) -+ 0 as n -+ oo. A
Proof: Let Pn = (0 = t6n) ::::: t}n) ::::: · · · ::::: tf;:} = t) be a sequence of refining partitions
of [0, t] such that Ll(pn) -+ 0 as n -+ oo. The left side of the inequality
mn mn
~)B(t~n))- B(t~':!1 )] 2 ::5 m:x IB(t~n))- B(t~':!1 )1 L IB(t~n))- B(t~':!1 )1
k=l k=l
and maxk IB(t~n)) - B(t~':! 1 )1 converge tot and zero a.s. as n -+ oo by Eq. 3.112 and
the continuity of the samples of B, respectively. To satisfy the above inequality, the
summationL:~,;, 1 1B(tt))- B(t~':!1 )1 must approach infinity a.s. as n -+ oo ([147], The-
orem 29, p. 19).
Example 3.74: The equality in Eq. 3.112 holds also in the mean square sense if
the mesh of the sequence of partitions Pn in this equation converges to zero as
n --+ oo. The sequence of partitions Pn does not have to be refining. <>
Proof: The firsttwo moments of Ymn = L:~,;, 1 [ B(tt))- B(t~':!1 ) t- tare E[Ymnl =0
and E[Y~n] = 2 L:~,;, 1 (tt)- t~':!1 ) 2 since the Brownian motion has stationary indepen-
dent Gaussian increments and the fourth moment of N(O, a 2) is 3 a 4 . Hence,
mn
E[Y: 2 ] < 2 max(/n) - /n) ) "(t(n) - t(n) ) = 2 t Ll(p ) -+ 0 as n -+ 00
mn - k k k-l ~ k k-1 n
k=l
Proof: Let Pn be a sequence of partitions of a time interval [0, t] and consider the sums
where ilB~n) = B(tt))- B(t~~ 1 ) and ilN~n) = N(tt))- N(t~~ 1 ). If the sequence of
partitions Pn is refining and Ll(pn) --+ 0 as n--+ oo, then Lk(ilB~n)) 2 and Lk(/lNt)) 2
converge tot and N(t) a.s., while Lk(LlB~n))(LlN~n)) converges to zero since the Brow-
nian motion has continuous samples so that
and maxk lllB~n)l --+ 0 as n --+ oo. Hence, Sn converges tot+ N(t) a.s. as n --+ oo.
If the sequence of partitions Pn is not refining, then Sn converges to t + N (t) in m.s. and
probability as n --+ oo.
Similar arguments can be applied to show that [B + C](t) = t + L;f~? Yf", where
B is a Brownian motion, C(t) = L;f~? Yk is a compound Poisson process with jumps
Y1, Yz, ... in Lz, Cis independent of B, and C(t) = C(t)-). t E[Yll· •
Let (fJ, F, (F1 )t?:.O• P) be a filtered probability space. A process {X(t), t :=:: 0}
defined on this space is Levy if it
1. is F 1-adapted and starts at zero,
2. has stationary increments that are independent of the past, that is, X (t)- X (s ),
t > s, has the same distribution as X(t- s) and is independent of Fs, and
3. is continuous in probability.
Table 3.3 summarizes the three defining properties for the Poisson, Brow-
nian motion, and Levy processes. The third property differentiates these pro-
cesses. Yet, the Poisson, Brownian motion, and Levy processes are closely re-
lated. For example, the Poisson and Brownian motion processes are Levy pro-
cesses since they are continuous in probability. The compound Poisson process C
190 Chapter 3. Stochastic Processes
Table 3.3. Defining properties for Poisson, Brownian motion, and Levy processes
in Eq. 3.6 is also a Levy process because it is F 1 -adapted, starts at zero, has station-
ary increments that are independent of the past, and P(IC(t) - C(s)l > 0) ---+ 0
as It- sl ---+ 0.
We have shown that the Poisson, compound Poisson, and Brownian motion
processes are not m.s. differentiable. It will be seen that the samples of the Levy
process are also too rough to be differentiable. Yet, it is common in the engi-
neering literature to use the formal derivative of the compound Poisson, Brownian
motion, and Levy processes, called the Poisson, Gaussian, and Levy white noise
process, respectively, to model the input to various physical systems. Because
the Poisson, Gaussian, and Levy white noise processes do not exist, calculations
involving these processes are formal so that the resulting findings may be ques-
tionable. We will see in the following chapters how white noise processes can be
incorporated in the theory of stochastic differential equations in a rigorous man-
ner.
Example 3.76: If X 1 and Xz are two independent Levy processes defined on the
same filtered probability space (Q, F, (F1k<:.O· P), then X = X 1 + Xz is also a
Levy process. <>
Proof: X is F 1 -adapted and has stationary independent increments, since it is the sum of
two processes with these properties. It remains to show that X is continuous in probability.
Set Ai = {IXi(t) -Xi(s)l > e/2}, i = 1, 2, and A= {IX(t)- X(s)l > e} for some e > 0.
Because Ac 2 A! n A2, we have A~ A1 U Az so that P(A) :::: P(A1) + P(Az) implies
P(IX(t)- X(s)l > e) -+ 0 as It- sl -+ 0 since the processes X1o Xz are continuous in
probability. •
Example 3.77: The characteristic function of a Levy process X has the form
Proof: Levy processes have stationary independent increments, that is, the random vari-
ables X(t + s)- X(s) and X(s)- X(O), s, t::;: 0, are independent, so that
Example 3.78: Let X be an Fradapted process starting at zero that has station-
ary increments that are independent of the past. Suppose that the characteristic
function of the random variable X (t) for some t =:: 0 is cp(u; t) = e -t lula, where
u E lR and a E (0, 2). Then X is a Levy process that may or may not be an
Frmartingale depending on the value of a. <>
Proof: Let f denote the density of X(t)- X(s), t > s. For every e > 0 the probability
P(IX(t)- X(s)l >e) = f(-e,e)c f(x) dx converges to zero as (t- s)--+ 0 since f(x) =
-l;r f~oo eH u x e-<t-s) lui a du approaches a delta function centered at the origin as (t-
s) --+ 0 so that its integral over the interval (-e, e)c converges to zero.
It can be shown that E[IX(t)IP] is finite if and only if p E (0, a) ([162], Propo-
sition 1.2.16, p. 18). Hence, X is not a martingale for a ~ 1 because the expectation
E[IX(t)l] is not bounded but this process is an Ft-martingale for a > 1. If a = 2, the
characteristic function of X(t) is <p(u; t) = e-tu 2 so that X(t) "' N(0,2t) is a square
integrable martingale that has the same distribution as ../2 B, where B denotes a Brownian
motion process. •
3.14.1 Properties
Each Levy process has a unique modification which is Levy and cadlag ([147],
Theorem 30, p. 21).
Note: Only this modification is considered in our discussion so that we assume that the
Levy process has cadlag samples, that is, right continuous samples with left limits. The
Brownian motion and the compound Poisson are examples of processes with cadlag sam-
ples that are also Levy processes. j.
Note: Let X be a Levy process, T denote a stopping time, and Y(t) = X(T + t)- X(T),
t ~ 0, be a new process derived from X. The process Y (1) is a Levy process adapted
to Fr+t• (2) is independent of Fr. and (3) has the same distribution as X. Hence, the
Brownian motion, Poisson, and compound Poisson processes have these properties because
they are Levy processes. ~
Note: X has only jump discontinuities because it has cadUtg samples. The random variables
X(t-) and X(t+) = X(t) denote the left and the right limits of X at timet, respectively,
(Fig. 3.15). ~
X(t+)=X(t)
I
: M(t)=X(t)-X(t-)
I
~X(<-)
I
time
If a Levy process X has bounded jumps, that is, if there exists a finite constant
c > 0 such that supt I~X(t)l :::; c < oo a.s., then X has finite absolute moments
of any order, that is, E[IX(tWl < oo for n = 1, 2, ... ([147], Theorem 34,
p. 25).
Proof: Because Levy processes have dtdlag samples, they can have only jump discontinu-
ities ([147], p. 6).
LetO < c < oo be a constant such that sup1 I~X (t) I ::::; c a.s. and define the stopping
times
The sequence (T1, T2, ... ) is strictly increasing since X is right continuous. Moreover,
the stopped process xTn must satisfy the condition sup1 IXT•(t)l :::; 2nc < oo since
IX(t)- X(Tk)l cannot exceed 2c fortE (Tb Tk+l1 by the definition of the stopping times
Tk. Because a Levy process preserves its properties if it is restarted at a stopping time, the
random variable Tn - Tn-1 is independent of FTn-i and has the same distribution as Tt.
Hence,
P(IX(t)l > 2nc):::; P(Tn < t) = P(e-T" > e- 1 ):::; E[e-T"]e1 = (E[e-T1J)n i
by the Chebyshev inequality. The last formula giving the tail behavior of the distribution
of IX(t)l shows that X(t) has moments of any order since 0:::; E[e-T1] < 1. •
L
00
with intensity AL(A) = E[NA(l)], called the Levy measure, where T1A
inf{t > 0: ~X(t) E A} and Tn~l = inf{t > TnA : ~X(t) E A}, n ~ 1.
Proof: The notation A is used for the closure of A, for example, the closure of an open
interval A = (a, {3) is A = [a, {3]. The process NA is defined by the jumps of X in A,
where T1A = inf{t > 0 : flX(t) E A} denotes the first time X has a jump in A and
Tn~l = inf{t > TnA : flX(t) E A}, n 2: 1, is the time of jump n + 1 of X in A. The
process NA is similar to cA in Eq. 3.111.
The definition in Eq. 3.115 is meaningful because Levy processes have cadUtg sam-
ples so that they can have only jump discontinuities. The process NA (1) is a counting
process giving the number of jumps of X in A during a time interval, (2) is adapted by
definition, (3) is without explosions since X has cadlag paths so that the random variables
T/ are stopping times with the property lirnn~oo TnA = oo, and (4) has stationary incre-
ments that are independent of the past. The distribution of this increments depends on only
the time lag t - s because X has stationary increments. Hence, NA is a Poisson counting
process with intensity AL(A) = E[NA(1)] < oo ([147], p. 26).
194 Chapter 3. Stochastic Processes
Let M(t, dy, w) = #{s :::0 t : ~X(s, w) E (y, y + dy]} be a random measure
counting the number of jumps of X in (y, y + dy] during (0, t] for each fixed (t, w). The
Poisson process NA can be defined by NA(t, w) = fA M(t, dy, w) for each w, where
AL(dy) = E[M(l, dy)] is the Levy measure of NA for A= (y, y + dy]. •
since NAnAi is a Poisson process with intensity ).L(A n A;). Similar considerations can
be used to prove the second equality in Eq. 3.116. The extension to an arbitrary function
h lA E L2 can be found in [147] (Theorem 38, p. 28). &
Let X be a Levy process and A denote a Borel set such that 0 fl. A. The associ-
ated jump process,
JA(t) = L
O<s::St
~X(s) IA(~X(s)) = i y M(t, dy), (3.117)
Proof: J A has piece-wise constant cadlag sample paths with jumps ~X (TnA) arriving at
the times TnA defining the Poisson process NA (Eq. 3.115). J A is a compound Poisson
process defined by the large jumps of X, that is, the jumps of X in A. Therefore, almost all
samples of J A are of bounded variation on compacts.
Since J A is a compound Poisson process, it is also Levy. The process X - J A is
Levy as the difference of two Levy processes. •
3.14. Levy processes 195
(3.118)
Proof: The process in Eq. 3.118 can also be written as Y0 (t) = X(t)-JA(t) with the nota-
tion in Eq. 3.117. The jumps ofY 0 have magnitude smaller than a so that E[IY0 (t)ln] < oo
for any order n, by one ofthe properties in Section 3 .14.1. It is common to define the pro-
cess ya for a = 1, but any other value of a > 0 is acceptable. •
Note: A similar property holds for processes obtained from a compound Poisson process
by retaining its jumps in two disjoint sets. A
Proof: Recall that a process is said to have samples of finite variation on compacts if
almost all its samples are of bounded variation on each compact of [0, oo).
Take A = (-1, ll. The processes JA and X- JA are Levy (Eq. 3.117). Any
sample function s ~ X (s, w) has a finite number of jumps in A during a time interval
[0, t] so that J A is of finite variation on compacts. The jumps of the Levy process X- JA
are smaller than 1 so that this process has bounded absolute moments of any order. Because
X- J A has stationary independent increments and X (0) - JA (0) = 0, we have E[X (t)-
JA(t)] = f3 t, where f3 = E[X(l)- JA(l)]. Hence, Y(t) = (X(t)- JA(t))- f3 tis a
martingale. The claimed representation of X results by setting Z(t) = JA(t) + f3 t.
That Y and Z are independent follows from the observations that (1) the processes
JA and JA(e) are independent Levy processes according to the previous property applied
for A1 =A and Az = A(e) = (-1, -e) U (e, 1), e E (0, 1), and (2) the process JA(e)
approaches X - J A as e --+ 0. •
196 Chapter 3. Stochastic Processes
A Levy process X with bounded jumps, that is, supt ID.X(t)l :::; a a.s., a > 0,
has the representation
where M (t, dy) =M (t, dy) - t 'AL (dy ), zd and zc are independent Levy pro-
cesses, zc is a martingale with continuous path, and zd is a martingale ([147],
Theorem41, p. 31).
Note: The representation in Eq. 3.119 shows that the process Y in the decomposition
X = Y + Z of a general Levy process defined by the previous statement can be represented
by the sum of two independent Levy processes, the processes ZC and zd. •
The last two statements show that any Levy process X has the representa-
tion (Fig. 3.16)
Jx= LevyJ
I
.. ..
Z(t) =JA(t)+~ t Y(t) =[X(t)- J~(t)]- ~ t
Levy with large compensated Levy with
jumps bounded jumps
t +
Z~(t) =m.artingale lzd(t)=flyl<a y [ M(t' dy)-t ')..,L (dy)]
With continuous
samples martingale
Example 3.81: Let X (t), t =::: 0, be a Levy process with characteristic function
({J(u; t) = E[eA u L(t)] = e-t lui", where a E (0, 2]. Figure 3.17 shows a sample
of a Brownian motion B, its variation, and its quadratic variation, respectively.
The figure also shows similar results for a Levy process X of the type considered
in Example 3.78 with a =
1.5. The sample of X exhibits jump discontinuities
whose effects are most visible in the sample of the quadratic variation process
[X] of X. The steady increase in the sample of [X] resembling the sample of [B]
is interrupted by jumps characteristic of the quadratic variation of a compound
Poisson process. 0
Note: The sample of X in Fig. 3.17 has been generated by the recurrence formula X(t +
llt) = X(t) + l!.X(t), t = 0, llt, 2/lt, .. ., where X(O) = 0 and /lX(t) are independent
198 Chapter 3. Stochastic Processes
-2
0 2 4 8 10 2 4 6 8 10
200 8000
Variation Variation
150 6000
100 4000
50 2000
0
2 4 6 10 0 2 4 6 8 10
X 104
15 6
Quadratic variation [B] Quadratic variation [X]
10 4
5 2
2 4 6 10 2 4 6 8 10
time time
Figure 3.17. Samples of a Brownian motion and a Levy process and their varia-
tion and quadratic variation
random variables with the characteristic function q;(u) = e-i':.t lui". Additional informa-
tion on the generation of samples of X and other processes is given in Chapter 5 . .&
Example 3.82: Let AL(dy) = dy fly Ia+! for y E lR \ {0} and a E (0, 2) be a
measure defining a Levy process X. Consider also a compound Poisson process
C(t) = L,~~? Yk. where A > 0 denotes the intensity of the Poisson process
N and Yk are independent random variables with the distribution F defined by
AL(dy) = A.dF(y).
There are notable similarities and essential differences between X and C.
Both processes are defined by random measures M (dt, dy) giving the number of
jumps in (t, t + dt] x (y, y + dy]. However, the average number of jumps in a
Borel set A per unit oftime is finite for C but may be infinite for X. <>
for A = (-a, al and A = (-a, a), 0 < a < oo, respectively. The large jumps of X
generate the compound Poisson process J(-a,a)c defined by Eq. 3.117. This process has
iidjumps l.b.X(t)l ~a arriving in time at the mean rate 1/(a aa). The small jumps of X
do not define a compound Poisson process because the mean arrival rate of these jumps is
not finite. Additional considerations on the jumps of X and measures needed to count the
small jumps of this process can be found in [162] (Section 3.12). •
Proof: We show that X = x< 1l + x< 2l + x<3l, where x< 1l (t) = a B(t)- at, a is a (d, d)
matrix, the coordinates of B E Rd are independent Brownian motion, x<2l is a compound
Poisson process with jumps of magnitude larger than 1, and x< 3l is a process including all
jumps of X strictly smaller than 1.
Thecharacteristicfunctionofx(ll(t) ~ N(-at,aaT t)isrp(ll(u; t) = e-ty,Ol(u),
i
where 1/f(l) (u) = AuT a + uTa aT u.
Consider first the large jumps of X, that is, jumps with magnitude larger than 1
and denote n<2l(dy) = :rt(dy) 1111YII?:l}· The total mass of n<2l is finite and x< 2l(t) =
Ls~t b.X(2) (s) is a compound Poisson process with characteristic function rp<2l(u; t) =
200 Chapter 3. Stochastic Processes
500,---~~~~~~~~~-, 20,-~~-~~~~~~~-,
a=1.0 a=1.5
-500
-1000
-1500'--~~~~~~~~~_j -80'--~~~~~~~~~_j
0 20 40 60 80 100 0 20 40 60 80 100
40,-~~~~~~~~~-,
a=1.7
20
-20
-40'--~~~~~~~~~___j
0 20 40 60 80 100
time time
for every B > 0, where n< 3,sl(dy) = n(dy) l{s<IIYII<l}· This process is independent of
x< 2l and has the characteristic function rp< 3,e) (u; t) = e-t 1/1(3,s)(u)' where
J
expectation E [ sup09 g II Y (s) 11 2 and its limit denoted by x< 3l is a Levy process with
the characteristic function rp(3l(u; t) = e-t1/1< 3l(u), where ([20], Theorem 1, p. 13)
3.15. Problems 201
Example 3.85: The Poisson process Nand the square of a Brownian motion B
are semimartingales. ¢
Proof: The processes Nand B 2 admit the representation M +A, where (M(t) = N(t)-
At, A(t) =At) and (M(t) = B(t) 2 - t, A(t) = t), respectively. The properties of M and
A show that Nand B 2 are semimartingales. For example, B(t) 2 - tis a martingale since
E[B(t) 2 ] = t < oo, B 2 is :Ft = cr(B(s), 0 ::; s ::; t)-adapted, and fort ~ s we have
E[B(t) 2 I :Fsl = E [ (B(t)- B(s)) 2 + 2 B(t) B(s)- B(s) 2 I :Fs] = (t- s) + B(s) 2 .
The martingale B(t) 2 - t has finite moments of any order and starts at zero. The process
A(t) = t is continuous, adapted, and A(O) = 0. We also note that the filtration :Ft
cr ( B (s), 0 ::; s ::; t) includes the natural filtration of B 2. •
3.15 Problems
3.1: Let Y be a real-valued random variable on a probability space (Q, F, P).
Find the natural filtration of the stochastic process X(t) = Y h(t), t 2:: 0, where
h : [0, oo) -+ ~ is a continuous function.
3.2: Complete the proofs of the properties of stopping times in Section 3.4.
3.4: Write the finite dimensional densities for an IRd -valued Gaussian process X
with specified mean and covariance functions.
3.5: Is the process in Eq. 3.33 ergodic in the first two moments, that is, is Eq. 3.15
satisfied with g(x) = xP, p = 1, 2?
3.6: Let X (t), t ::: 0, be a real-valued Gaussian process with mean zero and
covariance function E[X(t) X(s)] = t 1\s. Show that this process has independent
increments.
3.7: Show that a compound Poisson process has the second moment properties in
Eq. 3.109.
3.9: Consider the process X in Eq. 3.33 with Ak =A+ U + k and Bk = B + Vk.
where A, B, Uk, Vk are mutually uncorrelated random variables with zero-mean
zero and unit-variance. Is X a weakly stationary process?
3.10: Complete the proof of the m.s. continuity criterion in Section 3.9.1.
3.12: Prove that the m.s. derivative of a process X exists and E[X(t) 2 ] < oo if
and only if a 2 r(u, v)j(au av) exists and is finite at u = v = t.
3.14: Prove that the m.s. derivative of a weakly stationary process X exists and
E[X(t) 2 ] < oo if and only if d 2 r(r:)jdr: 2 exists and is finite at r: = 0.
3.16: Show that the variation Vh(P) defined by Eq. 3.54 increases as the partition
p is refined.
3.17: Prove the statements in Eqs. 3.58, 3.59, 3.60, and 3.61.
3.20: Develop the Karhunen-Loeve representation for a band limited white noise
process with mean zero (Table 3.1).
3.21: Find the mean crossing rate of a process Y(t) = a(t) X(t), where X is
given by Eq. 3.33 and a is a differentiable function.
3.22: Find the mean upcrossing rate of the translation process in Eq. 3.25, where
G satisfies the conditions in Eq. 3.92. Specialize your results for a lognormal and
exponential distribution function F.
3.23: Develop an algorithm for calculating the density !T in Eq. 3.94 approxi-
mately.
3.25: Prove that a counting process N associated with the strictly increasing se-
quence of positive random variables {Tn, n = 1, 2, ... } is adapted to the natural
filtration of N if and only if {Tn, n = I, 2, ... } are stopping times.
3.27: Show that the process J A defined by Eq. 3.117 has paths of finite variation
on compacts.
3.29: Show that the sum of two independent compound Poisson processes is a
compound Poisson process.
Chapter 4
4.1 Introduction
The probabilistic concepts reviewed in the previous chapters are applied to
develop one of the most useful tools for the solution of stochastic problems, the
Ito calculus. Our objectives are to:
1. Define the stochastic integral or the Ito integral, that is, an integral involving
dtglad (left continuous with right limits) integrands and semimartingale integra-
tors. The Ito integral cannot be defined in the Riemann-Stieltjes sense because
many semimartingales do not have sufficiently smooth samples. The stochastic
integral considered in this chapter is an extension of the original Ito integral de-
fined for Brownian motion integrators. We also define an alternative to the Ito
integral, called the Fisk-Stratonovich integral or the Stratonovich integral, that
has properties similar to the Riemann-Stieltjes integral. The relationship between
the Ito and the Stratonovich integrals is also established.
2. Develop a change of variable formula for functions of semimartingales, called
the Ito formula. This formula differs from the change of variable formula of
classical calculus and is essential for the solution of many stochastic problems in
engineering, physics, and other fields.
205
206 Chapter 4. Ito's Formula and Stochastic Differential Equations
m
Sj,p(g) = L f(tk) [g(tk)- g(fk-I)]. (4.1)
k=l
Note: The last equality in Eq. 4.2 holds if JJ f dg exists. The integral J: f dg in this
equation is JJ l[s,t) f dg. A
Let B be a Brownian motion process. The sample paths s r--+ B (s, w) of B
are of bounded q-variation on any finite interval [0, t] for q > 2 [182]. Hence,
J~ f(s) dB(s, w) exists as a Riemann-Stieltjes integral for almost all sample paths
of B if f is of bounded variation since 1I p + 1I q > 1 for p = 1 and q > 2.
The definition of J~ f dB as a Riemann-Stieltjes integral corresponding to the
sample paths of the Brownian motion is referred to as path by path definition.
The integrals
exist as Riemann-Stieltjes integrals for almost all sample paths of the Brownian
motion because the functions e 8 , cos(s), and sk are of bounded variation.
These observations may suggest that the path by path definition can be
extended to more general integrals, for example, J~ B dB and J~ f dB, where
f is an arbitrary continuous function. However, the Riemann-Stieltjes integrals
J~ B(s, w) dB(s, w) and J~ f(s) dB(s, w) may not exist because:
1. The condition 1I p + 1I q > 1 is not satisfied by the integrand and the integrator
of the path by path integral J~ B (s, w) dB (s, w) and
2. The Riemann-Stieltjes integral J~ f(s) dg(s) does not exist for all continuous
functions f on [0, t] unless g is of bounded variation ([147], Theorem 52, p. 40)
so that the integral J~ f (s) dB (s, w) cannot be defined as a path by path Riemann-
Stieltjes integral for an arbitrary continuous function f.
If, in addition, the sequence of partitions of [0, t] is refining, that is, Pn C Pn+l•
then lB,n(B) converges also a.s. to the limit in Eq. 4.3. The Ito integral differs
from B(t) 2 j2, that is, the expression of the integral J;
B dB obtained by the for-
mal use of classical calculus. <>
Proof: The notation lB,n(B) is used for simplicity although itis slightly inconsistent with
the definition of s f,p(g) in Eq. 4.1. We have
where the last equality holds since :L~;, 1 ( (B(t~n)) 2 - B(t~~l ) 2 )) is a telescopic series
whose sum is B(t) 2 . The m.s.limit of J B n(B) is (B(t) 2 - t)/2 since
approaches zero as n -+ oo, where 6-Bk = B(t~n)) - B(t~~ 1 ). Hence, the Ito integral
JJ B dB can be defined as the m.s. limit of lB,n(B). The integral can also be defined as
the limit in probability of lB,n(B) because the m.s. convergence implies convergence in
t
probability (Section 2.13).
1
Because limn-+oo :L~;, [ B(tt))- B(t~~ ) 1
= t holds a.s. for a sequence of
refining partitions of [0, t] (Section 3.13 in this book, [147], Theorem 28, p. 18), JB,n(B)
converges a.s. to (B(t) 2 - t)/2 for refining partitions so that in this setting the Ito integral
can be defined as the a.s.limit of lB,n(B). •
mn
JB,n (B) = L B(ti (n)) [ B(t~n)) - B(t~~ 1 )]
k=l
that differs from lB,n(B) in Example 4.1 by the choice of the intermediate points,
tk (n) = (1 - O)t~~l + (} tt), (} E [0, 1], instead of t~~1 • The limit of JB,n (B) as
n ~ oo exists in m.s. and in probability and is
-
l.i.m. lB n(B)
n-+oo '
= -21 B(t) 2 + ((}- 1/2) t. (4.4)
210 Chapter4. Ito's Formula and Stochastic Differential Equations
t
lo
B(s) o dB(s) = t
lo
B o dB = ~ B(t) 2 .
2
(4.5)
The Stratonovich integral coincides with the result obtained by the formal use of
the classical calculus. The difference between the Ito and Stratonovich integrals
is caused by the relationship between the integrands B(t ~~ 1 ) and B(t£(n)) and the
integrators B(t~n))- B(t~~ 1 ) of these integrals. They are independent for the Ito
integral but are dependent for the Stratonovich integral. <>
Proof: Take a fixed 8 E [0, 1] and denote by
.
the mcrements of B m
. ( (n) (n)
tk-l, tk(n)) , ( tk-l, tk1 (n)) , an d (tk1 (n) , tk(n)) , respectiVe
. l
y, w h ere
!::.tk = tt) - t~~ 1 . With this notation we have
mn mn
JB,n(B) = L B(t~~ 1 ) !::.Bk + L !::.Bk !::.Bk
k=l k=l
since B(tk(n)) = B(t~~1 ) + t::.Bk. We have shown that the first term of J B,n(B) converges
in m.s. to (B(t) 2 - t)/2 as n -+ oo and its limit defines the Ito integral (Eq. 4.3). It remains
to show that the second term in JB,n(B), that is, L~~~ t::.Bk t::.Bk, converges in m.s. to
8 t since Xn~ X and Yn ~ Y imply Xn + Yn ~ X+ Y (Section 2.13). The first
moment of the second term of JB,n(B) is
since t::.Bk = t::.Bk + t::.Bf: and E[t::.Bk t::.Bf:] = E[t::.Bk] E[!::.BkJ = 0. The second
moment of this term,
E [(I:
k=l
t::.Bk [),Bk) = E [(I:(t::.Bk) +I: t::.Bk t::.Bf:)
2
k=l
] 2
k=l
2
]
- 1 2 1 2
l.i.m. JB nCB)= -2 (B(t) - t)
n---+oo '
+e t = -2 B(t) + (8- 1/2) t.
This limit coincides with the Ito integral for () = 0 but differs from this integral for () i= 0.
The Stratonovich integral results for () = 1/2. •
The stochastic integrals in Examples 4.1 and 4.2 can be extended to define
stochastic processes. Take a fixed t > 0 and set
M(s) =lot 1[0,sJ(u) B(u) dB(u) = los B(u) dB(u) and (4.6)
where s E [0, t]. These integrals can also be defined, for example, as the m.s. lim-
its of the sums
mn
lB,n(B)(s) = L B(t~':l1 ) [ B(t~n) 1\ s) - B(t~':l 1 1\ s)] and (4.8)
k=l
mn
JB,n(B)(s) = L B(tk (n)) [ B(tt) 1\ s) - B(t~':l 1 1\ s)]. (4.9)
k=l
Calculations as in Examples 4.1 and 4.2 can be used to show that J B n(B)(s) and
JB,n(B)(s) converge in m.s. as n--+ oo to (B(s) 2 - s)/2 and B(s) 2 (fJ -1/2) s, +
respectively.
Example 4.3: Let t > 0 be a fixed time and M be the stochastic process in
Eq. 4.6. Then (1) E[M(s)] = 0, (2) E[M(s) 2 ] = s2 /2, (3) M(s) = M(a) +
J; B(u) dB(u) for a E (0, s), (4) M has continuous samples, and (5) M is a
square integrable martingale with respectto the natural filtration .Ft = a ( B (s), 0 :;::
s =::: t) of the Brownian motion B. <>
Proof: The stated properties of M follow from Eq. 4.3 properties of the Brownian motion.
However, we present here a direct proof of these properties.
With the notations Bk-1 = B(t~~ 1 ), Ll.Bk = B(tt) As) - B(t~~l As), and
:Fk = :Ftk(n), the first moment of JB ' n(B)(s) is
212 Chapter 4. Ito's Formula and Stochastic Differential Equations
foreachn ands E [0, t] because b.Bk = Ofor s ~ rf~ 1 and E[b.Bk I .Fk-Il = E[b.Bk] =
0 for s > t~':l 1 . The second moment of JB,n(B)(s) is
E[(JB,n(B)(s)) 2 ] =E [L
k,l
Bk-1 Bz-1 b.Bk b.Bz]
since Bk-I and Bt-l are .Fk-1-measurable fork > l, b.Bk is independent of Fk-I·
E[b.Bk] = 0, and E[(b.Bk) 2] = tt) 1\ s - t~':l 1 1\ s. Hence,
los B(u) dB(u) = lot l[O,aJ(u) B(u) dB(u) + f 1[a,s] (u) B(u) dB(u)
and the linearity of JB,n(B)(·) with respect to the integrand, where 0 ~a ~ s ~ t.
A proof that M is sample continuous can be found in [42] (Theorem 2.6, p. 38). The
proof starts with the observation that lB,n(B)(·) is a sample continuous process since
"
10rs E tk-l'tk(n)) .
( (n)
It is left to show that M is a square integrable martingale. This result also follows
from Eq. 4.3. Here we only sketch a direct proof of this property. We show that lB,n(B)
is a square integrable, that is, E[(JB,n(B)(s)) 2] < oo, s E [0, t], JB,n(B)(s) E .Fs,
0 ~ s ~ t, and E[JB,n(B)(s) I Fa] = JB,n(B)(a) for s ::=: a (Section 3.11). We have
already shown that E [ {1B,n(B)(s)) 2] is finite. That JB,n(B)(s) is .Fs-adapted follows
from the definition of JB,n(B)(s). The process JB,n(B)(s) has the martingale property
E[JB,n(B)(s) I .Fa]= JB,n(B)(a), s 2:: a, since
and
where Bq_ 1 and l!.Bq have the same meaning as in the previous equations but are restricted
to the interval [u, s]. The above equalities hold by properties of the Brownian motion.
Because the martingale property is preserved by L2-limits, M given by Eq. 4.6 is a square
integrable martingale ([42], Proposition 1.3, p. 13). •
Example 4.4: Consider a fixed time t and the stochastic process M in Eq. 4.7.
This process, that is, the Stratonovich integral of B with respect to B on [0, s ], is
not a martingale with respect to the natural filtration of B. <>
Proof: Properties of the sequence of sums defining the Stratonovich integral can be used
to show that M is not a martingale, as was done in the previous example. An alternative
-
approach is taken here. We note that M(t) = foB
t
o dB = B(t) 2 /2 (Eq. 4.5) cannot be
an F 1 = u(B(s), 0 s s s f)-martingale since its expectation E[B(t) 2/2] = t /2 is not
constant. •
Example 4.5: The sample paths of the It6 and Stratonovich integrals, J;
B dB
and J;B o dB, differ and the difference between these integrals increases in time.
Figure 4.1 shows a sample of J; B dB and a sample of J; B o dB corresponding
to the same sample path B(·, w) of a Brownian motion B, that is, the functions
(B(t, w) 2 - t)/2 and B(t, w) 2 /2, respectively.
Samples of the It6 and the Stratonovich integrals corresponding to the same
sample B(·, w) of a Brownian motion Bare also shown in Fig. 4.2 with solid lines.
The dotted lines in this figure give sample paths of J B,n(B)(·) and fs,n(B)(·) for
two partitions Pn of [0, t] with points ttl= k tfmn, k = 0, 1, ... , mn, mn = 10,
and mn = 100 obtained from the same sample B(·, w) of B. These plots suggest
that the sample paths of J B, n (B) ( ·) and JB, n (B) ( ·) approach the sample paths of
the It6 and Stratonovich integrals, respectively, as n increases. ()
1.2 ,-------,,------,---r-----r-----r----.-----,---,----,-----,
time, t
t t
Jo N-(s)dN(s) = Jo N(s-)dN(s) =
1
2 (N(t) 2 -N(t)). (4.10)
Proof: We first show that the Ito and the path by path Riemann-Stieltjes integrals of N-
with respect toN coincide. Consider a classical Riemann-Stieltjes integral j}: a(x) d{3(x),
where {3 is a step function with jumps f3k at Xko k = 1, 2, ... , n. If a and {3 are such that
they are not discontinuous from the right or from the left at each Xk simultaneously, then
([5], Theorem 7.11, p. 148)
1
n
= L a(xk) f3k·
b
a(x) d{3(x)
a k=l
4.3. Preliminaries on stochastic integrals 215
0.6.----~-~---------,
mn=10
0.4
0.2
-0.2
-0.4'---~-~-~-~---'
' JB,n
mn=100 mn=100
0.4 0.6
0.2 0.4
0.2
-0.2
-0.4'---~-~-~-~----' -0.2'---~-~-~-~---'
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
time, t time, t
-
J
-
B o dB for mn = 10 and mn = 100
3 3
2.5
N(t)
2.5
f01 N - dN
2 2
1.5 1.5
0.5 0.5
0 5 10 0 5 10
time,t time, t
Figure 4.3. A hypothetical sample of the Poisson process N and the correspond-
ing sample of the Ito integral N_ dN J
This result implies
lo t
N(s-, w) dN(s, w) =
N(t,w)
L N(Tk-l (w), w) [N(Tk(w), w)- N(Tk-1 (w), w)]
0 k=l
N(t,w) 1
L (k- 1) = 2.(N(t, w) 2 - N(t, w)),
k=1
216 Chapter 4. Ito's Formula and Stochastic Differential Equations
where Tk, k = 1, 2, ... , denote the jump times of Nand To = 0. Hence, the path by path
JJ
integral N(s-, w) dN(s, w) coincides with the Ito integral in agreement with a result in
[147] (Theorem 17, p. 54), and lN,n(N) converges a.s. to N- dN as n ~ oo.JJ
It remains to show that JN,n(N) converges in m.s. to JJ
N- dN as n ~ oo. Note
that JN,n (N) can be written as
1 mn l mn 2
lN,n(N) = 2 L (N(t~n)) 2 - N(t~~1 ) 2 )- 2 L (N(t~n))- N(t~~1 ))
k=l k=l
1 1 mn 2
= -N(t)
2
2- - " (N</n))- N(t(n) ))
2~ k k-1
k=l
so that JN,n (N) is an increasing sequence such that 0 .::; JN,n (N) .::; N(t) 2 /2 a.s. for each
n. Hence, we have lN,n(N) ~ JJ
N- dN as n ~ oo by dominated convergence and
JJ
the a.s. convergence of lN,n(N) to N- dN (Section 2.5.2.2)
We also present here an alternative proof for the m.s. convergence of JN,n(N) to
JJ N- dN = (N(t) 2 - N(t))/2 based on direct calculations and elementary arguments.
We need to show that the expectation of the square of
JN,n(N)- 2
1 (N(t) - N(t))2 = 2l "~ [ ANk- (ANk) 2]
k
converges to zero as n ~ oo, where ANk = N(t~n))- N(t~~ 1 ), that is, the sum
L E [ ( ANk - (ANk) 2) (ANt - (ANt) 2 ) J
k,l
converges to zero as Atk ~ 0, where Atk = t~n) - t~~ 1 • The first four moments of the
increments of N used to establish the above expression can be obtained from the observa-
tion that the cumulants Xr of any order r of ANk are A. Atk and the relationships f-Ll = Xl,
f-L2 = X2 +X f. f-L3 = X3 + 3 Xl X2 + Xf, and /-L4 = X4 + 3 xi+ 4 Xl X3 + 6 Xf X2 +xi
between the moments 1-Lr of order rand the cumulants of a random variable ([79], p. 83
and p. 377). The cumulant of order r of a random variable with characteristic function f{l is
equal to the value of c.J=1)-r d 7 f{l(u)jdu 7 at u = 0. •
(3) extends this definition to stochastic integrals with dtgUtd integrands and semi-
martingale integrators, and (4) gives properties of the stochastic integral that are
useful for calculations. The presentation in this and the following two sections is
based on [147].
The Ito integrals J; J;
B dB and N_ dN in the previous section are special
cases of the stochastic integral defined here. The stochastic integral differs signif-
icantly from the integrals in Section 3.9.3. The integrals in Section 3.9.3 (1) have
random integrand and deterministic integrator or vice versa and (2) the random
integrand/integrator of these integrals are in L2. In contrast, both the integrand
and integrator of the stochastic integral considered here are stochastic processes
whose expectations may not exist.
4.4.1 Semimartingales
Let (Q, F, (F1k::O• P) be a filtered probability space. We denote by S,
V, and .C the class of simple predictable, .1"1-adapted with cadlag paths, and
F 1-adapted with caglad paths processes, respectively.
Let 0 = To ~ T1 ~ · · · ~ Tn+ 1 < oo be a finite sequence of F,-stopping
times and Hi denote random variables such that Hi E Fr; and IHil < oo a.s.,
i = 0, 1, ... 'n, that is, IHi(w)i < 00 forw E Q \no and P(Qo) = 0.
Note: We haveS c .C since the processes inS are Ft-adapted with caglad, piece-wise
constant sample paths.
The classes of processes S, 'D, and .C are vector spaces. For example, the sum of
two F 1 -adapted processes with cadlag paths is an Ft-adapted process with cadlag paths.
Also, multiplying an F 1-adapted process with cadlag paths with a real constant does not
change its type, so that 'D is a vector space. •
Note: The mapping I x is linear and its range is Lo since it is a finite sum of random
variables Hi weighted by increments of X. We also note that Jx(H) resembles Js,n(B)
218 Chapter4. Ito's Formula and Stochastic Differential Equations
and J N,n (N) used in Examples 4.1 and 4.6 to define the stochastic integrals J B dB and
J N_ dN, respectively. A
To check whether I x is a continuous mapping, we need to define the mean-
ing of convergence inS and Lo. A sequence Hn inS is said to converge uni-
formly in (t, w) to H E S if, for any 8 > 0, there is an integer n£ ::: 1 indepen-
dent of (t, w) such that IHn(t, w)- H(t, w)l < 8 for n ::: n£ or, equivalently,
supwEr.!,t:c:O IHn (t, w) - H (t, w) I -+ 0, n -+ oo. The convergence in probability
is used in Lo.
Note: The continuity of lx means that the uniform convergence Hn ---+ H inS implies
pr .
the convergence I(Hn)--+ lx(H) m Lo as n---+ oo.
The definition of the (total) semimartingale shows that the collection of (total) semi-
martingales is a vector space because, if X 1 and X 2 are (total) semimartingales and c1, c2
are some constants, then q X 1 + c2 X2 is a (total) semimartingale. A
Example 4. 7: Let
N(t)
C(t) = L Yk, t ::: 0, (4.13)
k=1
be a compound Poisson process, where N is a Poisson process with respect to a
filtration :F1 and the iid random variables Yk are Fo-measurable. Then C_(t) =
C(t-) = limstt C(s) is a simple predictable process. <>
Proof: Recall that a Poisson process N is an adapted counting process without explosions
that has stationary independent increments (Section 3.12). We have
{w: Tn(w)::; t} = {w: N(t, w):::: n} E :F1 for each t,
so that the jump times T; of N are :F1-stopping times (see also [147], Theorem 22, p. 14).
We have H; = L~= 1 Yj E :Fo s; :FT;• i :::: 1, IH;I < oo a.s., and Tn+1 < oo a.s. for
each n. The jump times are finite because P(Tn+1 > t) ::; E[Tn+lllt by Chebyshev's
inequality and E[Tn+d = (n + 1)/A so that P(Tn+1 > t) approaches zero as t ---+ oo,
where A denotes the intensity of the Poisson process. Hence, C- is in S . •
5.----.----,-----,----,-----,----,----,-----,----,
4 C_(·,ro)
-1
0 2 3 4 5 6 7 8 9
3 4 5 6 7 8 9
time
where <P denotes the distribution of the standard Gaussian variable. Figure 4.5
shows the distribution of Ix(H) for n = 2 and two values of the intensity param-
eter A. of the Poisson process. <>
0.9
0.8
0.7
0.6
0.5
04
03 A-=1
0.2
0.1
-1 ~ 0
Note: The increments B(7i+t)- B(Ti) of the Brownian motion are independent Gaussian
variables with mean zero and variance 11+ 1 - Ti conditional on the jump times 1i. Hence,
I x (H) is a zero mean Gaussian variable with variance (T2 - Tt) + 4 (T3 - T2) conditional
on the jump times. The formula for P(lx(H) ::::; ~)follows from the distribution P(T >
rJ) = exp( -'A rJ) of the time T between consecutive jumps of N. This distribution depends
only on rJ and ).. since N has stationary independent increments. A
Note: Local martingales were defined in Section 3.11. A (local) martingale X is locally
square integrable if there exists an increasing sequence Tn, n 2: 1, of stopping times such
that limn-+oo Tn = oo a.s. and such that for every n 2: 1 the process xTn is a square
integrable martingale.
These criteria show that a Brownian motion B is a sernimartingale because B has
continuous paths and martingales are local martingales. A compound Poisson process C is
also a sernimartingale because C is an adapted process with cadlitg paths of finite variation
on any bounded time interval (0, t]. We have also seen that C is a martingale if its jumps
Yk are in Lt and have mean zero. A
Note: We have used the representation of Levy processes in Section 3.14.2 to show that
these processes are classical semimartingales and therefore semimartingales. The crite-
ria stated previously in this section show that decomposable processes are classical semi-
martingales. Similar considerations can be applied to show that other processes are semi-
martingales. For example, a compound Poisson process C(t) = L,f~? Yk with Y1 E L1 is
a semimartingale since
the compensated compound Poisson process C(t) = C(t) -At E[YI] is a martingale, and
At E[Yd is adapted with continuous samples of finite variation on compacts (Eq. 4.15). A
Example 4.9: The square of a Brownian motion B, that is, the process X (t) =
B(t) 2 , t ::: 0, is a semimartingale. <>
Note: The process X can be given in the form X(t) = M(t)+A(t), where M(t) = B(t) 2- t
is a square integrable martingale, A(t) = t has continuous samples of finite variation on
compacts, and M(O) = A(O) = 0. Hence, X is a semimartingale.
We have seen that B2 is a submartingale. That B2 is a semimartingale also results
from the fact that submartingales are semimartingales. Let Y be a submartingale. Then Y is
a semimartingale since Y(t) = (Y(t) - E[Y(t)])+E[Y(t)], Y(t)- E[Y(t)] is a martingale,
and E[Y(t)] is an increasing function so that it is of finite variation on compacts. A
If H E S, X E V, and 0 = To :::: T1 :::: · · · :::: Tn+l < oo are stopping times, the
stochastic integral of H with respect to X on [0, t] is
n
Jx(H)(t) = HoX(O) +L H; [X(t A 7i+l)- X(t AT;)], t::: 0. (4.16)
i=l
Alternative notations for the stochastic integral 1 x(H)(t) defined by Eq. 4.16 are
J~ H(s)dX(s), J~ H dX, and (H · X)(t). If the timet= 0 is excluded from the
222 Chapter 4. Ito's Formula and Stochastic Differential Equations
definition in Eq. 4.16, the resulting stochastic integral, denoted by f~+ H dX, is
t n
lo H(s) dX(s) = L Hi [X(t /\ Ti+l)- X(t /\ 'li)]
O+ i=l
lo
t H(s)dX(s) = HoX(O) + lo+
t H(s)dX(s). (4.17)
Figure 4.6 illustrates the definition of lx(H)(·) in Eq. 4.16 fort E [0, 8].
:rmple ~ of : :
Proof: The properties in Eqs. 4.18 and 4.19 follow from the definitions of Ix(H) and
Jx(H), the linearity of these integrals with respect to their integrands and integrators, and
the fact that S and 'D are vector spaces. For the property in Eq. 4.20 take 0 ::: s ::: t and
note that (Eq. 4.19)
t
lo
H dX = Ho X(O) + t
lo
(l(o sJ(u)
'
+ l(s tJ(u))
'
H(u)dX(u)
= Ho X(O) + {s H dX +
lo+
1t
s+
H dX = {s H dX +
Jo
1t
s+
H dX
as stated. •
Note: The ucp convergence over a time interval [0, t] implies the convergence in probabil-
ity at an arbitrary timer E [0, t] since
{w: IHn(r, w)- H(r, w)l > e} ~ {w: sup IHn(s, w)- H(s, w)l > e}
o:::s::::r
for any e > 0 and r E [0, t ], P (supo<s<t IHn (s)- H (s) I) ~ 0 as n ~ oo, and probability
is an increasing set function. A - -
The following two facts, given without proof, yield the desired generaliza-
tion of the stochastic integral in Eq. 4.16.
Note: These results show that any process H E £ can be approximated by a simple pre-
dictable process in the sense of the ucp convergence and suggest that most of the properties
of the stochastic integral in Eq. 4.16 hold also for its extension lx : £ ~ 1J.
J6
The extended definition of the stochastic integral includes the integrals B dB and
J6 N- dN in Examples 4.1 and 4.6 because their integrands, Band N-, are in£ and their
integrators are semimartingales. A
Note: Recall that b.Y(t) = Y(t) - Y(t-) is the jump process of a cadlag process Y. Our
illustration of the definition of the stochastic integral in Fig. 4.6 shows that the jumps of the
stochastic integral are equal to H (1i) b. X (Ti ), where 7i, i=1 ,2, ... , denote the jump times
of the integrator, in agreement with the above property. A
the stochastic integral has been illustrated in Example 4.6 showing that the Ito integral
JJ N- dN and the path bypath Riemann-Stieltjes integralJJ N(s-, w) dN(s, w) coincide.
. If H E .C, there exists a sequence Hn of processes in S converging to H in ucp so
that we can find a subsequence Hnk of Hn that converges a.s. to H (Section 2.13). The
integrals Hnk · X can be calculated as path by path Riemann-Stieltjes integrals, that is,
as X is a square integrable martingale. The above calculations have used the martingale
properties of X. It remains to show that H · X has the martingale property. For s :::; t the
226 Chapter 4. Ito's Fonnula and Stochastic Differential Equations
integral (H · X)(t) is the sum of (H · X)(s) and terms H; [XCTi+l 1\ t)- X(T; 1\ t)] for
1j > s. The conditional expectation of these terms,
E [H; (XCTi+l 1\ t)- X(T; 1\ t)) I Fs]
= E {E [H; [XCTi+l 1\ t)- X(T; 1\ t)] I FrJ I Fs}
= E {H; E [X(Ti+l 1\ t)- X(T; 1\ t) I Fr;) I Fs}
is zero since E [X(Ti+l 1\ t)- X(Tj 1\ t) I FrJ = 0 by the optional sampling theorem
(Section 3.11.2) so that H ·X has the prop~ E [(H · X)(t) I Fs] = (H · X)(s).
Stochastic integrals with Brownian m'otion integrator are of the type considered
here. For example, JJ
B(s)dB(s) = (B(t) 2 - t)
/2 is a martingale that is also square
integrable (Example 4.6) and JJ H(s) dB(s), H E .C, is a square integrable martingale if
E [JJ H(s) 2 ds J < oo, as we will see later in this chapter (Eq. 4.36). A
We state without proof an additional property of the stochastic integral
but first, we need to introduce a new concept. A sequence of stopping times
an = (0 = Tdn) :S T 1(n) :S · · · :S T~~)) is said to tend to the identity if
(1) limn__.oo supk rt) = oo a.s. and (2) supk Irt.;}1 - Tk(n) I converges a.s. to zero
as n-+ oo.
If X is a semimartingale as stated at the beginning of this section, Y is a process
in D or£, and an is a sequence of random partitions tending to the identity, then
lot G(s) dM(s) =lot G(s) dN(s)- J.. lot G(s) ds = -J... (t 1\ T1)
4.4. Stochastic integrals 227
so that G · M is not a martingale. On the other hand, Eq. 4.16 with (H, X) replaced by
( H, M) yields
lot H(s) dM(s) =lot H(s) dN(s)-). lot G(s) ds = N(t 1\ T1)-). (t 1\ T1).
That H · X is a martingale follows from
Example 4.11: Consider the stochastic integral J(t) = J~ c_ dB, where Cis a
compound Poisson process (Eq. 4.13) and B denotes a Brownian motion that is
independentofC.Ifthejumps Yk ofC areinLz, then
J (t) = Jo
t C(s-) dB(s) = L Hi[B(Ti+100
1\ t) - B(Tj 1\ t)]
0 1=1
=L
00
E[J(t)] and
1=1
00
L E[Hf] E[T1+1 1\ t 1\ s -
00
= 7i 1\ t 1\ s],
1=1
where the last two equalities follow from properties of the increments of the Brownian
motion B and the independence between C and B. •
228 Chapter4. Ito's Formula and Stochastic Differential Equations
4.5.1 Definition
Note: It can be shown that any semimartingale X has a unique continuous local martin-
gale part xc and that [Xc, xc] = [X, X]c ([147], p. 63). For example, xc for the semi-
martingale X= B +Cis xc = B so that [X, X]c(t) = t, [X, X](t) = t + L,~~? Y'f,
and [X, X](t)- [X, X]c(t) = L,~~? Yf = LO::;sg(~C(s)) 2 , where Band C denote
a Brownian motion and a compound Poisson process, respectively, and N, Yk are defined
by Eq. 4.13. The process [X, X](t) - [X, X]c(t) is given by the sum of the squares of
all jumps of C in [0, t]. This result holds for any semimartingale X (Eq. 4.25) because
semimartingales can have only jump discontinuities and the number of jumps is at most
countable.
We also note that the increment ~X (0) = X (0) - X (0-) of X at t = 0 is X (0)
since X (0-) = 0 by assumption so that [X, X](O) = X (0) 2 and [X, X]c (0) = 0. A
Note: The definition of [X, Y] in Eq. 4.26 is meaningful since X, Yare semimartingales
so that X_, Y _ E .C and the above stochastic integrals are well defined. The quadratic
covariation of X and X coincides with the quadratic variation of X in Eq. 4.24. ~
4.5.2 Properties
Polarization identity.
Proof: Since the collection of semimartingales defines a linear space, the quadratic vari-
ation process [X + Y, X + Y] is defined. By properties of the stochastic integral and
Eqs. 4.24-4.26 we have [X+ Y, X + Y] = [X, X] + [Y, Y] + 2 [X, Y], which yields the
polarization identity. •
Integration by parts.
The process [X, X] in Eq. 4.24 (1) is adapted with cadlag, increasing samples,
(2) can be obtained for each t > 0 and s E [0, t] from
L (X(t~n) s))
~ 2
X(0) 2 + 1\ s)- X(tt!1 1\ -4 [X, X](s), (4.29)
k=l
as n ~ oo, where we used Jci X(s-)dX(s) = X(O-) X(O) + Jci+ X(s-)dX(s) and
X(O-) = 0. The identity (b- a) 2 = b 2 - a 2 - 2a(b- a) with a = X(t~~1 )) and
b = X(t~n)) gives
= X(t) 2 - X(t-) 2 - 2X(t-) ~X(t) = ~(X 2 )(t)- 2X(t-) ~X(t) = MX, X](t)
by the first property in Section 4.4.4 and Eq. 4.24. •
as n-+ oo since supk fX(t~n))- X(t~~ 1 )i-+ 0 and I::;,1 fX(t~n))- X(t~~1 )i < oo by
the continuity and the finite variation of the samples of X, respectively. •
For each t > 0 and partitions Pn = (0 = t~n) < tin) < · · · < tf::} = t) such that
.6.(pn) -+ 0 as n -+ oo we have
mn
X(O) Y(O) +L (X(tt) 1\ s)- X(t~~l 1\ s)) ( Y(t~n) 1\ s)- Y(t~~l 1\ s))
k=l
ucp
-----+ [X, Y](s), s E [0, t], (4.30)
[X, Y](O) = X(O) f(O) and Ll[X, Y](t) = .6.X(t) .6.Y(t) ([147], Theorem 23,
p. 61).
where Pn = (0 = t~n) < · · · < t~n'j = t) is a partition of [0, t] such that fl(pn) -+ 0 as
n -+ oo. The first summation on the right side of the above equation converges in ucp to
f X- d X as n -+ oo. The second summation on the right side of the above equation is
mn mn
L X(t~~1 ) (x(ttl)- X(t~~1 )) + L (x<t~n))- X(t~~1 ))
2
k=l k=l
so that it converges in ucp to JJ X_ dX + LO<s::O:t (fl.Xs ) 2 as n -+ oo. Hence,
X(t) 2 - X(0) 2 = 2 f X(s-)dX(s) + L (.6.X(s)) 2 or
f
0<s:9
Note: The Kunita-Watanabe inequality discussed in the following section (Eq. 4.32) gives
I[X, Y](t)i :::0 ([X, X](t) [Y, Y](t)) 112 for H = K = 1 so that [X, X]c(t) = 0 implies
[X, Y]c(t) = 0. Hence, [X, Y](t) is equal to LO<s<t D.X(s) D.Y(s) by Eq. 4.30.
This result shows that, for example, the quadratic covariation of a compound Pois-
son process and a Brownian motion is zero. &
Example 4.13: Consider the compound Poisson process C in Eq. 4.13. This pro-
cess is a quadratic pure jump semimartingale with quadratic variation [C, C] (t) =
'£~~? Yf. If the jumps of C are one a.s., the compound Poisson process C coin-
cides with a Poisson process N so that [N, N](t) = '£~~? 1 = N(t). <>
Proof: The compound Poisson process C is a sernimartingale since it is adapted and has
cadUtg paths of finite variation on compacts (Section 4.4.1). We have seen in Example 4.12
that D.[C, C](t) = Y~(t) if C has a jump at timet and D.[C, C](t) = 0 otherwise. Hence,
the quadratic variation of C in a time interval (0, t] is [C, C](t) = L:f~? Yf. •
Note: We have [X, X](t) = X (0) 2 by one of the properties in this section so that f~ X- dX
results from Eq. 4.24. For example, take X(t) = A cos(v t), where A is a real-valued
random variable and v > 0 is a real number. Then
Example 4.16: The quadratic covariation [B, C] of a Brownian motion B and the
compound Poisson process C is zero. The process [B + C, B + C] is equal to the
sum of the quadratic variations of B and C, that is,
N(t)
[B + C, B + C](t) = [B, B](t) + [C, C](t) = t + LYf.
k=l
Figure 4.7 shows a hypothetical sample path of C and [B + C, B + C]. The linear
5
C(-,ro)
4
0 2 3 4 5 6 7 8 9
-
30
[B+C,B+C](-,ro) ~
20
__....o
10
.-J:)
0 2 3 4 5 6 7 8 9
time
n(dy) = IYI-(a+l) dy, and ot E (0, 2) (Section 3.14.2). Figure 4.8 shows samples
of the quadratic variation process [X, X] fora = 0.1, a = 1, and several values of
234 Chapter 4. Ito's Formula and Stochastic Differential Equations
X 104
5 2500
a=1.0 a=1.5
4 2000
3 1500
2 1000
500
0 0
0 20 40 60 80 100 0 20 40 60 80 100
400 200
a=1.7 a=1.99
300 150
200 100
100 50
0 0
0 20 40 60 80 100 0 20 40 60 80 100
time time
Figure 4.8. Samples of the quadratic variation of a Levy process with a = 1, 1.5,
1.7, and 1.99
Note: The generation of samples of X is based on the algorithm in Section 3.14.2 using
the approximation X ::::: X(l) + xC2l + xC 3 ,sJ, where (X(l), xC2l, xC 3 ,sl) are mutually
independent, xCll(t) = CJ B(t)- at, B is a Brownian motion, and xC 2l and xC 3 ,s) are
compound Poisson processes including jumps of X with magnitude in [1, oo) and (s, 1),
respectively, and s E (0, 1). The plots in Fig. 4.8 are for s = 0.1. •
1 00
IH(s)IIK(s)lld[X, Y](s)l
Note: The integrals in Eq. 4.32 are defined since [X, X], [Y, Y] are cadlag, increasing,
adapted processes and [X, Y] is a process with samples of finite variation on compacts . .t.
Proof: Set Z1 = ( f000 H(s) 2 d[X, X](s) ) f000 K(s) 2 d[Y, Y](s) )1/2 . The
1/2 (
and Z2 =
Kunita-Watanabe inequality gives
integral and the fact that Sis dense in£. The above equality and the symmetry of[·,·]
give [H ·X, K · Y](t) = fJ
H(s) d[X, K · Y](s) and [X, K · Y](s) = jg
K(u) d[X, Y](u)
so that Eq. 4.34 results since d[X, K · Y](s) = K (s) d[X, Y](s). A
rt
k=l
ucp
-+ Jo Z(s-)d[X, Y](s) as n--? oo, (4.35)
Proof: We have
by the definition of the quadratic covariation (Eq. 4.26) and the associativity of the stochas-
tic integral (Eq. 4.22), where Z_(t) = Z(t-) and H · V denotes f H dV. The sequence,
mn
L Z(t~~1 ) [ ( X(t~n)) Y(t~n)) - X (t~~l) Y(t~~ 1 ))
k=l
E [(lot H(s)dM(s))
2
] = E [lot H(s) d[M,M](s)].
2 (4.36)
Proof: This property is very useful in applications since it provides a procedure for calcu-
lating the second moment of stochastic integrals with square integrable martingale integra-
tors. For example, if M is a Brownian motion B, then Eq. 4.36 becomes
(4.37)
4.6. Ito's formula 237
since [B, B](t) = t. The result in Eq. 4.37 is referred to as the Ito
isometry ([135],
Lemma 3.1.5, p. 26).
The expectation on the right side of Eq. 4.36 exists by hypothesis. The expectation
on the left side of this equation is defined because the stochastic integral H · M is a square
integrable martingale by a preservation property of the stochastic integral (Section 4.4.4).
This integral is zero at t = 0 by the definition of M. The assumption M(O) = 0 and
Eq. 4.24 give
The expectation of the integral in this equation is zero since H · M is a square integrable
martingale so that
E
[[H · M, H·M](t)] = E[<H ·
M)(t) 2].
We also have (Eq. 4.34)
h(O)
g 1 (u) du =
lot
0
g'(h(s)) dh(s), (4.38)
d
dt [g(h(t))] = g'(h(t)) h'(t) or d[g(h(t))] = g'(h(t)) dh(t), (4.39)
where the first derivatives g' and h' of functions g and h, respectively, are assumed
to exist. The Ito formula extends the rules of the classical calculus in Eqs. 4.38
and 4.39 to the case in which the deterministic function h is replaced with a semi-
martingale X.
ThechangeofvariablesformulainEq.4.38gives f~ B(s)dB(s) B(t) 2 =!
2
for g(y) = y j2 and h replaced by a Brownian motion process B. This re-
sult is in disagreement with our previous calculations showing that the Ito inte-
gral f~ B(s)dB(s) is equal to B(t) 2 j2- t/2 (Eq. 4.3). On the other hand, the
Stratonovich integral f~ B(s) o dB(s) = B(t) 2 /2 (Eq. 4.5) and the classical cal-
culus give the same result.
g(X(t))- g(X(O)) = t
fo+
g'(X(s-))dX(s) +~
2
t
fo+
g"(X(s-))d[X, XY(s)
Note: The notation g E C2 (JR) means that the function g : lR -+ lR has continuous second
order derivative and g 1/g 11 denote the first/second derivatives of g. The last two terms on
the right side of Eq. 4.40 are not present in the classical calculus (Eqs. 4.38 and 4.39).
The lt6 formula can also be given in the form
g(X(t)) - g(X(O)) = lo0+t g 1 (X (s-)) dX(s) + -l lot g'' (X(s-)) d[X, X)(s)
2 0+
(4.42)
Proof: The proof of the Ito formula in Eq. 4.40 is presented in detail since we will use this
formula extensively. The proof, adapted from [147] (Theorem 32, p. 71), has two parts.
First, it is assumed that X is a continuous semimartingale. This result is then extended to
the case in which X is an arbitrary semimartingale. A heuristic proof of the Ito formula can
be found in [117] (Theorem 6.7.1, p. 184).
The following result is used in the proof. If g E C2 (R) is defined on a bounded
interval, the Taylor formula gives
h2
g(x +h)- g(x) = h g (x) + 2 g'' (x) + r(x, h),
1 h E R, (4.43)
where ir(x, h)l ::::: h 2 a(lhl), a : [0, oo) -+ [0, oo) is increasing, and limu,r,oa(u) = 0.
Two cases are considered.
where
mn
s~n) = L r(X(t~~1), X(t~n))- X(t~~l))
k=l
correspond to the three terms of the Taylor formula in Eq. 4.43 applied to the intervals
of Pn· The sums S~n) and sin) converge in ucp to the Ito integrals g 1 (X(s-))dX(s) fd
and (1/2) Jd
g" (X (s-)) d[X, X](s) as n -+ oo, respectively (Eqs. 4.23 and 4.35). The
absolute value of s~n) can be bounded by
mn
ls~n)l = L r(X(t~~1 ), X(tt))- X(t~~1 ))
k=l
mn
< max a(IX(/n))- X(t(n) )I) "(X(t(n))- X(t(n) )) 2
- 1<k<m
- - n
k k-1 w
k=l
k k-1
for each n. Because Lk (x(tt))- X(t~~ 1 )) 2 ~ [X, X](t) (Eq. 4.29), this sequence
also converges to [X, X](t) in probability. We also have maxk a(IX(t~n))- X (t~~ 1 )I) -+ 0
as n -+ oo since limu{-0 a (u) = 0 and the function s r-+ X (s, w) is continuous on [0, t] for
almost all wand, hence, uniformly continuous in this time interval so that maxk IX(tt))-
(n)
X(tk_ 1)1-+ Oa.s.asn-+ oo.
In summary, we have shown that for each t 2: 0 the sequences ~n), sin), and S~n)
Jd
converge in probability to g 1 (X (s-)) dX(s), (1/2) JJ
g 11 (X(s-)) d[X, X](s), and zero,
respectively, as n -+ oo. Hence, we have
lim P (is(n)
n-+oo 1
+ s<n) + S(n)-
2 3 Jot g 1 (X(s-)) dX(s)
g(X(t)) - g(X(O)) = Jot g 1 (X (s- )) dX (s) + l1 Jot g 11 (X (s-)) d[X, X](s) (4.45)
4.6. Ito's formula 241
holds a.s. for each t. Note that X(s-) = X(s) for each s E [0, t] since X is assumed to be
continuous. Because the processes on both sides of Eq. 4.45 are continuous, this equality
holds for all values oft. If X (0) is not zero, the Ito formula of Eq. 4.45 becomes
mn
g(X(t))- g(X(O)) = L (g(X(t~n)))- g(X(t~~ 1 ))
k=l
where Lk,U corresponds to the intervals of the partition Pn of [0, t] including jumps in U
and Lk, v =
Lk - Lk,U. Since X has cadlag sample paths, it has at most a countable
number of jumps in [0, t] and, for a sufficiently fine partition Pn· the intervals (t~~~, t~n)]
contain at most one jump of X in U so that
L (g(X(t~n)))- g(X(t~~ 1 ))
k,V
As in the previous case, the first term on the right side of the above equation converges in
ucp to Jci+
g 1(X(s-)) dX(s) + (1/2) Jci+
g 11 (X(s-)) d[X, X](s). The second term con-
verges to
It remains to show that, as e {,. 0, LsEU (g(X(s))- g(X(s-))) and the above summation
converge to LO<s:::t [g(X(s)) - g(X (s- ))] and
respectively, and Lk,V r(X(t~':l 1 ), X(t~n))- X(t~':l 1 )) approaches zero. Details on this
part of the proof can befound in [147] (pp. 73-74). Hence, the limit of
mn
g(X(t))- gcxco>> =L (g<X<tt))>- gcxct~':l 1 n)
k=l
as n ~ oo and e {,. 0 is
l0+
g"(X(s-))d[X, X](s)- ~ L
2 O<s~t
g11 (X(s-)) (~X(s)) 2
= r
lo+
g 11 (X(s-))d[X, X]c(s).
where LO<s~t means LO<s~t.ILl.X(s)l~l and 18(s)l ::": I~X(s)l ::": 1 a.s. Since g E
C2 (JR), there is a constant a> 0 such that lg''(8(s))l ::":a and
B(tt = n lo tB(st-
0
1 dB(s) + n(n -
2
1) lot B(st-
0
2 ds (4.48)
Proof: The Ito formula in Eq. 4.40 (1) can be applied to the mapping B 1-+ Bn since this
mapping is infinitely differentiable and B is a continuous square integrable martingale so
that it is a semimartingale, (2) shows that Bn is a semimartingale, and (3) gives a recurrence
formula for calculating the stochastic integral JJ B(s)q dB(s), where q 2: 1 is an integer
(Eq. 4.48).
That g(B) = B 2 is a semimartingale also results from the representation B(t) 2 =
A(t) + M(t), where A(t) = tis an adapted continuous process with A(O) = 0 and paths
of finite variation on compacts and M(t) = B(t) 2 - t is a square integrable martingale
starting at zero. Hence, the process B2 is decomposable so that it is a semimartingale
(Section 4.4.1 ).
The equality in Eq. 4.48 also shows that the Ito integral JJ B(s)n- 1 dB(s) can be
calculated path by path from (1/n) B(t)n- ((n- 1)/2) JJ
B(s)n- ds and samples of the
2
Brownian motion B. •
Proof: That N 2 is a semimartingale follows from the Ito formula. This formula applied to
the function g(N) = N 2 gives (Eq. 4.40)
We have seen that N is a quadratic pure jump semimartingale so that [N, Nf (t) = 0 for
all t ::=: 0. We also have
N(t)
L [N(s) 2 - N(s-) 2 - 2N(s-) !).N(s)] =L [N(Ti) 2 - N(Ti-1) 2 - 2N('li-l)J
O<s:::t i=l
N(t)
=L [i 2 - (i - 1) 2 - 2 (i - 1)] = N(t),
i=l
where 1j, i = 1, 2, ... , denote the jump times of N. The above results and the Ito formula
yield Eq. 4.49 since N(O) = 0. •
Example 4.21: Let C be the compound Poisson process defined by Eq. 4.13 with
jumps given by a sequence Yk of iid real-valued random variables. The process
C 2 is a semimartingale and
C(t) 2 = 21 0+
t
C(s-)dC(s)
N(t)
+ Lyf_
i=l
(4.50)
This formula provides an alternative way for calculating the stochastic integral
JC-dC. <>
Proof: According to Ito's formula, C2 is a semimartingale and (Eq. 4.40)
C(t) 2 -C(0) 2 = (
lo+
1
2C(s-)dC(s)+-
2 t
lo+
2d[C,C]c(s)
Because Cis a quadratic pure jump semimartingale, we have [C, C]C(t) = 0, t 2: 0. The
summation on the right side of the above equation is
and constitutes a compound Poisson process with jumps ~2 occurring at the jump times ~
of C. These considerations and C(O) = 0 give Eq. 4.50. •
Example 4.22: Let C be the compound Poisson process in Eq. 4.13 and g a real-
valued function with continuous second order derivative. An alternative form of
the Ito formula in Eq. 4.40 is
g(C(t)) - g(C(O)) = t
lo+
g1 (C(s- )) dC(s) +~
2
t
lo+
g11 (C(s-)) d[C, C]c (s)
since Jd+ g' (C(s-)) dC(s) = LO<s:::t g1 (C(s-)) ~C(s) and [C, C]c = 0. We also have
Stochastic differential equations are discussed in Section 4.7. Figure 4.9 shows a
sample path of X. <>
Note: The differential form of the Ito formula in Eq. 4.42 applied to the coordinates of X
gives
A sample of
(sin(B), cos(B))
Figure 4.9. A sample path of the Brownian motion on the unit circle
Example 4.24: Let Y and Z be real-valued processes with cadlag samples adapted
to the natural filtration of a Brownian motion B. Let X(t) = A(t) + M(t) be an-
other process, whereA(t) = f~ Y(s-)ds andM(t) = f~ Z(s-)dB(s). Then X
is a sernimartingale, and
g(X(t))- g(X(O)) =
ht g (X(s-))dX(s) + ~2kt g 11 (X(s-))d[X, xns)
1
forgE C(l~_2), where [X, X]c(t) =[X, X](t) = [M, M](t) = ~~ Z(s-) 2 ds. 0
Proof: Since A and M are semimartingales (Section 4.4.4), X is also a semimartingale.
The expression of g(X(t))- g(X(O)) is given by the Ito formulainEq. 4.40.
The quadratic variation of X can be calculated from [X, X] = [M +A, M +A] =
[M, M] +[A, A]+ 2 [M, A]. We have (Eq. 4.34)
L IA(t~n))- A(t~~ 1 )1
mn
L (A(t~n))- A(t~~1 )) 2
mn
:::; sup IA(t~n))- A(t~~ 1 )1
k=l k k=l
converges to zero for partitions Pn of [0, t] with mesh !!.(pn) ---+ 0 as n ---+ oo. Leti (t) = t
denote the identity function. Then A(t) = JJ
Y(s-) di(s) so that the quadratic covariation
of MandA is [M, A]= JJ Y(s-) Z(s-) d[B, i](s), t 2:: 0, and
mn
L Itt) -
mn
L (B(tt))- B(t~~1 )) (t~n) - t~~ 1 ) :::; sup IB(tt))- B(t~~ 1 )1 t~~ 1 1,
k=l k k=l
4.6. Ito's formula 247
converges to zero since I::;~ I lt~n) - t~~ 1 1 = t and supk IB(t~n)) - B(t~~ 1 ) I -+ 0 a.s. as
n -+ oo by the continuity of B. Hence, the quadratic variation process of X is [X, X] =
[M, M] as stated. •
+-1 L 11 --(X(s-))d[Xi,
da2g
Xj]c(s)
2.. axiaXj 0+
l,j=l
dg(X(t)) =" a d
_£(X(t))dXi(t)
~ ax·
i=l l
1
+-"
d a2
_g_(X(t))d[Xi, Xj](t)
2 ~ ax· ax.
i,j l J
and
1
+ 2 L lo
d t a2g
ax· ax (X(s)) d[Xi, Xj](s). (4.56)
i,}=I 0 z J
Example 4.25: Let B be a Brownian motion in JRd and let X(t) = g(B(t)) =
B(t)/ II B(t) II, t :=:: 0, be the Brownian motion on the unit sphere Sd(O, 1)
in m.d centered at the origin of this space, where g : m.d \ {0} --+ Sd(O, 1) and
II x II= <L.f=I xf) 112 denotes the Euclidean norm.
248 Chapter 4. Ito's Formula and Stochastic Differential Equations
for each p = 1, ... , d. These forms are not stochastic differential equations for
X of the type defined later in this chapter (Eq. 4.69) since the increments dX (t)
of this process depend on X (t), B (t), and the increments dB (t) of B rather than
just on X(t) and dB(t). The process X can be modified to satisfy a stochastic
differential by a random time change ([135], Example 8.5.8, p. 149) or by state
augmentation. For example, we can consider the evolution of the process (X, Y),
where Y is an JE.d -valued process defined by dY(t) = dB(t). <>
Proof: The differential form of the Ito formula in Eq. 4.56 gives
a a2
dXp(t) =" d
L..
gp
(B(t))
ax·
dB;(t)
1
+-2 "L..
d
gp
ax· ax.
(B(t))
d[B;, B ·](t)
J
i=1 l i,j=1 l J
which is (1/2)(2 dt- dt- dt) = 0 fori =f. j since the processes B; are independent Brow-
nian motions. Hence, the double summation in the above expression of dXp(t) becomes
L.%= 1 (a 2gp(B(t))jax?) dt. •
Example 4.26: Consider a real-valued function g : JE.d -+ lR with continuous
and bounded first and second order derivatives. Let B be an lR d -valued Brownian
motion starting at x = (x1, ... , Xd) E JE.d whose coordinates B; are indepen-
dent Brownian motions such that B;(O) = x;, i = 1, ... , d. Denote by Ex the
expectation operator for B (0) = x. The limit,
called the generator of B, exists and is proportional with the Laplace operator
D. = "£.f= a ;ax?.
1 2 This result provides a link between a class of stochastic
processes and some deterministic partial differential equations (Chapter 6). <':>
Proof: The Ito formula in Eq. 4.56 gives
g(B(t))- g(B(O)) =L
d it a (B(s))
g dB;(s) +-1 Ld it a2 (B(s))
g 2 ds
i=1 0 ax; 2 i=1 0 axi
4.6. Ito's formula 249
since d[B;, B1 ] (s) = 8ij ds, where 8ij = 1 for fori = j and 8;1 = 0 fori f= j.
The integrals in the first summation on the right side of the above equation are
:Ft = a (B(s) : 0 :::: s :::: f)-martingales starting at zero. Hence, their expectation is zero.
The integrals in the second summation on the right side of this equation can be defined as
Riemann integrals and can be approximated by t a2 g(B(&(w) t, w))jax?, &(w) E (0, 1),
for almost all w's. The limit of these integrals scaled by t as t +
0 is deterministic and
equal to a2 g(x)jaxf. These observations yield Eq. 4.57 . •
that has continuous second and first derivatives relative to x and t, respectively, for each
u E JR. The Ito formula applied to Z(t) = g(X(t), t) = exp ( H u X(t) + t u 2 /2)
gives
X(t)Y(t)-X(O)Y(O)= r
lo+
X(s-)odY(s)+ r
lo+
Y(s-)odX(s). (4.59)
Proof: If at least one of the semimartingales X andY is continuous, we have [X, Y](t) =
[X, Y]C(t) + X(O) Y(O) so that (Eq. 4.28)
X(t)Y(t)= rr
lo+
X(s-)dY(s)+ {t Y(s-)dX(s)+[X,Y]c(t)+X(O)Y(O).
lo+
This observation, Eq. 4.58, and the equality [X, Y] = [Y, X] give the integration by parts
formula based on Stratonovich integrals. Eq. 4.59 coincides with the integration by parts
formula in the classical calculus. •
g(X(t))- g(X(O)) = t
lo+
g'(X(s-)) o dX(s)
r
lo+
g'(X(s-)) odX(s) = r
lo+
g'(X(s-))dX(s)
Proof: IfEq. 4.61 is valid, the Jt(j formula in Eq. 4.40 gives Eq. 4.60. Hence, it is sufficient
to show that (Eq. 4.58)
[g 1 (X),X]c(t) = rr g (X(s-))d[X,X]c(s).
lo+
11
4.6. Ito's formula 251
If g E C3 (R), the Ito formula can be applied to the semimartingale g1 (X) and gives
so that
[g1 (X), xr = [g (X-) ·X, X]c + ~[g
11
2
111 (X-) ·[X, xr, X]c.
The terms on the right side of the above equation are (Eq. 4.34)
Note: If the coordinates of X are continuous semimartingales, then the change of variable
formula in Eq. 4.62 becomes
d !at a
g(X(t))- g(X(O)) = L
i=l 0+
_!_(X(s)) o dXi(s),
OXj
(4.63)
252 Chapter 4. Ito's Formula and Stochastic Differential Equations
and coincides with the change of variable formula of the classical calculus. A
The Ito and the Fisk-Stratonovich integrals in Eqs. 4.55 and 4.62 are related by
t lo+r ~(X(s-))
i=l oxi
odXi(s) = t lo+r ~(X(s-))dXi(s)
i=l axi
Note: Generally, the evolution of the state X of a physical system is defined by a differential
equation driven by a non-white (colored) noise so that X satisfies a Stratonovich stochastic
differential equation ([102], Section 5.4.2). The relationship in Eq. 4.64 and the definition
of the Fisk-Stratonovich integral in Section 4.6.3 can be used to express the solution X of
a stochastic problem in functions of Riemann-Stieltjes and It() integrals (Section 4.7.12).
This representation of X is particularly useful for calculations because It() integrals are
martingales or semimartingales. A
r
lo
h(B(s)) o dB(s) = r
lo
h(B(s)) dB(s) +~
2 lo
r h'(B(s)) ds (4.65)
gives the relationship between the Stratonovich and the Ito integrals of h(B) with
respect to B. <>
Proof: The integrals JJ
h(B(s)) o dB(s) and JJ h(B(s)) dB(s) are defined by the prop-
erties of h and B. Eq. 4.64 with d = 1 and h = g' gives Eq. 4.65 since d[B, B]C(t) =
d[B, B](t) = dt.
For example, if h(x) = exp(x), we have
lo t 0
eB(s) o dB(s) = lot
0
eB(s) dB(s) + -1
2 0
lot eB(s) ds (4.66)
Proof: The coincidence of the It() and the Stratonovich integrals with integrand depending
on N and integrator N results from Eq. 4.64 with d = 1 because N is a quadratic pure jump
semimartingale so that [N, N]c = 0. The same result can be obtained from Eq. 4.58 since
[h(N), Nf = 0. The equality in Eq. 4.67 remains valid if N is replaced by a compound
Poisson process. •
giving the rate of change of the state vector, where the functions a and b depend
on the system properties. Classical analysis is based on the assumptions that the
system properties and the input are perfectly known and deterministic. The evo-
lution of the state vector x is given by Riemann-Stieltjes integrals with integrands
depending on the system properties and integrators depending on input and time.
In this section, we generalize Eq. 4.68 by assuming that the input is a
stochastic process. The properties of the system are still assumed to be deter-
ministic and perfectly known so that the coefficients of Eq. 4.68 are deterministic,
known functions of the state vector and time. Because the input is random, the
state vector is a stochastic process. Let X denote the solution of Eq. 4.68 with w
replaced by a stochastic process W. It is common in applications to assume that
W is a white noise process for the following reasons ([175], Chapter 8).
1. In many cases the input has a much shorter memory than the system.
3. The first two moments of the state X of a linear system driven by white noise
can be obtained simply provided that these moments exist. The system defined by
Eq. 4.68 is said to be linear if a(x(t), t) = «(t) x(t) and b(x(t), t) = {J(t) x(t),
where the matrices « and {J are functions of time. We will adopt in our discussion
a more restrictive definition for linear systems, which is common in random vibra-
tion studies. We say that the system in Eq. 4.68 is linear if a(x(t), t) = «(t) x(t)
and b(x(t), t) = {J(t).
254 Chapter4. Ito's Formula and Stochastic Differential Equations
Table 4.1 summarizes the types of white noise processes used in this book. If the
jumps of the compound Poisson process Care in Lz, the second moment prop-
erties of the Poisson and the Gaussian white noise processes have the same func-
tional form and these processes can be made equal in the second moment sense.
The Levy white noise includes both the Gaussian and the Poisson white noise pro-
cesses (Section 3.14.2). The semimartingale noise is considered at the end of this
section. The use of the white noise model results in a significant simplification
in calculations, for example, linear filters driven by white noise ([175], Section
5.2) but can entail significant technical difficulties. For example, the Gaussian
white noise is frequently interpreted in applications as the derivative of the Brow-
nian motion, a stochastic process with a.s. non-differentiable sample paths! In
our discussion, we will deal with increments of the Brownian motion, compound
Poisson, Levy, and semimartingale processes rather than the corresponding white
noise processes. Accordingly, the evolution of the state vector X is defined by the
where a, bare (d, 1), (d, d')-dimensional matrices whose entries are real-valued
Borel measurable functions, Y is a vector in JRd' consisting of d' real-valued semi-
martingales, and the state X is an JRd -valued stochastic process. The input Y in-
cludes the Brownian motion, Poisson, compound Poisson, and Levy processes.
If Y is a Brownian motion B, the state X is called a diffusion or Ito process
and the matrices a and b b T are referred to as drift and diffusion coefficients,
respectively. The first and second integrals in Eq. 4.70 are Riemann-Stieltjes and
4. 7. Stochastic differential equations 255
1. X is adapted to the natural filtration :ri = a(Y(s), 0 :::; s :::; t) of Y, that is,
X (t) is a function of Y (s), s :::; t, at each timet 2: 0.
3. The Riemann-Stieltjes and Ito integrals in Eq. 4.70 are well defined at all times
t > 0.
- If X(O) is random, :rr needs to be extended to Ft
= a ( X(O), Fr).
A version of the input Y needs to be specified to construct a strong solution of
Eq. 4. 70. If we replace one version of Y by another version of this process, a
different strong solution may result because these versions of Y can have different
sample properties. We construct approximations of the strong solution in Monte
Carlo studies because paths of X are calculated from paths of Y by integrating
Eq. 4.70. The path behavior is not essential for the weak solution. A weak solu-
tion of Eqs. 4.69-4.70 is a pair of adapted processes ( Y, X) defined on a filtered
p~ob~bility space (Q, 1i, (1i 1 )t-:~_o, P) such that Y is a version of Y and the pair
(Y, X) satisfies Eq. 4.70. The weak solution is completely defined by the initial
condition, the functions a and b, and the finite dimensional distributions of Y. The
particular version of the input does not have to be specified. The above definitions
show that a strong solution is a weak solution but the converse is not generally
true.
The uniqueness of the solution of Eqs. 4.69-4.70 is generally defined in two
ways. A strong solution is said to be unique in the strong or pathwise sense
if two different solutions of these equations have the same sample paths except
on a subset of Q of measure zero. The strong uniqueness requires that the set of
solutions of Eqs. 4.69-4.70 consists of indistinguishable processes. On the other
hand, two solutions, weak or strong, of these equations are unique in the weak
sense if they have the same finite dimensional distributions. Hence, the collection
of weak solutions of Eqs. 4.69-4.70 consists of processes that are versions of each
other ([42], Section 10.4, [131], Section 3.2.1, [135], Section 5.3).
s
where X (0) = 0, B is a Brownian motion starting at zero, and (x) = -1, 0, and
1 for x < 0, x = 0, and x > 0, respectively. It can be shown that this equation
has a weak solution ([42], Section 7.3). The solution is unique in the weak sense
([42], pp. 248-249) but it is not unique in the strong sense. <>
256 Chapter 4. Ito's Formula and Stochastic Differential Equations
Proof: Let X (·, w) be a solution of the above stochastic differential equation corresponding
to a sample B(·, w) of B, that is,
where f~ a(X(s), s) ds and f~ b(X(s), s) dB(s) are Riemann and Ito integrals,
respectively.
JJ
Note: Generally, b(X(s), s) dB(s) cannot be defined as a path bypathRiemann-Stieltjes
integral because almost all sample paths of the Brownian motion process are of unbounded
variation. We have seen that there is a relatively simple relationship between the It() and
Stratonovich integrals (Eq. 4.58) so that the solution of Eq. 4.71 can also be expressed in
terms ofRiemann-Stieltjes and Stratonovich integrals. The process X(s-) in Eqs. 4.69 and
4.70 is replaced by X(s) in Eq. 4.71 since diffusion processes have continuous samples, as
we will see later in Section 4.7.1.1.
Note that diffusion processes are Markov, that is, the solution X of Eq. 4.71 is
a Markov process. Let to E (0, t) and X(to) = x. Because the increments of B are
independent of the past, future states,
X(t) =x +itto
a(X(s), s) ds +it
to
b(X(s), s) dB(s), t ~ to,
of X are independent of the past states X(u), u < fo, of this process conditional on its
present state X(to) = x. The converse is not true. For example, the compound Poisson
process is Markov but is not a diffusion process.
We also note that the solution of Eq. 4.71 has the strong Markov property, that is,
if Tis an Ft-Stopping time, the process X(T + r), r ~ 0, depends only on X(T), where
Ft denotes the filtration generated by the natural filtration of the Brownian motion B and
the initial state X(O). ~
4. 7. Stochastic differential equations 257
This section gives conditions for the existence and uniqueness of the so-
lution of Eq. 4.71, outlines some essential properties of the solution, illustrates
some theoretical concepts by examples, and demonstrates how Ito's formula can
be used to develop differential equations for the moments, the density, and the
characteristic function of X. In applications Eq. 4.71 is frequently given in the
form
X(t) = a(X(t), t) + b(X(t), t) W g(t), t ~ 0, (4.72)
where W g, called Gaussian white noise, is interpreted as the formal derivative of
the Brownian motion process B ([175], Section 5.2.1).
Example 4.31: Let X be the solution of dX (t) = -a X (t) dt + f3 dB(t), t ~ 0,
where a > 0 and f3 are some constants, X(O)""' N(fl,(O), y(O)), B is a Brownian
motion, and X (0) is independent of B. This real-valued diffusion process, called
the Ornstein-Uhlenbeck process, is a special case of Eq. 4.71 with d = d' = 1,
drift -ax, and diffusion f3 2 . The theorems in the following section guarantee the
existence and the uniqueness of the solution X.
Samples of the strong solution of this equation can be calculated from sam-
ples B ( ·, w) of B and the recurrence formula
giving X at the end of a time interval [t, t + ~t] from its value at the beginning
of the interval and the input in [t, t + ~t]. The weak solution of the Ornstein-
Uhlenbeck equation is given by a pair of a Brownian motion B and a Gaussian
process X with mean fl,(t) = E[X (t)] = f.l,(O) e-a 1 , variance y (t) = E[X (t) 2 ] =
y(O)e- 2 a 1 +f3 2 j(2a) (1-e- 2 a 1 ),andcovariancec(t,s) = E[X(t)X(s)] =
y(s 1\ t) e-a it-sl. <>
Proof: The recurrence formula for X shows that the Omstein-Uhlenbeck process is Gaus-
sian so that the second moment properties of X define its finite dimensional distributions.
We establish here differential equations for some of the second moment properties of X.
Complete differential equations for the second moment properties of the state vector of a
linear system driven by white noise are in Section 7 .2.1.1.
The mean equation can be obtained by averaging the defining equation of X. The
stochastic integral equation
We need some definitions to state the main results of this section. Let a (x)
and b(x) be (d, 1) and (d, d') matrices whose entries are functions of only x E
JE.d. We say that a and b satisfy the uniform Lipschitz conditions if there exist a
constant c > 0 such that
If (1) the drift and diffusion coefficients do not depend on time explic-
itly, are bounded functions, and satisfy the uniform Lipschitz conditions in
Eq. 4.73, (2) B is a Brownian motion martingale on a filtered probability space
(Q, :F, (:Ft)r~o, P), (3) B(O) = 0, and (4) X(O) is :Fo-measurable, then
• There exists a strong solution X of Eq. 4. 71 unique in the strong sense,
• X is a B([O, oo)) x :F-measurable, adapted, and continuous process, and
• The law of X is uniquely determined by the drift and diffusion coefficients and
the laws of Band X(O) ([42], Theorem 10.5, p. 228).
Proof: A Brownian motion B is a a(B(s), 0 ::S s ::S t)-martingale. The strong uniqueness
of the solution means that if x< 1) and x<2) are two t-continuous solutions ofEq. 4.71, then
x< 1) (t, w) = x<2)(t, w) for all t ~ 0 a.s.
If some of the conditions of the above theorem are violated, it does not mean that
Eq. 4.71 does not have a solution or that its solution is not unique. The theorem gives
only sufficient conditions for the existence and uniqueness of the solution of Eq. 4. 71. We
outline the plan of the proof of this theorem. Technicalities involved in this proof can be
found in [42] (Chapter 10). Let
Generally, the processes /(·,X) and X(-) differ. This theorem gives conditions under
which these processes are indistinguishable. Let (Q, :F, (nh>O• P) be a filtered proba-
bility space with :Ft = a(X(O), B(s) : 0 ::S s ::S t) defined by-the initial condition X(O)
and the history of the Brownian motion B up to time t ~ 0. The Brownian motion B is
:Fr-adapted by the definition of this filtration. We present now the essential steps of the
proof.
(1) If X is :Fr-adapted and B([O, oo)) x :F-measurable, then l(t, X) is a continuous
semimartingale. Since X is B([O, oo)) x :F-measurable and b is continuous, the entries of
b(X) are also B([O, oo)) x :F-measurable. Since the diffusion coefficients are bounded,
M is in L2. These observations and a preservation property of the stochastic integral show
that M is a square integrable martingale. The martingale M starts at zero and is continu-
ous. The integral giving A can be defined path by path because the drift coefficients are
bounded functions. Since X E B([O, oo)) x :F and a is continuous, the entries of a(X) are
8([0, oo)) x :F-measurable. Also, A is an adapted continuous process, which starts at zero
and is of finite variation on compacts. Hence, A is also a semimartingale. We conclude
that/(·, X) is a semimartingale with continuous samples.
(2) Sensitivity of I(·, X) to X. The inequality ([42], Lemma 10.1, p. 222)
holds for each r > 0, t E [0, r], and B([O, oo)) x F-measurable, Ft-adapted processes
Y and Z such that Y(s)- Z(s) E L2, where cr > 0 is a constant. The distance between
I (·, Y) and I(-, Z) in [0, t] depends on differences between the initial values and the time
histories of Y and Z.
(3) The uniqueness of the solution of Eq. 4.71. We show that there is at most one
B([O, oo)) x F-measurable adapted solution of this equation ([42], Theorem 10.3, p. 224).
Let Y and Z be two solutions of Eq. 4.71 so that Y(O) = Z(O) = X(O), Y(t) = I(t, Y),
and Z(t) = I (t, Z) for all t 2:: 0. For a fixed r > 0, we have
for t E [0, r] by the above inequality and the Fubini theorem. We need one additional fact
to establish the uniqueness of the solution. Consider two Lebesgue integrable functions
f and gin [0, r] for some r E (0, oo). If there is a constant I; > 0 such that f(t) <
g(t) +I; Jci f(s) ds for all t E [0, r], then ([42], Lemma 10.2, p. 224)
The latter inequality with f(t) = E[ll Y(t) - Z(t) 11 2 ], g(t) = 0, and I; = cr gives
E[ll Y(t)- Z(t) 11 2 ] = 0 so that P(Y(t) = Z(t)) = 1 for each t E [0, r] and, since r
is arbitrary, this result holds for each t 2:: 0. The processes Y and X are indistinguishable
since they are continuous (Section 3.8).
(4) The existence of the solution ofEq. 4.71. The proof is based on the construction
of a convergent sequence of processes defined by
xC0l(t) = X(O),
x(n)(t) = I(t, x(n-ll), n = 1, 2, ... ,
for all t 2:: 0. The properties of the I in Eq. 4.76 imply that the processes xCnl defined
by the above recurrence formula are Ft-adapted and continuous, and their law is uniquely
determined by X(O) and B. Moreover, the sequence of processes x(n) converges to a limit
X a.s. and uniformly in [0, r] and this property holds on any bounded time interval since
r is arbitrary. The limit X has the same properties as the processes x(n), so that X is
an adapted continuous process whose probability law is determined by X (0) and B. The
process X is the strong solution of Eq. 4.71 since it is constructed from a specified version
of the Brownian motion and some initial conditions. Hence, the processes I(·, X) and X
are indistinguishable in [0, oo). •
If ( 1) X (0) and B satisfy the conditions in the previous existence and uniqueness
theorem and (2) the matrices a and b are locally Lipschitz continuous functions
(Eq. 4.74) and satisfy some growth conditions (Eq. 4.75), then there is a unique,
adapted, and continuous solution X of Eq. 4.71 whose law is uniquely deter-
mined by the laws of X(O) and B ([42], Theorem 10.6, p. 229).
4. 7. Stochastic differential equations 261
Example 4.32: Consider the deterministic differential equation dx (t) = x (t) 2 dt,
t 2: 0, starting at x(O) = 1. The drift coefficient a(x) = x 2 does not satisfy the
growth condition in Eq. 4.75 for all x e R Yet, this equation has the unique
solution x(t) = 1/(1 - t) in [0, 1). The solution x(t) = 1/(1 - t) becomes
unbounded as t t 1, so that the solution does not exist at all times. The behavior
of x(t) suggests that the growth condition is needed to assure the existence of a
solution for all times t. <>
Example 4.33: Let x be the solution of dx(t) = 3 x(t) 213 dt, t 2: 0, with the
initial condition x (0) = 0. The drift a (x) = 3 x 213 of this equation does not
satisfy the Lipschitz condition in Eq. 4.73 at x = 0. This equation has a solution
but its solution is not unique. For example, any member of the infinite collection
of functions Xa(t) defined by Xa(t) = (t - a) 3 fort > a > 0 and Xa(t) = 0 for
t ~ a is a solution. This observation suggests that the Lipschitz conditions are
needed to have a unique SIOlution. <>
=1
-~ if X<-~,
[x]~ x if-~ :sx ::::;~,
~ ifx>~,
Since the functions a([·]~) and b([·]~) satisfy the conditions of the first existence and
uniqueness theorem in Section 4.7.1.1, the solution x<n of the above equation exists and
is unique in the strong sense for each~ > 0. Hence, Eq. 4.78 has a unique solution in the
strong sense.
We now show that
solves Eq. 4.78. Let i(t) = t denote the identity function. The multi-dimensional It()
formula applied to the mapping (B, i) ~ g(B, i) gives (Eq. 4.56)
= t a X(s) dB(s) + lot (c- a 2/2) X(s) ds !f.-!2 lot a 2 X(s) ds,
lo
where agjau and a2gjau 2 denote partial derivatives of g with respect to its first argument.
Hence, the process X in Eq. 4.79 is the solution ofEq. 4.78. •
Suppose that the drift and diffusion coefficients in Eq. 4.71 satisfy the conditions
of the first theorem of existence and uniqueness in Section 4.7.1.1. If X and Z
are solutions ofEq. 4.71 corresponding to the initial conditions X(O) and Z(O),
respectively, such that X(O), Z(O) E Fo and X(O)- Z(O) E L2, then
gives the solution X at time t ~ 0 for a unit initial condition. For example,
x< 1) (t)- x<2) (t) = (X<1) (0)- x<2) (0)) e-at for the Ornstein-Uhlenbeck process
X defined by dX (t) = -ex X (t) dt + f3 dB(t), t ~ 0, where ex > 0 and f3 are some
constants. Hence, SUPo::::s:::r (X(l) (s) - x<2) (s )) 2 = (X(l) (0) - x<2) (0) ) 2 for this
process. 0
Note: The bound in Eq. 4.80 depends on an undetermined constant Cr and its definition
requires that x<l) (0), x<2)(0) be in L2. Because c, is not known, the bound only provides
qualitative information on the evolution in time of the difference between x<l) and x<2).
For the Omstein-Uhlenbeck process X, Eq. 4.80 gives
Similar bounds can be obtained for any linear system. Hence, the bound in Eq. 4.80 is not
useful for these systems. A
Suppose that the drift/diffusion coefficients and the Brownian motion in Eq. 4. 71
satisfy the conditions of the first existence and uniqueness theorem in Sec-
tion 4. 7 .1.1. Then the solution X of this equation is a semimartingale with the
representation
X(t) = X(O) + A(t) + M(t), (4.81)
where A and M are defined by Eq. 4. 77, A is an adapted process with samples of
finite variation on compacts, M is an :Ft -square integrable martingale, A (0) = 0,
and M(O) = 0 ([42], Theorem 10.5, p. 228).
Note: The stated properties of A and M have been established in the proof of the existence
and uniqueness of the solution ofEq. 4.71 (Section 4.7.1.1). We calculate here just the first
two moments of a coordinate
d' t d'
Mi(t) =L fo bik(X(u))dBk(u) = L(bik(X) · Bk)(t)
k=l k=l
d'
E[M;(t) 21 = L E [Cbik(X) · Bk) (bu(X) · Bt)]
k,l=l
[lot
d'
= L E bik(X(u))bu(X(u))d[Bk, Bt](u)J
k,l=l
d1 d1
=L E [ {' bik(X(u)) 2 du] =L {' E[b;k(X(u)) 2 ] du
bl k blk
264 Chapter 4. Ito's Formula and Stochastic Differential Equations
by the definition of Mi, the linearity of the expectation operator, a property of the stochastic
integral (Eq. 4.34), and Fubini's theorem. The expectation of Mi is finite since the entries
of b are assumed to be bounded. A
Example 4.36: Let d = d' = 1, a(x) = a(x), and b(x) = b(x) in Eq. 4.71. If
the conditions of the first existence and uniqueness theorem in Section 4. 7 .1.1 are
satisfied, then
by the linearity of quadratic covariation (Eq. 4.26). All quadratic covariations with argu-
ments Z and A or Mare zero because Z is a constant process and A(O) = M(O) = 0. The
quadratic variation of A and the quadratic covariation of A and M are also zero. Hence, we
have (Eq. 4.34)
6.------.-------,-------.-------,------,
5
3
Y(·,CO)=X(·,ro)
Figure 4.10. Sample paths of X defined by dX(t) = -X(t) dt + dB(t) and its
memoryless transformation Y(t) = X(t) 3
dY(t) = [g' (X(t)) a(X(t)) + ~ g11 (X(t)) b(X(t)) 2] dt + g1(X(t)) b(X(t)) dB(t).
I
Because g is an increasing function, x = g- 1(y) exists so that Y is a diffusion process
defined by the stochastic differential equation
dY(t) = [g1 (g- 1(Y(t))) a(g- 1(Y(t))) + ~ g'' (g- 1(Y(t))) b(g- 1(Y(t))) 2] dt
Example 4.38: Let X be the diffusion process in Example 4.36 and h : lR ---+ lR
be a function with continuous second order derivative. Then,
0 0 2 0
where f~ h(X(s)) o dX(s) and f~ h(X(s)) dX(s) denote Stratonovich and Ito in-
tegrals, respectively. <>
Proof: We have seen that X is a semimartingale with continuous sample paths and that
d[X, X]c(t) = d[X, X](t) = b(X(t)) 2 dt. The above relationship between the Ito and
Stratonovich integrals follows from Eq. 4.61 with g 1 replaced by h. •
where a(x, t) = a(x, t)- (1/2) b(x, t) [ab(x, t)jax]. It is assumed that b(x, t)
has continuous second and first order partial derivatives with respect to x and t,
respectively. <>
Proof: Let U(s) = b(X(s), s). The Ito and Stratonovich integrals of U with respect to B
are related by (Eq. 4.58)
= [a(X(t), t )au- +-
au 1 2 a u] + b(X(t), t )au- dB(t)
2
dU(t)
ox ot +-2 b(X(t), t) -ax 2 dt
ax
so that
d[U, B](t) au dB(t), dB(t) J= b(X(t), t) ih
= [b(X(t), t) ih au dt.
Since [U, B] = [U, Bf, the relationship between the Ito and Stratonovich integrals be-
comes
{t U(s) o dB(s)
lo
= {' U(s) dB(s) +!
lo 2
t
lo
b(X(s), s) a~(s) ds,
uX
(4.84)
Note: The Stratonovich version ofEq. 4.85 is (Eqs. 4.82 and 4.83)
X(t)=X(O)+ lo t[a*(X(s),s)--b(x,t)--'-
0
1
2
ab(x t) J ds+ lot b(X(s),s)odB(s)
ax 0
these moments exist and are finite. Generally, the resulting moment equations
form an infinite hierarchy so that they cannot be solved exactly. 0
Proof: If q :::: 0 is an integer and g(x) = xq, the Ito formula can be applied to the process
g(X) since X is a semimartingale and g E C2 (IR). The expectation of this formula gives
This equation becomes a differential equation for the moments of X (t) if the drift and
diffusion coefficients are polynomials. If a and b are polynomials of degree at most one,
the moment equations are closed so that they can be solved. •
with the convention JL(u; t) = 0 for any integer u ::;: -1, where f-t(q; t) =
dJL(q; t)fdt. 0
Proof: The drift and diffusion coefficients of X satisfy the uniform Lipschitz and the
growth conditions so that there exists a unique, adapted, and continuous solution if X (0) E
Fo. The differential form of the Ito formula is (Example 4.36)
[-a
so that
~E[g(X(t))] = E
dt
g 1 (X(t)) X(t) + /322 g 11 (X(t))].
Eq. 4.86 results from the above equation with g(x) = ~. •
Example 4.42: Let X be the Omstein-Uhlenbeck process in Example 4.41. The
characteristic function ({J(u; t) = E[exp( A u X (t)), u E IR, of X (t) satisfies
the partial differential equation
o({J o({J {3 2 u 2
- = - a u - - --qJ. (4.87)
ot au
2
The stationary solution ({Js(u) = limHoo qJ(u; t) = exp ( -{3 2 u 2/(4a)) of this
equation satisfies an ordinary differential equation obtained from Eq. 4.87 by set-
ting oqJjot = 0. Hence, the stationary value of an Omstein-Uhlenbeck process at
any time t is a Gaussian variable with mean zero and variance f3 2 I (2 a), a known
result ([175], Example 5.6, p. 179). 0
4. 7. Stochastic differential equations 269
Proof: The partial differential equation of the characteristic function given by Eq. 4.87 can
be obtained by applying Ito's formula to g(X) = exp(H u X), u E R. Because g is a
complex-valued function, the ItO formula previously developed cannot be applied directly.
However, this formula is linear in g so that we can apply it to the real and imaginary parts
of g separately and add the corresponding contributions.
This extended version of the Ito formula gives
~rp(w
at •
t) = E [-a X(t) J=T u e A u X(t) + fi 2 (J=l u)2 eA u X(t)]
2
when applied to g(X) = exp(J=T u X). The result of Eq. 4.87 follows since akrpjauk =
E [ ( J=T X (t) )k exp(J=T u X (t)) J. where k ~ 1 is an integer. •
Example 4.43: Let X be the solution of the stochastic differential equation
dX (t) = a(X (t), t) dt + b(X (t), t) dB(t), t 2: to, (4.88)
starting at X(to) = xo. The density f(x; t I xo; to) of the conditional variable
X(t) I (X(to) = xo), referred to as the transition density of X, satisfies the
Fokker-Planck equation
at a 1 a2
at = -ax (a f) + 2 ax2 (b 2 f) (4.89)
with the initial condition f(x; to I xo; to) = 8(x -xo) and the boundary conditions
limlxl ... oo a(x, t) f(x; t I xo; to)= 0, limlxl ... oo b(x, t) 2 f(x; t I xo; to)= 0, and
lim1x1 ... oo ja[b(x, t) 2 f(x; t I xo; to)]jax = 0. Because X is a Markov process,
the density of X(to) and the transition density f(x; t I xo; to) define all the finite
dimensional densities of this process.
The solution of the Fokker-Planck equation is completely defined by the
coefficients of Eq. 4. 88 and does not depend on the particular version of B. Hence,
the Fokker-Planck equation can only deliver the weak solution for Eq. 4.88. We
also note that, if a process Y is such that the density of Y (t) I Y(to) = yo satisfies
Eq. 4.89, it does not mean that Y is a diffusion process. ¢
Proof: The expectation of the Ito formula (Eq. 4.42) gives
d [ ag bz azg]
dt E[g] =E a ax +2 ax2
for any real-valued function x ~ g(x) with continuous second order derivative or
d
dt
r
lJR g f(x; t I xo; to) dx = JJR r [a axag + 2b2 a2g]
axZ f(x; t I x;, to) dx.
Integrating the last equation by parts and using the stated boundary condition, we find
r gat
lJR r [-axa (a f)+ 21 ax2
at dx = lJR a2 (b2 f) ] gdx.
Example 4.44: Let g(x, t) be a function with continuous second and first order
partial derivatives in x and t, respectively. The Stratonovich calculus obeys the
classical chain rule of differentiation, that is,
Y(t)- Y(O) = r (axag dX(s) + agas ds) + 2~ lo(' axa2g2 b2 d[B, B](s)
lo
= f'
lo
(a ag
ax
+ ag
as
+ ~ b2 a2g)
2 ax2
ds + f' bag dB(s),
Jo ax
where g(x, t) has continuous second and first order partial derivatives in x and t, respec-
tively.
The relationship between the Ito and Stratonovich integrals given by Eq. 4.84 with
U replaced by b (agjax) yields
lo tb-odB=
0
a8
ax
lot b-dB+-
a8
ax
1 lot (ab
2
8 b
- -a+
ax ax
0
a2-
8 ) bds
ax2 0
so that
or
Y(t)- Y(O) = lo t(ag- + aag)
as 0 ax
- ds + lot ba-
0 ax
8 dB, o
where a =a- (1/2) b abjax (Example 4.39). Since we have d 0 X(t) =a dt + b o dB(t),
the differential form of the above equation is d 0 Y(t) = (agjax) d 0 X(t) + (agjat) dt as
stated. •
Example 4.45: The Stratonovich calculus can be used to find the solution of some
Ito differential equations. For example, the solution of the Ito equation dX (t) =
a(X(t)) dt + b(X(t)) dB(t), t ?::. 0, with a(x) = h(x) h' (x)j2, and b(x) = h(x),
is X(t) = g- 1 (g(X(O)) + B(t)), where his a differentiable function and g is the
primitive of h -l. <>
Proof: The Stratonovich integral representation of the diffusion process X is X(t) =
X(O) + f~ h(X(s)) o dB(s) since a = a- (1/2) b abjax = 0 (Eq. 4.83). The differ-
ential form of this equation is d 0 X(t) = h(X(t)) o dB(t). The rules of classical calculus
loot
apply and give
-do X (s)
- = dB(s) = B(t) lot
h(X(s)) 0
so that g(X(t))- g(X(O)) = B(t) ([131], Section 3.2.2). •
4. 7. Stochastic differential equations 271
Example 4.46: The solution of the Ito stochastic differential equation dX(t) =
a(t) X(t) dt + b(t) X(t) dB(t) is
X(t) = X(O) exp [lot (a(s)- 0.5 b(s) 2 ) ds +lot b(s) dB(s) J.
If a(t) = a and b(t) = bare constant, then
which yields the stated result. The process X is strictly positive a.s. on any bounded time
interval by the functional form of its stochastic differential equation and properties of the
Brownian motion so that the mapping X I-+ ln(x) is in C2 (JR) . •
Note: The requirement that G(·, ·, x) be an adapted, caglad process for each x E JR. guar-
antees that the stochastic integral in Eq. 4.91 is defined. The requirement that G(t, w, ·)
satisfy a Lipschitz condition for each (t, w) resembles the conditions in Section 4. 7 .1.1 for
the existence and uniqueness of the solution of differential equations driven by Gaussian
white noise.
The theorem in Eq. 4.91 can be stated for a system of stochastic integral equations
involving a finite number of input semimartinga1es ([147], p. 194). A
272 Chapter 4. Ito's Formula and Stochastic Differential Equations
where Sis a semimartingale. If X (0) E Fo, t 1--+ b(x, t) is continuous for each x,
and x 1--+ b(x, t) satisfies a uniform Lipschitz condition for each t 2: 0, then the
above equation has a unique solution that is a semimartingale.
If S is a compound Poisson process C, we have
N(t) t
X(t)=X(O)+ Lb(X(n-),n)Yk=X(O)+ f b(X(s-),s)dC(s),
k=l lo
where (T1, T2, .. .) and (Y1, Y2, .. .) denote the jump times and the jumps of C,
respectively. <>
Proof: The existence and uniqueness of the solution X follows from the above theorem.
The second condition on Gin Eq. 4.91 becomes the uniform Lipschitz condition lb(t, x)-
b(t, x') I :=:: c lx- x'l for each t 2: 0 and a constant c > 0 since b is a deterministic function.
If S is a compound Poison process C, the sample paths of X are constant between
consecutive jumps of C and have jumps
at the jump times of C so that X(t) = X(Tk-) + b(X(n-), Tk) Yb t E [Tko Tk+l). This
recurrence formula gives the stated expression of X. &
Xi(t) = li(t) + L
d'
j=l 0
1 t
Gf (X)(s-) dSj(s), i = 1, ... , d, (4.92)
has a unique solution in 'Dd. If the processes li are semimartingales, then Xi,
i = 1, ... , d, are also semimartingales ([147], Theorem 7, p. 197).
Note: The solution of Eq. 4.92 is an Ri -valued process X = (X 1 , ... , X a). This equation
extends Eq. 4.91 since it considers vector input/output processes and X depends on some
processes Ji. That the integrands in Eq. 4.92 do not depend explicitly on time is not a
restriction since the state X can be augmented to include time as a coordinate . .&
If conditions for the existence and uniqueness for the solution of Eq. 4.92 hold
and X}O) E 'D, i = 1, ... , d, then X(k), defined by recurrence formula
j=l 0
1 t
G{ (X(k))(s-) dSj (s), i = 1, ... 'd, (4.93)
Example 4.49: Let X be the solution of the stochastic differential equation in Ex-
ample 4.48, where S = C is a compound Poisson process and a, b are not explicit
functions of time. This equation has a unique solution under the conditions stated
in Example 4.48.
We construct here the solution X based on elementary arguments. If there
is a function h with inverse h- 1 such that h'(x) = 1ja(x), then
where X(Tk) = X(Tk-) + b(X(Tk-)) Yk. Tk are the jump times of C, and Yk
denote the jumps of this process. <>
274 Chapter4. Ito's Formula and Stochastic Differential Equations
Example 4.50: Let X be the solution of dX (t) = X (t-) dC(t), t :=:: 0, where C
is a compound Poisson process. The Ito and Stratonovich interpretations of this
equation,
I
X (T1-) 6.C(T1) fort E [T1, T2) so that
X(O) fort E [0, T1),
X(t) = X(O) (1 + Y1) fort E [T1, T2),
X(O) (1 + Y1) (1 + Y2) fort E [T2, T3).
The relationship,
Example 4.51: Let X be the process in Example 4.49 and g E C 2 (JP&) denote a
real-valued function. The change of the process g(X) in a time interval can be
obtained from the Ito formula in Eq. 4.40. A more familiar version of this result
is ([172], Theorem 4.2.2, p. 199)
where M is a Poisson random measure on [0, oo) x lE. defining a compound Pois-
son process C(t) = '£~~? Yk, E[M(ds, dy)] = ("Ads) dF(y), "A > 0 is the in-
tensity of the Poisson process N, and F denotes the distribution of the iid random
variables Yk (Section 3.12). <>
Proof: The Ito formula in Eq. 4.40 gives
Because X is a quadratic pure jump semimartingale the above integral with integrator
[X, X]c is zero so that
g(X(t))- g(X(O)) = t
lo+
g'(X(s-)) [a(X(s-))ds + b(X(s-))dC(s)]
+ L [g(X(s)- g(X(s-))- g'(X(s-)) LlX(s)].
0<s:9
= L g'(X(s-)) LlX(s)
O<s:::t
so that
The result in Eq. 4.94 follows from the above equation and the integral form,
ofLO<s:c:t[g(X(s))- g(X(s-))]. •
where a, b are such that this equation has a unique strong solution (Section 4. 7 .1.1 ).
An extensive discussion on numerical solutions for stochastic differential
equations driven by Brownian motion can be found in [115]. Essentials on nu-
merical solutions ofEq. 4.95 are in [131] (Section 3.4).
4.7.3.1 Definitions
Let Pn be a sequence of partitions of [0, t] with points 0 = tcin) < t{n) <
.. · < t/::! = t such that ~(Pn) ~ 0 as n ~ oo. We take mn =nand t~n) -t~~I =
tIn so that the partition mesh is ~ (Pn) = tIn. In this section we use the notation
~ (pn) = 8n = tIn for convenience.
A numerical solution x<nl ofEq. 4.95 is a stochastic process that approxi-
mates the solution X of this equation. The values of X (n) are calculated only at the
times rtl = k tIn, k = 0, 1, ... , n, that is, at the points of Pn· Any interpolation
method can be used to define X (n) at all times in [0, t ]. For example, the following
definition of x<n) is based on linear interpolation.
Note: The approximating solution x<n) (s) at a times E [t~:_\, t~n)] depends linearly on
the values of x<n) at the two ends of this interval. The samples of x<n) are continuous
samples but are not differentiable at the times t~n) defining the partition Pn· A
Error functions. For a partition Pn of [0, t], we define the error functions
Note: The expectation E [sup09 :9 1X(s)- x<nl(s)l] provides a useful measure for the
accuracy of the approximation x<n) over the entire time interval [0, t], but it is difficult to
evaluate. The error functions in Eq. 4.97 are calculated at the end of the time interval [0, t]
for simplicity.
4. 7. Stochastic differential equations 277
We also note that es(Pn. t) measures the difference between the sample values of
the random variables X (t) and xCn) (t) while ew (Pn, t) gives the difference between the
expected values of these random variables. A
and
Note: This definition requires that the conditional expectation of the increment of x(n)
in the time interval [t~n), t~~1 ] divided by On converges in m.s. to the drift of X and that
the variance of the difference of the random parts of the approximate and exact solutions
converges in m.s. to zero as on -+ 0. An extensive discussion on this topic can be found in
[115] (Chapter 9). A
Proof: The increment of the solution X in Eq. 4.95 during a time interval [t~~l, tt)] is
given by the recurrence formula
k
ti':}l
a(X(s)) ds + 1 t(n)
k
ti':}l
b(X(s)) dB(s). (4.99)
The Euler numerical solution x<n) results from this equation by approximating the func-
tions a(X(s)) and b(X (s)), s E [t~~l, t~n)], by their values a(X (t~~ 1 )) and b(X (t~~l )) at
the left end of this time interval. Any interpolation method can be used to define x<n) at
all times in [0, t], for example, Eq. 4.96. •
The Euler numerical solution is the simplest scheme for solving stochastic
differential equations. This method is frequently used in the subsequent chap-
ters of the book to generate samples of diffusion and other processes defined by
stochastic differential equations.
The Euler numerical solution converges strongly with order y = 0.5 and weakly
with order y = 1 under relatively mild conditions ([115], Theorem 9.6.2, p. 324,
[131], p. 162).
Note: An extensive discussion on the convergence of the Euler approximation can be found
in [115] (Chapters 8 and 9). For example, if X(O) E Lz is .Fa-measurable and a, b in
Eq. 4.95 are measurable functions in Lz satisfying some Lipschitz and growth conditions
(Eqs. 4.73 and 4.75), then a strongly consistent approximation x<n) converges strongly to
X. Moreover, we have E [1x(n)(t)- X(t)l] :::: p (8n + ~(8n)) 1 1 2 , where pis a positive
constant and ~ is the function in the definition of a strongly consistent approximation, that
is,~ > 0 and limu-J,O nu) = 0 ([115], Theorem 9.6.2, p. 324). The function~ is zero for
the Euler approximation so that the above upper bound becomes p ~/Z. Hence, the Euler
approximation converges strongly with order y = 0.5. A
Example 4.52: Consider the differential equation x(s) = a(x(s), s), s E [0, t],
starting at x (0) = xo. If (1) the drift a is uniform Lipschitz in x, that is, there exists
f3 > 0 such that la(x, s)- a(y, s)l ::::: f31x- yl for x, y E IE. and s E [0, t], (2) the
functions aajat and aa;ax are continuous, and (3) all solutions x are bounded in
[0, t], then the error ekn) = x(t~n)) - xt) satisfies the inequality
le(n)l
k
<CiOn (ekf38n-
- 2{3
1) '
where ct =max 1aa;atl +max Ia aa;axl and xt) = x<n)(t~n)) denotes the Euler
approximation at time t~n).
Figure 4.11 shows time histories of lekn) I and of the above upper bound on
the absolute value of the difference between the exact and the Euler numerical
solutions for a(x, t) = -5x, x(O) = 1, ct = 25, f3 = 5, and two partitions Pn
of [0, 1] with n = 10 and 100. The bound on le kn) I is too wide to provide any
meaningful information on the accuracy of the Euler numerical solutions. Note
that the plots for n = 10 and n = 100 have different scales. <>
4. 7. Stochastic differential equations 279
le(n)l
k
Figure 4.11. Upper bound and error of the Euler numerical solution
x<n) - x(n)
k+l - k
+ a(x(n) /n)) 8
k ' k n,
e(n)
k+l
= e(n)
k
+ [a(x(t(n))
k ' k
t(n)) - a(x(n) t(n))] 8
k ' k n
+ !2 x((}(n))
k
82
n·
The absolute value of the square bracket in the above equation is smaller than f31x(t~n)) -
x~n))l = f31e~n) I by assumption. We also have x = da(x, t)fdt = oafot +a oaf ox so
that there exists a > 0 such that lx(et))l ::::a. The above recurrence formula for ekn) and
the bounds give
since e&n) = 0. The latter inequality and ( 1 + c )n :::: en c, c > 0, deliver the stated bound
on the error with (c, n) replaced by ({3 8n, k + 1).
The drift of the differential equation considered in this example is a (x, t) = -5 x
so that x(t) = e- 5 t for x(O) = 1. The difference la(x, t)- a(y, t)l = 51x- Yl so that
we can take f3 = 5. Because x = da(x, t)fdt = oafot +a oaf ox, oafot = 0, and
a oaf ox = ( -5 x)( -5) = 25 x, we can take a = 25. •
280 Chapter 4. Ito's Formula and Stochastic Differential Equations
x<n) (tin)) - x<n) (ti~ 1 ) = a(X (ti~ 1 )) Mt) + b(X (ti~1)) !lBt> + R.t?,
(4.100)
where R(n) -
k,1 - 2
!
b' (X (t(n) )) b(X (t(n) )) [(llB(n)) 2 - !lt(n)]
k-1 k-1 k k . (4.101)
Proof: Consider the increment of X in Eq. 4.99 during a time interval [ti~1 , tt>]. Instead
of approximating the functions a(X(s)) and b(X(s)), s E [ti~1 • tin)], by their values
a(X(ti~1 )) and b(X(ti~1 )) at the left end of this time interval (Eq. 4.98), we use the ItO
formula to represent the integrands a and bin [ti~ 1 , tin)]. The resulting expression is used
to establish the Milstein approximation.
The Ito formula applied to the functions a(X(s)) and b(X(s)) for s E [ti~1 • rt>J
gives
1
b(X(s))- b(X(ti~1 )) b2 ) du + <) b' b dB(u),
2 t n
~I ~I
= {k
t(n) [
a(X(tin\))+ {
s (a a+-a"b
1
1 2 )du+
[s a 1 bdB(u)
] ds
lri':!l - lri':!l 2 ri':!1
+ { t(n)
k
[
b(X(tk(n)1 )) + { s (b 1 a+ -b"
1
b2 ) du +
[s b' b dB(u)
] dB(s)
lr<•l
k-1
- Jr.<•)
k-1
2 r<•l
k-1
or
X (tin))- X(ti~ 1 ) = a(X(ti~ 1 )) t:..tin) + b(X(ti~ 1 )) t:..Bin) + Rkn),
where Rkn) = Rtf + Rt~ represents the correction relative to the Euler approximation
and
Rknl = { t0)
k
[
r b1 bdB(u) ] dB(s)
' 1ri~1 lt~~~
:::::. f?.(n) = b'(X(/n) ))b(X(/n) )) { k
t(n) [ s ]
dB(u) dB(s)
k-1 Jr<•)
{
k, 1 k-1 Jr<•l
k-1 k-1
= !2 b' (X(t(n)
k-1
)) b(X(t(n) ))
k-1
[<t:..B(n))Z- L!J.t(n)].
k k
4. 7. Stochastic differential equations 281
The arguments of the functions a and b were not written in some of the previous formulas
for simplicity. The final expression of R1nt
above is valid since (Eq. 4.3)
l t(n)
k
(n)
1k-!
1
k k
1
B(s) dB(s) = - (B(t(n)) 2 - /nl)- - (B(t(n) )2- /n) )
2 2 k-1 k-1
The Milstein approximation results by neglecting the component Rkni of Rkn) and replac-
. (n) b -(n) • '
mg Rk,l y Rk,l.
If the drift and diffusion coefficients of X satisfy some type of Lipschitz and
growth conditions, then the Milstein approximation converges strongly with or-
der y = 1 ([115], Theorem 10.3.5, p. 350).
Note: This property shows that the Milstein approximation is superior to the Euler approx-
imation. However, the difference between the Milstein and the Euler approximate solutions
for Eq. 4.95 corresponding to the same partition of the time interval [0, t] may not be sig-
nificant (Example 4.53). 4
with the initial condition X (0) = xo, where c and a are some constants. The
geometric Brownian motion X(t) = xoe<c-O.saz)t+aB(t) is the solution of this
equation (Example 4.34).
Figure 4.12 shows with dotted and solid lines samples of the exact solution
and the corresponding samples of x<nl in the time interval [0, 1] obtained by the
Euler and Milstein formulas for c = 0.1, a = 2, and two partitions Pn of [0, 1]
with n = 50 and n = 150. The samples of X and X (n) in the figure correspond to
the same sample B ( ·, w) of B. The Milstein approximation is slightly superior to
the Euler approximation, an expected result since these approximations converge
strongly with order y = 1 and y = 0.5, respectively. <>
Note: The Euler and the Milstein formulas for the geometric Brownian motion are
X(n) - (1
k+l -
+ cjn) X(n)
k
+a X(n)
k
f),.B(n)
k ,
and
3,---~--~-----------,
Milstein, n=50
2.5 2.5
Figure 4.12. Samples of the Euler and Milstein numerical solutions for a geomet-
ric Brownian motion process
respectively, with the notations Xkn) = x<n)(tin)) and L'l.Bin) = B((k + 1)1n)- B(kln)
for the increment of the Brownian motion in [kIn, (k + 1) In].
Estimates of the error function es (Pn, t) in Eq. 4.97 must be based on samples of
the exact and approximate solutions corresponding to the same sample of the Brownian
motion, that is,
1 ns
es(Pn.t)= ns .L [ix(t,w)-x<n)ct,w)IJ·
lil=l
where X(·, w) and x<n) (·, w) are calculated from the same sample B(·, w) of Band ns 2: 1
is an integer denoting the number of samples used in estimation. The samples of X and
x<n) needed to estimate the error ew (Pn , t) do not have to correspond to the same samples
of the Brownian motion process B. &
The difference between the Euler and the Milstein numerical solutions in
Eqs. 4.98, 4.100, and 4.101 is the correction term Rkni
that depends on the diffu-
sion part of Eq. 4.95. A major source of numerical error and instability relates to
the crude approximation of the drift part of Eq. 4.95 by both methods. Methods
developed for the numerical solution of deterministic differential equations can
provide superior approximations to the solution for Eq. 4.95. Consider a deter-
4. 7. Stochastic differential equations 283
assumed to have a unique solution in [0, t]. For this equation the Euler and Mil-
stein approximations coincide. We present an alternative to these numerical solu-
tions, which provides a superior approximation to the solution of Eq. 4.102.
where x~n) = x<nl(t~n)), ak,l = a(x~n), tt)), ak,2 = a(xt) + !Xk,l 8n/2, t~n) +
8n/2),ak,3 =a(xk(n) +ak,28n/ 2 ,tk(n) + 8n /2) ,and t:¥k,4 =a(xk(n) +ak.3 8n.tk+l.
(n) )
Note: Additional information on the Runge-Kutta and related numerical methods for solv-
ing deterministic differential equations can be found in [115] (Section 8.2). &
where x[n) = x<nl (t~n)), t.Bt) = B(t~~ 1 )- B(t~n)), and the coefficients A have
the definitions of the coefficients a in Eq. 4.103 with X [n) in place of x~n).
Figure 4.13 shows samples of the Euler and Runge-Kutta (R-K) solutions
for p = 1, f3 = -1, n = 250, and n = 10, 000 corresponding to the same sample
of B. The Euler numerical solution diverges if the time step is relatively large
(n = 250, 8n = 100/250). On the other hand, the solution given by the Runge-
Kutta method is bounded for n = 250. Moreover, estimates of moments and other
properties of X from samples of x<nl generated by the Runge-Kutta method with
n = 250 are accurate. The solutions of the Euler and the Runge-Kutta methods
are satisfactory and nearly coincide for 8n = 100/10, 000. <>
Note: The selection of the time step on for a numerical solution x<n) must be such that
the solution does not diverge and provides sufficient information on the samples and/or
other properties of X. For example, on = 100/250 is inadequate for the Euler method
because the corresponding numerical solution x<n) diverges, but seems to be satisfactory
if the analysis is based on the Runge-Kutta method. &
284 Chapter4. Ito's Formula and Stochastic Differential Equations
X 10245
2.-~~~--~--~---,
-2
-4
-6
-8
-10 '--~--~--~--~----'
0 20 40 60 80 100 20 40 60 80 100
3,-~------~------, 3.-~~~~~-------,
-1
-2 o'-----~~40--..!.........6Q-.--~80--.......J100
rrme
4.8 Problems
4.1: Show that the stochastic integral Jx(H) in Eq. 4.16 is an adapted cadlag
process.
4.2: Prove the equality J~ s dB(s) = t B(t)- J~ B(s) ds, where B is a Brownian
motion.
4.3: Prove the associativity and the preservation properties of the stochastic inte-
gral (Section 4.4.4) for the special case of simple predictable integrands.
4.4: Show that the jump process D..(H · X) is indistinguishable from H(D..X),
where H E £and X is a semimartingale (Section 4.4.4).
4.5: Show that C 2 is a semimartingale, where C is the compound Poisson process
inEq. 4.13.
4.6: Check whether the processes
4.7: Show that X(t) = J~ h(s) dB(s), t E [0, r], r > 0, is a Gaussian process
with mean zero and covariance function E[X(s)X(t)] = f~M h(u) 2 du, where
h : [0, r] ~ ~ is a continuous function.
4.10: Write the Ito and the Stratonovich differential equations satisfied by the
process Y(t) = eB(t), t::: 0, where B is a Brownian motion.
4.11: Transform the following Stratonovich and Ito differential equations into
Ito and Stratonovich differential equations, respectively, that is, the Stratonovich
equations
4.12: Show that X (t) = X(O) e-cx 1 + f3 J~ e-cx(t-s) dB(s) is the solution of the
Omstein-Uhlenbeck process in Example 4.41.
4.16: Find the solution of dX(t) =a X(t)dt + bX(t) dC(t), X(O) > 0, where
Cis the compound Poisson process in Eq. 4.13 and a, bare constants.
Chapter 5
5.1 Introduction
287
288 Chapter 5. Monte Carlo Simulation
Case 1 (d = 1). If X "' N(J-t, y), the scaled and translated MATLAB function
J-t + .jY randn can be used to generate independent samples of X. Some of the
algorithms used to generate samples of a Gaussian variable are based on mem-
oryless transformations of some random variables [27, 79, 155]. For example,
Gaussian variables can be related simply to uniformly distributed random vari-
ables (Eq. 5.1).
Proof: If z1 and Z2 are independent copies of N(O, 1), their density is f(zt, Z2) =
exp[-(zt + z~)/2]/(2n). The density of the random variables (R, 8) defined by the
mapping z 1 = R cos(8) and Z2 = R sin(8) is fR,e(r,9) = r exp(-r 2/2)/(2n).
Hence, R has the density fR (r) = r exp( -r 2/2), 8 is uniformly distributed on (0, 2n ],
and the variables (R, 8) are independent. Samples of R can be generated from samples of
Ut ~ U(O, 1) andtherepresentationR 1::. ,J-21n(Ut) derivedfromFR(R) = Ut and the
fact that Ut and 1 - Ut have the same distribution, where FR denotes the distribution of
R. This method for generating samples of random variables is discussed further in Eq. 5.4.
The projection of Ron the coordinates (z1, Z2) gives Eq. 5.1. •
22:=1
. 1 0
{3·· _
I}- [
Yij -
j-1
f3ir /3jr
2 ]1/2'
1 :::; j :::; i :::; d, with L f3ir f3jr = 0. (5.3)
Yjj - Lr=1 f3jr r=1
5.2. Random variables 289
Note: A sample of the column vector G can be obtained from the MATLAB function
randn(d, 1). A proof of the validity of Eqs. 5.2 and 5.3 can be found in [155] (Sec-
tion 3.5.3). An alternative method for generating samples of X is in Problem 5.1. &
Example 5.1: Let X "' S01 (a, {3, p.) be an a-stable random variable with scale
a > 0, skewness {3, and location p., where 1/31 :::; 1 and a E (0, 2]. If f3 = 0 and
11- = 0, X has a symmetric density about zero and its samples for a = 1 can be
generated from
I
samples of Z ~ Sa(l, {3, 0), where
with V and Was in Eq. 5.5, vo = -n f3 h(a)j(2a), and h(a) = 1 -11- al ([79],
Section 4.8.1, [162], Section 1.7). <>
Example 5.2: Let X ~ S1 (a, 0, 0), a > 0, be a Cauchy variable with density
f(x) = aj[n (x 2 + a 2 )] and characteristic function q;(u) = e-a lui. Figure 5.1
shows n = 100 independent samples of X and a histogram of X based on 10,000
f(x)
1.5
-10
Histogram
-15 Sam/les of
-20
X- 1(0.2,0,0)
-25 0.5
A ~
-30
-35 0
0 20 40 60 80 100 -3 -2 -1 0 2 3
X
Sample number
samples of this variable, respectively, for a = 0.2. The figure also shows the
density f of X. The samples of X have been obtained from a Z and samples of
Z in Eq. 5.6 for a = 1. <>
samples of N by the inverse method in Eq. 5.4. However, the approach is inef-
ficient if the probability that N takes relatively large values is not negligible. If
E[N] =A 2: 10, it is convenient to generate samples of N from
where [x] denotes the integer part of x and G "'N(O, 1) ([155], Section 3.7.2).
Note: The approximation of N by the random variable Z in Eq. 5.8 is acceptable for large
values of A because the distribution of the random variable (N - A)/...fi. converges to the
distribution of N(O, 1) as A increases indefinitely. •
Case 2 (d > 1). Let F1, Fklk-1, ... ,1. k = 2, ... , d, and F denote the dis-
tributions of the coordinate X 1 of X E JRli , the conditional random variable
Xk I (Xk-1, ... , X1), and X= (X1, ... , Xd), respectively.
where U1, ... , Ua are independent U(O, 1) random variables. Then, X and Z
are equal in distribution.
Proof: The random vector Z has the same distribution as X by definition (Eq. 5.9). The
representation of Z in Eq. 5.9 constitutes an extension of the inverse transform method in
Eq. 5.4 to random vectors. •
Note: Let (u1, ... , ua) be a sample of (Ut. ... , Ua) and z = (Z1, ... , za) be the corre-
sponding sample of Z obtained from Eq. 5.9. The first coordinate of z is Z1 = F1 1(u1).
The other coordinates of z can be calculated from Fkjk-1, ... ,1 (Zk) = Uk for increasing
values of k ~ 2, where the vector (Zk-1, ... , Z1) has already been calculated in a previous
step. •
1 </J (X1-JL)
f(X1,X2) = a8 -0'- </J
(X2-X1)
8 .
292 Chapter 5. Monte Carlo Simulation
5,-------------,
..~.--· 15
4 cr=1, &=1 cr=1, &=3
3 10
·....
....
2
5
....:~·:-:~ ..~:
1
x2 x2
0
.'. ;.r •• •• 0
.. ' " . 0
-1
...
:.
'· .... 0
-2 -5
-3
~L__-~-~--~-___..J -10L__-~-~--~-___..J
~ -2 2 4 -4 -2 2 4
Figure 5.2. Samples X = (XI, Xz), where X1 "' N(tt. a 2 ) and Xz I (XI =
xi)"' N(x1, 82 ) for (a= 1, 8 = 1) and (a= 1, 8 = 3)
The mapping in Eq. 5.9 becomes Z 1 = tt +a <1>- 1(U1) and Zz I (Zt = Z1) =
Zl + 8 <I>- 1(Uz), where U1 and Uz are independent copies of U(O, 1). Figure 5.2
shows 100 independent samples of X generated by the algorithm in Eq. 5.9 for
a = 1, 8 = 1, and 8 = 3. The dependence between X 1 and Xz is stronger for
smaller values of 8. ¢
-I
points, r = 1, ... , n. Samples of the random vector X can be generated from
exp 1a- fsd lbla y(du) +A tan {!If) fsd lbla sign(b) y(du)), a=1= 1, 2
exp a- fsd lbl y(du)- A~ fsd b ln(lbl) y(du)), a= 1
exp a- fsd lbl 2 y(du)), a= 2
Example 5.6: Let X E JRd be a translation vector, that is, X = g(Y), where Y
is an JRd -valued Gaussian vector with mean zero, covariance matrix p = {p i,J =
E[Y; Y1 ]} such that Pi,i = 1, i = 1, ... , d. Suppose that the memoryless mapping
from Y to X is given by X;= F;-l (<P(Y;)) = g;(Y;), i = 1, ... , d, where F; are
absolutely continuous distribution functions.
Samples of X can be generated from (1) samples of Y and the definition of
X or (2) samples of U(O, 1), the distribution of X, and the algorithm in Eq. 5.9.
The latter approach is less efficient for translation random vectors. <>
P(X1:::: Xi, ... , Xd:::: XJ) = P(Yt:::: Yt. · .. , Yd:::: Yd) = <l>d(y; p),
called multivariate translation distribution, is the input to the Monte Carlo simulation
algorithm based on Eq. 5.9, where <l>J(·; p) denotes the joint distribution function of the
Gaussian vector Y ~ N(O, p). A
Fixed frequencies. We first introduce some notation and definitions. Let v * > 0
be a cutoff frequency and Pn = (0 = ao <at < ···<an = v*) be a partition of
the frequency range [0, v*]. Denote by L\v, =a, -ar-1 and v, = (ar-1 +a,)/2,
r = 1, ... , n, the length and the midpoint of the frequency intervals defined by p n.
It is common to take equal frequency intervals in which case we have .:\ v r = v* / n
and v, = (r- 1/2) v* fn. Consider also the Rd -valued Gaussian variables A,, B,
with mean zero and second moments
{Oir
E[Ar,k Ap,t] = E[Br,k Bp,l] = Orp Ja gk,l(V) dv:::::: Orp gk,z(v,) L\v,,
where g(v) = s(v) +s(-v) andh(v) = -.J=T (s(v) -s(-v)) (Section 3.9.4.2).
The approximate equalities in Eq. 5.11 can be used for small values of .:\ v r.
If the frequency band of the process X is not bounded, a cutoff frequency
v* > 0 has to be selected to apply the Monte Carlo simulation algorithm consid-
ered in this section. The cutoff frequency v * > 0 has to be such that most of the
energy of the process X corresponds to harmonics with frequencies in [0, v *].
We now define a sequence of processes X (n) and show that X (n) is approxi-
mately a version of X for sufficiently large values of n. This property justifies the
use of samples of x<n) as substitutes for samples of X.
Proof: Recall that Ll(Pn) = max1::;r~n (a,- ar-t) denotes the mesh of Pn· For any
partition of the frequency range [0, v*], x<n) is a process with mean zero and covariance
function with entries
5.3. Stochastic processes and random fields 295
Hence, x<nl is weakly stationary for each n. This process is Gaussian because it depends
linearly on the Gaussian variables A, and B,. If d = 1, then gu = g and hn = 0 so that
the covariance function of x<n) is
lim E
n-+oo
[(x(n) (t) x(n) (s)T) .. ]
lJ
= fov* [gi,J (v) cos(v (t- s)) -hi,} (v) sin(v (t- s))] dv
Example 5.7: Let X be an IR2 -valued stationary Gaussian process with spectral
density
Sk,J(V) = (1- p) /h1 Sk(V) + p sz(V) k, [ = 1, 2,
where sk(v) = 1/(2 vk) 1[-vk.vkl(v), sz(v) = 1/(2 ii) 1[-ii,vJ(v), and 0 < vko v <
oo. Figure 5.3 shows sample paths of the coordinates of X for vk = 25, v = 5,
two values of p, and n = 100 harmonics (Eq. 5.12). The frequency content of X
depends strongly on p. The coordinates X 1 and X 2 of X are nearly in phase for
values of p close to unity. <)
Note: Set v* = max(vlo iJ2, v), ~vr = v* In, and Vr = (r- 112) ~Vr for r = 1, ... , n.
The covariance matrix of the random amplitudes Ar and Br can be calculated from Eq. 5.11.
The samples in Fig. 5.3 have been generated by the model in Eq. 5.12. A
296 Chapter 5. Monte Carlo Simulation
4,---~-------------, 4,------------------.
Sample of ~ (p= 0.3) Sample of~ (p= 0.7)
-4 -4
0 2.5 5 7.5 10 0 2.5 5 7.5 10
time, t time, t
4 4
Sample of X2 (p= 0.3) Sample of X2 (p= 0. 7)
2
-4L---~----~----~--~ -4L---~----~----~--~
0 2.5 5 7.5 10 0 2.5 5 7.5 10
time, t time, t
Figure 5.3. Samples of the coordinates X 1 and X2 of X for p = 0.3 and p = 0.7
(adapted from [79], Fig. 4.4)
Proof: The parameter ').. > 0 indexing the sequence of processes approximating X has a
similar role as n = 1, 2, ... in Eq. 5.12. The first equality in Eq. 5.13 is a statement of the
spectral representation for an approximation of X obtained by neglecting the power of this
process beyond a cutoff frequency v* > 0. The second equality in Eq. 5.13 holds since
Nr(v*) < oo a.s. for v* < oo and the integrators Cr. r = 1, 2, have piece-wise constant
samples.
The process x(A) has mean zero and covariance function
= Af-t2 fo ~(v) 2
v•
E[X(A)(t)X(A)(s)] [cos(vt) cos(vs) +sin(vt) sin(vs)]dv
rv*
= Jo g(v) cos(v (t- s)) dv,
where the last equality holds because').. f-L2 ~(v) 2 = g(v) by hypothesis. Hence, the co-
variance function of x(A) converges to the covariance function of X as v* -+ oo, that is,
X and limv*--+oo x(A) are equal in the second moment sense for each ').. > 0. It can also
be shown that i(A)(t) = X(A)(t)/Var[X(A)(t)] converges in distribution to N(O, 1) ([79],
Example 3.12, p. 83). The approach in [79] can be used to show that the distribution of
a vector with coordinates i(A) Uk) becomes Gaussian as ').. -+ oo, where tk are arbitrary
times. In summary, we showed that (1) the second moments of x(A) approach the second
moments of X as v* -+ oo for any ').. > 0 and (2) x(A) converges to a Gaussian process
as ').. -+ oo for any v* > 0. Hence, x(A) becomes a version of X as ').. and v* increase
indefinitely. •
The models x<n) and X(A) in Eqs. 5.12 and 5.13 have similar features but
differ in several notable ways. Both models consist of a superposition of a finite
number of harmonics with random amplitudes. However, the frequencies of these
harmonics are fixed for X (n) but are random for X (A). The frequencies of con-
stituent harmonics of X (A) coincide with the jump times of the compound Poisson
processes C1 and C2. Both processes, x<n) and x<Al, are weakly stationary for
any n, A., and v*. However, x<n) is a Gaussian process for any n and v * while X(A)
is Gaussian asymptotically as A. -+ oo for any value of v *.
298 Chapter 5. Monte Carlo Simulation
Note: The methods in Section 5.2.1 can be used to generate samples of the jumps of the
compound Poisson processes Cr. The jump times of these processes can be generated in
at least two ways. One approach is based on the representation T,.,i = L~=l Wr,i of the
jump times of C,, where Wr,i are iid exponential random variables with mean 1/'A. The
other approach can first generate samples of the Poisson variables Nr(v*) (Example 5.3)
and then place Nr (v*) independent uniformly distributed points in [0, v*]. A
Example 5.8: Let X be a stationary Gaussian process with mean zero, covariance
function c(r) = E[X(t + r) X(t)] = sin(v* r)j(v* r), and one-sided spectral
density g(v) = (1/v*) 1[o,v•J(v).
Figure 5.4 shows a sample of x<J..l, the covariance function c(r) of X, and
4,-------------------,
3 Sample of x<A.>
0.5
Estimate of c('t)
-1
0
-3 '-. c('t)
-4 -0.5
0 5 10 15 20 0 0.5 1 1.5 2
time time lag
Figure 5.4. A sample of x<J..), the covariance function c(r) of X, and an estimate
of c(r) from 500 samples of x<J..)
(5.14)
n
x<n)(t) = L [Ar cos(vr. t) + Br sin(vr. t)]' (5.15)
r=l
.
becomes a vers10nof X, where Vr · t = Lk=l
d'
Vr,k tk. Vr = (vr,l, ... , Vr,d'), and
t = (tl, ... ' fd 1 ) .
Proof: The random field x<n) is Gaussian with mean zero for any D and partition of it.
300 Chapter 5. Monte Carlo Simulation
n
= L E (CAr cos(vr · t) + Br sin(vr · t)) (Ap cos(vp · s) + Bp sin(vp · s))]
r,p=l
n n
= LYr (cos(vr · t) cos(vr · s) + sin(vr · t) sin(vr · s)) = LYr cos(vr · (t- s)).
r=l r=l
Considerations as in Eq. 5.12 show that the covariance function of xCnl approaches the
covariance function of X as the partition of D is refined and D is increased to ~~. Because
x(n) is a Gaussian field for any D and partition of it, xCn) becomes in the limit a version
of X. This properties justifies the use of samples of xCn) as a substitute for samples of X.
We note that the second moments Yr of the random variables Ar and Br in Eq. 5.14
must be consistent with the properties of the spectral density of real-valued random fields
(Section 3.9.4.3). •
n
X(t) = L [Ar cos(vr · t) + Br sin(vr · t)],
r=l
where Ar and Br are independent Gaussian variables with mean zero and variance
E[A;] = E[B;] = Yr· This representation of X and the model of this process in
Eq. 5.15 have the same functional form. Figure 5.5 shows two sample of X for
d' = 2, n = 6, VJ = (1, 2), V2 = (2, 1), V3 = (2, 2), V4 = -V}, V5 = -V2,
and V6 = -v3. The left sample is for Yr = 1, r = 1, ... , 6. The right sample
corresponds to Yl = Y4 = 1 and Yr = 0.01 for r i= 1, 4, and has a dominant wave
with frequency VI = (1, 2). <>
Note: If d 1 = 2, we can think of cos(vr · t) = cos(vr,l t1 + Vr,2 t2) as a wave with length
2rr!Jvr,2 1 +vr,22 traveling in the direction Or = tan- 1 (vr2/VrJ).
, , The wave length is
5.3. Stochastic processes and random fields 301
10
5 2
0 0
-5 -2
-10 -4
6 6
6 6
tz 0 0
t1 t2
0 0
t1
given by the distance between the zeros of cos(117 • t), that is, the lines Vr,l t1 + Vr,2 t2 =
(2 p - 1) 1r /2 in JR2 , where p is an integer. &
N,(D)
be two compound Poisson processes, where N r and Yr,k are independent copies
of Nand Yk. respectively, and the random variables T r,k have the same meaning
as T k in the definition of the compound Poisson process C.
302 Chapter 5. Monte Carlo Simulation
= L ~(TJ,k) Y1,k cos (Tl.k · t)- L ~(Tz,I) Yz,1 sin (Tz,1 · t) (5.17)
k=l 1=1
where the processes CP• p = l, 2, are independent copies of C in Eq. 5.16. This represen-
tation suggests the definition of the approximating sequence of fields in Eq. 5.17.
The compound Poisson process C has mean zero and covariance function
where v, v1 E D, D(T k) = D n ( x1~ 1 [Tk,j, oo) ). Tk,j denotes the coordinate j of Tk,
and T1 is uniformly distributed in D. The above derivations used the fact that the Poisson
points Tk are uniformly distributed in D conditional on N(D). If D = x1~ 1 [-vj, vj].
J
then E [ 1D(T 1 )(v) 1D(T 1) (v 1) = TI1~ 1 (vj !\ vj + vj)/(2 vj) since the coordinates TJ,j
of T 1 are uniformly distributed in [- vj, vj].
J
The definition of X(A) in Eq. 5.17 gives E [ x<;_l(t) = 0 and
The above expression for the covariance function of x<"l results from
E[x<"l(t) x<"l(s)] = E [ M~
L I;(Tr,k) I;(Tu) yl,k Yl,l cos (Tl,k. t) cos (Tu· s) ]
+E [~~
L
k,l=l
k,l=l
I;(T2,k) I;(T2,1) Y2,k Y2,1 sin (T2,k · t) sin (T2,1 · s) ]
L
Nl(D) ]
=E [ I;(Tl,k) 2 rl,k cos (Tr,k · t) cos (Tr,k · t)
k=l
+E [ L
N2(D)
I;(T2,k) 2 ri,k sin (T2,k. t) sin (T2,k.
]
t) ,
k=l
the fact that the random variables Tr,k• k = 1, ... , Nr(D), r = 1, 2, are independent and
uniformly distributed in D conditional on Nr (D), the properties of the random variables
Yr,ko and E[Nr(D)] =A vv, where vv = fv dv denotes the volume of D.
Hence, x<"l is a weakly stationary random field for any A > 0 and D c JR'f'.
Moreover, x<"l becomes equal to X in the second moment sense if Dis increased to JR'f'.
Also, x<Jcl converges to a Gaussian field as A --+ oo for any D. We conclude that x<"l
becomes a version of X if A --+ oo and D is increased to JR'f'. •
The models in Eqs. 5.15 and 5.17 have features and limitations similar to
those of the models in Eqs. 5.12 and 5.13.
Note: The generation of Poisson points in a subset of JR'f' is discussed in Section 5.3.3.3.
The methods described in Section 5.2 can be used to generate samples of the random vari-
able (fr,l• ... , Yr,N,(D)). A
s (v) =
1 [
exp -
vf - 2 p v1 v2 + vi ] , v E JR2 ,
2n Jl=P2 2 (1- p 2 )
r[ + 2 p r1 r2 + r 22 ] rEJR2 ,
c(r)=E[X(t)X(t+r)]=ex[p- 2 ,
where IPI < 1. The spectral density of X is shown in Fig. 5.6 for p 0.7.
0.25
-3 -3
Figure 5.7 shows four samples of X fort E S = [0, 10] x [0, 10], A = 0.5, Y 1
a Gaussian random variable with mean zero and variance one, D = [-3, 3] x
[ -3, 3], and s(v) 2 = s(v)jA. The random field X(i.) has approximately the same
second moment properties as X for any A and S( v) satisfying the above condition.
The model x<i.) in Eq. 5.17 has been used to generate the samples shown in this
figure. The average number of Poisson points in D is A v D = 18, where vD
denotes the volume of D. <>
2 2
0
0
-2 -1
10 10
10 10
5
0 0 0 0
10
0 0 0 0
Let [ -v*, v*], 0 < v* < oo, be the frequency band of a deterministic function
x(t), t E JR. Then ([29], Sections 5-4, 5-5)
n
x(t) = lim "'x(kt*)ak(t; t*), where
n--+oo L.,;
k=-n
* sin(n(tjt*-k))
ak(t· t ) - and t* = njv*. (5.18)
' - n(tjt*-k)
Note: It is sufficient to know the values x(k t*), k E Z, of x to reconstruct the entire signal.
The function ak is 1 if tIt* = k, is zero if tIt* is an integer different from k, and decreases
rapidly to zero as It It* - kl increases. A
n1 +n+l
x<n)(t) = L X(kt*)ak(t; t*), (5.19)
k=n 1 -n
becomes a version of X as n --+ oo, where n 1 denotes the integer part oft j t *
and n ~ 1 is an integer.
Proof: We call nodes the times k t* at which the process is sampled, where v = vr,
v* = vj', t* = rr jv*, and k E Z. The integer n 2:: 1 is half the size of the window centered
on the current cell [n 1 t*, (n 1 + 1) t*], that is, the cell containing the time argument t
(Fig. 5.8). The model x<n) in Eq. 5.19 depends linearly on the values of X at 2 (n + 1)
nodes and coincides with X at the nodes included in its window.
Current cell
(n - n)t* r* ,--1-.
/ (n +l)r* (n + n +1)/
1\.
n
t t t
• ~I;_/ • ~ti~e
* J,
t
for almost all w's since the samples of X consists of a superposition of harmonics with
frequencies in the range [-v*, v*]. The representation limn-+oo x<n) cannot be used to
generate samples of X because it depends on an infinite number of random variables, the
random variables X(k t*), k E Z.
Consider the model in Eq. 5.19. This stochastic process is Gaussian with mean zero
and covariance function
n1 +n+l (ns+n+l )
E[x<n) (t) x<n) (s)] = k=~-n ak(t; t*) l=~-n c((k- l) t*) CX[ (s; t*) ,
where c denotes the covariance function of X. Hence, x<n) is not stationary for finite
values of n because E[x<n) (t) x<n) (s)] depends explicitly on the times t and s. However,
E[x<nl(t)X(nl(s)] converges to c(t- s) as n--+ oo This follows from the sampling
5.3. Stochastic processes and random fields 307
theorem in Eq. 5.18 applied twice to the covariance function cO in the above equation. We
can apply this theorem to the covariance function of X because the frequency band of its
Fourier transform is [ -v*, v*] by hypothesis. Therefore, x<nl becomes a version of X as
the window size is increased to infinity. This property justifies the use of samples of x<n)
as a substitute for samples of X provided that n is sufficiently large. Numerical results
show that x<nl with n :::::: 10 approximates satisfactorily X [78, 88]. •
The model x<nl in Eq. 5.19 can be used to develop an efficient Monte Carlo
simulation algorithm for generating samples of stationary Gaussian processes.
Monte Carlo algorithm. Suppose that a sample x<nl(., w) of x<n) has been
generated up to a time (n t + 1) t*. The extension of this sample in the next cell
[(nt + 1) t*, (nt + 2) t*] involves two steps:
1. Generate a sample of the conditional random variable
X((nt + n + 2) t*)
= X((nt + n + 2)t*) I [X((nt + n + l)t*) = X((nt +n + 1) t*, w),
X ((nt + n) t*) =X ((nt + n) t*, w), ... ,
X((nt - n) t*) =X ((nt - n) t*, w)]. (5.20)
2. Calculate the corresponding sample of X (n) in [ (nt + 1) t*, (nt + 2) t*] from
Eq. 5.19.
Note: The real-valued random variable X((nt + n + 2) t*) is Gaussian with known mean
and variance (Section 2.11.5). Because X is stationary and we consi?er in Eq. 5.20 only
values of X at the past 2 n +2 nodes, the second moment properties of X((nt +n +2) t*) do
not change in time so that they have ~o be calculated once. The algorithms in Section 5.2.1
can be used to generate a sample of X((nt + n + 2) t*).
The extension of the sample of x<n) beyond the time (nt + 1) t* should use a sample
of the conditional variable
X((nt + n + 2) t*)
= X((nt + n + 2) t*) I [X((n 1 + n + 1) t*) = X((n 1 + n + 1) t*, w),
X((nt + n) t*) = X((nt + n) t*, w), .. .] ,
which accounts for the entire time history, rather than a sample of X((nt + n + 2) t*).
However, this exact formulation is impractical since the properties of X((nt + n + 2) t*)
have to be recalculated at each node and depend on a vector of increasing length as time
progresses. Moreover, the improved accuracy of this formulation may be insignificant
because of the reduced influence of values of X at nodes far away from the cell containing
the current time. Numerical results support this statement for processes that do not exhibit
long range dependence. A process is said to have long range dependence if its covariance
function is such that c(r) ~ r-!3 for r-+ oo, where f3 E (0, 1) is a constant [78, 88].
308 Chapter 5. Monte Carlo Simulation
The algorithm in Eq. 5.20 can be applied to narrowband processes, that is, processes
whose spectral density is zero outside a small vicinity of a central frequency 1{) > 0. How-
ever, the Monte Carlo simulation algorithm based on Eq. 5.20 can be inefficient if the cen-
tral frequency vo is large. A modified version of this algorithm developed for narrowband
processes can be found in [78]. A
Note: The process x<nl with coordinates in Eq. 5.21 is Gaussian with mean zero for
any values of n;. Straightforward calculations show that x<n) becomes a version of X as
min1::;:i::;:d n; -+ oo so that we can use samples of x<n) as a substitute for samples of X
provided that n; are sufficiently large. A
where Y; and Y are independent band-limited Gaussian white noise processes with
mean zero and spectral densities sy;(v) = (1/(2\i;)) 1[-v;,v;J(v) and sy(v) =
(1/(2 iio)) 1[-vo,voJ(v), respectively. The spectral density s of X has the entries
s;,; (v) = (1- p) sy; (v) + p sy(v), i = 1, 2, and s1,2(v) = s2,1 (v) = p sy(v). Fig-
ure 5.9 shows the exact and approximate covariance function E[X 1 (t + r) X1 (t)]
fort at a cell midpoint, lag times r ::=: 0, VJ = 2 n, ii2 = 0.4 n, and v = 0.2 n so
that vj = 2 n, vi = 0.4 n, tj = 0.5, and ti = 2.5. The approximate covariance
functions for p = 0 and p = 0.5 corresponding ton 1 = 10 nearly coincide with
the exact covariance functions of X 1 for these values of p. <>
Note: If the spacing t;* = t* is the same for all the coordinates of x<n) in Eq. 5.21, we have
n;,t = n 1 • Suppose that n; = n and n;,t = n 1 , and that a sample of X has been generated
up to a time (n 1 + 1) t*. The extension of this sample to the cell [(nt + 1) t*, (nt + 2) t*]
requires a sample of X((nt + n + 2) t*).
The simulation algorithm is less simple if the nodes in the representation of the
coordinates of x<n) in Eq. 5.21 are not equally spaced. Details on a simulation algorithm
for this case can be found in [88]. A
5.3. Stochastic processes and random fields 309
0.4
0
0 2 3 4 0 2 3 4
time lag, t time lag, t
n
nt 1 +nt+1 d'
x<m(t) = .L X (k1 tt, ... , kd' tj,) CXka (ta; t;)
kt=n 11 -nt
(5.22)
u
becomes a version of X as na --+ oo, = 1, ... , d', where t; = n;v;, n 1" is
the integer part of ta It;, na ::::_ 1 are integers, and n = (n 1, ... , nd' ).
Note: The random field x<n) is Gaussian with mean zero for any window width na. The
field is not homogeneous since E[x(n) (t) x<n) (s)] depends explicitly on t and s. The
covariance function of x<n) approaches the covariance function of X as na -+ oo, a =
1, ... , d 1 , so that x<n) becomes a version of X for a window with infinite width. This
property justifies the use of samples of x<n) as a substitute for samples of X.
The representation of X in Eq. 5.22 is fort E x~'= 1 [nat~, (na + 1) t~] and depends
on values of X at nodes around this cell. Suppose that a sample of x<n) has been generated
310 Chapter 5. Monte Carlo Simulation
for values of t in this cell. To extend this sample to a neighboring cell, it is necessary to
generate new values of X conditional on the previously generated values of this field. An
algorithm for generating samples of X based on the model in Eq. 5.22 is given in [88]. £.
n n
2 2
c((k! -lJ) t;, (k2 -12) t:i) aku (tu; t;) atv (sv; t;),
u=l v=l
so that x<m is not stationary for a finite window width.
Numerical results in [88] show that the covariance function of X (n) ap-
proaches rapidly the covariance function of X. Satisfactory approximations are
reported in [88] for n 1 = n2 ::::::: 10. <>
where Y E JR.d' is a stationary Gaussian process with mean zero and a prescribed
covariance function.
5.3. Stochastic processes and random fields 311
Note: It is assumed that the (d, d) and (d, d 1 ) matrices a and bare such that Eq. 5.23 has
a unique solution. If Y is a Gaussian white noise, then X is called a filtered Gaussian
process.
The solution of Eq. 5.23 is
where U(t, s) = JJ 8(t, s) eR vs ds, 8(t, s) denotes the Green function for Eq. 5.23,
=
Y (t) fJR e R v t d Z (v), and Z is a process with stationary, orthogonal increments (Sec-
tion 3.9.4). The second moment properties of X in Eq. 5.23 can be calculated by the
methods in Section 7.2.1.2. A
Note: The determination of the coefficients a and b of Eq. 5.23 is difficult in a general
setting. Relatively simple results are available in special cases, for example, Y is a station-
ary Gaussian process with mean zero and the matrices a, bare constant ([30], Theorem 1,
p. 106).
A finite difference approximation of Eq. 5.23 can be used to obtain samples of X
from samples of Y (Section 4.7.3). For example, we can use the forward finite difference
scheme
X(t + ~t) = [i + a(t) ~t] X(t) + b(t) ~Y(t),
where i denotes the (d, d) identity matrix, ~t > 0 is the time step, and ~Y(t) = Y(t +
~t)- Y(t). A
(5.25)
becomes a version of the Gaussian process X as ni --+ oo, i = 1, ... , d.
Proof: The convergence of the Fourier series of ci,J to ci,j in R can be proved under
various conditions. For example, if Ci,j is continuous and its first order partial derivatives
5.3. Stochastic processes and random fields 313
are bounded in R, then the Fourier series of c;,j converges to c;,J at every interior point of
R in the vicinity of which the mixed derivative iPc;,j jot as exists. This property holds in
the special case in which c;,J and its partial derivatives oc;,J jot, oc;,J jos, and a2 c;,J jot as
are continuous in R ([183], Chapter 7).
The Fourier series of c;,j in R is
L
00
c;,j (t, s) = Ak,l [ ai~·j) cos(vk t) cos(vl s) + bi~·i) cos(vk t) sin( vis)
k,/=0
c ") = -4la'f
dk'/ 2 du la'f dv c; j (u, v) sin(vk u) sin(v/ v),
' T 0 0 '
fori, j = 1, ... , d, k,l = 0, 1, ... , vo = 0, Ak,l = 1j4 fork = l = 0, Ak,l = 1j2 for
k > 0, l = 0 or k = 0, l > 0, and Ak,l = 1 for k,l > 0 ([183], Section 7.2).
The covariance functions c~~·ni) (t, s) = E [ X~n;)(t) XJni\s)] represent partial
sums of the Fourier series ci,j (t, s) since abj) = E[A;,k A j,ll, bkY) = E[Ai,k B J,zl,
d d(i,J) E[B B ] H . R
ck,l = i,k j,l , an k,l =
(i,J) E[B A ] (n;,nj)
i,k j,l . ence, ci,j converges to Ci,j m as
ni, n i --+ oo because the Fourier series of c;,j is convergent in R by hypothesis. Hence,
x<n) becomes equal to X in the second moment sense as n1, ... , nd --+ oo. Because
x<n) and X are Gaussian processes, x<n) becomes a version of X as more and more terms
are used in Eq. 5.25. This property justifies the use of samples of x<n) as a substitute for
samples of X if all ni 's are sufficiently large.
We also note that the coefficients Ai,k> B;,k exist in m.s. since E[IAi,k B J,lll :::;
2 E[B j,l]
( E[Ai,k] 2 )1/2 and, for example,
E[Af k] = ~ {r du {r dv Ci,i (u, v) cos(vk u) cos(vk v) :::; ~ fr du {' dv lci,i (u, v)l,
• T Jo lo T Jo lo
and the last integral is bounded. •
1.4 , - - - - " T - - - , - - - - , - - - . . . , - - - - - . . - - - - - - ,
1.2 c(t,t)
0.8
0.6
n=lOO
0.4 n=50
n=5
Figure 5.10. Time evolution of exact and approximate variance functions, c(t, t)
and c<n,n) (t, t) for xo = 0, p = 0.6, and r = 3
exact and approximate variance functions, c(t, t) and c<n,n)(t, t), for xo = 0,
p = 0.6, r = 3, and several values of n. The approximate variance functions
c<n,n)(t, t) correspond to a function c* defined on [-r, r] x [ -r, r] and extended
periodically to the entire plane JR2 . This function is such that c*(t, s) = c(t, s)
for (t,s) E R = [O,r] x [0, r], c*(t,s) = c(-t, -s) for (t,s) E [-r,O] x
[-r, 0], and c*(t, s) = 0 for (t, s) E [0, r] x [-r, 0] U [-r, 0] x [0, r]. The
approximate covariance function c<n,n)(r, r) approaches c(r, r)/2 as n increases
since the periodic extension c* of c has a jump discontinuity on the boundary of
[-r, r] x [-r, r]. The difference between c<n,n)(r, r) and c(r, r) can be elimi-
nated by using a periodic function c** that coincides with c in [0, r] x [0, r] and
is continuous on the boundaries of [0, r] x [0, r]. <>
Note: The Fourier series of the covariance function c(t, s) = r(t, s) - {-t(t) f..t(s) of X is
convergent and converges to c(t, s) since this function is continuous in [0, -r] x [0, -r] and
its right and left partial derivatives are continuous almost everywhere in this set. •
5.3. Stochastic processes and random fields 315
Ao =~ { X(t)dt,
(2rr) JR
L
00
c(t, s) = ao + akl ,kz,k3,k4 gl (kt It) g2(k2 t2) g3(k3 St) g4(k4 S2),
k1 ,kz,k3,k4=l
316 Chapter 5. Monte Carlo Simulation
with the appropriate expressions for the functions gr. The covariance function of x(n)
is equal to the partial sum of the Fourier series c. If the Fourier series of the covariance
function of X is convergent in R*, then the covariance function of x(n) converges to the
covariance function of X as n--+ oo. Because X and x(n) are Gaussian, x(n) becomes a
version of X as n --+ oo. Conditions for the convergence of Fourier series can be found,
for example, in [43] (Section 11.5) and [183] (Chapters 1 and 7). •
where¢(·,·; p) denotes the density of an JR2 -valued Gaussian variable with mean
zero, variance one, and correlation coefficient p. If d = 1, the finite dimensional
densities of X are
(5.29)
where Y = (Y(tt), ... , Y(tn)) is a Gaussian vector with mean zero and covari-
ance matrix p = {p(ti- lJ) = E[Y(ti) Y(tJ)]}, Xt E lR, It E JRlf', i = 1, ... , n,
¢ (y1, ... , Yn; p) denotes the density of Y, and Yi = ci> - 1 ( F (xt)). Similar results
can be found for JRd -valued random functions with d > 1.
The translation processes and fields are stationary, can follow any marginal
distribution, and their correlation function is completely defined by the marginal
distributions Fi and the covariance functions Pi,J of Y. Generally, the marginal
distributions Ft and the correlation functions ~i,J are available in applications,
for example, they may be estimated from records of X. Hence, the covariance
functions Pi,J defining the underlying Gaussian function Y have to be calculated
from Eq. 5.28. It turns out that the functions (Ft, ~i,J) cannot be arbitrary. They
must be such that the solutions Pi,J of Eq. 5.28 are covariance functions ([79],
Section 3 .1.1 ). If this condition is not satisfied, there is no translation model with
the specified properties (Ft, ~i,J ).
-2
~o~----~5o-------1~oo_______
1L5o_______
200L______2~5-o----~soo
30,------.------~-------,------~------~----~
Sample of X = Y
20
-20
Example 5.16: Let Y(t) = (a1 Y1 (ti) + a1 Y2(t2))) (af+ai)- 112 be areal-valued
Gaussian field, where t = (t1, t2) E R2 , ak are constants, and Yk are stationary
Gaussian fields with mean zero, variance one, and spectral densities that are zero
outside the frequency ranges [ -v;, vkJ, k = 1, 2, and take constant non-zero val-
ues in these ranges.
Figure 5.12 shows a sample of Y(t) fort E [0, 5] x [0, 5] and the translation
image of this samples defined by the memoryless transformation X (t) = Y (t) 3 .
Results are for a1 = a2 = 1, vi = 10, and vi
= 5. The samples of Y1 and Y2
have been obtained by the algorithm in Eq. 5.22 with n 1 = n2 = 5. <>
The class of translation processes in Eq. 5.27 can be extended to the class of
mixtures of translation processes (Section 3.6.7). The non-Gaussian processes in
this class can match not only correlation and marginal distributions as translation
processes, but also higher order correlation functions. This property is illustrated
by the following example.
for x > -ej,Je 2 - e. The first and second order correlation functions of the
processes X k are
ePk(a) _ 1
gk(a) = E [Xk(t) Xk(t +a)]= ,
e-1
ePk(a)+pk(u)+pk(a-u)
~k(a, a) = E [Xk(t) Xk(t +a) Xk(t +a)] = (e _ 1) 312 (5.31)
320 Chapter 5. Monte Carlo Simulation
':~
-10
100 20 40 60 80 100
0~
-10
,:~r
-10
100 20 40 60 80 100
0~
- 1 0 ·
':~~0
-10L----------~------~----------------------~--------L-------~------~
0 ~ ~ w 00 100
time
Figure 5.13. Five samples of X(t) corresponding to p 1 = 0, 114, 112, 3/4, and 1
~('t)
0.8
oL---~--~----~--~--~
0 2 't 3 4 5
0.4
0.3
0.2
0.1
0
0
0
2 3
3 3 (j
(j
Note: The numerical methods in Section 4.7.3 can be applied to the differential equation
defining X to calculate samples of X from samples of Y. This section considers only
the first step of the above Monte Carlo algorithm, that is, the generation of samples of Y,
assumed to be a Gaussian, Poisson, or Levy noise. .l
Note: The time increment !':!..t should correspond to the time step used to integrate the dif-
ferential equation for X. The algorithm in Eq. 5.32 is based on properties of the Brownian
motion process . .A.
Note: The number of jumps of C in (0, r] is given by a Poisson random variable N(). r)
with intensity ). r (Example 5.3). The jump times of C are uniformly distributed in (0, r]
conditional on the number of jumps. The generation of N (). r) independent samples of Y1
can be based on the methods in Sections 5.2.1 and 5.2.2. Additional information on the
generation of samples of C can be found in Sections 5.3.1.1 and 5.3.1.2.
The above approach can be extended to~~ -valued compound Poisson processes
C(t) = L,f~? Y b where Y k are iid Rd' -valued random variables . .A.
Levy white noise. The general form of a Levy process L is given by the Levy
decomposition in Section 3.14.2. The Levy process includes the Brownian mo-
tion and the compound Poisson processes so that the Gaussian and Poisson white
noise processes are special cases of the Levy noise. We have already generated
samples of L using the representation of this process in Section 3.14.2. We do not
restate this Monte Carlo algorithm. Instead, we give a Monte Carlo algorithm for
generating samples of an a-stable process La, that is, a Levy process whose incre-
ments [).La(t) = La(t + /:).t) - La(t) are independent Sa(([).t) 1fa, 0, 0) random
variables (Example 5.1).
(5.33)
Example 5.18: Let La(t), t E [0, r], be an a-stable Levy process. Samples of
this process for a= 1.0, 1.5, 1.7, and 1.99 have been generated in Section 3.14.2
based on a representation of La as a sum of compound Poisson processes with
small and large jumps. Figure 5.16 shows samples of La for the same values of
500,--------~---------, 20,----~------,
a=l a=l.5
-500
-1000
-1500L__-~-~-~-~-_j -80'----~-~-~-~-----"
0 20 40 60 80 100 0 20 40 60 80 100
time time
10,--------------,
a=l.7
0
20 40 60 80 100 20 40 60 80 100
time time
a generated by the algorithm in Eq. 5.33. The size of the jumps of La decreases
with a. The sample of La for a = 1.99 resembles the sample of a Brownian
motion process. <>
process that is equal to B in the second moment sense is the process M * with
increments dM*(t) = B(t) dB(t)/ ,Jt, t > 0. Figure 5.17 shows several samples
of B and the corresponding samples of M *. The sample properties of these two
processes differ significantly although their increments have the same first two
moments and are not correlated.
It is common in the engineering literature to define white noise in the sec-
ond moment sense. This example demonstrates one more time the need for a
precise definition of the white noise process that has to go beyond second moment
properties. ¢
Proof: Consider the martingale M(t) = B(t) 2 - t. Denote by dM*(t) = B(t) dB(t)j./i,
t > 0, the increments dM(t) = 2 B(t)dB(t) of M scaled by 2./i. The first two moments
of the increments of M* are E[dM*(t)] = 0 and E[dM*(t) dM*(s)] = 8(t- s) dt so that
M* and B are equal in the second moment sense. However, while the increments of B are
independent, those of M* are not.
An algorithm for generating samples of B was discussed previously in this section.
The recurrence formula M*(t + dt) = M*(t) + dM*(t) can also be used to produce
samples of M*. •
kl
k1! · ···kn! n V~B
n
1=1
( ~n))ki
(5.34)
r),
The special case of Eq. 5.34 for n = 2 and k2 = k - k1 gives the probability that
k1 ::: 0 out of k independent uniformly distributed points in B fall in A that is,
the binomial distribution. If the number of points k in B and the probability p A 1
that a uniformly distributed point in B falls in A 1 tend to infinity and zero, respec-
tively, such that k p A 1 is a constant A. vB, the binomial distribution converges to a
Poisson distribution with intensity A..
Example 5.20: Let N be a Poisson process with intensity A > 0 defined on JE. 2
and B = [0, 1] x [0, 1] be a Borel set in this space. Consider also a generalized
PoissonprocessN with intensity X(x) = 6 AXf x2 defined on the same space. The
processes Nand N have the same average number of points in B. Figure 5.18
shows samples of N and N in B, respectively, for A = 100. The points of N are
uniformly spread in B while the points of N are concentrated in the upper right
comer of B where X takes on larger values. <>
328 Chapter 5. Monte Carlo Simulation
Samples of ./If
..Samples of N ·..
.....
. . .: . . :
\
0.8
. ... 0.8
.. . .. .· .. ···: .
.··.: .. ....· ...·· ..
0.6:: 0.6
\ ·...
. .. .. ..
. ·:
0.2 ..... 0.2
..
oL_~--~~--------~ OL-------~--------~
0 0
that is, the polytope k consists of the points in JRd' that are closest to T k.
Homogeneous Inhomogeneous
0.8
0.6
~N
0.2
0
0 0.2 0.4 0.6 0.8
X!
sponding to the samples of Nand N in Fig. 5.18. The sample of the Voronoi
tessellation obtained from the sample of N shows that it is possible to generate
microstructures with grains of a wide range of sizes and shapes. <>
driven by a Brownian motion B E m;d', where a and b are such that this equation
has a unique solution (Section 4.7.1.1). Our objective is to estimate properties of
the solution X of this equation.
Let (} and X (0) be the Green function and the initial state, respectively, for
Eq. 5.38. The solution of this equation is X(t) = O(t, 0) X(O) + M(t), where
M(t)= fo 1
0(t,s)b(s)dB(s), t2::0, (5.39)
Note: We only give here a heuristic justification of the time change theorem in a simplified
setting. Let (Q, :F, P) be a probability space with the filtration :F1 = a(B(s) : 0 :::;
s :::; t), t 2: 0, where B is a Brownian motion process. Let M(t) = J6
A(s) dB(s) and
assume that A is an :Fradapted real-valued process that is independent of B and has the
property limt--+oo J6 A(s) 2 ds = oo a.s. Conditional on A, the increments A(s) dB(s)
of M are independent and follow a Gaussian distribution with mean zero and variance
A(s) 2 ds. These increments have the same distribution as a Brownian motion B running in
a different clock T defined locally by dT(s) = A(s) 2 ds so that T(t) = J6
A(s) 2 ds and
limn--+oo T(t) = oo a.s. by hypothesis. If A(·) is a deterministic function a(·), the new
J6
clock T(t) = -r(t) = a(s) 2 ds is a deterministic increasing function.
We note that the time change in Eq. 5.40 modifies the diffusion coefficient since the
martingale in Eq. 5.39 giving the noise contribution to the solution of Eq. 5.38 becomes a
Brownian motion in the new clock. We will see that the change of measure considered in
the following section causes a change of the drift coefficient. A
Note: The following examples show that this Monte Carlo algorithm is preferable to al-
gorithms based on a direct integration of Eq. 5.38 if the relationship between the new and
original clocks is deterministic. If this condition is not satisfied, the use of the time change
theorem in Monte Carlo simulation seems to be impractical. A
011 (t) = cos(v t) + ~ v sin(v t)fv, 012(t) = sin(v t)/v, and v = v ~-The
332 Chapter 5. Monte Carlo Simulation
denotes the new time. The mapping from the original to the new times is deter-
ministic in this case.
Figure 5.20 shows the relationship between the original and the new times
9 0.4,---~-~---~-----,
8
New time T(t)
7
0.3
6
5
0.25 \
\estimated
0.2
4
0.15
3
exact
2
0.1
2 4 6 8 10 2 4 6 8 10
Original time, t Original time, t
Figure 5.20. The time change and the evolution of the exact and the estimated
standard deviation of X
and the evolution of the exact and estimated standard deviation of X for v = 1,
~ = 0.05, and f3 = 0.2. The estimated standard deviation of X is based on 200
samples of this process generated in the new clock. <>
Note: The state X= (XI =X, x2 =X) is the solution of Eq. 5.38 with d = 2, d 1 = 1,
(a X)I = X2, (a X)2 = -v 2 X1 - 2 ~ v X2, b1 = 0, and b2 = {3. The process X repre-
sents the displacement of a linear oscillator with damping ratio ~ and natural frequency v
subjected to white noise.
The process M is a square integrable, continuous martingale. The new clock, r (t) =
[M, M](t) = {3 2 JJ e 2 svs B12 (t- s) 2 ds, is deterministic and approaches infinity as t---+
oo. The time change theorem states that M(t) = B(r) a.s. The samples of X used to
estimate the standard deviation of this process were obtained in two steps. First, samples
of B have been generated. Second, the corresponding samples of X have been calculated
from the samples of B and the relationship between M and B. The method is more efficient
than the classical Monte Carlo algorithms, which produce samples of X by integrating the
differential equation for this process driven by samples of W. A
4 f~ B(s) 2 ds, converges to infinity a.s. as t ---+ oo so that the time change theo-
rem can be applied. Because the new time T is random, the mapping from B to M
is complex and its use in Monte Carlo simulation may be impractical. Figure 5.21
shows 20 samples of T obtained from samples of B in the original clock and the
800.-----------~--~--.
,...., 1
New clock, 15 B(s)=M(T- (s))
700
T(t)=[M,M](t)
600
500
400
300
200
-15
-20 L__----~----~-------'
0 10 20 30
Figure 5.21. Samples of the new clock T and the Brownian motion B
corresponding samples of B(s) = M(T- 1 (s)). The process B is a Brownian
motion in the new clock according to the time change theorem. ¢
Note: Because M(t) = 2 JJ B(s) dB(s), we have (Section 4.5.3)
so that T is a random process with increasing samples. Define the sequence of stopping
times T1 = inf{t > 1 : B(t) = 0}, T2 = inf{t > T1 + 1 : B(t) = 0}, ... , and note that
(1) fo
00B(s) 2 ds = x1 + x2 + ... 'where x1 = ![
1 B(s) 2 ds, x2 = IP
B(s) 2 ds, .. .,
(2) the random variables X 1, X 2, ... are independent and identically distrib~ted, and (3) the
expectation of X 1 is
E[Xd =E [loTI B(s) 2 ds]::: E [fo1B(s) 2 ds] = fol E[B(s) 2]ds = 1/2.
We have JJ
B(s) 2 ds -+ oo a.s. as t -+ oo because Lk=
1 X; converges a.s. to infinity
as n -+ oo (Section 2.14 in this book, [150], Proposition 7.2.3, p. 563) and its limit is
fooo B(s)2 ds.
The time change theorem states that M(t) = B([M, M](t)) a.s., t ::: 0. The samples
of B in the new clock can be obtained from the recurrence formula
B(s + ~s. w) = B(s, w) + (M(T- 1(s + ~s. w), w)- M(T- 1(s, w), w)), s ::: 0,
334 Chapter 5. Monte Carlo Simulation
Example 5.24: Let N be a Poisson process with jump times Tk. k = 1, 2, ... , and
intensity A > 0. The compensated Poisson process N(t) = N(t) -At is a mar-
tingale with quadratic variation [N, N](t) = [N, N](t) = N(t) (Section 3.12).
However, the process B(s) = N([N, Rr 1(s) ), s :::: 0, in Eq. 5.40 is not a Brown-
ian motion since N is not a continuous martingale as required by the time change
theorem in Eq. 5.40. <>
= 1 is
Proof: Let ih denote values of a B at time k .6.s, where k ;::: 0 is an integer and .6.s
a time unit in the new clock. The definition of Bk in Eq. 5.40 gives Bk = N(N- 1 (k)) =
k-). Tk so that Bk- Bk-1 = 1 -). Z, where Z is an exponential random variable with
mean 1/).. Hence, B cannot be a Brownian motion since its increments are not Gaussian
variables. •
respectively, where x; and z; are independent samples generated from the densities
f and q, respectively, where f(~) = dP(X ::=:::Old~, q(~) = dQ(X ::=::: ~)ld~,
and Q is a measure on (Q, F) such that P « Q.
Let X = exp(Y), Y ~ N(1, (0.2) 2 ). The exact probability P(X > x) is
0.3712, 0.0211, 0.1603 X 10- 3 , 0.3293 X 10- 4 , and 0. 7061 X 10-5 for X = 3, 5,
8, 9, and 10, respectively. The corresponding estimates pMc(x) based on 10,000
samples are 0.3766, 0.0235, 0.2 x 10-3 , 0, and 0. The estimates p1 s(x) based on
the same number of samples are 0.3733, 0.0212, 0.1668 x 10 - 3 , 0.3304 x 10-4 ,
and 0.7084 x 10-5 for x = 3, 5, 8, 9, and 10, where the density q(z) = ¢((z-
x) I a) I a corresponds to a Gaussian variable with mean x and variance a 2 . While
the importance sampling provides satisfactory estimates of P(X > x) for levels
up to x = 10, the estimates PMc(x) are inaccurate for x 2: 8. <>
Note: The required probability is
where Ep denotes the expectation operator under P. The estimate PMc(x) approximates
this expectation. We also have
where q is the density of X under Q. This density has been selected such that 50% of its
samples exceed x and the ratio f I q is bounded in JR. &
i[
Jfi!!.d
P1 = f(x)J
1vc(x) - - q(x) dx = EQ [ 1vc(X) -
f(X)J
- ,
fi!!.d q(x) q(X)
where f and q are the densities of X under the probability measures P and Q,
respectively. The measure P defines the original reliability problem. The measure
Q is selected to recast the original reliability problem and is such that P « Q.
The estimates of P1 by direct Monte Carlo and importance sampling are denoted
by pI,MC and pf,IS, respectively (Example 5.25).
Let X have independent N(O, 1) coordinates and D be a sphere of radius
r > 0 centered at the origin of ocd, that is, D = {x E ocd :II X liS r}. Table 5.1
lists the exact and estimated probabilities of failure ford = 10, and several values
of r. The estimates of P 1 are based on 10,000 samples. The density q corresponds
to an ocJ -valued Gaussian variable with mean (r, 0, ... , 0) and covariance matrix
a 2 i, where a > 0 is a constant and i denotes the identity matrix. The direct
336 Chapter 5. Monte Carlo Simulation
r (j
Pt Pf,IS PJ,MC
5 0.5 0.053 0.0 0.0053
1 0.0009
2 0.0053
3 0.0050
6 0.5 0.8414 X 10 -4 0.0001 X 10 -4 0.0
1 o.1028 x 10- 4
2 o.5697 x 10-4
3 1.1580 x 10-4
4 1.1350 x 10- 4
7 0.5 0.4073 x 10- 6 0.0 0.0
1 o.oo16 x 10-6
2 o.I223 x 10_ 6
3 o.6035 x 10- 6
4 0.4042 x 10_6
Monte Carlo method is inaccurate for relatively large values of r, that is, small
probabilities of failure. The success of the importance sampling method depends
on the density q. For example, pf,IS is in error ford = 10 and a = 0.5 but
becomes accurate if a is increased to 3 or 4. <>
Note: The exact reliability,
1- p = p
f
d
( "'x2 < r2
~ l
)
-
= -1-
f'(d/2)
lor0 2
j2 ~d/2-1 e~ d~
,
!=l
is an incomplete gamma function, where f'(-) denotes the gamma function. The densities
of X under the measures P and Q are
, 1 ~ f(Zk)
Pf,IS = - ~ 1vc(Zk) -(-),
n k=l q Zk
where X(O) = x denotes the initial state and the coefficients a, bare such that the
solution of this equation exists and is unique (Section 4. 7 .1.1 ).
We need some definitions to state the Girsanov theorem. Let
g(x, t) = a(x, t) + b(x, t) y(x, t), (5.42)
where y(X(t), t) E JRd' is a bounded, measurable function that is Ft-adapted with
dtglad samples. Define an JRd -valued stochastic process Z by
Z(t) = Z(O) +lot g(Z(s), s) ds +lot b(Z(s), s) dB(s), t E [0, -r], (5.43)
(5.45)
where t E [0, -r].
Girsanov's theorem. This theorem gives the following properties that are es-
sential for Monte Carlo simulation ([147], pp. 108-114):
1. The process iJ in Eq. 5.44 is a Brownian motion under Q.
2. The solution X ofEq. 5.41 under P satisfies also Eq. 5.43 under Q.
Note: The stochastic equations for X and Z in Eqs. 5.41 and 5.43 (1) have the same
diffusion coefficients but different drift coefficients and (2) are driven by Brownian motions
under the measures P and Q, respectively. The drift of Z can be obtained from the drift
of X and Eq. 5.42. The sample paths of X under P and Z under Q differ. However, the
processes X and Z have the same probability law. l.
338 Chapter 5. Monte Carlo Simulation
The Girsanov theorem can be used to improve the efficiency and accuracy
of the direct Monte Carlo method. For example, let X denote the state of a system
that fails if X leaves a safe set D in a time interval [0, r]. Suppose that our objec-
tive is to estimate the probability of failure P f ( r) of this system by Monte Carlo
simulation. The classical Monte Carlo method generates n samples of X, count
the number of samples n f of X that leave D at least once in [0, r], and estimate
Pj(T) by n1jn. This solution can be inefficient in applications because (1) the
estimates of Pf ( r) have to be based on a very large number of samples of X for
highly reliable systems, that is, systems with P f ( r) « 1, and (2) the computation
time for obtaining a single sample of X can be significant for realistic systems.
The Girsanov theorem provides a procedure for modifying the drift of X if con-
sidered under a new probability Q. If a significant number of samples X under
Q cause system failure, then the number n of samples needed to estimate P f ( r)
can be reduced significantly relative to the classical Monte Carlo method based
on the definition of X under the original probability measure P. The following
algorithm outlines the estimation of P f ( r) based on samples of X under Q.
Monte Carlo algorithm. The estimation of the failure probability P f(T) in-
volves three steps:
1. Select the function y in Eq. 5.42 such that the fraction of samples of X exiting
D at least once during [0, r] is relatively large.
2. Generate samples z; (t) of X under Q and of the Radon-Nikodym derivative
dQjdP, i = 1, ... , n, fortE [0, r].
3. Estimate the failure probability P f(T) by (Girsanov's theorem)
(5.46)
Note: The success of the Monte Carlo simulation algorithm in Eq. 5.46 depends essentially
on the choice of the function y in Eq. 5.42. Unfortunately, there exists no simple method
for selecting y. The following two examples are presented to clarify the statement of the
Girsanov theorem. Estimates of Pj(T) based on this theorem are given in Examples 5.29
and 5.30. These examples also present some practical ideas for selecting an adequate ex-
pression for y in Eq. 5.42. &
Example 5.27: Let (Q, :F, (F1)r:::_o, P) be a filtered probability space and X be
a process on this space defined by dX(t) = adt + bdB(t), t E [0, r], where
r > 0, X(O) = x, B is a Brownian motion under P, and a, b are constants.
Let Z be another process on the measurable space (Q, F) satisfying the equation
dZ(t) = bdB(t), t E [0, r], where Z(O) = x and B is a Brownian motion with
respect to a probability measure Q defined by (d QjdP)(t) = exp[ -(ajb) B(t)-
(a 2 j(2b 2 )) t]. Then X and Z have the same law. The expectation of X(t) and the
probability of the event A = {X(t) _::: ~}, ~ E JR. are x +at and <I>((~- x-
5.4. Improved Monte Carlo simulation 339
a t)/b -J{), respectively, and can be calculated from X under P or from Z, that is,
X under the probability measure Q. <>
(Eq. 5.43).
We have E p [X (t)] = E p[x +at+ b B(t)] = x +at under P. This expectation is
also
- d
since B(t) = 0 G, G ~ N(O, 1), under Q, and E[exp(u G)] = exp(u2 /2), u E R
Similar considerations can be used to calculate higher order moments of X ( t).
The probability of A= {X(t):::: ~}is as stated since X(t) ~ N(x +at, b2 t) under
P. We also have
P(A) = L(~ ~) (t) d Q(t) = e-a 2 t/(2 b2) Lexp(a B(t)jb) d Q(t)
=e-a2t/(2b2) r eayfbrp(yj../i)j../idy=cf>(~-x-at),
1(-oo,(f;-x)/b] b0
Proof: The Radon-Nikodym derivative in Eq. 5.45 withy = -ajb gives the above ex-
pression of dQjdP. The process Z is a geometric Brownian motion equal to Z(t)
x exp( -0.5 b2 t + b B(t)) for Z(O) = x. •
Example 5.29: Let X be the process in Example 5.27 with X (0) = x = 0. Our
objective is to estimate P f(r) = P(maxo:::t:sr X(t) > Xcr). If X is a performance
index and Xcr denotes a critical level for a system, then P 1 (r) is the probability of
failure in [0, r].
Let Z be a process defined by dZ(t) = (a+ by)dt + bdB(t), where
iJ is a Brownian motion under a probability measure Q defined by Eq. 5.45 and
340 Chapter 5. Monte Carlo Simulation
PJ,Mc(r) = ~ ti=1
n
1(xcr.oo) (max {xi(!)})
o.:::r:::-r
and
for X(O) = x and a > 0. Let pf,Mc(r) and PJ,Is(r) be the estimates of Pt(r)
in Example 5.29, where Z is defined by
4,---------------~---, 4,-------------------~
Samples of X under P Samples of X under Q
3 3
2 X/
cr
-1
-2L-------~--~--~--~
0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
time time
sponding samples of this process under the measure Q. Because of the drift cor-
rection obtained by Girsanov's theorem, the samples of X under Q are more likely
to exceed Xcr· This property of the samples of X under Q results in superior esti-
mates of Pt(r). <>
Proof: As in the previous example we have modified the drift such that it reaches Xcr
at timer. This condition gives y = (Xcr- xe-a')/[.J2lii(l- e-ar)] so that y =
4.47 for the values considered. The Radon-Nikodym derivative (dQ/dP)(t) is equal to
exp[y B(t) + y 2 t/2].
The measure change used in this example has been established by elementary con-
siderations and is not optimal in the sense that the number of samples of X under the above
measure Q that exceed Xcr in [0, r] is relatively small (Fig. 5.22). Alternative measures Q
can be selected such that approximately 50% of the samples of X under Q exceed Xcr in
[0, r ]. The use of such measures would further increase the efficiency of the Monte Carlo
algorithm for estimating P1 (r). •
5.5 Problems
5.1: Let X ~ N (/L, y) be a Gaussian vector in JRd . Develop an algorithm for gen-
erating samples of X using the fact that conditional Gaussian vectors are Gaussian
vectors.
5.3: Use Eq. 5.9 to develop a Monte Carlo algorithm for generating samples of
the translation vector considered in Example 5.6.
5.4: Complete calculations in the text giving the covariance function of X (n) in
Eq. 5.12.
5.5: Complete the details of the proof showing that X (n) in Eq. 5.12 becomes a
version of X as !l.(pn) --+ 0 and v* --+ oo.
5.6: Complete the details of the proof showing that x<J...) in Eq. 5.13 becomes
equal to X in the second moment sense as v* --+ oo and that x<J...) becomes a
Gaussian process as A.--+ oo.
5.7: Show that the second moment properties of the compound Poisson processes
Cp. p = 1, 2, used to define the process x<J...) in Eq. 5.17 are as stated. Find the
second moment properties of the increments d C P ( v) of these processes.
5.8: Suppose that X is a stationary Gaussian process with mean zero and one-
sided spectral density g(v) = (1/v*) 1[o, v*l (v), v :=:: 0, where v* > 0 is a constant.
Calculate the mean square errore = E[(X(t) - x<n)(t))] of the approximate
representation x<n) of X given by Eqs. 5.12 and 5.19.
5.9: Complete the details of the proof showing that X (n) in Eq. 5.22 becomes a
version of X as na --+ oo, a= 1, ... , d'.
5.10: Redo Example 5.14 with an alternative periodic extension of the covariance
function of X such that c<n,n)(r, r) is approximately equal to c(r, r).
5.11: Prove Eq. 5.29. Find also the finite dimensional density of X ford > 1.
5.12: Prove the convergence of a binomial distribution to a Poisson distribution
under the conditions stated following Eq. 5.34.
5.13: Show that the points of N I N(B) are uniformly distributed in B, where N
is a Poisson random measure defined on Rd' and B E B (Rd') is bounded.
5.14: Repeat the analysis in Example 5.22 for the case in which X is an Ornstein-
Uhlenbeck process.
5.15: Calculate moments of order 3 and higher of X in Example 5.27 using the
definition of this process under the probability measure Q.
5.16: Find the density of the random variable T = inf{t :=:: 0: X(t) = x}, x > 0,
by using the change of measure in Example 5.27 and the distribution of the first
time a Brownian motion B starting at B(O) = 0 reaches x (Section 3.4).
5.17: Extend the analysis in Example 5.30 to a process X defined by the stochastic
differential equationdX(t) = (a X(t) + fJ X(t) 3 ) dt+a dB(t), where a, {J, a are
some constants and B denotes a Brownian motion.
Chapter 6
6.1 Introduction
The current state of a system is commonly given by the solution of a deter-
ministic differential, algebraic, or integral equation. For example, the functions
giving the displacements of the points of a solid satisfy partial differential equa-
tions obtained from equilibrium conditions, kinematic constraints, and material
constitutive laws. Generally, it is not possible to solve analytically the equations
defining the system state. Numerical methods are needed for their solution. Most
numerical methods are global, that is, they determine the solution everywhere or
at a large number of points in a system even if the solution is needed at a single
point.
This chapter develops alternative methods for solving deterministic differ-
ential, algebraic, and integral equations that have two common features.
( 1) The methods are local, that is, they give directly the value of the solution of
a differential or integral equation at an arbitrary point in the set on which the
solution is defined, rather than extracting its value from the field solution.
For algebraic equations, the methods give the value of a particular unknown
directly.
(2) The methods employ Monte Carlo simulation. Local solutions can be
expressed as averages of functions of some stochastic processes that can
be estimated by Monte Carlo simulation algorithms. A MATLAB function
for finding the local solution of a partial differential equation is given in
Section 6.2.1.3 for illustration.
We consider the following classes of differential, algebraic, and integral
equations and develop methods for finding their local solutions.
343
344 Chapter 6. Detenninistic Systems and Input
d d 2
au(x,t) = "ai(X) au(x,t) +~ "fJiJ(X) a U(X,t)
at ~ ax· 2 ~ ax· ax.
i=l l i,J=l l J
where D is an open subset of JRd, ai, fJiJ are real-valued functions defined on
D C JR<f, d =:: 1 is an integer, and q, p denote real-valued functions defined on
D x [0, oo ). The solution u : D x (0, oo) ---+ lR ofEq. 6.1 depends on boundary and
initial conditions. This equation is meaningful if the first order partial derivative
and the second order partial derivatives of u with respect to t and x, respectively,
are continuous. If the functions q and p depend on only x and aujat = 0, Eq. 6.1
is said to be time-invariant.
In Section 6.4 we consider a class of partial differential equations that define
linear elasticity problems and differ from Eq. 6.1. The local solutions developed
in Sections 6.2 and 6.3 for Eq. 6.1 cannot be applied to elasticity problems. An
alternative solution based on Neumann series representations will be used to solve
locally some elasticity problems.
where i is the (m, m) identity matrix, g is an (m, m) matrix, tf> and u denote m-
dimensional vectors, and A is a parameter. It is assumed that the entries of g and
tJ> are real-valued. If tf> =f. 0, A has a fixed value, and det(i -A g) =f. 0, Eq. 6.2 has
a unique solution. If tf> = 0, Eq. 6.2 defines an eigenvalue problem for matrix g.
example, the finite element, boundary element, and finite difference methods, used
currently to obtain numerical solutions of partial differential equations.
Some possible limitations of the traditional computational methods are:
(1) the computer codes used for solution are relatively complex and may require
extensive preprocessing to define a particular problem in a specified format, (2) the
numerical algorithms may become unstable in some cases, (3) the errors caused
by the discretization of the domain of integration and the numerical integration
methods used in analysis cannot be bounded, and (4) the field solution must be
calculated even if the solution is needed at a single point or a small collection of
points in D. In contrast, numerical algorithms based on the random walk method
are simple to program, always stable, accurate, local, and ideal for parallel compu-
tation. However, the local solutions are less general than those using the traditional
computational methods.
where .6. = ""£1= 1 a2 ;ax? is the Laplace operator, OiJ = 1 fori = j, and
OiJ = 0 fori f. j. If p = 0, Eq. 6.6 becomes Llu(x) = 0, and is referred to
as the Laplace equation.
3. The heat, transport, and diffusion equations for heterogeneous media are
given by Eq. 6.1 with q = 0, where p has the meaning of flux. The heat
equation for a homogeneous medium with diffusion coefficient a > 0 is
au(x, t) 2
-a'-t----'- =a .llu(x, t) + p(x, t), (6.7)
and is given by Eq. 6.1 with ai(X) = 0, f3iJ (x) = 2a 2 OiJ. and q = 0.
4. The Schrodinger equation for a heterogeneous medium is
d d 2
"" au(x)
~ai(x) -a-.
. Xr
+ 21 ..""
~
a u(x)
f3iJ(x) ~ +q(x)u(x)
X1 x 1
+ p(x) = 0,
I= 1 !,]= 1
(6.8)
1
2 Llu(x) + q(x) u(x) + p(x) = 0, (6.9)
and can be obtained from Eq. 6.8 by setting a i (x) = 0 and f3iJ (x) = OiJ.
The above Schrodinger equations are referred to as inhomogeneous and ho-
mogeneous if p f. 0 and p = 0, respectively.
Note: The boundary condition for the time-invariant version of Eq. 6.1 is u(x) = ~(x),
x E aD. The real-valued function~ giving the boundary values of u in this case differs
from~ in Eq. 6.10. We use the same notation for simplicity. A
1. The entries of the matrix {J(x) = {fiij (x)} are real-valued functions and fJ is
a symmetric positive definite matrix for each x E JRd, that is, the eigenvalues of
fJ (x) are strictly positive for all x E D. Hence, there exists a matrix b such that
{J(x) = b(x)b(x)T foreachx ED.
Note: There are several methods for calculating a matrix b satisfying the condition fJ (x) =
b(x) b(x)T for a fixed x E JRll. The Cholesky factorization provides such a representation
for fJ (Section 5.2.1 in this book, [155], Section 3.5.3). An alternative solution is given
by the equality rp(xl f:J(x) rp(x) = l(x), where rp(x) is a (d, d) matrix whose columns
are the eigenvectors of f:J(x) and l(x) is a diagonal matrix whose non-zero entries are the
eigenvalues of f:J(x). If the eigenvalues of f:J(x) are distinct, then we can take b(x) =
rp(x) fL(X), where fL(X) is a diagonal matrix with Jkii (x) = Jl..ii (x). A
2. Set ai(x) = ai(x), i = 1, ... , d, and b such that {J(x) = b(x)b(x)T for all
X E r. Let B be an JRd-valued Brownian motion and let X= (X, Xd+t) be an
JR.d+l-valued diffusion process defined by the stochastic differential equation
It is assumed that the drift and diffusion coefficients of X satisfy the conditions
in Section 4.7.1.1 so that the solution X of Eq. 6.11 exists and is unique. These
requirements impose additional conditions on the coefficients ofEq. 6.1. We also
note that X is a semimartingale with continuous samples.
3. For the time-invariant version of Eq. 6.1, let X be defined by Eq. 6.11 with
X(O) = x E D and let Fr = a(B(s) : 0 ~ s ~ t) denote the natural filtration of
B. If D is a Borel set, then
4. For the general version of Eq. 6.1, let X be defined by Eq. 6.11 with X(O) =
X E D and xd+l (0) = t > 0. Define also the set Dt = D X (0, t) for any t > 0.
If D is a Borel set, then
348 Chapter 6. Deterministic Systems and Input
Note: Consider Eq. 6.1 with q = 0. This equation can be written in the form
showing that p must be continuous in its arguments. It turns out that p has to satisfy a
stronger condition for u to have continuous partial derivatives of order 1 in t and order
2 in x, The function p must be Holder continuous in D, that is, there exist constants
c, a E (0, oo) such that lp(x, t)- p(x', t)l _:: : c II x- x' lla, where x, x' E D ([58],
p. 133).
Let X be a diffusion process starting at x E 3D. We say that x is a regular point
of D if the first timeT, when X exits D, is zero a.s., that is, P(T = 0) = 1. A boundary
point that does not satisfy this condition is said to be an irregular point. The boundary
point {0} of D = {x E R 2 , II x II< 1} \ {0} is an irregular point with respect to a Brownian
motion B in R 2 because B will not hit {0} in a finite time even if it starts at this point ([58],
p. 99). Useful considerations and examples on regular and irregular boundary points can be
found in [58] (Section 4A) and [134] (Section 9.2), That D may have irregular points does
not have significant practical implications on the local solution of Eq. 6.1 because (1) D
can be approximated by a regular subset of r if it has irregular points and (2) almost no
path of the diffusion processes used to estimate the local solution of Eq. 6.1 will ever hit
the irregular points of D, provided that D has a finite number of irregular points ([41],
pp. 49-50). A
If the conditions 1, 2, 4 and 5 are satisfied, then the local solution of Eq. 6.1
with the initial and boundary conditions in Eqs. 6.4 and 6.10 is
Proof: The superscripts of the expectation operator E(X,t) indicate that X(O) = (x, t),
x E D and t > 0. The boundary value of u depends on the exit point of X. This function
is equal to rJ(X(t)) iff= t (Eq. 6.4) and ~(X(t- f)) iff < t (Eq. 6.10).
The generator of the diffusion process i is defined by the limit
. E(x,tl[g(X(s))]- g(x, t)
A[g(x, t)] =lim , (6.15)
s.j,O s
where g is a real-valued function defined on ~+1 with continuous second order partial
derivatives in x E JRd and continuous first order partial derivative in s 2: 0. Denote by
agjax; and a2gjax; ax j the partial derivatives of g with respect to the coordinates x; and
Xi, Xj of i = (x, xa+l) so that Xi =Xi fori = 1, ... , d, d + 1. We have seen that X is
a continuous semimartingale (Section 4.7.1.1) so that we can apply the multi-dimensional
Ito formula (Section 4.6.2) to g(X(s)). This formula gives
E(X,t)[g(X(s))]- g(X(O)) = d+1 E(X,t) [ { sL -
ag(x_~cr)) dX;(cr) ]
Jo
t
i=1 ax,
+!
2 .. 1
E(x,t) [ rs
Jo
(b(X(cr)) b(X(cr))T) .. a 2 g5X~cr)) dcr]
I} ax;aXj
1,]=
+! t Jo
2 .. 1
l,j=
{s (b(X(cr) b(X(cr))T) .. a 2 g(X(cr)) dcr] .
lj ax;axj
For a small s > 0, the sum of integrals under the expectation on the right side of the above
equation divided by s can be approximated by
Ld -
ag(X(8(w) s, w)) a; (X(8(w) s, w))- ag(X(8(w) s, w))
-
+~ Ld
(b(X(8(w) s, w)) b(X(8(w) s, w))T) .. a g(X(B(w) s, w))
2 -
i,j=1 lj ax;axj
for each w E Q, where 9(w) E [0, 1]. Since the drift and diffusion coefficients of X are
bounded functions, i has continuous sample paths, and g has continuous partial derivatives
of order 1 and order 2 with respect to i and x, respectively, the limit of the above function
ass {. 0 is deterministic so that A in Eq. 6.15 is
a a a 1 a a2
A=---+"' ai-
axd+1 L...t ax·l
+-2 "'(bbT);· - - .
L...t J ax· ax.
(6.16)
i=1 i,j=1 l J
350 Chapter 6. Deterministic Systems and Input
The generator of X coincides with the differential operator of Eq. 6.1 for the drift and diffu-
sion coefficients of X in Eq. 6.11. We have performed similar calculations in Section 4.6.2
to show that the generator of a Brownian motion is proportional to the Laplace operator.
The expectation of the Ito formula applied to the function g (X (s)) can be given in the form
(Eq. 6.16)
We have seen that any semimartingale X has the representation X (t) = X (0) +
A(t) + M(t),
where A is an adapted process with dtdUtg samples of finite variation on
compacts and M is a local martingale such that A(O) = M(O) = 0 (Section 4.4.1). Let
Xt (t) = X(O) +(A+ M)t (t) = X(O) +At (t) + Mt (t) be the process X stopped at
- -t -t - - -t
T. Since A and M have the same properties as A and M, respectively, then X is a
semimartingale so that Eq. 6.16 can be applied to the function g(Xt ), and gives
- t - -
Also, g(X (s)) = g(X(T 1\ s)), fo A[g(X (CT))]dCT =
S - t
fot 1\S A[g(X(CT))]dcr,
- -
and T ~
t, so that the above equation gives
for s 2: t. Eq. 6.17 is referred to as the Dynkin formula ([135], Theorem 7.4.1, p. 118).
If X is defined by Eq. 6.11 and the solution u of Eq. 6.1 with the initial and boundary
conditions in Eqs. 6.4 and 6.10 is used in place of g, then Eq. 6.17 gives Eq. 6.14 since
X(s) EDt for s < f and X(f) is on the boundary of D 1 so that A[u(X(s))] =- p(X(s))
(Eq. 6.1) and
u(X(T)) = { ~(X(t- f)), iff < t,
ry(X(t)), iff= t.
We can replace g with u in Eq. 6.17 since the solution u has a continuous first order deriva-
tive in t and continuous second order partial derivatives in x. Additional technical consid-
erations related to the above proof can be found in [42] (Chapter 6), [58] (Chapter 4), and
[135] (Chapter 9). •
If the conditions 1, 2, 3 and 5 are satisfied, the local solution of the time-
invariant version ofEq. 6.1 with the boundary condition u(x) = ~(x), x E aD,
is
u(x) =Ex [~(X(T))] +Ex [loT p(X(a)) da]. (6.18)
Proof: The expectation of the Ito formula applied to a real-valued function g E C2 (Rd)
6.2. Random walk method 351
+! tr
2 . .
!,j=l
lo !j axiaXj
2
(b(X(u) b(X(u)l) .. a g(X(u)) du] .
Arguments similar to those used to derive Eq. 6.14 show that the generator of X is
a a 1 a a2
A=" ai(X)-
LJ ax·
+-2 "(b(x)b(xlk - - ,
LJ J ax· ax·
i=l ! i,j=l ! J
[!.' t
applied to the function u(XT•) for each n ~ 1, and gives
Hence, the family Jf" h j (X(u)) dB j(u), n = 1, 2, ... , is uniformly integrable ([135],
Appendix C) so that
The functions hi are bounded Borel functions in D by the properties of the diffusion coef-
ficients of X and of the solution u of the time-invariant version of Eq. 6.1. •
Example 6.1: Let D be a simply connected open bounded subset of ffi.d and let
u be the solution of the Laplace equation ~u(x) = 0, x E D, with the boundary
condition u(x) = ~(x), x E aD. Denote by T = inf{t > 0 : B(t) ¢. D} the first
time when an JEd -valued Brownian motion B starting at x E D leaves D. The
local solution is u(x) = Ex [;{B(T))] for each x E D. <>
Proof: The process BT is a square integrable martingale since Tis a stopping time and B
is a square integrable martingale. The Ito formula applied to the function u(BT) gives
u(BT (t)) = u(x) + Ld lot au(BT (s)) dB[ (s) +-I lot t:.u(BT (s))ds
i=l 0 axi 2 0
where the latter equality holds since t:.u(BT (s)) = 0 for all s ::::: 0. The stochastic integrals
Mi (t) = JJ [au(BT (s))jaxj] dB[ (s), t ::::: 0, are .Ft-measurable, have finite expectation
since E[Mi(t) 2] = E [JJ (au(B(s))jaxi) 2 ds J and the functions au;axi are bounded in
D,and
s :5 t.
Hence, the processes Mi are martingales. Then u(BT (t)) = u(x) + L.f=
1 Mi (t) is a mar-
tingale with expectation Ex[u(BT (t))] = u(x) so that Ex[u(B(T))] = u(x) by letting
t--+ oo. This result and the specified boundary conditions yield u(x) = Ex[~(B(T))]. •
Example 6.2: The RWM can solve locally elliptic and parabolic partial differen-
tial equations. The method cannot be used to solve hyperbolic partial differential
equations. <>
Proof: Let X be the diffusion process in Eq. 6.11. The generator of this process is
d a 1 d az
A= ~ai-
L... ax·
+-2 ~ (bbTk - - .
L... J ax· ax·
i=l l i,j=l l J
Since b bT is a real-valued symmetric matrix, its eigenvalues cannot be negative so that the
RWM cannot be applied to solve parabolic partial differential equations ([73], Chapter 26).
For example, consider the time-invariant version of Eq. 6.1 with d = 2 and ai (x) =
0 and let X be an R2 -valued diffusion process defined by the stochastic differential equation
6.2. Random walk method 353
dX(t) = b dB(t), t ::::_ 0, where B is a Brownian motion in R2 and b denotes a (2, 2) real-
valued matrix. The generator of X is (Eq. 6.16)
Since
the differential operator defined by the generator of X cannot match the operator of a hy-
perbolic equation ([73], Chapter 26). •
through aD at an earlier time f < t. The values of u(X(T)) at these exit points
are I](X(t)) and HX(t- f)), respectively.
Suppose that ns independent samples of X have been generated. Denote
by n~ and n7 the number of sample paths of X that exit D 1 through D and aD,
respectively. We haven~ 2: 0, n7 2: 0, and ns = n~ + n7.
I 1 n, " n,
u(x, t) = ns - "IJ(X(t, w'))
I ] + ns [ __!__ "~(X(t- T(w"), w"))
If ]
ns [
n' ~ ns n" ~
s w'=l s w"=l
+- L
1 n, iT(w)
p(X(s, w)) ds. (6.19)
ns w=l 0
Note: There is no reason to continue the calculation of a sample of X after this sample has
exited Dt. Hence, we need to generate samples of the process X stopped at f.
The terms in the square brackets of Eq. 6.19 are estimates of the expectations
E(X,tl[u(X(f)) I f = t] and E(X,tl[u(X(f)) I f < t] and the ratios n~/ns and n7/ns
weighting these terms approximate the probabilities P(f = t) and P(f < t), respectively.
Hence, the sum of the first two terms in Eq. 6.19 is an estimate for E(X.tl[u(X(T))].
The expectation E(X,t) [Jt p(X(s)) ds] in Eq. 6.14 is approximated by the last term in
Eq. 6.19.
The accuracy of the local solution given by Eq. 6.19 depends on (1) the sample size
ns = n~ + n7, (2) the time step used to generate sample paths of the diffusion process X,
and (3) the accuracy if the recurrence formula used for generating sample paths of X. The
algorithms in Section 4.7.3 can be used to generate samples of X. _.
n, ns iT(w)
u(x) = -
1
L HX(T(w), w)) +-1 L p(X(s, w)) ds, (6.20)
ns w=l ns w=l 0
Llu(x) = 0, X E D,
u(x) = ~r(X), X E aD 7 , r = 1, ... , m. (6.21)
Note: The local solution in Eq. 6.23 is a special case of Eq. 6.18 for ai (x) = 0, fJij (x) =
2 liij , and q = p = 0. The Brownian motion B is adequate for solution because its
356 Chapter 6. Deterministic Systems and Input
Figure 6.2. Local solution of the Laplace equation for a multiply connected do-
main Din JR2
in JR2 with boundaries 8D1 = {(Xl, X2) : xf +xi- 1 = 0} and 8D2 = {(Xl, X2) :
(x1 - 1/4) 2 +xi - (1/4) 2 = 0} kept at the constant temperatures 50 and 0,
respectively. The local solution by the RWM is u (x) = 50 p 1(x) for any point x E
D (Eq. 6.24). Because Pl (x) cannot be found analytically, it has been calculated
from Eq. 6.20 and samples of B starting at x. The largest error recorded at x =
(0.7, 0), (0.9, 0), (0, 0.25), (0, 0.5), and (0, 0.75) was found to be 2.79% for n s =
1, 000 independent samples of B with time steps !'!,.t = 0.001 or 0.0001. Smaller
time steps were used at points x close to the boundary of D. The error can be
reduced by decreasing the time step and/or increasing the sample size.
Figure 6.3 shows the exact and RWM solutions along a circle of radius
r = 3/4 centered at the origin, where the angle() E [0, 2n) measured from the
coordinate Xl gives the location on this circle. The RWM solutions with (l'!,.t =
0.01, ns = 100) and (!'!,.t = 0.0001, ns = 5000) are shown with circles and
stars, respectively. The accuracy of the RWM method improves as the time step
decreases and the sample size increases [38]. <>
6.2. Random walk method 357
46
0 0
0
44 * *
0
0
* 0
0
42 0 *
0
0
* 0 0
40
0 0
0
u 38
0
0
36
*
34
Exact solution *
32 *
300 2 3 4 5 6 7
e
Figure 6.3. Solution along a circle of radius r = 3/4 centered at the origin
Note: The temperature u is the solution of~ u = 0 with the boundary conditions u(x) =
50 for X E aDl and u(x) = 0 for X E aD2. The exact solution is ([73], Example 16.3,
p. 296)
log(~(x) 2 +p(x) 2 ))
u~)=~ ( 1-
2 log(a)
, where a = 2 + v'3,
This exact result was used to evaluate the accuracy of the local solution. .&
~ au(x) 1 ~ a 2 u(x)
L....ai(x)~+2 L....f3iJ(x)ax·Bx· =0, xEDc!R2 ,
i=l l i,j=l l 1
with the boundary conditions u(x) = ~(x), x E aD. If {3(x) = {f3iJ (x)} admits
the decomposition {3(x) = b(x) b(x) T for each x E D, then the local solution is
u(x) = Ex[u(X(T))], where X is an JR2 -valued diffusion process with drift and
diffusion coefficients ai (x) = ai (x) and biJ (x) and T denotes the first time X
starting at x E D exits D (Eq. 6.14). It is assumed that Ex [T] < oo.
Generally, the expectation Ex[u(X(T))] cannot be obtained analytically
but it can be estimated from samples of X generated by Monte Carlo simulation.
Numerical results were obtained for D = (- 2, 2) x ( -1, 1), boundary conditions
~(x) = 1 for x E ( -2, 2) x {-1} and zero on the other boundaries of D, and the
coefficients a1 (x) = 1, a2(x) = 2, f3n (x) = 1.25, f312(x) = /321 (x) = 1.5, and
/322 (x) = 4.25 of the partial differential equation of u. The local approximations
of u based on ns = 1, 000 samples of X and a time step of !1t = 0.001 are
satisfactory. For example, these values are 0.2720, 0.2650, and 0.2222 at x =
(0.5, 0), (1.0, 0), and (1.5, 0). They differ from the finite element solution by less
than 5%. <>
Note: The diffusion process X needed to estimate u(x) at an arbitrary point x E D is
defined by the stochastic differential equation
dX1(t) = dt +dB1(t) + (1/2)dB2(t),
{
dX2(t) = 2dt + (1/2) dB1 (t) + 2dB2(t),
6.2. Random walk method 359
where B 1 and B2 are independent Brownian motions. A finite element solution has been
used to assess the accuracy of the local solution because the exact solution is not known in
this case. A
Note: The function Pr (x) = P(B(T) E iJDr) is the probability that B starting at x E D
exits D for the first time through iJDr, r = 1, ... , m (Eq. 6.23). The operator A= (1/2) ~
is the generator of the Brownian motion B. Generally, the probabilities and expectations in
Eq. 6.26 cannot be obtained analytically but can be estimated from samples of B. A
Example 6.7: The Prandtl stress function for a bar in torsion is the solution of
~ u(x) = -2 G {3, x E D c JR2 , where G is the modulus of elasticity in
shear and f3 denotes the angle of twist. If D is a multiply connected domain
with the external boundary aD! and the interior boundaries a Dr, r = 2, ... , m,
delineating the cavities of the beam cross section, the Prandtl function satisfies the
conditions u(x) = kr for x E a Dr, r = 1, ... , m, where kr are constants. One
of the constants k, is arbitrary and can be set to zero, for example k1 = 0 ([23],
Section 7-6).
The local solution for the Prandtl stress function is
with the notation in Eq. 6.26. If the Brownian motion B starts at a point on
the boundary of D, for example, X E oDr, and D is regular, then Pr(X) = 1,
pq (x) = 0 for q # r, and Ex [T] = 0 so that u (x) = kr, that is, the local solution
satisfies the boundary conditions exactly [82, 84]. 0
Note: The constant fJ can be calculated from the condition that the applied twist is twice the
volume of the Prandtl function over D ([23], Section 7-6). Therefore, the global solution
is needed to find {J. However, the value of fJ may not be necessary in the design phase
selecting the shape of D from several possible alternatives. Because local solutions help
detect regions of stress concentration in D, they provide a useful tool for the selection of
adequate shapes forD, that is, shapes which do not cause large localized stresses.
If D is a simply connected set, then Pr (x) = 0 for r = 2, ... , m so that the local
solution becomes u(x) = G fJ Ex[T] .l
G R(r2 - x2 - x2)
u(x) = G f3 Ex [T] = P
2
1 2 , x E D,
and coincides with the expression of this function in solid mechanics books ([23],
Section 7-4). o
Note: Consider a sphere of radius r > 0 centered at the origin of rand an JRli -valued
Brownian motion B starting at x, II x II < r. Let T be the first time that B starting at x
(r
exits the sphere. Then Ex[T] = 2- II x 11 2 ) fd (Section 8.5.1.1 in this book, [135],
Example 7.4.2, p. 119) . .l
Example 6.9: Estimates of the Prandtl stress function u(x) in Example 6.7 at
x = (2, 1), (3, 1), (4, 1), (2, 2), and (3, 2) for an elliptical cross section D = {x :
(x1/a1) 2 + (x2/a2) 2 < 1} with a1 = 5 and a2 = 3 have been obtained by Monte
Carlo simulation for G f3 = 1/2. For ns = 100 samples of B generated with a
time step !:l.t = 0.001 the largest error of the estimated Prandtl function is nearly
33%. The error decreases to 2% if the sample size is increased tons = 1, 000. 0
of the Prandtl stress function ([23], Section 7-4) was used to assess the accuracy of the local
solution. .l
6.2. Random walk method 361
Example 6.10: Consider a hollow circular shaft with inner radius a > 0 and outer
radius b > a and let T be the first time an JR2 -valued Brownian motion B starting
at x E D leaves D = {x E JR2 :a <II x II< b}. The local solution is
Note: The Prandtl stress function is u(x) = -~G {3(xT x - b 2) for a2 < xT x < b2
([23], Section 7-9). This expression of u has been used to assess the accuracy of the local
solution. &
au<x. t) = az I: a u<x. t)
d 2
(6.27)
at i=1 ax;
with specified initial and boundary conditions giving the functions u(x, 0) for
X ED and u(x, t) for X E aD and t > 0, respectively.
Proof: The generator of X is A = -ajaxd+l + a 2 ~ so that Eq. 6.29 giving the local
solution of Eq. 6.27 is a special case of Eq. 6.14 for ai = 0, f3i} = 2a 2 li;1 , q = 0, and
p = 0. The steady-state solution u 8 (x) = limt-+oo u(x, t) satisfies the Laplace equation
~Us = 0 since CJusfCJt = 0.
The local solution in Eq. 6.29 can be extended to random heterogeneous media.
This extension is used in Section 8.5.1 to find effective conductivity coefficients for random
heterogeneous media. •
Example 6.11: Consider a rod oflength I > 0 with initial temperature u(x, 0) =
100, x E D = (0, 1), and boundary temperatures u(O, t) = u(l, t) = 0 for
t > 0. The temperature at an arbitrary time t > 0 and location x E (0, I) is
u(x, t) = 100 p3(x, t), where PI (x, t), p2(x, t), and p3(x, t) are the probabilities
that the R2 -valued diffusion process X= (XI, X2) defined by
where T is the first time that X starting at (x, t) exits Dt = (0, l) x (0, t), x E (0, l),
and t > 0. The last equality in the above equation holds since u is the solution of the
heat equation, X(s) E Dt for s E (0, T), and A coincides with the differential operator of
Eq. 6.27 ford = 1.
The exact solution is
u(x, t) = L -;-;
400
00 nrrx 2
sin - 1- exp{ -(n rr aj l) t}
n=I,3, ...
for x E (0, l) and t ~ 0 ([73], Example 26.2, pp. 528-531). A
6.2. Random walk method 363
'-;{1} X (0, t)
0.9
0.8
0.7
0.1
au = -c au + d a2u
at ax ax 2
defined in D = {x : x > 0} fort > 0 with the initial and boundary conditions
u(x, 0) = 0 and u(O, t) = uo, t > 0, respectively. The local solution of this
equation at (x, t) is u(x, t) = uo P(T < t), where D1 = D x (0, t), T = inf{s >
0: X(s) ¢ Dt}, i = (Xt, Xz) is defined by
{
dXt(s) = -cds + (2d) 112 dB(s),
dXz(s) = -ds,
and B denotes a Brownian motion. Figure 6.5 shows two samples of i starting at
(x = 0.4, t = 0.2) and exiting D 1 at times T < t and T = t.
Estimates of u(x, t) calculated at several locations x and times tare in er-
ror by less than 2% for c = 1, d = 1, uo = 1, a time step !l.t = 0.001, and
ns = 500 samples. The error can be reduced by increasing the sample size. For
example, the error decreases to approximately 1% if n s = 1, 000 samples are used
in calculations [86]. <>
Note: The generator A= d (8 2 ;axr)-c (8jax 1)-8f8xz of X, Eq. 6.14, and the boundary
and initial conditions yield the stated local solution.
The exact solution of the mass transport equation giving the concentration of pollu-
tant at an arbitrary point x and time t > 0 is
u(x, t) =Z
uo[erfc (x-ct)
2./di + exp(c xfd) erfc (x+ct)]
./di ,
2
364 Chapter 6. Deterministic Systems and Input
(x = 0.4, t = 0.2)
U=U
/ 0
time
f
Figure 6.5. Samples of X with generator A= d (a 2 ;ax?) - c (ajaxl)- a;ax2
for c = 1 and d = 1
where erfc(x) = (2/ ..(ir) J000 e-~ 2 d~ is the complementary error function ([195], Exam-
ple Problem 3.1, p. 65). £
d a 1 d a2
A=" a;(x)-
~
+- "(b(x)b(x)T)iJ - - .
ax· 2 i,j=l
~ ax· ax·J
(6.31)
i=l l l
6.2. Random walk method 365
Let p, 1; : JE.d --+ lR be two functions such that (1) p has a compact support
and continuous second order partial derivatives, (2) I; is a bounded Borel measur-
able function, and (3) the drift and diffusion coefficients in Eq. 6.30 are such that
this equation has a unique solution X that belongs a.s. to the support of p.
av (6.33)
at=A[v]+l;v
Proof: The derivation of the Feynman-Kac formula involves two steps. First, consider the
JR2 -valued process (Y, Z) defined by
Y(t) = p(X(t)),
{
Z(t) =exp(Jd~(X(u))du),
Y(O) = p(X(O)) = p(x), and Z(O) = 1. The expectation of the product of these two
processes is v(x, t) = Ex [Y(t) Z(t)] in Eq. 6.32. The Ito formula applied to the above
functions of X defining the processes Y and Z gives
{
dY(t) = A[p(X(t))]dt + 'L1=l ('Lf=l bij(X(t))ap~:?))) dBj(t),
dZ(t) = ~(X(t)) Z(t)dt.
Second, the product Y(t) Z(t) can be given in the form
by averaging, or
l
+Ex [v(X(h), t - h)]
Example 6.13: Let r denote the time a Brownian motion B starting at zero spends
in the positive half line up to a time t > 0. The distribution and density of r are
time B spends in (0, oo) during [0, t]. The corresponding Feynman-Kac functional is the
solution of the partial differential equation
The function v must be bounded and satisfies the continuity conditions v(O+, t) = v(O-, t)
and av(O+, t)jax = av(O-, t)jax at each t > 0. It can be shown that
v(O, t) = t
Jo
1
n -,/r (t - r)
e-f3r dr = t e-f3r j(r)dr
Jo
so that the density of r is as asserted ([111], pp. 224-226). •
Proof: We first note that the local solution in Eq. 6.14 is a special case of Eq. 6.34 corre-
sponding to q = 0.
Consider an JRd+ 2 -valued diffusion process (X, Xd+2) = (X, Xd+l• Xd+2 = Z)
I
defined by the stochastic differential equation
2
a
A*=---+ :Ea;- + q z - -
axd+l
d
i=l
a
ax;
a
axd+2
+-
1
Ld
2 i,j=l
a
(bbT);· - - - ,
1 ax; ax j
368 Chapter 6. Deterministic Systems and Input
and can be applied to an arbitrary function g (x, s, z) with continuous first order partial
derivatives in s and z and continuous second order partial derivatives in x. For the function
g(x, Xd+b xd+2 = z) = 1/r(x, Xd+J) z, we have
a d a
A*[g(x, s, z)] = z [ --"-- + La;(x) ; - + q(x, s)
uXd+l i=l ux;
+ 21 L
d
(b(x) b(x)T)ij ~
az ] 1jr(x, xd+J).
i,j=l Xz X]
For g (X, Z) = 1/r (X) Z and the solution u of Eq. 6.1 in place of 1jr, the Ito formula
gives
by averaging. The above formula yields Eq. 6.34 since (X, Z) starts at (x, t, 1) so that
u(X (0)) Z(O) = u(x, t), and X (s) E D 1 for s E (0, f) implies
for the boundary condition u(x) = ~(x), where B is a Brownian motion in Iffi.d
starting at x E D and T = inf{t > 0: ,fiB(t) rf. D} is a stopping time. The
value of u can be estimated at x E D from n s samples ofT and B (T) by
l
L ~ (.J2 B(T(w), w)) eq
ns
u(x) = ns T(w).
w=l
6.2. Random walk method 369
Numerical results obtained forD = {x E ffi. 2 : x? +xi < 4}, q = - L:f=l a;,
and ~ (x) = exp(ao + a1 XI + a2 x2) are in error by less then 1.5% for ao = 0.5,
a1 = 0.3, a2 = 0.3, and ns = 500 samples of B generated with a time step
flt = 0.001. <>
Note: The errors were calculated with respect to the exact solution u(x) = exp(ao +
a 1 x 1 + a2 x2). That this function satisfies the SchrOdinger equation results by elementary
calculations. ~
u(x)= :s [ a f
n'
1
eqT(w')+f3
n"
w~leqT(w")
]
=c1(x)a+c 11 (x)f3,
where x E (0, 1), n~ and n7 denote the number of samples of a Brownian motion
B starting at x E D that exit D = (0, /) through the left and the right ends of
d, ns = n~ + n7 denotes the sample size, and c'(x), c"(x) giving the weights
of the boundary values u(O) = a, u(l) = f3 result from the above expression of
u(x). The largest value, emax. of the errore= lu(x)- u(x)l/lu(x)l at X = k/10,
k = 1, ... , 9, is smallerthan5%for/ = l,a = 1, f3 = 2,q = 1, flt = 0.001, and
ns = 500. The error can be reduced by increasing the sample size and/or reducing
the time step flt used to generate Brownian motion samples. For example, emax
is less than 1.5% for ns = 1, 000 and flt = 0.0005 [85]. <>
Note: The exact solution of this Schrodinger equation used to asses the accuracy of the
estimator u is
;-;:;-: f3 - 01 cos (.j2q /) ;-;:;-:
u(x)=OICOS(y.<.qx)+ . f"l"":: sin(y2qx).
sm(y2q/)
The estimator u of u is less accurate when D and/or q is large. This unsatisfactory
performance is caused by the dependence of u(x) on exp(q T), a random variable that can
have a heavy tail. To clarify this statement, consider the special case 01 = f3 = 1, q = 1,
l = 2a, and x =a in which u(x) is equal to the expected value of exp(T). Estimates of
E[T] based on 1,000 sample paths of B generated with a time step D.t = 0.001 are 1.0097
for a = 1 and 4.0946 for a = 2. The corresponding estimates of coefficient of variation vr
ofT are 0.7554 and 0.7663, respectively. The estimates of the coefficient of variation of
exp(T) are 1.7506 for a = 1 and 31.6228 for a = 2. The large coefficients of variation are
the cause of unstable estimates for the expectation of exp(T). In fact, E[exp(T)] may not
be bounded. For example, if T is an exponential random variable with expectation 1j A and
A E (0, 1], then E[exp(T)] = +oo so that it will not be possible to obtain stable estimates
of the mean of exp(T) from samples ofT ([79], Section 2.2.6).
The performance of the estimator u can be improved by dividing D in sufficiently
small parts for which this estimator is accurate. Suppose that D = (0, I) is divided in m + 1
370 Chapter 6. Deterministic Systems and Input
equal intervals and letak = u(k lf(m + 1)) be the unknown value of u at the division point
Xk = k lf(m + 1), k = 1, ... , m. The relationships ak = c/c ak-1 + c'/c ak+l> between the
values of u at the division points can be obtained from the functional form of the estimate
u
of applied in the intervals ((k -1)1/(m + 1), (k+ 1)1/(m + 1)). The unknown values of u
at the division points can be obtained from these relationships and the boundary conditions
giving two additional equations, ao =a and am+l = {3. For example, let a = 1, f3 = 2,
q = 1, l = 2, and m = 3. The relationships between the values of u at the division
points are ak = (ak-1 + ak+I)/2, k = 1, 2, 3, by the symmetry of the Brownian motion.
These equations and the boundary conditions give a1 = 1.25, a2 = 1.50, and a3 = 1.75.
Suppose that u(0.3) needs to be estimated. The estimate of u(0.3) can be obtained by this
approach for the domain (0, 0.5) and the boundary conditions u(O) = 1, u(0.5) = 1.25.
The error of this estimate is under 1.2% for ns = 500 and D.t = 0.0005. A
Let u be the solution of Eq. 6.8 or 6.9, where q(x) = A. is a parameter and
p(x) = 0.
Our objective is to find a local solution for these equations, that is, to
find the lowest eigenvalue for Eqs. 6.8 and 6.9. We use Eq. 6.9 to illustrate the
construction of the local solution.
1
- l ~u(x) = A.u(x), xED,
u(x) = 0, X E aD. (6.36)
A.o = sup{p: Ex[exp(p T)] < oo, for all xED}, (6.37)
Note: The trivial solution u = 0 always satisfies Eq. 6.36. A constant A is said to be an
eigenvalue of Eq. 6.36 if there is a function u that is not zero everywhere in D and the pair
(A, u) satisfies Eq. 6.36. If Dis smooth, the operator -(1/2) D. has a countable number of
distinct eigenvalues 0 < AQ < AI < · · · such that An ~ oo as n ~ oo ([43], Volume 1,
Section V.5). Also, the eigenvalues of the operator -(1/2) D. are strictly positive as it can
be found from the equality ((1/2) D.u, u) = -A (u, u) using integration by parts and the
factthatu is an eigenfunction, where (u, v) = fv u(x) v(x)dx.
Let X= (X= B, xd+l) be anJR'i+ 1-valued diffusion process, where dXd+l (t) =
dt, X(O) =X ED, and xd+l (0) = 0. The generator of X is A= a;axd+l + (1/2) /), =
a;at + (1/2) D., where /), = "'£1=1 a2 ;ax'f. Suppose that Up is a solution of Eq. 6.36
for A = p > 0. The Ito formula applied to the function exp(p xd+l (t)) Up(X(t)) =
6.2. Random walk method 371
If Ex[eP T] is finite, then up(x) = 0 for all x E D since up(B(T)) = 0 by the boundary
condition in Eq. 6.36, B(s) E D for s E (0, T), and up(B(s)) satisfies Eq. 6.36 for J... = p
by assumption. In this case p cannot be an eigenvalue of Eq. 6.36 so that Ao must be at
least sup{p : Ex[eP T] < oo, Vx E D}. It can be shown that the lowest eigenvalue of the
operator of Eq. 6.36 is given by ([57], Section 8.8, pp. 263-270)
The use of Eq. 6.37 is impractical because it requires us to estimate the expecta-
tion Ex[exp(p T)] for many values of x E D and values of p. A more efficient Monte
Carlo method for estimating the lowest eigenvalue of integral and differential operators is
discussed in Section 6.6.2 . .l
where B is a Brownian motion and T = inf{t ~ 0 : B(t) fl. D} is the first time B
starting at x E D exits D. We cannot calculate the value of u(x) from the above
equation since B can exit D through both its ends so that u(B(T)) is either u(O) or
u(1) and u(O) is not known. Hence, the expectation Ex[u(B(T))], and therefore
the value u(x) of u, cannot be calculated.<>
We will see that mixed boundary value problems can be solved locally by
using Brownian motion and diffusion processes, but these processes have to be
reflected toward the interior of D when they reach aD at a point where Neumann
conditions are prescribed. Also, we will have to extend the classical Ito formula
to handle functions of reflected Brownian motion and diffusion processes.
372 Chapter 6. Deterministic Systems and Input
Example 6.17: The expectation of the local time process L exists and is equal to
E[L(t)] = ,J2 t jn for all t 2: 0. <>
Proof: The integral JJ1(-e,e) (B(s)) ds in the definition of the local time process is bounded
by t for any e > 0 so that
and ( 4 e ..(iI .f27r + o (e)) I (2 e) converges to ,J2 t I rc as e ,(, 0 so that the expectation of
L(t) is as stated.
Tanaka's formula given later in this section (Eq. 6.39) provides an alternative way
for calculating E[L(t)]. According to this formula we have E[IB(t)l]- E[IB(O)IJ =
E[B(t)] + E[L(t)], where B is a Brownian motion. For B(O) = 0 we have E[L(t)] =
E[IB(t)i], which gives the stated expectation for L(t) since B(t) ~ N(O, t). •
Proof: The mean rate at which X crosses zero is iJ jrr:, where &2 E[X(t) 2] (Sec-
tion 3.10.1). Let N(t) be the number of zero-crossings of X during a time interval [0, t]
and let Di be the largest time interval including the time 7i of the zero-crossing i of X such
that X (s) E ( -e, e) for all s E Di. The approximate length of Di is 2 e IX(Ti) and has
the expectation 2 e Jli72!ir since the density of X at the time of a zero-crossing of X is
f(z) = (z/ir 2) exp( -z2 /(2 &2 )), z ~ 0.
The expected value of the total residence D(t) =I:~~) Di of X in ( -e, e) during
a time interval (0, t) is
E[D(t)] =
.
EIE [~
i=I
Di I N(t)] ) = E[N(t)] E[J_!____]
X(Ti)
= iJ t 2 t:
rr
~
o-
so that E[D(t)]/(2e) = tj../2ii. Note that the expectation of the local time is a fixed
fraction oft for stationary mean square differentiable Gaussian processes. •
Jot Jot
IB(t)l - IB(O)I = sign(B(s)) dB(s) +lim...!_ 1(-s,sJ(B(s)) ds
s.J.-0 2e
= B(t) + L(t) a.s., (6.39)
JJ
Note: The process B(t) = IB(O)I + sign(B(s)) dB(s) is a square integrable martingale
with continuous samples and quadratic variation (Section 4.5.3)
where sign(x) is -1, 0, and 1 for x < 0, x = 0, and x > 0, respectively. Hence, B is a
Brownian motion (Section 4.6.2). The second integral on the right side of Eq. 6.39 is the
local time process L(t) in Eq. 6.38. Figure 6.6 shows a few sample paths of B and the
corresponding samples of the Brownian motion process B(t) = JJ
sign(B(s)) dB(s) for
B(O) = 0.
The Tanaka formula also shows that IB I is a semimartingale since B is a square
integrable martingale and L is a continuous increasing process that is adapted to the natural
filtration of B (Section 4.4.1 ). ~
Let hs be a real-valued function defined by h 8 (x) = lxl for lxl > e and
he (x) = e /2 + x 2 j (2 e) for lx I :S e, where e > 0 is arbitrary (Fig. 6.7).
374 Chapter 6. Deterministic Systems and Input
10.-----------------~ 10.------------------ ,
Samples of B Samples of f3
-5L---------~--------~
0 5 10 5 time 10
time
Figure 6.6. Five sample paths of a Brownian motion B and the corresponding
samples of B for B(O) = 0
Note: If g is the identity function, then the process L becomes the local time process L
defined by Eq. 6.38. A
6.2. Random walk method 375
g(IB(t)l)- g(IB(O)i)
Proof: The classical Ito formula cannot be applied to g(X) = g(IBI) since the mapping
B r+ g(IBI) is not differentiable. Formal calculations using Ito's formula yield
g(IB(t)l)- g(IB(O)I) = lo0t g' (IB(s)l) sign(B(s)) dB(s) + -2I lot0 g" CIB(s)l) ds
+lot g' (IB(s)l) 8(B(s)) ds
since
g(IB(t)l)- g(IB(O)I) = lo0t g' (IB(s)l) d(B + L)(s) + -2I lot0 g" CIB(s)l) ds
= Jot g'(IB(s)l)dB(s)
IJot g"(IB(s)i)ds + Jo
+2 t g (IB(s)l)dL(s)
1
= Jot g'(IB(s)l)dB(s)
IJot g"(IB(s)l)ds +g'(O)L(t),
+2
that is, the result in Eq. 6.41.
Direct calculations: The proof of Eq. 6.4I involves three steps. First, a sequence of
approximations hn,e (x) converging to lx I is constructed, where hn,e E C2 (JR) for each inte-
ger n 2::: I and s > 0. Second, the Ito formula is applied to the sequence of approximations
g(hn,e(B)) of g(IBI). Third, it is shown that the resulting sequence of equations giving
the increment of g(hn,e(B)) in a time interval [0, t] converges to Eq. 6.4I as n --+ oo and
s ,).. 0. Similar steps are used in [42] (Chapter 7) to prove Tanaka's formula.
1. Let he be the function in Fig. 6.7 defined for somes > 0. The first derivative of he is
hi(x) =-I, xjs, and I for x.:::; s, x E [-s, s], andx 2::: s, respectively. The function he
is not twice differentiable at x = ±s, and we seth~ zero at x =±e. With this definition
the second derivative of he is h~(x) = (Ijs) I(-e,e)Cx).
376 Chapter 6. Deterministic Systems and Input
L: L:
Consider the sequence of functions
where the functions CfJn E C00 (JR) have compact support shrinking to {0} as n ~ oo, for
example, CfJn (y) = n rp(n y) and rp(y) = c exp( -(1 - y 2 )- 1) for IYI < 1 and rp(y) = 0
for IYI 2: 1, where c > 0 is a constant such that J2 1 rp(y) dy = 1. The functions hn,e are
infinitely differentiable and are such that (1) hn,e ~ he, h~,e ~ h~ uniformly in lR as
n ~ oo and (2) h~,e ~ h~ pointwise as n ~ oo except at x = ±e.
L:
Consider first the functions hn,e. Since
it is sufficient to show that for '7 > 0 arbitrary there is n(17) such that f~oo lhe(x - y)-
he(x)l CfJn(Y) dy < '7 for all n > n(17) and for all x E JR. The function he is uniformly
continuous so that there is a 8 > 0 for which lhe(u)- he(v)l < '7 if lu- vi < 8. Divide
the domain of integration of f~oo Ihe (x - y) - he (x) I CfJn (y) dy in two intervals, ( -8, 8)
and ( -8, 8/. The integral on ( -8, 8) is smaller than '7 for all n. The integral on ( -8, 8)c is
equal to zero for n large enough so that there is n(17) such that lhn,e(x)- he(x)l is smaller
than '7 for all n > n ('7). The same arguments can be used to prove the uniform convergence
of h~ e to h~. That h~ e converges to h~ pointwise for x i= ±e results by similar consider-
ation~. At x = e, the difference h~,e(e) - h~ (e) is f~oo (h~ (e - y) - h~ (e)) CfJn (y) dy =
(1/e) J5e CfJn (y) dy and approaches 1/(2e) i= 0 as n ~ oo. Hence, h~,e does not converge
to h~ at x = e and at x = -e.
2. The Ito formula applied to B(t) r+ (go hn,e)(B(t)) = g(hn,e(B(t))) gives
Consider first the left side of Eq. 6.42. By the definitions and the properties of hn,e
and he the sequence of functions go hn,e converges togo he uniformly in IR as n -+ oo so
that (go hn,e)(B) converges to (go he)(B) in L2 and a.s. Hence, the left side of Eq. 6.42
converges to the left side of Eq. 6.43 in L2 and a.s. for each t 2: 0 as n -+ oo.
Consider the first integral on the right side of Eq. 6.42. For each t ::,:: 0 the sequence
of functions l[O,t] (g 1 o gn,e) (g~,e)(B) converges to l[O,t] (g 1 o he) (h;,)(B) uniformly on
[0, oo) x Q as n -+ oo. This observation and the equality (Section 4.5.3)
imply the L2 convergence of Jci g 1 (hn,e(B(s))) h~,e (B(s)) dB(s) to the stochastic integral
Jcig'(he(B(s)))h~(B(s))dB(s) as n-+ oo. The a.s. convergence of this sequence of
integrals results by bounded convergence.
Consider the second integral on the right side of Eq. 6.42. The sequence of func-
tions (g" o hn,e) (h~,e) converges to (g 11 o he) (h~) uniformly in IR as n -+ oo so that
(g 11 o hn,e) (h~,e)(B) converges to (g 11 o he) (hi)(B) in L2. Because the functions (g" o
hn,e) (h~,e)(B) are bounded a.s. on [0, t], Jci g 11 (hn,e(B(s))) (h~,e(B(s))) 2 ds converges
to Jci g 11 (he(B(s))) (h~(B(s))) 2 ds in L2 and a.s. as n-+ oo by bounded convergence.
Consider now the last integral in Eq. 6.42. The integrand g' (hn,e(B(s))) h~,e(B(s))
converges to g 1 (he(B(s))) (h~ (B(s))) as n -+ oo for w E Q \ N(s) and a fixed s ::,:: 0,
where N(s) = {w: B(s, w) = ±e} c Q and P(N(s)) = 0. Hence,
holds a.s. for each fixed s 2: 0. Let N = {(s, w) E [0, t] x Q : B(s, w) = ±e} be a subset of
[0, t] x Q where the preceding equality does not hold. The set N is A. x P-measurable, where
A. denotes the Lebesgue measure. This set has measure zero since fN(A. x P)(ds, dw) =
f[O,t] A.(ds) fN(s) P(dw) (Fubini's theorem) and JN(s) P(dw) = 0 for each s E [0, t].
Therefore,
lim g' (hn e(B(s))) hn" e(B(s)) = g 1 (he(B(s))) h'!. (B(s))
n-+oo ' ' c.
is true for A.-almost all s ::,:: 0, almost surely. Hence, the third integral on the right side of
Eq. 6.42 converges to Jci g' (he(B(s))) h~ (B(s)) ds in L2 and a.s. for each t ::,:: 0 as n -+ oo
by bounded convergence (Section 2.5.2.2). This completes the proof of Eq. 6.43.
378 Chapter 6. Deterministic Systems and Input
I (t) = lot g1(h 8 (B(s))) h~(B(s)) dB(s) -lot g1(/B(s)/) sign(B(s)) dB(s)
of the first integrals on the right side of Eqs. 6.43 and 6.44 is (Section 4.5.3)
and this bound converges to zero ass .J, 0 since (h~(x))) 2 ::: 1, /h~(x) - sign(x)/ :::
1(-s,s)(x), and g 1 is continuous.
The expectation of the square of the difference,
of the second integrals on the right side of Eqs. 6.43 and 6.44 is
Also, we have
where the constants q, c2 > 0 exist since the functions g" and hs are continuous and the
Brownian motion is restricted to the interval ( -8, 8). Hence,
Lc{ r
and
2
E [ ( { 1(-s,s)(B(s))ds) ] = t 2 1(-s,sJ(B(s))ds dP
Jd
proving the convergence of g 11 (hs(B(s))) (h's(B(s))) 2 ds to Jd
g 11 (\B(s)\)ds in L2 as
8 {, 0. Hence, the limit in Eq. 6.44 must also exist in L2 and the equality in Eq. 6.41 holds
a.s. for each fixed t 2:: 0. That lims-J_o(1/28) Jd
g 1 (hs(B(s))) 1(-s,s) (B(s))ds is equal
to g 1 (0) L(t) follows from the observations that JJ1c-s,s) (B(s))ds converges to zero as
8 {, 0 and
. 1
:5 hm- !of Ig (hs(B(s)))- g (0) I 1(-s s) (B(s))ds
s{-028 0
I I
'
where c > 0 and 8 bound \g'1 (x) I and hs (x ), respectively, for x E ( -8, 8). •
Note: This formula can be used to find the local solution for mixed boundary value prob-
lems. For example, if g is the solution of a partial differential equation defined in (0, l)
and g 1 is prescribed at x = 0, the last term in the above equation captures the contribution
of this boundary condition to the local solution. &
- g'(O) (
L(t, w):::: 2S Jo 1(-e,e)(B(s, w) ds,
for a small 8 > 0, that is, by the time B spends in the range ( -8, 8) during the
time interval (0, t) scaled by g'(0)/(2 8). The approximation has been used in the
following two examples to find local solutions by Monte Carlo simulation.
This equation specialized for the above boundary conditions gives the following
local solutions.
(i) The local solution cannot be applied directly since Ex [u(X (T))] = Umax is not
known. The boundary condition u(O) = 0 and the local solution at x = 0 yield
Umax = ~E 0 [J[ (1- X(s)) 2 ds J.
Once Umax has been calculated, we can apply
the local solution to find the value of u at an arbitrary x E (0, 1). The exact value
of the tip displacement is Umax = 0.25 a. The estimated value of Umax based on
ns = 1, 000 samples of X generated with a time step /).t = 0.001 is in error by
0.24%. Note that the value of Umax can also be used to solve locally the Dirichlet
boundary value problem for u(O) = 0 and u(1) = u max·
6.2. Random walk method 381
for g = u, where u'(x) = du(x)jdx and u11 (x) = d 2 u(x)jdx 2 . The expectation of the
above equation in the time interval (0, T) is
Because u is the solution of u"(X(s)) = -m(X(s))/x for s < T, the above equation
yields the stated local solution [82]. •
Example 6.20: Let u be the solution of (1/2) u"(x)+q u(x) = 0, x E D = (0, l),
with the Dirichlet and Neumann boundary conditions u'(O) =a and u(l) = {3,
where a, {3 are some constants. The local solution of this Schrodinger equation is
so that
+ q Ex [loT u(IB(s)l) X2(s) ds] + ~Ex [loT u" (IB(s)l) X2(s) ds]
by considering the time interval [0, T], using the definition of X2, and averaging. Since
IB(s)l is in D for s < T, then (1/2) u" (IB(s)l) + q u(IB(s)l) = 0 so that
This formula yields the stated expression of u(x) since u(IB(T) I) = f3 and X2(s) = eq s.
The exact solution of the Schrodinger equation is
for q > 0. This solution has been used to calculate the error of the local solution. •
Proof: Let hn,s and h 8 be the functions used to prove Eq. 6.41, where 8 > 0 and n ~ 1 is
an integer. The classical ItCi formula applied to function hn,s (X) gives
hn,e(X(t))- hn,e(X(O)) = Jot h~, 8 (X(s)) dX(s) + 2 Jot h~. 8 (X(s)) b(X(s)) 2 ds.
1
The limit of this equation as n --.. oo and 8 ,!. 0 yields the stated representation of
IX(t)l - IX(O)I based on arguments similar to those used to prove Eq. 6.41. The second
integral on the right side of the above equation is equal to b(Of L (x) (t) since it converges
to (1/2) JJh~(X(s))b(X(s)) 2 ds = (1/28) JJ b(X(s)) 2 1(-s,s)(X(s))ds asn--.. oo.
The proof can be completed by considerations of the type used to prove Eq. 6.44. •
Let a and b > a be some constants and let r be a periodic function with
period 2 (b -a) defined by r(x) =
lx - al for lx - al ::: b -a. Figure 6.8
shows this function in [ -3, 3] for a = 0 and b = 1. Let B be a Brownian motion.
-2 -1 0 2 3
X
The process r(B) has the range [0, b- a], and is referred to as Brownian motion
reflected at two thresholds. Figure 6.9 shows a sample of a Brownian motion B
and the corresponding sample of r (B) for a = 0 and b = 1.
Let Xk(a) = a+ 2k (b- a) and Xk(b) = b + 2k (b- a), k E Z, be the
collection of points on the real line where r (x) is equal to 0 and b- a, respectively.
Fore E (0, (b- a)/2) define the intervals Ik,e(a) = {x : lx- Xk(a)l ::: s} and
h,s(b) = {x : lx - Xk(b)l ::: s} centered on Xk(a) and Xk(b), respectively, and set
J(a) = U~_ 00 (Xk(a), Xk(b)) and /(b)= U~_ 00 (Xk(b), Xk+l (a)).
384 Chapter 6. Deterministic Systems and Input
2
0.8
1.5
0.6
0.4
0.2
-0.5L-~-~-~-~--' 0
0 2 3 4 5 0 2 3 4 5
time time
Figure 6.9. A sample of Band the corresponding sample of r(B) for a = 0 and
b=1
+ L
00
[g'(O)L(t;xk(a))-g'(b-a)L(t;xk(b)))+- 11t
2 0
g 11 (r(B(s)))ds,
k=-00
(6.46)
Proof: The Tanaka formulas in Eqs. 6.39 and 6.41 cannot be used to prove Eq. 6.46 since
they deal with Brownian motion processes reflected at a single threshold. Therefore, we
have to use direct calculations to establish the second extension of Tanaka's formula.
We only prove that r(B) satisfies the condition,
that is, a special case of Eq. 6.46 for g equal to the identity function. The proof involves
three steps (Eq. 6.41). First, a sequence of smooth approximations hn, 8 (B) converging to
r(B) is developed. Second, the ItO formula is applied to the function hn, 8 (B). Third, the
limit of the resulting formula is calculated for n --+ oo and e . (. 0.
6.2. Random walk method 385
l
1. Let E: > 0 and define
r(x), x ¢'. (uklk,s(a)) U (Uzlz, 8 (b)),
h ( ) _ f. + (X-Xk(a)) 2 xEh, 8 (a), kEd::,
sX- 2 2s '
. b _ a _ f. _ (X-Xk(b)) 2 h, 8 (b),
2 2s ' x E k E Z.
Let h n, 8 (x) = fJR h 8 (x - y) f{Jn (y) d y be a sequence of approximations of hs, where the
functions f{Jn E C 00 (R) have compact support shrinking to {0} as n --* oo. The functions
hn,s are infinitely differentiable for each n and E: > 0. Note that the sequence of functions
hn,s used in the proofs ofEqs. 6.41 and 6.46-6.47 share the same notation but have different
definitions.
For every & > 0, the functions hn,s and h~.s converge uniformly to h 8 and hi in
R as n --* oo, respectively, and h~ s converges pointwise to h~ in R as n --* oo except at
x = Xk(a)±& and x = Xk(b)±do~ k E Z. The function h~ is not defined atx = Xk(a)±&,
x = Xk (b) ± E: and is set to zero at these points (Section 6.2.3.1).
2. The Ito formula applied to the function hn,s(B(t)) yields
= r h~(B(s))dB(s) +
fo 2s
t ~
_..!:__
L-fo
(1h
·'
k=-00
(a)- lh (b)) (B(s))ds
·'
as n --* oo, since h~(x) = L~-oo (lh,s(a)- lh,,(b)) (x). The last equality in the
above equation holds since the sequence
1 t L
Jo
m
l~t,,(a)(B(s))ds = Lm 1
fo
t llk,s(a)(B(s))ds
28 28
k=-m k=-m
is positive, increasing, and bounded by t j (2 E:) for each integer m 2: 1. The L2 limit of the
above equation as & {. 0 yields Eq. 6.47 for each t 2: 0. Similar arguments can be used to
establish Eq. 6.46.
The Tanaka formula in Eq. 6.39 is a special case ofEq. 6.47 for a = 0 and b--* oo.
Under these conditions (1) the subsets /(a) and /(b) ofR approach (0, oo) and (-oo, 0),
respectively, so that 1/(a)(x)- 1/(b)(x) = sign(x), (2) the summation on the right side
of Eq. 6.47 degenerates in a single term corresponding to Ik,o(&) = 1(-s,s)• and (3) the
function r(x) becomes lxl. •
with the boundary conditions u(x1, 0) = uo for x1 E (0, ~a), 0 < ~ < 1,
u(x1, 0) = 0 for x1 E (~a, a), u(x1, b) = ~ uo, and Bu(x)jBx1 = 0 at x1 = 0,
x1 =a.
The local solution is u(x) = Ex[u(X(T))], where X= (r(X1), X2), X=
(X 1, X2) is a diffusion process defined by the stochastic differential equation
= lo t
o
[ -q -au h 1
a~1 s
+ dt (
2
-a u (h 1 ) 2 + -au h ")
a~'f 6 a~1 6
a-
2
+ d2
a~?
u] ds
+ ('
lo
(/2dl ~hi
a~1
dBt + /fd2 ~ dB2),
a~2
where the arguments of functions u and h6 are (h 6 (X 1(s)), X2(s)) and X 1(s), respectively.
The limit of the above equation as 8 {. 0 provides a further extension of the Tanaka
formula, which can be used to develop local solutions for transport equations. This ap-
proach has not been used here. The local solution and the numerical results reported in
these examples have been obtained from the expectation of the above equation with 8 in
the range (0, 0.1).
The exact steady-state concentration distribution in (0, a) x (0, oo) is ([194], Ex-
ample Problem 3.2, p. 68)
where jn = [(q/d2) 2 + (2n rr dtf(a d2)) 2] 112. The exact solution shows that the con-
centration approaches a constant value ~ uo as x2 --+ oo. This solution is used to assess
the accuracy of the local solution calculated in D = (0, a) x (0, b) with the additional
boundary condition u(xt, b)=~ uo. •
Proof: Let (BI, B2) be two independent Brownian motions. The 1R:.2-valued process X =
(IBJI, IB2I) lives in the first orthant of the plane and its reflections are orthogonal to the
boundaries of D. For example, if X is on the boundary XJ = 0, then L1 has a positive
increment causing an increase of X1 in the positive direction of coordinate x1.
The Ito formula can be applied to g(X) in conjunction with Tanaka's formula pro-
viding a representation for IB; I and gives Eq. 6.49. Direct calculations based on arguments
similar to those used to derive Eqs. 6.41 and 6.46 give
lo
2
=L
t ag(X(s))
g(X(t))- g(X(O)) sign(Bi(s))dBi(s)
i=!
0 ax·l
I t
+ 2: Jo ~g(X(s))ds + ~ii(t)
2
Case 2: Let dij, i, j = 1, 2, be a constant matrix with d 11, d22 > 0. Then
lives in the first orthant D of JR2 , where B1 and B2 are independent Brownian
motion processes and IB;I = Bi + L;, i = 1, 2 (Eq. 6.39).
The process X takes values in the first orthant of R2 because d11 and d22 are strictly
positive. For example, suppose that X(t) = (O,x2), x2 > 0, at a timet. The incre-
ment dL1 (t) of the local time L1 causes a change of X from (0, x2) to (du dL1 (t), x2 +
d21 dL1 (t)) E D. Similar considerations apply for the other coordinate. Additional infor-
mation and examples can be found in [42] (Chapter 8). •
6.2. Random walk method 389
~ fo
2 t
g(X(t))- g(X(O)) = g,i(X(s)) sign(B7(s)) dBt(s)
it
+2c12 g,12(X(s)) sign(Bf(s)) sign(Bi(s))] ds
1
+ '""cu
2
L,.;
lim-
028 £ 1
g i (X(s)) 1(-£ e)(B 1~(s)) ds.
, ,
(6.51)
i=l + 0
Note: The processes IB7 I can be approximated by the sequence of processes hni ,s (B;>,
ni = 1, 2, ... , fori = 1, 2 and s > 0. The ltOformula applied to the mapping ( Bi, Bz) r--+
g(hn 1 ,s(Bi), hn 2 , 8 (Bz)) gives
= L2 lot g i h~
i=l 0 , ',
8 dB7(s) +2 L
1 2
i=l 0
lot cu g i h~
, ',
8 ds
a2 u a2 u
- + 3- = -16, XED= (0, 1) X (0, 1),
axr ax~
~ [ {T au(X(s))
axi
,
(dBi(s)+dLi(s))
]
Ex[u(X(T))]-u(x)=;:{iEx fo
1 x
+ -E [loT (a 2u(X(s)) +3 a2u(X(s))) ds ] ,
2 2
2 0 axi ax2
where q = 1 and c2 = .J3. If u is a solution, the above equation gives the stated re-
sult because (1) the partial derivatives au;axi are zero on the Neumann boundaries and
the stochastic integrals j[aujaxi] dB are martingales starting at zero so that the first two
expectations on the right side of the above equation are zero and (2) a2u(X(s))jax? +
3 (a 2u(X(s))jaxi) = -16 for s E (0, T) .•
this class of problems. However, the approach cannot be extended to solve locally
general mixed boundary value problems characterized by less simple Neumann
boundary conditions and arbitrary subsets Din JRd.
Method 2: Consider the time-invariant version of Eq. 6.1 defined on an open
bounded subset D of JRd with the mixed boundary conditions u(x) = ~d(x),
X E aDd, and 'Vu(x). c(x) = ~n(X), X E aDn, where ~d and ~n are prescribed
functions, aDd and aDn are subsets of aD such that aD = aDd U aDn and
aDdnaDn = 0, V' = (ajaxl, ... ,a;axd),andcdenotesaspecifiedlRd-valued
function. Our objective is to develop a heuristic algorithm for calculating the local
solution of this partial differential equation.
Let x E D be a point at which we would like to calculate the local solution
and let X be an JRd -valued diffusion process whose generator coincides with the
differential operator of Eq. 6.1. The process X is started at x E D if we seek the
solution of Eq. 6.1 at this point. The sample paths of X can be divided in two
classes:
1. Sample paths that reach aD for the first time at y E aDd. The con-
tribution of these sample paths of X to the local solution is given by results for
Dirichlet boundary value problems since the classical Ito formula is valid in D
and u (y) is known.
2. Sample paths that reach aD for the first time at y E aD n. The solution
for Dirichlet boundary value problems cannot be applied since u is not known
on aDn. The contribution of these samples of X to the local solution can be
calculated by the following heuristic algorithm:
- Apply the classical Ito formula between x andy E aDn to derive a rela-
tionship between the values of u at x E Dandy E aDn.
- Restart X at x' E D and find the point where the restarted process reaches
aD for the first time. If this point is on aDd the sample of X is stopped
because u is known on aDd. Otherwise, the previous step is repeated till
the restarted process reaches aDd.
The above algorithm giving the contribution of the Neumann boundaries to
the local solution is illustrated in detail by the following example. In this algorithm
the shift II y - x II from y E aD to x E D must be small but remains unspecified.
Example 6.24: Consider the local solution of the partial differential equation
a2 u a2 u
ax2 + 3 ax2 = -16, X= (Xl, xz) ED= (0, 1) X (0, 1),
1 2
with the mixed boundary conditions au;ax1 = f3(xz) on {0} x (0, 1), u = 0
on {1} x (0, 1), au;axz = a(x1) on (0, 1) x {0}, and u = y(x1) on (0, 1) x
392 Chapter 6. Deterministic Systems and Input
{1}, where a, {3, andy are prescribed functions. The generator of the diffusion
process X = (X 1 = B1, X2 = J3 B2) coincides with the operator of the above
partial differential equation, where B1 and B2 are independent Brownian motion
processes.
Let X(·, w) be a sample of X that reaches aD for the first time at Y(w) E
aDJ. This sample is stopped at Y(w) because u(Y (w)) is known. The contribution
of the sample X(·, w) to the local solution is u(x, w) = ~(Y(w)) + 8 T(w), where
T(w) is the time it takes this sample of X to reach aDJ (Example 6.23). For
example, ~J(Y(w)) is equal to zero or y(Y(w)) if Y(w) belongs to {1} x (0, 1) or
(0, 1) x {1}, respectively.
Let X ( ·, w) be a sample of X that reaches aD for the first time at Y (I l (w) E
aDn (Fig. 6.10). This sample cannot be stopped because u is not known at
Dirichlet
0.9
Dirichlet
0.8
0.6/
0.7
yO)(m) Y(m)
0.5
~----·
o.4 (hl,O)
0.3
0.2
Neumann
is (Fig. 6.10)
according to the boundary condition. Second, the contribution during the time interval
[TCI)(w), r< 2l(w)] of the process restarted at x(ll(w) to the local solution is
for a shift h2 > 0 in the positive direction of coordinate x2. Third, the contribution during
the time interval [T< 2l(w), T(w)] of the process restarted at x< 2l(w) to the local solution
is
u(X(2 l(w)) = ~d(Y(w)) + 8 T(w).
The stated result follows from the above equations. •
394 Chapter 6. Deterministic Systems and Input
Boundary "dD of D
Figure 6.11 illustrates the construction of the spherical process. The pro-
cess consists of a sequence of open spheres included in D whose centers approach
6.3. Sphere walk method 395
the boundary aD of D. The point X E D where the local solution has to be cal-
culated is the center of the first sphere. This sphere has radius r = min yeaD II
x - y II so that it is tangent to aD. The center X 1 of the second sphere is a random
point on the surface of the first sphere and its radius R 1 is such that it is tangent to
aD. The spherical process in Fig. 6.11 is stopped when the center of a sphere in
this sequence enters a small vicinity of aD for the first time ([160], Section 1.2.1).
The local solution given by the SWM is based on the Green function, the
mean value property, and the Monte Carlo simulation method. The SWM can be
applied to find the local solution of partial differential equations of the type given
by Eq. 6.1 with Dirichlet and mixed boundary conditions. Efficient local solutions
by the SWM have been reported only for the Laplace and Poisson equations [200].
If u in Eq. 6.52 satisfies the boundary condition u(x) = ~(x), x e aD, then
(6.54)
Note: The formula in Eq. 6.53 with q u in place of p gives an integral equation for u. If
g(x, y) is positive for all x, y E D, the integral equation for u can be given in the form
u(x) = c(x) E[q(Y) u(Y)], where c(x) = fv g(x, y) dy andY is a random vector with
density g(x, ·) = g(x, ·)/c(x). A
Y(r )
_
-
{ 2 ~ Ilog(1/r),
d-2
d = 2,
(d-2)ad r ' d::: 3,
and ad = 2 nd/21r(d /2) is the surface area of the unit sphere in JRd ([128], Sec-
tion 1.2). The Green functions have various singularities depending on the func-
tional form of the operator £ and the dimension d. For example, the singularities
of the Green function for the Laplace operator in IR 2 and JR3 are given by log(r)
and r- 1, respectively.<>
Note: The Green function can be calculated as the limit as 8 -1, 0 of a sequence of solutions
g8 of L'lg 8 (x, y) = -pe(x, y), x, y E D, with the homogeneous boundary conditions
g8 (x, y) = 0, where p 8 (-, y) is zero outside the disc Ds(Y) = {x :II x- y II< 8} of radius
8 centered at y E D and satisfies the conditions JDs (y) p 8 (x, y) dx = 1 and Ds (y) C D
for each 8 > 0. The limit of g8 (·, ·)as 8 -1, 0 yields the Green function. A comprehensive
review on Green's function can be found in [128]. A
£ [u(x)] = 0, X E D, (6.55)
Note: The formula in Eq. 6.56 states that the value of u at the center of S (x, r) is equal to
the average of the values of this function on aS(x, r) weighted by the kernel k of C. We
will see that the kernel k is an essential ingredient of the SWM. Efficient algorithms for
finding the local solution of partial differential equations by the SWM are possible only if
k has a known analytical expression.
Two methods can be used to find the kernel of a differential equation. The kernel
of C can be obtained from Eq. 6.52 using integration by parts and the Green formulas
([44], Volume 2, Chapter IV). There are few differential operators for which the kernel kin
Eq. 6.56 has a simple form, for example,
1, for~.
k(x,y)= { 1\X-yl\cl/2
sinh(IIX-yl\ cl/2)' for~- c, c > 0.
Example 6.27: The local solution of the Laplace equation in Eq. 6.23 for D =
S(x, r) is u(x) = Ex [u(B(T))], where B denotes an JRd -valued Brownian mo-
tion starting at B(O) = x and T = inf{t > 0: B(t) ¢ S(x, r)}. Because B(T) is
uniformly distributed on as(x, r), we have
Note: The expression of u in Eq. 6.56 implies Eq. 6.57 since Y is a uniformly distributed
aS(x, r)-valued random variable, and Eq. 6.58 follows from Eq. 6.57 by differentiation
with respect to x; . The notation Ex is used to indicate that the expectation is performed on
a sphere centered at x.
The relationships in Eqs. 6.57 and 6.58 state that u and its partial derivatives at an
arbitrary point x E D are equal to averages of the values of u on aS (x, r) weighted by the
kernel k and by its partial derivatives ak I ax;, respectively. Hence, we can calculate u (x)
and au(x)jax; at X from the values of U on aS(X, r) provided that the kernel k is known.
If k 2: 0 on aS(x, r), the mean value property in Eq. 6.57 can also be given in the
form u(x) = ctd(x, r) Ex [u(Z)], where Z is an aS(x, r)-valued random variable with the
probability density function
1 ns
u(x) = - L k(x, Y(w)) u(Y(w)) and
ns w=l
au(x) 1 ~ ak(x, Y(w)) y
-- = - ~ u( (w)). (6.60)
ax; ns w=l ax;
Note: Let G;, i = 1, ... , d, be independent copies of N(O, 1) and define the random
variable U = (Lf=l cT) 112 . The vector (G1/U, ... , Gd/U) is uniformly distributed
on aS(O, 1) in JRd ([155], Procedure N-4, p. 89). This observation can be used to generate
independent samples of Y. ~
In applications the function u and its derivatives are specified on the bound-
ary aD of D, which generally is not a sphere. Hence, Eqs. 6.57 and 6.58 cannot be
applied directly. The following two sections use the spherical process in Fig. 6.11
to find the local solution for Dirichlet and mixed boundary value problems.
6.3. Sphere walk method 399
Spherical
vicinity of oD
The points X; are uniformly distributed on aS(X i-1, R;-d (Eq. 6.57). Let X N+l
be the point of aD that is the closest to X N (Fig. 6.12). By the definition of N,
we have II XN- XN+I II< e.
(6.61)
Proof: Suppose that N(w) = 2 (Fig. 6.11). The use of Eq. 6.57 along the sample
(Xo(w), X 1 (w), Xz(w), X3(w)) of X gives
so that u(x, w) = flT=I k(X;-1 (w), X; (w)) sCX3(w)) since the X 3((v) is on the boundary
JD of D so that u(X3(w)) = s(X3(w)). The equality u(XN) = k(XN, XN+IJS(XN+I)
400 Chapter 6. Deterministic Systems and Input
is approximate since the spherical process is stopped before reaching the boundary of D.
The approximation is likely to be satisfactory since u is a smooth function.
We have seen that samples of a Brownian motion (Example 6.27) or properties
of Gaussian variables (Eq. 6.60) can be used to generate uniformly distributed points on
8S(Xi, Ri)· We will see in Section 8.5.1.1 that the exit time for a Brownian motion from a
sphere is finite a.s. This property guarantees that the number of steps N in Eq. 6.61 that a
spherical process takes to reach the boundary of D is finite a.s. •
1 ns
I: u(x, w),
err
ucx> = - where
ns w=1
Note: The calculation of the estimate uof u can be inefficient since the radii Ri (w) of the
spheres S(Xi (w), Ri (w)), i = 0, 1, ... , N(w), are the solutions of nonlinear optimization
problems. This difficulty can be reduced by using specialized algorithms for calculating
Ri (w) based on simplified representations of 8D [200]. A
a a2
2
C = Lai-
. 1 axi
+-
1
L
2 ..
2
dij - - ,
axi ax
1 j
xeD= (-2, 2) x (-1, 1),
!= ~J=
and the Dirichlet boundary conditions u(x) = 1 for x e ( -2, 2) x {-1} and
u (x) = 0 on the other boundaries of D, where a i and dij are some constants.
Two versions of the SWM can be implemented depending on the sign of
the kernel k of C. If k does not have a constant sign in D, the local solution must
be based on Eq. 6.57. If k has a constant sign, for example, k ~ 0 in D, the local
solution can be based on Eq. 6.59 and a modified spherical process whose points
are not uniformly distributed on the boundary of the spheres S(X i , Ri).
6.3. Sphere walk method 401
0.9
0.6
0.5
/Histogram (r = 1)
0.4
(\
0.3
I
0.2
\
l 11
Figure 6.13. Histogram and approximation of k for r = 1
Note: Let X be an JR2-valued diffusion process defined by the stochastic differential equa-
tion dX(t) =a dt + b dB(t), t ~ 0, where a, b denote (2, 1), (2, 2) constant matrices and
B is an JR2 -valued Brownian motion. If a = (ai, az) and b bT = d = {dij }, the generator
of X coincides with .C. It was found that the function
h(a,r) =L
2
i=l
Pi(r) c/J
O"j
(a -1-Li)
O"j
with I-Ll = 1.25, 1-LZ = 3.4, a1 = az = 0.3, and pz(r) = 1 - Pl (r) approximates
k satisfactorily, where PI (r) = 0.6; 0.7; 0.8; 0.9 for r = 0.5; 1.0; 1.5; 2.0, and c/J(p) =
(2rr)- 112 exp(-0.5 p 2 ). The histograms in Fig. 6.13 are based on ns = 1, 000 samples of
X generated with a time step of 0.001. &
We have seen that the spherical process used to define the local solution in
Eq. 6.61 reaches a small vicinity of the boundary aD of Din a finite number of
steps. However, the product of kernels k(Xi-1. Xi) in Eq. 6.61 may not converge
402 Chapter 6. Deterministic Systems and Input
as the number of steps N increases, in which case the local solution given by this
equation cannot be used, as demonstrated by the following example.
Example 6.30: Consider a thin rectangular plate in D = (0, a) x (0, b), a, b > 0,
in JR2 . The plate is subjected to a uniform traction at the boundaries {0} x (0, b)
and {a} x (0, b). The state of stress at a point x E D in the plate is given by
the vector r(x) = (ru (x), r22(x), r12(x)). There is a mean value theorem for r
similar to Eq. 6.57 stating that
where 8""' U(O, 2n) gives the location on the unit circle centered at x, a(8) =
a1 (8)T a2(8),
llu(x) = 0, XED,
{ u(x) = ~a(x), X E aDa, (6.63)
'Vu(x) · c(x) = ~n (x), X E aDn,
where c and ~a, ~n are JRa- and JR-valued prescribed functions, respectively.
The SWM in the previous section cannot solve mixed boundary value prob-
lems since u is not known on aDn. We present here an extended version of the
SWM for the local solution of Eq. 6.63. Alternative versions of the SWM as well
as numerical examples can be found in [200] (Sections 3.3 and 5.3).
Suppose we need to calculate the local solution of Eq. 6.63 at a point
x E D and that the spherical process in Fig. 6.12 has reached a point X N in
6.4. Boundary walk method 403
since Yo - XN+l = p c(X N+t). Hence, the unknown value of u at X N+l can be
related top gn(XN+I), a known value, and u(Yo). Now we restart the spherical
process from Yo. Suppose that after M steps the restarted spherical process is at
Y Min a small vicinity of aD and that Y M+l E aDa. By Eq. 6.61 u(Yo) can be
related to u(Y M+l) = ga(Y M+l), which is known. If Y M+l E aDn, then Y M+l
has to be reflected in D and the previous algorithm needs to be repeated until a
Dirichlet boundary is reached. Hence, the value of u (x) can be calculated for any
sample of the spherical process.
and k (O) ( ~, y) = 1. It can be shown that the equations of linear elasticity can be
given in the form
Note: Let I denote the identity operator. Under the above condition, the Neumann series
of the operator I - K(l) is absolutely and uniformly convergent, so that the solution of
(I- K(l)) [p,(~)] = g(~). that is, the solution ofEq. 6.66is (Section 8.3.1.4)
00
The above series representation of IL and Eq. 6.67 yield Eq. 6.68. Generally, the
condition under which Eq. 6.68 holds is not satisfied for elasticity problems. However, it
is possible to modify the series representation of the local solution such that it becomes
convergent, as demonstrated later in this section (Example 6.31) . .l
Suppose that the representation of the local solution in Eq. 6.68 is valid.
This representation cannot be used for numerical calculation since it has an infinite
number of terms. We have to approximate the local solution u (x), x E D, by the
first m < oo terms in Eq. 6.68, that is,
m
u(x)::: u(x) = R[g(x)] + LRKi[g(x)]. (6.69)
i=l
No probabilistic concepts have been used to establish Eqs. 6.68 and 6.69. We
will use Monte Carlo simulation to calculate the terms of u(x) in Eq. 6.69. A
random walk on aD, referred to as the boundary walk method (BWM), is used
to evaluate u(x) in Eq. 6.69.
6.4. Boundary walk method 405
Let Yo, YI, ... , Yi, ... be a aD-valuedMarkovchain, where fo and fili-I
denote the densities of Yo and Yi I Yi-I, i = 1, 2, ... , respectively. Define also
an Rd -valued process with states Zo, ZI, ... , Z;, ... , where
Zi = qi g(Yi), i = 0, 1, ... ,
qi-I k(Yi-I, Y;)
qi = , i = 1, 2, ... , and
fili-I (Yi I Yi-I)
r(x, Yo)
(6.70)
qo = fo(Yo) ·
The states Zo, ZI, ... , and Zi depend on Yo, (Yo, Y I), ... , and (Yo, ... , Y;), re-
spectively. The joint density fo.I, ... ,i of (Yo, ... , Yi) is fo,I, ... ,i (y 0, YI· ... , Yi) =
fo(Yo) n~=I fJij-1(Yj I Yj-1)for j ~ 1.
If fo.I .... ,i(Yo.YI· ... ,yi)= 0 implies that the entries of the matrix
r(x,yo) n~=Ik(Yi-I•Yi)g(yi) arezeroforallx ED, all boundary points
Yo, Y1• ... , Yi E aD, and i ~ 0, then
Proof: We use the definitions in Eqs. 6.66, 6.67, 6.70, and 6.70 to prove Eq. 6.71. Note
that
R[g(x)] = fav r(x, y) g(y) du(y) = fav [r~:C~) g(y)] fo(Y) du(y) = E [zo]
provided that fo(Y) = 0 implies that the entries of r(x, y) g(y) are zero for all x E D and
y E aD. The second term in Eq. 6.69 is
since fo,I(Yo. YI) = 0 implies the entries of r(x,y0)k(y0 ,yi)K(YI) are zero for all
x E D and Yo, Y1 E aD . •
Let (Yo(w), Y 1(w), ... ), w = 1, ... , ns. be ns independent samples of the
Markov process (Yo, Y1, ... ) and let (Zo(w), ZI(w), ... ) denote the correspond-
ing samples of (Zo, Z1, ... ).
The terms of the local solution in Eq. 6.68 can be approximated by the following
estimates of the expectation of the random variables Z i, that is,
(6.72)
406 Chapter 6. Deterministic Systems and Input
u(x) = ~ R[g(x)] + ~ t(
i=l
RK(i-I) [g(x)] + RK(i) [g(x)])
+2L
1 1 00
= 2 E[Zo] (E[Z;_I] + E[Z;]).
i=l
ro =II x - Yo II, r;-I,i =II Yi-1- Yi II, f/Jx,y 0 denotes the angle between the interior
normalatyo E aDandtherayfromyo E aDtox E D,andrpyi_ 1 ,yi istheanglebetween
the interior normal at Y; and the ray from Yi E aD to Yi-1 E aD.
The series L~o E[Z;] cannot be used to calculate u(x) since it is not convergent
Consider the series v(x) = L~o Ai E[Z;]. This series has a single pole at A = -1 and
coincides with the series expansion of u (x) for A = 1. The idea is to eliminate the pole of
v(x) and then use the resulting series to calculate the local solution u(x).
Note that
where i is the (m, m) identity matrix, g is an (m, m) matrix, t/> and u denote
m-dimensional vectors, and /.. is a parameter. The entries of g and t/> are real
numbers. If tJ> =1= 0 and det(i -/..g) =I= 0 for a specified value of/.., then Eq. 6.73
has a unique solution. If t/> = 0, Eq. 6.73 becomes
and defines an eigenvalue problem. We refer to Eqs. 6.73 and 6.74 as inhomoge-
neous and homogeneous, respectively.
Our objectives are to develop methods for finding directly the value of
a particular coordinate of the solution u of Eq. 6.73 and the dominant eigen-
value/eigenvector of Eq. 6.74, that is, to develop methods for the local solution
ofEqs. 6.73 and 6.74.
(6.75)
defined by Eq. 6.73 with/.. = 1. This choice of/.. is not restrictive. The objective
is to find the value of a coordinate u i of u directly rather than extracting its value
from the global solution ofEq. 6.75.
Let X be a discrete time Markov process with discrete states {1 , ... , m + 1},
where m + 1 is an absorbing state. The process X starts at a state i E {1, ... , m}
and exits {1, ... , m} or, equivalently, enters the absorbing state m + 1, for the first
time in T transitions, where
Proof: The above matrix norm II · lim is II g lim= maxll~ll=l II g ~ II, where II · II is the
Euclidian norm.
LetS = T- 1 and i, i1, ... , is(w)• ir(w) be a sample path X(-, w) of X, which
starts at state i E { 1, ... , m} and reaches the absorbing state m + 1 in T (w) transitions. The
expectation in Eq. 6.79 is
P(w) ={ Pi,i1 Pi1 ,i2 · · · Pis(w)-1 ,is(w) Pis(w) for S(w) > 0,
Pio =Pi for S(w) = 0
is the probability of sample path w, and V (S(w)) = vi,i 1 Vi 1,i2 ••. viscwl-! ,is(w) denotes the
value of V just prior to X entering the absorbing state m + 1 calculated along a sample path
w of X. An alternative form of Eq. 6.79 is
oo m m
Ui = L:: L:: ... L:: gi,i1 g;l .i2 ... gik-1 .ik rPik
k=Oi1=l ik=l
= ¢; + (g t/1) i + (g 2 t/1) i + . .. ,
where the first value of k is zero since, if X enters the absorbing state at the first transition,
T = 1 and S = 0. The resulting expression of u; represents the entry i of the Neumann
series expansion L~o gs t/1, which is convergent since II g lim< 1 by hypothesis. Since
(i- g)- 1 t/1 = L~o gs tfi, u; in Eq. 6.79 solves locally Eq. 6.75. •
6.5. Algebraic equations 409
If at least one of the probabilities p; is not zero, the average number of transitions
of X to the absorbing state m + 1 is finite and equal to
w = (i - p)- 1 1, (6.80)
where 1 E !Rm has unit entries and w; = E[T I X(O) = i], i = 1, ... , m.
Proof: Let p be a real-valued function defined on {1, ... , m} and
w; =E [t k=O
p(X(k)) I X(O) = i]
fori E {1, ... , m} and S = T -1. It can be shown that w; = p(i) + LJ=I Pi,j Wj ([150],
pp. 108-110) or w = p + p w. If p(i) = 1 for all i E {1, ... , m}, then
u; = -1 L V(T(w)- 1, w)
ns A.
'!'ir(w)-l. (6.81)
ns w= 1 Pir(w)-1
Example 6.32: Let u be defined by Eq. 6.75 with m = 1. The exact solution of
this equation is u = ¢/ (1 - g). The associated Markov process X has two states
and the transition probability matrix
-
p= [ p
0
1- p
1
J.
410 Chapter 6. Deterministic Systems and Input
If lgl < 1, Eq. 6.79 gives u = L~o gk ¢ = ¢ L~o gk, which coincides with
the exact solution¢ I (1- g) since the Neumann series L~o gk is convergent with
sum 11(1- g). 0
Example 6.33: Suppose that all probabilities Pi = 1 - L::j= 1 Pi,} are equal to
p E (0, 1). Then the average number of transitions to absorption for the Markov
chain X with transition probabilities Pi,} is Wi = 11p, for each i E {1, ... , m}. 0
Proof: The solution wi = ljp, i E {1, ... , m}, can be obtained directly by noting that X
is absorbed with probability p or takes values in {1, ... , m} with probability 1 - p at each
transition. Hence, T is the time to the first success for a sequence of independent Bernoulli
trials with probability of success p. Hence, Tis independent of the initial value of X and
has the mean and variance E[T] = ljp and Var[T] = (1- p)/r. respectively.
Consider the Neumann series w = 1 + L:~ 1 ps 1 corresponding to the solution
w = (i- p)- 1 1. Because the terms ps 1 of L:~ 1 ps 1 are equal to (1- p)s 1, this series
is convergent and all coordinates of ware equal to L:~0 (1- p)s = ljp. For example,
the entries of p 1 are L:j= 1 Pi,} = 1- p so that p 2 1 = p (p 1) = (1- p) p 1. •
Example 6.35: Consider the Poisson equation ~u(x) + p(x) = 0 in Eq. 6.6
defined on an open bounded set D c ~ 2 with the Dirichlet boundary condition
u(x) = ~(x), x E aD. The local solution of this problem by the RWM is in
Section 6.2.1.4. We present here an alternative local solution based on the finite
difference approximation,
1 a2
Ui,j = 4 (Ui+1,j + Ui-1,} + Ui,J+1 + Ui,j-1} + 4 p(Xi,j), (6.82)
of this Poisson equation, where x i,j denotes the coordinate of the node (i, j) of
the finite difference mesh in D, a denotes the step of the finite difference mesh,
and Ui,J = u(xi,J ).
The method in this section can be applied to solve locally the linear system
defined by Eq. 6.82. However, we present an alternative local solution based on
n 8 walkers that start at a node x of the finite difference mesh and can move to a
node just right, left, up, or down from x with equal probability. A walker k travels
through the nodes x ~k), s = 1, ... , mk. of the finite difference mesh and exits
D at x~l E aD, where mk denotes the number of transitions walker k takes to
reach the boundary aD of D. For simplicity, we change notation and identify the
nodes of the finite difference mesh by a single subscript. The local solution can
be estimated by
The result provides an approximation for the local solution in Eq. 6.26. <>
Proof: Note that the walkers paths define a random walk in W. The first term in the
u,
expression of that is, the sum (1/ns) L:Z~ 1 Hx~h. is an estimate of the expectation
Ex[u(B(T))] in Eq. 6.26. The connection between the second term in the expression of u
and the integral Jcf p(B(s)) ds in Eq. 6.26 is less obvious.
To show that Ex [JcJ p(B(s)) ds Jis approximated by the second term in the ex-
pression of u(x), we develop an alternative interpretation of Jcf p(B(s)) ds. Consider a
circle S(x, a) of radius a > 0 centered at x E D, where a is the step of the finite dif-
ference mesh. Let ~T1 be the time a Brownian motion B starting at x needs to reach the
boundary of S(x, a) for the first time. Denote by Z1 the exit point of B from S(x, a).
Consider another circle S (Z 1 , a) and a Brownian motion starting at Z 1· Let ~ T2 be the
time it takes this Brownian motion to reach for the first time the boundary of S(Z1, a)
and so on. We stop this construction at step M if S(ZM, a) is the first circle that is not
included in D. The process x = Zo H- Z1 H- · • • H- ZM is similar to the spher-
ical process in Fig. 6.11. Because the radius a is small, the last point of the process
Z = (x = Zo. Z1, ... , ZM) is close to the boundary aD of D so that Jl
p(B(s)) ds
412 Chapter 6. Deterministic Systems and Input
so that E [ 'L! 1 p(Zr-1) D.Tr J= (a 2/2) E [ 'L!1 p(Zr-d J. where we used E[D.Tr] =
a2 /2 (Section 8.5.1.1). These considerations show that the second term in the expression
of uconstitutes a Monte Carlo estimate for (1/2) E [ L~l p(Zr-1)] (a 2/2). •
Example 6.36: Consider the partial differential equation in the previous example
with p(x) = 0 and D = (0, 1) x (0, 1). The coordinate of node (i, j) of the finite
difference mesh is Xi,j = ((i - 1) a, (j - 1) a), i, j = 1, ... , n + 1, where n
denotes the number of equal intervals in (0, 1) and a = 1/n. The estimate of the
solution u of the Laplace equation L\u(x) = 0 in D at an arbitrary node (i, j) of
the finite difference mesh is Ui,j = (1/ns) L~~~ ~(x~~) (Example 6.35), where
~ is the boundary value of u and x ~~ denotes the terminal point of the sample
paths k of the random walk.
If~ is equal to 100 on {1} x (0, 1) and zero on the other boundaries of D,
the above estimate of u gives Ui,j = 100n~/ns, where n~ denotes the number
of samples of the random walk that exit D through boundary {1} x (0, 1). The
estimates of u calculated at a relatively large number of points in D are in error
by less than 3% for a finite difference mesh with a = 1/1,000 and ns = 1, 000
samples.¢
Note: The exact solution of the Laplace equation with the domain and Dirichlet boundary
conditions considered in this numerical example is
400
L sinh(n :n: x1) sin(n :n: xz)
00
u(x) = -
:n: sinh(n :n:)
n=l,3, ...
Example 6.37: Let u be the solution of Eq. 6.75. Suppose that the condition
II g lim< 1 is not satisfied so that the local solution in Eq. 6.79 cannot be used.
Let a = i - g, J-L be an eigenvalue of a, {L > max{JJ-LI}, c = ({L) -z, 1/)* = c a 1/),
and g* = i- ca 2 . Then u is also the solution u = 1/)* + g* u and II g* lim< 1.
Hence, Eq. 6.79 can be used to solve u = 1/)* + g* u locally.¢
Proof: That u is the solution of u = !/)* + g* u results by straightforward calculations
using the definitions of!/)* and g*.
6.5. Algebraic equations 413
(6.83)
obtained from Eq. 6.73 with if> = 0 and f.L = 1/A., where g is assumed to be a
real-valued symmetric matrix with simple positive eigenvalues. Our objective is
to find the largest eigenvalue of g and the corresponding eigenvector, referred to
as the dominant eigenvalue and eigenvector, respectively.
Example 6.38: Consider the eigenvalue problem in Eq. 6.83. Suppose that the
eigenvalues f.Lk of g satisfy the condition f.L1 > f.Lk for all k > 1. Let y(r) E
JRm, r = 0, 1, ... , be a sequence of vectors generated by the recurrence formula
y<r) = g y<r- 1), where y< 0l does not exclude the dominant eigenvector of g, that
is, y<0l = 2:r= 1 {J; u;, where u; denote the eigenvectors of g and fJ1 =f:. 0. Then
y<r) / f.L} and Y;(r+ 1) IY;(r) converge to fJ1 UJ and f.LJ, respectively, as r --+ oo, where
so that
Let p = {p;,j} be an (m, m) matrix such that Pi,j ::0: 0 and LJ=I Pi,j =
1 and let X be a Markov chain with states {1, ... , m} defined by the transition
probability matrix p. Let a = (aJ, ... , am) E JRm and rr; > 0, i = 1, ... , m,
such that 2:r= 1 rr; = 1. Define a discrete random variable Y taking the values
a; = a;/n; with probabilities n;, i = 1, ... , m, and let V be the process defined
414 Chapter 6. Deterministic Systems and Input
by the recurrence formula in Eq. 6.78, but starting at V(O) = ai rather than 1.
Consider also the recurrence formula
starting at x (O) E JRm such that x (O) does not exclude the dominant eigenvector of
g. Set Pr =II g x<r-l) II in Eq. 6.84 so that the vectors x<l), x< 2), ... have unit
norm. Let Vi,J = gi,J I Pi,J if Pi,J > 0 and Vi,J = 0 if Pi,J = 0.
If the eigenvalues ILk of g = gT are such that IL =ILl > ILk fork > 1, then
n_
-· (e(q'))l/(q'-q)
,_ ·- for q' > q and large q, where (6.85)
e(q)
m m
e(q) = LE[V(q) I X(O) = Y, X(q) = j] =PI .. ·Pq 'L<x(q))J· (6.86)
}=1 }=1
Proof: Let wi,j be a sample path of X starting at state i and reaching a state j for the first
time after q transitions. The value of V along this sample is ai vi,i 1 Vi 1 h ... v;q_ 1 ,j so that
the expectation of V along all samples wi,j of X is
For x<0) =a we have g 7 a =PI ... Pr-1 Pr x<r) (Eq. 6.84) so that
e(q')
"m
L-J'=l(x
(q')
)j '-
q q q' > q'
-
eq
() = Pq+l · · · Pq' --",..---,....,..-- ~
"m
L..j=l
( (q)). - 1-L
x 1
•
where the approximate equality holds for large values of q since the vectors x<q) and x<q')
nearly coincide with the dominant eigenvector and the numbers Pq+ 1, ... , Pq' are approx-
imately equal to 1-L (Example 6.38). •
6.5. Algebraic equations 415
If the conditions under which Eqs. 6.85 and 6.86 hold, the coordinate u i of the
dominant eigenvector of Eq. 6.83 can be approximated by
1-~ i .
Uj::::: 1 ,
~q+ (1 - ~q -q)
'"'
~
E[V(r) I X(O) = Y, X(r) = J]. (6.87)
r=q+l
Proof: Suppose that q is sufficiently large so that the expectation E[V(r) I X(O)
Y, X (r) = j] can be approximated by 1-Lr u j. For q 1 > q we have
probabilities 1Ti = 1/3, and a = (100, 100, 100). The values of e(q) and e(q ')
have been estimated from ns = 1, 000 samples. ¢
Note: The expectations in Eqs. 6.86 and 6.87 have been estimated from samples of X
generated from the transition probability matrix p defining this process and the probability
(:rrb ... , :rrm) of its initial state. A
where xi E Di and Vi is the volume Di. The above equations defining has the u
formofEq. 6.73 in which the entries of the matrices g, "''and u are g(x i, Xj) Vj,
¢(xi), and u(xi). Similar arguments show that Eq. 6.89 can be cast in the form of
Eq. 6.74. ¢
Example 6.42: Suppose that the kernel of Eqs. 6.88 and 6.89 is degenerate, that
is, g(x, y) = L~=l gl,s(x) g2,s(y), x, y E D, and that the functions g1,s are
linearly independent. Then the solution of Eq. 6.88 is
m
u(x) = ¢(x) +A L gt,s(X) hs,
s=l
in which the coefficients hs = fv g2,s(Y) u(y) dy are given by the linear system
of algebraic equations
m
hs=as+ALhthts. s=l, ... ,m,
t=l
6.6. Integral equations 417
or
with an infinite number of equations, where Ysr are some coefficients. <>
Note: Because JD g (x, y) Xs (y) d y is an element of the space spanned by the functions
Xr, it has the representation Lq Ysq Xq (x). The coefficients Ysq can be calculated since
JD g(x, y) Xs(Y) dy is a known function.
If the integral in Eq. 6.88 can be performed term by term, we have
Hence,
so that the square brackets must be zero. The resulting conditions for tis constitute a sys-
tem with an infinite number of linear equations. The solution technique for this system
resembles the local method for solving Eq. 6.73 ([94], p. 89). A
418 Chapter 6. Deterministic Systems and Input
If l¢(x)l ::::: CJ, ig(x, y)l ::::: c2 for all x, y ED, 0 < CJ, c2 < oo, v = fv dx <
oo, and lA I c2 v < 1, then the local solution ofEq. 6.88 is
+ I>k Uk(x),
00
Proof: The Neumann series in Eq. 6.91 introduced in Eq. 6.88 gives
¢(x) +I:k=l
Ak Uk(x) = ¢(x) +A lv{ g(x, y) ¢(y) dy
+I:>k+l r g(x,y)uk(y)dy
00
k=l lv
under the assumption that the integral JD g(x, y) u(y) dy with u in Eq. 6.91 can be per-
formed term by term. These calculations are formal since we have not yet imposed any
conditions on the series in Eq. 6.91. The functions Uk in Eq. 6.90 result by equating the
terms of the above equation that have A at the same power. For example, the first two
functions of the series expansion in Eq. 6.91 are (Eq. 6.90)
= Lg2(x,z)¢(z)dz.
Because the inequalities 1¢ (x) I ::::; CJ and lg(x, y) I ::::; c2 are assumed to hold for
some constants c1, c2 > 0 and all x, y E D, lg2(x, y)l ::::; fv lg(x, z) llg(z, y)l dz :::0 v c~,
and, generally, lgk(X, y)l ::::; (v c2)k jv for all k = 1, 2, ... , the absolute value of u in
6.6. Integral equations 419
lu(x)l::::: lcf>(x)l + f
k=l
I.A.Ik r lgk(X, y)llcf>(y)l dy
lv
00
If the condition I.A. I cz v < 1 is satisfied, then Lk(I.A.I cz v)k is convergent implying that
the Neumann series in Eq. 6.91 is absolutely and uniformly convergent so that the term by
term integration performed above is valid. Moreover, the value of u (x) at an arbitrary point
x E D can be approximated by the first ii < oo terms of the Neumann series in Eq. 6.91.
The method in this section for solving locally Eq. 6.88 relates to the BWM devel-
oped in Section 6.4 for solving locally linear elasticity problems. Both methods are based
on Neumann series representations of the local solutions and the terms of these series are
calculated by Monte Carlo simulation (Eqs. 6.71 and 6.92). •
If the Neumann series of Eq. 6.91 is absolutely and uniformly convergent, and
the kernel g is positive, then
where y(k) (x) is aD-valued random variable with probability density gk (x, ·) =
8k(x, ·)/ak(X) and ak(x) = fv 8k(X, y) dy.
Note: The functions gk (x, ·) are proper density functions with support D because they are
positive and the volume under their graph is unity. The definition of Uk in Eq. 6.90 has also
the form uk(x) = ak(x) f D gk(X, y) cf>(y) dy showing that the integral is the expectation
of cf>(Y(k)(x)). The recurrence formula given by Eq. 6.90 shows that, if g is positive, so
are the kernels gk. Generally, the integrals in Eq. 6.90 must be calculated numerically.
If the kernel g is not positive on D x D, uk (x) can be interpreted as the expectation
of the random variable g(x, z(k)) cp(z(k)) v, where z(k) is uniformly distributed in D and
v = fv dx is the volume of D . .6.
The local solution in the preceding section (Eq. 6.79) can be extended to
find local solutions for Eq. 6.88 ([48] and [95], p. 90). Let X = (X(O) =
x, X (1), ... ) be a discrete time, continuous state Markov process starting at the
point x E D where the local solution needs to be determined. Let f(· I y) denote
the density of the conditionalrandorn variable X(i) I (X(i -1) = y), i ~ 1. This
density, referred to as transition density, needs to be specified. Given a transition
density f(· I ·),the probability that X starting at x E D exits Din one transition
is p(x) = fvc f(~ I x) d~. The following two equations define a stopping time
T giving the number of transitions that X starting at x E D takes to exit D for the
first time and a discrete time process V = (V(O), V(l), ... ). It is assumed that
the transition density f(· I x) is such that T < oo a.s. for all x E D.
420 Chapter 6. Deterministic Systems and Input
If the Neumann series in Eq. 6.91 is absolutely and uniformly convergent, the
local solution of Eq. 6.88 at x
E D is
Proof: Letx = xo r-+ XJ r-+ · · · Xk r-+ Xk+l with xo, ... , Xk ED and Xk+l E De be a
sample path of X that starts at x E D and exits D for the first time ink+ 1 transitions. Note
that X(T- 1) and X(T) are the last and the first values of X in D and De, respectively
(Eq. 6.93). The expectation of V(T -1) cp(X(T -1))/ p(X(T -1)) conditional on X(O) =
x and T = k + 1, k = 0, 1, ... , is cp(x)jp(x) fork= 0 and
+A { g(X,XJ)c/J(Xj)dXJ+A 2 {
fv fvxD g(X,Xj)g(XJ,X2)c/J(X2)dxldx2+ ...
= cp(x) +A UJ (x) + A2 u2(x) + · ..
with the notation in Eq. 6.90. Hence, the expectation in Eq. 6.95 coincides with the local
solution in Eq. 6.91. Because the Neumann series in Eq. 6.91 is absolutely and uniformly
convergent by hypothesis, Eq. 6.95 gives the local solution of Eq. 6.88. •
The local solution in Eq. 6.95 can be estimated from samples of X and V.
Letx r+ X(1, w) r+ · · · r+ X(T(w)- 1, w) r+ X(T(w), w), w = 1, ... , ns, be
ns independent samples of X starting at x E D.
u(x) = _!.._t
ns w=l
V(T(w)- 1, w) cf>_(X(T(w)- 1' w)).
f(X(T(w)- 1, w))
(6.96)
6.6. Integral equations 421
d 4 u(x)
~ + X(X) u(x) = q(x), xED= (0, l),
with the boundary conditions u(O) = u(l) = 0 and u"(O) = u"(l) = 0, where
x > 0 and q are specified functions. The integral form of the above differential
equation is
fo
1 .
u(n = g*(~, T/) [q(T/)- X(T/) U(T/)] d17
= r/J(~)- fol g(~, T/) U(T/) dT/,
where g* denotes the Green function of the differential operator d 4 I dx 4 , ~ = xI l,
11 = yll,rjJ(~) = J~g*(~.T/)q(T/)dT/andg(~.T/) = g*(~.T/)X(T/). Because
max~, 11 e(O,I) lg(~. T/)1 < 1, the local solution in Eq. 6.95 is valid and has been
used to calculate the local solution.
Numerical results have been obtained for l = 1, x = 1, q = 1, and X
defined by X(k + 1) I (X(k) = ~) ,...., N(~, a 2 ), where a > 0 is a specified
parameter. These results are in error by less than 2% for a in the range (0.2, 0.6)
and ns = 500 samples. <>
where~ = xI l, 11 = y I l E (0, 1). This function gives the value of u at x caused by a unit
action applied at y.
The solution of the above differential equation is the displacement function for a
simply supported beam with unit stiffness and span l that is supported on an elastic foun-
dation of stiffness x and is subjected to a load q. A
(6.97)
Proof: Let v, w : D --+ IR be square integrable functions. The inner product, (v, w) =
fv v(x) w(x)* dx, induces the norm II v II= (v, v) 112, where z* denotes the complex
conjugate of z E IC. This definition is meaningful by the Cauchy-Schwarz inequality since
the functions v, ware square integrable in D.
Suppose that there exists a complex-valued solution (A., u) of the eigenvalue prob-
lem in Eq. 6.89. The complex conjugate,
ofEq. 6.89 multiplied by u(x) and integrated with respect to x over D yields
We also have
by multiplying Eq. 6.89 with u(x)* and integrating the result with respect to x over D. The
above two equations imply A = A* since g is symmetric and u is an eigenfunction, so that
the inner product (u, u) is real and non-zero.
Let u 1 and u 11 be two eigenfunctions corresponding to the eigenvalues >! and A." =j:.
A. 1, respectively. The inner product of these eigenfunctions,
If (1) the kernel g in Eq. 6.89 is real-valued and symmetric, (2) the initial trial
function u<0 ) does not exclude uo, and (3) 0 < A.o < Ak fork ::=: 1, then
II u< 0 II
A.o ~ {u(i+ 1), u(i)} and uo(x) is proportional to u (i) (x) fori large. (6.99)
Proof: Because the collection of eigenfunctions of Eq. 6.89 is complete, we can represent
the initial trial function by
= I:CkO) Uk(x),
00
u<0)(x) x ED.
k=O
ca
Suppose that 0) -:f. 0 so that the eigenfunction uo is not excluded from u(O). The proof of
Eq. 6.99 involves three steps. First, we note that
(i)
= L ~ uk(x),
00
u<i)(x) i = 1, 2, ... ,
k=O Ak
where cii) = cr- I 1) II uCi- 1) II. For example, (Eqs. 6.97 and 6.98)
where ci1) = ckO) I II u<0l 11. Because uCll has the same functional form as u<0l, we have
uC2l(x) = Lk(ck2) /A~) uk(x). The above representation of u<i) follows by induction.
Second, the first term of the series giving u<i) is dominant for a sufficiently large
value of i since 0 < .Ao < Ak for all k ;:: 1 so that Ab u<il (x)lcg) converges to uo(x) as
i ~ 00.
Third, consider two successive approximations,
424 Chapter 6. Deterministic Systems and Input
divided by II u(il 11 2::::: (cg)) 2 (uo, uo)JA.5i gives the approximate expression of A.o in
Eq. 6.99. The above equations also show that uo becomes proportional to u<il for a suffi-
ciently large i. •
If the conditions in Eq. 6.99 are satisfied and g is a positive function, then
(6.100)
where a(x) = fv g(x, y) dy and Y(x) denotes an !Rd -valued random variable
with the probability density function g(x, ·) = g(x, ·)/a(x).
The functions in Eqs. 6.98 and 6.100 can be estimated from n s independent
samples of Y(x) (Eq. 6.100) by
(6.101)
with the boundary conditions u(±1) = 0. The integral form of this equation is
where g is the Green function of the differential operator d 2 1dx 2 . This equation
is a simple version of the SchrOdinger equation, which can also be used to analyze
the stability of a simply supported beam subjected to an axial load.
Figure 6.14 shows the lowest eigenfunction and eigenvalue of the above
differential equation. The approximate eigenfunction calculated from Eq. 6.101
with i = 100 iterations and ns = 500 samples at each iteration is practically
indistinguishable from the exact solution uo(x) = cos(rr xj2). In fact, a very
good approximation of uo results in a few iterations. However, the estimate of A.o
in Eq. 6.99 becomes stable after a larger number of iterations, as demonstrated by
its variation with the number of iterations. <>
Note: The Green function of the differential operator d 2 j dx 2 is the solution of the differ-
ential equation g 11 (x, y) = 8(x- y), x, y ED= (-1, l), with the boundary conditions
6. 7. Problems 425
0.8
2.8
Estimate of A.0
0.6
/ A-0=2.4674
2.6
0.4
Exact and
approximate
u0(x)
X Iteration number
g(±l, y) = 0, where the primes denote derivatives with respect to x and <'lOis the Dirac
delta function. Elementary calculations give
The Green function g is positive, symmetric, and square integrable in D so that Eq. 6.101
can be applied. The density of Y(x) is g = gfa(x), where a(x) = J~ 1 g(x, y)dy =
(1- x 2 )j2.
The solution of u 11 +'Au = 0 is an even function since, if u (x) satisfies this equation,
so does u( -x). Hence, we can replaceD by D* = (0, 1) and solve u" +'Au = 0 in D* with
= =
the boundary conditions u (1) 0 and r/ (0) 0. The general solution of this equation,
where a and f3 are some constants. Since the function u must satisfy the boundary condi-
tions and cannot be zero everywhere in D*, we have
6.7 Problems
6.1: Develop an algorithm for solving locally Eq. 6.21 for the case in which~ r
are random fields defined on the boundaries aD r of D.
6.2: Extend the algorithm in Problem 6.1 to solve locally Eq. 6.25 with p and~
random fields defined on D and aD r, respectively.
426 Chapter 6. Deterministic Systems and Input
6.3: Solve the SchrOdinger equation (1/2) u"(x) + q(x) u(x), 0 < x < l, with
the Dirichlet boundary conditions u(O) =a and u(O) = f3 for two cases: (1) q is
a continuous function and (2) q is a random field with continuous samples.
6.4: Show that the eigenvalues of Eq. 6.36 with d = 1 and D = (0, 1) must be
strictly positive.
6.5: Let g(IB(t)l) and g(hn,e(B(t))) be the processes given by Eq. 6.41 and
Eq. 6.42, respectively. Evaluate numerically the difference between these two
processes in a time interval [0, 1] for several values of nand s > 0, g(x) = x 2 .
6.7: Complete the details of the proof given in the text for the second extension
of Tanaka's formula in Eq. 6.46.
6.8: Generalize Eq. 6.41 to find a formula for the increment g(X(t))- g(X(O)),
where g E C2 (JR) and X denotes an arbitrary diffusion process. Apply your re-
sult to two diffusion processes, an Omstein-Uhlenbeck process and a geometric
Brownian motion process.
6.10: Apply the algorithm described in Fig. 6.10 to solve locally the partial dif-
ferential equation in Example 6.22.
6.11: Use the SWM to solve locally the Laplace equation Llu(x) = 0, x E
D = (- 2, 2) x ( -1, 1) with the Dirichlet boundary conditions u (x) = 1 for
x E ( -2, 2) x {-1} and u(x) = 0 on the other boundaries of D.
6.12: Repeat the analysis of the partial differential equation in Example 6.22 for
the same Dirichlet boundary conditions on (0, a) x {0} and (0, a) x {b} but different
Neumann boundary conditions. Use in your calculations au(x)jax 1 =a on {0} x
(0, b) and au(x)jaxl = {3 on {a} x (0, b), where a and f3 are some constants.
6.13: Show that the solution of Eq. 6.75 is u = (i - g) -I t/J = (l:~o g') t/J if
there exists y E (0, 1) such that II g x II< y II x II for all x E !Rm, where g 0 = i
denotes the identity matrix.
6.14: Apply Eq. 6.79 to solve locally the problem in Example 6.34 by using tran-
sition probability matrices other than p 1 and p 2 in this example. Comment on the
dependence of the solution accuracy on the transition probability matrix. Extend
your analysis to the case in which t/J in Eq. 6.75 is random.
6. 7. Problems 427
6.15: Apply Eq. 6.85 and 6.87 to find the dominant eigenvalue and eigenvector
for some real-valued symmetric matrices of your choice.
6.16: Use Eq. 6.95 to solve locally the differential equation in Example 6.44.
Consider for X (k + 1) I X (k) a different probability law than the one used in the
text.
Chapter 7
7.1 Introduction
This chapter examines stochastic problems defined by
429
430 Chapter 7. Deterministic Systems and Stochastic Input
On the other hand, the second moment properties of X are insufficient to define
its probability law if Y is not Gaussian or if Y is Gaussian but V is a nonlinear
operator. Linear and nonlinear operators are examined separately. It is also shown
that some of the methods developed for properties of the solution of initial value
problems can be extended to a class of boundary value problems.
~
(formal calculus) if A, H, and K are GWN/PWN input
deterministic
• HOM for special
cases
• HOM if drift/ • Meanlcorr. • HOM if
diffusion are if a, bare poly( X)
drift/diffusion
poly( X) • PDE for rp in are poly(X, S)
~ • PDE forrp special cases • PDE for rp in
~z
if drift/diffusion special cases
are poly( X)
0 • Fokker-Planck-
Kolmogorov eqs
• Approximations
• System:
dX(t) = a(X(t-), t) dt + b(X(t-), t) dS(t)
X(t) = a(X(t-), t) + b(X(t-), t) p(S(t-))
• Input:
dS(t) = dA(t) + H(t) dB(t) + K (t) C(t) = dA(t) + dM(t)
Table 7.1 lists the type of problem considered and results obtained in the
following two sections. The notations mean/corr, HOM, and PDE for rp mean
that differential equations are obtained for the mean/correlation, the higher order
moments, and the characteristic function rp of X. Qualifiers such as Z = solution
of an SDE with GWNIPWN or H = poly( B) mean that Z satisfies a stochastic
differential equation driven by Gaussian and/or Poisson white noise processes or
H is a polynomial of a Brownian motion B. The qualifiers DM and SAM in the
table heading refer to direct and state augmentation methods used in the analysis.
The defining equations for the state X and the driving noise S are given at the
bottom of the table.
The analysis is based on the Ito formula for semimartingales. We give this
7.1. Introduction 431
formula here for convenience. Let Y be an JRm -valued semimartingale and let
g : JRm -+ lR be a function with continuous second order partial derivatives. Then
g(Y) is a semimartingale and (Section 4.6.2)
+- L loto --(Y(s))d[Y;,
1 d
2 ..
Bzg
By; By1
Yj](s). (7.3)
l,j= 1
E:~
i=l
Note: If Y has continuous samples, then Y(s-) = Y(s), L'.Y;(s) = 0, and [Y;, Yj]c =
[Y;, Yj] so that Eq. 7.2 becomes Eq. 7.3. If Y is a quadratic pure jump semimartingale,
then [Y, Y]c = 0 (Section 4.5.2), and Eq. 7.2 becomes Eq. 7.4. The formula in Eq. 7.4
applies if, for example, Y is a compound Poisson process . .A.
432 Chapter 7. Deterministic Systems and Stochastic Input
where the state X is an JE.d -valued stochastic process, the input Y takes values
in JE.d', a denotes a (d, d)-matrix, and b is a (d, d')-matrix. The solution of this
differential equation is
where the Green function 9, also called the unit impulse response function in
dynamics, satisfies the equation ([175], Section 5.1.1)
a9(t,s)
- - - = a(t) 9(t, s), t ::_ s ::_ 0, (7.7)
at
and 9(s, s) is the identity matrix.
and set xl = X and x2 = X. The JE..2 -valued process X = (Xl, X2) is the
solution of
d[X1(t)]
dt X2(t) =
[0-f3(t)
1-a(t) ][X1(t)]
X2(t) + [OJ
1 Y(t),
that is, it satisfies a differential equation of the type given by Eq. 7.5. <>
Our objective is to develop differential equations for moments and other
probabilistic properties of X. If Y is a Gaussian process, the probability law of
X is defined completely by its first two moments since Eq. 7.5 is linear. If Y is
not Gaussian, X is generally non-Gaussian so that moments of order higher than
two and/or other statistics are needed to characterize its probability law ([79],
Section 5.2).
where the coordinates of B E ~d' are independent Brownian motions. The solu-
tion ofthis differential equation is (Eq. 7.6)
In Eq. 7.8 we can assume without loss of generality that the input is white
noise. If the input to this equation is colored, it can be approximated by the output
Y of a linear filter driven by white noise ([30], Section 2.17). The augmented
vector Z = (X, Y) satisfies a differential equation of the same form as Eq. 7 .8.
Note: If the coefficients a and b ofEq. 7.8 satisfy the conditions in Section 4.7.1.1, then the
equation has a unique solution. If the functions(} and bare bounded, the integral in Eq. 7.9
with Brownian motion integrator is a a (B(s), 0 ::::; s ::::; f)-square integrable martingale. We
have also seen that X is a sernimartingale (Section 4.7.1.1) . .&
Example 7.2: Let X be the process in Example 7.1, where Y is the solution of
dY(t) = p(t) Y(t) dt + ~(t) dB(t), t ::: 0, and B denotes a Brownian motion.
The augmented state vector Z = (X 1, X2, Y) is the solution of
X1(t)]
d [ X2(t) =
[0-f3(t)
1
-a(t) 01 ] [ X2(t)
X 1 (t) ] dt +[ 00 ] dB(t)
Y(t) 0 0 p(t) Y(t) ~(t)
The mean and correlation functions of X in Eq. 7.8 satisfy the differential
equations
with p,(O) = E[X(O)] and r(O, 0) = E[X(O) X(O)r]. The covariance function
c of X satisfies the same differential equation as r.
434 Chapter 7. Deterministic Systems and Stochastic Input
Xp(t) Xq(t)- Xp(O) Xq(O) = Ld lot (fipk Xq(s) + Xp(s) fiqk) dXk(s)
k=l 0
Xp(t) Xq(t)- Xp(O) Xq(O) =lot Xq(s) dXp(s) +lot Xp(s) dXq(s)
r
+ "21 Jo (d[Xp, Xq](s) + d[Xq, Xp](s)).
The average of the left side of the above equation is rpq (t, t) - rpq (0, 0) so that its deriva-
tive with respect tot is fpq(t, t). The average of the first term on the right side of this
equation is
= L t apu(s)rqu(s,s)ds,
ulo
so that its derivative with respect to time is Lu apu (t) rqu (t, t). In the same way, the
average of JJ Xp(s) dXq(s) can be obtained. The quadratic covariation process [Xp, Xq]
is (Section 4.5)
r
d' t
[Xp. Xq](t) =L hpv(s)bqv(s)ds,
v=l Jo
with time derivative I:_~~~ hpv(t)bqv(t). The expectation of the last term on the right
side of the Ito formula is 'L~~ 1 JJbpv(s)bqv(s)ds. Hence, the Ito formula gives by
expectation and differentiation
d'
fpq (t, t) =L apu(t) rqu(t, t) +L aqu (t) rpu (t, t) +L hpv (t) bqv(t).
u u v=l
fort2:s.
7.2. Linear systems 435
Note that the above equation can also be obtained by using classical analysis. We have
= L:lt
d
u=l
apu(u)Xu(u)Xq(s)du+ Ld' 1ts bpv(u)Xq(s)dBv(u)
v=l
by multiplying the above equation with Xq (s). The average of the above equation is
rpq(t,s)-rpq(s,s)= Ld it apu(u)ruq(u,s)du
u=l s
since
showing that c and r are the solutions of the same differential equations since
where the (d', 1) and (d', d') matrices ILb and ~bare interpreted as the mean and
the intensity of the driving white noise, respectively. The coordinates of the noise
B b are correlated.
436 Chapter 7. Deterministic Systems and Stochastic Input
If X is the solution ofEq. 7.8 with B replaced by B bin Eq. 7.11, the mean and
correlation functions of X are given by
with /L(O) = E[X(O)] and r(O, 0) = E[X(O) X(O)T]. The last two formulas
hold with c in place of r and IL b = 0.
Proof: The mean equation results by averaging the defining equation of X. The difference
between the equations for X and JL gives
The probability law of X can be obtained from the properties of the Gaussian
process X (t) I (X (0) = ~) whose first two moments are given by the above
equations with (~, ~ 2 ) in place of (E[X (0)], E[X (0) 2 ]).
The differential equations for the correlation function of X become
E[W(t)] = ILw(t),
E[(W(t) - ILw(t))(W(s)- /Lw(s))T] = qw(t) 8(t- s). (7.13)
The functions ILw(t) and qw(t) denote the mean and the intensity of W at time
t, respectively, and 8 (·) is the Dirac delta function. It is assumed that there is no
correlation between the noise and the initial state X(O).
The classical mean and correlation/covariance equations in linear random
vibration coincide with Eq. 7.12 for ILb = ILw and ~b ~b = qw. However, the
derivation of these equations in linear random vibration is based on a heuristic
definition of white noise (Eq. 7 .13) and on formal calculations ([175], Chapter 5).
where X (0) = 0, a > 0 and {J are constants, and W is a white noise with J-t w =
0 and qw = 1 (Eq. 7.13). Denote the solutions of Eq. 7.14 by X G and Xp if
W(t) is a Gaussian white noise Wa(t) = dB(t)fdt and a Poisson white noise
Wp(t) = dC(t)/dt, respectively. Suppose that Wa and Wp have the same first
two moments as W so that X G and X p are equal in the second moment sense. The
classical linear random vibration can only deliver the second moment properties
of the state so that it cannot distinguish between the processes X G and X p. This
is a severe limitation since the samples of these processes can differ significantly.
Figure 7.1 shows a sample of X G and three samples of X p for C(t) = 'Lz;,?
Yk,
where N is a Poisson process with intensity A. = 5, 10, and 100, and Yk are
independent copies of a Gaussian variable Yt with mean zero and variance E[Yf]
such that A. E[Yf] = 1. The qualitative difference between the samples of X p and
X G decreases as A. increases. ¢
Note: Recall that the mean and correlation functions of Care E[C(t)] =), E[Yt] = 0 and
E[C(t) C(s)] = J... E[Yf] min(t, s) (Section 3.12). Formal calculations of the type used in
the classical linear random vibration show that the mean and the covariance functions of
the Poisson white noise Wp(t) = dC(t)/dt are zero and J...E[Yf]8(t- s), respectively.
H we set J... E[Yf] = 1, the white noise processes Wa and Wp are equal in the second
moment sense. &
1.5,-----~-------, 2.----~-------,
1 xp 0.=5) 1 .5 xp (A.=IO)
0.5
0 0.5
-1
-2L--~-~----~-~ -1.5L--""--'-~----~-~
0 2 4 6 8 10 0 2 4 6 8 10
3,----~-------,
X (1.=100)
2 p
-1
-2L--~-~----~-~
0 2 4 10 2 4 6 10
6 time 8 time 8
Figure 7.1. A sample of X G and three samples of X p for A = 5, 10, and 100 such
that A E[Yf] = 1
If (1) the coefficients of a linear system are time-invariant and the driving noise
is weakly stationary, that is, a(t) =a, b(t) = b, Jlw(t) = Jlw, and qw(t) = qw,
and (2) p,(t) and r(t, t) in Eq. 7.12 become time-invariant as t--+ oo, then X s
exists and its mean and covariance functions are given by
a lls + b llw = 0,
acs+csaT +hqwbT =0,
Cs('r) =a Cs(T), T :::: 0, (7.15)
where Cs and Cs(T) are shorthand notations for Cs (t, t) and Cs (t + r, t), respec-
tively.
Proof: If X is weakly stationary, the above equations follow from Eq. 7.12 since JL(t) and
c(t, t) do not vary in time so that their time derivatives are zero and c(t, s) depends on only
the time lag r = t - s rather than the times t and s.
The differential equations for JL(t) and the entries Cij(t, t), i 2: j, of c(t, t) have
the form ~(t) = a ~(t) + ~. t 2: 0, where a and~ are constant matrices. The matrix a
7.2. Linear systems 439
is equal to a for /L(t). Hence, /Ls = limt-+oo /L(t) exists and is finite if the eigenvalues
of a have negative real part. Similarly, Cs = limr-+oo c(t, t) exists and is finite if the
eigenvalues of the corresponding matrix Cl! have negative real part ([166], Section 4.1.3,
[175], Section 5.1.1).
Suppose that u has distinct eigenvalues >..i with negative real part and let ~s =
Li di Vi be the stationary solution of~ (t) = Cl! ~ (t) +~.where Vi denote the eigenvectors
ofu. Then~(t) = Li (ci eJ...; 1 + di) Vj, where Ci are some constants. If ~(0) =Lid[ Vj,
then W) = Li ( (d[- di) eJ...; + di) Vj. If ~(0) = ~s• then d[- di = 0 for all i's so that
1
= Li di vi = ~s• as stated. •
the general solution is ~(t)
Example 7.5: Consider a linear dynamic system with n degrees of freedom, mass
matrix m, damping matrix~, and stiffness matrix k that is subjected to a random
forcing function Y. Let X be an !Rn -valued process collecting the system displace-
ments at its degrees offreedom. This process is defined by ([175], Section 5.2.1.1)
Let Vi and t/Ji, i = 1, ... , n, be the eigenvalues and eigenvectors of the eigenvalue
problem v2 m t/J = k t/J. Suppose that (1) the damping matrix is~ =am+ {3 k,
where a, {3 are some constants, and (2) the eigenvalues v are distinct. Then Xl
has the representation
n
X(t) = Lt/Ji Qj(t), t 2::0, (7.17)
i=l
where
the columns of the (n, n) matrix ci» are the eigenvectors t/Ji, mi are the non-zero
s
by ( .T .
entries of the diagonal matrix ci» T m ci», and the modal damping ratios i are given
~ )ii = 2 Si Vj mi. Any system response can be represented in the form
n
X(t) = LPt/Ji Q;(t), t 2::0, (7.19)
i=l
Note: The matrices 4»T m 4» and 4»T k 4» are diagonal by the definition of the eigenvalue
problem. Because s= s
am+ f3 k, the matrix 4»T 4» is also diagonal. The parameters iii;,
~i, and k;=Vf /iii; are called modal mass, damping ratio, and stiffness, respectively.
s
If the matrix 4»T 4» is not diagonal, a similar approach can be used for the state
vector (X, X) rather then X. In this case, the system eigenvalues are complex-valued,
and a right and a left complex-valued eigenvector corresponds to each eigenvalue ([127],
Section 3.7). ~
Example 7.6: Consider a special case of the previous example withY replaced
by -m 1 W(t), where 1 denotes an (n, 1) matrix with unit entries and W is a scalar
white noise with mean zero and intensity qw(t) = qw = rr go. This case corre-
sponds to a building with n-degrees of freedom subjected to seismic ground ac-
celeration modeled as a white noise with a one-sided spectral density of intensity
go > 0. The stationary displacement vector X (t) has mean zero and covariance
• y q •T, where y q denotes the covariance matrix of the n-dimensional vector
Q(t) = (Q1 (t), ... , Qn(t)). The entries (i, i) of • 'Yq are •T
i = 1, ... ,n, (7.20)
where r; = <•m 1); jm; are called modal participation factors. The correlation
coefficients between the modal coordinates are
where x = 4~? 'A ('AJ.t + 1) ('A+ J.t) + (1- 'A 2 ) 2 , 'A= v;jv1, and f..t = ~;/~J·
The above results can be used to calculate the second moment properties of the
response of a linear system subjected to seismic ground acceleration and develop
approximations for response maxima ([175], Example 5.10, p. 190). <>
Proof: An arbitrary pair of modal coordinates, Q; and Q1 , i "I= j, satisfy the differential
equations
1 0 0
!!._ [ Q;(t)
Q;(t)
]
= -vl
[ 0 -2~; v? 0 0 Q; (t) ]
Q;(t)
] [ Qj(t) +[ 0
-r;
]
W(t).
dt Qj(t) 0 0 0 1 0
Qj(f) 0 0 -v~
J
-2~j v] Q1 (t) -ri
The above equation defines the evolution of the state vector ( Q; , Q; , Q1 , Q1). The second
moment properties given by Eqs. 7.20 and 7.21 result from Eq. 7.15 applied to this state
vector. •
7.2. Linear systems 441
Example 7.7: Let X be the solution of Eq. 7.5 with constant coefficients, that is,
a(t) = a and b(t) = b so that the Green function (} in Eq. 7.7 depends only
on the time lag. Suppose that (1) the system is causal, that is, 9(r) = 0 for
r < 0, (2) Y is a weakly stationary process with mean zero, correlation function
ry, and spectral density sy, and (3) Eq. 7.5 admits a weakly stationary solution.
The spectral density of X is
L: L:
Proof: We have
since the system is causal and time invariant. The starting time -oo is needed for stationary
response ([175], Section 5.2.2). The second equality in the above equation follows from
the change of variable u = t - a. Because Y has mean zero, the expectation of X is zero
at all times. If the mean of Y is a constant ILy f=. 0, then (Section 3.9.3.3)
L: L:
The correlation function of X,
L: L:
E[X(t) X(s)T] = fJ(u) b E[Y(t- u) Y(s- v)T] bT fJ(v)T du dv
L: L:
s(v) = _1_ { e - A r v r(r) dr = _1_ { e<-A (r-u+v)+(u-v)) v r(r) dr
2n}~ 2n}~
where Sy (v) = (1/(2 n)) J~ e-R (r-u+v) v ry(r - u + v) dr for each (u, v). •
442 Chapter 7. Deterministic Systems and Stochastic Input
Example 7.8: Let X(t), t:::: 0, be the solution ofEq. 7.5 with a(t) =a, b(t) =
b, X(O) = x, and a Gaussian white noise input Y = W with second moment
properties in Eq. 7.13. Suppose that Eq. 7.5 has a stationary solution X s and let
ILs and Cs(·) denote its mean and covariance functions. The process
is a version of X(t), t :::: 0, defined by Eq. 7.5 with X(O) = x. The mean and
covariance functions of X are
Proof: That X is Gaussian with the first two moments in Eq. 7.25 follows from properties
of Gaussian vectors (Section 2.11.5). Because X and X are Gaussian processes, they are
versions if they have the same second moment properties. The. stationary solution Xs
satisfies the differential equation Xs (t) =a Xs (t) +bY (t) so that X(t) =a X(t) +b Y(t),
t ::: 0, results from Eq. 7.5 by conditioning with respect to X s (0) = x, where Y(f) = Y (t) I
Xs(O) and X(O) = x. Because Y is a white noise process, Y(t) = W(t) = W(t) so that X
and X satisfy the same differential equation and initial condition.
The extension to a colored stationary Gaussian input Y is discussed in [76] for the
case in which Y is defined by some of the coordinates of the state of a linear filter driven
by Gaussian white noise. •
Example 7.9: Let X be the solution of X(t) = -a X (t) + Y (t), t :::: 0, where
X (0) = x, Y is the stationary solution of Y(t) = - f3 Y (t) + W (t), W is a station-
ary Gaussian white noise with mean f-tw and intensity qw, a =f. {3, and a, f3 > 0.
The mean and variance functions of X (t), t 2: 0, are
t-t(t)=xe-at+t-ti (1-e-at),
qw [l-e-2at _ 2 e-(a+{J)t_e-2at]
f3 (7.26)
c(t, t) = 2{3 (a+ {3) a a-
Proof: The augmented state vector (X, Y) E ~2 satisfies the differential equation
!!_ [
dt
X (t)
Y(t)
J = [ 0-a 1
-fJ X(t) J+ [ 01 JW(t).
J[ Y(t)
This system has a stationary solution (Xs, Ys) since it has constant coefficients satisfying
the conditions in Eq. 7.15. The first two moments of (Xs, Ys) are MI = E[Xs(t)] =
Mwf(a fJ), M2 = E[Ys(t)] = 1-Lw/fJ, Y!,! = E[Xs(t) ] = qw/[2a fJ (rx + fJ)], Y!,2 =
- 2
7.2. Linear systems 443
Y2,1 = E[Xs(t) Ys(t)] = qw/[2{3 (a+ {3)], and Y2,2 = E[l\(t) 2 ] = qw/2{3, where
Z = Z- E[Z] denotes the centered variable Z. The Green function is
-ar
8(-r:) =[ ~
so that the mean and covariance functions of (X(t), Y(t)) = (Xs(t), Ys(f)) I (Xs(O) =
x, Ys(O) = y) are (Eq. 7.25)
'
~1 t
( )
= ~I + (x - ) -a t
~1 e + Ya -_ ~2
f3
i\2Ct, t) = 22,1(t, t)
= qw { _1_ _ _1_ e-(a+f3) t _ _I_ e-!3 t (e-!3 t _e-at)},
2{3 a+f3 a+f3 a-{3
Example 7.10: Let t E (0, l) in Eq. 7.5 be a space parameter and let Y = W
be a white noise with the second moment properties in Eq. 7.13. The solution of
this equation satisfies the homogeneous boundary conditions 2:7= 1 a pi X; (0) = 0
and 2:7= 1 {3qi X;(l) = 0, where p = 1, ... , d1, q = 1, ... , d- d1, 0 :S d1 :S d,
and (ap;. {3q;) are some constants. Generally, one or more coordinates of X(O)
are not known so that Eqs. 7.10 and 7.12 cannot be solved. This is the essential
difference between the initial value problem considered previously in this chapter
and the boundary value problem in this example.
444 Chapter 7. Deterministic Systems and Stochastic Input
where Yw,o(t) = E [X(O) W(tf] and() denotes the Green function in Eq. 7.7.
Initial conditions for these equations are given by the second moment properties
of X(O) = ~ Z, where~ is a (d, d) constant matrix depending on fJ(l, 0) and the
()
coefficients (a pi, /3qi) in the boundary conditions and Z = J~ (l, s) b(s) W (s) ds
is an JRd -valued random variable with mean and covariance matrices
The above covariance equations differ in two ways from the covariance
equation used in linear random vibration. The equations for c(t, t) and c(t, s)
include new terms and c(O, 0) is not generally available but can be determined
from the relationship between X (0) and X (l). ¢
Proof: To derive Eq. 7.27, we use the formal definition of white noise in Eq. 7.13. The
solution ofEq. 7.8 att = 1,
so that
c(t, t) = E[X(t) X(t)T]
= 8(t, 0) c(O, 0) 8(t, O)T + 8(t, 0) t Yw '0 (s) b(s)T 8(t, s)T ds
lo
+[lot 8(t, s) b(s) Yw,o(s)T ds J 8(t, O)T
+lot ds lot du 8(t, s) b(s) E W(u)T] b(u)T 8(t, u)T,
[W(s)
7.2. Linear systems 445
where Yw,o(t) = E [X(O) W(t)T]. The above double integral degenerates into
lot O(t, s) b(s) qw(s) b(s)T O(t, sl ds
by the definition of the driving noise. The derivative of the covariance function of X (t) is
c(t, s) =E [ X(t) X(sl] = E [o(t, s) X(s) +[ O(t, u) b(u) W(u) du] X(s)T
since
Example 7.11: Let V, cl>, M, and Q be the displacement, the rotation, the bend-
ing moment, and the shear force in a beam of length l fixed at the left end and
simply supported at the right end. The state vector X = (V, cl>, M, Q) satisfies
the differential equations V(t) = cl>(t), <i>(t) = -a M(t), M(t) = Q(t), and
Q(t) = -W(t), where 1/a is the beam stiffness and W is a distributed white
noise load with mean /Lw and intensity qw.
446 Chapter 7. Deterministic Systems and Stochastic Input
414 5312
f.Ll(t) = -aJLwl (-2l + 48~ - 16~ ),
3 1 3 5 2 1
JL2(t) =-a f.Lw l (-6~ + 16 ~ - 8),
2 1 2 5 1
f.L3(t) = f.Lwl (-2~ + 8~- 8),
5
f.L4(t) = f.Lw l (-~ + 8),
where~ = tIl E (0, 1). Figure 7.2 shows the functions Ji, 1 = f.Ll/(a f.Lw 14 ) and
JL2 = f.L2/(a f.Lw 13), JL3 = f.L3/(JLw 12), and JL4 = f.L4/(JLw l).
0.8
-0.025'---~-~-~~---' -0.4L.__~-~-~~---'
0 0.2 0.4 til 0.6 0.8 0 0.2 0.4 0.6 0.8
til
The differential equations for the variance functions of X are (Eq. 7 .27)
X 10-4
6 0.6
4 0.4
L'44
2 0.2
0
\ 'C'l,l
-2 -0.2
0 0.5 til 0 til
al al a 2 3 !ol
V(l) = V(O) + l <I>(O)-- M(O)-- Q(O) +- (l- s) 3 W(s) ds,
2 6 6 0
2
<I>(l) = <I>(O) -a l M(O) - -al Q(O) + -a!ol (l- s) 2 W(s) ds,
t
2 2 0
sy(v)
Sx(v) = 2,
j-v 2 + J=T fJ v + v5- h(v)j
where h(v) = f~oo k(u) e - R vu du. The process X describes the torsional mo-
tion of an airplane wing in a turbulent wind andY models the effect of the buffet-
ing forces on the wing. <>
Proof: We first note that the moment equations developed in this section cannot be applied
to find the second moment properties of X defined by the above equation.
Let X 8 (t) = f~oo eH vt dZ(v) be the spectral representation of X8 , where Z is a
process with E[dZ(v)] = 0 and E[dZ(v) dZ(v')*] = sx(v) 8(v- v 1 ) dv (Section 3.9.4.1).
The second moment of the increments of Z depend on the spectral density sx of X, which
is not known. The representation of Xs gives
£: = £: [£: J
L:
k(t- a) X 8 (a)da k(u)e-H vu du eH vt dZ(v)
= h(v)eHvt dZ(v)
by the change of variables u = t - a and the definition of h. Also, h(t) is bounded for
each t ~ 0 since lh(v)l ::; f~oo lk(u)l du, which is finite. The defining equation of X and
L: [
the above equality give
Let X denote the left side of the above equation. Then, X and Y must have the same
second moment properties. The process X has mean zero consistently with Y. The second
7.2. Linear systems 449
L:
E[X(t) 2 ] 2 ,8 v
so that
L: [l-v 2 + n ,8 v + v5- h(v)l 2
Sx(v)- Sy(v)] dv = 0,
which gives the stated expression of Sx. •
where the coordinates of the ffi.dp -valued process p are polynomials of S with
coefficients that may depend on time.
Note: We have seen that a real-valued process S defined on a probability space (r.l, :F, P)
endowed with a right continuous filtration (:F1) 1:c:o is a semimartingale if and only if it is a
classical semimartingale, that is, it has the representation
where S(O), A, and M have the properties in Eq. 7.30 (Section 4.4.1 in this book, [147],
Theorem 14, p. 105).
The matrices bin Eqs. 7.28 and 7.29 have the dimensions (d, d1) and (d, dp). re-
spectively. We use the same notation for simplicity. It is assumed that a and b are such that
the solutions of Eqs. 7.28 and 7.29 exist and are unique (Sections 4.7.1.1 and 4.7.2). •
450 Chapter 7. Deterministic Systems and Stochastic Input
The martingale component of Sin Eq. 7.30 is assumed to admit the representa-
tion
dM(t) = H(t) dB(t) + K(t)dC(t), (7.32)
where the entries of the (d', db) and (d', de) matrices Hand K are processes in
£, the coordinates of B and C are independent Brownian motion and compen-
sated compound Poisson processes, respectively, the jumps of the coordinates of
C have finite mean, B and C are independent of each other, H is independent
of (K, C), and K is independent of (H, B).
Note: Recall that the processes in£ are .Ft-adapted and have caglad paths (Section 4.4.1).
The stochastic integrals defined in Section 4.4.3 are for integrands in £ and semimartingale
integrators. We also note that the processes defined by Eqs. 7.30 and 7.32 do not include
all semimartingales. For example, a general Levy process (Section 3.14.2) does not admit
the representation in these equations.
Since M(O) = 0, we have (Eq. 7.32)
M(t) = t
lo+
H(s)dB(s) + t
lo+
K(s)dC(s).
are satisfied at each timet :::: 0 for all indices i = 1, ... , d 1 , u 1, ... , db, and v =
1, ... , de, then M;, i = 1, ... , d 1 , are square integrable martingales. The second moment
of M;(t) = Lu H;u · Bu(t) + Lv K;v -
· Cv(t) is E[M;(t) ] 2 = Lu E [(H;u · Bu(t)) 2] +
E [ ( K;v · Cv (t)) 2] by the properties of H, K, B, and C. We have seen in Section 4.5.3
E [JJ+ K;v(s) 2 d[Cv, Cv](s)J. Hence, M; is a square integrable martingale under the
above conditions.
Let fh, t :::: 0, be the filtration generated by B and C. Generally, 9t is smaller
than the filtration .F1 considered at the beginning of this section. If the processes H and
K are 9 1-adapted and have caglad paths, then M is an 9t-local martingale. If in addition
the process A in the representation of S (Eq. 7.30) is also 9t-adapted, then S is an 9t-
semimartingale. For example, A can be a deterministic function with continuous samples of
finite variation on compacts and the processes H and K can be memoryless transformations
of B and C_, respectively. A
7.2. Linear systems 451
2s s) ds = 6 t 3. The square of JJ+ 2 C(s-) dC(s) is 4 L,'C,_~ (L.~;:;:f Yi) 2 Yf so that its
expectation is finite provided that E[Yf] < oo since C has a.s. a finite number of jumps in
(0, t]. Hence, JJ+ 2 C(s-) dC(s) is also a square integrable martingale. Since this process
is independent of JJ+ 3 (B(s) 2 - s) dB(s), M is a square integrable martingale. •
Let
Our objective is to find the first two moments and/or other probabilistic
properties of the JRd- valued processes X in Eqs. 7.35 and 7.36 provided they exist.
The analysis is based on the Ito formula for semimartingales applied to functions
of X or to functions of an augmented state vector Z, which includes X. We
refer to these two approaches as the direct and the state augmentation methods,
respectively (Table 7.1). The first method is used in the next two sections to find
properties of X in Eq. 7.28 for square integrable and general martingale inputs.
Section 7.7.2.3 applies the state augmentation method to solve Eqs. 7.28 and 7.29.
Numerical examples are used to illustrate the application of these methods.
452 Chapter 7. Deterministic Systems and Stochastic Input
Let X be the solution of Eq. 7.28 with S in Eq. 7.30 and M in Eq. 7.32,
where Cq (t) = Cq (t) - 'Aq t E[Yq, J],
Nq(t)
Nq are Poisson processes with intensities Aq > 0, and Yq,k are independent copies
of Yq,l, q = 1, ... , de. It is assumed that Cis in L2, the conditions in Eq. 7.33
are satisfied, and the coefficients a, bin Eq. 7.28 are such that this equation has a
unique solution.
We now derive differential equations for the mean function f.L(t) = E[X(t)]
and the correlation function r(t, s) = E[X(t) X(s) T] of X.
where if (t) = b(t) H (t), K (t) = b(t) K (t), and e is a (de, de) matrix with
non-zero entries euu = .j'Au E[Y;,d, u = 1, ... , de.
Proof: The assumption that A is deterministic restricts the class of acceptable inputs. How-
ever, this assumption is adequate for the type of stochastic problems considered in the book,
as demonstrated by the examples in this and following chapters.
The average ofEq. 7.35 gives
since E[dM(t)] = 0. The derivative with respect to time of this equation yields the above
differential equation for JL (Eq. 7.7).
7.2. Linear systems 453
= t f (8piXq(s-)+Xp(s-)8qi) [taij(s)Xj(s-)ds
i=l O+ }=1
+ E
d'
biu(s) (dau(s) + dMu(s))
]
+2
1 d t
L Jo (8pi 8qj + 8pj 8qi) L
d' - -
Hiu (s) Hjv(s) d[Bu, Bv](s)
i,j=l O+ u,v=l
where .6.Xj(S) = Xj(S)- Xj(S-) is the jump of xi at times. The differentiation with
respect to time of the expected value of the above equation gives the second formula in
Eq. 7 .38. Except for the last term on the right side of the above equation, the calculations
are straightforward and are not presented. The last term takes the simpler form
which is zero at a times unless both Xp and Xq have a jump at this time. The jumps
d - - -
of Xp are .6.Xp(s) = Lv~l Kpv(s) .6.Cv(s). Two distinct coordinates of C have a
jump at the same time with zero probability since C has independent coordinates so that
.6.Xp(s).6.Xq(s) = Lv~l d -
Kpv(s) -
Kqv(s) ( .6.Cv(s)
- .)2
Consider a small time interval (s, s + .6.s], that will be reduced to a point by taking
the limit .6.s {. 0. If A.v .6.s« 1, a component Cv of C has at least one jump during the
time interval (s, s + .6.s] with probability 1 - el.u D.s :::: A.v .6.s. We have
Xp(t)- Xp(s) = 1t
s+
dXp(u), t ~ s,
with Xq(s), calculating the expectation of the resulting equation, and differentiating this
equation with respect to t. •
454 Chapter 7. Deterministic Systems and Stochastic Input
= [h(t)h(t)T +k(t)e(k(t)e)T],
where e is defined in Eq. 7.38 and E[dS(t)] = da(t). Hence, the formal second
moment calculus used in classica11inear random vibration theory can be applied
to find the second moment properties of X under the above conditions.
where JL(p; t) = E[X (t- )P] = E[X (t)P]. The expectations E[X (t- )P] and
E[X (t)P] coincide since P(X (t) =X (t- )) = 1 at a fixed t.
If A(t) = 0 and H(t) = K(t) = 1, X is the solution of a stochastic
differential equation driven by Brownian and Poisson white noise, and its first
two moments satisfy the differential equations ,1(1; t) =a JL(l; t) and ,1(2; t) =
2a JL(2; t) + b 2 (1 +A E[Yf]). These moment equations are in agreement with
our previous calculations (Section 4.7.1.3). <>
Proof: The formulas in Eq. 7.38 do not apply since A is a random process. The above
equations involving moments of X have been obtained by Ito's formula applied to the
function X(t)P. Recall the notation /1-(t) = 11-(l; t) = E[X(t)], r(t, t) = fJ-(2; t) =
E[X(t) 2], H(t) = b H(t), K(t) = b K(t), and e = J;.. E[Yf]. •
Example 7.15: Let X be defined by dX(t) =a X(t) dt+bdS(t), wheredS(t) =
dM(t) and M(t) = B(t) 3 - 3 t B(t). The differential equations,
Note: The differential form of lt6's formula applied to the function B(t) r-+ B(t)n for an
integer n ~ 1 gives
The Ito formula in Eq. 7.2 can be applied to develop differential equations
for higher order moments of X and other probabilistic properties. However, the
determination of some coefficients of these equations can be cumbersome. Some
computational difficulties involved in the determination of higher order moments
of X are illustrated by the following example.
Example 7.16: Let X be defined by dX (t) = a(t) X (t) dt + b(t) dS(t), where
dS(t) = dM(t) and dM(t) = H(t) dB(t). The moments JL(p; t) = E[X(t)P]
are the solutions of
Proof: lt()'s formula applied to the function X(t) r-+ X(t)P gives
X(t)P- X(O)P = lo tp X(s)p-l dX(s) +-1 lot p(p- 1) X(s)P- 2 d[X, X](s),
0 2 0
which yields the stated moment equation by averaging and differentiation with respect to
time since d[X, X](s) = b(s)2 H(s)2 dt.
JJ
For X(O) = 0, the state is X(t) = 8(t, s) b(s) H(s) dB(s) so that
E[K (t) e eT K (t)T] in Eq. 7.38 may not exist. We attempt here to develop a dif-
ferential equation for the characteristic function cp(u; t) = E [eFfuT X(t-)],
u E ffi.d, of X (t) because this function is always defined.
Proof: The arguments u and t of the characteristic function rp(u; t) are not shown in the
above equation for simplicity.
The Ito formula in Eq. 7.2 applied to X(t) ~ exp (AuT X(t)) gives
1 d
-- L
t
2 k,l=l O+
1
Uk U[ eFf u
T
X(s-) d[Xk, XtJc(s)
+ L
O<s~t
[eFfuT X(s)- eFfuT X(s-)- tn
k=l
Uk eFfuT X(s-) b.Xk(s)],
(7.40)
where
d d'
dXk(s) = Lakt(s)Xt(s-)ds + Lbkr(s)dar(s)
1=1 r=l
db de
+L Hkp(s)dBp(s) +L Kkq(s)dCq(s),
p=l q=l
db db
d[Xk, Xt]c(s) = L Hkp(s) Htp'(s) d[Bp, Bp'](s) = L Hkp(s) Htp(s) ds,
p,p'=l p=l
7.2. Linear systems 457
td
E [eA uT X(t-) e A I:Z=I uk Kkq(t) Yq,I _ e A uT X(t-) I Dq] P(Dq)
q=l
=At L Aq ( E [eAu
de T d -
X(t-) e A Lk=I Uk Kkq(t) Yq,l] _ cp(u; t))
q=l
Generally, Eq. 7.39 is not a partial differential equation for the characteristic
function of X since some of the expectations in this equation cannot be expressed
in terms of derivatives of q;. The condition in Eq. 7.39 becomes a partial differen-
tial equation for cp if, for example, H and K are deterministic functions, in which
case X is the state of a linear system driven by Gaussian and Poisson white noise
(Table 7.1). This special case has been considered in the previous section under
the assumption Yq,l E Lz, q = 1, ... , de.
where a, b1, and bz are deterministic functions of time. The input consists of
Brownian motion B and a compensated compound Poisson process C. The char-
acteristic function of X satisfies the partial differential equation
am
_.,..
at
am
= u a(t) _.,..
au
- 1
u2 b (t) 2
2
cp +A. (E [
e A ub1(t) YI] - 1) q;
'
458 Chapter 7. Deterministic Systems and Stochastic Input
where the arguments of the characteristic function are not shown. <>
Proof: Apply Eq. 7.39 ford= d 1 = 1, a(t) = 0, H(t) = b1(t), and K(t) = b 2 (t).
A special case of the above equation corresponding to h).(t) = 0 has been discussed in
Section 4.7.1.3. •
Because the expectations on the right side of this equation cannot be expressed as
partial derivatives of cp, it is not possible to obtain a partial differential equation
for the characteristic function of X. <>
where a, bare some constants and B denotes a Brownian motion. The character-
istic function of X satisfies the condition (Eq. 7.39)
acp acp a
- = - p u - - i u l cp
at au
Proof: Note that the driving noise La does not admit the representation in Eqs. 7.30 and
7.32 so that Eq. 7.39 cannot be applied. The Ito formula in Eq. 7.3 applied to the function
X (t) 1-+ eH u X (t) gives
since [X, X]c(t) = 0 and JJ+ eH uX(s-) dLa(s) = LO<s::;t eH u X(s-) ~X(s), so
that
+ L
O<s::;t
eHuX(s-) (eHuLlLa(s) -l)).
Errors can result if we calculate the above expectation term by term since the expectations
of the individual terms may not be finite, as we will see in Example 7.53. However, we
will continue with formal calculations and perform the expectation term by term. The
expectation of the derivative with respect to time of the first term is
as ~t .j.. 0. Since the time derivative of the above term is q;(u; t) (-lula), our formal
calculations suggest that q; satisfies the stated partial differential equation.
We also note that the solution X(t) = JJ e-P (t-s) dLa(s) is an a-stable variable
Sa(u, {3, JJ-) with u = ((1- e-pat)/(pa)/fa, f3 = 0, and JJ- = 0 (Section 2.10.3.2 in
this book, [79], Example 3.27, p. 112). The characteristic function of X(t) is
so that
aq; - t aq; 1-e-pat
-at = -e pa lula q;, -=-a
au pa
sign(u) lula- 1 q;,
and aq;jat + p u aq;;au = -lula q; showing that the above formal calculations yield a
correct result in this case. •
460 Chapter 7. Deterministic Systems and Stochastic Input
rather thanEqs. 7.28 and 7.29. This definition of X includes the class of nonlinear
systems, which will be considered in Section 7.3.2.3. If a is linear in X and b
does not depend on X, the above equations coincide with Eqs. 7.28 and 7.29. It
is assumed that the matrices a, bare such that Eqs. 7.41 and 7.42 have a unique
solution (Sections 4.7.1.1 and 4.7.2). Let S be the semimartingale process in
Eqs. 7.30 and 7.32. We consider two special cases for S corresponding to the
following definitions of the processes A, H, and K.
A(t) = «(t), H(t) = h(B(t), t), and K(t) = k(C(t-), t). (7.43)
dA(t) = {J(S(t-), t)dt, H(t) = h(S(t-), t), and K(t) = k(S(t-), t).
(7.44)
7.2. Linear systems 461
Note: The augmented vector Z in Eq. 7.45 satisfies a stochastic differential equation driven
by Gaussian and Poisson white noise. Generally, this equation is nonlinear even if the
differential equation for X is linear.
If S corresponds to the special case in Eq. 7.44, then the augmented vector Z =
I
(X, S) is the solution of the stochastic differential equation
dX(t) =a(X(t-),t)dt+b(X(t-),t)p(S(t-),t)dt,
{
dS(t) = p(S(t-), t) dt + h(S(t-), t) dB(t) + k(S(t- ), t) dC (t).
(7.47)
where p, q :::: 0 are integers and 11-(u, v; t) = 0 if at least one of its arguments u, v
is strictly negative. The equations of order p+q contain the moments fl-(p-2, q+
4; t) of order (p- 2) + (q+4) = p +q +2. It may seem that we cannot calculate
the moments /1-(p, q; t) of Z. However, the moments /1-(P - 2, q + 4; t) can
be obtained sequentially for increasing values of p since the moments 11-(0, q; t)
depend only on S, so that they can be calculated from the defining equation of this
process. We say that the above infinite hierarchy of equations is closed. <>
462 Chapter 7. Deterministic Systems and Stochastic Input
Proof: The augmented vector Z is a diffusion process defined by the stochastic differential
equation
dX(t) =a X(t) dt + 3 b (Y(t) 2 - t) dB(t),
{
dY(t) = dB(t),
since dS(t) = 3 (B(t) 2 - t)dB(t). The Ito formula applied to function Z =(X, Y) r-+
X P yq gives the above moment equations by averaging and differentiation. This example
illustrates the state augmentation method in Eq. 7.45. •
Example 7.22: Let X be the process in Example 7.21. We have found that the
direct method cannot deliver a partial differential equation for the characteristic
function of X (Example 7 .19). However, we have
Proof: We have seen in the previous example that Z = (X, Y) is a diffusion process. Let
x(u, v; t) = eACuX(t)+vY(t)) be a differentiable function of Z(t). The Ito formula in
Eq. 7.3 applied to the mapping (X, Y) r-+ x (u, v; t) gives
x(u, v; t)- x(u, v; 0) =lot v"=1 u x(u, v; s) dX(s) +lot R v x(u, v; s) dY(s)
where d[X, X](s), d[X, Y](s), and d[Y, Y] are equal to 9 b2 (Y(s) 2 - s) 2 ds, 3 b (Y(s) 2 -
s) ds, and ds, respectively. By averaging the above equation and differentiating the result
with respect to time, we obtain the stated partial differential equation of((!. The character-
istic function({! must satisfy the condition ({J(O, 0; t) = 1 (Section 2.11.3). Also,({! has to
satisfy the initial condition ({J(u, v; 0) = <P<J(u) for B(O) = 0, as considered here, where
({Jo is the characteristic function of X (0). Boundary conditions needed to solve the partial
differential equation for({! are considered later in this chapter (Eq. 7.62).
The formula in Eq. 7.39 can also be used to obtain the above partial differential
equation for the characteristic function of Z. •
(7.48)
gives the above condition. If the coefficients of the differential equations defining Z are
polynomials of X and S, Eq. 7.49 is a differential equation for the moments of the aug-
mented state vector Z. Generally, these equations form an infinite hierarchy. However,
there are numerous special cases of Eq. 7.47 that are relevant in applications and the resul-
tant moment equations are closed, as demonstrated by the following examples. .l
464 Chapter 7. Deterministic Systems and Stochastic Input
where dS(t) = -a S(t) dt +a v'2ct dB(t), a > 0, a,~ > 0, and v > 0 are some
constants, B denotes a Brownian motion, n 2:: 1 is an integer, and a z are continuous
functions. The process X represents the displacement of a linear oscillator with
natural frequency v and damping ratio ~ driven by a polynomial of the Ornstein-
Uhlenbeck process S.
The moments f,L(p, q, r; t) = E [X (t)P X(t)q S(tY] of orders = p+q +r
of Z = (X, X, S) satisfy the ordinary differential equation
M(p, q, r; t) = p fL (p - 1, q + 1, r; t) - q v 2 fL (p + 1, q - 1, r; t)
n
- (q f3 + r a) f,L(p, q, r; t) + q l:az(t) f,L(p, q- 1, r + l; t)
1=0
+ r(r - 1) a a 2 f,L(p, q, r - 2; t)
3 '5 Skewness
3
QOL---~~--~----~
0.5 1.5 0.5 1.5
of skewness and kurtosis, Y3 and y4, of the stationary process X for v 2 = 1.6,
v2 = 9.0, and the input 'E.?=o az (t) S(t) 1 with n = 2, ao(t) = a1 (t) = 0,
7.2. Linear systems 465
X(t) ]
d [ X(t) =[ X(t)
-v2 X(t)- f3 X(t) + L:?=o a1(t) S(t)l ]
dt +[ o0 ] dB(t).
S(t) -a S(t) u ,.fi(X
l[ l
The Ito formula applied to (X(t), X(t), S(t)) r-+ X(t)P X(t)q S(tY gives the above dif-
ferential equation for the moments of (X, X, S) following averaging and differentiation.
To see that the moment equations are closed, consider the collection of equations,
11-(s, 0, r; t) 11-(s, 0, r; t)
:, M(' ~I, I,'· t) ~c M(' ~I, I,'· t)
l [ l
[
!1-(1,s-1,r,t) !1-(1,s-1,r,t)
11-(0, s, r; t) 11-(0, s, r; t)
L:1al(f)/-L(s -1,0,
0 r +l; t) 11-(s, -1,
11-(s 0, r -
1, 2;
r -t) 2; t)
+[ : +r(r -1)au 2 : ,
(s -1)L:1al(t)/1-(1,s -2,r+l;t) !1-(1,s -1,r -2;t)
s Li a1(b) 11-(0, s- 1, r + l; t) 11-(0, s, r - 2; t)
for a fixed value of s = p + q, where cis a time-invariant matrix that can be constructed
from the moment equations. The differential equations for the moments of Z with s = 1
can be used sequentially for increasing values of r to find /1-(1, 0, r; t) and 11-(0, 1, r; t)
for any value of r. These moments are needed to solve the moment equations for Z with
s = 2. These calculations can be continued to obtain moments of any order of the state
vector Z =(X, X, S). •
{
dX(t) =
(-pX(t) + Y(t)) dt,
dY(t) = -ay(Y(t)) dt + by(Y(t)) dB(t),
Example 7.25: Let X be the process in Example 7.23. The characteristic function
of the augmented state vector Z (Z1 = X, Z2 X, Z3 = S) satisfies the
partial differential equation
arp 2 arp
2 2
- = - a a w rp- v v - + (u- f3 v )arp-
at au av
-a warp+ v taz(t) (-1) 1 cJ-=1)1+ 1 azrp,
aw 1=0 awl
d[ X(t)
S(t)
J= [ a(t)X(t-)+p(S(t-),t)
a(t) S(t-)
J dt+ jffi.d'
{ [ 0 J M(dt d)
y ' y '
(7.50)
where d; = (di,l, ... , di,d'), d;,r :::: 0 denote integers, and /;;,d(t) are time dependent
coefficients. The compound Poisson process driving this equation has the representation
N(t) 1
C(t) = '"_LYk = f yM(dt,dy), (7.52)
k=1 lo+
where N is a Poisson process of intensity ). > 0, Y k are independent copies of an JR<l'-
valued random variable Y 1 with distribution F and finite moments, and M (d t, d y) is a
Poisson random measure with expectation E[M(dt, dy)] = A.dt dF(y). The moments
{L(q1, ... , qn; t) = E [TI/= 1 Z;(t)q;] of Z = (X, S) are defined as in Eq. 7.48, where
q; :::: 0 are integers, q = I:/= 1 q;, and n = d + d'. A
7.2. Linear systems 467
/J,(qi, · · ·, qn; t)
d
= L qk akz(t) p,(q1, ... , qk. ... , qz, ... , qa, ... , qn; t)
k,l=l
d
+L Lqk ~k.dk p,(q1, · · ·, qk- 1, ... , qa, qa+I + dk,I, ... , qn + dk,d'; t)
k=l dk
n
+ L qk CXk-d,l-d p,(qi, ... , qa, ... , qk, ... , qz, ... , qn; t)
k,l=d+l
+A.E [ nxi(t-)q; {
.
d
n"
lJRd' .
d' q; ·'
q,.
~ ri!(qi- ri)!
Sj(t-tiy~;-r;dF(y) ]
1
1=l J=l r;=O
- A. p,(ql, ... , qn; t), (7.53)
where matrices m(z, t) and c(z, t, y) are such that the above equation has a unique solution
(Section4.7.2). ThedifferentialformoftheltOformulaappliedtog(Z(t)) = flt=l Zi(t)q;
gives (Eq. 7.2)
468 Chapter 7. Deterministic Systems and Stochastic Input
=It
dJ-L(ql, ... , qn; t)
In
k,l=d+l
+). dt E
.
z=l
d
Xi (t)q; {
}fii!.d'.
d' q;
n '"""'
·'
q,. S · (t)'i yq;-r; dF(y)
~ ri!(qi -ri)! 1
1=1 r;=O
1
- fd,fizi(t)q;dF(y)l,
}Til!. i=l
with the convention that moments J-L(ql, ... , qn; t) with at least one strictly negative argu-
ment are zero.
An algorithm can be developed for solving exactly Eq. 7.53 following the approach
in Example 7.23. The moments J-L(O, ... , 0, rd+l, ... , rn) can be calculated exactly since
S is a filtered Poisson process ([79], Section 3.3). If the coefficients of the differential
equation defining Z are time invariant and this process becomes stationary as t ---+ oo,
the moments J-L(ql, ... , qn; t) = J-L(ql, ... , qn) do not depend on time so that they satisfy
algebraic equations obtained from Eq. 7.53 by setting jL(ql, ... , qn; t) = 0. •
d[ S(t)
X(t) J = [ aX(t-)+S(t-) J d { [ 0J M(d d) 1
aS(t-) t+ }Til!. t, y y
with the notation in Eq. 7.50. If E[Yf] exists and is finite fork::=: 1, the moments
tt(p, q; t) = E[X (t)P S(t)q] of Z = (X, S) satisfy the differential equation
Figure 7.5 shows the first four moments tt(p, 0; t), p = 1, ... , 4, of X
calculated from the above moment equations and their Monte Carlo estimates for
Z(O) = 0, a= -2:rr, a= -1, l = 2, A.= 2, and Y1 ~ N(O, 5). The dependence
of the stationary skewness and kurtosis coefficients of X on the parameter a is
illustrated in Fig. 7.6 for A. = 2 and A. = 200. The stationary skewness and kurtosis
7.2. Linear systems 469
Mean
1.5
/
MC simulation MC simulation
2 3 4 5 2 3 4 5
30 140
Skewness 120 Kurtosis
25
100 MC simulation
20
MC simulation
/;Momeot
80
15
60
10
;equations 40
5\... 20
0 0
0 2 3 4 5 0 2 3 4 5
time time
Figure 7.5. The mean, variance, coefficient of skewness, and coefficient of kurto-
sis functions of X
40
35 Kurtosis
3.5 A=2 30
25
1..=200
1.5 A=200
5
0.5 0
0 2 3 0 2 3
-a/1t -a/n
q!
=L
q
k!(q - k)! p,(p, q - k; t) E[Yf]
k=l
and involves moments ofordersmallerthan p+q. We also note that the moments p,(O, q; t)
can be calculated in advance from the characteristic function
Xq(t) =)..
t E [(Y1 ea(t-s)) q] ds = ~
Jo
)..E[Yq]
[exp(qat) -1]
fors = 1,
jl(2, 0; t) = 2a p,(2, 0; t) + 2p,(l, l; t),
jl(1, 1; t) =(a+ a) p,(1, 1; t) + p,(O, l + 1; t) +).. E[YI] p,(1, 0; t),
jl(O, 2; t) = 2a p,(O, 2; t) +).. (2 E[Yd p,(O, 1; t) + E[Yf] p,(O, 0; t)),
fors = 2, and
jl(3, 0; t) = 3 a p,(3, 0; t) + 3 p,(2, l; t),
jl(2, 1; t) = (2a +a)p,(2, 1; t) +2p,(1,l + 1; t) +)..E[YI]p,(2,0; t),
jl(l, 2; t) = (a+ 2a) p,(l, 2; t) + p,(O, l + 2; t) +).. (2 p,(1, 1; t) E[YI]
+ p,(1, 0; t) E[Yf]).
JL(O, 3; t) = 3 a p,(O, 3; t) +).. (3 p,(O, 2; t) E[YI] + 3 p,(O, 1; t) E[Yf]
+p,(O, 0; t) E[Yf]) ,
for s = 3. The moments of orders = 1 can be calculated since p,(O, q; t) are known for
all values of q. The moment equations for s = 2 involve the unknown moments p,(1, l; t).
These moments can be determined recursively from the moment equations
q!
+ l; t) +).. L
q
jl(1, q; t) = (a+ q a) p,(1, q; t) + p,(O, q k!(q _ k)! p,(1, q - k) E[Yf]
k=l
7.2. Linear systems 471
q!
= (2a +q a) J.L(2, q; t)+2J.L(1, q +l; t)+A. L
q
it(2, q; t) k' _ k)'p,(2, q -k) E[Yf],
k=l .(q .
starting with q = 1. •
Example 7.27: Let Z = (X, S) be the process on Example 7.26. The charac-
teristic function of the augmented state vector Z satisfies the partial differential
equation
aq; aq;
- = ua-
at au
+ (-1) I (.J=i) I+1 ualq;
av1
aq;
- +av- +A.q;[q; (v)
av y
-1],
Proof: Apply Ito's formula in Eq. 7.4 to (X(t), S(t)) r+ exp(H (u X(t) + v S(t))).
We discuss here only the contribution of the last term in Eq. 7.4. The jumps of the co-
ordinates of Z at a time a > 0 are ~X(a) = 0 and ~S(a) = ~C(a), where C(t) =
J6+ y M (da, dy ). We have
E [ev'=1(u X(t)+v S(t)) _ eH (u X(t-)+v S(t-)) J
=E [eR (u X(t-)+vS(t-)) (eR v t..C(t) _ 1) J
= E [ eH (u x (t- )+v S(t-))] ( E[eH v t..C(t)] _ 1)
where the last equality holds since Z(t-) and ~C(t) are independent.
Note also that the characteristic function qi(v) = E[eH v S(t)] of S(t) satisfies the
partial differential equation
aq5 aq5 _
- = a v - +A.rp[rpy(v) -1],
at au
which can be obtained from the above partial differential equation of rp for u = 0. •
Figure 7. 7 shows the first four moments of X calculated from the above
moment equations and their Monte Carlo estimates for Z (0) = 0, v = 2 n:, ~ =
0.05,a = -l,l = 2,A = 2,andY1 "'"'N(O, 12). Thedependenceofthestationary
Variance
0.5
0.1 Moment ~Moment
equations equations
0
2 3 4 5 0 2 3 4 5
30 150
Skewness Kurtosis
25
20
MC simulation
100 MC simulation
/
15 Moment
Moment 50 equations
equations
0 0
0 2 3 4 0 2 3 4 5
time time
skewness and kurtosis coefficients of X on~ is illustrated in Fig. 7.8 and Fig. 7.9
for A = 2 and A = 200, respectively, and v = 2 n:, n:, and n: /2. These coefficients
approach Yg,3 = 0 and Yg,4 = 3, that is, the values corresponding to Gaussian
variables, as ~ --+ 0. The stationary skewness and kurtosis coefficients of X differ
significantly from Yg,3 and Yg,4 for other values of~ indicating that X is a non-
Gaussian process. The differences between the skewness and kurtosis coefficients
of the stationary displacement X and Yg,3 and Yg,4 decrease as A increases, an
expected finding since the input approaches a Gaussian process as A --+ oo. <>
7.3. Nonlinear systems 473
6,-------------------, 50--------------------,
Skewness Kurtosis
5
V=21t
0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5
4,-------------------, 20,-------------------,
3.5
Skewness Kurtosis
3
0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5
Proof: The above moment equations have been obtained from Eq. 7.53 applied to the
augmented vector Z = (X, X, S). The resulting system of equations for the moments of Z
is closed so that we can calculate moments of any order of Z. •
defined by Eq. 7.54 in which a and b are linear functions of X are viewed as
nonlinear. This definition of a nonlinear system is not general ([131], Section 3.3).
We have defined a linear system by Eq. 7.5 consistently with the theory of
linear random vibration [ 175]. Moreover, there are relatively simple and general
methods for finding properties of the state X of Eq. 7.5, as we have seen in Sec-
tion 7 .2. For example, the probability law of X is completely defined by its second
moment properties if the input to Eq. 7.5 is a Gaussian process. In contrast, there
is no general method for finding the probability law and other properties of X de-
fined by Eq. 7 .54. Some examples of nonlinear systems for which it is possible to
find analytically properties of the state defined by Eq. 7.54 are in Section 7.3.1.4.
The state vector X in Eq. 7.54 can define the evolution of a dynamic sys-
tem, material structure, damage state in a system, portfolio value, and other pro-
cesses (Section 7.4). The determination of probabilistic properties of X is the
main objective of nonlinear random vibration, a branch of stochastic mechan-
ics and dynamics analyzing the response of nonlinear dynamic systems subjected
to random input [ 175]. Two cases are distinguished in nonlinear random vibration
depending on the functional form of b. If b does not depend on the state X, that is,
b(X(t), t) = b(t), the nonlinear system is said to have additive noise. Otherwise,
we say that the system has multiplicative noise. The behavior of systems with
additive and multiplicative noise can differ significantly.
Our objectives in this section are as in the first part of the chapter. We at-
tempt to find properties of X satisfying Eq. 7.54 driven by various noise processes
Y. Because X is the solution of a nonlinear differential equation, it is not a Gaus-
sian process even if Y is Gaussian. Most available results are for inputs Y that
can be modeled by Gaussian white noise, but even for these inputs, probabilistic
properties of X are not generally available in analytical form. Numerical methods,
Monte Carlo simulation, and heuristic assumptions are used in most applications
to characterize X.
7.3. Nonlinear systems 475
1.---~--------------, 4,-------------------,
a=-1 (multiplicative noise) a=1 (multiplicative noise)
0.8
3
8 10 10 20 30 40 50
-2L---~--~--~--~~
0 10 20 30 40 50 10 20 30 40 50
time time
Figure 7.10. Samples of X for multiplicative and additive noise, ot = ±1, and
a= 1
Conditions for the existence and uniqueness of the solution of Eq. 7.55 are dis-
cussed in Section 4.7.1. The coefficients a and bbT in Eq. 7.55 are called drift
and diffusion, respectively.
We develop differential equations for moments, densities, and characteristic
functions of X and discuss the solution of these equations. Solutions of Eq. 7.55
by Monte Carlo simulation and various approximations are also presented.
.
tJ-(ql, ... , qd; t) =~ [ ag(X(t))]
~ E ai(X(t), t) --=----'-:...:....
i=l axi(t)
+~
2
t
i,j=l
E [(b(X(t))b(X(t))T) a2g(X(t))
ij axi(t) axj(t)
] .
(7.57)
dg(X(t)) =L
d og(X(t)) (
ai(X(t), t) dt +L
d'
bik(X(t), t) dBk(t)
)
~~ a~U) bl
1 d d' a2g(X(t))
+ 2 .~ L axi(t)oXj(t)bik(X(t),t)bjz(X(t),t)d[Bk,Bz1(t).
I,J=l k,/=1
The second term, dM(t) = (og(X(t))/oXi(t)) bik(X(t)) dBk(t), on the right side
of the above equation is a local martingale by a preservation property of the stochastic
integral so that E[dM(t)] = 0 since M(O) = 0. The expectation of the above equation
divided by dt gives Eq. 7.57 because d[Bh Bz](t) = 8kz dt.
If X= Xs, its moments are time invariant so that Ji,(ql, ... , qd; t) = 0 and Eq. 7.57
yields Eq. 7.58. •
The condition in Eq. 7.57 cannot be used to calculate moments of X since
the expectations on its right side are not moments of X. If the drift and diffu-
sion coefficients are polynomials of the state vector, these expectations become
7.3. Nonlinear systems 477
moments of X. Under these conditions, Eqs. 7.57 and 7.58 are differential and
algebraic equations for the moments of the state vector, respectively, and are re-
ferred to as moment equations.
where the drift and diffusion are such that the solution X exists and is unique. The
moments of X (t) satisfy the condition (Eq. 7 .57)
If the drift and diffusion coefficients are polynomials of X, this condition becomes
a differential equation for the moments of the state X. <>
Example 7.31: Suppose that the drift and diffusion coefficients in Example 7.30
are a(x) = {Jx- x 3 and b(x) =a. The moments of X satisfy the differential
equation
. q (q- l)a 2
J.t(q; t) = q fJ J.t(q; t) - q J.t(q + 2; t) + 2 J.t(q - 2; t).
It is not possible to calculate the moments of X exactly since the above moment
equations form an infinite hierarchy. The calculation of a moment of any order
of X(t) involves the solution of an infinite set of equations simultaneously. We
present in Section 7.3.1.5 some heuristic methods, called closure methods, used
in physics and nonlinear random vibration theory to solve infinite hierarchies of
moment equations ([ 175], Section 6.1.1 ). If the moment equations do not form an
infinite hierarchy, we say that the moment equations are closed. In this case it is
possible to calculate exactly moments of any order of X. <>
These moment equations involve five unknowns, the moments JL(l; t), JL(2; t), JL(3; t),
JL(4; t), and JL(5; t). Hence, it is not possible to find the first three moments of X(t). The
situation persists for any value of q.
The stochastic differential equation for X has a unique solution because its coeffi-
cients satisfy local Lipschitz and growth conditions (Section 4.7.1.1). The theory of deter-
ministic ordinary differential equations can be used to assess the existence and uniqueness
of the solution of the moment equations. .&
478 Chapter 7. Deterministic Systems and Stochastic Input
. q (q- l)a 2
!J.-(q; t) = -a q ~J.-(q; t) + 2 !J.-(q; t),
and has the stated solution. If -a + (q - 1) a 2 /2 < 0, or equivalently, a > a*(q), then
~J.-(q; t) decreases to zero as t -+ oo. The moment 11-(q; t) is time invariant if a = a*(q).
If a < a*(q), ~J.-(q; t) converges to ±oo as t -+ oo depending on the sign of ~J.-(q; 0).
We also note that X is a geometric Brownian motion process with solution (Sec-
tion 4.7.1.3)
X(t) = X(O) exp(( -(a+ a 2 f2)t +a B(t))].
If X (0) and B are independent, then
which coincides with 11-(q; t) since E[eu G] = eu 212 for G ~ N(O, 1). &
t)]
d
8cp(u; t) = J=T~::::ui E [e.J=IuT X(t) ai(X(t),
-~ .t
at i=l
J=TI:: d
Ui E [e.J=IuT X,(t) ai(Xs(t), t) J
-~ .t
i=l
Proof: Apply Eq. 7.3 to the mapping X(t) ~ eA uT X(t). We can use this version of
the It<> formula since X is a continuous semimartingale. The expectation of the It<> formula
JJ
scaled by dt yields Eq. 7.60 since eA uT X(s) bik(X(s)) dBk(s) is a local martingale
starting at zero and d[Bk, BL](t) = Okl dt.
Let qi 2: 1 be integers and q = I:f=l qi. If the entries of the matrices a and b are
polynomials of X and the partial derivatives,
of ({! exist, the right sides of Eqs. 7.60 and 7.61 can be expressed as partial derivatives of ({!
with respect to the coordinates of u.
The characteristic function of the stationary solution Xs satisfies the condition in
Eq. 7.60 but it is time invariant so that Eq. 7.60 becomes Eq. 7.61. •
We have seen in the previous section that moment equations can be obtained
for a diffusion process X only if its drift and diffusion coefficients are polynomials
of its coordinates. The same condition is needed in the current setting. If the drift
and diffusion coefficients of X are polynomials of this process, then Eqs. 7.60
and 7.61 become partial differential equations for the characteristic function of X.
In this case, it is possible to find the characteristic function of the state if initial
and boundary conditions are specified for Eq. 7.60 and boundary conditions are
given for Eq. 7.61. The initial condition for Eq. 7.60 is cp(u; 0) = e .J=IuT Xo
and cp(u; 0) = E [e.J=IuT X(O)J if X(O) = xo and X(O) is an JRd-valued ran-
480 Chapter 7. Deterministic Systems and Stochastic Input
dom variable, respectively. The boundary conditions for Eqs. 7.60 and 7.61 are
discussed later in this section.
Example 7.33: The displacement X of an oscillator with cubic stiffness, called
the Duffing oscillator, satisfies the equation
s
where E (0, 1) is the damping ratio, v denotes the initial frequency, a is a real
constant, and W is a Gaussian white noise with mean zero and one-sided spectral
density of intensity go > 0. The characteristic function of X satisfies the partial
differential equation
-arp arp
= U1 -au2 - V
2
U2-
arp a3rp
+ V 2 a U2 -
arp
- 2 S V U2 -
rr go 2
- - - u 2 (/)
at au1 aui au2 2
with initial condition given by the distribution of the state X at t = 0. ¢
Proof: The vector process (X 1 = X, X 2 = X) satisfies the differential equation
dX1(t) = X2(t)dt,
{
dX2(t) = -v 2 (X1(t) +a X1 (t) 3) dt- 2 ~ v X2(t) dt + .JiiTo dB(t),
where B is a Brownian motion. The partial differential equation for the characteristic func-
tion of (Xt. X2) is given by Eq. 7.60. This equation must satisfy the boundary conditions
tp(O, 0; t) = 1 and tp, aktpjilu~, iltpjilu 2 --+ 0 as ui + u~ --+ oo fork = 1, 2, 3 since the
random variables X1 (t) and X2(t) have finite moments (Eq. 7.62). •
Boundary conditions. Suppose that the drift and diffusion coefficients in Eq. 7.55
are polynomials of X so that Eqs. 7.60 and 7.61 are partial differential equations
for the characteristic function rp ( ·; t) of X (t). Initial and boundary conditions
need to be specified to solve these equations. The initial conditions result directly
from the properties of the state at time t = 0. The boundary conditions can be
obtained from the following properties of the characteristic function.
• rp(O; t) = 1,
• lrp(u; t) I ::: 1,
• If X(t) has a density, then rp(u; t) --+ 0 as II u II-+ oo,
• If X (t) has a density and finite moments of order q, then
aq rp(u; t)
q1 qd --+ o as II u 11--+ oo,
au1 ... "a
d
where qi ::: 0 are integers and L qi = q. (7.62)
i=1
7.3. Nonlinear systems 481
Note: The first two conditions result from the definition of the characteristic function.
The last two properties follow from the fact that, if g is an integrable function and
y(u) = fJR eHux g(x)dx, then y(u)--+ 0 as lui--+ oo ([62], Lemma 3, p. 513). This
fact implies cp(u; t) --+ 0 as II u II-+ oo since the density f of X(t) is assumed to be
integrable. If X(t) has finite moments of order q, then x{ 1· · ·XJd f(x) is integrable so
that the partial derivatives of order q = '£%= 1 qk of cp converge to zero as II u II-+ oo. A
The density of the solution X of Eq. 7.55 satisfies two partial differen-
tial equations, the Fokker-Planck or the forward Kolmogorov equation and the
backward Kolmogorov equation. These equations can be derived from (1) the
relationship between the characteristic and density functions and Eq. 7.60, (2) the
Ito formula (Section 4.7.1.3), or (3) the Chapman-Kolmogorov formula (Exam-
ples 7.38 and 7.40).
Let f(x; t I xo; 0) be the density of X(t) I (X(O) = xo), t ~ 0, where
X (0) = xo is a deterministic initial state. If the starting time is s > 0 rather then
zero, the density of X(t) I (X(s) = xo) is denoted by f(x; t I xo; s), 0 < s ::=: t.
The shorthand notation f (x; t) will also be used for f (x; t I x o; 0) provided there
is no ambiguity. If the initial conditions are random, that is, X (0) is an JRd -valued
random variable, the density of X(t) can be calculated from f(x; t I x o; 0) and
the density of X (0). It is assumed that f (x; t I x o; 0) satisfies the conditions
If f(x; t I xo; 0) satisfies the conditions in Eq. 7.63, then f is the solution of
the Fokker-Planck equation
of(x; t I xo; 0)
at =-
L - a [ai(x, t) f(x; t I x
d
0 ; 0)]
ox·
i=l '
+ 21 Ld 8
ox· zox. [ (b(x, t) b(x, t)T)ii f(x; t 1 x 0 ; O) J.
i,j=l ' J
(7.64)
482 Chapter 7. Deterministic Systems and Stochastic Input
d a
- "-[ai(x,t)fs(x)]+- " - - (b(x,t)b(x,t)T) .. fs(X)
1 d a2 [ J =0.
~ax·
i=l l
2 ~ax· v i,j=l l
ax·J
(7.65)
Proof: We need to show that Eq. 7.60 multiplied by e-FfuT x j(2:rr)d and integrated
with respect to u over JR.d gives Eq. 7.64. Alternatively, we can show that Eq. 7.64 multi-
plied by eH u r x and integrated over JR.d with respect to x gives Eq. 7 .60. We follow the
latter approach. The left side of Eq. 7.64 gives
A term of the first summation on the right side of Eq. 7.64 multiplied by eH uT x and
integrated over JR.d with respect to x can be written as
k n Trnd-J
m.
d
}=l,jofti
(eRuJxJ dx ·)
j
{k _i_
ID
m.
ax·
l
[ai(x, t) f(x; t)] eHu;x; dxi}.
using integration by parts and the first set of conditions in Eq. 7.63. Hence, the term we
considered becomes -.J=T ui E [ ai (X(t), t) eH ur X(t) J.
A term of the second summation on the right side of Eq. 7.64 can be written as
following multiplication with eH u T X and integration over JR<f with respect to x. The
double integral in the brackets has also the form
7.3. Nonlinear systems 483
- ,J=l u; { _a_
1~ axj
[(b(x, t) b(x, t)T) ..
lj
f(x; t)J e A u; x; dx;
so that (Eq. 7.63)
;ij = -R u; kz eH(u;x;+uJ Xj) a~j [ (b(x, t)b(x, t)T)ij f(x; t)J dx; dxj
= -H u; k eAu;x; dx; k a~j [ (b(x, t)b(x, t)T)ij f(x; t)J eAuJXJ dxj.
The above inner integral gives
;ij =-u;Uj {
hz eH(u;x;+ujXj) (b(x,t)b(x,tl) v.. j(x;t)dx;dXj
and
-a(/) a(/) u 2
at
= - p u - - -qJ
au 2
forGWN,
where A > 0 and Yt are parameters of the compound Poisson process and a E
(0, 2) defines the Levy white noise. The density, f(x; t I xo; 0), of the conditional
process X (t) I (X (0) = xo) satisfies the partial differential equations
aj a(xf) 1 a2f
at = p --a:;- + 2 ax2 forGWN,
The first formula is a Fokker-Planck equation given by Eq. 7.64. The second for-
mula is not a classical Fokker-Planck equation, and we refer to it as a generalized
or extended Fokker-Pianck equation. The generalized Fokker-Planck equation
is valid if Yt has finite moments of any order and some additional conditions
stated in the following comments are satisfied. It is not possible to derive a partial
differential equation for f under Levy white noise with a E (0, 2). <>
Proof: The derivation of the partial differential equations for the characteristic and density
functions of X under Poisson white noise is outlined here. The differential equation for
the characteristic function of X driven by a Levy white noise is in Example 7.20. The
corresponding equation for a Gaussian noise results from Eq. 7.60.
LetC(t) = I:,f~? Yk beacompoundPoissonprocess, whereNisaPoissonprocess
with intensity ).. > 0 and Yk are independent copies of a random variable Yt. The Ito
formula in Eq. 7.4 applied toeA uX(t) gives
so that we have
cp(u;t)-1=-pu Jo r acp(u- s)
---a;;--ds+ L cp(u;s)E [ ev-lu~C(s)_l
r-1 ]
O+ O<s::::t
X(s-) is independent of future jumps .b..C(s) = C(s) - C(s-) of C, and .b..C(s) is either
Yk for some k or zero in any small time interval .b..s > 0 so that the random variable
eH u llC(s) -1 is eH u Yk -1 with probability)... .b..s and zero with probability 1-)... .b..s
as .b..s -)., 0. The partial differential equation for rp results by differentiation.
The Fourier transform of the left side of the partial differential equation for rp is
af/at. The first term on the right side of the Fokker-Planck equation is given by the corre-
sponding term of the equation for rp by using integration by parts, the boundary conditions,
and the equality
ak f(x· t)
-----,-'-
axk
= 1-
2Jr
h
JR.
r-1
(-v'=Iule-y-luxrp(u;t)du,
which is valid for any integer k 2: 1. The Fourier transform of rp(u; t) E[eH u Y,] is
[f (H~YJ)k]
2Jr lJR.
= _1 { e-Ruxrp(u;t)E du
2n }JR. k=O k.
- ~ E[Yfl_l
- L. k' 2Jr e
k
-..;=Tux (. )( C[ )kd - ~ (-1)kE[Yfl akj
rp u, t y - 1 u u- L. k! a k,
k=O . JR. k=l X
and the above integration can be performed term by term. The last equality follows from
the relationships between the density and the characteristic functions given above. •
The Fokker-Planck equations (Eqs. 7.64 and 7.65) are deterministic, linear
partial differential equations of the second order defined for all t 2: 0 and x E IE. d .
Conditions for the existence and uniqueness of the solution of these equations can
be found in [44] (Chapter IV). Comprehensive discussions on the Fokker-Planck
equations are in [70, 174].
In reliability studies and other applications we are interested in some global
properties of the state X defined by Eq. 7 .55, for example, the distribution of the
stopping time T = inf{ t > 0 : X (t) ¢ D} corresponding to an open subset D of
JE.d and deterministic initial state X(O) = xo ED. Let XT (t) = X(t /\ T) denote
the process X stopped at T. The samples of X T and X coincide for times t ::=: T
but differ fort > T. The samples of xT are constant and equal to X(T) E aD at
times t 2: T. If X is the state of a physical system and D denotes a safe set for this
system, the probability of the event {XT (t) E D} gives the fraction of samples of
X that never left D in [0, t], so that
is the system reliability in [0, t]. We will see that the probability Ps (t) in Eq. 7.66
can be calculated from the solution of the Fokker-Planck equation for the den-
sity of X with appropriate boundary conditions. We review now some boundary
conditions that are relevant for reliability studies and other applications. Extensive
discussions on the type and classification of boundary conditions for diffusion pro-
cesses and the Fokker-Planck equation can be found elsewhere ([70], Chapter 5,
[111], Section 15.6).
d
of = _"'"" BA.i (x; t)' where (7.67)
at ~
i=l
axi
a
Ai(x; t) = ai f - 2 L
1 d
~[(bbT)iJ f], i = 1, ... , d, (7.68)
J=l ux1
are the coordinates of a vector A.(x; t) E !Rd, called the probability current.
opv(t)
- - =-
at
~
~
.
/,
D
oA.i(x; t)
OXi
~
dx =- ~
. aD
i
Ai(x; t)ni(x)da(x), (7.69)
z=l z=l
where the last equality in the above equation follows from the Gauss or the divergence
theorem ([68], p. 116). The function
d
- L'Ai(x,t)ni(x)da(x) = -A.(x,t) ·n(x)da(x)
i=l
gives the probability flow from D to De through an infinitesimal surface element da(x)
of aD. Hence, -fan' A.(x, t) · n(x) da(x) represents the probability flow from D to De
through a subset aD1 of aD.
7.3. Nonlinear systems 487
Note: The above definitions can be applied to subsets of the boundary aD of D. Distinct
subsets of aD can be accessible or inaccessible boundaries.
The probability P ( XT (t) E De) is zero at t = 0 since X(O) = xo E D. This
probability is equal to P(T ::: t) (Eq. 7.66) and increases in time since more and more
samples of X reach the boundary of D. The density of xT (t) in D is the solution of
Eq. 7.64 with the absorbing boundary condition f(x; t I xo; 0) = 0 for X E aD. This
boundary condition eliminates the samples of X once they reach aD so that P(T > t) is
equal to Pv(t) in Eq. 7.69.
Absorbing boundary conditions are used in reliability analysis to characterize sys-
tem performance (Section 7.4.3). We have used reflecting boundaries in Section 6.2.3 to
find local solutions for deterministic partial differential equations with Neumann boundary
conditions. A
Proof: We have seen that X(t) = X(O) e<c-a 2 / 2) t+a B(t) (Example 7.32) so that X(t) ?:
0 at all times since X (0) > 0 by hypothesis. The process can be zero if and only if
(c - a 2 /2) t + a B(t) becomes -oo. The function (c - a 2 /2) t + a B(t) is bounded
a.s. for t finite and
lim
t~oo
~[(c-
t
a 2 /2) t +a B(t)] = (c- a 2 /2) + lim [a B(t)ft]
t~oo
= (c- a 2 /2)
since limt~oo B(t)jt = 0 a.s. ([150], Section 6.4). If c- a 2 /2 < 0, then lim1 ~ 00 X(t) =
0 a.s. so that x = 0 is an inaccessible attracting boundary. Otherwise, the boundary x = 0
cannot ever be reached so that x = 0 is an inaccessible natural boundary. •
488 Chapter 7. Deterministic Systems and Stochastic Input
f(x;tly;O)= ~¢(~)
for B(O) = 0 and a reflecting boundary at x = 0. <>
Proof: The density f(x; t I y; 0) satisfies the Fokker-Planck equation
af 1 a2 f
at 2 axz
with the boundary condition aj(x; t I 0; O)jax 0 at x = =
0, t > 0, since ).(x; t) =
-(1/2) aj(x; t I 0; O)jax. The density f can be calculated from the above Fokker-Planck
equation with the stated boundary condition and the initial condition f(x; 0 1 0; 0) = 1 for
x = 0 and zero otherwise.
We follow here a different approach using the observation that B I (B(O) = 0)
reflected atx = 0 is equal in distribution with IB I I (B(O) 0). Because B(t) I (B(O)= 0) =
and ..fi N(O, 1) have the same distribution, we have for~ > 0
P(IB(t)l ~~I (B(O) =0)) =P(-~ < ..fi N(O, 1) ~~)=<I>(~)- <I>(:%),
so that the corresponding density is f(x; t I 0; 0) = (d/d~)P(IBI ~ ~ I B(O) = y) is as
stated ([154], Section 6.3). •
+ -a 2 .:fi+I,J-
. .:. . :. : .:.;__ 2fi,j + fi-1,}
_:_:~__.:_c__:.:.f.._
2 (Llx)2 '
7.3. Nonlinear systems 489
10
. f(x;tJ 0;0) · >
8
-2
2
0.14
-2 0
Figure 7.11. The evolution of the density of X(t) I X(O) = xo for ot = f3 = -1,
a = 1, and xo = 0
where fi,j = f(xi; tj I xo; 0). The above linear system of equations has been used to
propagate the vector JCi) = (!l,j. h.i• .. .) in time. •
'fJ = { [{ f(x; t + h I~; t) f(~; t I xo; 0) d~- f(x; t I xo; 0)] g(x) dx
and
and using integration by parts and the fact that g is an arbitrary function. •
If X is the solution of Eq. 7.55, then f(x; t I y; s), 0 < s < t, satisfies the
backward Kolmogorov equation
aj(x; t I y; s) ~ 8f(x; t I y; s)
a = - L ai (y' s) ....:...._----:a--'-:....._-
s i=1 Yi
-~ t 2 i,J= 1
(b(y, s) b(y, s/)
iJ
az f(x; t I y; s).
ayi ay1
(7.70)
Proof: Let u(~, cr) = E(~,u)[g(X(t))], where cr is an arbitrary point in (0, t), the super-
scripts of E(~,u) indicate the condition X(cr) =~.and g E C2 (mtd). Takes, h > 0 such
that 0 < s - h < s < t and X (s - h) = y. We have
. E(y,s-h)[g(X(s))]- g(y) . E(y,s-h)[g(X(s))- g(y)]
hm = lim ----=-------'-'--
h h
(t
h.l-0 h.j.O
,t
i= 1
where the second equality in the above equation holds by Ito's formula and A is the gener-
ator of the diffusion process X (Section 6.2.1.1). The definition of the function u gives
u(y, s- h)= E(y,s-h)[g(X(t))] = E(y,s-h) {E(X(s),s)[g(X(t))]}
= E(y,s-h)[u(X(s), s)]
7.3. Nonlinear systems 491
so that
au(y, s) =lim u(y, s)- u(y, s- h)
as h,),O h
=-lim E(y,s-h)[u(X(s), s)]- u(y, s) = -A[u(y, s)]
h,),O h
or au(y, s)jas + A[u(y, s)] = o. The functions aujas and A[u] are
au(y, s)
- --=
as
1]Rd
< ) aj(x; t I y; s) d
gx
as
x and
dJLr d ILT 2
-1 = a(xo)- +-1 b(xo) 2 - -
dxo 2 dx5
Proof: Let f(x, t I xo; 0) be the solution of the Fokker-Planck equation for X with the
boundary conditions f(a, t I xo; 0) = f(f3, t I xo; 0) = 0. The function Pv(t) =
J! f(x; t I xo; 0) dx gives the fraction of samples of X starting at xo E D that have not
reached the boundaries of D in [0, t], that is, the probability P(T > t). Integrating the
backward Kolmogorov equation with respect to x over (a, {3), we find
since the drift and diffusion coefficients of X do not depend explicitly on time. The integral
of the above equation over the time interval [0, oo) gives an equation for the mean ofT
0
since f (apvfa7:) dr = Pv(oo)- Pv(O) = -1 and f000 Pv(r) d1: = f.LT· •
492 Chapter 7. Deterministic Systems and Stochastic Input
1 a2 f(x; t 1 z; s +h)
+2 az2 lz=g (z- ~)2 + o((z- ~)2).
i
f(x; t I ~; s)
af(x; t 1 z; s +h)
=f(x;tl~;s+h)+ lz=g (z-~)f(z;s+hl~;s)dz
h
az JR
k
az JR
+ o((z- ~) 2 ) f(z; s + h I ~; s) dz
or
f(x; t I~; s)- f(x; t I~; s +h)
h
_ aj(x; t I z; s +h) I
- az
!_
z=~ h
{ ( _ ")
}JR Z 5
f( .
Z, S
+ h I"·
5, S
)d
Z
a (a(x) fs(x)) - 21 dx
dx a22 ( b(x) 2 fs (x) ) =0
so that a(x) fs(x)- (1/2) (b(x) 2 fs(x) )' = -q or h'(x)- 2~(x) h(x) = -2q, where
h = b2 fs, ~ = ajb2 , and q is a constant, which must be zero by Eq. 7.63. Therefore, we
have h' (x) - 2 ~ (x) h (x) = 0, which yields the stated expression for !s. •
where B is a Brownian motion. The stationary Fokker-Planck equation for the density fs
of the stationary solution Xs is (Eq. 7.65)
a a rr g0 a2 Is
- - (h 2 I s ) - - (-(h 1 +ph 2) I s ) + - - =
axl . ax2 . . 2 axi
o.
Suppose that there exists a function g such that fs(x) = g(h(x)). Then the above Fokker-
Planck equation becomes
J
p [<h,2) 2 g'(h)+g(h)]+ rr 0 [<h, 2)2 g"(h)+g'(h)] =0
since ls,l = g'(h)h,l, 1s,2 = g1(h)h,2, ls,22 = (h,2) 2 g"(h) + g'(h), h,l2 = 0, and
h. 22 = I. Hence, Is viewed as a function of h is the solution of
a rr go a2 Is
p - [ h 2 1 s ] + - - =0
ax2 · 2 axi
so that ph,2 Is+ (rr go/2) (alsfax2) = l(xl), where lis an arbitrary function of x1.
Because the left side of this equation approaches zero as lx21 -+ oo for each x1 (Eq. 7.63),
l(xl) must be zero. We have p h,2 Is + (rr go/2) h,2 g1 = 0 or p g + (rr go/2) g 1 = 0
following simplification with h,2 = x2 so that dgfg = -(2 pj(rr go)) dh. The solution of
this equation is the stated stationary density Is. •
The path integral method constructs the global solution ofEq. 7.55 by link-
ing local solutions of this equation, that is, solutions over relatively small time
intervals [tk-1, fk], where p = (0 = to < t1 < · · · < tn) is a partition of
[0, t] with mesh t..(p) = max1:sk:snCtk - tk-1). Two properties of X are used
in the path integral method. First, X is a Markov process so that its finite di-
mensional distributions can be obtained from the density of the initial state X (0)
and the density /(·; t I y; s) of the conditional vector X(t) I (X(s) = y),
0 :::: s :::: t, referred to as the transition density (Section 3.6.3). Second, if the
time step tk - fk-1 is sufficiently small, the vector X (tk) I (X Ctk-1) = y) is ap-
proximately Gaussian with mean y +a(y, fk-1) (tk- lk-1) and covariance matrix
b(y, lk-d b(y, fk-1)T (tk- fk-1).
For the numerical implementation of the path integral method both time
and space need to be partitioned. Let C i, i = 1, ... , n, be a partition of the
set D c JE.d in which X takes values. If XCtk-d E C;, the probability of the
event {X (tk) E C j} can be approximated by Jc f (x; tk I y i; tk-1) dx, where
J
/(·; tk I y; tk-1) denotes the density of X(tk) I (XCtk-d = y) and Yi E C; may
be selected at the center of C;. These considerations show that the evolution of X
can be approximated by a Markov chain with known transition probability matrix.
If the drift and diffusion coefficients of X do not depend explicitly on time, the
transition probability matrix has to be calculated once. Otherwise, the value of
this matrix needs to be updated at each time step. Theoretical considerations on
the path integral method and examples illustrating its application can be found
[132].
Moment closure. Generally, the moment equations cannot be solved exactly be-
cause they form an infinite hierarchy (Example 7.31). Several heuristic solutions
have been advanced in nonlinear random vibration theory and physics to close the
moment equations. Most of these solutions are based on postulated relationships
between moments of the state vector X providing additional equations, which are
used to close the infinite hierarchy of moment equations [22, 175]. For example:
for the density of X(t), where fk(X) are fixed densities, and Pk(t) are
weights such that Pk(t) 2: 0 and Lk=I Pk(t) = 1 for all t 2: 0. If X(t)
is stationary, the weights Pk are time-invariant. Let X= X be a real-valued
stationary diffusion process and let ei (PI, ... , Pn) be the expression of a
moment equation of order i in which the moments of X have been ob-
tained from the density f. Define an objective function e(pi, ... , Pn) =
Lf~I e; (PI, ... , Pn) 2 , where qo denotes a selected closure level. The op-
timal functions Pk minimize the objective function, that is, they are the
solutions of the system of equations aejapk = 0, k = 1, ... , n [74].
There is no justification for the relationships between moments of X postu-
lated by the central moment, Gaussian, and cumulant closure methods. Moreover,
the closure methods can deliver incorrect results, for example, negative even-order
moments [97]. Yet, these methods are used in large scale applications since they
are relatively simple and provide useful results in some cases [22, 105].
Example 7.43: Let X be the solution of
dX(t) = -[X(t) +a X(t) 3 ] dt + ,JJTiOdB(t),
where a and go > 0 are some constants. The moments of X satisfy the infinite
hierarchy of equations
•(
JL p; t) = - p JL(p; t) -a p JL(p + 2; t) + l7r go p (p - 1) JL(p- 2; t).
If X is a stationary process, the Gaussian closure method gives the nonlinear al-
gebraic equations
JL{l) -a (3 JL(1) JL(2) - 2 JL(1) 3) = 0,
{
-2 JL(2)- 2a [3 (JL(2)- JL{1) 2 ) 2 + 6 JL{1) 2 JL(2)- 5 JL(1) 4 ] + 1r go= 0,
for the first two moments of X. Approximate/exact values of the second moment
JL(2) are 0.2773/0.2896, 0.3333/0.3451, 0.4415/0.4443, and 0.4927/0.4927 for
a = 1, 0.5, 0.1, and 0.01, respectively, and 1r go/2 = 1/2. The approximations in
this case are satisfactory ([175], Example 6.7, p. 228). 0
Proof: The moment equations are given by Eq. 7.57. Let il(p; t) = E[(X(t)- J.L(l; t))P]
denote the central moment of order p of X(t). The Gaussian closure requires that the
moments of X(t) of order three and higher behave as moments of Gaussian variables, for
example, il(3; t) = 0 and il(4; t) = 3 il(2; t) 2 for the third and fourth order moments, so
that
J.L(3; t) = 3J.L(l; t)J.L(2; t)- 2J.L(l; t) 3 ,
J.L(4; t) = 3 [J.L(2; t) - J.L(l; t) 2 ] 2 + 6 J.L(l; t) 2 J.L(2; t)- 5 J.L(l; t) 4 .
7.3. Nonlinear systems 497
These conditions and the moment equations for p = 1 and p = 2 are used in the Gaussian
closure procedure to calculate the first four moments of X. If X is stationary, we need to
solve the above coupled nonlinear algebraic equations to find the first two moments of the
state. •
with the usual convention that p,(p, q; t) = 0 if at least one of the arguments
(p, q) is strictly negative. These moment equations form an infinite hierarchy that
cannot be solved exactly.
For a closure level qo = 2, the central moment and cumulant closure give
the relationship
which can be used to close the moment equations of order p + q = 1 and 2. The
corresponding approximate stationary solutions are p,(l, 0) = 0, p,(O, 1) = 0,
p,(2,0) = ~: (J1 + 3 ~:r -1), p,(1, 1) = 0, andp,(0,2) = :t~· If a= 0,
the Duffing oscillator becomes a linear system so that the variance of the stationary
displacement is (n: go)/(4~ v 3 ). o
Note: The above moment equations have been obtained from Eq. 7.57. They can also be
derived by applying ItO's formula to the mapping (X(t), X(t)) 1-+ X(t)P X(t)q. •
Equivalent linearization. Let X be the solution of Eq. 7.55 with b(x, t) = b(t).
Denote by X 1 the solution of the linear stochastic differential equation
This seems to be a reasonable criterion for finding the parameters of the approx-
imating equation. The problem is that we cannot calculate the mean square error
in Eq. 7.72 since the probability law of X is not known. If the law of X were
known, there would be no need for constructing an approximation. To overcome
this difficulty, we consider the following objective function.
Objective function. lfEq. 7.55 has additive noise and X 1 is defined by Eq. 7.71,
then ii is selected to minimize the m.s. error
Note: The objective function e in Eq. 7.73 depends only on the probability law of the
Gaussian process X1 defined completely by its first two moments, which can be calculated
from Eq. 7.10. Because the moments of X1 depend on the entries of a, the objective
function ein Eq. 7. 73 is a function of a. The optimal matrix a minimizes e. The entries of
the optimal matrix a satisfy a coupled system of nonlinear algebraic equations.
There are several reasons to question the accuracy of the approximating solution Xz
delivered by the equivalent linearization method: ( 1) there is little justification for using e
in place of e, (2) the approximation Xz is a Gaussian process while X is not, and (3) the
frequency contents of Xz and X may differ significantly. .A
Despite all these limitations that can cause significant errors in reliability
studies when using X 1 as an approximation for X, the equivalent linearization
method is used extensively in engineering applications. Numerical results and
theoretical considerations suggest that the method gives satisfactory approxima-
tions for the first two moments of the state vector at a fixed time [ 177, 153]. The-
oretical considerations on the equivalent linearization method can be found in
[19, 177, 153]. The equivalent linearization method has been extended to (1) ac-
commodate Poisson rather than Gaussian noise [80] and (2) use nonlinear systems
with known solutions to approximate the properties of the state of nonlinear sys-
tems whose solution is not known [175].
Example 7.45: Let X be the displacement of a Duffing oscillator defined by
where f3 > 0 denotes the damping ratio, a > 0 provides a measure of nonlinearity,
v > 0 is the frequency of the associated linear oscillator (a = 0), and W is a
stationary Gaussian white noise with mean zero and one-sided spectral intensity
go > 0. The displacement X z of the equivalent linear system is the solution of
Xz(t) + 2 f3 Xz(t) + v;q Xz(t) = W(t),
where Veq is the only unknown coefficient. If the response is stationary, the natural
frequency of the equivalent linear oscillator is
1+
3a n 2go)
2v2 1+
(
Veq = .
f3v
7.3. Nonlinear systems 499
The frequency veq is equal to the frequency of the associated linear oscillator
(a= 0) corrected by a factor depending on the magnitude of a.¢
Proof: The linear differential equation for Xz is constructed by replacing the nonlinear
restoring force v 2 (X(t) +a X(t) 3) by the linear restoring force v~ Xz(t). The solution of
aejav~ = 0 under the assumption that Xz is a stationary process is
2 = v2 ( 1 +a E[Xt(t) 4 ]) 3rr-go)
= v2 ( 1 +a -
Veq 2 2- ,
E[Xt (t) ] 4,8 Veq
where e= E [ (v 2 (Xt(t) +a Xz(t) 3 ) - v~ Xz(t) ) 2]. The last equality in the expression
of v~q holds because Xz is a Gaussian process and E[Xz(t) 2 ] = rr go/(4 ,8 v~) for a linear
oscillator. The above nonlinear algebraic equation for ~ can be solved exactly in this
case. Generally, the equations satisfied by the unknown parameters of an equivalent linear
system have to be solved numerically. •
Perturbation method. Suppose that the linear and nonlinear parts of the matrices
a and b in Eq. 7.55 are of order one and s, 0 < s «
1, respectively, that is,
a = az +san and b = bz + s bn, where the entries of matrices a, az, an, b, bz,
and bn are of order one. Then the defining equation for X is
dX(t) = (az(X(t), t) + s an(X(t), t)) dt
+ (bz(X(t), t) + s bn(X(t), t)) dB(t). (7.74)
The solution is sought as a power series in s, that is,
+ s X(l)(t) + s2 X( 2)(t) + · ·. ,
X(t) = X(O)(t) (7.75)
and is usually approximated by X (t) ~ X (O) (t) + s X(l) (t), referred to as the first
order perturbation solution.
The processes X(O) and X(l) in the first order perturbation solution satisfy the
differential equations
Proof: The perturbation solution and the definition of X (Eqs. 7.74 and 7.75) give
an(X(t),t)=an(X(O)(t),t)+"
d aan (X<0l ' t) (
eX~l)(t)+e 2 x~2)(t)+···),
~ 8x· I I
i=l I
The above equations and approximations yield Eq. 7.76 by equating terms of the same
order of magnitude. •
so that x<O) is a Gaussian process but x<I) is not. The process x<I) is the solution
of a linear system driven by a polynomial of a filtered Gaussian process (Exam-
ple 7 .23). The stationary mean and variance of X obtained from the first order
perturbation solution are zero and Var[X(t)] :::::: aJ (1- 3eaJ) + o(e), where
aJ = n go/(4 ~ v 3 ) is the variance of the displacement of the associated linear
system (e = 0), that is, the process X(O). ¢
Proof: The differential equations for x(O) and x(l) follow from Eq. 7.76.
The stationary processes x<O) and x< 1l have mean zero. The variance of the first
order perturbation solution is
The approach in Example 7.23 or direct calculations using the solutions of x<0l and x<I)
can be used to find the approximate variance of X based on the first order perturbation
solution ([175], Example 6.11, p. 237). •
7.3. Nonlinear systems 501
Example 7.47: Suppose that the damping ratio~ of the Duffing oscillator in the
previous example is also a small parameter, that is, ~ = s > 0. The perturbation
method fails because the variance of the perturbation solutions increases indefi-
nitely in time while the density of the stationary displacement X is (Example 7 .42)
fs(x) = c exp [- 2 ~J (x 2 + i J.
x4) s > 0,
so that the variance of the processes x(O) and x(l) approaches infinity as t --+ oo since the
linear oscillators defining these processes have no damping. •
Stochastic averaging. There are many applications in which the state X varies
slowly in time, that is, X(t) "' O(s), where s > 0 is a small parameter ([122],
Section 4.7). It is shown that approximate equations can be developed for the
evolution of X by averaging the differential equations developed for some mem-
oryless transformations of this process. We illustrate this approach, referred to as
the method of averaging, with two examples. A theorem justifying the validity of
the method of averaging is stated.
ii(t) = -2 -
8 lo
Jl' v 0
2n
g(a cos(1jr), -a v sin(1jr), t) sin(1jr) d1jr,
~(t) = -2 -
8 _ f2n g(a cos(1jr), -a v sin(1jr), t) cos(1jr) d1jr,
n va lo
a
and and ~ represent averages of the amplitude and phase of x (t) over a window
of size 2 n jv. 0
Proof: The change of variables,
-
dA(t) = e2 [ -1 hs(A(t))
-
+ 1r s(v)
_ ]
dt + e ../rr s(v) dB1 (t)
v 2 v2 A(t) v
1 -
- 2
d<l>(t) = e - -- hc(A(t)) dt + e ../rrs(v)
_ dBz(t),
v A(t) v A(t)
where
Proof: The change of variables (X, X) t-+ (A, \ll) applied to the defining equation for X
gives
2
A(t) = ~v h(A cos(\ll), -v A sin(\ll)) sin(\ll) - :_ Y(t) sin(\ll),
v
.
= -e e
2
<l>(t) h(A cos(\ll), -v A sin(\ll)) cos(\ll) - - Y(t) cos(\ll).
vA vA
The temporal average over a time interval of duration 2 n I v of the terms in the above
equations that do not include the input Y is
!. e2 - e -
A= - hs(A)- - Y(t) sin(\ll),
v v
e 2- e -
!.
<I> = -- hc(A)- - - Y(t) cos(\ll),
vA vA
where ~(t) = v t + cP(t). We have performed a similar temporal average in the previous
example.
The next step of the stochastic averaging method is to approximate the random input.
It is assumed that the correlation time rc of the driving noise Y is such that there exists a
time D.t > 0 with the properties (1) A and if> are nearly constant in [t- D.t, t] for any t ?: 0
and (2) D.t » rc. Conditional on A(t - D.t) =a and cP(t - M) = ¢,we have
so that
e
E[Yt (t)] = - -
va
it
t-At
E[Y(t) Y(a)] cos(v t + ¢) cos(v a+ rfJ) da
~ --e
va
10
-oo
r(r) cos(v t + ¢) cos(v (t + r) + r/J) dr
= -- e
2 -va
10_ 00
r(r)[cos(v r) + cos(v (2t + r) + 2¢)]dr
e
= -2 -
va
10-oo
r(r) cos(v r) dr: + oscillatory terms.
504 Chapter 7. Deterministic Systems and Stochastic Input
The second equality in the above equation holds approximately since the correlation func-
tion is nearly zero for time lags larger than <c « lit. The oscillatory terms vanish by
temporal averaging over a time interval [0, 2 7l' I v] so that
-1°
8
E[Y1 (t)]:::::: - -
2v a _ 00
r(r) cos(v r) dr
= ---
B 100 r(r) cos(v r) dr = ---
7l' B
s(v).
4va -oo 2 va
r1 (<) = E[Y1 (t) Y1 (t + r)]:::::: E[Y(t) Y(t + r) sin(v t + f/J) sin(v (t + r) + f/J)]
1
= 2 r(r) [cos(v r)- cos(v (2t + r) + 2f/J)]
= 21 r(r) cos(v r) + .
oscillatory terms
and q (<) :::::: r1 (<) - (tv~ s(v) ) 2 = rt (<) + O(s 2), respectively. Because r1 and q are
proportional to r, Y1 is a broadband process that can be approximated by a white noise with
intensity
q1 =! 100
2 -00
r(r) cos(v r) dr + oscillatory terms:::::: 7l' s(v),
where the last equality results by neglecting the oscillatory terms. Hence, Y1 can be ap-
proximated by
Jl'B ~
yl (t) :::::: - - s(v) - v 7l' s(v) wl (t),
2va
where W1 is a stationary Gaussian white noise with mean zero. Similar considerations
can be used to show that Y2(t) :::::: -v'rr s(v) W2(t), where W2 is a zero-mean Gaussian
white noise. The above approximate representations of Y1 and Y2 introduced in the dif-
ferential equations for (A, cP) give the stochastic differential equations for the approximate
amplitude and phase of X. •
It turns out that the approach used in the previous two examples to derive
simplified differential equations for the state of a system driven by random noise
is valid under some conditions. Consider an JRd -valued stochastic process defined
by the differential equation
where the JRd -valued functions«, fJ satisfy a sequence of conditions given by the
Stratonovich-Khas'minskii theorem [112], s > 0 denotes a small parameter, and
Y is an JRd' -valued stationary Gaussian process whose coordinates are broadband
processes with mean zero.
7.3. Nonlinear systems 505
If a and fJ in Eq. 7.77 satisfy the conditions in [112], then the solution X of this
equation converges in the weak sense to the solution of
(7.79)
Note: The approximate solution of X in Eq. 7.78 has the same functional form as the
stochastic differential equation for (A, cP) in Example 7.49. The conditions that the matri-
ces« and fJ in Eq. 7.77 need to satisfy and the type of weak convergence of X to X are
defined in [112], and are not stated here.
The average operator rav is applied over all explicit time arguments. If the functions
« and fJ are periodic with period 't'(), this operator becomes
T~v {·} =-
1 lt+ro {·} ds.
t'Q t
The symbol fJ{JjfJX denotes a (d, d) matrix with rows (8f3k/8Xb ... , 8f3k/8Xd), k =
1, ... , d. The entries bij of b can be calculated from the expressions of b 17 given by
Eq. 7.79. A
Proof: The exact differential equations for X = (A, <I>) in Example 7.49 show that
a1 = -v1 h(A cos(\11), -v A sin(\11)) sin(\11),
az = v1A h(A cos(\11), -v A .
sm(\11)) cos(\11),
lh = - ~ Y sin(\11),
v
1
{Jz =- vA Y cos(\11),
with the notation in Eq. 7.77, where ll'(t) = v t + <l>(t). The first coordinate of the drift
of the stochastic differential equation for the approximate amplitude and phase of X is
506 Chapter 7. Deterministic Systems and Stochastic Input
+ Tav {[
0
00 E [( :~~) t <Pl)t+, + ( :~~) t (P2)t+< Jd<} .
Because iJP1/iJX1 = iJP1/iJA = 0 and iJP1/iJX2 = iJP1/iJcf> = -Y cos(\11)/v, the second
term on the right side of the above equation is
where the last equality results by time averaging, as performed in Example 7.49 to find
E[Y1 (t)]. Hence, we have a1 = hs(A)jv + 1r s(v)/(2 v 2 A). The first and second com-
ponents of a1o that is, hs(A)/v and 1r s(v)/(2 v2 A), correspond to the time and stochastic
averaging (Example 7.49). Similar calculations can be used to find the remaining drift and
diffusion coefficients. For example, the entry (1, 1) of b bT is
btl+ ht2 = v12 Tav {L: r(<) sin(v t + rp) sin(v (t + <) + rp) d<}.
The expression under the integral coincides with the approximate correlation function of
the process Y1 in Example 7.49. •
with the same notation as in Examples 7.49 and 7 .50. The approximate amplitude
and phase of X are the solutions of
dA(t)
-
= -e 2 ( -
v A(t) -
n s(v) )
_ dt + e ,Jn s(v) dB1 (t),
2 v2 A(t) v
-
dc'l>(t) = e
.../n_s(v) dB2(t).
v A(t)
An alternative form of the amplitude equation is
Note: The above stochastic differential equations for A and <i> can be obtained by following
the approach in Example 7.49 or using the formulation in Eqs. 7.78 and 7.79. The equations
for A and <i> were originally derived in [8]. 6
Monte Carlo simulation. Monte Carlo simulation is the most general and con-
ceptually simple method for estimating properties of the state vector X. The
method involves (1) the generation of samples of the driving noise B in Eq. 7.55,
(2) numerical solutions of Eq. 7.55 to obtain samples of X (Sections 4.7.3 and
5.3.3.2), and (3) statistical techniques for estimating the required properties of X.
The only limitation of the Monte Carlo method is the computation time,
which can be excessive. For example, the estimation of the reliability of very
safe dynamic systems requires the generation of many samples of X since the
probability that the design conditions are violated is extremely low. The following
example presents another practical situation in which Monte Carlo simulation can
be inefficient.
Example 7.52: The process X = x e-t/ 2+B(t), called geometric Brownian mo-
tion, is the solution of dX (t) = X (t) dB(t), t 2: 0, with the initial condi-
tion X (0) = x, where B denotes a Brownian motion (Example 7 .35). Sup-
pose we want to estimate the moment of order q of X(t) from samples of this
process generated by Monte Carlo simulation. Let fl(q; t) be an estimator of
E[X (t)q] defined as the arithmetic average of the random variables X; (t)P, where
X;(t), i = 1, .. ., are independent copies of X(t). The number of samples of
X required such that the coefficient of variation of [l(q; t) has a specified value
Vreq is nreq = (eq 2 t - 1) jv}eq·
This number can be very large, for example,
nreq = 1, 192, 000 samples are needed so that Vreq = 0.05 fort = 1 and q = 4. <>
Proof: We have
where the last equality holds because q B(t) ~ N(O, q2 t). This result can also be obtained
by Ito's formula applied to the function X(t)q.
k
Let (l(q; t) = :L7=I Xi(t)q be an estimator of E[X(t)q], where Xi(t) are inde-
pendent copies of X ( t). This estimator is unbiased with variance
and coefficient of variation C.o.v.[(l(q; t)] = J<eq 2 t - l)jn. The required number of
samples nreq results from the condition C.o.v.[(l(q; t)] = Vreq· •
508 Chapter 7. Deterministic Systems and Stochastic Input
where Sis an JRd' -valued semimartingale defined by Eqs. 7.30 and 7.32 and the
coordinates of the JRdp -valued process pare polynomials of S (Eq. 7.29). Polyno-
mials of semimartingales are semimartingales since sums of semimartingales are
semimartingales and so are smooth memoryless transformations of semimartin-
gales (Ito's formula, Section 4.6.2).
As in the previous sections, our objective is to develop differential equations
for properties of the state vector X. The analysis is based on Ito's formula for
semimartingales (Eq. 7.2). This formula can be applied to functions of the state
X in Eqs. 7.80 and 7.81 or functions of an augmented state vector Z including X.
We have referred to these two ways of using the Ito formula as the direct and the
state augmentation methods, respectively.
d lot
Xp(t) Xq(t)- Xp(O) Xq(O) =L (.5p; Xq(s-) + Xp(s-) .5qi] dX;(s)
t
i=l O+
The equality Xp(s) = Xp(s-) + t.Xp(s) and previous calculations (Eq. 7.38) show that
the last term on the right side of the above equation is LO<s<t t.Xp(s) t.Xq(s). The
average of the resulting equation is -
= L tE[Kpa(s)Kqa(s)JE[(t.Ca(s)t].
0<S9ct=l
Generally, Eq. 7.82 is not a differential equation for the moments of X but it
becomes such an equation if a and bare polynomials of X. Usually, the collection
of equations for moments of X forms an infinite hierarchy so that these equations
cannot be solved exactly. Approximations, for example, closure methods, can be
used to find the second moment properties of X.
t)]
d
acp~~; t) = .J=1 I > i E [e.J=IuT X(t-) ai(X(t-),
i=l
d
- ~ ~ Ui UJ E [e.J=IuT X (f-) ( H(t) H(t)T)iJ]
1,]
de
+ L>a { E [e.J=IuT X(t-) e.J=T I:f=l ui I:~ 1 Kia(t) ~Ca(t) J _cp(u; t)}.
a=l
(7.83)
=L t
d
e.J=IuT X(t)- e.J=IuT X(O) J=T Ui e.J=IuT X(s-) dXi(s)
i=l lo+
where
db de
dXi(s) = ai(X(s-), s) ds + L Hik(s) dBk(s) +L Kia(s) dCa(s)
k=l a=l
d - -
and !J.Xi(s) = La~l Kia(s) l!.Ca(s). The above Ito formula, the assumption that all
expectations in Eq. 7.83 exist and are finite, and considerations similar to those used to
derive Eq. 7.39 give Eq. 7.83 by differentiation with respect to time. •
The condition in Eq. 7.83 is not a partial differential equation for cp because
the expectations in this equation cannot be expressed as functions of cp and its
partial derivatives for arbitrary a, b, il, and K. This condition becomes a par-
tial differential equation for cp under rather restrictive conditions. More general
systems can be analyzed by the state augmentation method discussed in the next
section.
where p > 0, ~, and a are some constants and La is an a-stable process. The
marginal distribution of X has a heavy tail, that is,
acp acp a
at= -pu au -lui cp.
for ~ = 0 and a = 1. This equation is a special case of the above partial differen-
tial equation of cp. ¢
Proof: Note that the process La in this example does not belong to the class of inputs S
in Eq. 7.83 so that this equation cannot be used. The It() formula applied to the function
X(t) ~ eHuX(t) gives
The expectation of the terms on the left side of the above equation gives q; at times t and
t = 0. However, the expectation of the right side of this equation cannot be calculated term
by term since individual terms may not have finite expectations. Hence, it is not possible to
find a partial differential equation for q;. If we calculate the expectation term by term, the
result of these formal calculations is
Differentiation with respect to time and considerations as in Example 7.20 yield the above
equation for the characteristic function.
Let Xi = X(i ~t), i = 1, ... , n, be a sample of X generated by Monte Carlo
simulation with a time step ~~ > 0. Let X(l) 2: · · · 2: X(n) be the order statistics
512 Chapter 7. Deterministic Systems and Stochastic Input
derived from the sample X1, ... , Xn. For example, X(l) = mroq:Si:Sn Xi and X(n) =
min1:si:sn Xi are the largest and the smallest observations in the record of X. It can be
shown that the Hill estimator [100]
Hk,n =! t
k 1=1
log ( X (j) )
x<k+1)
converges in probability to 1/f3 ask, n --+ oo such that kjn --+ 0 [100]. The range (2.5,5)
of the estimated values of f3 correspond to ratios kjn = 0.0001, 0,0005, 0.001, 0.005, and
0.01 and n = 1, 000, 000.
The parameter f3 gives the rate of decay of the tail of the distribution of X(t). We
also note that the asymptotic expression of P(X(t) > x) gives
log (P(X(t) > x)) ~ log(c)- f3 log(x) as x--+ oo
so that the graph (log(x), log (P(X(t) > x))) is a straight line with slope -{3. It is common
to refer to this behavior of the tail of the distribution of X as power law [15]. •
which becomes a partial differential equation for cp if the coefficients a and b are
polynomials of X. <>
Proof: The augmented vector Z is the solution of
{
dX(t) = a(X(t)) dt + 3 b(X(t)) (Y(t) 2 - t) dB(t),
dY(t) = dB(t).
The differential equations for the moments and the characteristic function of Z result by ap-
plying the Ito formula to the functions X (t)P Y (t)q and eH uT Z(t), respectively, taking
the average, and differentiating the result with respect to time. •
7.4 Applications
The developments in the first part of this chapter are used to illustrate solu-
tions of some stochastic problems, which can be defined by Eq. 7.1. We present
elementary criteria for model selection (Section 7.4.1), demonstrate how stochas-
tic differential equations and diffusion processes can be used to describe the evo-
lution of some properties of random heterogeneous materials (Section 7 .4.2), dis-
cuss methods for reliability analysis based on the crossing theory for stochastic
processes, Fokker-Planck equation, probability bounds, and Monte Carlo simula-
tion (Section 7 .4.3), introduce elements of finance and establish the Black-Scholes
formula (Section 7.4.4), and give essentials of linear estimation and Kalman-Bucy
filtering (Section 7 .4.5).
7.4.1 Models
In many cases, the laws defining the evolution in time of the state X of a
system are either partially understood or too complex to be derived from basic
principles, for example, climatic changes, wave and wind forces, stock prices,
seismic ground acceleration, and many other phenomena. A common approach in
these cases is to select a probabilistic model for X.
The selection of a model for X is based on the available information,
which consists of (1) one finite record or a set of finite records of X and (2) phys-
ical properties of this process, for example, if X(t) represents the strength of a
fiber at location t, it must be positive. Because of the limited information the
model selection problem does not have a unique solution. Generally, there are
many competing models, that is, models that are consistent with the available
information. The selection of a model for X involves the following two steps.
1. Specify a finite collection of competing models Mk, k = 1, 2, ... , m.
There are no theorems defining the models Mk. Simplicity and use should
guide the selection of the collection of competing models. Nevertheless, in
applications, familiarity with a particular model plays a significant role in
the construction of the collection of competing models.
514 Chapter 7. Deterministic Systems and Stochastic Input
2. Find the model Mopt = Mk0 , ko E {1, ... , m}, that is optimal in some
sense. In Section 9.8.2 we present a Bayesian framework for finding M opt·
We use examples in this section to illustrate the classical approach for model
selection and demonstrate the practical importance of accounting for use in the
solution of the model selection problem.
It has been shown that earth climate is strongly correlated with calcium
concentration in ice deposits [199]. This dependence suggests that calcium con-
centration records obtained from ice-core samples can be used as a proxy for the
evolution of earth climate in time. Figure 7.12 shows an 80,000 year record of the
-1
-2
-3
~o~ __ _ L_ _ _ _L __ _ ~----4L---~s----~--~--~
time (years)
has been proposed to model the evolution of earth climate, where u is a two-well
potential, B denotes a Brownian motion, La is an a-stable process, and ay, a1
are some constants [55].
Note: According to [55], Eq. 7.84 is a competing model since it is consistent with (1) the
record in Fig. 7.12 for a = 1.75 and ay/CFi = 3 and (2) current climate dynamics theories.
Note that the differential equation of X can be given in the form
The selection of the stochastic process in Eq. 7.84 essentially follows the
classical approach in modeling. First, a functional form has been postulated for the
logarithm of the calcium concentration X. Second, the record in Fig. 7.12 has been
used to estimate the parameters of the postulated model. Details of the method
used to estimate the parameters of the processes in Eq. 7.84 are in [55]. The
potential in Eq. 7.84 has been obtained from the marginal histogram of X and the
relationship between the marginal distribution and the potential (Example 7.41).
The noise properties have been obtained from the statistical analysis of the record
in Fig. 7.12 corrected by the estimated drift of X. The analysis has shown that the
driving noise has a component ofthe a-stable type with a= 1.75 [55].
In most applications the collection Mk. k = 1, ... , m, of competing mod-
els consists of a single model M with a specified functional form but unknown
parameters. Therefore, the model selection problem is reduced to the estimation
of the unknown parameters of M. This classical solution of the model selection
problem has been used in [55] to establish the model M in Eq. 7.84.
Translation processes can have any marginal distribution F but their cor-
relation function r cannot be chosen independently ofF, and their finite dimen-
sional distributions Fn, n ;:::: 2, are uniquely defined by F and r (Section 3.6.6 in
this book, [79], Section 3.1.1). Conditional Gaussian processes can be useful in
some particular applications but they cannot be calibrated to a specified marginal
distribution and correlation function ([79], Section 3.4). Diffusion processes can
match the marginal distribution of any signal provided that its correlation is ex-
ponential [122]. Filtered Poisson processes can match the correlation function
of any time series but they can only approximate its marginal distribution ([79],
Section 3.3).
Translation processes have attractive features for applications. However,
they cannot be viewed as universal non-Gaussian models. In addition to restric-
tions on their correlation and finite dimensional distribution functions, these mod-
els are not recommended in some cases for other reasons. For example, suppose
that our objective is to find properties of the state X of a system driven by a non-
Gaussian process Y. The representation of Y by a translation model Y T causes
difficulties since there is no theory for finding properties of X driven by Y T. The
output of a linear filter to Poisson white noise or to a polynomial of a class of
Gaussian processes is a preferable model for Y since the augmented system of
equations consisting of the defining differential equations for Y and X can be
solved by the methods discussed in this chapter. We consider the latter model of
Y in this section.
Suppose that the non-Gaussian input Y to a system with state X can be
modeled by the output of the linear filter in Eq. 7.85 to a polynomial Lt=! S1 of
an Omstein-Uhlenbeck process S. Properties of X can be obtained by the state
augmentation method discussed earlier in this chapter, the representation of Y,
and the evolution equation for X. The selection of a model for Y is by no means
trivial and may require many iterations.
Model of input Y.
n
Y(t) + /3 Y(t) + v 2 Y(t) = L S(t)1,
1=0
dS(t) = -a S(t) dt +a~ dB(t), (7.85)
Note: We have shown in Section 7.2.2.3 that the moment equations for the diffusion pro-
cess Z = (Y, Y, S) are closed so that the moments of any order of the W -valued random
variable Z(t) can be calculated exactly for any t 2: 0. Let
which gives the above differential equation for rpq(t, s) by averaging and differentiating
with respect to t.
The following example shows that the correlation functions rpq can be calculated
from the above equation. Hence, we can calculate exactly the correlation function of the
non-Gaussian process Yin Eq. 7.85 and moments of any order of Y(t), t 2::: 0. A
Example 7.55: Suppose that :L?=o S(t) 1 = S(t) 2 , fJ = 0.25, v 2 = 1.6, a = 0.12,
and a = 1 in Eq. 7.85. The stationary skewness and kurtosis coefficients of Yare
Y3 : : : : 1.8 and Y4 ::::::: 8, respectively. Hence, if we need to analyze the response of a
system to an input with Y3 : : : : 1.8 and Y4 : : : : 8, the process Y can be used to model
this input. Figure 7.13 shows some of the correlation functions of the stationary
0.2
-0.2
-0.4
E[dY(t)/dt dY(t+'t)/dt]
-0.6'----------'----":----'-------'-------'-------.J
0 10 15 20 25 30
time lag, 't
IR3 - valued process Z = ( Y, Y, S). The properties of the linear filter defining Y
control to a great extent the correlation function of this process if :L?=o S 1 is a
broadband process. <>
Proof: The moments J-L(p, q, r; s) are given in Example 7.23. The differential equations
giving the correlation functions rpq(t, s) = E[Zp(t) Zq(s)] involve only moments of Z
518 Chapter 7. Deterministic Systems and Stochastic Input
since the drift coefficients ap(Z(t), t) are polynomials of Z. However, some of these
moments are of order higher than 2 so that they are not correlations. The moments,
E [ Z3(t) 2 Z1 (s)]. E [ Z3(t) 2 Zz(s) ]. and E [ Z3(t) 2 Z3(s) ]. t 2: s, in the differential
equations for the correlation function of Z are not known but can be calculated simply.
For example,
Consider a system with state X and let Y and YT be two models of the input
to this system, which are consistent with the available information. The model Y
i.s defined by a differential equation of the type in Eq. 7.85 and the model Y T is a
translation process. The construction of the model of Y exceeds by far the effort
require to define Y T. However, once Y has been obtained, properties of the state
X corresponding to Y can be established by the methods in this chapter. On the
other hand, there are no practical methods for calculating properties of X driven
by Yr.
holds for all t 2:: 0 and almost all w's, where v D denotes the volume of D. Hence,
the ensemble average E[f(t t)] = y(t) can be obtained from the spatial average
of an arbitrary sample of r at any time t and can be interpreted as a strain measure
at time t induced by the external action y e in a homogeneous material.
Let X (g, t) be an internal variable characterizing a material property at lo-
cation g E D and time t 2:: 0, for example, one of the Euler angles of the atomic
lattice orientation in a polycrystal. Because material properties and internal strains
vary randomly in space, X is a random function defined on the same probability
space as r. Our objective is to derive an evolution equation for X. We proceed
under the assumptions that (1) the evolution of material properties is governed
only by plastic straining and (2) the internal variable X takes values on the real
line. Generally, X is an ~d -valued random function but the assumption that X is
real-valued is satisfactory in many cases [14, 92].
If the material were homogeneous, then X(t t) = X h (t) would have the
same value everywhere in D and its evolution would be given by a deterministic
differential equation of the type Xh (t) = a(Xh (t)) y (t). The function a is prob-
lem dependent, for example, a(x) is proportional with 2 cos(x)- c if x denotes
the atomic lattice orientation of a planar crystal with two slip systems, where c is
a constant (Section 8.6.2.1).
where~ is a new time defined by d~(t) = y(t) dt, B denotes a Brownian mo-
tion, qw,o is a constant, and b(·) is a function of the state X.
Proof: Because the medium is random and heterogeneous, the internal variable X is a
random function depending on time and space. The evolution of X is characterized by the
change of its probability law in time. However, the determination of the evolution of the
probability law of X is a very difficult task. A simplified description of this evolution has
been proposed in [92]. Let C be a collection of points~ in D that are sufficiently far apart
so that the correlation between strain processes at distinct points in C can be neglected, that
is, the processes f'(~, ·)and f'(~', ·)in Eq. 7.86 can be assu~ed to be uncorrelated, where
~, f E C and ~ =/= ~ 1 • The collection of functions of time f' (~, ·), ~ E C, can be viewed
as samples of a stochastic process y 0 + Y (·), where y is the spatial average of the strain
rates in C and the noise Y captures the sample to sample variation of these strain rates.
Hence, the evolution of the internal variable X in a random heterogeneous medium can be
520 Chapter 7. Deterministic Systems and Stochastic Input
described by
X(t) = a(X(t)) y(t) + a(X(t)) Y(t),
that is, the evolution equation Xh(t) = a(Xh (t)) y(t) for a homogeneous medium in which
the strain rate y(t) is replaced with y(t) + Y(t). The equation of X becomes
dX Y
-d = a(X) + a(X) -;-,
~ ' y
in the new "time"~ defined by d~(t) = y(t)dt. Because Jie(t) > 0 by hypothesis, we
have y(t) > 0 so that the new time is well defined. It is assumed in [14] that W = Y jy in
Eq. 7.88 is a Gaussian white noise W with mean zero and intensity
where ~corr = y fcorr and fcorr are correlation times. The expectation E[f(t S) 2] is per-
formed with respect to the space parameter, and can be related to properties of the random
medium [14]. Generally, qw may depend on X in which case it can be given in the form
qw(~) 2 = q~ 0 h(X(n) 2 [14]. •
The methods in Section 7.3.1 dealing with nonlinear systems driven by Brownian
motion can be used to calculate properties of the diffusion process X in Eq. 7.88.
The Fokker-Planckequation (Section 7.3.1.3) can be used to assess changes in the
properties of a material subjected to a sustained action.
The density f(x; ~ I xo; ~o) of the conditional process X(S) I (X(~o) = xo) is
the solution of the Fokker-Planckequation (Eq. 7.64)
2 2
qw o a
-aj = - -
a (a(x)f) +- '- ( b(x) 2 f )
. (7.89)
a~ ax 2 ax2
Note: The dependence of the density f(x; ~ I .xo; ~o) on ~ can be used to characterize
the change in time of some material properties, referred to as the structural evolution.
If f(x; ~ I xo; ~o) converges to a stationary density fs(x) =
lims~oo f(x; ~ I xo; ~o),
we say that the structural evolution emerges in a pattern (Section 8.6). The stationary
density fs is the solution of
d q;
0 d2 (
- (a(x) fs(x))- - ' - 2 b(x) 2 fs(x)
)
=0
dx 2 dx
-200L--~--~-~-----'
50 100 150 200 0 50 100 150 200
time time
Figure 7.14. Applied strain history and corresponding stress histories in five
grains
plot in Fig. 7.14 shows the corresponding shear stress histories in the (x 1, x2)
plane in five different grains.
The stress histories in Fig. 7.14 represent samples of the random function
r(~, t) in Eq. 7.86, recorded at five different locations~ in the polycrystal. By
following the arguments used to define the process X in Eq. 7.88, the stress sam-
ples in Fig. 7.14 can be used to define a stochastic process for the shear stress in
the polycrystals. The differences between these stress samples is caused by the
uncertainty in the material properties, since the input is deterministic.
Note: The stress analysis is based on the Taylor hypothesis, assuming that there is no inter-
action between grains. Crystal plasticity theory has been used to calculate stress histories
in individual grains. Details on this topic are in Section 8.6.2.2. ~
Example 7.57: Let X be the atomic lattice orientation in a single crystal with two
slip systems subjected to a random shear strain rate y + Y = 1 + q W, where q is
a constant and W is a Gaussian white noise. The atomic lattice orientation process
is the solution of (Section 8.6.2.1 in this book, [12], [14])
Figure 7.15 shows the evolution of the density f(x; S) for an initial state X(O),....,
u ([0, 2 iT]) and two values of the noise intensity q. The evolution of f (x; n
reaches a steady configuration that depends on the noise intensity. 0
522 Chapter 7. Deterministic Systems and Stochastic Input
q = 0.1 q=2
1.5
1.5
f(x;l;) .f(x;/;)1
0.5
0.5
0 0
0.15
0 0 0 0
Figure 7.15. Density f(x; 0 for X(O) ~ U([O, 2n]) and two noise intensities
Note: Generalities on crystal plasticity are in Section 8.6.2. The internal variable X in this
example is equal to twice the atomic lattice orientation. The general evolution equation for
a single crystal is in Section 8.6.2.1.
Because we assume that the crystal has two slip systems with deterministic prop-
erties, the material does not have random properties. The internal variable X is a random
process since the initial condition and applied strain are uncertain. A
system. Common values of r are 50 years for a bridge, 20 years for an aircraft,
thousands of years for earth temperature, and weeks or months for a stock price.
The reliability in r is the probability
where D is an open subset of JRd, called the safe set, and results from the design
conditions. Our objective is to calculate the reliability P s ( r) or the probability of
failure Pt(r) = 1- Ps(r) in r.
Example 7.58: Let X and X be the displacement and velocity of a linear or non-
linear oscillator that is subjected to a random input. Denote by X = (X 1 =
X, X 2 = X) the state vector of this system. The safe set is
if the design objective is to control the displacement or the total energy of the
oscillator, respectively. <>
Note: The first approximate equality is based on the assumption that the events A and B
are independent. The second approximate equality assumes that the D-outcrossings of X
follow a Poisson process. There is no justification for the first assumption but numerical
results support it. The second assumption is correct asymptotically as the size of D and the
reference period r increase indefinitely ([45], Section 12.2). _.
The probability of failure P f(T) takes values in [Pf,l (r), Pf,u (r)], where
0::::; t1 < · · · < tm ::::; r, A; = {X(t;) ¢ D}, p; = P(A;), TC;J = P(A; n AJ),
p =(pi, ... , Pm), and 1r = {TCiJ }, i, j = 1, ... , m.
Proof: Let A= {X(O) E D} and B = {N(D; r) = 0}. Then Ps (r) = P(A n B) so that
Pj(<) = P((A n B)c) = P(Ac U Be) .::0 P(Ac) + P(Bc),
where
where q = (qi, ... , qm) and pare viewed as column vectors. Because
E [(x Z- l{z;<oo}) 2] = x 2 E[Z2]- 2x E[Z] + P(Z i= 0)::: 0, Vx E R,
7.4. Applications 525
we have (E[Z]) 2 - E[Z 2] P(Z "I 0) .::; 0 or P(Z "I 0) ~ (E[Z]) 2 / E[Z 2] which gives
for any q E JRm. Tighter lower bounds can be obtained by selecting q such that it maxi-
mizes (qT p) 2 j(qT Hq) [170].
Alternative lower bounds on the failure probability can be obtained. For example,
we have shown that the probability
=
n=O
of 2:::~:) I!:iiI, where ~~~ 1 1 denotes the characteristic function of I!:i 11-
Methods of various complexity and accuracy have been proposed for cal-
culating the laws of !:i 1 and N ( r) [54]. One of the simplest methods for finding
the distribution of !:i 1 uses (1) the equivalence between the kinetic energy of the
526 Chapter 7. Deterministic Systems and Stochastic Input
oscillator mass at the initiation of a plastic excursion and the energy dissipated
by plastic deformation l11 and (2) the distribution of _Xel at the time of a ±u-
crossing of xel, which can be obtained from the properties of this process [ 189].
The assumption that N(r) is a Poisson random variable with intensity A. given by
the mean ( -u, u)-outcrossing of xel is not satisfactory since xel is a narrowband
process so that its ( -u, u)-outcrossings occur in clusters and therefore they can-
not be independent. This difficulty can be overcome by replacing N with another
counting process N, where N(r) gives the number of u-upcrossings in [0, r] of
the envelope process R(t) = [Xe1(t) 2 + xe1(t) 2] 112, where xel denotes the Hilbert
transform of xei (Section 3.9.4.1). The envelope process (1) is larger than IXe11
at all times, (2) is equal to 1xe11 at the zero-crossing times of xe1, (3) has a single
excursion above u for each cluster of (- u, u) -outcrossings of X el, and (4) may
have excursions outside ( -u, u) during which xei does not exit ( -u, u). The ex-
cursions of R which are not associated with ±u-crossings of X el are called empty
excursions, and are not considered in the calculation of P s ( r). The process N can
be approximated by a homogeneous Poisson process with intensity
1
r = -- (1- e-n 8 u) or
n8u
-
r-1-2}0 Q>(a)
r [1- ,JiJT
2rr
<l>(y (u
yn(u
1T 2 -
2
a 2 )ju)-
-a 2 )/u
1/2] ,
Note: The first expression of r results from the distribution P(R(t) > u) = e-u 2 12 of R,
the expression of the mean crossing rates :AR(u) and :A(u) of Rand xel, respectively, and
the following approximations and assumptions.
LetT be the random duration of an excursion of R above u. The long run average
ofT is P(R(t) > u)/:AR = 1/(u n y0:2) since r P(R(t) > u) approximates the average
time R is above u during a time interval [0, r ], provided that r is large, and AR r gives
the average number of u-upcrossings of R in [0, r]. The average interarrival time between
7.4. Applications 527
Monte Carlo simulation. In Section 5.4.2 the Monte Carlo simulation method
was used to estimate the probability that the state of a physical system does not
leave a safe set during a specified time interval. Here we estimate the probability
that a structural system performs according to some design specifications during
an earthquake. The coordinates of the state vector X are displacements and ve-
locities at a finite number of points of a structural system. The safe set D results
from conditions on X. The input is the seismic ground acceleration, which can be
modeled by a process A(t) = ~(t) As(t), t E [0, r], where r > 0 is the duration
of the seismic event, As(t) is a stationary Gaussian process with mean zero and
spectral density s(v), and ~(t) denotes a deterministic modulation function.
Statistical analysis of a large number of seismic ground acceleration records
and plate tectonics considerations have been used to develop expressions for the
spectral density s(v), modulation function ~(t), and earthquake duration r of the
seismic ground acceleration process A(t) at many sites in the US. It has been
shown that s(v), ~(t), and r are completely defined by the moment magnitude M
of the seismic sources in the proximity of a site and the site-source distance R
[96, 139].
Let Pj(m, r) denote the probability that a structural system subjected to
the ground acceleration process A corresponding to a seismic source with (M =
m, R = r) does not satisfy some specified performance requirements, that is,
Pt =I I Pt(m, r) !M,R(m, r) dm dr
can be used to assess the seismic performance of the system under consideration,
where !M,R denotes the density of (M, R). Estimates of this density for the cen-
tral and eastern United States can be found in [64].
Example 7.60: A steel water tank with dimensions 20.3 x 16.8 x 4.8 ft is located
at the top of a 20 story steel frame building. The tank is supported at its comers
by four vertical legs of length 6.83 ft. It is assumed that (1) the steel frames of
the building remain linear and elastic during the seismic event but the tank legs
may yield, (2) the building and the tank oscillate in a vertical plane containing
the direction of the seismic ground motion, (3) there is no feedback from the tank
to the building, (4) the tank can be modeled as a rigid body, and (5) the design
condition for the tank requires that the largest deformation of its legs be smaller
than a critical value dcr·
Let Abt be the absolute acceleration process at the connection point be-
tween the building and the tank, so that this process represents the seismic input
to the tank. The calculation of fragility surfaces by Monte Carlo simulation in-
volves three steps. First, a pair of values (m, r) needs to be selected for (M, R)
and ns samples of Abt corresponding to these values of (M, R) have to be gen-
erated. Second, the dynamic response of the tank has to be calculated for each
sample of Abt. Third, the probability P f (m, r) has to be estimated by, for ex-
ample, the ratio n; j ns, where n; denotes the number of tank response samples
that exceed a critical displacement dcr· Figure 7.16 shows fragility surfaces for
the tank corresponding to several values of dcr and two different designs, which
are sketched in the figure. The fragility surfaces in Fig. 7.16 corresponding to the
same critical value dcr differ significantly for the two designs illustrated in this fig-
ure. Fragility surfaces of the type shown in Fig. 7.16 and information on the cost
of various design alternatives can be used in earthquake engineering to select an
optimal repair strategy for an existing system and/or develop efficient new seismic
designs.<>
Note: The building's steel frame can be modeled by a system with r = 20 translation
degrees of freedom. If the system has classical modes of vibration, the displacement at
the structure-tank connection point is Y(t) = :Lr=l ¢; Q;(t), where the constants¢; can
be obtained from the modal shapes of the building and the functions Q; denote modal
coordinates. The modal coordinates satisfy the differential equations
.. . 2
Q; (t) + 2 ~i v; Q; (t) + V; Q; (t) = -y; A(t), t :::: 0,
with zero initial conditions, where ~i and v; denote modal damping ratios and frequencies,
respectively, and Yi are modal participation factors (Examples 7.5 and 7.6). We have
dcr =0.5%
0.5 0.5
8
0 4
dcr =2.0%
0.5 0.5
8 8
0 4
dcr =6.0%
0.5 0.5
8 8
0 4 m m
where ()i is the unit impulse response function for mode i (Example 7.5). Let
n
A(t) = g{t) L Uk [ Ak cos(vk t) + Bk sin(vk t)]
k=l
be an approximate representation of the seismic ground acceleration for Monte Carlo sim-
ulation, where uk, Vk > 0 are constants and Ak, Bk are independent N(O, 1) variables
(Section 5.3.1.1). Then
n
Qi(t) = L uk [Auc,ik(t) + Bus,ik(t)] and
k=l
Y(t) = t
k=l
uk [Ak (t. z=l
tPi 8c,ik(t)) + Bk (t.
z=l
tPi 8s,ik(t))]
where
The stress -strain relationship for the tank legs considered in Fig. 7.16 is linear from
(0,0) to (0.0012, 36 ksi), constant from (0.0012, 36 ksi) to (0.1356, 36 ksi), linear from
(0.1356, 36 ksi) to (0.585, 60 ksi), and constant for larger strains, where the first and second
numbers in parentheses denote strains and stresses, respectively. &
Durbin's formula. We have given in Section 3.10.2 formulas for the densities of
T and (T, X(T)), where X is a real-valued process, T = inf{t > 0 : X (t) :=:: u },
and u E Ilt These formulas are repeated in the following equation.
The joint density of the random variables (T, X(T)) and the density of T are
fr.x(T)(t, z) = E [l(x(s)<u,O:;:s<t} X(t)+ I X(t) = u, X(t) = z] fx(t),X(t/u, z),
fr(t) = E [l(X(s)<u,O:;:s<t}X(t)+ I X(t) = u] /X(tj(U). (7.94)
We show that Eq. 7.94 can be used to bound the density ofT. Consider n
distinct times 0 < t1 < · · · < tn < tin [0, t] and the events A = {X(t) < u <
X(t + M)} and Bn = n?=l {X(ti) < u}, where M > 0 is a small time interval.
Then fr can be bounded by the functions in Eq. 7 .95. The bounds on f T in this
equation have been used to develop computer codes for calculating the density of
T approximately [159].
Proof: The upper bound on fT results from Eq. 7.94 since 1(X(s)<u,0:::S<t} ::::: 1Bn ::::: 1.
Let Tk> k = 1, 2, ... , denote the time of the k'th u-upcrossing of X. The probability
that this upcrossing occurs in [t, t+D.t] can be approximated by frk (t) D.t, where fTk is the
density of Tk. Hence, the mean u-upcrossing rate A.(u; t)+ of X at timet is I:~ 1 fn (t)
so that
We also have
= f lot
k=2
E [1(Tt=s} 1{Tk=tl X(s)+ X(t)+ I X(s) = X(t) = u] fx(s),X(t)(u, u)ds
~ fo' E [ ,,T, ~, 1 x,,,+ (E • 1T,~n x,,,+) 1 x ,,, ~ x u1 ~ "J tx1, 1.x1o'"· "' J,
::=:lot E [l{Tt=s} X(s)+ X(t)+ I X(s) = X(t) = u] fx(s),X(t)(u, u) ds
since I:~ 2 1(Tk=t} X(t)+ = (1- 1{Tt=tJ) X(t)+ ::=: X(t)+. The expression of fT1 and
the above inequality give the lower bound on fT in Eq. 7.95. •
be the probability of failure for a safe set D = (-oo, a), a > 0. Figure 7.17
shows the evolution in time of the exact and approximate probability of failure
Pt (t) for a = 1 and X (0) = 0. The approximate solutions correspond to Monte
Carlo simulation and a finite difference solution of the Fokker-Planck equation for
the density of X (t) I (X (0) = 0) with an absorbing boundary at x =a. <>
Note: LetT= inf{t > 0: X(t) if. ( -oo, a)}. The distribution of this stopping time is
Fokker-Pianck solution
o.a finite difference)
0.7
0.6
0.5
0.1
time
Figure 7.17. Evolution of the exact and approximate probability of failure P f(t)
for X = B and a safe set ( -oo, 1)
0.6 0.6
0.4 0.4
Fokker-Pianck Fokker-Planck
2 4 10 2 4 10
Figure 7.18. Probability of failure P f(t)) for X defined by dX (t) = (±X (t) -
X (t) 3 ) dt + dB(t) and a safe set D = (-a, a)
a = 1, xo = 0, a = 1.6, and a = 1.1. The finite difference and the Monte Carlo
simulation methods have been used for calculations. <>
Note: The initial and boundary conditions for the finite difference solution of the Fokker-
Planck equation are f(x; 0 I xo; 0) = 8(x - xo) and f(x; t I xo; 0) = 0 fort > 0,
respectively. The condition f(x; t I xo; 0) = 0, t > 0, defines an absorbing boundary for
the Fokker-Planck equation associated with the diffusion process X. &
7.4. Applications 533
s,
where v, e > 0 are some constants and Y is a stationary Gaussian process
with mean zero and a broadband spectral density s(v). The envelope A(t) =
(X(t) 2 + X(t) 2 fv5) 112 of X can be approximated by a diffusion process (Exam-
ples 7.48 and 7.51). Let Pf(t) = 1 - P(maxo9 9 1X(t)l <a) be the probability
of failure for the state X in the above equation. An approximation of this proba-
bility offailure is Pf(t) = 1 - P(maxo 9 9 A(t) < a).
Figure 7.19 shows Monte Carlo estimates of P f(t) and Pf(t) obtained from
Pjt) (Fokker-Pianck)
0.25
0.2
0.15
0.1
Pjt) (Monte Carlo)
0.05
0 oL_~-=~~~~~--~5---L--~--~--~~10
time
1,000 samples of X and A, respectively. The figure also shows the probability
Pf (t) obtained from the finite difference solution of the Fok:ker-Planck equation
for the density of A. Numerical results are fore = 0.283, v = 1, s(vo) = 3.978,
and a = 5. That Pf(t) is larger than Pf(t) is consistent with our discussion in
Example 7.59 on the relationship between crossings of A and X.
Figure 7.20 shows the evolution of the joint density of (X, X) in time for
the case in which the driving noise is Y = W, where W is a zero-mean stationary
Gaussian white noise with spectral density sw(v) = s(vo), v E lit The results in
this figure have been obtained by a finite difference solution of the Fokker-Planck
equation for the density (X, X) [179]. ¢
Figure 7.20. Evolution of joint density of (X, X) in time (adapted from [179])
The probability of failure of this oscillator for the safe set D = (-a, a) x lR is given by
Pj(t) = 1- f!:_a f~oo f(x; t I xo; O)dx1 dx2, where f(x; t I xo; 0) isthesolutionofthe
above Fokk:er-Planck equation with the initial and boundary conditions f(x; 0 I xo; 0) =
8(x - xo) and f(±a, x2; t I xo; 0) = 0 fort > 0, x2 E R, and xo E D. A
7.4.4 Finance
The discussion in this section is based on Chapter 4 in [131]. Useful refer-
ences on this topic are [60], [126], and [151] (Section 10.16).
Let X (t) denote the price of a stock at time t. The relative return in a small
time interval [t, t + L\t], L\t > 0, can be modeled by
where c > 0 and a > 0 are the mean return rate and volatility, respectively.
Bond, money market, savings account, and other similar assets are viewed as
riskless so that their evolution is given by a deterministic differential equation.
7.4. Applications 535
The following models are considered for the stock and bond prices and for the
value of a portfolio.
• The options are European calls, there is no cost for transactions, and riskless
assets have a constant rate of return.
• The market is rational.
• The self-financing condition dV (t) = Us (t) dX (t) + Ub(t) df3(t) holds.
Note: In a rational market there is no opportunity for arbitrage, that is, riskless strategies
guaranteeing a profit do not exist. The self-financing condition implies that changes in a
portfolio's value are caused solely by changes in the prices of stocks and bonds. There is
no infusion of new capital in or consumption of wealth from a portfolio. .&
536 Chapter 7. Deterministic Systems and Stochastic Input
The Black-Scboles formula states that the rational price for a European call
option at time t = 0 and exercise price k is V (0), where
g(x,t)=
ln(xj k) + (r + a 2 /2) t , h(x,t)=g(x,t)-at 112 , (7.97)
112
at
The process V(t) in Eqs. 7.96 and 7.97 is the value of a portfolio at time t
defined by the self-financing strategy
where u, 1 and u, 11 denote the first and second partial derivatives of u with respect to x1
and u,z is the partial derivative of u relative to xz.
7.4. Applications 537
imply
dV(t) = [(c- r) U8 (t) X(t) + r V(t)] dt +u U8 (t) X(t) dB(t)
so that
Because the above two expressions for V(t)- V(O) must coincide, we have
(c- r) U8 (t) X(t) + r u(X(t), -r- t)
(J2
= c X(t) u ' 1(X(t), -r- t)- u ' 2 (X(t), -r- t) +-
2
X(t) 2 u 11 (X(t), -r- t),
'
U8 (t) = U,l (X(t), t ' - t).
u2 2
u 2 (x, s)
!
= -2 x u 11 (x, s)
'
+ r x u ' 1(x, s) - r u (x, s)
for function u defined for x > 0 and s E [0, -r] with the deterministic terminal condition
u (x, 0)= (x - k) +. The solution of this differential equation is
u(x, t) = x <l>(g(x, t))- ke-rt <l>(h(x, t))
with the notation in Eq. 7.97. •
Example 7.64: Let c = 1, u = 0.5, and r = 0.5 be the mean return rate, volatil-
ity, and return rate, respectively. Suppose that (1) current stock and bond prices
are X (0) = 6 and f3 (0) = 6 and (2) a European call option is available at a strike
price k at time r = 1. The question is whether the current offering of X (0) = 6 is
acceptable.
Figure 7.21 shows ten samples of the self-financing strategies Us and Ub in
Eq. 7.98 and the portfolio value V in Eq. 7.96 fork = 10 and ten samples of V
for k = 20. These samples suggest that it is unlikely to make any profit at these
strike prices. Estimates of the probability Pprf of making a profit and the expected
profit eprr are Pprf = 0.03 and eprf = 2.2 fork = 10 and Pprf ~ 0 and eprf = 2.1
fork = 20. If k = 5, the estimates are Pprf = 0.55 and eprf = 7.03, so that we
have a better then 50% chance to profit. The above estimates have been calculated
from 500 samples. <>
538 Chapter 7. Deterministic Systems and Stochastic Input
-4
-6
Samples of ub (k=lO)
-8L_~--~--~--~--~
0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
60,-~---------------, 50,-~-----------,
30 20
20 10
10
-10'---~-~-~-~--'
0 0.2 0.4 o.6tim&·s
Figure 7.21. Samples of trading strategies, stock price, and portfolio value
Example 7.65: Let X(t) = e-rt X(t) and V(t) = e-rt V(t) be the discounted
stock price and portfolio value at a timet E [0, r], where X and V are defined
on a probability space (Q, :F, P) with a filtration :Fr = a(B(s), 0 ::::; s ::::; t)
and B is the Brownian motion in the defining equation for X. The processes
X and V satisfy the differential equations dX(t) =a X(t)dB(t) and dV(t) =
Us(t) dX(t), where B is a Brownian motion under a probability measure Q de-
fined by (dQ/dP)(t) = exp[-(cja)B(t)- (c 2 j2a)t]. The process Vis an
:Fr- martingale and
That B(t) = (c-r) tfa+B(t) is a Brownian motion under Q was shown in Section 5.4.2.2.
The process
V(t) = V(O) +a lot U (s)X(s)dB(s)
8
7.4. Applications 539
give
This function coincides with u in Eqs. 7.96 and 7.97 for a European call option ([131],
Section 4.2.2). •
7.4.5 Estimation
The evolution of the state X of many physical systems can only be modeled
approximately because of complexity and/or limited understanding. Generally,
the resulting approximate models are satisfactory for short term predictions of fu-
ture values of X but cannot be used for long term predictions of a system state.
Measurements of the actual state of a system can be used to update X and re-
duce errors caused by approximations in modeling. The Kalman-Bucy filtering
approach is an efficient tool for propagating the state X based on its model and
measurements. We present the essentials of this approach starting with time in-
variant estimation problems and then extending these results to time dependent
linear problems with discrete and continuous time.
• The best mean square (m.s.) estimator of X given Z is the conditional ex-
pectation X= g(Z) = E [X I .rZJ = E [X I Z].
• The best m.s. linear estimator of X given Z is
-1
(7.99)
A
X=JLx+YxzYzz (Z-JLz),
y=E (X-X)(X-X)
[
A A T] =Yxx-YxzYzz-1 Yzx· (7.100)
where X, g(Z), and Z are column vectors, fxlz is the density of X I Z, and fz denotes the
density of Z. The expectation in the above integral,
Example 7.66: Suppose that Z = h X+ Vis the observation vector, where his a
known deterministic matrix and Vis an JRd' -valued observation noise with mean
JLv and covariance matrix y vv· It is assumed that X and V are not correlated. The
best m.s. linear estimator of X and its error covariance matrix are
T T -1
X=JLx+Yxxh (hYxxh +Yvv) (Z-JLz),
A
The quality of the estimation depends on the noise to signal ratio Yvv!Yxx· The er-
ror covariance matrix is zero for perfect measurements (Yvv = 0) and approaches
Yxx as Yvv!Yxx -+ oo. 0
Proof: The above results follow by elementary calculations from the expressions for the
estimator X in Eq. 7.99 and of its error covariance matrix. •
of X (t) given the observation vector (Z(l), ... , Z(s)) and (2) the error covariance
matrix,
for X(t I s). The solution of this problem is referred to as filtering, prediction,
and smoothing if t = s, t > s, and t < s, respectively. The definitions in
Eqs. 7.103 and 7.104 imply X(O I 0) = E[X(O)] = 0 and y(O I 0) = 1/f.
We note that X(t I s) andy (t I s) can be obtained from Eqs. 7.99 and 7 .100.
However, the approach is inefficient for a large observation vector ( Z ( 1), ... , Z (s)).
The following formulation provides a procedure for calculating X(t I s) and
y (t I s) efficiently. We give equations for the one-step predictor X(t I t - 1),
the estimator X(t I t), and their error covariance matrices.
542 Chapter 7. Deterministic Systems and Stochastic Input
The best m.s.Iinear one-step predictor and the best m.s. estimator of X sat-
isfy the equations
Proof: The first two formulas in Eq. 7.105 are based on the state propagation equation and
the estimation theory. The formulas in Eq. 7.106 are the Kalman-Bucy equations.
The first formula in E9. 7.105 results from Eq. 7.102 applied in the time interval
[t- 1, t], that gives a(t- 1) X(t- 1 I t- 1) + b(t- 1) W(t- 1) for the initial condition
X(t- 1 I t -1). Hence, the best m.s. estimator of X(t) is X(t I t- 1) = a(t -1) X(t- 1 I
t - 1) since W(t- 1) and X(t- 1) are uncorrelated and W has mean zero. If we assume
that the noise processes V and W and the random variable X (0) are mutually independent,
then
where N2 = z2- i2 lzl = z2- Yz2,ZI y~~ZI ZJ, called the innovation vector, includes
only information in z2 that is not in zl ([166], Section 6.2) and, for example, Yz2.Z!
denotes the covariance matrix of (Z2, Z1). The error covariance matrix of X lcz 1 ,z 2 ) is
7.4. Applications 543
the error covariance matrix of X Iz 1 less a positive definite matrix of the type in Eq. 7.100
corresponding to the information content in N2, that is,
-1
Yx
l(zt.Zzl
= Yx lzt - Yx,n 2 Yn 2 ,n 2 Yn 2 ,x·
The formulas in Eqs. 7.105 and 7.106 result by replacing Z1, Z2, and i2 lz 1 in the above
equations by (Z(l), ... , Z(t- 1)), Z(t), and X(t It- 1), respectively. •
a 2 y(t- 1 It- 1) + q ) ( 2
y(t I t) = ( 1 - a y(t- 1 I t- 1) + q) .
a y(t- 1 It- 1) + q + r
2
y(t)= a(t) y(t) + y(t) a(t)T + b(t) q(t) b(tl- y(t) h(tl r(t)- 1 h(t) y(t)
where k(t) = y(t) h(t) r(t)- 1 and y(O) = t. (7.110)
Note: We have seen that the bestm.s.linear estimator X(t I t) of the state X(t) of a discrete
time system depends on observations linearly. We consider a similar estimator for the state
X in Eq. 7.107, that is, we take
where the matrix p(t, s) will be determined from the condition that it minimizes the co-
variance of the error X(t)- X(t). The above definition of X implies X(O) = 0 so that
y(O) = t (Eq. 7.108).
Under the assumption that W inEq. 7.107 is a Gaussian white noise, it can be shown
that p satisfies the Wiener-Hopf equation ([33], Chapter 4, [135], Chapter 6))
Because E [ X(t) Z(s)T] = Jci p(t, u) E[Z(u) Z(s)T] du by the above definition of X,
the Wiener-Hopf equation implies E [ ( X(t)- X(t)) Z(s)T] = 0 for all s ::=:: t so that
X(t) = E [X(t) I .1'tl, where .1't = u(Z(s), 0 ::=:: s ::=:: t).
A heuristic proof of Eqs. 7.109 and 7.110 can be found in [166] (Section 7.2). The
proof is based on the finite difference approximation
To find X(t) and its properties, we need to perform the following opera-
tions. First, the Riccati equation has to be solved for the initial condition y(O) =
1/f. The solution of this equation and the second formula in Eq. 7.110 yield the
time evolution of the gain k. Second, Eq. 7.109 has to be solved for the initial con-
dition X(0) = 0. The methods in Section 7 .2.1.2 can be used to find the second
7.4. Applications 545
moment properties of X. If the state X is observed at some discrete times, the pro-
cedure in the previous section can be applied in conjunction with the recurrence
formula (Eq. 7.6)
Ifthe observation noise is much smaller than the driving noise (l!,.t » r), then
y(n I n) :::::: r and y(n I n - 1) :::::: l!,.t. Hence, the prediction uncertainty is
approximately equal to the variance of the driving noise in a time step !!,.t. This
uncertainty is reduced in the filtering phase to the observation variance. (>
Proof: The discrete version of the state equation is X (n t!.t) = X((n - 1) M) + W(n),
where E[W(n)] = 0 and E[W(n) W(m)] = t!.t Onm· The formulas in Eq. 7.106 give the
above recurrence equations for y(n I n- 1) and y(n I n). Since y(O I 0) = 0, we have
y(l I 0) = M and y(1 I l) = r t!.tj(r + M):::::: r, where the approximate equality holds
for M » r. Similar approximations hold for y (n I n - 1) and y (n I n ). •
Example 7.69: Let X be the state in Eq. 7.5, where Y = Wand W is the white
noise in Eq. 7.13 assumed to have mean zero, that is, JL w (t) = 0. The mean and
covariance matrices of the initial state are p,(O) and c(O, 0), respectively. Sup-
pose that we observe the state X through an imperfect device giving the readings
Z(t) = h(t) X(t) + V(t) (Eq. 7.107), where his a (d, d) matrix and Vis a white
noise with E[V (t)] = 0 and E[V (t) V (s)T] = q v(t) 8 (t - s) that is uncorrelated
with X and W. The best m.s. linear estimator of the state X is the solution of
where k(t) = c(t, t) h(t)T qv(t)- 1 and cis the covariance function of X. This
result is consistent with Eq. 7.109. (>
Proof: Define X by X(t) = cx(t) X(t) + k(t) Z(t) and impose two conditions: (1) the
estimator X of X is unbiased, that is, E[X(t)] = E[X(t)] for all t :::: 0 and (2) the error
546 Chapter 7. Deterministic Systems and Stochastic Input
X(t) = X(t)- X(t) is minimized in the m.s. sense, for example, we may require that the
trace of E[X(t) X (t)T] be minimized.
The means IL and [L of X and X satisfy the equations
The evolution of {L(t) = E[X(t)] and c(t, t) = E [ ( X(t) - {L(t)) ( X(t) - {L(t)) TJ can
be calculated from results in Section 7.2.1 since b(t) W(t) - k(t) V(t) is a white noise
process with mean zero and intensity b(t) qw (t) b(t)T + k(t) qv(t) k(t)T. The initial con-
dition for the differential equation of c is c(O, 0) = c(O, 0) since X(0) = p,(O) so that
X(O)- {L(O) = X(O) -p,(O).
The second condition requiring the minimization of the error X(t) = X (t) - X(t)
is used to find the optimal gain, that is, the function k that minimizes c(t, t) at each time
t ;::: 0. These calculations can be found in [129] (pp. 265-267). •
7.5 Problems
7.1: Derive the mean, correlation, and covariance equations in Eq. 7.12.
7.2: Find the second moment properties of the stochastic process X(t), t ;::: 0, in
Example 7.5 for~ 1= am+ fJ k.
7.3: Extend the approach in Example 7.8 to the case in which the driving noise Y
is a colored stationary Gaussian process given by some of the coordinates of the
state of a linear time-invariant filter driven by Gaussian white noise.
7.4: Let X be the solution ofEq. 7.5 withY being the white noise Win Eq. 7.13.
Suppose that the drift, diffusion, and noise parameters are time invariant, that is,
a(t) =a, b(t) = b, JLw(t) = JLw, and qw(t) = qw, and that the drift coefficients
are such that Eq. 7.5 has a stationary solution X s with second moment properties
given by Eq. 7.15. Show that the second moment properties of X are given by
Eq. 7.15 if X (0) is random and follows the marginal distribution of X s.
7.5: Find the first four moments of the martingale M(t) = B(t) 3 -3 t B(t), t :=: 0,
where B is a Brownian motion.
7.5. Problems 547
7.7: Find the first four moments of a process X defined by the stochastic differ-
ential equation dX (t) = -ot X (t) dt + f3 dM (t) fort ~ 0 and X (0) = 0, where
ot > 0, f3 are some constants, M(t) = B(t) 3 - 3 t B(t), and B is a Brownian
motion.
where dM(t) = x(M(t)) dB(t) and xis a continuous function. Find whether the
moment equations corresponding to the augmented vector (X, M) are closed for
x(g) = g and x(g) = g 2.
7.12: Write a differential equation involving moments of the augmented state vec-
tor Z = (X, S) in Eq. 7.47. Outline conditions under which these equations can
be used to find the moments of Z.
7.15: Find the solution for X defined in Example 7.32 in which B is replaced by
a compound Poisson process. Develop moment equations for X.
7.16: Let X be the process defined in Example 7.33. Develop moment equations
for (X, X). Are these moment equations closed?
7.19: Derive the partial differential equation for the characteristic function of the
state vector (X, X, S) defined in Example 7.28 and specify the initial and bound-
ary conditions needed to solve this equation.
7.20: Derive the Fokker-Planck equation for the density of X defined by dX (t) =
a(X(t)) dt + b(X(t)) dB(t), t ::=: 0, starting at X(O) = xo by using Ito's formula.
Assume that the conditions in Eq. 7.63 are satisfied.
7.21: Find the density of the geometric Brownian motion in Example 7.36 by
solving the Fokker-Planck equation numerically.
x 1
../f7i a0
xi J'
exp [ - - -
2 aJ
7.29: Develop the differential equations for the correlation functions r pq (t, s) of
the augmented state Z = (X, X, S) in Section 7.4.1.2 for both the transient and
stationary solutions. Confirm the numerical results in Fig. 7.13.
7.30: Define a reliability problem and calculate the bounds in Eq. 7.92 on the
probability of failure P f ( r). Find also Monte Carlo estimates of P f ( r).
7.31: Find the solution of the problem in Example 7.68 for the case in which X
is an Ornstein-Uhlenbeck process.
Chapter 8
8.1 Introduction
This chapter examines algebraic, differential, and integral equations with
random coefficients. The boundary and/or initial conditions required for the so-
lution of differential and integral equations can be deterministic or random. The
input is deterministic for all types of equations. We refer to these problems as
stochastic systems with deterministic input.
Our objective is to find probabilistic properties of the solution of stochas-
tic systems subjected to deterministic inputs. Two classes of methods are avail-
able for solving this type of stochastic problems. The methods in the first class
constitute direct extensions of some of the results in Chapters 3 and 6, for ex-
ample, the local solution for a class of differential equations with random coef-
ficients (Section 8.2), the crossing solution for the random eigenvalue problem
(Section 8.3.2.6), and some of the homogenization methods for random hetero-
geneous media (Section 8.5.1). These methods involve advanced probabilistic
concepts and in many cases deliver detailed properties of solutions. The methods
in the second class are largely based on relatively simple probabilistic concepts
on random variables, processes, and fields. The main objective of these methods
is the calculation of the second moment properties of the solution of algebraic,
differential, and integral equations with random coefficients.
We (1) examine partial differential equations with the functional form in
Chapter 6 (Section 8.2), inhomogeneous and homogeneous algebraic equations
(Sections 8.3.1 and 8.3.2), and inhomogeneous and homogeneous differential and
integral equations (Sections 8.4.1 and 8.4.2), (2) calculate effective or macro-
scopic constants for random heterogeneous materials (Sections 8.5.1 and 8.5.2),
(3) study the evolution of material properties and the possibility of pattern for-
549
550 Chapter 8. Stochastic Systems and Deterministic Input
mation in the context of elasticity and crystal plasticity (Sections 8.6.1 and 8.6.2),
(4) define the stochastic stability problem and illustrate its solution by some simple
examples, and (5) examine localization phenomena in heterogeneous soil layers
and nearly periodic dynamic systems (Section 8.8.1 and 8.8.2).
with the boundary condition U(x) = 8(x), x E aD, where D is an open set
in JR2 , and (Ai, Hij) and 8 are real-valued random fields defined on D and aD,
respectively. It is assumed that ( 1) the matrix H = { H ij} admits the representation
H(x) = C(x) C(x)T ateachx ED and (2) the random fields Ai, Hii• Cii• and 8
are such that the local solution in Section 6.2 can be applied for almost all samples
of these fields.
Let Ai (·, w'), Cij (·, w'), and 8(·, w') be samples of the random fields Ai,
Cij, and 8, respectively, and define for each w' an W -valued diffusion process by
the stochastic differential equation
dX(t; w') = A(X(t; w1), w') dt + C(X(t; w'), w') dB(t, w'),
where B is an JR2 -valued Brownian motion. For a sample w' of Ai, Cij, and 8,
the local solution is
where T(w') = inf{t > 0: X(t, w') ¢ D} denotes the first time X(·; w') starting
at x E D exits D and the above expectation is with respect to the random variable
8.3. Algebraic equations 551
X(T(w'), w') for a fixed w' (Section 6.2.1). Statistics of the local solution U(x)
can be obtained from its samples U (x, w').
Numerical results have been obtained forD= ( -2, 2) x (-1, 1),
with lail < i, i = 1, 2, C12(x) "' U(1/2- a3, 1/2 + a3) with la31 < 1/2,
8(x) "' U(1 - a1, i + a1) on the side (-2, 2) x {-1} of D, and 8(x) = 0
on all other boundaries of D, where a1 = 0.4, a2 = 1, and a3 = 0.2, and the
random variables A;, C;j, and 8 are independent of each other. Figure 8.1 shows
a histogram of U((0.5, 0)) based on 100 independent samples of A;, Cij. and 8.
Histogram of U((0.5,0))
4.5
=
Mean 0.2623
Std = 0.1328
3.5 (100 samples)
2.5
1.5
1m~
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
The solutions U((0.5, 0), w') for each w' are based on 1,000 independent samples
of X generated with a time step of M = 0.001. For a; = 0, i = 1, 2, 3, the local
solution is deterministic, and its estimate at x = (0.5, 0) for the same sample size
is 0.2731. <>
Let Y =A- Ai for a fixed value of A such that y- 1 exists a.s. Then Eq. 8.1
becomes
YX=q, qEffi.n. (8.2)
Our objective is to find the probability law of X from its definition in Eq. 8.2
and the probability law of Y. We will be able to achieve this objective in only a
few cases since (1) the inverse y-l of a random matrix Y cannot generally be
found analytically and (2) the determination of the probability law of X from
X = y-l q is usually impractical even if Y - 1 is known. Most of the methods in
this section focus on the approximate calculation of the first two moments of X
under the assumption Y E L2.
The Monte Carlo simulation method provides the most general solution for
Eq. 8.2 in the sense that it can deliver an estimate of the probability law of X. The
essential limitation of the method is the computation time, that can be excessive if
the solution of the associated deterministic problem (Eq. 8.3) is computationally
intensive and/or the required sample size n s is large. Another potential limitation
of Monte Carlo simulation relates to the available information on Y. For example,
if only the first two moments of Y are available, the use of Monte Carlo simulation
requires us to postulate the probability law of this random vector.
8.3. Algebraic equations 553
(8.4)
Note: The algorithms in Section. 5.2 can be used to generate samples of Y. The Monte
Carlo simulation method is not restricted to linear equations. The method can be applied
to nonlinear equations in which case the determination of the samples Xk of X requires
solution of a nonlinear algebraic equation rather than Eq. 8.3. •
[ ~:(<:s ~:(<::; J[ i~ J= [ ~ J.
where E>1 = tan- 1 (Z2/(Z1 +a)), E>2 = tan- 1 ((a- Z2)/(Z1 +a)), Z1 and
Z2 are independent random variables uniformly distributed in ( -e a, e a), and
0 ::: e ::: 1/3. The random vector X gives the forces in the two-member truss
in Fig. 8.2 under a deterministic load q. Because of geometric imperfections, the
t
a
!q
' '
::~~z,
I
~
1 ~
l 1l
a z1
position of the joint defined by the intersection of the two bars is random with the
coordinates (a + Z 1, Z2).
Figure 8.3 shows histograms of X 1 and X 2 obtained from n 8 = 1, 000 truss
samples for a = 1, e = 0.3, and q = 1. The estimated mean, standard deviation,
554 Chapter 8. Stochastic Systems and Deterministic Input
1.4,-------------, 1.5,---------------,
0.8
0.6
0.5
0.4
skewness, and kurtosis are -1.16, 0.52, 1.23, and 4.17 for X 1 and 1.58, 0.52, 1.18,
and 3.94 for X2. If e = 0, that is, there is no uncertainty in the truss geometry, the
member forces are X 1 = -1 and X 2 = J2. <>
Note: The defining equations for (Xt. X2) result from the equilibrium conditions, where
81 and 82 are the random orientations of the bars (Fig. 8.2). The entries of matrix Y in
Eq. 8.2 are Yu = cos(E>1), Y12 = cos(E>z), Y21 = sin(E>1), and Y22 = sin(E>2). The
Monte Carlo simulation was based on Eqs. 8.3 and 8.4. First, ns independent samples
of the joint coordinates (a + Z1, Z2) were generated and the corresponding samples of
the angles (81, 82) and member forces (X 1, X2) were calculated. Second, the resulting
member forces were used to estimate moments and develop histograms for (X1, X2). •
1. Approximation of X:
~ aX(/Lz)
X::: X(/Lz) + L... a (Zu- J.Lz,u). (8.5)
u=1 Zu
Note: The approximation of the solution X in Eq. 8.5 is referred to as the linear or first
order Taylor approximation. We also note that the approximate second moment properties
of X in Eq. 8.6 depend only on the first two moments of Z.
The Taylor expansion is
where
Rz(z<o)) = -21 L aazu
m 2 X(Z*)
azv
(
Zu- Zu(0)) ( Zv- Zv(0))
u,v=1
and Z* = e z(O) + (1 - e) Z, e E (0, 1) ([149], Remark 2, p. 453, [26],Theorem 21.1,
p. 142). The approximation in Eq. 8.5 is given by the above Taylor expansion for z( 0l = /Lz
without the remainder Rz.
The error of the linear approximation in Eq. 8.5 can be bounded by using the ex-
pression of Rz. This bound is rarely used in applications since it requires the calculation
of the second order partial derivatives a2x;azu azv. u, v = 1, 2, ... , m, which is a time
consuming task for large vectors Z. &
The approximate solutions in Eqs. 8.5 and 8.6 depend on the values of X
and its gradient for Z equal to its mean value, which are given by the following
two equations.
Proof: The solution of Eq. 8.7 corresponds to Eq. 8.2 with Z replaced by its mean value
and involves the inversion of a deterministic matrix Y (/Lz). The derivative of Eq. 8.2 with
respect to a coordinate of Z is
and gives Eq. 8.8 by setting z = /Lz since aq jazu = 0. The partial derivatives aX(/Lz)fazu
are called sensitivity factors because they give the rate of change of X with respect to
perturbations in the coordinates of Z about /Lz. If both the sensitivity factor with respect
to a coordinate Zk of Z and the uncertainty in this coordinate are small, then Zk can be set
equal to its mean value as a first approximation. •
Example 8.3: Let X be the solution of Eq. 8.2 with m = n = 1. Suppose that
Z takes values in a bounded interval (a, b) and that the function Z ~--+ X ( Z) has
a continuous second order derivative. The error of the first order approximation,
X(Z) = X(J.Lz) + X'(J.Lz) (Z- J.Lz), can be bounded by
for almost all w E Q. The calculation of this bound is impractical for realistic ap-
plications since X" is not known and its numerical calculation can be prohibitive.
Moreover, the resulting bound may be too wide to be informative. <>
Note: The approximate mean of X given by Eq. 8.7 can be obtained from the equilibrium
conditions for Z1 = 0 and Z2 = 0. The gradients of the coordinates of X relative to Z are
given by Eq. 8.8. The partial derivatives a cos(8u)faZv and a sin(8u)faZv, u, v = 1, 2,
have to be calculated to find the entries of matrices aY(Z)jazu, for example, the element
( 1, 1) of this matrix is
for u = 2. The entries (1,1), (1,2), (2,1), and (2,2) of matrices aY(/Lz)fazu are 0, l/v'2-1,
0, and -1 for u = 1 and 0, 0, 1, and -1/.J2 for u = 2. The derivatives aX(/Lz)fazu in
Eq. 8.8 are (2, -2) for u = 1 and (2.83, 2) for u = 2. The second formula in Eq. 8.6 gives
an approximation for the covariance matrix of X.
Figure 8.4 shows the functions Z f-+ Xu (Z), u = 1, 2. The approximate coordinate
of X in Eq. 8.5 are hyperplanes 7ru tangent to the graphs of these functions at (/Lz, xu(/Lz)),
u = 1, 2. The error of the approximate covariance in Eq. 8.6 depends on (1) the differences
between the hyperplanes 7ru and the functions Z f-+ Xu(Z) and (2) the uncertainty in Z.
For example, the approximate second moments of X are likely to be inaccurate if the above
differences are significant and/or the variance of the coordinates of Z is relatively large. A.
8.3. Algebrmcequations 557
0 4
-1 3 ..
-4
30 30
30 30
Zz zz
0 0 0 0
X(Z) ~ X(P,z) +~
~
aX(p,z)
(Zu- JLz,u)
u=l azu
+ -2
1
Lm aZX(P,z)
a a (Zu -
u,v=l Zu Zv
JLz,u) (Zv - JLz,v). (8.9)
/Lx ~ X(P,z)
1
+ -2 Lm aZX(P,z)
a a Yz,uv•
u,v=l Zu Zv
The above second order partial derivatives of X can be calculated from the second
partial derivatives ofEq. 8.2.
Nonlinear equations. Suppose that X is the solution of
YX+N=q, (8.10)
where Y is as in Eq. 8.2, N denotes an (n, 1) matrix whose entries are nonlinear
real-valued functions of (Z, X), and the 1Rm-valued random variable Z collects
558 Chapter 8. Stochastic Systems and Deterministic Input
The gradients of X can be obtained from the above equation since the functions
Z 1--+ Y(Z) and (Z, X) 1--+ N(Z, X) are known so that the partial derivatives
oY jozu, oN jozu. and oN joxk can be calculated.
Y =a+8R, (8.11)
1. Approximation of X:
Proof: The above perturbation solution is regular because the random part of Y is small
and the solution X does not differ significantly from the solution x = a- 1 q of Eq. 8.2
with s = 0 (Example 8.5 in this section, [101], Section 1.2).
The representation of X in Eq. 8.12 and Eq. 8.2 give
(a+ s R) (Xo + s X 1 + s 2 X2 + · · ·) = q or
(a Xo- q) + s (a X1 + R Xo) + s (a X2 + R X t) + · · · = 0,
- 2 -
that is, a power series that must be zero for all values of s. We require that the coefficients
of all powers of s be zero according to a fundamental theorem of perturbation theory ([171],
8.3. Algebraic equations 559
x<P)
p
= L(-1)k ek (a- 1 R)
- k
a- 1 q,
k=O
where ( a- 1 Rtis the identity matrix fork = 0. Note that the equations for Xo, X 1, ...
have the same deterministic operator.
The moments in Eq. 8.13 show that it is not possible to approximate E[X xT] to
the order e 2 by using the first order approximation of X. The first order approximation
of X delivers only one of the three terms of order e2, the term E[X 1 x[l. The other two
terms cannot be obtained from the first order approximation. •
1. The perturbation method can be used to find the solution of Eq. 8.2 when the
random matrix Y has a more general form than in Eq. 8.11, for example, Y =
a+ 8 R1 + 82R2 + 0(8 3).
2. The perturbation method can also be applied to determine the solution of non-
linear problems defined by
YX+8N=q, (8.14)
where Y is given by Eq. 8.11 and the entries of the (n, 1) matrix N are real-
valued nonlinear functions of X and may be random. The first three terms of the
perturbation solution of this equation can be calculated from
aXo = q,
a X1 = -R Xo- N(Xo),
aX2 = R X1- N*(Xol X1, (8.15)
- - ~ aN(Xo)
(a +eR)(Xo +eX1 + · · ·) +eN(Xo) +e L.J (eX1 i + ···) = q,
i=1 axi ,
560 Chapter 8. Stochastic Systems and Deterministic Input
Example 8.5: Let x be the solution of the deterministic nonlinear algebraic equa-
tion x + 6 x 3 = 1, where 6 is a small parameter. The approximation of the real
root of this equation to the order 6 3 is x ~ x< 3) = 1-6 + 3 6 2 - 126 3.
Figure 8.5 shows the exact root and the perturbation solutions of the first
0.9
0.8
/
o.? Solution x<l)
0.6
0.5
0.2 J'
0.1 Solution X (3)
0 o~--~o.~1----~o.2~--~o.3~--~o.4~_L~o.s
£
three orders. For small values of 6 the perturbation solutions are satisfactory, and
their accuracy improves slightly with the order of the approximation. For large
values of 6 the perturbation solutions are inaccurate. ¢
Note: The power series x = Lk=O, 1, ... ek Xk and the equation x + e x 3 = 1 give
(xo + e x1 + e2 x2 + · · ·) + e (xo + e x1 + 8 2 x2 + ·· ·)3 = 1 or
xo + e x1 + 8 2 x2 + 8 3 x3 + · · · + 8 [x6 + 3 8 x5 x1 + 3 8 2 (x5 x2 + xo xi) + · · · ] = 1
so that we have xo = 1, x1 + x6 = 0, x2 + 3 x5 x1 = 0, and x3 + 3 (x5 x2 + xo xi) = 0 by
equating terms of the same order of magnitude. These equations give xo = 1, x1 = - x6 =
-1, x2 = -3x5 x1 = 3, and x3 = -3 <x5 x2 + xo xi)= -12.
We note that the algebraic equation in this example provides an illustration for a
singular perturbation problem since the limit point 8 = 0 differs in an essential way from
the limit 8 ~ 0. The equations with e = 0 and 8 > 0 have a single root and three roots,
respectively. Problems that are not singular are said to be regular ([101], Section 1.2). .&
Example 8.6: Consider the algebraic equation in Example 8.2 and represent the
random matrix Y as the sum of a deterministic part a = E[Y] and a random part
e R = Y- a. For 6 = 0.3 the approximate means of X 1 and X2 are -1.0130
(1.30%) and 1.4245 (0.73%), and the approximate standard deviations of X 1 and
X2 are 0.3825 (-26.81%) and 0.3831 (-26.82%), respectively. These approxi-
mate results are based on Eq. 8.13. The numbers in parentheses give errors relative
to the Monte Carlo solution in Example 8.2. ¢
8.3. Algebraic equations 561
Note: The expectations E[Y] and E[Rij Rkzl needed for solution are difficult to find ana-
lytically because of the complex dependence of R on the random variables specifying the
random geometry of the truss. These averages were estimated from ns = 1, 000 samples
of Y generated by Monte Carlo simulation. For example, the estimates of the entries (1, 1),
(1, 2), (2, 1), and (2, 2) of E[Y] are 0.9848, 0.7003, 0.0025, and 0.7038, respectively. -'.
Proof: Consider the sequence of sums Sk = E~=O (-1) 7 m 7 • Fork > l, we have
k k k
llskX-S[XII=II L (-1tm 7 xll::: L llm7 XII:S: L y7 llxll
r=l+1 r=l+1 r=l+1
yl+l(l- yk-1) yl+1
= -y
1 II
II::: -1--y II X II X
by norm properties and Eq. 8.16. Because the upper bound on II Sk x - S[ x II approaches
zero for each x E Rn as l, k - oo and Rn is complete ([32], Theorem 3.8, p. 41), Sk xis
Cauchy in Rn and has a limit
k 00
II Sk (i +m)x -x 11=11 skx +skmx -x 11=11 (-1)k mk+ 1 x 11::: yk+ 1 II x 11.
between Sk (i + m)x and x approaches zero ask - oo because y E (0, 1) so that
limk-+oo sk (i + m) x = x. Because s x = liffik_,. 00 Sk x holds for each x E lR" and
562 Chapter 8. Stochastic Systems and Deterministic Input
Let X be the solution of Eq. 8.2, where Y E L2. Denote by a = E[Y] and
R = Y- a the deterministic and random parts of Y, respectively. The Neumann
series solution for Eq. 8.2 and the approximate second moment properties of X
can be calculated from the following equations.
(8.19)
8.3. Algebraic equations 563
X= (i + f:(-1/ ( Rr)
r=1
a- 1 a- 1 q a.s.
Generally, the first few terms of the above series are used to approximate X. For example,
the second moment properties of X given by Eq. 8.19 are based on the approximation
L Ur,
00
X= where Uo = a- 1 q and
r=O
Ur = -a- 1 RUr-1. r = 1,2, .... (8.20)
LA
00 00
7 Ur =a- 1 q -)..a- 1 R LAr Ur,
r=O r=O
564 Chapter 8. Stochastic Systems and Deterministic Input
I,
1. Approximate solution X :
Proof: The unknown coefficients a and {J should be determined from the condition that the
difference between X and Xt is minimized in some sense. However, this difference cannot
be calculated directly since the probability law of X is unknown. If II y- 1 x II~ y II x II
a.s. for all x E Rn and a constant y < oo, then
4.-----------------.
3
1.5
0.5 '---------~----------'
0 0.5 0.5
v = 0'/l.l v = 0'/ll
(8.23)
derived from Eq. 8.2, where a = E[Y] and R = Y -a, and (2) a heuristic
approximation, referred to as the local independence hypothesis, stating that the
correlation between X and R can be disregarded ([164], p. 18, [173], Section 11.8).
There is no theoretical justification for this hypothesis. We review briefly this
method because of its wide use in some fields [164, 173]. Two versions of the
iteration method are presented.
E[X]::::::: a- 1 q,
E[X XT]::::::: a- 1 q(a- 1 q)T + a- 1 E[R X XT RT] (a- 1)T. (8.24)
Proof: The above equations for the first two moments of X can be obtained from the
iteration of order zero of Eq. 8.2, that is, Eq. 8.23.
566 Chapter 8. Stochastic Systems and Deterministic Input
The expectation of Eq. 8.23 and the local independence hypothesis, E[R X] =
E[R] E[X] = 0, yield the first formula in Eq. 8.24.
The expression of the product of Eq. 8.23 with itself and the local independence
hypothesis give the second formula in Eq. 8.24. This formula and the local independence
hypothesis can be used to calculate the expectations E[X; X j ]. •
Proof: The average of Eq. 8.23 involves the unknown expectation E[R X]. The approach
here is to develop an equation for E[R X] rather then using the local independence hypoth-
esis in this equation. By multiplying Eq. 8.23 to the left with R and then averaging, we
have
E[R X]= E[R]a-I q- E[Ra- 1 R X]= -E[R a-IR X]:::: -E[R a-IR] E[X],
where the last approximate equality results by the local independence hypothesis. This
equality and Eq. 8.23 give the first formula in Eq. 8.25. The second formula in Eq. 8.25
results from the hypothesis of independence and the expression of X in Eq. 8.23. This
formula and the hypothesis of independence can be used to calculate the expectations
E[X; Xj] . •
Our objective is to solve the above random eigenvalue problem, that is, to
find probabilistic properties of the eigenvalues A; and the eigenvectors X; of A.
Note: It can be shown that the eigenvalues A; are Borel measurable functions of the entries
of A ([21], Theorem 2.2, p. 23) so that they are random variables on (Q, :F, P). The eigen-
vectors X; of A are also random variables on this probability space ([164], Theorems 1.6,
p. 37, and Theorems 1.8, p. 43). A
• The eigenvalues A.; are real and the eigenvectors x; are in JE.n.
• If a is positive definite, its eigenvalues are positive.
• The eigenvectors corresponding to distinct eigenvalues are orthogonal and lin-
early independent.
• The eigenvectors are unique in the sense that an eigenvalue that is not multiple
can have only one eigenvector.
• If a has a multiple eigenvalue, it is possible to define a set of n linearly inde-
pendent eigenvectors providing a basis in IE. n .
Proof: The equation ax;= A;x; implies the equalities [((ax;)*)T]x; =A; (x7)T x;
and (x7)T (ax;) =A; (x7)T x; so that (A;- A;) (x7)T x; = 0, where x7 is the complex
568 Chapter 8. Stochastic Systems and Deterministic Input
• lA.; - !hi I ::: ( .E~j=l r5) 112 ::: n max lrij 1. where r is an (n, n) symmetric
matrix with real-valued entries and /hi denote the eigenvalues of a+ r ([127],
pp. 96-99, and [193], pp. 101-103).
• lA.- akkl ::: Tk = .E7=I,i# lakd for at least one k ([120], p. 371, [193], p. 71).
Proof: We prove the first two statements. The first statement follows from the inner product
of ax;= A.; x; withx;.
Suppose that the eigenvectors of a are normalized such that their norm is II Xk II=
(xb Xk) 112 = 1. If x = Lk=l ck Xk is a vector in IRn with II x II= (x, x) 112 = 1, the
Rayleigh quotient is
~ 2
(ax,x)
AR = - - - = L.- ck Ak = /..1 +~ 2
L.- ck (A.k - AJ) ~ AJ
(x' x) k=l k=2
8.3. Algebraic equations 569
since II x =
11 2 Lk=l c~ =
1 and Ak :::: AI fork 2::: 2. Also, AR
An Lk=l c~ =An. If Ck = e; Cp fork =f. p,
n
AR = Ap + cp2 L...J
"" (Ak- Ap)Bk,
2
k=l,kf-p
2 ] I/2
A =S11+S22± [ (Su-S22) +S2
I,2 2 2 12
m
L
n T .
xk a,pX 1
X;~ x; + l:xi,p (Zp- E[Zp]), where x;
'
p =
A.· - Ak
Xk.
p==1 k==1,kf-i I
(8.28)
Proof: The partial derivative Ai,p and x;,p of A; and X; relative to a coordinate Zp of
Z in Eqs. 8.27 and 8.28 are called sensitivity factors. The expressions of A; and X; in
Eqs. 8.27 and 8.28 are referred to as first order approximations.
Let An+ C1 An- 1 + · · · + Cn-1 A+ Cn = 0 be the characteristic equation for A.
The differentiation of this equation with respect to Z p yields
The above equation can be used to calculate the coordinates bjjl of x;,p = L:!== 1 hjjl Xl
in the basis defined by the eigenvectors of a by noting that (1) (a- A; i) Xi,p is orthogonal
8.3. Algebraic equations 571
to x; since (x TI
a - J..; x T) xi • p
I
= 0 and (2) the product of the above equation with xkT to
the left yields
(Ak -A;)b}f) = A;,k8ki - x [ a,pX;.
The latter equation gives the coordinates biJ:) of x;,p fori f= k since a has distinct eigenval-
ues by assumption. The coordinate b}f) remains undetermined and can be taken as zero so
"n
that x;,p = L...k=!,kf=i b(p) · b(p)
ik Xk· The ch o1ce · not restnctlve
ii = 0 1s ·
· · smce ' 1") x;,p
(a- ""i
and x; are orthogonal so that the above results do not change if we add to x;,p a vector
proportional to xi . •
The first order approximations of A; and Xi in Eqs. 8.27 and 8.28 are linear
functions of Z so that the second moment properties of Z are sufficient to calcu-
late the first two moments of the eigenvalues and eigenvectors of A. For example,
the approximate mean and variance of A; are A.; and r;;,q=l
Ai,p Ai,q Yz,pq. re-
spectively.
·=[
2.6 -1.1 0.0 ]
-1.1 5.2 -../2 '
0.0 -../2 10
0.1 -0.1
0.0]
R = Z [ -0.1 0.2 0.0 '
0.0 0.0 0.0
and Z is a random variable with mean zero (f-L z = 0) and variance a The i.
eigenvalues and eigenvectors of a, that is, of A with Z = 0, are A. 1 = 2.1647,
A.z = 5.2385, A.3 = 10.3967, and
where the columns of bare the eigenvectors of a. The first order Taylor approxi-
mations of the eigenvalues and eigenvectors of A given by Eqs. 8.27 and 8.28 are
At ::::: 2.1647 + 0.0449 Z, Az::::: 5.2385 + 0.2383 Z, A 3 ::::: 10.3967 + 0.0168 Z,
and
-0.0046 0.0118 0.0054 ]
X ::::: b + Z [ 0.0113 0.0025 -0.0105 '
0.0024 -0.0082 -0.0032
respectively. <>
Note: The random coefficients of the characteristic polynomial det(A- A i) are Co = 1,
Ct = -0.3 Z- 17.8, Cz = 0.01 Z 2 + 3.82 Z + 88.31, and C3 = -0.1 z 2 - 8 Z- 117.9
572 Chapter 8. Stochastic Systems and Deterministic Input
so that Co,I = 0, C1,1 = -0.3, C2,1 = 0.022 + 3.82, and C3,1 = -0.22-8. The
sensitivity factors for the eigenvalues can be calculated from (Eq. 8.27)
with 2 set as zero. The first order approximations of the eigenvectors of A is given by
Eq. 8.28. For example, the coordinates of x 1, 1 are
T T
x 3 a,1 XJ
x 2 a,l x1
b (l) _ --"'--'--- (1)
= -0.0124 and b13 = = -0.00089
12 - }q - A2 A.1 - A3
where a and k are (n, n) matrices with real-valued entries that are deterministic
and random, respectively, det(a) f. 0, E[R] = 0, and cis a small parameter. It
is assumed that a is symmetric and has distinct eigenvalues. Let Ai and xi be the
eigenvalues and eigenvectors of a. The eigenvectors xi are scaled to have unit
norm. It is assumed that a has distinct eigenvalues.
so that
The alternative form, (a- Aj i) x~l) = AP) Xj - Rx;, of the above equation for order 8
and the orthogonality of the vectors (a-).; i) xp) and x; imply xf (A?) x; - Rx;) = 0
orA;(1)
=X; Rx; smcex; x; = 1.
T - · T
Alternative calculations can be performed to find this result. For example, the equa-
tion for order 8 multiplied to the left by x]
gives
n n
xiT a "LBikXk+xiT Rx;
- =).;xiT "LBikXk+A;(1) xiT x;,
k=l k=l
where xp) = Lk=l B;k Xk and B;b k = 1, ... 'n, denote the projections of xp) on the
eigenvectors of a. The last equation simplifies to Bij (). i- ).;) +x] Rx; = A} 1) 8;i since
x] ax; =A;8ij andx] x; =8;i so that
(1) T -
A; =X; Rx; fori= j and
x~ Rx;
B;· =-1_ _ fori =I= j.
1 ).; - ).i
The projection B;; of X~l) on x; remains undetermined. Adding a vector proportional with
x; to X~l) = Li=l,i~i Bij xi does not change the above results so that we take B;; = 0,
and the resulting first order approximation of X; is x; + 8 x?). Additional considerations
on the determination of xp)
and approximations of order 2 and higher of the eigenvalues
and eigenvectors of matrix A can be found, for example, in [120] (Chapters. 9 and 11). •
The first order approximations of A; and X; in Eqs. 8.30 and 8.31 are linear
functions of R so that the second moment properties of this random matrix are
sufficient to calculate the first two moments of the eigenvalues and eigenvectors
of the random matrix A.
Example 8.13: Suppose that the deterministic and stochastic parts, a and R, of a
(3, 3) random matrix A (Eq. 8.29) are the matrices a and R in Example 8.12. It is
assumed that the random variable Z has mean zero and variance of order 1.
The zero order perturbation solution is given by the eigenvalues and the
eigenvectors of a (Example 8.12). The first order corrections of the eigenvalues
and eigenvectors of A are Ail) = x[ RXt = 0.045971 Z, A~l) = xi Rx2 =
0.238710 Z, A~1 ) = xf RX3 = 0.015319 Z, and
-0.0046 0.0118 0.0054 ]
X(l) = Z [ O.Ql 13 0.0025 -0.0105 '
0.0024 -0.0082 -0.0032
respectively. The eigenvalues and eigenvectors to the order e are A ; ~ J...; +
eA?) and X; ~X;+ ex?>. The above expressions for x<i) and A; can be
used to calculate approximately moments and other probabilistic properties of the
eigenvalues and eigenvectors of A. <>
574 Chapter 8. Stochastic Systems and Deterministic Input
Note: The first order approximation of the first eigenvector of A is (Eq. 8.31)
The calculations of moments and other probabilistic properties of A} 1) and x}l) are very
simple in this case since they depend linearly on Z. •
A =a+R, (8.32)
Note: The above approximations of the expectations of the eigenvalues and eigenvectors
of A coincide with the eigenvalues and eigenvectors of a.
The equation, a E[X] + E[R X] = E[A X], obtained by averaging (a + R) X =
A X cannot be used to calculate the first order moments of A and X since E[R X] and
E[A X] are not known. The approximations
and depends on the unknown averages E[A2 X], E[R X], E[R a X], and E[R R X]. The
local independence hypothesis and the last equation give E[A2 ] E[X] :::::: a a E[X] +
E[R R] E[X] or
E[ATJxt Xi:::::: xt a a xi +xt E[R R]Xj,
which yields Eq. 8.34 sincexf Xi= 1 andxf a a xi= >..fxf Xi= >..T- .&
Example 8.14: Consider the (3, 3) random matrix A in Example 8.13 with s = 1.
The first moments of the eigenvalues and eigenvectors of A coincide with the
deterministic solution corresponding to matrix a. We have obtained the same ap-
proximation for the mean values of the eigenvalues and eigenvectors by the pertur-
bation method. However, the approximate second moment of the first eigenvalue
of A is E[AI] ~ 4.69 + 0.0038 a}, and differs from the perturbation solution.¢
10 r------,-----,------,---,----,----.----,-----,----,--,
: 7Third zero-crossing
-s Second zero-crossing
-8
L:
Proof: By the Rice formula the mean zero-crossing rate of V is
where fv(J.), V'(J.) and fv'(J.)IV(J.) denote the densities of (V(A.), V' (A.)) and V'(A.) I V(A.),
respectively. The mean rate v(A.) includes zero-crossings with both positive and negative
slopes. The second moment properties of the JRZ-valued random variable (V(A.), V 1(A.))
are
tt(A.) = E[V(A.)] = A.n +«(A.)p., tt'(A.) = E[V 1 (A.)] = nA.n-l + {l(A.) p.,
Yoo(A.,p) = «(A.)y«(p)T, Yot(A.,p) = «(A.)y{l(p)T, Ylo(A.,p) = fl(A.)y«(p)T, and
Yll (A., p) = fl (A.) y fl (p) T, where p. and y denote the mean and covariance matrices of
the random vector C.
If it is assumed that Cis an 1Rn-valued Gaussian variable, then (V(A.), V 1(A.)) is a
Gaussian vector, and the mean and variance of the conditional random variable V'(A.) I
V(A.) are m(A.) and a(A.) 2 in Eq. 8.39, respectively (Section 2.11.5). The Rice formula
applied to the process VG yields vG(A) in Eq. 8.40.
The second moment properties of C may be difficult to obtain analytically, but can
be estimated simply from samples A generated by Monte Carlo simulation. •
The joint density of V ' ( ~s) at the Ys -crossings of V, s = 1, ... , r, that is, the
density of (V'(~I), ... , V'(~r)) conditional on (V(~I) = Yl, ... , V(~r) = Yr),
is
( J-) = TI~=I lzsl f(z I y)
g z I y, ~ r nr . (8.41)
JJE.r s=llzsl f(z I y)dz
Consider also the event 9 = n;=l {V(A 1) E (x 1 , x 1 + llxtl}, !lx 1 > 0, for some arbitrary
values of A such that At > ~r. The probability of the simultaneous occurrence of C and 9
can be approximated by
where !lx1 is small and f(z, y, x) is the joint density of (V'(~s), V(~s)), s = 1, ... , r,
and V(A 1 ), t = 1, ... , q, so that
=n
l
q
t=l
ilXt
JE.r
D~- 1 1zsl f(z, y) f(x I y, z) d
r
fJR,ns=lluslf(u,y)du
Z
= n q
t=l
flxt {
}JE.r
g(z I y, ~) f(x I y, Z),
Let v(). 1 y, z, l;) be the mean zero-crossing rate of the conditional stochastic pro-
cess V().) I (V(l;s) = Ys. V'(l;s) = Zs, s = 1, ... , r) for A> l;r. The mean zero-crossing
rate of V(A), A > l;r, conditional on only V(l;s) = y 5 , s = 1, ... , r, can be obtained by
integrating v(). 1 y, z, l;) weighted by g(z I y, l;) over all possible values of z. This mean
crossing rate withy = 0 yields v(A II;) in Eq. 8.42.
The determination of the functions g(z I y, l;) and v(). I 0, z) is difficult for an
arbitrary vector C. If Cis Gaussian, the conditional zero-crossing rate v(). I 0, z), denoted
by vc(A I 0, z), can be obtained from Eq. 8.40 with the appropriate parameters and
g(z I y, l;) becomes
n~=1
lzs I exp [ -(1/2) (z- m)T y- 1 (z- m)]
gc(zly,l;)= [ 1 ] •
f~r TI~= 1 Ius I exp -(1/2) (u- m)T y- (u- m) du
where m and y denote the mean and covariance matrices of the conditional random vector
(V'(/;1), ... , V'(l;r)) I (V(/;1) = Y1o •.. , V(l;r) = Yr) [75] . •
n
The mean zero-crossing rates v(A.) and v(A. I provide useful information
on the eigenvalues Ai of A. For example, (1) the average number of eigenvalues of
A in an interval I of the real line is f 1 v(A.) dA. or f 1 v(A. I ~) dA., (2) the probability
that the number of eigenvalues N (I) in I exceeds a specified value q > 0 is
bounded by
l
where g denotes the constant of gravity, 0 < a < l and m > 0 are some constants,
and K > 0 is a random variable. The eigenvalues, A 1 = gIl and A2 = gIl +
2 a 2 K l(m 12 ), of A are the square of the natural frequencies of a mechanical
system consisting of two pendulums with length l and mass m coupled by a spring
of random stiffness K. The mean zero-crossing rate of the process V can be
calculated exactly and is [75]
(
v(A.) = 8 A. - - g) + -mz2 fK (m- z2 (A. - -)
g) .
l 2a 2 2a 2 l
580 Chapter 8. Stochastic Systems and Deterministic Input
can be obtained by direct calculations and show that the first eigenvalue of A is determin-
istic, in agreement with the crossing solution. •
A _ [ (Kt + Kz)fm
- (Kzlz- Ktlt)fm
where m,lt,fz. g > 0 are some constants and K 1. Kz > 0 are random variables.
This matrix defines the eigenvalue problem for a rigid bar with mass m, length I,
and moment of inertia g relative to its mass center, which is supported at its ends
by two springs with stiffness K 1 and Kz. The mass center is at distances It and lz
from the two springs so that I = It + lz. Numerical results are given for m g =
3220 lb, It = 4.5 ft, lz = 5.5 ft, g = m r 2 , r =4ft, and lognormal variables K 1
and Kz with E[KI] = 2400 lb/ft, E[Kz] = 2600 lb/ft, coefficients of variation
VI = vz = 0.3, and correlation coefficient p [75]. The lognormal vector (K 1. Kz)
represents the image of an JR2 -valued Gaussian variable by a nonlinear mapping
J
([79], pp. 48-49). The matrix A has two eigenvalues and we have 000 v()..) d)., =
2.
Figure 8.8 shows the mean zero-crossing rates v and va of V and Va, re-
spectively, for p = 0, 0.5, and 0.9 in the left three plots. The mean zero-crossing
rates v have been obtained by numerical integration. The plots also show the con-
ditional mean zero-crossing rates va().. I At = 46). The mean zero-crossing rates
va().. I At = 46) are also shown in the three plots on the right side of Fig. 8.8
together with Monte Carlo (MC) simulation results. The Gaussian assumption is
satisfactory for the matrix A in this example. The additional information A 1 = 46
reduces the uncertainty in Az. The reduction in uncertainty increases with the cor-
relation coefficient p between K 1 and K 2. 0
Note: The joint density of (V(A.), V' (A.)) is
where a= -(ljm +If/~). b = -(ljm + Zif~). and c = (lf +Zi)f(m ~).The arguments
ks. s = 1, 2, in the above expression are related to x and z by the mappings V(A.)
= x and
V 1 (A.) = z. _.
8.3. Algebraic equations 581
0.16 r--~-:--~-------,
p =0.0 p =0.0
0.12 0.12
v(A.) MC
0.08 0.08
0 oL-~----~~----~~
30 40 50 60 70 80 90 100 30 40 50 60 70 80 90 100
0.08
0.04
0.16 r--~~~-------,
~p =0.9
0.12
MC~
0.08 0.08
v G (A. I A1= 46)
0.04 0.04 ~ ~
o~LL--~--~----~~
0 .____._____._____.______.__)__.___,\._______,
30 40 50 60 70 80 90 100 30 40 50 60 70 80 90 100
A. A.
Note: The square frequencies of the system coincide with the eigenvalues of the (n, n)
582 Chapter 8. Stochastic Systems and Deterministic Input
0.25 ,--.,---,...-----.----.---~-.---,.----,
0.2
0.15
0.1
0.05
Gaussian assumption
Monte Carlo
mathematical objects and should not be confused. The symbol A in Eq. 8.43 has been used
to parallel the notation A in Eq. 8.1. A
where the term in square bracket is a random function F (x) and the kernel of the
second integral K (x, 0 = 1[O,xJ (0 (~ - x) Y1 is random (Eq. 8.44). <>
Let V =A- AI for a fixed A such that Eq. 8.43, that is,
Note: The solution U of this equation is a random field on (Q, :F, P) because it is a
measurable function of the random coefficients of V and of the random initial/boundary
conditions for Eq. 8.45, which are defined on this probability space. A
It is assumed that (1) all random parameters in Eq. 8.45 have finite second
moments, (2) Vis a linear operator, that is, V[a1 U1 + a2 U2] = a1 V[Ul] +
a2 V[U2], where a;, i = 1, 2, are some constants, and (3) the deterministic part
£ = E[V] of V has an inverse £- 1 that is a linear operator. The random part
R = V - £ of V has zero expectation by definition. For example, the random
operator in Example 8.19 is V = d 2 jdx 2 +A Y1 so that£= d 2jdx 2 +A E[Yl]
and R = A (Y1 - E[YI]).
Our objective is to find probabilistic properties of the solution U of Eq. 8.45.
We have already mentioned that some of the local solutions in Chapter 6 can be
extended to solve a class of differential equations with random coefficients. We
584 Chapter 8. Stochastic Systems and Deterministic Input
also can approximate Eq. 8.45 by an algebraic equation with random coefficients
and use any of the methods in Section 8.3.1 to solve the resulting equation. The fi-
nite difference, finite element, or other methods can be used to discretize Eq. 8.45.
1 ns
fLu(X, f)=- LUk(X, t),
ns k=l
1 ns
Cu(Xt, X2, ft, t2) = - L(Uk(Xt, ft)- fLu(Xt, ft))
ns k=l
X (Uk(X2, t2)- fLu(X2, t2))T. (8.47)
Note: The algorithms in Chapter 5 can be used to generate samples of V. The Monte Carlo
simulation method is not restricted to linear operators. It can be applied to nonlinear equa-
tions, in which case the determination of the samples Uk of U requires solving a nonlinear
differential equation rather than Eq. 8.46. A
1. Approximation of U:
~ aU(x, t; P.z)
U(x, t; Z):::::: U(x, t; P.z) +~ a (Zu -t-Lz,u). (8.48)
u=l Zu
(8.49)
Note: The approximate solution for U in Eq. 8.48 is referred to as the linear or first order
Taylor solution. Considerations similar to the discrete case (Eqs. 8.5 and 8.6) can be used
to derive Eqs. 8.48 and 8.49. A
The functions U(x, t; P.z) and aU(x, t; P.z)fazu needed to apply Eqs. 8.48
and 8.49 can be calculated from the following two equations. Let l be the operator
D with JLz in place of Z.
Proof: Generally, l differs from C = E[V]. The function U(x, t; /Lz) in Eq. 8.50 results
from Eq. 8.45 with its initial and boundary conditions in which Z is set equal to /Lz·
The partial derivative W. J
[U] + V [ ~~ = 0 of Eq. 8.45 relative to a coordinate
Zu of Z gives Eq. 8.51 by taking Z equal to /Lz in this equation.
The functions u and au ;azu satisfy deterministic differential equations with the
same differential operator, the operator C. The derivatives au jazu give the rate of change
of U with respect to the coordinates of Z and are called sensitivity factors. If a sensitivity
factor au ;azk is small and the variance of Zk is not excessive, the effect of the uncertainty
in Zk on the solution is limited so that Zk may be taken equal to its mean value as a first
approximation. •
The approximate second moment properties of U in Eq. 8.49 can be (a) im-
proved by retaining additional terms of the Taylor series approximating the map-
ping Z ~---+ U and (b) generalized to solve nonlinear equations. For example, let
U be the solution of
D [U(x, t)] +N [U(x, t)] = q(x, t), (8.52)
586 Chapter 8. Stochastic Systems and Deterministic Input
av [U]
azu
+v
azu
J
[au +aN [U] +taN [U] auk = 0
azu auk azu
k=l
and the functions z r-+ 1) and (Z, U) r-+ N are known, the derivatives av;azu. aN ;azu.
and au;auk can be calculated, and therefore the gradients aU(x, t; /Lz)/azu can be ob-
tained from the above equation. A
In this case .C coincides with£ = E[V] since the coefficients of V are linear in
the coordinates of Z. <>
Note: The differential equations for U(x; /Lz) and aU(x,; /Lz)/azu. u = 1, 2, result from
Eqs. 8.50 and 8.51. Classical methods for solving deterministic differential equations can
be used to find U(x; /Lz) and aU(x,; /Lz)/azu. A
Example 8.21: Let U (t), t ~ 0, be the solution of the nonlinear differential equa-
tion (dfdt- a- Z U(t) 2 ) U(t) = q(t), where t ~ 0, a is a constant, Z denotes
a random variable, and U (0) = 0. The function U (t; f:-L z) and the sensitivity factor
V(t; t-Lz) = 8U(t; t-Lz)/8z satisfy the differential equations
1.8 ,-----~-----,-----,.-----,------,
time
sin(v t), v = 5, and IJ.z = -0.5. The sensitivity factor increases rapidly in time
and reaches a plateau at about t = 4, showing that the dependence of the solution
on the value of Z increases in time for t ::: 4 and is nearly constant for t > 4.
This observation is consistent with the graph of U(t; -0.4), showing a notable
difference from U ( t; fJ. z) for times t ~ 4. <>
Note: The differential equation defining U is nonlinear. The linear and nonlinear parts
of the differential operator are 'D[U] = (djdt -a) U and N[U] = -Z u 3 , respectively
(Eq. 8.52).
The equation
D=£+ER, (8.53)
1. Approximation of U:
Proof: The approximation for U based on the first q terms of the power series in Eq. 8.54
is called approximation of order q. The first order approximation U ::::: Uo + s U 1
corresponds to q = 1. The initial and boundary conditions for the differential equation
defining U k, k = 0, 1, ... , consist of the terms of order sk of the corresponding conditions
for Eq. 8.45.
The recurrence formulas in Eq. 8.54 giving Uk can be obtained from Eq. 8.45 with
1J in Eq. 8.53 and the solution U is represented by the power series in Eq. 8.54. These
equations give
The coefficients of each power of s must be zero by the fundamental theorem of perturba-
tion theory ([171], p. 12) so that we have
for the first three terms of the perturbation series. These conditions give Eq. 8.54. The
expressions of the first two moments of U depend on the order of the approximation con-
sidered for the solution. For example, the approximations of the mean of U to the orders
sand s2 are, respectively, E[U] = Uo + O(s 2) and E[U] = Uo + s 2 E[Uz] + O(s 3 ).
Errors may result if the order of the approximating solution is insufficient to deliver all the
terms needed for calculating moments for a specified accuracy. For example, the first order
approximation for U delivers only one of the three terms of order sZ in the expression of
E[U(xl, ft) U(x2, t2)T] (Eq. 8.55). •
£but different inputs (Eq. 8.54), (2) the calculation of the terms U k can be per-
formed sequentially starting with U o, (3) the formulas in Eq. 8.54 can be used to
express the solution of Eq. 8.45 in terms of the input q, and (4) the method can
be applied to solve nonlinear problems. For these problems, U o is the solution of
a nonlinear differential equation, but higher order terms satisfy linear differential
equations (Example 8.24).
Example 8.22: Let V = djdt + f3 +8 Y(t), t::: 0, inEq. 8.45 with d = 1, where
q and f3 > 0 are some constant, Y is a stochastic process with mean zero, and
U(O) = 0. Hence, the deterministic and random parts of V are£ = (djdt) + f3
and R = Y, respectively. The first two terms of the perturbation series are
and Ut(t) = q
--e-flt
f3
it
0
(efls -1) Y(s)ds.
Example 8.23: Let U(t) be the solution ofthe differential equation in the previ-
ous example with Y(t) = Y a random variable uniformly distributed in (0, 1), q
a constant, and U(O) = 0. Denote by U 51 = lim1--;. 00 U(t) the stationary solu-
tion of this equation. The distribution of this random variable is P(U 51 .::; u) =
P(Y > (q- uf3)/(8u)) and can be approximated by P(Ust .::; u):::::: P(Y >
(/3 2/(8 q))(q j f3 - u)) for U :::::: Uo + 8 U1. Figure 8.11 shows with dotted and
solid lines the exact and approximate probabilities P(Ust .::; u), respectively, for
q = 1, f3 = 1, and 8 = 0.1, 0.5, 1.0. As expected, the perturbation solution is
accurate for small values of 8. <>
Proof: The exact solution is
P(U st $: u)
0.9
0.8
0.7 £ =0.1
£=0.5
0.6
£ =1.0
0.5
0.4
0.3
0.2
0.1
Example 8.24: Let U be the solution of the nonlinear differential equation U(t) =
-U(t) 2 - (U(t) + Y) 2 sin(8 t), t ~ 0, starting at U(O) = 1, where Y is a ran-
dom variable taking values in [0, 1] and 8 is a small parameter. The terms of the
perturbation solution
where CfJi, i = 1, 2, and Yb k = 1, ... , 7, are some deterministic functions of t and Y ( [ 17],
pp. 57-58).
Approximate mappings relating the random parameter Y to U(t) can be used to find
the marginal distribution of the solution at an arbitrary time 1. For example, consider the
mapping Y r-+ [; (1) = h (t, e, Y) corresponding to the first order approximation U
Uo + e U1 for U. This mapping has the inverse ([17], p. 61),
d 4 U(x)
dx 4 + Y(x) U(x) = q(x), x ED= (0, 1)
= fo 1
g(x,v)q(v)dv- fo 1
g(x,v)Y(v)U(v)dv,
( ) (1 - v) x [I - x 2 - (1 - v) 2 ]j6, X :S V,
{ (8.56)
g x, v = (1-x)v[l- v 2 - (l-x) 2 ]/6, x:::v
Note: The solution U can be interpreted as the deflection of a beam on elastic foundation
that has unit length, is simply supported at its ends, and is subjected to a load q. The
random field Y defines the stiffness of the beam-elastic support ensemble.
592 Chapter 8. Stochastic Systems and Deterministic Input
n n
U(xi):::: h L g(xi, Xj) q(xj)- h L g(xi, Xj) Y(xj) U(xj),
j=l j=l
where Xi = i h, i = 0, 1, ... , n, and h = lfn. The matrix form of the above set of
equations is Y U = q, in which U collects the unknown values U (xi) of U, the coordinate
i of q ish LJ=l g(xi, Xj) q(x j ), andY is a random matrix depending on the values of the
field Y at the nodes Xi of the finite difference mesh. The resulting algebraic equation has
random coefficients and can be solved by the methods in Section 8.3.1. A
Example 8.26: Suppose that U(t), t 2: 0, is the solution of the differential equa-
tion of (djdt + f3 + Y(t)) U(t) = q(t), where U(O) = Z is a random variable,
Y denotes a stochastic process with positive sample paths a.s., f3 > 0, and q is an
integrable function of time. The deterministic and random parts of the differential
operator ofthis equation are£ = djdt + f3 and R = Y(t), respectively. The
integral form of the differential equation for U is
Note: The integral equation for U can be obtained by multiplying the differential equation
of U with the Green function of£ and integrating over the time interval [0, t], that is,
lot e-f3(t-s) (:s + f3) U(s)ds =lot e-f3(t-s) q(s)ds- lot e-f3(t-s) Y(s) U(s)ds.
Integration by parts of this equation gives the stated integral equation for U.
The integral equation for U can also be obtained by viewing U(t) = -{3 U(t) +
q(t) - Y(t) U(t) as a linear system with Green function g(t, s) = exp( -{3 (t- s)), t ~ s,
driven by q(t)- Y(t) U(t) . .&
defining a real-valued function qJ(x) for x E [0, 1], where f is a real-valued func-
tion and A is a constant. The Neumann series solution of Eq. 8.44 is based on the
following fact.
8.4. Differential and integral equations 593
If the kernel k is square integrable on [0, 1] x [0, 1] and lA. I <II k 11- 1 , the
solution ofEq. 8.57 is
1
cp(x)=f(x)-A fo h(x,y;A.)f(y)dy, where (8.58)
Proof: Let a(x) 2 = JJ k(x, y) 2 dy and b(y) 2 = JJ k(x, y) 2 dx. The norm,
is finite by hypothesis so that there exists an ex with the above property. We have
from Eq. 8.59, properties of the norm of k, and the Cauchy-Schwarz inequality. For exam-
ple,
::5
so that k3(x, y) 2 a(x) 2 b(y) 2 a 2 and kn+2 (x, y) 2 ::5 a 2 n a(x) 2 b(y) 2 or lkn+2 (x, y)l ::5
an a(x) b(y), n = 0, 1, .. ., which gives
:;)f
1
00
kr(x, y)
I
::5:; IA.Ir lkr(X, y)l ::5:; IA.Ir a(x)b(y)ar- 2
00 00
00
The geometric series I;~ 1 (lA. I a/ converges if lA. I < a- 1 or lA. I <II k 11- 1 since we can
take a =II k 11. Hence, the series I;~ 1 A.r kr (x, y) is absolutely and uniformly convergent
in [0, 1] x [0, 1] because its absolute value is bounded by a convergent series independent
of the arguments (x, y) multiplied by the function a- 2 a(x) b(y), which is bounded almost
everywhere in [0, 1] x [0, 1].
If lA. I <II k 11- 1, the solution of the Fredholm equation is given by Eq. 8.58 since
the series in Eq. 8.59 giving the resolvent kernel h is absolutely and uniformly convergent
so that it can be integrated term by term, that is,
=-
r=1
Hence, we have A. JJ
k(x, z) h(z, y; A.) dz = k(x, y) + h(x, y; A.).
It remains to show that rp in Eq. 8.58 satisfies Eq. 8.57, that is, that
fo
1
[-h(x,y;A.)-k(x,y)+A. fo 1 k(x,z)h(z,y;A.)dz] f(y)dy=O.
The last condition is satisfied since the above integrand is zero. Hence, rp given by Eqs. 8.58
and 8.59 is the solution of Eq. 8.57. Additional considerations on the Neumann series in
Eq. 8.57 can be found in [99] (pp. 266-269), [110] (Chapter 3), and [187] (pp. 49-53).
The proof that Eqs. 8.58 and 8.59 give the solution of Eq. 8.57 can be obtained under
other conditions. For example, this result holds if the kernel is bounded in its domain of
definition, that is, lk(x, y)l ::5
fJ, 0 < fJ < oo, and lA. I < {J- 1. Moreover, the requirement
IA.I < {J- 1 can be replaced by IA.I < IA.1I, where A.1 denotes the smallest eigenvalue of k
([99], p. 267).
8.4. Differential and integral equations 595
Because the series representation for h(x, y; A.) in Eq. 8.59 is absolutely and uni-
formly convergent almost everywhere in [0, 1] x [0, 1], the integration in Eq. 8.58 can be
performed term by term. Therefore, the Neumann series solution is
= f(x) + L )/ fo
00 1
rp(x) kr(X, y) f(y) dy.
r=1
Numerical calculations based on the Neumann series solution are usually based on the first
two or three terms of this series. •
where Dis an open bounded subset of JRd, cp, f : D --+ lR", and k : D x D --+
JRnxn. If (1) lkij(X, y)l ::::; {3, 0 < f3 < oo, for all X, y ED and i, j = 1, ... , n,
(2) f D dx = v v < oo, and (3) lA. I < (n v v {3) -I, then the Neumann series
solution of Eq. 8.60 is
where h(x, y; A.) = - L~ 1 A.r- 1 kr(X, y) and the kernels kr can be obtained
from k1 (x, y) = k(x, y) and kr(X, y) = fv k(x, z) kr-1 (z, y) dz for r ~ 2. 0
Proof: We have
If:)/
r=l
kr,ij (x, y) I ::0 f:
r=1
IA.Ir nr- 1 v;- 1 f3r
showing that 1:~ 1 ').f kr,ij (x, y) is absolutely and uniformly convergent in D x D if the
condition P-I < (n v D {3) -I is satisfied. The function h satisfies the equation
The solutions of Eqs. 8.57-8.59 and Eqs. 8.60-8.61 can be extended to ob-
tain a Neumann series solution for Eq. 8.44. There is a notable difference between
these equations. The kernels of the integrals in Eqs. 8.57 and 8.60 are determinis-
tic while the kernel of the integral in Eq. 8.44 is stochastic.
I
If (1) there is f3 E (0, oo) such that Kij (x*, y*) I ::
f3 a.s. for all x *, y* E D*
and i, j = 1, ... , n, (2) fn• dx* = v1 < oo, and (3) /A/ < (n v1 {3)- 1, then
the solution ofEq. 8.44 completed with initial and boundary conditions is
H(x*, y*; A)=- LAr-l K,(x*, y*), Kt(x*, y*) = K(x*, y*), and
r=l
K,(x*,y*)= { K(x*,z*)Kr-t(z*,y*)dz*, r=2,3, .... (8.63)
ln•
Note: The result in Example 8.27 can be applied for almost all samples of the stochastic
kernel K since the entries of K are assumed to be bounded almost surely. Hence, the
vv
resulting Neumaun series is absolutely and uniformly convergent a.s. We also note that the
condition I.A. I < (n {3)- 1 implies that the Neumann series is absolutely and uniformly
J
convergent a.s. on any D c D* since jj dx* = v :::; v* . .A.
U(t) = f(t) +A J;
K(t, s) U(s) ds where f(t) = (q/fJ)(l- e-f3t), K(t, s) =
1[o, 11 (s) Y e-f3(t-s), and A = -1 (Example 8.26).
If IYI < 1/r a.s., the Neumann series is absolutely and uniformly conver-
gent in [0, r] a.s., and the solution is
0.4
Mean Standard deviation
0.8 ,•'
II I II II
0.3
0.6
0.2 ··. \ approximate
0.4
'exact 0.1
0.2
0 0
0 0.5 1.5 2 0 0.5 1.5 2
time time
Figure 8.12. Exact and approximate mean and standard deviation functions of U
so that the Neumann series is absolutely and uniformly convergent in [0, r] if r IYI < 1
a.s. Because IYI < 0.5 a.s., the Neumann series giving the solution U is convergent for
r < 2. The first two kernels are K 1 = K and
The approximate solution can be used to calculate moments and other properties of
U. The exact solution U(t) = (q j(q + Y)) ( 1- e-<ll+Y) 1) is a simple function of Y so
that its moments can be obtained without difficulty. A
Example 8.29: Let U(x), x E [0, 1], be the solution of the differential equa-
tion (d/dt + fJ + Y(t)) U(t) = q with U(O) = 0, where Y is a zero-mean
process. The decomposition method is based on a series representation U(t) =
L~o Ur(t) of U, where
are the first two terms of the above decomposition of U, £ = djdt + {3, and
R = Y(t) ([3], Sections 4.7, 4.10, 5.1, 7.2, and 8.2). The functions Uo and Ut
coincide with the first two terms of the Neumann series solution for U. <>
Note: The differential equation satisfied by U is (C.+ R) [U(t)] = q. The integral ver-
sion of the differential equation for U can be written symbolically as U(t) = c.- 1[q]-
c.-1R[U(t)] or
suggesting the recurrence formulas Uo = c.- 1[q] and Uk = c.- 1R[Uk-tl fork 2': 1. A
8.4. Differential and integral equations 599
Example 8.30: Let U(t), t ::=: 0, be the solution of the differential equation
(.C + R) [U(t)] = q(t) in the previous example, where .C and Rare determin-
istic and random differential operators and q is an integrable function. The mean
of U can be calculated approximately from
Note: The expectation, .C[E[U(t)]] + E[R U(t)] = q(t), of the differential equation for U
cannot be used to calculate the first moment of the solution since E['R. U(t)] is not known.
An equation can be developed for this unknown expectation by applying the operator R to
the solution U(t) = .c- 1[q(t)]- .c- 1 R[U(t)], which gives
Our objective is to solve the random eigenvalue problem defined by Eq. 8.66,
that is, to find properties of the eigenvalues A i and the eigenfunctions <l>i of A.
To complete the definition of the eigenvalue problem in Eq. 8.66, boundary con-
ditions need to be specified for this equation.
Note: It can be shown that the eigenvalues and eigenfunctions of A are random variables
and random fields, respectively, on a probability space (Q, :F, P) on which A is defined
([164], Section 1.2).
More general eigenvalue problems arise in some applications. For example, we may
be interested in finding the eigenvalues and eigenfunctions of the homogeneous equation
At [<I>] = A A2[<l>], where At and A2 are differential or integral operators with random
coefficients. A
600 Chapter 8. Stochastic Systems and Deterministic Input
since Li cr = 1 and Aj ::::: A1· The inner product can be calculated term by term because
the space spanned by the eigenfunctions of C is complete.
The last property implies that any function in U can be represented as a linear form
of the eigenfunctions of C. •
602 Chapter 8. Stochastic Systems and Deterministic Input
Denote the deterministic and random parts of the operator A in Eq. 8.66 by
.C and n, respectively. We assume that the random coefficients of A are square
integrable. Set .C = E[A] and n =A- .C. Bounds on the eigenvalues of A can
provide useful information on, for example, the dynamic properties of uncertain
dynamic systems. Exact expressions for the eigenvalues and eigenfunctions of A
can be most valuable in applications. However, these expressions are available
only for very simple problems.
Note: The differential operator for this boundary value problem A = .C = d 2I dx 2 is deter-
ministic. The boundary condition at x = 1 is the only source of uncertainty. The set of ad-
missible functions for a sample Z (w) of Z consists of real-valued twice differentiable func-
tions <1>(·, w) with support [0, 1] and boundary conditions <1>(0, w) = 0 and d<l>(1, w)jdx+
Z(w) <1>(1, w) = 0. The solution of this equation is <l>(x) =a cos(A 1/ 2 x)+b sin(A 1/ 2 x),
where a and b are arbitrary constants. The first and second boundary conditions imply
a = 0 and A 1/ 2 cos(A 112) + Z sin(A 112 ) = 0, respectively. •
Example 8.32: Consider the boundary value problem in the previous example.
The inner product ((d 2 cl>(x)jdx 2 ), cl>(x)} provides an upper bound on the lowest
eigenvalue AI ofthisproblemsothatE[AI]::;; E[((d 2 ct>(x)jdx 2 ), cl>(x)}], where
-1/2
Z+ 2
2 [ 1 Z+ 2 (Z + 2) 2 ]
cl>(x) = (x - z + 1 x) 5- 2 (Z + 1) + 3 (Z + 1) 2
II
<I> 2
II = lo{
1 <I>
(x)
2d
x =
c2 [ 1
1 5-
z + 2 + 3cz(Z++2)1)2
2 (Z + 1)
2 J.
The constant C1 is selected such that II <I> II= 1. •
8.4. Differential and integral equations 603
If II ¢; II= 1 and 'A; is not a multiple eigenvalue, the first order approximation of
the eigenvalues and eigenfunctions of A are:
Proof: We assume that the eigenvalues and eigenfunctions of A can be expanded in the
convergent power series
(I) 2 (2)
A; =A; + B A; + B A; + ... ,
<I>; (x) = </Ji(x) + e <1>~ 1 ) + e2 <~>?) + · · · (8.71)
so that £[<fli1 =A; if!; and £[<1>} 1)] + R[<fiiJ =A; <t>jll + Aj 1l if!; for order 1 and orders,
respectively. These equations are accompanied by conditions given by the terms of order
1 and e of the boundary conditions prescribed for the eigenvalue problem of A. The inner
product of the equation of order e and <fi} gives
or B;j ().. j -A; )+(R[<fliJ, if!j) = A} 1l 8;1 since the collection of eigenfunctions is complete
so that <1>} 1) = Lk B;k <Pk and (if!;, if!j) = 8ij. This equation yields Ajll = (R[<fliJ, <Pi)
fori = j and B;1 = (R[<fliJ, <fiJ)f(),;- Aj) fori =!= j. The projection Bii of <t>} 1l on
¢; remains undetermined. We take Bii = 0 since the above results do not change if we
add to <1>} 1) a function proportional to if!;. Moreover, the eigenfunction <I>; ::::o <Pi+ e <t>jll
approaches ¢; as e ---+ 0. •
Note: The deterministic and random parts A are £ = Ll and n = Z Ll, respectively. The
first order correction of the eigenvalue i relative to A.; is (Eq. 8.69)
(1)
A; = (Z Ll[¢;], ¢;) = Z (A.;¢;,¢;)= Z A.;.
jl)
so that <P (x) is zero. The result also follows from the alternative form Ll <P (x)
(A/(l + E: Z)) <P(x) of the original boundary value problem, showing that the random
and deterministic operators, (1 + E: Z) Ll and Ll, have the same eigenfunctions but different
eigenvalues. The eigenvalues of the random operator scaled by ( l + E: Z) coincide with the
eigenvalues of Ll. .&
Note: The expectation, .C[E[<P]] + E[R[<P]] = E[A <P], ofEq. 8.66 cannot be used to find
the mean values of eigenvalues and eigenfunctions of A[<P] = A <P since E['l?[<P]] and
E[A <P] are not known. The expectation ofEq. 8.66 and the local independence hypothesis,
giving the approximations E[R[<P]]::::: E[R] E[<P] = 0 and E[A <P] ::::: E[A] E[<P], yield
Eq. 8.72. .&
gives E[A2] E[<l>] :::::: .C .C[E[<l>]] + E[R R] E[<l>] by averaging and using the local inde-
pendence hypothesis. The inner product of the last formula for A = A; and <1> = <1>; with
1/Ji yields Eq. 8.73 since E[<l>j]:::::: 1/J;, (1/Ji, 1/Ji) = 1,
E[AT] (1/J;, 1/J;) :::::: (.C .C[I{Jj], 1/J;) + (E[R R] [1/Jj], 1/J; ),
and.Cisaself-adjointoperator, so that (.C.C[I{Jj], 1/J) = (.C[I{Jj], .C[I{Jj]) = >..T (1/J;, 1/J;). There
is no simple formula for the second moment of the eigenfunctions of a random operator A.
A
Example 8.34: Let (Ai, ci>i) be the solution of the eigenvalue problem
(1 + Z) D.ci>(x) =A ci>(x),
x E D, here Z is a random variable with mean zero and finite second moment.
The approximate mean solution is E[A;] ::::: Ai and E[ci>;] ::::: ¢i, where Ai and
<Pi denote the eigenvalues and eigenfunctions of the deterministic part .C = D.
of A = (1 + Z) D.. The approximate second moment of an eigenvalue of A is
E[AT] :::::AT (1 + E[Z 2]), so that the approximate variance of Ai is AT E[Z 2 ]. 0
complete information on the material features at the microscopic level, for exam-
ple, we know the concentration, the properties, and the geometry of the constituent
phases of a multiphase material as well as their spatial distribution.
Our objective is to calculate bulk, global, or effective properties, that is,
material properties at the macroscale or laboratory scale. These properties are
some type of averages of the material features at the microscale. Effective prop-
erties (1) are measured in most laboratory tests, (2) are used as material constants
in continuum theories, for example, the Lame constants of linear elasticity, and
(3) can be obtained in some cases even if the available information on the mi-
crostructure is partial, for example, we may know only the volume fraction and
mechanical properties of the constituents of a multi -phase media but not their ge-
ometry and spatial distribution.
Consider a microstructure and some differential equations describing the
state of the microstructure subjected to a specified action. Because material prop-
erties are random at the microscopic scale, these equations have random coeffi-
cients. The solution of these equations provides detailed information on the state
of the microstructure, which may not be needed. We can derive equations for
the material state at the macroscopic scale by averaging the solution of the state
equations for the microstructure, but this approach is computationally inefficient
since it requires finding first the state of the microstructure. Two methods have
been developed for establishing state equations at the macroscopic scale, the ho-
mogenization and representative volume methods. The homogenization method
considers a family of problems for microstructures with scale 8 > 0 and calcu-
+
lates the limit of the solution of these problems as 8 0. The method also delivers
effective material properties. The representative volume method develops differ-
ential equations for the macroscopic state of a material, that is, for an average of
the microscopic state over a window much larger than the scale of the microstruc-
ture, referred to as representative volume. The macroscopic state equations can be
obtained directly, rather than by averaging the microscopic state equations.
It is common to assume that the material properties at the microscopic scale
can be modeled by ergodic random fields. Spatial averages of these properties
over a representative volume much larger than the microscale are nearly deter-
ministic and practically independent of the particular microstructure sample used
for calculations.
Generally, numerical algorithms are needed for calculating effective prop-
erties. However, some homogenization problems can be solved analytically, for
example, the determination of the effective properties of series and parallel sys-
tems. The components of a series and a parallel system carry the same load, the
applied load, and experience the same deformation, the system deformation, re-
spectively. A series system fails by its weakest component while a parallel system
fails when its last surviving component breaks.
Example 8.35: Consider a series and a parallel system with n linear elastic com-
ponents of stiffness Ki, i = 1, ... , n, where Ki are independent copies of a real-
valued random variable K > 0 with finite mean. These heterogeneous systems
8.5. Effective material properties 607
Example 8.36: Suppose that the components of the series and parallel systems
in Example 8.35 are made of Maxwell or Kelvin materials, that is, they are mod-
eled by a spring and a damper in series or a spring and a damper in parallel,
respectively. The overall force-displacement relations for a parallel system with
Maxwell and Kelvin components are
f
r(t) = tr
n
1/K;
1
+ tjC; and r(t)
f
= tr
n K;
1-exp[-(K;/C;)t]'
respectively, where K; and C;, i = 1, ... , n, denote the stiffness and damping
parameters of component i, n is the number of components, t ::::: 0 is the time
measured from the application of the force f > 0, and r(t) denotes the system
elongation at time t.
The overall force-displacement relations for a series system are
f 1
and _!_ = 1---------,--
=:::-:::---;:-----
r(t) r(t) 'L:7=t [1-exp(-(K;/C;)t)]/K;
for Maxwell and Kelvin components, respectively. <:>
608 Chapter 8. Stochastic Systems and Deterministic Input
Proof: The above results are consistent with findings in Example 8.35. If Ci ---+ oo and
Ci ---+ 0, the Maxwell and Kelvin materials degenerate into linear springs and the above
equations yield the results in Example 8.35.
The elongation f' (t) of Maxwell and Kelvin materials with parameters ( Ki, C i)
subjected to a force x > 0 are
respectively.
Suppose that a parallel system with Maxwell components is loaded by f > 0. The
deformation f'(t) at timet of all system components is f'(t) = Fi (t) (11 K; + t I Ci), where
the force F; (t) in component i = 1, ... , n at time t is unknown. Because L:7= 1 Fi (t) = f
at each t by equilibrium, we have
n n l
f = "'F;(t)
L...
= f'(t) " '
L... liK· +tiC
.
i=1 i=l l l
The elongation f'(t) can be determined from the above equation for a specified load and
time and then used to find internal forces F; (t). The deformation f' (t) and internal forces
F; (t) are parametric stochastic processes (Section 3.6.5) since they are deterministic func-
tions of time depending on a finite number of random parameters, the component stiffness
and damping. A similar approach can be applied to find the global force-displacement
relations for other systems. •
Example 8.37: Consider a rod of length l > 0 and random stiffness K(~) > 0,
~ E [0, l]. The effective stiffness oftherodis Keff = 11 J~ d~/K(~). Figure 8.13
shows samples of Keff as functions of the rod length. The samples nearly coincide
as the rod length increases. The plots are for the stiffness random field K (~) =
a+ <P(X(~))(b- a),~ E [0, l], a = 1, b = 2, and a process X defined by
dX(~) =-pX(~)d~ + .J'IPdB(~), where p = 10, B is a Brownian motion,
and X(O) ~ N(O, 1). <>
Proof: The determination of the effective stiffness is based on the approach in Exam-
ple 8.35 for series systems. Suppose that the rod is subjected to a fixed elongation d > 0.
A random force F > 0 needs to be applied at the ends of the rod to generate this elonga-
tion. The strain caused by Fat~ E [0, l] is f'(~) = F I K(~) so that d = JJ r(~) d~ or
F JJ d~ I K (~) = d, which gives the rod stiffness. The equivalent homogeneous rod has
stiffness K(l) = ll Keff·
We note that the stiffness K is a translation random field (Section 3.6.6) with the
properties K (~) ~ U(a, b) for all~ E [0, l] and scaled covariance function that is approx-
imately equal to exp(-p 1~1 -hi), where ~1, ~2 are some points in [0, l]. •
0.9
Samples of Keff
0.4
0.3
0.2
0·1oc__----=o'=-.5---'----:1':-.5---:-----::2':-.5---:-----::3"::.5--:----:4"::.5-~
le.ngth
cases. For example, suppose that the series system in Example 8.35 has two types
of components with stiffnesses ka and kb and volume fractions ~a and ~b. respec-
tively, that is, there are na = n ~a and nb = n ~b components with stiffness ka and
kb. The effective stiffness of the components of a homogeneous series system with
n components is keff = 1I (~a I ka + ~b I kb). However, the above partial informa-
tion on the microstructure is insufficient in other cases. For example, consider two
microstructures of a material consisting of disconnected inclusions imbedded in a
matrix with a 50% inclusion volume fraction. In the first microstructure the inclu-
sions have much higher conductivity than the matrix. In the second microstructure
the matrix has much higher conductivity than the inclusions. The microstructures
are indistinguishable if the information is limited to phase properties and concen-
tration. However, the effective conductivity of the second microstructure exceeds
by far the corresponding property of the first microstructure. In this case we need
detailed information on the microstructure to calculate the effective conductivity.
The following two sections illustrate methods for calculating effective prop-
erties for random heterogeneous materials. In Section 8.5.1 we estimate the ef-
fective conductivity of multi-phase random materials. The analysis is based on an
extension of the random walk method (Section 6.2) and smooth approximations
of the conductivity field. In Section 8.5.2 we illustrate the determination of the
effective material properties for linear elasticity. The analysis is based on proper-
ties of random fields, concepts of linear elasticity, and perturbation and Neumann
series methods.
610 Chapter 8. Stochastic Systems and Deterministic Input
8.5.1 Conductivity
There are numerous studies providing approximations of and bounds on
macroscopic properties of heterogeneous media, such as electric or thermal con-
ductivity, magnetic permeability, and diffusion coefficients. Most of the available
results are for special microstructures, such as two-phase composites with inclu-
sions of simple geometries [184, 192]. A notable exception is the Monte Carlo
simulation method in [113, 186] for calculating the effective conductivity of two-
phase materials. The method uses samples of Brownian motions having differ-
ent speeds in different phases and non-zero probabilities of reflection at phase
interface. Numerical results by this Monte Carlo simulation method have been
reported for two-phase media. The method appears to be impractical for calculat-
ing the effective conductivity of general heterogeneous media because it requires
the solution of complex first passage problems for JR 2 - and JR3 -valued Brownian
motion processes.
In this section we discuss an alternative method for estimating the effective
conductivity of random heterogeneous media. The method is general, and can
be applied to estimate the effective conductivity of heterogeneous media with an
arbitrary number of phases, which can have any geometry. As previously stated,
the analysis is based on an extension of the random walk method discussed in
Section 6.2.
defined on an open bounded subset D of ffi.d with u(x) = 0 for x E aD, where
aD denotes the boundary of D, (1 > 0 is a constant, and Ll = L~=l a2 ;ax~. The
solution u can be interpreted as the temperature in a homogeneous medium with
constant conductivity a > 0, which is subjected to a unit flux everywhere in D
and zero boundary conditions. Let
T = inf{t C:: 0: B(t) tJ. D, B(O) =x ED} (8.75)
be the first time an ffi.d -valued Brownian motion B starting at x E D exits D.
1
u(x) = -Ex[T], xED. (8.76)
2a
Proof: The Ito formula applied to the mapping B 1-+ g(B), g E C2 (lRd), in the time
interval [0, T] gives
by expectation, where A = (1/2) y:_d =l a2 jax~ = (1/2) ~is the generator of B (Sec-
tion 6.2.1.3). The notation A used in 'iliis section for the generator of Band in the following
section for the generator of a diffusion process X, has no relation to a similar notation used
previously in this chapter for some differential operators (Section 8.4).
The local solution in Eq. 8.76 results by writing the above equation with u in place
of g and noting that B(T) E aD so that u(B(T)) = 0 and that B(s) ED for s < T so that
A[u(B(s))] = -l/(2cr). •
Example 8.38: If Din Eq. 8.74 is a sphere Sd(r) = {~ E r :II ~ II< r} of radius
r > 0 centered at the origin of m.d , then
Ex[T(r)] r 2 - llxll 2
u(x) = 2a = 2 da , x E Sd(r),
where T(r) is given by Eq. 8.75 with D = Sa(r). Hence, the temperature at the
center of the sphere is u(O) = r 2 /(2d a).¢
Proof: Let k > 0 be an integer and let Tk = min (k, T(r)). The expectation of the Ito
formula applied to the mapping B 1-+ II B 11 2 in the time interval (0, Tk) gives
with u(x) = 0 for X E aD, where \1 = (ajaxt, ... , ajaxa) and y(x) denotes the
fluxatx ED.
Let X be a diffusion process defined by
where the entries of the (d, 1) and (d, d) matrices a and b are real-valued func-
tions defined on Jlld and B is an Jlld -valued Brownian motion. Define the drift and
diffusion coefficients of X by
aa(x) ~
ap(x) = -a-, p = 1, ... , d, and b(x) = v' 2a(x) i, (8.79)
Xp
where i is the (d, d) identity matrix. It is assumed that the drift and diffusion
coefficients defined by Eq. 8. 79 satisfy the conditions in Section 4. 7 .1.1 so that
the solution ofEq. 8.78 exists and is unique. Let
denote the first time when X starting at X (0) = x E D exits D. It is assumed that
T* has a finite expectation (Section 6.2.1.1).
Proof: We first note that, if the flux y is unity, then the local solution of Eq. 8.77 becomes
u(x) = Ex[T*].
Let g be a function with continuous second order partial derivatives and assume
X(O) =X. The average of the It6formula applied to the mapping X ~ g(X), g E C2 (JR.d),
in a time interval [0, t] gives
The conductivity field in a multi-phase medium has jumps at the interface between
its constituents so that the drift and diffusion coefficients given by Eq. 8.79 do not satisfy
the uniform Lipschitz conditions. However, we can develop smooth approximations u for
the conductivity field a of a multi-phase material, and use these approximations to calculate
the effective conductivity, as illustrated later in Example 8.39. •
r2
aeff(r) = 2d £0[T*(r)]. (8.82)
Proof: We have shown in the previous sections that the values of function u at the center of
a sphere Sa (r) containing a homogeneous material with conductivity O'eff and a heteroge-
neous material with conductivity field a(x), x E Sa(r), are r 2 /(2d aeff) and E 0 [T*(r)],
respectively (Eqs. 8.76 and 8.81 and Example 8.38). The effective conductivity of a het-
erogeneous medium results from the condition E0 [T*(r)] = r 2 /(2d Ueff(r)).
If the conductivity field is periodic, the dependence of O'eff on r becomes weak
for values of r exceeding the scale of material periodicity because additional segments of
material beyond this value do not bring any new information. •
To find :Eeff, we require that the temperature at the center of the sphere
Sd(r) containing the homogeneous material matches U(O, w) =
E 0 [T*(r, w)] for
almost all w's, that is, r 2 I (2d :Eeff(r, w)) =
U(O, w), which yields (Eq. 8.82)
r2
:E ff(r w) - w E Q (8.83)
e ' - 2d E 0 [T*(r, w)]' ·
The resulting effective conductivity is a random variable. The following Monte
Carlo algorithm can be used to calculate the effective conductivity :Eeff·
• Select a sphere Sd(r), that is, a radius r > 0, and generate nw independent
samples :E(x, w), x E Sd(r), of the conductivity random field :E.
(8.84)
Note: The conductivity random field can be defined by a translation random field, that is,
l;(x) = F- 1 o <l>(G(x)), where F is a specified absolutely continuous distribution, <l>
denotes the distribution of N(O, 1), and G is a Gaussian random field with differentiable
samples (Sections 3.6.6 and 5.3.3.1 in this book, [79], Section 3.1).
Let 1;(-, w) be a sample of l; and let x(w) denote the diffusion process X in
Eqs. 8.78 and 8.79, in which a(·) is replaced by l;(·, w). Let T*(r, w) be the first time
x(w) starting at x = 0 exits Sd(r). The expectation of T*(r, w) can be estimated from ns
independent samples of x(w) generated by the methods in Section 5.3.3.2. The effective
conductivity corresponding to a material sample win Sd(r) is given by Eq. 8.83. An esti-
mate of the expected value of l;eff(r) associated with Sd(r) can be obtained from Eq. 8.84.
If the radius r of Sd(r) is much larger than the correlation distance of the conductiv-
ity field and l; is an ergodic random field, the dependence of :Eeff(r, w) on w is likely to be
insignificant for almost all material samples so that l;eff(r) can be obtained from a single
material sample and is nearly deterministic. &
values of nw and ns so that the resulting estimates of the effective conductivity can
be unstable, that is, their values may differ significantly from sample to sample.
For example, f:eff(r) is 1.7270 and 1.5812 for (nw = 10, ns = 10) and (nw =
50, ns = 10), respectively. The corresponding estimates of the coefficients of
variation of f:eff(r) are 8.33% and 4.12%. Stable values of i:eff(r) result for
nw ~ 50 and ns ~ 500. For nw = 50 and ns = 500 the estimated effective
conductivity is i:eff(r) is 1.4997, and has a coefficient of variation of 1.6%. ~
Note: The above numerical results are for a smooth approximation :E E C2 (St(r)) of the
conductivity random field I: of the two-phase material. The approximation :E is such that
Eq. 8.78 with drift and diffusion coefficients in Eq. 8.79 has a unique solution. 4
ir ir - - ir --/ ir
-1
r2 [ 0 Y dy r Y dy 0 dy r dy ]
:Eeff--
- 2
- --
:E(y)
+ :E(y) :E(y)
- -
:E(y)
2.8 2.5
2.5 1.5
j
2.4
2.3
2.2
0.5
2.1
2 0
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
p p
of :Eeff based on 1,000 material samples. Small and large values of p corre-
spond physically to rapid and slow spatial fluctuations of conductivity, respec-
tively. These values of p can also be viewed as corresponding to large and small
616 Chapter 8. Stochastic Systems and Deterministic Input
U(x) = -lx
-r
y dy
:E(y)
+
fr
-r
ydy
~
Jr --1L_
lx _!!!____,
-r :E(y)
-r !:(y)
The conditions in Eqs. 8.82 and 8.83 give the stated expression for l::eff· This expression
and samples of :E were used to construct the plots in Fig. 8.15. •
Sample 1 Sample 2
6 I 8
1
Yk = 1, and h(t) = f/J(tl/0.2) f/J(t2/0.2), where r/J denotes the density of N(O, 1).
Figure 8.16 shows the dependence of estimates of the expectation of E eff on A.
The estimates of E [Eeffl for each value of A have been obtained from ten ma-
terial samples and thirty samples of X for each material sample (Eq. 8.84). The
estimates of E [Eeff] nearly coincide with and are much larger than the matrix
conductivity a for small and large values of A, respectively. 0
8.5. Effective material properties 617
'•
Note: The average number of inclusions in D is equal to Avv', where vD' denotes the
volume of D 1• The effective conductivity increases with A since the inclusions have higher
conductivity than the matrix. A
8.5.2 Elasticity
Our objective is the determination of effective properties for a class of ran-
dom heterogeneous materials characterized by properties with a scale of spatial
fluctuations much smaller than the overall dimension of material specimens. This
type of heterogeneity is common to polycrystalline metals and multi-phase mate-
rials, for example, aluminum and glass fiber reinforced plastics.
Consider a random heterogeneous material in an open bounded subset D
of ~d that is in equilibrium under some external actions. Let Sij(X), fij(X),
Aijkt(x), and Cijkt(x) denote, respectively, the strain, stress, stiffness, and com-
pliance tensors at x E D. It is assumed that:
1. The material is linearly elastic with the stress-strain relationship
(8.85)
where S = (Su, S22. S33, S12, S23, S31) and r = (f'u, f'22· !33, f't2· !23, !31) denote
column vectors, and A and C are the corresponding matrices of stiffness and compliance
coefficients for three-dimensional elasticity problems (d = 3). The fourth order tensor
Aijkl has the properties: (a) Aijkl = Aijlk and Aijkl = Ajikl by the symmetry of the
stress and strain tensors and (b) Aijkl = Aklij by the existence of the strain energy ([125],
p. 141). The compliance coefficients Cijkl have the similar properties. Hence, the tensors
A;jkl and Cijkl in IR3 have 21 distinct entries. A
618 Chapter 8. Stochastic Systems and Deterministic Input
The constitutive law (Eq. 8.85), the kinematics relationships (Eq. 8.87), the
equilibrium conditions
where Fi is the coordinate i of the body force vector, and the boundary conditions,
for example, traction, displacements, or a mixture of these actions on the boundary
aD of D, need to be used to find the displacement, strain, and stress fields in D.
The equilibrium conditions expressed in terms of displacements are
- (x)
SiJ =-1
v(x)
1
D(X)
SiJ (~) d~ and rij
- (x) = -1-
v(x)
1
D(X)
rij (~) d~ (8.89)
the stress and strain moving average fields, where v(x) = fv(x) dx and D(x)
is a subset of D centered on x such that D(x) is small relative to D but large
with respect to the scale of fluctuation of SiJ and riJ. Also, D(x) has the same
shape and size for every x. A set D(x) with this property is referred to as a
representative volume. Our objective is to find properties of the moving average
8.5. Effective material properties 619
fields SiJ and f' ij. Two options are available. The first option is to derive properties
of Sij and f'iJ from Eq. 8.89. The disadvantage of this approach is that it requires
knowledge of the probability laws of the random fields (SiJ, f'iJ, Ui). The second
option is to find properties of Sij and f'ij directly. This approach requires the
development of new constitutive laws for the average stress and strain fields, and
such laws are difficult to derive in a general setting. For example, Eqs. 8.85 and
8.89 yield
- (x)
SiJ = -(I ) 1 Aijkl (~) f'kl (~) d~
1
V X D(X)
['kl (~)
=(-1 Aijkl(~) - - - - d~
) -
['ki(X) = Aijkt(X)
- -
['kl(X) (8.90)
V(X) D(X) ['kl(X)
and
- (x)
rij = -(1) 1 Cijkl (~) ski(~) d~
1
V X D(X)
skl (~)
=(-1 Cijkl(~) - - - - d~
) -
Ski(X) = Cijki(X)
- -
Ski(X), (8.91)
V(X) D(X) Skl(X)
respectively, provided that f'kl (x) and ski (x) are not zero. The matrix form of
the above equations is S(x) = A(x) f(x) and f(x) = C(x) S(x), respectively,
with notation as in Eq. 8.86. If the average is performed over D the argument x is
dropped.
The weighted averages of the stiffness and compliance coefficients defined
by Eqs. 8.90 and 8.91 give the constitutive law for the average stresses and strains,
where Aijkl and C;jkl are called the overall or effective stiffness and compliance
coefficients, respectively. These definitions of Aij kl and Cij kl are impractical since
they involve the fields S iJ and [' ij . The following sections present approximations
of and bounds on the effective stiffness and compliance coefficients, and are based
on developments in [137, 194].
(8.92)
where Yij = Yji are specified constants defining the magnitude of the applied
boundary deformation. Let S;j , f' ij , and i1 denote the average over D of S ij , [' iJ ,
and U, respectively, where U is the strain energy of the body. It is assumed that
the body forces Fi in Eq. 8.88 are zero.
620 Chapter 8. Stochastic Systems and Deterministic Input
The average strain and the average strain energy are ([194], Section II.B)
- 1 = -1
r; !, U;, 1 (~) d~ = 9;1 ,
1/, 1-
V D
where div(~(x)) = si,i (x), n; (x) is the coordinate i of the exterior normal n(x) at x E an,
and da(x) denotes a surface element on an ([73], Section 9.4).
Let uCil(~) be a vector in Rd with entries U;(~)8qJ• where i, j are some fixed
indices and q = 1, ... , d, that is u<il (~) has zero coordinates except for the coordinate j
which is U; (~). The boundary conditions in Eq. 8.92 imply that the entries of u<il (~) are
(Yik ~k) 8qj for~ E an. These observations and the Gauss divergence theorem give
aD
Yik ~k 8qj nq(~) dam=
Jan
1 D
acr;k ~k)
a~j
d~ = 1_ _
D
Yij d~ = v Yij
since Sij,j (~) = 0. The indices i and j are arbitrary but fixed. The average strain energy
accumulated in the elastic body is
- 11
U=-
2 V D lJ
S .. (x)U··(x)dx=-
l,J
11 -
2V D
'V·U(~)d~
= - 1/o-
2v an
Uq (~) nq (~) da(~) = -21/o
v an
SiJ (~) Yik ~k n 1 (~) da(~)
- __!_ f a(S;J (~) Yik h) dl: - ~ - .. f S·. (I:) dl: - ~ S·. -..
-2vJn a~j .,-2Y'1JD lJ'> .,-2 zJYIJ•
where the above equalities follow from the Gauss divergence theorem, boundary condi-
tions, and the definition of the average stress provided that there are no body forces. •
8.5. Effective material properties 621
(8.95)
Proof: The Voigt average corresponds to the assumption that the strain field in a hetero-
geneous medium is uniform, and is analogous to the effective stiffness of a parallel system
(Example 8.35). We say that A (V) is an upper bound on A since the matrix A (V) - A is
positive definite (Eq. 8.95).
Let A be the overall material stiffness given by Eq. 8.90 in which D(x) coincides
with D. The corresponding average stress and strain fields over D are related by Sij =
!
Aijkl i\z so that the average strain energy is U = yT Ay (Eq. 8.93), where y is a
column vector including all distinct components of the tensor Yij, that is, the average strains
( Yll, Y22, Y33, Y12, 923, Y:3r). As previously stated, these relationships are of little use for
applications because Aijkl depends on detailed information on the stress and strain fields
(Eqs. 8.90-8.91).
Let r ij (x) be the strain field corresponding to the equilibrium configuration of a
linear elastic random heterogeneous material in D with the boundary conditions in Eq. 8.92.
!
The strain energy corresponding to this field is U = ;;T A y. Consider also the uniform
strain field Yij corresponding to a linear elastic homogeneous material in D under the same
boundary conditions (Eq. 8.92). We have
ii = ~ ;;T Ay = 21v fv rij (x) Aijkl (X) rkz(x) dx ~ 21v fv Yij Aijkl (x) Ykl dx
Example 8.42: Consider a rod of length v > 0 and unit cross section character-
ized by the random stiffness A(x), 0 .::5 x .::5 v. The displacements of the left and
right ends of the rod are zero and 9 > 0, respectively. The overall constitutive
law is S = A f' and
- llov A(x) -f'(x)
A= - - - dx .::5 A (V) = -1 lov A(x) dx,
v o r v o
where A is defined by Eq. 8.90. <>
Proof: The average strain energy accumulated in the rod isU = (1/2) A y2 . According to
the principle of minimum potential energy, we have
U= !
2
A 92 ~ __!._
2v
r
lo A(x) 9 2 dx = ! 9 2 (! {v A(x) dx) = !
2 v ]0 2
A (V) 92
so that A ~ A (V). •
622 Chapter 8. Stochastic Systems and Deterministic Input
where Sij = SJi are specified constants, n J (x) are the coordinates of the exterior
normal n(x) to aD at XED, and v denotes the volume of D. Let Sij. rij.
and u
be the average of SiJ, riJ, and U over D, respectively. It is assumed that the body
forces Fi in Eq. 8.88 are zero.
The average stress and the average strain energy are ([194], Section ll.B)
- = -11
SiJ
V D
SiJ (~) d~ = siJ,
u- = -
1 /, 1 -
sj1 (~) ri1 (~) d~ = - siJ ri1 . (8.97)
2v v 2
{ SiJ(~HknJ(~)da(~) = { (siJ(xHk) 1. d~
lav lv ·
= { (SiJ 1 C~Hk + SiJ(~)8Jk) d~ = { Sik(~)d~
lv ' lv
by the Gauss divergence theorem and the equilibrium condition Sjj,j (x) = 0. Because the
stress field is constant on the boundary of D, we have
with the notation used to derive U in Eq. 8.93 and boundary conditions. •
8.5. Effective material properties 623
and the overall compliance C in Eq. 8.91 with D(x) = D satisfy the inequality
(8.99)
Proof: The Reuss average corresponds to the assumption that the stress field is uniform
throughout the heterogeneous material, and is analogous to the effective stiffness of a series
system (Example 8.35). We say that c(R) is an upper bound on C since c(R) -Cis positive
definite (Eq. 8.99).
Let i: be the overall material compliance given by Eq. 8. 91 in which D (x) coincides
with D. The corresponding average strains and stresses are related by f'ij = Cijkl Skz so
that the average strain energy is i1 = 1sT i: s (Eq. 8.97), where s is a column vector
whose entries are the distinct elements of Sij . As previously mentioned, the calculation of
the average compliance coefficients Cijkl is impractical.
Let Sij (x) be the stress field corresponding to the equilibrium configuration of a lin-
ear elastic random heterogeneous material subjected to the boundary conditions in Eq. 8.96.
1
The average strain energy of this solid is given by 11 = sT i: s. Consider also the con-
stant stress field Sij corresponding to a linear elastic homogeneous material in D satisfying
Eq. 8.96. We have
1_ _ 1 { 1 -T (R) -
= 2 siJskz-;;JvciJkZ(x)dx= 2 s c s,
since the actual stress field minimizes the complementary strain energy and Sij satisfies the
traction boundary conditions ([68], Section 10.9). •
The Voigt and Reuss averages can be used to bound the overall stiffness in
the sense of the inequalities in Eqs. 8.95 and 8.99. For example, we have seen that
A (V) provides an upper bound on A. We also have that
(8.100)
Example 8.43: Consider the random heterogeneous rod in Example 8.42 and as-
sume that it is subjected to a traction s at its ends. The overall constitutive law is
S = A f' or f' = C S and
-
C = -
1
v
1v
0
S(x)
s
1
C(x) - - - dx ::S C(R) = -
v
1v
0
C(x) dx
Proof: The average complementary strain energy accumulated in the rod isUc = (1 /2) Cs2.
According to the principle of minimum complementary potential energy, we have
uc = !2 Cs2 :5 __!__
2v lo
r C(x) s2 dx = !2 s2 (!v Jofv C(x) dx) = !2 c<Rl s2
so that C :5 C(R) . •
Example 8.44: Let D be a cube with unit sides containing a multi-phase material
consisting of n isotropic homogeneous phases with shear and bulk elastic moduli
o<r) and K(r)' and volume v,, r = 1, ... 'n, so that L~=l Vr = 1. It is assumed
that the dimension of individual constituents is much smaller than unity. Let G and
K denote the effective shear and bulk moduli of the material in D corresponding
to uniform pure shear and hydrostatic pressure, respectively. For example, G
for a uniform pure shear so in the (X!, X2)-plane is G = so/f'l2. where r12 =
fv f12(x) dx since D has unit volume (Eq. 8.90). The Voigt and Reuss averages
give the bounds
applied for a, = (vr Gr) 1/ 2 and br = (Vr / Gr) 112. This inequality holds for any a,, br in
lR or C. •
8.91 in which the actual strain and stress fields are approximated. The accuracy
of the resulting effective moduli A;jkl and C;jkl depends on the postulated strain
and stress fields. We illustrate this approach by two examples. The perturbation
and the Neumann series method are applied in the following section to develop
alternative approximations for A;jkl and C;jkl·
Example 8.45: Consider the multi-phase material in Example 8.44 and let v (r) be
the Poisson ratio of phase r, r = 1, ... , n. Suppose that the material is sheared
in the (x1, x2)-plane by a deterministic stress so. The overall or effective shear
modulus in this plane is G = L~=l Vr c<r) ri7 /f'12, where ri7 and r12 are
averages of the strain field f'12 over Dr and D, respectively, and Dr denotes the
subset of D occupied by phase r (Eq. 8.90 and Example 8.44). We also denote by
K the effective bulk modulus (Example 8.44).
The use of the above expression for the effective shear modulus G requires
knowledge of the strain field. Two options are available. We can calculate the
actual strain field f' 12 or we can postulate an approximate strain field f' 12 and then
calculate f'12· We use the latter option and take ri7 to be the strain in a spheri-
cal inclusion with the properties of phase r that is embedded in an infinite elastic
matrix with the macroscopic properties G and K of the composite. The resulting
stress field is ri7 =so/ (G + ~cc<rl - G)) where~ = 2 (4- 5 v) (15 (1 - v)),
v = (3 K - 2 G)/ (6 K + 2 G), and K = L~=l Vr K(r) e<rl /G [34]. The ap-
proximate contraction e<rl of phase r is e<rl = pj (K +a (K(r) - K) ), where
a = (1 +i!)/(3 (1- v)) and p denotes a deterministic hydrostatic pressure applied
to D. The average strains and contraction over Dare
n
and e = LVr e<rl.
r=1
3 4
G K
(n =20) 3
2 v (n =20)
v
2
0
1.5 2 2.5 3
0
1
l
1.5 2
ln 2.5 3
10 10
G K
8 8
(n =200) (n =200)
v v
6 6
4 4
2 2
0 0
1.5 2 2.5 3 1 1.5 2 2.5 3
t
v D
=
r= 1
Vr [.sij A(r) ..!._
Vr
r
Jv,
r aa(X) dx + 2 o(r) ..!._
Vr
r
Jv,
rij (x) dx]
so that
which yields the stated constitutive law since the total shearing strain is equal to 2f\2·
The relationship between pressure and volume change r a a = r 11 + r 22 + r 33 for
an isotropic homogeneous material is Sa a = (3 A + 2 G) r aa = K r a a, where K denotes
the bulk modulus. The average Saa of Saa over D is
so that K is as stated.
The expressions of G, K, a, {3, and ii form a system of nonlinear algebraic equations
whose solution gives G and K. The plots in Fig. 8.17 represent the solutions of these
algebraic equations for material samples generated by Monte Carlo simulation.
8.5. Effective material properties 627
Example 8.46: A collection of material particles that may have random shape
and/or mechanical properties whose interaction is localized at their contacts is
referred to as a granular material. In contrast to multi-phase media, the parti-
cles of a granular material are not included in a matrix. Most studies on granular
materials focus on the development of continuum constitutive equations and the
analysis of potential changes in material properties, referred to as structure evo-
lution [72, 163]. The analysis of granular materials can be based on (1) concepts
of molecular dynamics, (2) continuum smoothing and balance approach, (3) ki-
netic theory based on the conservation theorem of statistical mechanics and the
Maxwell-Bolzman distribution, and (4) mean field hypothesis [72, 108, 148].
We discuss solutions based on the mean field hypothesis. The effective bulk
and effective shear moduli, j_ and 2 [L, of a random packing of homogeneous and
isotropic spheres of the same radius r > 0 given by the mean field hypothesis are
j_ = 11-k (1-
Snr
a) [-a- __2!?___],
1-v 2-v
2 /L = JLk(l-a) [~ _ ~], (8.101)
Snr 1-v 2-v
where 11- and v are the shear modulus and the Poisson ratio of the spheres, k
denotes the average number of contact points for a sphere of the aggregate, a and
b are some constants related to the geometry of the contact between two spheres,
and a is the aggregate porosity, that is, the ratio of void to total volume. The
mean field hypothesis states that ( 1) each particle has the same number of contacts
that is equal to the average number of particle contacts over the entire aggregate
and (2) the contacts are equally spaced on the boundary of each particles. These
assumptions hold in a perfectly ordered configuration of the aggregate. The use of
the mean field hypothesis eliminates the only source of uncertainty, the geometry
of packing, and gives an overall relationship between stresses and strains that is
deterministic.
The derivation of the effective moduli in Eq. 8.101 involves two steps. First,
the constitutive law of the interaction between two particles needs to be estab-
lished. Second, the mean field hypothesis is applied to eliminate the uncertainty
in the coefficients of the equilibrium conditions and establish overall constitutive
laws. Details on these derivations can be found in [52]. <>
628 Chapter 8. Stochastic Systems and Deterministic Input
Note: We summarize the essential steps of the mean field solution for the special case of an
assembly of infinitely long cylindrical particles with circular cross section of radius r > 0
and parallel axes. Let S be a plane section through the assembly, which is parallel to the
axes of the particles. Consider a rectangular subset S' of S with unit side in the direction
of the particle axes. The other side of S' has length a » r. The area of the intersection
of S' with the cylindrical particles of the aggregate is a* ::::::: Li N; [2 r sin(a;) ], where
N; denotes the number of cylindrical particles intersecting S' with an angle [a;, a; + da;)
(Fig. 8.18). The numbers N; are random since the packing is not ordered. The interaction
x2c : ' 2
x\l
x'l
e (p,c) X
1
between two particles in contact consists of the forces Fn and F1 normal to and contained in
the plane tangent to the boundaries of these particles at their contact point. It is assumed that
there is no relative movements between particles along their axes so that we can consider
that F1 is contained in the (XJ, xz) plane (Fig. 8.18).
Suppose that the assembly is in equilibrium under equal pressure in the XJ and
xz directions. These boundary conditions would generate uniform stress and strain fields
in a homogeneous isotropic material replacing the aggregate of particles. Let os;i be a
perturbation of the uniform stress field in this material and oy;j denote the corresponding
increment in strains. Our objective is to determine the relationship between 8sij and 8y;j,
that is, the overall constitutive law or, equivalently, the effective constants of the aggregate.
Consider again the set S' and suppose that the stress increment is limited to the component
parallel to axis xz. The total force increment on this section is a 8szz. The equilibrium
of the assembly is preserved if the original contact forces between particles are slightly
altered. Let 8F~p,c) and oF/p,c) denote changes in the contact forces caused by the stress
increment 8szz. The assembly is in equilibrium if the condition
a 8szz =- L L [8F~p,c) cos(0(p,c)) + oF/p,c) sin(e(p,c)) J,
particles contacts
8.5. Effective material properties 629
is satisfied, where e(p,c) is the angle between Fn and axis x2 (Fig. 8.18). The summation
includes contact forces acting on the particles which intersect s' and have their center above
this plane. These forces are identified by the superscripts (p, c) in the above equation. The
angles 9(p,c) are uncertain because of random packing, and therefore oF/ip,c) and oF/p,c)
are random variables.
The analysis of the interaction between two particles can be used to obtain the re-
lationships oFn = en oyn and oFt = Ct OYt between the force increments oFn and oFt
and corresponding deformation increments, OJin and OYt, in the normal and tangential di-
rections, where en and c1 are constants. The strain increments in the (xi, x2) reference
corresponding to these displacements can be calculated simply. Classical formulas of ten-
sor calculus can be used to find the image orij of the strain tensor or;j in the (XI, X2)
reference, for example,
The last two equations and the equilibrium conditions give relations between the stress
increment 0S22 and the strain increments orij. The above results and the mean field hy-
pothesis yield Eq. 8.101 following the arguments in [52]. ~
The methods in Sec. 8.4.1 can be applied to find moments and other prob-
abilistic properties of the displacement, strain, and stress fields for random het-
erogeneous media. We show here that some of the methods in this section, for
example, the perturbation and Neumann series methods, can be applied to cal-
culate approximately effective material constants and develop bounds on these
constants that are tighter than the Voigt and Reuss averages.
It is assumed, as stated at the beginning of Section 8.5.2, that the stiffness
coefficients Aijkl are homogeneous, square integrable random fields so that their
expectations E[AiJkl (x)] = OiJkl exist and are constant. The random part of the
stiffness coefficients, Aijkl (x) - Oijkl. has mean zero.
If (1) the random fields AiJkl and Fi are ergodic and independent of each other
and (2) D is sufficiently large so that sample and ensemble averages of A ijkl
and Fi nearly coincide, the constitutive law for the average stress and strain
fields can be approximated by Sij (x) = aiJkl f'iJ (x), where SiJ and denote riJ
approximations to the order e of Sij and f'ij in Eq. 8.89, respectively.
630 Chapter 8. Stochastic Systems and Deterministic Input
Proof: The representation of the displacement field and Eqs. 8.85, 8.87, and 8.88 give
- 21 R;Jki(x)
- ( (0)
uk,IJ(x) + ul,kJ(x)
(0) )
to order 1 and 8, respectively, provided that F; ~ 0(1). The boundary conditions for the
above equations correspond to the terms of order 1 and 8, respectively, of the specified
boundary conditions. If the boundary conditions for the equations of order 1 and the body
forces F; are deterministic, the solutions u?l are deterministic. The solutions u?l are
random fields irrespective of the properties of F; and the type of boundary conditions since
the differential equations defining these functions are driven by the random part of the
stiffness coefficients. The approximate strain and stress fields to the order 8 are
= S(O)
lj
(x) + 8 S(l) (x) + 0(8 2 ),
lj
where rki) = ( u~~) + ul:V) /2 and the expression for S;j is given by the terms of order
1 and 8 of A;jkt(X) rkJ(X).
The expectation of the above expression of S;i is
E [ s;(0)
1 (x) 1 (x) = a;Jkl E r;1 (x) + 8 r;1 (x)
+ 8 s;(1) ] [ (0) (1) J
since R;jkl and rk~) are not related so that the expectation E [ R;jkl (x) rk~) (x) Jis equal to
J J
E [ R;j kl (x) E [ r~~) (x) and R;j kl has mean zero. If the random fields A;j kl and F; are
ergodic so are the stress and strain fields. Hence, for a sufficiently large D, the expectations
E [ sfjl(x) + 8 sgl(x)] and E [r~l(x) + 8 rgl(x)J can be approximated by the sample
' ' .
(0)
averages of Sii + 8 S;i(1) and riJ(0) + 8 rii(1) , denoted by SiJ and rij, respectlVely. These
observations give the approximate constitutive law sij = a;j kl f\t . •
Neumann series method. Let R;jkt(X) = A;Jkt(X)- a;jkt be the random
part of A;Jkl and assume that the displacement field satisfies the boundary condi-
tion in Eq. 8.92. Consider a trial strain field
m
f(m)(x) = 2)-GR)kyk(x),
k=O
where the last equality is valid if the above Neumann series is convergent. Under this
condition we can approximate the strain field by the first m terms of the Neumann series
expansion of r, that is, by the field
-
U=-y
1 -T - -
2
Ay:::-
1
v
1
D
ri.(m) (x)(a+R)ijkl(x)rkl(m) (x)dx
1
Example 8.47: Consider a rod with unit cross section and random stiffness A (x),
0::: x ::: v, subjected to boundary conditions U(O) = 0 and U(v) = y vat its left
and right ends, respectively. The average and random parts of A, assumed to be a
632 Chapter 8. Stochastic Systems and Deterministic Input
is convergent a.s., and represents the strain field in the rod (Eqs. 8.58 and 8.59). We use
the above representation of[' to select trial strain fields f. The Voigt average (Eq. 8.94)
correspondstothetrialstrainfieldf(x) = ji. SinceAji 2 ::::: (1/v)j~ A(x)ji(x) 2 dxholds
for any strain field ji, it can be applied to the strain field ji + H (x) Yl· We have selected
y~pt to minimize the upper bound (1/v) J~ A(x) (ji + H(x) YI) 2 dx on A ji 2 . A
8.6.1 Elasticity
Let D be an open bounded subset of JR:d, d = 1, 2, 3, containing a random
heterogeneous elastic material. The boundary aD of D is subjected to specified
displacements varying slowly in time. The stiffness tensor of the material in D
is a random field Aijkz(x, t), where x E D, i, j, k, l = 1, 2, 3, and t ::: 0. Our
634 Chapter 8. Stochastic Systems and Deterministic Input
objective is to assess whether the stiffness A;jkl can evolve and, if it evolves,
whether its evolution emerges in a pattern.
Let flU(x, t), flrij(X, t), and flSij(x, t) denote the displacement, strain,
and stress increments at x E D during the time interval [t, t + flt], flt > 0, for
example, flU(x, t) = U(x, t + M)- U(x, t). We assume in our discussion the
following models for material, kinematics, and external actions.
A;jkt(X, t) = A(x, t) 8ij 8kt + S(x, t) (8;k 8jt + 8;j 8kz), (8.102)
where
where r~n = r mn - (1/d) r pp 8mn denotes the entry (m, n) of the devia-
toric strain tensor and cC>-) and cC~l are specified constants. This rule can be
viewed as an approximation of the crystal plasticity evolution defined later
in Section 8.6.2 (Eq. 8.122).
(c) is locally linear, that is, the relationship between stress and strain incre-
ments is
in [t, t + D.t], where CiJ = cJi are some constants. The loading is quasi-static and
body forces are zero.
1
Ay· · - -2 (Au·
'-" I) - '-' 1,]· (x ' t) + 6.u ],1· · (x ' t)) -- .. (t) -- cI).. M
6.a I)
are space invariant. Because 6.Yij I 6.t = Cij, we refer to Cij as strain rates.
The material model defined by Eqs. 8.102, 8.103, 8.104, and 8.105 is quite simple.
We use this model because it allows analytical developments. Such developments are not
possible when dealing with more realistic material models, for example, the model for
crystal plasticity in Section 8.6.2. We also note that the symbol 6. used in this section to
indicate increments of strain, stress, and other functions should not be confused with the
Laplace operator used previously in the book. &
where
Proof: We note that the Fourier transforms l:.U;(q, t) of l:.U;(x, t) in Eq. 8.109 depend
on the particular samples of t:.A(q, t) and t:.S(q, t) used in this equation so that they are
random. The following calculations apply to samples of the random fields A and S. We
also note that the functions l:.U; (x, t), t:.l\(x, t), and l:.S(x, t) are defined only on a subset
D of Rd. The Fourier transforms in Eq. 8.109 correspond to extensions of these functions
to the entire space Rd. It is common to use periodic extensions.
The equilibrium condition t:.Sij,J (x, t) = 0 can be given in the form
The result in Eq. 8.109 is the Fourier transform of the above equality. •
- ~) -
tlA(x, t) = bmn tlUm,n(X, t),
~
fl.:~(x,
~) -
t) = bmn tlUm,n(X, t), (8.112)
where b(J..), b(l;), b};~, and b;!~ are constants depending on the average strain
rates c;i.
8.6. Evolution and pattem formation 637
Proof: The change I::!..A(x, t) = A(x, t + l::!..t)- A(x, t) = 1::!../..(t) + 8l::!..A(x, t) of the
material parameter A during the time interval [t, t + l::!..t] depends on the strain increment
in this interval (Eq. 8.104). The function D..r~n l::!..f~m can be approximated to the order 8
by
l::!..f mn *
* l::!..f nm = l::!..amn l::!..anm - 31 (l::!..app) 2
+ 28 ( l::!..amn I::!..Bnm- ~ l::!..app I::!..Bqq),
where I::!..Bpq(x, t) = (I::!..Up,q(x, t) + I::!..Uq,p(x, t))/2. The arguments of the functions in
the above and some of the following equations are not shown for simplicity. The corre-
sponding approximation of the square root of this function is
* ~ r::
* )1/2 -va+
(l::!..fmnl::!..rnm y'a
8 ( 1
l::!..apq-3l::!..arr8qp ) I::!..Bqp(X,t),
!
where a = l::!..apq l::!..aqp- (l::!..arr ) 2 and the function (y + 8 z) 112 has been approximated
by y1/2 + (8 z)/(2 y 112). We have
1::!../..(t) + 8 I::!..A(x, t) = c(A.) [ y'a + :a (!::!..a pq (t) - ~ I::!..CXrr (t) 8qp) I::!..Bqp(X, t)]
so that
The results in Eqs. 8.111-8.112 follow since l::!..aij (t) = cij M (Eq. 8.107) so that alf2 =
(cpq Cqp - (err ) 2 /3) 112 l::!..t and
The constants of the first equalities in Eqs. 8.111 and 8.112 are, respectively,
b(A.) = c(A.) (c c _ (c )2/ 3) 1/2 and b(A.) = c(A.) Cmn + Cnm - (2/3) Cpp 8nm .
pq qp rr mn (Cpq Cqp- ( 1/ 3) (Crr )2)1/2
(8.113)
The constants b(l;) and b;i~ in the second equalities ofEqs. 8.111 and 8.112 can be obtained
from b(A.) and b~~ by multiplication with c<~) jc<A-). •
(8.114)
638 Chapter 8. Stochastic Systems and Deterministic Input
Note: The above equations show that the Fourier transforms of ~A(x, t) and ~S(x, t)
are linear functions of the Fourier transform of the displacement increments, that is, of the
functions ~Om (q, t). 6
The material evolution rules in Eq. 8.111 show that the deterministic part
of the stiffness coefficients varies linearly in time and remains space invariant.
The evolution of the random part of the stiffness coefficients cannot be obtained
directly from Eq. 8.112 since the increments of these coefficients depend on the
increments D..Um,n of the random part of the displacement field. However, the
Fourier transform of this equation (Eq. 8.114) and Eq. 8.109 yield finite differ-
ence equations for A and 8. These equations are coupled so that they have to be
solved simultaneously. Generally, the coefficients of the equations defining the
evolution of A and 8 depend on the wave number q (Eq. 8.109). Hence, the rela-
tive magnitude of the amplitudes of the constitutive waves of A(q, t) and 8(q, t)
may change in time and this change implies evolution. Accordingly, the proba-
bility law of the random fields A(x, t) and B(x, t) would also change in time in
which case we say that the material properties evolve.
Example 8.48: Consider a rod of length l > 0 with stiffness A(x, t) = a(t) +
8 A(x, t), where a and A denote the deterministic and random parts of A and
8 is a small parameter. It is assumed that the stiffness obeys the evolution rule
D..A(x, t) = c<a) D..f(x, t), where D..f(x, t) = D..U'(x, t) and D..A(x, t) denote
strain and stiffness increments at x during the time interval [t, t + D..t], U (x, t) is
the displacement function at location x and timet, and U'(x, t) = aU(x, t)jax.
The rod is subjected to an average elongation D.. a (t) x = c x D. t, c > 0, per unit
length during the time interval [t, t + D..t].
The stiffness A does not evolve because the spectral density of A at a time
t > 0 can be obtained from the spectral density of A at time t = 0 by scaling.
Hence, a very simple system of the type considered in this example cannot evolve
although its properties vary randomly in space. We will see in the following two
examples that this behavior does not persist for more general systems. <>
Proof: The increment of the displacement field in the rod during the time interval [t, t + M]
can be approximated by ~U(x, t) = ~u(x, t) + s ~U(x, t), x E D = (0, 1). The material
model and the material evolution rule give (Eqs. 8.102, 8.103, and 8.104)
~a(t) + s ~A(x, t) = c(a) (~a(t) + s ~(;' (x, t)) so that
The above relationship between !:l.U(q, t) and A(q, t) and the Fourier transform of the
evolution of A give
~ c<a) c ~
!:l.A(q, t) = -- - A(q, t) !:l.t
a(t)
or, by taking the limit as !:l.t --+ 0 of the finite difference equation for A scaled by !:l.t,
a~ c0)c ~
-A(q, t) = - - - A(q, t).
at a(t)
gives the fluctuating part of the stiffness at an arbitrary time t ~ 0. The above equation
holds for almost all material samples. If A(x, 0) is a homogeneous random field with
spectral density s(q, 0), the spectral density of A(x, t) is
[_ _ J = E [A(x,
E A(x, t) A(y, t)
_ 0) J exp (-2 Jot
_ 0) A(y, c<a) c
a(s) ds
)
are Fourier pairs. The relationship between the spectral densities of A at the initial and an
arbitrary timet ~ 0 shows that A(x; t) is a scaled version of A(x; 0). Therefore, there is
no evolution. •
Proof: The increments of the displacements in D during a time interval [t, t + l!!.t] are
represent the corresponding strain increments at x E Din [t, t + t!.t] (Eq. 8.106).
The relationship between stress and strain increments is (Eq. 8.105)
t!.Su (x, t) ]
[ f!.S22(x, t)
f!.Sn(x, t)
][
A(x,t)+28(x,t) A(x, t) 0 1!!.1u(x,t)]
= [ A(x, t) A(x, t) + 2 S(x, t) 0 f!.i22(X, t)
0 0 8 (x, t) f!.i12(X, t)
cA., 1 (x, t) + 2 8, 1 ex, t)) t!.a 11 + (A.(t) + 2 ~(t)) t!.U1,u (x, t) + A.(t) t!.U2,21 (x, t)
+ S,2(x, t) f!.a1212 + ~(t)(f!.Ut,22(x, t) + f!.U2,12(x, t))/2 = 0 and
S, 1(x, t) t!.au/2 + W) (f!.U1,21 (x, t)/2 + t!.U2,u (x, t))
+ S,2(x, t) !!!.au +A.(t) f!.U1,12(x, t) + (A.(t) + 2~(t)) t!.U2,22(x, t) = 0.
8.6. Evolution and pattern formation 641
[
q[ (A.(t) + 2W)) + qi ~(t)/2 q1 q2 (A.(t) + ~(t)/2) ] [ ~q 1 (q, t) J
q1 q2 (A.(t) + W)/2) qi (A.(t) + 2 ~(t)) + q[ ~(t)/2 ~U2(q, t)
= R [ q1 ~an (A(q, tl + 2 S(q, t)) + q2 ~~12 S(q, t)/2 J
q1 ~a12 S(q, t)/2 + q2 ~an S(q, t) '
constitutes an algebraic system of linear equations for (~Ut. ~U2) with solution
where
The coefficients aij and bij are functions of the wave numbers q and the current average
material parameters A. and ~. The arguments in the above equations are suppressed for
simplicity.
The strain function,
* *
[rmn<x.t)rnm<x,t)] 1/2"" 1 [ (~an )2 + ( ~a12 )2]1/2
- ..fi
e - -
+ .../2[(~an) 2 + (~a 12 ) 21112 [~au (~UI,I(x, t)- ~U2,2(x, t))
~A.= c(J..) c (1 + t-t 2) 112 ~t/..fi and ~~ = c<n c t-t (1 + t-t2) 112 ~tj./2
for terms of order 1, and
~A(x, t) = i:O·) [~U1,1 (x, t) - ~U2,2(x, t) + t-t (~UI,z(x, t) + ~U2,1 (x, t))],
~S(x, t) = c<~) [~U1,1 (x, t)- f!J.U2,2(x, t) + t-t (f!J.Ut,z(x, t) + f!J.Uz,l (x, t))],
for terms of order e, where c(J..) = c(J..) /J2 (1 + t-t2) and c(~) = c(;) /J2 (1 + t-t2). The
material parameters A. and ~ vary linearly in time at different rates if t-t f= 1. The evolution
of the fluctuating part of material parameters is less simple. The Fourier transforms of the
last two equations,
~A(q, t) = J=I c(J..) c [(q 1 + t-tqz) f!J.U1 (q, t) + (-q2 + t-tql) ~U2(q, t)],
~S(q, t) = J=Ic<~) c [(ql + t-tq2) f!J.U1(q, t) + (-q2 + t-tqi) ~U2(q, t)],
642 Chapter 8. Stochastic Systems and Deterministic Input
and the linear system of equations giving the Fourier transform of U as a function of the
Fourier transforms of A and 8 can be used to determine the evolution in time of A and S.
Because the increments of A and Sin [t, t + ll.t] are related by
!!__A(q; t) = -c<l.l c ll.t f(q; JJ-, A, 0 A(q; t) = a(q; JJ-, A,~) A(q; t),
at
where
(ql + qz JJ-) (bn + b12 c(l.) jc<O) + (-qz + q1 JJ-) (bzl + bzz c(l.) ;c<~l)
f(q; !J-, A,~) = 2
an azz- a12
The differential equation aAjat = a(q; JJ-, A,~) A shows that the constitutive waves of
A(q; 0) change in time at different rates since a(q; JJ-, A, 0 is a function of q. That A
evolves in time can also be seen from the solution,
at the initial time t = 0, where ak > 0 are some positive constants and q (k) E JR2
denote wave numbers. The field has the spectral representation
n
A(x, 0) = L ak [ Ak cos(q(k) · x) + Bk sin(q(k) · x)],
k=l
8.6. Evolution and pattern formation 643
where Ak and Bk are uncorrelated random variables with mean zero and variance
1, and q<k) · x = qfk) x1 + qik) x2 (Section 3.9.4.3).
The spectral density, covariance function, and spectral representation of the
random field A(x, t) at an arbitrary time t 2: 0 are
1 n
s(q, t) = 2L Yk(t) [o(q + q<kl) + 8(q- q<kl) J,
k=l
n
c(~, t) = E [ A(x, t) A(x + ~. t) J = L Yk(t) cos(q(k) · ~), and
k=l
n
A(x, t) = L Yk(t) 1 1 2 [ Ak cos(q(k) · x) + Bk sin(q(k) · x) J,
k=l
respectively, where Yk(t) = a'f exp [ 2 f~ a(q<kl; tt. A.(s), g(s)) ds J. Hence, (1)
A(x, t) is a weakly homogeneous random field with mean zero at each time t 2: 0
and (2) the spectral representations of the random fields A(x, 0) and A(x, t),
t > 0, involve waves of the same frequencies but of different amplitudes.
Suppose that the spectral density of A(x, t) has only four waves with q (1) =
(2, 3), q< 2l = (3, 6), q< 3l = (5, 7), and q<4 l = (10, 0.01) and ak = 1, k =
1, ... , 4. Figure 8.20 shows the evolution in time of a sample path of A for tt = 1,
t= 1.8
5 5
0 0
-5 -5
2 2
2 2
0 0 0 0
x2 xl x2
t= 3.6 t= 6.0
5 5
0 0
-5 -5
2 2
2 2
0 0 0 0
x2 xl x2
t= 0.0 t= 1.8
5 5
0 0
-5 -5
2 2
2 2
-2 -2 -2 -2
~2
t= 3.6 t= 6.0
5 5
0 0
-5 -5
2 2
2 2
-2 -2 -2 -2
~2 ~2
,
A(q, 0, w) = -(2n)2
2- Ln ak [(Ak(w) + .J=l Bk(w) ) 8(q + q(k))
k=l
for the Fourier transform of each sample w E Q of the material parameters. Because
A(q; 0, w) has energy only at (q(k), -q<kl), so does A(q, t, w). Hence, A(q, t, w) and
A(q, 0, w) have the same frequencies, but the energy associated with these frequencies
8.6. Evolution and pattern formation 645
differ for the two random fields because the ordinates of the spectral density of Aare mod-
ulated differently in time depending on the values of a(q; f-t, .l..(s), Hs)). The difference in
the evolution of the energy associated with each frequency is caused by the dependence of
the function a(q; f-t, .1.., 0 on q.
The sample in Fig. 8.20 is generated from a Gaussian random field. The model is
inadequate for representing material stiffness because stiffness is positive and bounded and
the Gaussian fields do not have these properties. We use the Gaussian model in Fig. 8.20
only for graphical illustration. ~
Note: The unit vectors (sa, na) are related to the lattice frame (e 1 , e2) by
646 Chapter 8. Stochastic Systems and Deterministic Input
slip system
where ea denotes the angle between the direction sa of the slip system a and e1. The
Schmid tensor in the lattice frame is ta = sa 0 na and has the entries ti) = nj, sf
i, j = 1, 2, where sf
= cos(()a), s2
= sin(()a), n'l = - sin(()a), and = cos(()a). n2
Note that the trace and the determinant of matrix ta are zero. We decompose ta into its
symmetric and skew parts denoted by
and
q a =s kew(t a )= [ 0_ 112 1/2 ]
0 '
respectively. The tensor qa is constant only for planar crystals. A
(8.116)
Note: The deviatoric stress tensor depends on two parameters, Z1 = (Si1 - S?,2)/2 =
(S11 - Szz)/2 and Zz = Si2 = S12. The resolved shear stress corresponding to the
pressure tensor Sm 8ij is sa = Sm 8ij sf
nj = Sm (sf
nj) = 0 since sa and na are
orthogonal, so that
by the definition of the Schmid tensor and the symmetry of the deviatoric tensor, where
8iJ = 1 for i = j and 8ij = 0 for i =I= j.
The relationships between the components (JP, qa) and (Pa, Qa) of the Schmid
tensors in the lattice and laboratory frames are
(8.118)
where A 11 = Azz = cos(<l>) and A12 = -Az1 = - sin(<l>) are the entries of the (2, 2)
matrix A. If the deformation of the crystal varies randomly in time, the angle of rotation <l>
and the components (Pa, Qa) of the Schmid tensor are stochastic processes. &
1. The yield condition, defining a criterion for the occurrence of plastic flow, is
where S~r can be interpreted as a strength parameter for the slip system a and
represents a material property.
Note: sg. may or may not depend on the shearing rates tfJ of the slip systems f3 in the
crystal. Because the resolved shear stress sa depends linearly on the stress parameters Z1
and Zz (Eq. 8.117), the yield condition is aa Z1 +ba Zz = ±sg., where aa and ba depend
on the orientation of the slip system a.
If sgr does not depend on tfJ, the yield condition is said to be rate indepen-
dent. Otherwise, we deal with a visco-plastic model. A typical visco-plastic model is
sa
sg. = f (... ' t fJ' ...)' where f is a specified function of the shearing rates t ~ and
sa is a state variable whose time evolution is given by the differential equation sa =
g(Sa, ... , 1tfJ j, ...), for example,
(8.120)
where hafJ are specified functions. This evolution equation is analogous to a modified
version of the Voce macroscopic model [50]. &
648 Chapter 8. Stochastic Systems and Deterministic Input
(8.121)
(8.122)
where R is a matrix defining the rigid body rotation and A = k RT denotes the
lattice spin.
Note: The sum of G = La ta pa and W - A coincides with the plastic velocity gradient
LP given by Eq. 8.121.
The evolution of the geometry and structure of a crystal experiencing plastic defor-
mation is illustrated in Fig. 8.23. Let Do be a subset of Rd (d = 2 in our discussion)
~
~
giving the geometry of the crystal at the initial time t = 0, and let Dt c Rd be its de-
formed shape at a later time t > 0. Denote by x 1--+ y(x, t) the mapping from Do to
Dt. If the imposed deformation and/or crystal properties are random, Dt is a random
subset in Rd. It is assumed that the Jacobian of the mapping from Do to Dt is not zero
so that the mapping can be inverted. The physical meaning of this assumption is that a
non-zero volume in Do cannot be mapped into a zero volume in Dt or that there is no
subset of Do of non-zero Lebesgue measure whose image in Dt has measure zero. The
gradients F(x, t) = {ayi(x, t)jaxj) of the mapping x 1--+ y(x, t) in the laboratory frame
are given by F = R F P, where R and F P correspond to rigid body rotation and lat-
tice distortion, respectively (Fig. 8.23). The velocity gradient L = {ayi(x, t)jayj) in
the current coordinates can be calculated from L = F p-l by the definition ofF and
8.6. Evolution and pattern formation 649
.
cl> = A12 = W12 - "21"''a
~r (8.123)
a
for the evolution of the lattice orientation since q f1 = q!f.2 = 0 and qf2 = -q!f.1 =
1/2 (Fig. 8.22).
Note: If the crystal has a single slip system, there is no solution.
If the crystal has two slip systems, the unknown shearing rates ta, a = 1, 2, can
be calculated from the definition of G in Eq. 8.121. Then Eq. 8.122 yields a differential
equation for the angle <l> between the current lattice frame and the laboratory frame.
If the crystal has three or more slip systems, the constitutive laws of the slip systems
are needed to find ta. The symmetric part of the velocity rate can be given in the form
can be used to determine ta with Scr obtained, for example, from Eq. 8.120. The evolution
of the lattice orientation follows from Eq. 8.122. ~
650 Chapter 8. Stochastic Systems and Deterministic Input
Example 8.51: Consider an infinite planar crystal with two slip systems. The
crystal is subjected to plastic deformation with velocity gradient L. Let <l>(t)
denote the angle between e 1 and e1 at timet (Fig. 8.22), where e1 is aligned with
e
the first slip system, that is, 1 = s 1' and f) is the angle between the two slip
systems of the crystal. The solution of the differential equation
. L12(t)-L21(t) sec(2fJ)
<l>(t) = 2 - 2 [(LJ2(t)
and
Because the grain has two slip systems, the slip shearing rates ta, a = 1, 2, can be deter-
mined from the above equations as functions of G11 and G22. These expressions of ta,
a = 1, 2, and Eq. 8.123 yield a differential equation for ci> with known coefficients and
input.
We also note that the velocity gradient can be decomposed in three parts correspond-
ing to pure compression, pure shear, and spin modes,
(8.126)
Example 8.52: Suppose that the crystal in Example 8.51 is subjected to a pure
time-invariant compression, that is, L12(t) = L21(t) = 0 and Ln is a con-
stant. The lattice orientation angle <1> (Fig. 8.22) satisfies the differential equation
<i>(t) = -Ln sec(2fJ) sin(2 <l>(t)) and is
where <1> (0) denotes the initial state. The fixed points of the differential equation
=
for <1> are ¢o n (n /2), where n is an integer.
Figure 8.24 shows trajectories of the lattice orientation angle <1> for L 11 = 1,
several initial conditions <1>(0) E (-n/2, n/2), and an angle fJ = nj6 between
8.6. Evolution and pattern formation 651
2.-------~------~------~-------,
-2L-------~------~------~------~
0 0.5 1 1.5 2
time
the two slip systems of the crystal. The trajectories of the orientation angle ap-
proach the stable equilibrium point ¢o = 0 as t --+ oo so that the density of <l>(t)
approaches a delta Dirac function centered at zero as time increases, irrespective
of the initial state <1>(0). Hence, atomic lattice evolves in time and its evolution
emerges in a pattern [ 11, 119].
The trajectories of the lattice orientation are less simple if L 11 is time de-
pendent. For example, suppose that L 12 = L21 = 0 and that L 11 is a stationary
Gaussian process with mean zero, variance one, and one-sided spectral density
g(v) = 2 {Jj[n (v 2 + {3 2 )], v :::: 0. Figure 8.25 shows a sample of L 11 for f3 = 0.1
2.5,-----,-~----~----~----~----.
10 20 30 40 50
time
Figure 8.25. Evolution of lattice orientation for pure time-variant random traction
and the corresponding sample of <I> for <1>(0) = 0.5 and(} = n j6. The sample
path of the lattice orientation <I> oscillates between the fixed points 0 and n /2 be-
652 Chapter 8. Stochastic Systems and Deterministic Input
cause the equilibrium position depends on the sign of L 11, which changes in time
[11, 119]. In this case the lattice orientation <I> evolves but no pattern emerges
from its evolution. <>
Note: Let z be a real-valued function defining the state of a physical system specified by
the differential equation z(t) = h(z(t), 9), where h is a known function and (J denotes
a vector of parameters. The solutions zo(fJ) with the property h(zo(fJ), 9) = 0 are the
singular or fixed points of the above differential equation. Hence, a singular point is also
an equilibrium point since z(t) is zero at zo(fJ). A fixed point is said to be stable if the
solution z starting in a small vicinity of zo(fJ) remains near zo(fJ) as t -+ oo. If a fixed
point does not have this property, it is said to be unstable.
Several methods can be used to determine whether a fixed point is or is not stable.
For example, the linearization technique ([104], Chapter 8, [119]) considers the differential
equation
8.6.2.2 Polycrystals
A polycrystal is an ensemble of a large number of crystals interacting with
each other. Consider a polycrystal occupying an open bounded subset D of IR d
with boundary aD subjected to a specified traction and velocity on aDs and a Dv,
respectively, where aDs U aDv = aD and aDs n aDv = 0. The objective
is to find the velocity and stress fields in D. The stress S and velocity V in
the polycrystal viewed as a continuum must satisfy (1) equilibrium conditions,
Sij,j + b; = 0, where b; denote body forces, (2) kinematics constraints, L =
grad(V) = G + W, (3) a constitutive law, for example, S* = KG, where K
represents the material stiffness that can be calculated from the crystal properties,
and (4) an incompressibility requirement, div(V) = V;,; = 0.
The material stiffness K depends in a complex way on the properties of the
constitutive grains and their interaction. Generally, K is determined by the finite
element method based on:
(1) The relationships Gg = Cg Sg and Sg = Kg Gg derived from Eqs. 8.121 and
8.122, where Cg and Kg denote the grain compliance and stiffness, respectively,
Gg is the symmetric part of the plastic velocity gradient in a grain, and S g denotes
the deviatoric part of the strain tensor in a grain. The parameters C g and Kg
depend on the resolved shear stress and the Schmid tensor in the grain.
8.6. Evolution and pattern formation 653
(2) Linking hypotheses relating the microscopic and macroscopic solutions. The
Taylor and Sachs hypotheses are frequently used.
- The Taylor hypothesis states that (a) the macroscopic and microscopic ve-
locity fields coincide, that is, G = Gg and (b) the macroscopic deviatoric
stress tensor S is equal to the average of the deviatoric tensor S g over the
crystal volume, that is, S =< Sg >,where< · >denotes volume average.
Accordingly, we have the macroscopic constitutive law S = K G, where
K =< Kg >. The Taylor hypothesis provides an upper bound analogous
to the Voigt average in Eq. 8.94.
- The Sachs hypothesis states that (a) the macroscopic and microscopic de-
viators coincide, that is, S = Sg and (b) the macroscopic velocity is equal
to the average velocity over the crystal volume, that is, G =< Gg >.
Accordingly, we have the macroscopic constitutive law G = C S, where
c =< cg >.
Example 8.53: The calculation of the evolution of the stress, strain, and lattice
orientation in a polycrystal involves three steps. First, the constituent grains of the
polycrystal need to be characterized. Second, the finite element method can be
used to calculate the evolution of the polycrystal state variables. Third, statistical
methods need to be used to estimate global features of the polycrystal behavior.
Euler angles measurements of the atomic lattice orientation in a material
specimen can be used to develop probabilistic models for the geometrical proper-
ties of the grains in a polycrystal. For example, Figs. 8.26 and 8.27 show three-
dimensional and contour plots of <I> 1 for aluminum AL 7075, where <I> 1 is one
of the three Euler angles defining the atomic lattice orientation. The plots corre-
Experiment Simulation
4 .. ·
<1>,
2
0
550
550 550
0 0 0 0
y (IJlll) x(flm) y (IJlll) X (IJlll)
Figure 8.26. Experimental data of <I> 1 and a sample of its translation random field
model. Three-dimensional representation
654 Chapter 8. Stochastic Systems and Deterministic Input
Experiment Simulation
X (J.U11) X (J.U11)
Figure 8.27. Experimental data of <I> 1 and a sample of its translation random field
model. Contour lines
0.4
0.25
0.2
0.15
0.1 0.2
r-,
0.05
h
2
nnnn 4
"
6
0
have been obtained for a polycrystal subjected to a 5% initial prestrain and subse-
quently to strain cycles of amplitude 3%. Figure 8.29 shows the number of grains
1600
1400
1200
1000 20 cycles
aoo 50 cycles
soo 100 cycles
400
200
stress level, s
in the polycrystal with stresses exceeding a stress levels after 1, 20, 50, and 100
strain cycles. The plot provides information on the global state of stress in the
polycrystal and its evolution in time. <>
for an initial state X (0) = xo, where a E JRd' is a parameter, W E JRd' denotes a
white noise process, and d, d' :=:: 1 are integers. It is assumed that Eq. 8.127 has a
stationary solution denoted by X s· If Xs = 0 satisfies Eq. 8.127, we say that this
equation admits a trivial stationary solution. For example, X s = 0 is a solution of
Eq. 8.127 if this equation is homogeneous.
A main objective of stochastic stability analysis is the determination of sub-
sets of JRd' including all values of a for which solutions X of Eq. 8.127 starting
in a small vicinity of a stationary solution X s converge to X s as time increases
indefinitely. If this behavior is observed for almost all samples of X, then X s is
said to be stable a.s. Otherwise, X s is unstable and a bifurcation, that is, a qual-
itative change of this solution, may occur. Stability analysis is relevant for many
aerospace, mechanical, and structural systems that exhibit nonlinear behavior [9].
Generally, the functional form of the density of X s and the Lyapunov expo-
nent are used to assess whether X s is or is not stable [6, 24, 102, 130, 191]. The
656 Chapter 8. Stochastic Systems and Deterministic Input
Lyapunov exponent. If
1 -
ALE= lim -In
t---+00 t
II X(t) II< 0 and > 0, (8.128)
then the stationary solution X sis stable and unstable a.s., respectively. If ALE =
0, then X s may or may not be stable [9, 10].
Note: Generally, the calculation of the Lyapunov exponent involves two steps. First, a
differential equation needs to be developed for the process X = X- Xs. An approxi-
mate version of this equation obtained by linearization about Xs is used in the analysis.
Accordingly, X is the solution of the differential equation
d d' d
dXi(t) = L aij Xj(t)dt + L L bt Xj(t)dBk(t), i = 1, ... , d,
j=l k=l j=l
where Bk> k = 1, ... , d 1 , are Brownian motions independent of each other and the co-
efficients aij, bt may depend on Xs. Hence, X is a diffusion process conditional on
Xs. Second, the Ito formula can be used to show that S(t) = X(t)/ II X(t) II is a dif-
fusion process on the unit sphere in JRd. It can be proved under some mild conditions
that the top Lyapunov exponent is given by ALE = E[q(S(t))] = q(s) p,(ds), where J
q(s) = ((ii - bil) s, s) + (1/2) tr(b i/ ), ii = {aij}, 'L1=l bfj Xj are the entries of
b, and p,O denotes the marginal probability density function of S assumed to be ergodic
[112]. Theoretical considerations on Lyapunov exponents and their use in applications can
be found in [6, 7, 13]. A
Example 8.54: Let X(t), t ::: 0, be an JR2 -valued diffusion process defined by
dXt(t) =aXt(t)dt+a (Xt(t)dBt(t)+Xz(t)dBz(t))
{
dXz(t) =bXt(t)dt +a (Xz(t)dBt(t)- Xt(t)dBz(t)),
where a, b, a are some constants and Bt, Bz denote independent Brownian mo-
tion processes. The trivial stationary solution of the above stochastic differen-
tial equation is stable a.s. if (a + b) Io(fJ) + (a - b) It (fJ) < 0, where fJ =
(a- b)/(2a 2 ) and / 0 , It. denote modified Bessel functions of the first kind. 0
8. 7. Stochastic stability 657
so that Cl> is a diffusion process. The stationary density f(rp) of Cl> satisfies the Fokker-
Planck equation
d a2 d2 f(rp)
- ((b- a) cos(rp) sin(rp) f(rp)) = - - -2-,
drp 2 drp
and has the expression f(rp) = c exp ( 2,8 cos(rp) 2), where c > 0 is a constant such that
J~ rr f (rp) d rp = 1 and f satisfies the boundary condition f (0) = f (2rr). Elementary
calculations give c = 1I (2 ef3 rr lo (,8)).
The Lyapunov exponent ALE = E[q(S(CI>(t)))] = J~rr q(S(rp) f(rp) drp has the
expression
1
ALE = 2 Io(,B) [(a+ b) Io(,B) +(a- b) It (,8)].
Hence, the trivial stationary solution is stable a.s. if the stated condition is satisfied. For
example, the stationary trivial solution is stable a.s. for a = -1, b = -2, and a = 1 but
unstable for a= 1, b = 2, and a= 1. •
that is approximated by
where ~(t) = f3- m Xs(t)m-l and X(t)m- Xs(t)m ::::: m Xs(t)m-l X(t). The
approximation is acceptable since we consider only small initial deviations from
the stationary solution. If ALE is strictly negative, then almost all samples of the
solution of Eq. 8.131 converge to zero as t -+ oo, and we say that the stationary
solution of Eq. 8.129 is stable a.s.
A _ { a for Xs = 0,
LE- (1- m)a for Xs =I= 0,
so that the trivial and non-trivial solutions ofEq. 8.129 are stable a.s. for a < 0 and
a > 0, respectively. Figure 8.30 shows five samples of the solution of Eq. 8.129
with S = a B, a = ±1, a = 1, m = 3, and X(O) = 0.1. For a = -1, the
samples of X approach the trivial solution X s = 0 as time increases. The samples
0.35
a=-1
0.3
0.25
0.2
0.15
0.1
0.05
0
0 5 10 15 20 5 10 15 20
time time
The density of Xs is ls(x) = c x<2 afcr 2 -l) exp [ -2xm-l /((m - 1) a 2 )] for
a > 0, where c > 0 is a constant. This density is the solution of the Fokker-Planck
equation for X s (Section 7.3 .1.4). For a < 0 the density of X s is concentrated at
zero, that is, ls(X) = o(x). We note that Is for a > 0 has different functional
forms for a < acr and a > acr. where acr = a 2 /2. Figure 8.31 shows the density
Is form = 3, a = 1, and several values of a. The thin and heavy lines correspond
to values of a smaller and larger than acr·
1.8
2.5
X
Proof: The solution of Eq. 8.131 with S = a B for a deterministic function ~(-) is a
geometric Brownian (Section 4.7.1.1). Accordingly, we have
so that E[Xs(t)m- 1 ] = 13 - a 2 f2 =a because the left side of the above equation is zero.
If Xs is ergodic, then (lft) JJ
Xs(u)m- 1 du converges a.s. to E[Xs(t)m- 1 ] =
a as t --+ oo so that lilllt--+oo Y(t) = a a.s. for Xs = 0 and lilllt--+oo Y(t) = a -
m E[Xs(t)m- 1 ] = (1 - m) a for Xs i= 0. These asymptotic values of Y coincide with the
mean of this process at any t 2: 0. Since X(t) = X(O)etY(t), the trivial and non-trivial
stationary solutions are stable a.s. if a < 0 and (1 - m) a < 0, respectively.
We now examine the stability of the stationary solution Xs by calculating the Lya-
punov exponent ALE in Eq. 8.128. Consider the approximate definition of X in Eq. 8.131
with S =a B. The differential equation for R(t) = ln(X(t) 2) is (Ito's formula)
0.14,----~----.----,---~--,------,------,
0.06
0.04
0.02
Figure 8.32. Boundary a*(A.) between stability regions for the trivial and non-
trivial stationary solutions
it converges to the bifurcation point between the trivial and non-trivial stationary
solutions in Example 8.55. The result is not surprising since C becomes a version
of the driving noise a Bas A.-+ oo.
Figure 8.33 shows samples of X with X(O) = 0.5, a= 1, Yt "'U(-a, a),
a = a ,f3!3, A. = 30, and two values of the parameter a, a = 0.015 < a* (30)
and a = 1 > a*(30). Consistently with the stability regions in Fig. 8.32, the
2.5 4
a= 0.015 a= 1.0
2
3
1.5
2
0.5
0
0
~1
50
u 100 150 200 50 100 150 200
time time
where C*(t) = :L~? ln(1 + Yk) and Z(t) = (1/t) JJ ~(s-)ds + C*(t)jt. The Ito
formula, applied to the function
whose arguments (t, x) correspond to the~-valued process (X1, X2) defined by dX1 (t) =
dt and dX2(t) = dC*(t), gives
g(t,C*(t))-g(O,C*(O)) = t
lo+
~(s-)g(s,C*(s-))ds
+ L [g(s, C*(s-) + ~C*(s))- g(s, C*(s-))]
O<s:::;t
Z(t) = f3- m!
t
t
lo+
Xs(u-)m- 1 du + C*(t)/t
converges a.s. to f3- mE [ Xs(t-)m- 1J +A. E[ln(1 + Y1)] as t --+ oo. This solution
is stable a.s. for ex > ex*(A.) because E[Xs(t- )m- 1] = f3 +A. E[ln(1 + Y1)]. To obtain
L:
E[Xs(t)m- 1], note that the Ito formula applied to the function ln(Xs(t) 2) gives
by averaging. Because Xs is a stationary process, the left side of the above equality is zero
so that E[Xs (t- )m- 1] = {3 +A E[ln(l + Y1)].
As in the previous example, the Lyapunov exponent is not needed since we know
the exact solution, but it is calculated for illustration. The Lyapunov exponent can be found
L:
by Ito's formula applied to R(t) = ln(X(t) 2 ). This formula gives
8.8.1 Soilliquefaction
Soils are heterogeneous materials whose properties exhibit significant spa-
tial fluctuations. Under external actions, soil deposits can experience a phase
transition. For example, sands and other cohesionless soils can change abruptly
their consistency from solid to near fluid if subjected to cyclic actions. This phase
transition phenomenon, called liquefaction, occurs if the pore water pressure ex-
ceeds a critical value. Soil liquefaction can have catastrophic effects on building
664 Chapter 8. Stochastic Systems and Deterministic Input
and bridges during seismic events. Generally, relatively small volumes of a soil
deposit liquefy during an earthquake. The pockets of liquefied soil appear to have
random size and location [144].
Our objective is to (1) assess the potential for liquefaction of cohesionless
soil deposits subjected to earthquakes and (2) examine the spatial distribution and
the size of liquefied soil pockets. The potential for liquefaction is measured by the
pore water pressure. It is assumed that liquefaction occurs at a point x in a soil
deposit if the pore water pressure at x exceeds a critical value.
The cone tip resistance Z1 and the soil classification index Zz largely con-
trol the occurrence and the extent of liquefaction in a soil deposit. Because Z 1 and
Zz exhibit a significant spatial variation, it has been proposed to model them by
random functions [144]. Let Z = (Z 1. Zz) be an JR(2 -valued random field defined
in an open bounded subset D of JR 3 . Because Z1 and Zz are bounded, Z cannot be
approximated by a Gaussian random field. We model Z by a translation random
field defined by
Example 8.57: Consider a vertical section through a soil deposit D and let D v =
(0, a) x (0, b) c JR2 be a rectangle included in this section with the sides of length
a and b oriented in the horizontal and vertical directions, respectively (Fig. 8.34).
The soil deposit in D is subjected to the deterministic ground acceleration x(t)
in Fig. 8.34. Data analysis shows that the random fields Z 1 and Zz follow beta
distributions. The parameters of these distributions and the correlation functions
of Z are given in [144].
Figure 8.35 shows contour lines for six samples of the pore water pres-
sure random field W corresponding to six independent samples of Z. The darker
8.8. Localization phenomenon 665
1
bou\ruy condffion• / /v 15
1.6m
b= 12m
~~----a-=-6~0m------~l
+~·-~=-==--------t-<1'
Sample 3
Figure 8.35. Six samples of the pore water pressure random field W
shades in the figure are for larger values of W, and indicate an increased poten-
tial for liquefaction. The samples of W are consistent with observed liquefaction
patterns [144]. The size and the location of the pockets of liquefaction show a
remarkable sample to sample variation. <>
Note: An extensive data set has been used in [144] to estimate the probability law of the
random field Z. If it is assumed that the soil deposit in D is homogeneous, then liquefaction
either does not occur or occurs everywhere in D, in contrast with field observations. I.
666 Chapter 8. Stochastic Systems and Deterministic Input
their neighbors will vibrate with even smaller amplitudes, so that the modes of
vibration of the dynamic system in Fig. 8.36 with slightly different oscillators and
weak couplings will be localized.
It is shown that properties of products of random matrices and the Lyapunov
exponents introduced in Section 8.7 can be used to find the rate of decay of the
localized modes.
Let Xn = An Xn-1. n = 1, 2, ... , where A1, A2, ... are independent
copies of a (d, d) matrix A with real-valued entries and X o = xo E !Rd. Note that
the sequence Xn = Cn Xo, n = 1, 2, ... , is an 1Rd-valued Markov chain, where
Cn =An An-!··· A1.
We present some properties of matrix C n and define the upper Lyapunov
exponent for the sequence X n.
Proof: Because E [ (ln I A1 lim)+] < oo, the expectation E [In I A1 lim] exists but may
8.8. Localization phenomenon 667
by properties of the norm, the definition of Cn, and the assumption that the matrices Ak
are independent and have the same distribution. The subadditivity of In II Cn II implies the
convergence
1 . 1
-E [In II Cn lim]- mf -E [In II Cq lim], as n--+ oo,
n q:;::l q
. 1 1
ALE= hm -In
n-+oo n
II Cn lim= lim -ln
n-+oo n
II An··· Ar lim (8.133)
holds a.s. and is called the upper or the top Lyapunov exponent ([24], Theo-
rem4.1,p.ll).
Note: The result in Eq. 8.133 is a statement of a theorem by Furstenberg and Keston [69].
Recall that we have used the upper Lyapunov exponent in Section 8.7 to assess the
sample stability of the stationary solutions of differential equations driven by Gaussian and
Poisson white noise. We have seen that these solutions are stable a.s. if ALE < 0. The Lya-
punov exponent in Eq. 8.133 can also be used for the stability analysis of discrete systems
with random properties, for example, a system with state Xn, n = 0, 1, ... , defined by
the recurrence relationship Xn = An X n-1· n = 1, 2, ... , and a specified initial state X 0·
Such relationships also result from discrete time approximations of stochastic differential
equations of the type in Eq. 8.131.
That the Lyapunov exponent gives the rate of growth of the solution is consistent
with our intuition. For example, let ai, i = 1, 2, ... , be deterministic (d, d) matrices and
define the sequence Xn(xo) = an "n-1 · · · a1 xo = Cn xo, n = 1, 2, ... , starting from a
deterministic initial state xo. Let a~n and a~ax denote the smallest and largest eigenvalues
of ci Cn. We have (Section 8.3.2.1)
We have seen in Section 8.7 that systems with dimension d ::: 2 have more
than one Lyapunov exponent. For stability analysis the upper Lyapunov exponent
is relevant. However, for localization we need the smallest positive Lyapunov
exponent since it gives the least rate of decay of the state X n, and this rate provides
information on the extent of the localization. We denote this Lyapunov exponent
by XLE· Additional information on the use of the Lyapunov exponent for mode
localization analysis and methods for calculating XLE in some special cases can
be found in [198] and [122] (Chapter 9).
s~--s~~~s~~~~-s~~--s~~--s~~
~~~~~~«---+}- ~J~·---,r~---J~#
Figure 8.37. A perfectly periodic continuous beam
that the original spans become Li = l + Ri, i = 1, ... , n, where Ri are indepen-
dent copies of a random variable R such that IR I « l /2 a. s. The resulting beam
is referred to as a randomly disordered or nearly periodic beam.
Figures 8.38 and 8.39 show the first five modes of vibration of the perfectly
periodic and randomly disordered beams, respectively, for l = 1, s = 20, n =
100, R ~ U( -0.03, 0.03), and unit mass per unit of length. The modes of the
periodic beam extend over all spans. On the other hand, the modes of the randomly
disordered beam are localized on only a few spans. <>
Note: The modal shapes in Figs. 8.38 and 8.39 were calculated by using the finite element
method. Two beam elements were used in each span of the perfectly periodic and randomly
disordered beams, so that both beams were modeled by discrete systems with 2 n + 1
degrees of freedom. An algorithm written in MATLAB was used for the modal analysis.
The example is taken from [198].
Let ei be the rotation at support i = 1, ... , n + 1 and let e<il = (E>i, E>i+l)·
Then e<il = T(i) eU-l), where T(i) denotes a transition matrix that can be obtained
by classical methods of structural analysis. For the perfectly periodic beam the transition
matrices are deterministic and identical. On the other hand, for the randomly disordered
beam the matrices T(i) are independent and identically distributed. The recurrence formula
for 8 gives
e<k> = T<k> ... T<2> e<l)
so that II e<k) 11=11 T(k) · · · T<2>eO> II and
II e<k) II= (11 T(k) · · · T<2>e<t> I II e<l) 1111) II e<l) 11~11 T(k) · · · T<2> lim II 9(1) II,
8.8. Localization phenomenon 669
o.
1
c~A ~AA~AA
AAA AA AAA AAAAAAAA AAAA AAAAAA
olvvv~vvvv~vvvv~VVVVV,HH~V!V~
A~AAAA~AAAJ Mode 1
-0.1 l . . . - - - ' - - - - - - ' - - - - ' - - - - ' - - - - ' - - - ' - - - ' - - ' - - - ' - - - - - '
0.121 41 61 1 1 1 121 141 1 1 1 1 2 1
0 ~WW~W~WW~~w~m~w~~M~ 2
-0.1 ...__.___..___..___..J....__ _ ! . __ __,__ _,__....~.- _ _ . __ _.J
O~~~w~~~W~~w~,M~3
-0.1 '---'----'----'----'---'-----'---'----'---'---'
0.121 41 61 1 1 21 1 1 1 1 2 1
orw~m~wv~~~m~ww~~M~ 4
-0.1 ' - - - ' - - - " ' - - - " ' - - - - ' - - - ' - - - - - ' - - . . . . 1 . - - - - ' - - - - ' - - - - '
0.1141 1 1 101 121 141 1 1 1 1 2 1
-0.1
o~~~~w~~w~~v~~w~~
'---'---.l...---'----'---'----'----'---'---'---'
5
1 21 41 61 81 101 121 141 161 181 201
Degrees of freedom
_] : :±: : : : : : Mode 1
••' 21 41 61 81 1Q1 121 141 ~·
I
~~I ~· ~
1 21 41 61 81 101 121 141 161 181 201
Degrees of freedom
so that
II a<k) II~ iLE (k-l) II a<l) II. as k-+ 00,
670 Chapter 8. Stochastic Systems and Deterministic Input
where ~LE is the smallest positive Lyapunov exponent corresponding to the random matri-
ces T(k). The Lyapunov exponent ~LE in the above equation gives the rate of decay of the
rotation at supports, and is referred to as a localization factor. &
8.9 Problems
8.1: Consider a modified version of the partial differential equation in Exam-
ple 8.1 obtained by replacing its right side with an integrable function p(x),
x E D. Find the local solution of the resulting partial differential equation. Extend
your results to the case in which p(x) is a random field.
8.2: Extend the local solution of the partial differential equation in Problem 8.1
to the case in which D is a random set. Assume that the point x at which the local
solution is determined is in D a.s.
8.5: Solve the problem in Example 8.8 for Y being a shifted exponential random
variable with the density f(y) =A e-J... (y-a), y ::: a, where A > 0 and a are some
constants.
8.6: Find the first two moments of X defined by Y X = q by both versions of the
iteration method, where Y = a + R, R "' U (-a, a), and 0 < a < a.
8.7: Let S = (Sn, S12 = S21, S22) be a Gaussian vector, where SiJ are random
stresses. Suppose that the coordinates of Shave the mean values (100, -60, 300),
are equally correlated with correlation coefficient p = 0.7, and have the same
coefficient of variation v = 0.3. Find the first two moments of the eigenvalues
and eigenvectors of the stress tensor {Sij}.
8.8: Consider an infinite thin plate with a through crack of length 2 a. The plate
is subjected to uniform far field stresses, a tension Q 1 perpendicular to the crack
and a shear Q2. Suppose that (Ql, Q2) is a Gaussian vector with mean (10, 4),
coefficients of variation (0.3, 0.3), and correlation coefficients p = 0, 0.7, and
0.9. Let 8 be the direction of crack extension. Find the distribution of 8 based
on the relationship Q1/ Q2 = [1 - 3 cos(8)]/ sin(8).
8.9: Find second order Taylor approximations for the eigenvalues and eigenvec-
tors of a square random matrix A.
8.9. Problems 671
8.11: Find the second moment properties of the eigenvalues and eigenvectors of
the random matrix in Example 8.12 by the iteration method.
8.12: Let G be an ~n -valued Gaussian variable with mean IL and covariance func-
tion y. Find the mean zero-crossing rate v(A.) of the stochastic process V (A.) =
),n + G1 A.n-l + · · · + Gn-1 A.+ Gn. Use your results to calculate v(A.) forn = 3,
IL = 0, and y equal to the identity matrix.
8.14: Let U be the solution of Eq. 8.45 with D given by Eq. 8.53, where 8 is
a small parameter. Find expressions for the mean of U to the order 8 2 and for
the correlation function of U to the order 8 4 . Calculate the confidence interval
(E[U] - Std[U], E[U] + Std[U]) on U to the order 8 2 .
8.15: I:.et U be the solution of d 4 U(x)jdx 4 + N(x, U(x)) = q, x E (0, l), with
the boundary conditions U(O) = 0, dU(O)jdx = 0, U(l) = 0, andd 2 U(l)jdx 2 =
0, where q is constant and N(x, U(x)) is a nonlinear random function. De-
rive equations for the second moments of U(x) for N(x, U(x)) = A(x) U(x) +
B(x) U (x ) 3 , where A and B are square integrable random fields that are uncorre-
lated with each other.
8.16: Repeat the analysis in Problem 8.15 by using a finite difference approxima-
tion for the partial differential equation defining U.
8.19: Apply the Monte Carlo simulation method to estimate the second moment
properties and the distributions of the first three eigenvalues and eigenfunctions
of the random eigenvalue problem in Problem 8.18. Assume that Z is uniformly
distributed in (1, 2).
672 Chapter 8. Stochastic Systems and Detenninistic Input
8.21: Generate additional samples of the process in Example 8.56 for different
values of A and a, and determine whether the observed sample properties are
consistent with the results in Fig. 8.32.
Chapter 9
9.1 Introduction
In most stochastic problems both the system properties and the input char-
acteristics are random. However, the degree of uncertainty in the input and the
system can be very different. For example, it is common in earthquake engineer-
ing to assume that the structural system is deterministic because of the very large
uncertainty in the seismic ground acceleration. Under this simplifying assumption
we can evaluate the seismic performance of structural systems by the methods in
Chapter 7. Similarly, developments in Chapters 6 or 8 can be applied to stochastic
problems characterized by a negligible uncertainty in both the system and input
or the input alone, respectively.
This chapter considers stochastic problems in which both the system and
input uncertainty has to be modeled. Methods for analyzing this type of stochastic
problems are discussed primarily in Section 9.2. The subsequent sections in this
chapter present applications from mechanics, physics, environment and ecology,
wave propagation in random media, and seismology. Some brief considerations
on model selection conclude the chapter. We start with a summary of the main
sources of uncertainty for general stochastic problems.
• Randomness in system and input properties. For example, the atomic lat-
tice orientation in metals exhibits random spatial variation (Section 8.6.2.2),
identically designed light bulbs and other physical systems have different
service life, properties of geological formations are characterized by notable
spatial fluctuations, the details of the next earthquake at a site in California
are not known.
• Finite information. For example, the functional form and the parameters of
most mathematical models used in finance, physics, mechanics, and other
fields of applied science and engineering need to be estimated from finite
673
674 Chapter 9. Stochastic Systems and Input
records so that they are uncertain and the degree of uncertainty depends on
the available sample size.
• Limited understanding and model simplification. For example, the un-
derlying principles of a phenomenon may not be well understood and/or
may be too complex for modeling and analysis so that simplified repre-
sentations need to be considered. The prediction of the outcome of a coin
tossing experiment is possible in principle because the coin can be viewed
as a rigid body subjected to an initial condition. However, the high sen-
sitivity of the outcome of this experiment to initial conditions, properties
of the landing surface, and many other parameters render the mathematical
modeling of this experiment impractical if not impossible. We need to set-
tie for a global characterization of the coin tossing experiment giving the
probability of seeing the head or the tail in a toss.
This chapter examines stochastic problems defined by
D[X(x, t)] = Y(x, t), t 2: 0, x ED C JRq, (9.1)
where D is a subset of JRq , D can be an algebraic, integral, or differential operator
with random coefficients, Y(x, t) is an 1Rm-valued random function depending
on space x E D and time t ::=: 0 arguments, the )Rn -valued random function X
denotes the output, and m, n, q 2: 1 are some integers. The output X depends on
Y, D, and the initial/boundary conditions for Eq. 9.1, which can be deterministic
or random. It is difficult to give conditions for the existence and uniqueness of the
solution ofEq. 9.1 in this general setting. We will give such conditions for special
forms of Eq. 9.1. In many applications it is sufficient to find the solution X at a
finite number of points Xk E D. Then Eq. 9.1 becomes an equation giving the
evolution in time of an JRd -valued stochastic process X (t), t 2: 0, collecting the
functions X(t, Xk). Similar considerations have been used in Section 7.1. Most
of the results in this chapter relate to differential equations, that is, Din Eq. 9.1 is
a differential operator.
Our objective is to calculate second moment and other properties of X in
Eq. 9.1 supplied with deterministic or random initial and boundary conditions, or
of its discrete counterpart, the stochastic process X. There is no general method
that can deliver the probability law of X or X. The next section presents methods
for calculating probabilistic properties of X and X. Most of the methods in this
section can be viewed as extensions of developments in Chapters 6, 7, and 8. The
subsequent sections present applications from various fields of applied science
and engineering.
Two classes of methods are available for finding properties of the solution
X ofEq. 9.1. The methods in the first class constitute direct extensions of some
of the results in Chapters 6 and 7, for example, the local solution for some par-
tial differential equation, the Monte Carlo simulation, the conditional analysis,
the state augmentation, and the Liouville methods in Sections 9.2.1, 9.2.2, 9.2.3,
9.2.4, and 9.2.5, respectively. These methods involve rather advanced probabilis-
tic tools and can deliver detailed properties of X. The methods in the second class
extend some of the results in Chapter 8, for example, the Taylor, perturbation, and
Neumann series, the Galerkin, the finite difference, and finite element methods
in Sections 9.2.6, 9.2.7, 9.2.8, and 9.3, respectively. Most of these methods in-
volve elementary probabilistic concepts and are particularly useful for finding the
second moment properties of X.
676 Chapter 9. Stochastic Systems and Input
with the boundary conditions au jaxr = 0 on {0} x (0, 1), U = 0 on {1} x (0, 1),
aU jax2 = 0 on (0, 1) X {0}, and U = 3.5349xf+0.516lxr +3.0441 On (0, 1) X
{1}. The coefficient Zr and the input Z2 are independent random variables that
are uniformly distributed in (3 -a, 3 +a), 0 < a < 3, and (16- b, 16 +b),
0 < b < 16, respectively.
Let X(t; w) be an JR2 -valued diffusion process with coordinates
Xr(t,w) =Br(t)+Lr(t),
{
X2(t, w) = JZ1(w) (B2(t) + L2(t)),
where B; are some independent Brownian motions and L i are local time processes
(Section 6.2.3.1). Let (Z1, Z2)(w) be a sample of (Z1, Z2). Denote by T(w) =
inf{t > 0 : X(t, w) ¢ D} the first time X(·, w) corresponding to this sample
(Z1, Z2) and starting at x ED leaves D. The local solution of the above partial
differential equation at x E D for a sample (Z r, Z2)(w) of (Zr, Z2) is
where h gives the values of U on the boundaries {1} x (0, 1) and (0, 1) x {1}
of D. Figure 9.1 shows a histogram of U(x) for a = 0.5, b = 4, and x =
(0.25, 0.25) that is obtained from 100 samples of Z. The estimates of U (x, w) for
each w have been obtained from 500 samples of X ( ·, w) generated with a time step
!1t = 0.001. The estimated local solution at x = (0.25, 0.25) of the associated
deterministic problem (a = b = 0) is 4.1808, and nearly coincides with the mean
of the histogram in Fig. 9 .1. 0
9.2. Methods of analysis 677
2.5r------,-------.------,
Histogram of U((0.25,0.25))
Mean =4.2635
2 Std = 0.3205
(100 samples)
1.5
0.5
4 4.5 5
~
= L c; E
x [!oT(w) ---'---'--'----'------'-
oU(X(s, w), w) (
dB; (s) + d L; (s)
)]
i=l 0 ox;
where q = 1 and c2 = ../Z1 (w). The boundary conditions and the partial differential
equation defining U give the above local solution. •
Note: The samples of X can be used to estimate properties of this random function. For
example, r(x, t, y, s) = (1/ns) :L:S=l X(x, t; w) X(y, s; w)T is an estimate for the cor-
relation function E[X(x, t) X(y, s)T] of X . .l
If the operator D in Eq. 9.1 has state dependent coefficients, the following
Monte Carlo algorithm can be applied to find properties of X.
Example 9.4: Let X be the process in Example 9.1 with the deterministic initial
condition X (0) = x. If R and B are independent, then X 19 = X IR is a Gaussian
process with mean and correlation functions
J.t(t I R) = E[X(t) I R] = x e-Rt,
r(t,s I R) = E[X(t)X(s) I R] = x 2 e-R(t+s) + ;: e-R(t+s) (e 2 R(tAS) -1).
These two functions define the probability law of X IR. Properties of X can be
obtained from the probability law of XIR by eliminating the condition on R.
Numerical integration or Monte Carlo simulation can be used for solution. Fig-
ure 9.2 shows the evolution in time of the variance of X for a = ,fi, R = 1, and
R"' U(a- b, a+ b) with a= 1 and b = 0.7. ¢
1.4r---.,----~----.---~------,
R- U (0.3,1.7)
R=1
oL---~--~--~--~--~
0 2 4 5
time
Figure 9.2. Evolution of the variance of X for a = ,fi, R "' U(0.3, 1.7), and
R= 1
Proof: The mean and correlation equations in Section 7 .2.1.1 can be used to find the second
moment properties of XIR. Alternatively, we can use direct calculations to obtain the above
equations. For example, the mean equation, f.t(t I R) = - R JJ,(t I R) with initial condition
J-L (0 I R) = x, results by calculating the expectation of the equation for X IR. We also have
t-t(t I R) =xe-f~R(u)du,
rtl\s t s
r(t, sIR)= t-t(t I R) t-t(s I R) + a 2 Jo e- fu R(Od~- fu R(rJ)drJ du.
r(t, s) = h
JRlOO
r(t, s 1 z)
.
•=1
1
0100( v2:rr
~ e-zi 1 dz;
2
2 )
is not feasible for this value of n. However, r(t, s) can be estimated without difficulty from
r(·, · 1 Z) and samples of Z, as demonstrated in Fig. 9.3. A.
9.2. Methods of analysis 681
4.---------------. 15r--------------.
Samples of G(t) Samples of R(t)
10
-4L-------~------~
0 5 10 5 10
time time
0.6 0.3
0.4 0.2
0.2 0.1
OL-------~------~
5 10 0 5 10
time time
Figure 9.3. Ten samples of G, the corresponding samples of R and of the condi-
tional correlations r(t, t I R), and an estimate of r(t, t) based on 100 samples
The kernel H(t, s) for X in Example 9.4 is exp (-R (t- s)) 1(t:::sl• where R is
a random variable. The expectations E[H(t, s)] and E[H(t, u) H(s, v)] in the
expressions of the mean and correlation functions of X are given by the moment
generating function E[e~ R] of the random variable Revaluated at~ = -(t - s)
and~ = -(t - u + s - v), respectively.<>
Note: Consider a deterministic differential equation x(t) = a(t) x(t) + y(t), where X E
Rd, a is a (d, d) matrix whose entries are real-valued functions of time, y E ~ denotes
the input, and the time argument t takes values in a bounded interval [0, r ]. If the entries
of a are continuous functions, then (1) the transition matrix h(·, ·)is the unique solution of
oh(t, s)jot = a(t) h(t, s), 0 :::; s :::; t, with h(s, s) equal to the identity matrix and (2) the
unique solution of the above differential equation is
in the time interval [0, r] ([30], Theorem 1, p. 20, and Theorem 1, p. 40). It is assumed that
the samples of R satisfy the condition of these theorems a.s.
The kernels H(t, s), E[H(t, s)], and E[H(t, u) H(s, v)] in the above equations can
be interpreted as a Green's function for X, the mean of X, and the correlation function of
this process. The expectation E[H(t, u) H(s, v)] is also referred to as a stochastic Green's
function ([3], p. 86). &
Example 9.7: Let X be the process in Example 9.4. The operator defining this
process has state independent coefficients and depends on a single random param-
eter 8 = R. The augmented process Z has the coordinates (Z 1 = X, Zz = R)
9.2. Methods of analysis 683
with the initial condition Z1(0) = X(O) and Z2(0) = R. Let JL(p,q; t)
E[X(t)P Rq] be the moment of order p + q of Z(t), where p, q :::: 0 are inte-
gers. The moments of Z(t) satisfy the differential equation
. p(p-l)a 2
JL(p, q; t) = -p JL(p, q + 1; t) + 2 JL(p- 2, q; t)
with the convention that JL (p, q; t) is zero if at least one of the arguments p, q is
strictly negative. Since the moment equations for Z (t) form an infinite hierarchy,
it is not possible to solve exactly these equations.
The Fokker-Planck equation can be used to find the evolution in time of the
density I of the diffusion process (ZJ, Z2). For example, the stationary density
Is of Z is Is (Zl, Z2) = c(z2) exp (-zi Z2/a 2), where c(z2) > 0 is a function of
Z2 such that J"'R dz1 j000 dz2 Is (ZI, Z2) = 1. <>
Proof: The differential equation for the second coordinate of Z states that E> is time-
invariant. The moment equations for J-t(p, q; t) are given by the Ito formula applied to the
function (X(t), R) 1-+ X(t)P Rq. Although the moments ~-t(O, q; t) = E[Rq] are known
for any value of q, the moment equations for Z(t) are not closed. For example, there are
two differential equations for p + q = 1. The equation for (p = 0, q = 1) provides no
information since the moments of R are known, while the equation for (p = 1, q = 0)
involves the moments J-t(l, 0; t) and J-t(1, 1; t). There are three differential equations for
p + q = 2. The equation for (p = 0, q = 2) provides no information, while the equations
for (p = 2, q = 0) and (p = 1, q = 1) involve the moments J-t(2, 0; t), J-t(2, 1; t),
J-t(l, 1; t), and ~-t(l, 2; t). Hence, up to moments of order 2 we have three informative
equations and five unknown moments.
The density f(·, ·; t) of (Zl (t), Z2 (t)) is the solution of the partial differential equa-
tion (Section 7 .3.1.3)
aj =
at
~
az 1
[z1 Z2 f + CT 2
2
~]
az 1
.
2
- a [ Zl Z2 fs
az 1 + -cr2 -az
afs] = 0.
1
Because the expression in the square bracket is a constant and this constant is zero by
the boundary conditions at infinity, we have ln Us(ZJ, Z2)) = -zi z2/cr + d(z2), where
d (Z2) is a constant with respect to the variable of integration Zl, which yields the stated
expression for fs. •
driving Brownian motion B. The process in Example 9.7 is a special case cor-
responding to a deterministic rather than random noise intensity. The augmented
state Z(t) = (Z1(t) = X(t), Z2(t) = R, Z3(t) = f) satisfies the stochastic
differential equation
giving the evolution of the density f of the state vector Z can be used to establish and solve
the differential equation for fs. A
with the convention in the previous two examples. The moments of order p + q =
1 and p + q = 2 are given, respectively, by
/L(l, 0; t) = JL(O, 1; t),
jL(O, 1; t) = -JL(l, 0; t)- 2{3 JL(O, 1; t) and
9.2. Methods of analysis 685
= X 2 (t)dt,
= -[X 1(t) + 2 fi Xz(t)] dt- o-1 X1 (t) dB1 (t) + uz dBz(t).
The average of the Ito formula applied to the function (XI Xz) r-+ xf Xi
gives the mo-
ment equations.
The moment equations of a specified order p + q have the form m(t) = a m(t).
If the real parts of the eigenvalues of a are negative, the moments m(t) converge to a
time-invariant finite value as t --+ oo and we say that the process is asymptotically stable
in the moments of order p + q. For example, the eigenvalue of a for p + q 1 are =
J
AI,2 = -fi ± fi2- 1 so that X is asymptotically stable in the mean if fi > 0. If the
dimension of the vector m is large, it is convenient to use the Hurwitz criterion to assess
the sign of the real part of the eigenvalues of a ([30], Theorem 2, p. 55).
The differential equation defining X can be used to model a simply supported beam
with constant stiffness x and length l > 0. The beam is subjected to a fluctuating axial
force W1 applied at its ends. The beam deflection V(x, t) at timet 2: 0 and coordinate
x E [0, l] satisfies the differential equation
where m is the beam mass per unit length. The representation V (x, t) = Y (t) sin(:ll' xI l)
of V corresponding to the first buckling mode gives
where H(t) [Z1(t) 2 + (Zz(t)jv(t)) 2 ] 112 , v(t) = [x(Z3)/m] 112 denotes the
instantaneous frequency of the plate, and the parameters a, fJ in the definition of
the functions x and k may be random. The operator D * defining Z* has state
independent coefficients.
The methods in Section 7 .3.1 can be applied to find properties of Z * con-
ditional on the random parameters in D*. For example, the stochastic averaging
method can be used to derive an approximate evolution equation for the crack
length Z3 = A ([175], Section 7.5.2.3). <>
with coefficient depending on the processes X and V;. It is assumed that the
functions a and b are such that the stochastic differential equations for X and V;
have unique solutions. The methods in Section 7.3.1 can be used to find properties
of X and V;.
Figure 9.4 shows three samples of X and the corresponding samples of
the sensitivity factor V = ax;ae for X(O) = 0, a(X(t), 8 0 ) = -eo X(t),
b(X(t), 8o) = I, and eo = 1. The resulting sensitivity factor V is the solu-
tion of the differential equation dV(t) = -(eo V(t) +X (t)) dt, and has samples
much smoother than the samples of X. <>
Note: The differential equations for the sensitivity factors result by the differentiation of
the defining equation for X with respect to the coordinates B; of () and then setting () equal
to Oo.
Note that the differential equations for sensitivity factors are linear in these factors.
Also, the generation of samples of X and V can be performed simultaneously by consid-
ering the evolution of the vector (X, V), or it can be performed sequentially by calculating
first samples of X and using them to calculate the corresponding samples of V. .._
9.2. Methods of analysis 687
Samples of V
If~ ~ g(~, t) has continuous first order partial derivatives with respect to the
coordinates of~ and defines a one-to-one mapping, then
where Ill and g- 1 (~, t) denote the Jacobian and the inverse of~~ g(~, t).
Note: The relationship between the random vectors X(t) and Xo gives Eq. 9.3. This
equation is useful if the expression for the mapping ~ 1-+ g (~, t) can be obtained explicitly
(Section 2.11.1 in this book, [176], Theorem 6.2.1, p. 143). The relationship X(t)
g(Xo, t) can also be used to calculate moments and other properties of X(t). &
688 Chapter 9. Stochastic Systems and Input
If the solution ofEq. 9.2 exists and is unique, the density/(·; t) of X(t) satisfies
t
the Liouville equation
at=_ a(fhi) (9.4)
at i=l axi
with the initial and boundary conditions f(x, 0) = fo(x), limJxd-Hxl f(x; t) =
0, i = 1, ... , d, respectively, so that
Proof: The proof of the Fokker-Planck equation in Section 7.3.1.3 can be used to establish
Eq. 9.4. Let cp(u; t) = E [eAur X(t)J be the characteristic function of X(t) so that
f(x; t) = (1/(2 n)d) JJR;d e A ur x cp(u; t) du. The derivative with respect to time of cp
gives
The Fourier transform of the left side ofEq. 9.4 is aj jat. The Fourier transform of the right
side ofEq. 9.4 is- L.t=l [JR;d[a(f hk)faxk]eAuT x dx which gives the right side of
the above equation following integration by parts.
The explicit solution of Eq. 9.4 in Eq. 9.5 can be obtained from the associated La-
grange system ([176], Theorem 6.2.2, p. 146) •
so that
The mapping X(O) 1--+ X(t) = g(X(O), t) can be used to calculate moments of
X (t) from moments of X (0) and to find the density,
Note: The Jacobian of~ H- g(~, t) is unity so that f (x, t) = fo(g- 1 (x, t)) by Eq. 9.3.
The evolution equation for X is given by xl (t) = hl (X(t), t) = X2(t) and X2(t) =
h2(X(t), t) = -v 2 Xt(t) sothatah1/ax1 = 0, ah2/ax2 = O,andtheexponentialfunction
in Eq. 9.5 is unity. The density f(x; t) in the above equation coincides with the result in
Eq. 9.3 ([176], Example 6.1, p. 143). •
Example 9.13: Let X be the solution of X(t) + C X(t) + K X(t) =a sin(v t),
where C > 0 and K > 0 are independent random variables and a, v > 0 are some
constants. The augmented vector X with coordinates (Xl = X, = X, = x2 x3
C, X4 = K) satisfies the differential equation
so that the density j of X(t) satisfies the Liouville equation (Eq. 9.4)
aj a (- -) a [ _ _ _ _ . 1-]
- = - - - - X2f ---- (-X4Xt-X3X2+asm(vt))
at axl ax2
Note: If in addition the input parameters (a, v) were random, the corresponding augmented
vector i would be an JR6 -valued process. The evolution of this process would be defined
by the above differential equations and two additional equations stating that the parameters
(a, v) are time-invariant. &
Example 9.14: Let X be an JR2 -valued process defined by X(t) = A X(t) with
the initial condition X(O) =X o, where the entries (1, 1), (1, 2), (2, 1), and (2, 2)
of A are -A1, A2, A1, and -A2, respectively, and At, A2 are random variables.
Then ([176], Example 8.2, p. 223)
- 1 [ A2+Ate-(AJ+Az)t A2 (1-e-(AJ+Az)t) J
X(t)-A1+A2 A1(1-e-(A 1+Az)t) At+A2e-<AI+Az)t Xo.
690 Chapter9. Stochastic Systems and Input
0.2
Mean of X 1 Std of X 1
0.8 p=O.O
Figure 9.5. Evolution of the mean and standard deviation of X 1 for lognormal
random variables (A 1, A2) and initial condition X o = (1, 0)
Figure 9.5 shows the evolution in time of estimates of the mean and stan-
dard deviation of X 1 for Xo = (1, 0) and lognormal random variables (A 1, A2)
with means (1.0, 0.5), coefficients of variation (0.3, 0.4), and two values of the
correlation coefficient p of the Gaussian images of A 1 and A2. The estimates are
based on 500 independent samples of X 1 . The first two moments of X 1 converge
to the corresponding moments of A2/(A1 + A2) as t--+ oo. <>
and the input are stochastic is elementary for the case in which D has state inde-
pendent coefficients.
We only present three examples illustrating how some of the methods in
Sections 8.4.1.2-8.4.1.5 can be extended to solve Eq. 9.1.
1.4
1.2
\
0.8
U(t;l!
z .11q)
-0.2
-o. 4o'-----~--~--~--~-___J10
time
Figure 9.6. Functions U(t; f.Lz, f.Lq), Vz(t; f.Lz, f.,Lq). and Vq(t; f.Lz, f.,Lq) for a = 1,
f.Lz = -0.5, f.,Lq = 1, and v = 5
U(t; Z, Q) ~ U(t; f.Lz, f.,Lq) + Vz(t; f.Lz, f.,Lq) (Z- f.Lz) + Vq(t; f.Lz, f.,Lq) (Q- f.,Lq).
For example, the approximate mean and variance of U(t) are f.,L(t) ~ U(t; f.Lz, f.,Lq)
and a(t) 2 ~ Vz(t; f.Lz, f.,Lq) 2 a}+ Vq(t; f.Lz, f.Lq) 2 aJ. <>
Proof: The function U(t; J.Lz, J.Lq) is the solution of the differential equation for U with
(J.Lz, J.Lq) in place of (Z, Q). The sensitivity factors Vz and Vq can be obtained as in
Section 8.4.1.2. For example, the differential equation,
(d fdt -a) Vq (t; J.Lz, J.Lq) - 3 J.Lz U(t; J.Lz, J.Lq ) 2 Vq (t; J.Lz, J.Lq) = sin(v t),
for Vq results by differentiating the defining equation for U with respect to Q and setting
(Z, Q) equal to its expectation (J.Lz, J.Lq). •
692 Chapter 9. Stochastic Systems and Input
0.5 0.5
0 ~ 0 h
0 0.5 1.5 2 0 0.5 1.5 2
2 2
Exact Approximate
1.5 (£=0.5) 1.5 (£=0.5)
2 3
Figure 9.7. Histograms of the exact and first order perturbation solutions for
U(O) = 0, {3 = 1, t = 5, Q ~ U(0.5, 1.5), and Y(t) = Y ~ U(0.5, 1.5)
the assumption that Q and Y are independent random variables. The histograms
of the exact and first order perturbation solutions are similar fore = 0.1 but differ
significantly fore = 0.5. <>
Note: The functions Uo and U1 satisfy the differential equations Uo(t) + fJ Uo(t) = Q
with Uo(O) = 0 and U1 (t) + fJ U1 (t) = - Y(t) Uo(t) with U1 (0) = 0, respectively, which
yield the stated solutions. The approximate mean is (E[Q]/ fJ) (I - t)
e-fi + O(e 2 ). The
expectation of the product (Uo(t) + e U1 (t)) (Uo(s) + e U1 (s)) gives the terms of orders
one and order e but only one of the three terms of order e2 (Section 8.4.1.3). A
9.2. Methods of analysis 693
0.4
Mean Standard
0.8
0.3 deviation
0.6
0.2
0.4
0.1
0.2
Figure 9.8. Exact and approximate mean and variance of U for {3 = 1, U (0) = 0,
Y"' U(-0.5,0.5), Q"' U(0.5, 1.5),andr = 2
mean and variance of the solution U in the time interval [0, 2]. The approximate
solution is given by the first two terms of the above Neumann series. In this case
the approximate mean and variance functions are accurate. Generally, a larger
number of terms needs to be retained from the Neumann series to obtain satisfac-
tory approximations. ¢
Proof: The integral form of the differential equation for U is
=j (1-e-l3t)-Ye-l3t lo el3su(s)ds
1
with A. = -1, where f(t) = (Q/ {3) (1 - e-13 1 ) and K (t, s) = 1[0,t] (s) Y e-13 (t-s) (Sec-
tion 8.4.1.4). •
m
X(x, t) = L Cp(t)({Jp(x), xED, t 2:0, (9.6)
p=l
where the real-valued coefficients C P are random and may depend on time. The
function X is a member of the space S spanned by the trial functions ({J 1, ... , ({Jm.
Suppose that we approximate the solution of Eq. 9.1 by a member of S. The error
or residual of this approximation is
m
R(x, t) = D[X(x, t)]- Y(x, t) = L D[Cp(t) ({Jp(x, t)]- Y(x, t), (9.7)
p=l
Under some conditions Eq. 9.8 becomes an algebraic equation with random
coefficients for C. For example, if V is a linear differential operator involving
only partial derivatives with respect to the coordinates of the space argument x,
the solution of Eq. 9.1 is time-invariant so that X in Eq. 9.6 becomes X(x) =
r;;=l Cp cpp(x) and Eq. 9.8 yields the algebraic equation A C = Y, where A =
{(V[cpp],cpq)},p,q = 1, ... ,m,C = (Ct, ... ,Cm),andY = {(Y,cpq)},q =
1, ... , m. Properties of C can be obtained from the solution of the above algebraic
equation.
Example 9.18: Let X be the solution of X(t) =A X(t), t E [0, 1], with the initial
condition X (0) = 1. The exact solution of this equation is X(t) = exp(A t). If A
is uniformly distributed in the range (a 1, a2), the exact moments of X are
0.2 '-----~~------'
0 0 ·5 time 0 ·5 time
Figure 9.9. Mean and standard deviation of X and X for A"' U( -1.5, -0.5)
where R(t) =-A+ (1- At) C1 + t (2- At) Cz denotes the residual. These equations
show that the vector C = (Ct. Cz) is the solution of an algebraic equation with random
coefficients and input, and give the above expressions for C. If C has a large dimension,
it is not possible to obtain analytical expressions for the coordinates of C. In this case, the
Taylor series, perturbation, and other methods presented in the first part of this chapter can
be used for solution ([178], pp. 511-522).
The accuracy of the Galerkin solution depends strongly on the trial functions. For
example, the Galerkin solution X(t) = 1 + C t with C = A/(1 - 2 A/3) derived from the
orthogonality condition JJR(t) t dt = 0 provides unsatisfactory approximations for both
the mean and variance of X in [0, t].
We also note that the process Y(t) = X(t)- 1 satisfies the differential equation
Y(t) =A (Y(t) + 1) with the initial condition Y(O) = 0. It is convenient in some applica-
tions to modify the original differential equation so that its new version has homogeneous
initial and boundary conditions. A
Example 9.19: The Galerkin solution can also be applied to solve nonlinear dif-
ferential equations with random coefficients and input. Let X be the solution of
X(t) = AX(t) 2 fort e [0, l]andX(O) = 1. LetX(t) = l+L:;=l CpgJp(t)with
gJp(O) = 0, p = 1, ... , m, be a Galerkin solution for this equation. The unknown
coefficients of the Galerkin solution satisfy the nonlinear algebraic equation
m m
,L:apkCp-A L /3pqkCpCq=0, k=l, ... ,m,
p=l p,q=l
where CXpk = J~ ~p(t) glk(t) dt and /3pqk = J~ gJp(t) gJq(t) glk(t) dt. <>
Note: The Galerkin solution X satisfies the initial conditions for any values of the coeffi-
9.2. Methods of analysis 697
R(t) = t
p=!
Cpri;p(t)- A (t
p=l
Cprpp(t))
2
n m
= :Lcpri;p(t)-A L CpCqrpp(t)rpq(t)
p=! p,q=!
Example 9.20: Let X be the solution of Eq. 9.1 with D = (0, l), l > 0, and
D = a4 1ax 4 + M (x) a21at 2 depending on a random field M with strictly positive
samples in (0, l). The initial and boundary conditions are X(x, 0) = 0, X(O, t) =
X(l, t) = 0, and aX(O, t)lax 2 = aX(l, t)lax 2 = 0.
The coefficients C P of the Galerkin solution
_
X(x, t) = L Cp(t) fPp(x) = cl (t) sin(-[-)+
2 nx (2nx)
C2(t) sin - l -
p=l
{
Mn cl (t) + M12 C2(t) + (JT I l) 4 (l 12) cl (t) = Yl (t),
M21 cl (t) + Mn c2 (t) + (21T I l) 4 (l 12) cl (t) = Y2(t),
where Mpq = J6
M(x) sin(p JT xl l) sin(q JT xl l) dx, C p(O) = 0, and Cp(O) =
0, p, q = 1, 2. The terms on the right side of the above equation are Y q (t) =
Jci Y(x, t)sin(q JT xll) dx, q = 1, 2.
Suppose that the input is Y(x, t) = Y(t) =a Y(t), where a is a constant
andY denotes an Omstein-Uhlenbeckprocess defined by dY(t) = -p Y(t) dt +
-J2P dB(t), where p > 0 is a constant and B denotes a Brownian motion. Then
the vector C(t) = (CJ(t), C1(t), C2(t), C2(t), Y(t)) is an JR5 -valued diffusion
process conditional on the random coefficients M pq . The methods discussed pre-
viously in this chapter and results in Section 7.2.1 can be used to find properties
of C.<>
Note: The error of the approximate solution,
2
R(x, t) = "'[(p:rr)4
~ . ] sin (pTC
- 1- Cp(t) + M(x) Cp(t) - 1 -X) - Y(x, t),
p=!
698 Chapter 9. Stochastic Systems and Input
and the orthogonality conditions give the differential equations for the unknown coeffi-
cients Cp. These coefficients and their first derivative are zero at the initial time since
X(x, 0) = 0. The coefficients Cp are stochastic processes defined by differential equa-
tions with random coefficients and input.
The random function X gives the deflection of a simply supported beam with unit
stiffness, span l > 0, and random mass M(x), x E [0, l], that is subjected to a random load
Y(x, t), x E [0, l], t 2: 0 . .&
Example 9.21: Let g : [0, 1] --+ ~be a bounded function that is continuous
at x E [0, 1] and let bn(x) = L~=0 g(kjn)Pnk(x) be the Bernstein polyno-
mial of order n of g, where Pn,k(x) = [n!j(k! (n- k)!)]xk (1- x)n-k. Then
limn--+oo bn(x) = g(x). This limit holds uniformly in [0, 1] if g is continuous
in this interval. The Bernstein polynomials can be extended simply to represent
real-valued continuous functions defined on ~d, d > 1 ([123], p. 51).
Let G : [0, 1] x n --+ ~be a bounded random function on a probability
space (Q, F, P), which is uniformly continuous in probability on [0, 1]. Then the
sequence of random Bernstein polynomials,
n
Bn(X) =L G(kjn) Pn,k(X), (9.9)
k=O
converges uniformly in probability to Gin [0, 1]. This property suggests the use
of Bernstein polynomials as trial functions in Eq. 9 .6. The corresponding Galerkin
solution is X(x, t) = L~=O Ck(t) Pn,k(x), where Ck are random unknown coef-
ficients, which can be determined from Eq. 9.8.
Random Bernstein polynomials can also be used to develop parametric rep-
resentations for stochastic processes. These representations are useful in Monte
Carlo simulation and analytical studies ([79], Section 4.3.2.3). ¢
Proof: Foragivene > Owecanfind/l > Osothat ix-x'i < /l implies ig(x)-g(x')i < e
by the properties of g. We have
n
ig(x)- bn(x)i = ~)g(x)- g(kfn)) Pn,k(x) ~ L ig(x)- g(kjn)i Pn,k(x)
k=O lkfn-xl<8
+ L ig(x) - g(kjn)i Pn,k<x).
lkfn-xi:0:8
The sum indexed by ikfn -xi < /l is smaller than e Llkfn-xl<8 Pn,k(x) ~ e since
ig(x)- g(kjn)i ~ e and Lk=O Pn,k(x) = 1. The sum indexed by ikfn -xi 2: /lis smaller
than 2 fJ Llkfn-xl::::8 Pn,k(x), where fJ bounds g, that is, SUPxe[O, l]ig(x)i ~ {3. We also
have
n n
L(k -nx) 2 Pn,k(X) = L[k(k -1)- (2nx -l)k+n2 x 2 ]Pn,k(x) = nx (1-x),
k=O k=O
9.2. Methods of analysis 699
These results show that lg(x)- bn(x)l :S e + fJ/(2n8 2 ) so that lg(x)- bn(x)l :S 2e
for n ~ fJ/(2e8 2 ). If g is continuous in [0,1], then lg(x)- bn(x)l :S 2e holds with 8
independent of x so that bn converges uniformly to g in this interval.
It remains to show that Bn in Eq. 9.9 converges uniformly in probability to G in
[0, 1], that is, that for any e, 17 > Othere exists n(e, 17) such that P(IBn(x)-G(x)l ~e) :S 17
for n ~ n(e, 17), or equivalently that
~
d(Bn, G)= sup d(Bn(x), G(x)) = sup
xe[O,l]
i
I ( )
xe[O,l] n 1 + Bn X
IBn(x)- G(x)l
-
G( )I P(dw)
X
can be made as small as desired as n ~ oo. Take a fixed x E [0, 1] and an integer n ~ 1.
Let li = [(i - 1)/n, i/n] be the subinterval of [0, 1] containing x. Since G is uniformly
continuous in probability in [0, 1], for any e, 17 > 0 there exists 8 = 8(e, 17) such that
lx- x'l :S 8 implies P(IG(x)- G(x')l ~e) :S 11· Hence, we have P(IG(x)- G(v/n)l ~
e) :S 17 for v = i - 1, i if 1/n :S 8. Suppose that n satisfies the condition 1/n ::::; 8. Then
~
d(Bn, X)= sup d (Bn(t), X(t)) = sup
xe[O,l]
i
IBn(x)- G(x)l
I
xe[O,l] n 1 + Bn(x)- G(x)
I P(dw)
= xe[O,l]
~ i
n
I L~-0 G(vjn) Pn,v- G(x) L~-oPn,vl
1 + IBn(x)- G(x)l
P(d )
())
< sup
- xe[O,l]Jn
r L~=O1 +IG(v/n) - G(x)l
IBn(x)-G(x)l
Pn, v P(dw) < 4 fJ rJ + 3 e +
-
_!!_,
2n8
where SUPxe[O,l]IG(x)l :S {J a.s. The above upper bound holds since (1) the integral in the
above summation corresponding to v = i - 1 can be separated in two integrals over the
events {IX((i- 1)/n)- X(t)l ~ e} and {IX((i- 1)/n)- X(t)l < e} and these integrals
are smaller than 2 fJ 17 and e, respectively, (2) similarly, the integral corresponding to v = i
is smaller than 2 fJ 17 + e, and (3) the remaining sum of integrals satisfies the inequality
r L~1IG(vjn)-
Jn
G(x)l Pn v r~
+ IBn(x)- B(x)l , P(dw) :S Jn LJ IG(vjn)- G(x)l Pn,v P(dw)
v=O
so that it is smaller thane+ {Jj(2n 82 ) by the upper bound on lg(x)- bn(x)l, where the
sum L~ includes the terms with indices v '# i - 1, i. We have found that
for any e,17 > 0 and n ~ n(e,17) so that Bn converges uniformly in probability toG on
[0, 1] as n ~ oo. •
700 Chapter 9. Stochastic Systems and Input
1. Develop a finite difference representation for Eq. 9.1 and for the initial and
boundary conditions of this equation by approximating all derivatives at the
nodes (k, i) by finite differences.
2. Find moments and other probabilistic properties of the unknowns X k,i by the
methods presented in this and previous chapters.
Note: The derivatives in Eq. 9.1 and in the initial and boundary conditions for this equa-
tion can be approximated by finite differences of various accuracy. For example, the par-
tial derivative 8Xf8t can be approximated at (x, t) by [X(x, t + .t.t)- X(x, t)]j.t.t or
[X(x, t + .t.t)- X(x, t- .t.t)]/(2 At). The errors of these finite difference approximations
are of orders 0(-t.t) and O((At) 2 ), respectively.
If only the spatial derivatives are approximated by finite differences, we obtain dif-
ferential equations giving the time evolution for a vector with entries X(t, Xk). The coeffi-
cients of these equations depend on values of the random coefficients of V at the nodes of
the finite difference representation for Eq. 9.1. A
where U, V > 0 are random fields, V'(x) = dV(x)jdx, and the input Y(x, t)
depends on both time and space arguments. The initial and boundary conditions
are X(x, 0) = o, X(O, t) = o, and ax(l, t)jax = o.
Take ti = i At and Xk = kjn, k = 0, I, ... , n, where n > 1 is an in-
teger giving the number of equal intervals considered in D = (0, 1). The finite
difference representation of the equation of X at (x k, ti) is
where Xk,i = X(xk, t;), Vk = V(xk), Vfc = V'(xk), Uk = U(xk), and Yk,i =
Y(xk. ti). The coefficients in the above finite difference equation are
Ak = Wk ( Vk/(2jn) 2 - V£/(2/n)),
Ck = Wk ( V£/(2/n) + Vk/(2jn) 2 ),
9.3. Mechanics 701
where wk = (.M) 2 I uk. These coefficients are random because they depend on
the random variables Uk. Vk. and V{ The above finite difference equations for
Xk,i satisfy the initial and boundary conditions X k,-l = Xk,O = 0 and Xo,i = 0,
Xn+l,i = Xn-l,i· respectively.<>
Note: The solution X gives the elongation of a rod with unit length and cross section area,
material density U(x), and modulus of elasticity V(x), subjected to an axial force Y(x, t)
at location x and timet. The differential equation for X follows from (1) the Hooke law
stating that the stress in the rod is S(x, t) = V(x) aX(x, t)jax and (2) the Newton law
giving the condition
U( ) a2 X(x, t) _ as(x, t) Y( )
X at2 - ax + X' t •
The rod is at rest at the initial time so that Xk,-1 = Xk,O = 0 for all k's. The boundary
conditions Xo,i = 0 and Xn+l,i = Xn-l,i show that the rod is fixed at x =
0 and that the
stress is zero at x = 1. .l
We conclude here our discussion on some methods for solving Eq. 9.1. The
remaining part of this chapter presents stochastic problems selected from various
fields of applied science and engineering. The solutions of the problems in the
following sections are based on the methods in Chapters 6, 7, 8, and the first
part of Chapter 9. An exception is the discussion of the stochastic finite element
method in the following section. This method can be applied to find the solution
of Eq. 9.1, but is discussed in the following section because of its wide use in
mechanics.
9.3 Mechanics
The finite element method is the preferred method for solving determinis-
tic and stochastic problems in mechanics. The method has been applied to analyze
problems in physics, chemistry, aerospace, and other fields [28].
The finite element method can be viewed as a mixture of the finite difference
and the Rayleigh-Ritz methods. The unknowns in the finite element method are
values of X in Eq. 9.1 at a finite number of points, called nodes. The equations
for the values of X at nodes are obtained from variational principles.
We formulate a variational principle used frequently in mechanics, illus-
trate the Rayleigh-Ritz method by a simple example, give essentials of the finite
element method for deterministic problems, extend this formulation to stochastic
problems, and present finite element solutions for some stochastic problems.
XED, boundary traction p(x, t) for X E aDa and t 2: 0, and imposed displace-
ment u(x, t) for X E aDu and t 2: 0, where aD= aDa u aDu and aDa n aDu =
0. Let Ui (x, t), Sij (x, t), and Yij (x, t) denote the displacement, stress, and strain
fieldsatx E Dattimet 2:0. Thevectorss = (su,s12,S13,S22,s23.s33)and
y = (yu, YI2. YI3. J122, J123, Y33) contain the entire information on stresses and
strains since Sij = sji and Yij = Yji. The equilibrium conditions require
fort 2: 0, where Sij,j = asij ;ax j, n j (x, t) denotes the coordinate j of the exterior
normal n(x, t) at x E aDa and timet, and summation is performed on repeated
subscripts (Section 8.5.2 in this book, [68], Section 3.4).
Letu(x,t) = (ut,U2,U3)(x,t)bethedisplacementatx ED c ~3 and
t 2: 0 and let Uk, k =1, 2, be two arbitrary displacement fields in jj =
D u aD
such that Uk(X, t) = U(X, t) for X E aDu and t 2: 0. The difference 8u = Ut- U2
is said to be a virtual displacement. Virtual displacements must vanish on aD u
but are arbitrary on aDa.
Note: The left and right sides of Eq. 9.11 are the first order variations of the strain energy
and external work, respectively. It can be shown that the principle of virtual displacements
constitutes an alternative statement of equilibrium. This principle is valid for any material
behavior and magnitude of the displacement field ([28], Section 1.3). A dual of the prin-
ciple of virtual displacements, called principle of virtual forces, can also be established.
This principle provides an alternative statement of compatibility ([28], Section 1.5, [68],
Section 10.9). 6
(axa4
- 4
m a2)
+- 2
X at
u(x, t) = y(x, t),
at least the kinematical boundary conditions and the weights c k need to be deter-
mined. If we take ({Jk(x) =sin (k :n: x/ l), the Rayleigh-Ritz method gives
k4 :n:4 X 2 !ol
Ck(t) +- -4- q(t) =- y(x, t) sin (k, :n: xj l) dx,
ml ml o
where q(O) and Ck(O) result from g(x) = L~=l q(O) sin (k :n: x/ l) and h(x) =
I::~= I Ck (0) sin (k :n: xI l), respectively, and the dots denote differentiation with
respect to time. <>
Proof: The function u represents the displacement at cross section x and time t of a simply
supported beam with span l, stiffness x, and mass m per unit length. The beam strain
energy, SE, is
SE
(a2ii)2 dx = -:n:4 X ~ k 4q(t) 2,
= -X2 lolo -ax2 n
413 L
k=l
so that 8(SE) = (n 4 x/(21 3)) Lk=l k 4 ck(t) 8ck(t), where 8q(t) denotes the first order
variation of the coefficients ck (t). The first order variation of the external work, EW, is
The differential equations for ck follow from the condition 8(EW) = 8(SE) that must hold
for any 8ck(t) by the principle of virtual displacements (Eq. 9.11).
The Rayleigh-Ritz and the Galerkin methods consider similar representations for
the solution. However, there are two notable differences between these two methods. First,
the trial functions in the Galerkin method must satisfy all boundary conditions while the
trial functions in the Rayleigh-Ritz can satisfy only the kinematical boundary conditions.
Second, the unknown coefficients of the Rayleigh-Ritz and the Galerkin representations
are determined from the principle of virtual displacement and an orthogonality condition,
respectively. •
Example 9.24: Consider a linear elastic rod of length l > 0, constant stiffness
x > 0, unit cross section area, and specific weight p > 0, that is suspended
704 Chapter 9. Stochastic Systems and Input
vertically from one of its ends. Let x E D = (0, I) denote the distance from the
suspension point. We assume small deformations so that the strain field is equal
to the first derivative of the displacement function with respect to x. The finite
element method involves the following steps.
1. Partition D in finite elements, select nodes, and define node displacements. For
example, let 0 = x1 < x2 < · · · < Xn+l = I, partition Din the finite elements
Dk = (xk, Xk+l), select Xk to be the node coordinates, and denote by Uk the
displacement of node x k. k = 1, ... , n, n + 1, in the direction of coordinate x.
2. Postulate the displacement field in each finite element. For example, take the
displacement in the finite element k to be
Example 9.25: Consider the problem in Example 9.24 but assume that the rod
stiffness is a random field A (x ), x E [0, l], and that the rod is subjected to a
random action Y(x), x E [0, l]. If we retain the finite element partition and the
functional form of the displacement field in Example 9.24, the element stiffness
and force matrices are
Xk+1 1Xk+J
Ak = 1 A(x) P[ Pk dx and Yk = (p + Y(x)) a:[ (x) dx,
Xk Xk
respectively. The global stiffness and force matrices A* and Y* are linear forms
of A k and Yk, respectively. Moments and other properties of the random matrices
Ak, Yk, A*, andY* can be calculated from the definition of these matrices and
the probability laws of A and Y. For example, the first two moments of the entries
Ak,pq of the matrices Ak are
E[Ak,pq] = 1 Xk+1
E[A(x)]
(
/3[ Pk)
pq
dx,
1
Xk
Xk+J1Xf+1
E[Ak,pq Ai,st] = E[A(x) A(y)] (Pf Pk) (Pf Pk) st dx dy
~ ~ M
The approach in Examples 9.24 and 9.25 can be applied to general solid
mechanics problems with random material properties, input, and/or initial and
boundary conditions. Consider an elastic solid in an open bounded subset D of
lRq, q = 1, 2, 3, that is in equilibrium. Let aDa and aDu denote the parts of the
boundary aD of D where traction and displacement are specified, respectively.
It is assumed that (1) the density of the material in D exhibits a random spatial
variation that can be modeled by a random field R(x), x E D, (2) the material
in D is linear elastic with random properties so that the relationship between the
stress and strain fields is given by Hooke's law S(x, t) = A(x) f(x, t), where the
random field A = AT denotes the stiffness tensor, and (3) the solid in D is indy-
namic equilibrium under the random action Y (x, t), x E aD a, and displacement
constraint on aDu for all t ::_:: 0.
Let Dk. k = 1, ... , n, be a collection of open subsets in D such that jj =
Uk=l Dk and Dk n D1 = 0 fork =!= l. The members of this partition of D are
called finite elements. We denote by aD a,k = aDa n Dk the common part of the
boundary of the finite element k and the boundary of D with specified traction. Let
U k be a vector collecting the displacements of the nodes of the finite element k.
The displacement and strain fields in this element are uCkl(x, t) = lXk(X) Uk(t)
and r<kl(x, t) = f3k(x) Uk(t), where a:k is a specified function and Pk results
706 Chapter 9. Stochastic Systems and Input
t lvk{
k=I
t lavu,k
{
k=I
for b(t, x) set equal to the inertia force -R(x) U(t, x). The relationship between the el-
ement displacement vectors u(k) and the global displacement vector U, the requirement
that the above equation be satisfied for all virtual displacements 8U (t), and elementary ma-
trix manipulations yield Eqs. 9.12 and 9.13. The matrices Mb AZ, and YZ can be random
and are referred to as element mass, stiffness, and force matrices, respectively. The global
mass, stiffness, and force matrices are denoted by M, A*, andY*. •
is reduced significantly. On the other hand, the calculation efforts required to find
the second moment properties of M k exactly and by the second approximation are
similar.
Frequently, the parameters defining constitutive laws in continuum material
models are assumed to be random variables and/or fields for use in stochastic fi-
nite element studies, for example, the Lame constants of linear elastic materials
are represented by random fields [71, 114]. These representations are not recom-
mended since they can be inconsistent with the material microstructure [136]. The
incorporation of the microstructure features in the analysis does not pose any con-
ceptual difficulty. It requires the use of adequate mechanical models and the cali-
bration of the random coefficients of these models to the available information on
the material microstructure. We have used such a representation in Section 8.6.2.2
to describe properties of metallic polycrystals.
708 Chapter 9. Stochastic Systems and Input
00
A- E[A])r] Y
+~
(
U = [
l E[A] E[Af
YOI(t I R) =- a2 R
2a
(1- e-2at - te-2at)'
2a
a2 R2 ( 1 - e-2at t )
Yll
(t I R) =~ 2a2
- -a e-2at - t2 e-2at '
and approach a 21(2a), -a 2 Rl(4a 2), and a 2 R 21(4a 3 ) as t --+ oo. The sta-
tionary variance of X based on the first order perturbation X o(t) + eX 1 (t) is
Var[X(t)] = (al(2a)) (1 +e 2 a;l(2a 3)). 0
averaging with respect to this random variable. The approximate variance of X based on
the first order perturbation does not include all terms of order s2 .
A similar result can be obtained by the first order Taylor method. The processes
X and V in the first order Taylor approximation X(t) ~ X(t) + V(t) R* of X satisfy the
differential equations dX(t) =-a X(t) dt +a dB(t) and dV(t) =-a V(t) dt- X(t) dt,
where R* = s R. •
imply ao + a12 8kz = 0 and a18ki + a2 8u = 0 so that ~2(Gk, Gz) = Gk Gz- OkJ
is in S2. Similar orthogonality conditions show that the polynomials in S3 have
the form ~3(Gk. Gz, Gm) = Gk Gz Gm - Gk 8zm - Gz8km - Gm OkJ, where Gk.
Gz, and Gm are independent copies of N(O, 1). A table giving polynomial chaos
of various orders can be found in ([71], Section 2.4).
If X is a random variable with finite variance defined on (Q, F, P), the series
representation,
(9.14)
Note: The convergence of the series in Eq. 9.14 indicates that we can approximate a square
integrable random variable to any degree of accuracy by polynomials of Gaussian variables.
The coefficients a~::::::~; in the above expression of X can be determined simply since the
Hermite polynomials t;p are orthogonal ([71], Section 2.4).
Approximations X of X defined by finite sums including the first terms in Eq. 9.14,
rather than the series representation in this equation, have been employed to solve stochastic
problems [71]. The statement in Eq. 9.14 has only been used in these applications to select
the functional form for X. A
710 Chapter 9. Stochastic Systems and Input
Example 9.28: Let G be an N (0, 1) random variable and consider its memoryless
transformations X 1 = exp( G) and X 2 = IG 1. These random variables can be
approximated by
m m
x1 = L ap,1 ~p(G) and x2 = L:ap,2 ~p(G),
p=O p=O
Figure 9.10. Exact and approximate mappings G 1-+ X k and G 1-+ Xk (adapted
from [63])
ure 9.11 showing ratios of approximate to exact variances and kurtosis coefficients
confirms this observation [63]. <>
Example 9.29: Let X(t) = F- 1 o <I>(G(t)) = g(G(t)), t E [0, o], t" > 0, be
a translation process, where F is a distribution and G denotes a stationary Gaus-
sian process with mean zero and covariance function p(t") = E[G(t + t") G(t)]
9.3. Mechanics 711
4.5
X1=exp(G)
X2=1 Gl
3.5
o.
X2=1 Gl
X1=exp(G)
Variance ratios
9 1
m*
Y(t) = l:)v(t)Sv(Gip ... , Giv),
v=O
where bv (-) are some deterministic functions of time depending on the marginal
distribution and the correlation function of X, and l; v are Hermite polynomials. <>
Proof: We can approximate X(t) at a fixed timet :::: 0 by X(t) = r,;=oap ~p(G(t))
since X(t) is a random variable (Example 9.28). Since X is a stationary process, the
coefficients ap in the representation of X do not depend on time and can be calculated
from ap::: E[X(t) ~q(G(t))]/ E[~p(G(t)) 2 ]. The approximation X represents a nonlinear
memoryless transformation of a Gaussian process and is not very useful for calculations.
For example, it is difficult to find even the marginal distribution of X.
Consider the approximation G(t) = Lk=l Gk ((Jk(t) of G(t), where Gt, ... , Gm
denote independent N(O, 1) variables and CfJk are specified deterministic functions. For
example, the spectral representation (Sections 3.9.4.1 and 5.3.1.1) and the Karhunen-Loeve
expansion (Section 3.9.4.4) can be used to define G. The approximate representation X
with G in place of G becomes
where ~v(Gi 1 , ••• , Giv) are Hermite polynomials and bv(·) are functions of time that can
be obtained from the above equality [71].
In many applications X is specified partially by its marginal distribution F and cor-
relation function r(-r) = E[X(t + 1:) X(t)]. The approximations X andY of X need to be
calibrated such that they match as closely as possible these properties of X. The accuracy
of the marginal distribution F of X depends on the number of terms considered in its rep-
resentation. The marginal distribution of Y is likely to be less accurate than F because its
712 Chapter 9. Stochastic Systems and Input
Example 9.30: Consider a beam with length l > 0 and random stiffness. The
beam is fixed and free at its left and right ends, respectively, and is subjected to a
unit transverse load at its free end. Suppose that the beam stiffness can be modeled
by a homogeneous Gaussian random field Z (x ), x E [0, l], with mean f.L z > 0 and
covariance function Cz. Let Zm(x) = f.Lz + L:;=l Gq hq(x), x E [0, l], be an
approximate parametric representation of Z, where Gq are independent N(O, 1)
random variables and hq are specified deterministic functions. The above approx-
imation of Z can be obtained, for example, by the spectral representation method
(Sections 3.9.4.1 and 5.3.1.1 in this book, [79], Section 4.3.2) or the Karhunen-
Loeve series (Section 3.9.4.4 in this book, [71], Section 2.3).
Let 0 = Xl < x2 < · · · < Xn+l = l be nodes in [0, l] defining the finite
elements Dk = (xk. Xk+l), k = 1, ... , n. The displacement vector U k and the
interpolation matrix a k (x) for the beam element k are
15r-----~.-----.-------------,-------------,
1 1 Density of U([)
1
~ 3rd order polynomial chaos
2nd order polynomial chaos
10
· MC simulation
5
0.4 0.6
u
Figure 9.12. Estimates of the density of U (l) by Monte Carlo simulation using
the exact solution and polynomial chaos solutions for f.k z = 1 and cz (x1, x2)
(0.3) 2 exp(-lxl- x21) (adapted from [71], Fig. 5.36, p. 147)
9.4 Physics
We consider large systems whose macroscopic properties can be described
by global variables. The Boltzmann equation, the Ising model, and noise induced
transitions are discussed in Sections 9.4.1, 9.4.2, and 9.4.3, respectively. The
Boltzmann equation examines systems consisting of a large number of identical
particles, and gives the evolution in time of the fraction of particles with some
attributes, the Ising model provides a global characterization of magnetism for
ferromagnetic materials, and diffusion processes can model the predator-prey re-
lationship in a random environment. In some cases, the qualitative behavior of
these systems changes abruptly and this change is referred to as phase transition.
Proof: If there are no collisions, the number of particles in an elementary volume along a
flow line is preserved so that g(x, p; t) = g(x + dx, p + d p; t + dt). This conservation
condition gives Eq. 9.15 by expanding its right side in Taylor series, taking the limit of the
resulting expression as dt ~ 0, and using the relationships dx = vdt and dp = hdt
that give the change of the position and momentum vectors in (t, t + dt). The Boltzmann
equation with no collisions and the Liouville equation in Eq. 9.4 coincide.
If there are collisions, the term (agjat)coll needs to be specified. Generally, this
term is approximated by (agjat)coll = -(g- go)fr:c, where go is the distribution g at
thermal equilibrium and r:c denotes a relaxation time ([4], Section 9.1). This approximation
gives (g- go)t = (g- go)r=O exp(-tfr:c) since agofat = 0 by the definition of the
equilibrium distribution. •
9.4. Physics 715
Example 9.31: The steady-state Boltzmann transport equation is given by Eq. 9.15
with ag ;at = 0. If rc is a small parameter and (ag ;at)coll = -(g- go)/rc, the
first order approximation of this equation is
Tc (
1
-
3 a 3 a)
LPi-+Lh;- (8o+rc8t+···)=-[C8o+rc81+···)-8o]
m i=1 ax; i=l ap;
The energy of the random field Ising model with n sites and configuration S is
n n
£=- L BijSiSj- .L<h+Zi)Si, (9.16)
{ij}=l i=l
where Bij denotes the interaction energy and h + Z i is the energy corresponding
to an external magnetic field with a deterministic component h and a random
component Z i.
Note: The classical Ising model is given by Eq. 9.16 with Zt = 0 at all sites.
The symbol L(ij)=l indicates that summation is performed over the nearest neigh-
bor sites. For example, the summation has at each site of a two-dimensional square lattice
four terms corresponding the right/left and above/below neighboring sites ([4], Chapter 11,
[103], Chapter 14). The random field Z can be interpreted as a spatially fluctuating external
field superposed on h. J.
If the interaction energy is isotropic, that is, B ij = B, B > 0, and the fields
hand Zt are zero, the minimum of the resulting lattice energy is-s y nj2, where
y n j2 gives the number of terms in the first summation of Eq. 9.16 and y = 4 for
a square lattice. This minimum value of £ corresponds to a completely polarized
lattice, that is, a lattice with all spins of the same orientation. It is not possible
to find analytically the minimum of the lattice energy for the general setting in
Eq. 9.16.
We give an alternative form of the lattice energy that can be used to de-
velop approximations for the minimum lattice energy, and offers some interesting
interpretations of the ferromagnetism phenomenon.
Note: Denote by N++• N+-· and N- the number of nearest neighbor pairs with spins
(+, +), (+, -) and (-, +), and (-, -), respectively. It can be shown that y N+ =
2N++ + N+- andy N- = 2N- + N+- so that N+- = y N+- 2N++ and N_ =
(y N-- N+-)/2 = y n/2 + N++- y N+ ([103], Section 14.1). Hence, we have
n
L si Sj = N++ + N- - N+- = 4 N++ - 2 y N+ + y n/2
{i,j}=l
Example 9.32: Suppose that Bij = e > 0 and Z; = 0 for all i, j = 1, ... , n, so
that £ can be expressed in terms of the macroscopic attributes N + and N++ of the
Ising lattice (Eq. 9.17). Let L = (N+ - N-)/n E [ -1, 1] denote the fraction of
sites with up spin in excess of sites with down spin, that is, the magnetization per
particle.
Numerical algorithms are needed to find the minimum of the lattice energy
even for the special case in which Z i = 0 at all sites, as considered here. To sim-
plify the analysis, the Bragg-Williams model assumes N++ = (N+fn) 2 (y n)/2,
which yields the expression £ = -n (e y L 2 /2 + h L) for the lattice energy.
If h = 0, then£ = -n (e y L 2 /2) with minimum -n e y /2 corresponding to
L = 1. If h :j: 0, the determination of the minimum energy is less simple. We use
the fact that the value of L minimizing £ maximizes the partition function
1 '
Q= L
L=-1
[n (1 + L)/2]~[n (1 - L)/2]! exp [ -{3 £]
where f3 = 1/(k D. k denotes the Boltzmann constant, and Tis the temperature
([103], Section 14.4). If n -* oo, the partition function is maximized at L = l *,
where l* is the solution of l* =tanh ((h + y e l*)/(k D). If h = 0, this equation
has the solutions l* = 0 and l* = ±lo for T > Tc and T < Tc. respectively,
where Tc = e y I k. Hence, the lattice is ferromagnetic forT < Tc but has no
magnetization otherwise. <>
Note: The Bragg-Williams model and arguments similar to those used to obtain Eq. 9.17
yield the expression of the lattice energy.
The partition function is Q = Lsi exp(-fl £),where the summation is over all
possible values of S; ([103], Section 14.1). Because Lis determined by N+, the number
of configurations (S1, ... , Sn) having the same value of L is equal to the number of ways
in which we can extract N+ out of n objects, that is, n!/(N+! N- !). The equation l* =
tanh (h/(k D + 8 y l* /(k D) giving the value of L that maximizes Q results by using
Sterling's formula to approximate the combinatorial coefficient in the expression of Q and
retaining the dominant term of Q ([103], Section 14.4).
If h = 0, we have l* = tanh(eyl*/(kD) so that l* = 0 for 8 yj(kD < 1 or
[* = 0 and[* = ±lo fore y /(k n > 1, where lo E (0, 1). The solution[* = 0 under
the condition e y 1(k n > 1 is not valid since it corresponds to a minimum rather than a
maximum of Q. The solution l* = ±lo is not unique because if h = 0 there is no intrinsic
difference between up and down spins. A
zero and variance a 2 • Let Z(w) be a sample of Z = (Zt, ... , Zn). and denote
by Emin(w) the minimum of the energy in Eq. 9.16 corresponding to this noise
sample.
Figure 9.13 shows estimates of the mean and the coefficient of variation of
Emin = Emin/ (n a) for noise intensities a in the range [0, 10]. The estimates have
been calculated from 1,000 samples of £min at each noise level. The figure also
shows samples of the lattice configuration S at the minimum energy for several
o.---.----r--~----r---.----r--~----r---.---~0
-2 -0.01
-4 X 0 -0.02
-/{rr/2-1)/n o o o o o o o o
-----~-9-~-----------------
-8 '---><-----'------'-----'-----'-------'------'-----'-----'-------'-----_l-0.04
0 2 3 4 5 6 7 8 9 10
Noise scale, cr
cr=4 cr=6 cr=10
~iii
Figure 9.13. Estimates of the mean and standard deviation of Emin
noise levels. For a = 0 the minimum energy is -y n/2 so that its coefficient of
variation is zero. The magnitude of the coefficient of variation of &min increases
with the noise, and approaches an asymptotic value of - J (1T /2 - 1) In, which
is 0.0378 for n = 400. The estimated mean of £min converges to -,J2!1i =
-0.0378 as the noise intensity increases. <>
Note: An algorithm in Section 5.2.1 has been used to generate independent samples of Z
needed to construct the plots in Fig. 9.13.
If the noise is very large, the atomic spins tend to have the same orientation as the
noise so that (1) the contribution of the first term in Eq. 9.16 becomes negligible and (2) the
second term is approximately equal in distribution with- L:?=l IZil. Hence, &min can be
9.4. Physics 719
approximated by the random variable X = -(1/n) Z:?=l IGi I, where Gi are independent
copies of N(O, 1). Because E[IG1Il = .j2fii and E[IG1I 2l = 1, the mean and variance
of X are -.j2{ii and (1 - 2/rr:)jn, respectively, which yields the stated coefficient of
variation of Emin ~
Example 9.34: The previous example examines the random field Ising model in
Eq. 9.16 with h = 0 at an arbitrary temperature T. Here we take T = 0 and
h i= 0. If the noise Zi is zero at all sites and h < 0 has a large magnitude,
then Si = -1 at all sites, that is, all spins point down. As the external field h is
increased, it reaches a critical value at which all spins change suddenly their ori-
entation and magnetization jumps from -1 to + 1 (Fig. 9.14). Hence, the lattice
experiences a phase transition since the atomic spins at all its sites change orien-
0.8
0.6
0.4
0.2
-0.2
-0.4
-0.6
0' =1.5
-0.8
-1
-3 -2 -1 0 4 5 6
h
tation abruptly. Suppose now that the previous experiment is repeated and that
the noise in Eq. 9.16 is not zero but consists of independent Gaussian variables
Zi with mean zero and variance a 2 • We can still observe a phase transition in the
presence of noise. However, the transition is less abrupt and becomes smoother
as the noise intensity increases. The Ising model in this example has been used to
study Barkhausen noise in magnetic materials [168]. <>
Note: For small noise intensities, that is, small values of a, the couplings between spins
dominate and the lattice changes polarization through a large avalanche that sweeps through
the entire system as the external field is increased to a particular level. For large values of
a, the coupling between spins is insignificant compared to the local disorder, and the lattice
tend to magnetize in many single-spin flips, that is, avalanches contain a single spin. There
exists a critical value ac of the disorder parameter a defining a phase transition between
720 Chapter 9. Stochastic Systems and Input
these two limit cases. Near the critical point ac, avalanches can be described by a power
law up to a size depending on the magnitude of Ia- acl [168]. A
where p x(t) is the rate of population increase or decrease depending on the sign
of p E IR and the term -x(t) 2 accounts for the fact that resources are limited. The
parameter p is positive and negative for a favorable and hostile environment. The
solution of the Verhulst model is
variable with the density Is (x) = 8 (x) for p < 0 and Is (x) = 8 (x - p) for p > 0,
where 8 denotes the Dirac delta function.
The Verhulst model is a special case of Eq. 9.1 in which V has deterministic
coefficients and Y is zero. The following two examples consider extensions of
the Verhulst model that account for the random variations of the environment.
It is shown that the state of these models experience one or more noise induced
transitions.
Example 9.35: Suppose that the rate p in the Verhulst model is replaced by p +
a dB(t)jdt, where B denotes a Brownian motion and a is a constant giving the
scale of the environmental noise. The state X of this version of the Verhulst model
is a diffusion process defined by the stochastic differential equation
0.9
cl-!2<p<cl- (p=3/4)
0.8
0.7
p<cl-12 (p=O)
the functional form of fs and are caused by random fluctuations in the environ-
ment since E[p dt +a dB(t)]/dt = p has the same value as in the deterministic
Verhulst model ([102], Section 6.4).
It can be shown that the above differential equation has the solution
where X (0) denotes the initial state ([115], p. 125). Figure 9.16 shows samples of
1.5 r r - - - - - - - - - , 3.---------,
p =0.0 p =0.75 p=2.0
2 4
0.5
0
0
u 50
time
Proof: The density l(x; t I xo; 0) of X(t) I (X(O) = xo) satisfies the Fokker-Planck
equation (Section 7.3.1.3)
a1= - -
- a ( (px-x 2 )1 ) +cr2-a2
- ( x 2 I).
at ax 2 ax2
The stationary density ls(x) = limt--+oo l(x; t I xo; 0) is the solution of the above equa-
tion with aIs 1at = o, that is,
d 0"2 d2 )
0 = - dx ( (p x - x2) Is) +2 dx2 (x2 Is ,
since Is and its first derivative approach zero as lx I -+ oo. The above equation gives
In <fs(x)) = 2 (p- cr 2) ln(x)jcr 2 - 2xjcr 2 + q by integration, where q is a constant.
The stated expression of Is results by elementary manipulations.
If p > a 2j2, the integral J000 x 2 (p/0" 2 -l) e- 2 xf0" 2 dx is bounded so that Is exists
and has the above expression. If p < cr 2 /2, Is is a delta function at x = 0, that is,
9.4. Physics 723
ls(x) = 8(x) ([102], Section 6.4), a result consistent with the expression of X(t) showing
that, if p < u 2 /2, then X(t) --+ 0 a.s. as t --+ oo since p- u 2/2 + u B(t)/t converges
a.s. top- u 2 /2 as t --+ oo (Section 8.7). Also, note that the first two moments of Xs have
the same expression for p > u 2 /2. Hence, the qualitative change in Is at p = u 2 cannot
be captured by the second moment characterization of Xs.
The solution x(t) = x(O) eP 1 [1 + x(O) (eP 1 - 1)/ p rl ofEq. 9.19 has the asymp-
totic value Xs = limt-Hxl x(t) = 0 and Xs = p for p < 0 and p > 0, respectively. Hence,
p = 0 constitutes a bifurcation point between two stable solutions. The steady-state solu-
tion Xs coincides with the mean of the stationary solution Xs for all values of p.
The Lyapunov exponent ALE obtained by the method in Section 8.7 is ALE = p-
u 2/2 and ALE = (1 - m) (p - u 2/2), m = 2, for the trivial and non-trivial stationary
solutions, respectively. Since the stationary solution is stable a.s. if ALE < 0, the trivial and
non-trivial stationary solutions are stable a.s. for p < u 2 /2 and p > u 2/2, respectively,
in agreement with the results in this section showing that Is (x) = 8(x) for p < u 2/2 and
Xs > 0 a.s. for p > u 2/2. The approach in Section 8.7 delivers only the value p = u 2/2
of p at which the stationary solution Xs bifurcates but not the transition of Is at p = u 2.
However, the Lyapunov exponent method can be extended to detect the transition of !s at
p = u 2 [10].
We have modeled the uncertainty in the environment by a Gaussian white noise
for simplicity. A broadband noise with finite variance would have been more realistic. If
such a noise were used to model the environment, the differential equation for X would
have been a Stratonovich rather than an It() equation. The relationship between the It() and
Stratonovich differential equations in Section 4.7.1.2 can be used to recast a Stratonovich
equation into an It() equation. •
Imaginary part
5 10 15 20
u
Figure 9.17. Real and imaginary parts of the characteristic function rp s for A = 12,
Y1 ""U(-a, a), a= 0.5, and p = 1, 2
Figure 9.17 shows the real and imaginary parts of the characteristic function
fPs for A = 12, Y1 "" U( -a, a), a = 0.5, and two values of p. The significant
difference between the two characteristic functions suggest that Is may exhibit a
transition. The corresponding plots of Is in Fig. 9.18 show that there is a qualita-
5 6
Figure 9.18. The density Is for A = 12, Y1 ,.._, U (-a, a), a = 0.5, and p = 1, 2
tive difference between the stationary solutions for p = 1 and p = 2. The most
likely values of Xs are in a small vicinity of x = 0 and away from x = 0 for
p = 1 and p = 2, respectively. We also note that Var[C(t)] = At E[Yf] = t
9.5. Environment and ecology 725
for A.= 12 and Y1 ,....., U(-a, a) with a = 0.5 so that the processes C(t) in this
example and a B(t) with a = 1 in the previous example have the same second
moment properties. 0
The expectation of the above equation can be performed term by term. The left side gives
rp(u; t) - rp(u; 0). The calculation of the expectation of the integral on the right side does
not pose difficulties and is not discussed. To calculate the expectation of the summation in
the above equation, we note that the term
and the jump times of Care uniformly distributed in (0, t] conditional on N(t), the above
conditional expectation is fR dF(y) Jci rp(u (1 + y); ~) d~/t- Jci rp(u; ~) d~/t for each
k = 1, ... , N(t). The stated equation for rp results by differentiating the expectation of the
above Ito formula for e A u X(t).
The numerical results in Fig. 9.17 have been obtained by using a finite difference
approximation of the equation for rp with the boundary conditions rp(±u; t) = 0 and
arp(±u; t)jau = 0, where u = 200. The densities Is in Fig. 9.18 are the Fourier transforms
of the corresponding characteristic functions IPs. •
content and pollutant concentration in soil deposits are random functions satisfy-
ing partial differential equations with random coefficients and input. This section
presents models for the moisture content in soil deposits (Section 9.5.1), water
quality indicators in rivers (Section 9.5.2), and pollutant concentration in soil de-
posits (Section 9.5.3). We also give some numerical results for these environment
and ecology measures.
(9.21)
with the initial and boundary conditions 8(z, 0) = e i (z), z > 0, and Q =
ae(O, t)jaz = R, t 2': 0, where R denotes the rainfall rate at (x, y), and the
coefficients 1/f and K are the suction head and the hydraulic conductivity, re-
spectively ([90], Chapter 5).
Note: The coefficients 1/J and K can be related to e and Ks by empirical constitutive
relationships, so that Eq. 9.21 becomes a partial differential equation for <9. The functions
ei and Q = ae(O, t)jaz are the moisture content at the initial timet = 0 and the water
flux in the vertical direction, respectively.
Because of the uncertainty in Ks, Eq. 9.21 has random coefficients. The boundary
condition at z = 0 can also be random since the rainfall rate R exhibits random spatial
and temporal fluctuations. The dependence of the functions <9, <98 , Ks, and R on the
coordinates (x, y) is not shown in Eq. 9.21 because we assume that the water can flow only
in the z-direction. Therefore, the solution of Eq. 9.21 at a site (x, y) depends only on the
soil properties and rainfall rate at this site. A
9.5. Environment and ecology 727
where c and s are some constants ([90], p. 85). If K s ::: R, ponding cannot occur
so that Tp = +oo.
Note: The constants c and s depend on <9j, <9s, soil water diffusion coefficient, soil satu-
rated hydraulic conductivity, and other soil properties ([90], p. 84).
The ponding time in Eq. 9.22 is a nonlinear mapping of Ks and R so that Tp is a
random field if Ks and R are random fields. We can calculate properties of Tp from the
probability laws of Ks and R. &
Sample of K Sample of R
s
0.04 0.04
0.02 0.02
5 5
0 0 0 0
Sample of liT
P,
Sample of liT (R
p
= E[R])
0.4
0.2 0.5
0
5
5 5
0 0 0 0
The lognormal random field Ks is homogeneous with mean and covariance function
respectively, so that a'f = ck(O) and Vk = O'k/ 11-k ([79], p. 49). The finite dimensional
distributions of Ks can be obtained from the definition of Ks and the probability law of
G (Section 3.6.6 in this book, [79], p. 45). For example, the first order or the marginal
distribution of this random field is P(Ks :5 z) = <P((ln(z)- /1-g)/ag), where <P denotes
the distribution of N(O, 1). It can also be shown that the scaled covariance function of
Ks satisfies the condition I~(~) I :5 IP(~)I for all~ E JR2 , where;(~)= ck(~)fck(O) =
[(1 + Vf)P(~) - 1]/Vf ([79], p. 49). A
Probabilistic models forKs and R of the type in Example. 9.37 can also
be used in Eq. 9.21. Moreover, the model for R can be generalized by allowing
the rainfall to vary randomly not only in space but also in time. We have already
stated that analytical solutions cannot be obtained for Eq. 9.21, but this equation
can be solved by Monte Carlo simulation. First, samples of (K s, R) need to be
generated and the corresponding samples of (\If, K) calculated. Second, Eq. 9.21
has to be solved for the samples of (Ks, R, \11, K) obtained in the previous step.
9.5. Environment and ecology 729
Third, the corresponding samples of the ponding time T P result from its defining
equation 8(0, Tp) = E>s. Fourth, statistics need to be calculated for 8 and T p
from the calculated samples of these random fields.
{
i(t) = -(k1jv) /(t), (9.24)
i:(t) = -(k2jv) (c(t)- Cs) - (k1/V) /(t),
where k1 and k2 denote the deoxygenation and reaeration rates, respectively, and
Cs is the DO saturation concentration ([2], p. 137). The evolution of the state
vector (l, c) can be calculated simply from Eq. 9.24 for specified initial conditions.
Alternative models have been proposed for the evolution of (/, c) that account
for additional processes that may take place in streams ([2], pp. 135-150). Our
objectives are to establish a stochastic version for the evolution of BOD and DO
and calculate statistics of these water quality measures at any distance t > 0
downstream from a pollution source.
Consider the differential equation
where L(t) and C(t) are the coordinates of the state vector X(t), A 11 = -(K1 +
K3)jv, A12 = 0, A21 = -K1jv, A22 = -K2jv, Y1(t) = Z(t)jv, and Y2(t) =
(K2 Cs- K2,p)/v are the entries of A and Y(t). The uppercase letters in Eq. 9.25
are used to denote random variables and functions. The parameters K 1 and K2
have the same meaning as in Eq. 9.24 but can be random. The parameters K 3 , Z,
and K2,p denote the rate at which pollution decreases because of sedimentation
and absorption, the rate of pollution increase along the river banks caused by
small pollution sources, and the rate of DO decrease owing to photosynthesis,
respectively ([2], pp. 138, [138]). The parameters K 3, Z, and K2,p may depend
on the distance t from the main pollution source. The model in Eq. 9.25 is a linear
differential equation with random coefficients, random input, and deterministic
730 Chapter 9. Stochastic Systems and Input
or random initial conditions. The model in Eq. 9.25 is an extension of Eq. 9.24
because it (1) considers additional processes, such as sedimentation, absorption,
and photosynthesis, and (2) incorporates the uncertainty in the model parameters
and input.
The method for calculating probabilistic properties of X in Eq. 9.25 de-
pends on the type of the driving noise. For example, the conditional analysis in
Section 9 .2.3 and the methods in Chapter 7 can be used if Y is a Brownian motion,
a martingale, or a semimartingale. If Y is a random variable rather than a random
function, then
\ll1(t) 0 J
X(t) =[ \ll2(t) \ll3(t) X(O) +[ 81(t) J
82 (t)
.....
= \ll(t) X(O) + .:=.(t), (9.26)
f(x, z, x; t)
_.f. (X1-~1(t) -1/J2(t)(XI-~I(t))+1/II(t)(X2-~2(t)) )
- JO 1/1'1 (t) ' 1/1'1 (t) 1/1'3 (t) 'Z, X
Note: The augmented vector X= (Xt = Xt. X2 :"" X2, X3 =.z. X4 = K2,p) is defined
by Eqs. 9.26-9.27 and the differential equations X3(t) = 0, X4(t) = 0 with the initial
9.5. Environment and ecology 731
conditions X3(0) = Z, X4(0) = K2,p· The density f(x, z, x; t) is given by Eq. 9.5
applied to X in which we use X(O) = \ll(t)- 1 [X(t)- :S:(t)] (Eq. 9.26). Generally, the
determination of the density f(x; t) = fJR2 f(x, z, x; t) dz dx of X(t) requires numerical
integration. A
Example 9.38: Suppose that X(O), Z, and K2,p in Eq. 9.26 are random variables
and all the other parameters in these equation are deterministic. Let the densities
of X1 (0) and X2(0) be proportional to the densities of N(6.8, 1) and N(8.7, 0.03)
in the intervals [0, oo) and [0, cs], respectively, and zero outside these intervals.
For example, fx 1 (o)(~) = ctl[O,oo)(~)f/J(~- 6.8), where q > 0 is a constant and
f/J denotes the density of N(O, 1). Let Z and K2,p be random variables that are
uniformly distributed in the intervals (0, 0.4) and (0, 0.2), respectively. It is also
assumed that the random variables X 1(0), X2(0), Z, and K2,p are mutually inde-
pendent. Then the joint density of (X 1(0), X2(0), Z, K2,p), that is, the density of
the augmented state vector X(t) = (X1 (t), X2(t), Z, K2,p) at t = 0, is
for i1 ::=:: 0, i2 E [0, cs], X3 E [0, 0.4], and X4 E [0, 0.2], where c > 0 is a
normalization constant. If any of these conditions is violated, fo is zero. The
density of X(t) = (X(t), Z, K2,p) at a later timet::=:: 0 is (Eq. 9.28)
if i1 E [~1 (t), oo), i2 E [c*, c*], X3 E [0, 0.4], and X4 E [0, 0.2] and zero if any
of these conditions is not satisfied, where c * = max(O, (1/12/1/11) (i1 - ~I) + ~2)
and c* = min(cs, Cs 1/13 + (1/12/1/11)(i1 - ~1) + ~2) [138].
Figure 9.20 shows the joint density of X(t) at distances t = 0 and t = 5
miles downstream from a pollution source for k1 = 0.35, k2 = 0.75, k3 = 0.20,
cs = 10, and an average stream velocity v = 7.5 miles/day. The plots have
been calculated from the above expression of the density of X(t) by numerical
integration. They show the trend of (BOD, DO) with the distance t ::::: 0 and the
spread about this trend. Such plots can be used to calculate the probability that the
DO level does not fall below a critical level DOer at any site downstream from a
pollution source. Values of DO smaller than DOer can have serious consequences
on the aquatic life in a stream. <>
Note: The densities in Fig. 9.20 have been calculated from the density of X(t) in which
the random variables Z and K2,p were assumed to take discrete values. Numerical results
are for ten equally spaced values in the range of possible values of these variables.
The relationship between X(t) and (X(O), Z, K2,p) in Eq. 9.26 can also be used
to calculate moments of BOD and DO. For example the expectations of these pollution
732 Chapter 9. Stochastic Systems and Input
2.5
1.5
0.5
0
10
10
Figure 9.20. The joint density of (BOD, DO) at t = 0 and t = 5 miles down-
stream from a pollution source
measures are
The spatial and temporal variation of the flow velocity and pollutant con-
centration depend on the properties of the soil deposit in D, pollutant source, and
boundary and initial conditions. Our objective is to find properties of pollutant
·concentration c(x, t). This function varies randomly in space and time if the soil
properties, the pollutant source, and/or initial and boundary conditions are uncer-
tain.
The steady-state head in D is the solution of the partial differential equation
where the source functions, the head ho, and the flux qo are prescribed func-
tions, aDu and aD,. partition the boundary of D, and n(x) is the unit outward
normal at x E aD,. ([39], Chapter 20).
Note: The flux in D can be calculated from the Darcy law q(x) = k(x) Vh(x) and the
solution of Eqs. 9.29 and 9.30. The velocity v of ground water is v = q frp, where rjJ
denotes the effective soil porosity ([39], Chapter 20). •
ac q (adij ac a2c )
-+cVv+v·Vc-" - - + d i j - - - =g(x,t), (9.31)
at .!....J ax·I ax.J
i,j=l
ax·I ax.J
ac
-+v·Vc-"
q (adij ac
--+d;·---
a2 c ) =g(x,t).
at L ax· ax·
i,j=l I
1 ax· ax·
I I ]
If in addition the coefficients d;j are space invariant, the above equation becomes
ac q a2 c
- +c'Vv- "di' - - - = g(x,t).
at L 1 ax·ax·J
i,j=l I
734 Chapter 9. Stochastic Systems and Input
The defining equation for c is a special case of the type of partial differential equa-
tion considered in Section 6.1. Hence, the local solutions in Sections 6.2 and 6.3 can be
used to find properties of the concentration at a specified location x E D and time t ::=: 0. A
Example 9.39: Suppose that (1) the ground water flow in an infinite medium has
a known constant velocity v > 0 along the coordinate x 1, (2) an amount co > 0
of pollutant is placed instantaneously at x = (0, 0, 0) and time t = 0, and (3) the
dispersion tensor diJ = d OiJ is constant. Then c satisfies the partial differential
equation ac;at + v ac;ax1 = d ilc, where il = L:~= 1 a2 ;axf, so that
Figure 9.21 shows contour lines of the pollutant concentration at timet = 0.1, 1.5,
and 3.0 in a two dimensional medium ford = 0.2 and v = 4. Two phenomena
can be observed. The pollutant eye travels in the x 1 direction with velocity v and
the pollutant diffuses in the x1 and xz directions as time increases. <>
4
0.1 1.5 3.0
0
t= t= t=
0
2
x2 o
-2
•
-4
-2 0 2 4 6 8 10 12 14 16
x,
Note: That the above expression of c is the solution of the transport equation ocfot +
v ocjox1 = d ll.c can be found by direct calculations ([46], Chapter 1). As t t 0, the
concentration converges to co 8 (x). The equality co = f&,3 c(x, t) dx holds at each time
t ::=: 0 and expresses the conservation of mass. A
Example 9.40: Consider the transport problem in the previous example but as-
sume that the dispersion tensor DiJ (x) = D(x) 8ij varies randomly in space. The
pollutant concentration C(x, t) becomes a random function of space and time de-
fined by (Eq. 9.31)
ac +v ac _~[aD ac +D
at axl 6 ax; ax;
aaxfc]
2
=O,
9.5. Environment and ecology 735
Sample of D ( vd = 1.5)
3~~~~~~~~
-3
0 3 -3 0 3
xl xl
0 x<t· e 01
3t = 0.1 t =1.5 t = 3.0 t= 0.1 t = 1.5 t= 3.0
• 0
x2
0
-3 -3
0 5 10 15 0 5 10 15
xl xl
Figure 9.22. Samples of dispersion tensor and contour lines for the corresponding
samples of pollutant concentration
with the initial condition C(x, t) =co 8(x). It is not possible to find the probabil-
ity law of C analytically so that numerical methods need to be employed.
The top plots in Fig. 9.22 are two samples of the dispersion D assumed to
be a lognormal translation field defined by D (x) = exp(J-L g + a g G (x)), where
G(x) = (Gt (xt) + G2(x2))j,Ji, x E ~2 , and Gi, i = 1, 2, denote independent
Gaussian processes with mean zero and covariance functions E[Gi(Xi) Gi(Xi +
si)] = exp( -a lsi!), a = 2. The expected value of D(x) is 0.2, that is, the
value of the dispersion in Example 9.39. The samples of Dare for coefficients of
variation Vd = 0.7 and Vd = 1.5.
The bottom two plots in Fig. 9.22 are contour lines for the corresponding
samples of C at distances t = 0.1, 1.5, and 3 downstream from the pollution
source for v = 4. Figures 9.21 and 9.22 show that the uncertainty in the dispersion
tensor can have a significant effect on pollution spread. Also, the sample to sample
variation of pollutant concentration can be significant. <>
Note: The parameters f.lg and ag in the definition of D(x) can be related to the mean 1-ld
736 Chapter 9. Stochastic Systems and Input
These equations can be used to calculate the values of the parameters J-tg and a g for speci-
fied values of J-td and ad.
The Monte Carlo simulation algorithm in Section 5.3.3.1 has been used to generate
samples of the dispersion tensor D. A MATLAB tool box has been applied to solve the
partial differential equation of the pollutant concentration for samples of D.
Note that an extended version of the local solution in Section 6.2 of the type con-
sidered in, for example, Section 8.2 can be used to find properties of C(x, t) at an arbitrary
location and time. Also, the methods discussed in the first part of this chapter can be used
to find moments and other properties of the pollutant concentration approximately. A
The calculations in the previous two examples are based on the assumption
that the flow velocity is known and deterministic. Generally, the flow velocity is
not known and may be a random function of x E D and t 2::: 0. In this case, we
have to solve also Eq. 9.29 to find the head that gives the flux and flow velocity.
There are no simple methods for solving Eqs. 9.29 and 9.31 even in a deterministic
setting.
The stochastic version of Eqs. 9.29 and 9.31 results by assuming that the
hydraulic conductivity, the dispersion tensor, the input, and the initial and bound-
ary conditions are random functions with known probability law. Generally, the
probability law of these functions is not known so that it has to be inferred from
observations. Measurements of conductivity and dispersion can be performed at
a finite number of points x k and constitute spatial averages over relatively small
volumes centered at x k· where the size of these volumes depends on the measur-
ing device. Higher resolution devices tend to identify smaller scale features of
the measured fields so that the resulting records have a larger variability. Meth-
ods for incorporating the quality of measurements in the probabilistic models for
conductivity and dispersion are discussed in [39] (Chapter 20) and [47].
Example 9.41: Consider a rod with length l > 0, cross section of unit area, and
random stiffness K(x) > 0 and density R(x) > 0, x E (0, l). The rod is free
and fixed at its left and right ends, respectively. It is assumed that a force Y(t) is
acting on the left end of the rod and that the rod is at rest at the initial time. Let
U (x, t) be the rod displacement at location x E (0, l) and time t :::: 0. Then
which becomes
I _a_t_
oV(x,t)
_a_t_
oT(x,t)
_
-
1
R(x)
= K(x)
oT(x,t)
----ax•
oV(x,t)
ax '
with the notation V(x, t) = aU(x, t)jat and T(x, t) = K(x) aU(x, t)jax. The
functions V and T are the rate of change of U in time and the stress in the rod,
respectively. The initial and boundary conditions for the above equations are
V(x, 0) = 0, T(x, 0) = 0, T(O, t) = Y(t), and V(l, t) = 0.
Figure 9.23 shows contour lines for two samples of U corresponding to
Y(t) = sin(v t), v = 0.1, l = 10, R(x) = 1, and two independent samples of
K(x) = ko +a exp(G(x)), where ko = 2, a= 5, and G is a stationary Gaussian
738 Chapter9. Stochastic Systems and Input
10 10
8 8
X X
6 6
4 4
2 2
0 0
0 2 3 4 5 0 2 3 t 4 5
Figure 9.23. Contour lines of two samples of the solution of the wave equation
for a rod with random stiffness
field with mean zero and covariance function E[G(x) G(x ')] = exp( -lx - x'l).
The distinct pattern of the two wave samples is caused by the random fluctuations
of the rod stiffness. <>
Note: Because we assume that the rod is made of a linearly elastic material and its defor-
mation is small, the strain and stress at (x, t) are r(x, t) = aU(x, t)jax and S(x, t) =
K(x) aU(x, t)jax, respectively. The equilibrium condition,
Let X(x, t) be an JR2 -valued function with coordinates (V, T). Then X is the solu-
tion ofEq. 9.1 with 'D = i (ojot) -« (ojox) and Y(x, t) = fJ Q(x, t)j R(x), where i is the
identity matrix, and the entries of the matrices« and fJ are au = a22 = 0, a12 = 1/ R(x),
a21 = K(x), fh = 1, and f32 = 0. The methods discussed in this chapter can be applied to
find properties of X. A
1.-----------------~
Real part of A Imaginary part of A
-1~----~----~----~ -1~----~----~----~
0 2 3 0 2 3
from the above differential equation for A and the samples of Z in Fig. 9.25 for
ko = 5, e = 0.5, and Z(x) = exp(G(x)), where G is a stationary Gaussian pro-
cess with mean zero and covariance function E[G(x) G(x')] = exp(-lx- x'l),
X, X 1 E (0, l).
740 Chapter 9. Stochastic Systems and Input
14.-------------~--~ 4.---------~--~---.
Samples of z Phase
12
3 oo
@~ «® ,Yo
8
6 2~~;g
4
1 wr~ 0
2
~ €£ 0 0
0 n~O.
1.5 2 2.5 3 0 0.1 0.2 0.3 0.4
Amplitude
Figure 9.25. Ten samples of the field Z and 100 samples of the amplitude and
phase corresponding to 100 samples of Z and l = 3
where A and <I> denote the random amplitude and phase of the reflected wave.
The right graph in Figure 9.25 shows the amplitude of the reflected waves for 100
samples of Z and the corresponding phase of these waves relative to the incident
wave for l = 3. The sample to sample variation of the amplitude and phase of
the reflected waves is significant. Statistics of A and <I> can be calculated from
Monte Carlo simulation results as shown in Fig. 9.25. For example, an estimate
of the correlation coefficient PA,<I> of (A, <I>) based on 1000 samples is -0.0195
for l = 3. This low value of PA,<I> is consistent with the plot of the amplitude and
phase of the reflected wave in Fig. 9.25 showing virtually no linear trend between
these two random variables. <>
Proof: The solution of the wave equation in (l, oo) is U(x) = e-Hkox+A(l) eHkox.
Denote by[; the solution of the wave equation in (0, l). The continuity conditions at x = l
require [; (l) = U (l) and U' (l) = U' (l). The ratio of these two conditions is
U1 (l) A(l)eHkol_e-Hkol
--- = .J=I ko - - - ' - - - = , - - - - - = , - - -
u(t) A(l)eHkol + e-Hkol
so that
A(l) = e-2Hkol A ko ~(l) + ~'(l)
.J=1 ko U(l)- U 1 (l)
which gives the equation for A by differentiation with respect to l and by using continuity
conditions at x = 0 [140].
We have not imposed any condition on e in our discussion. If e is a small parameter,
it is possible to derive an evolution equation for the transition probability density of a
stochastic process related to the reflection coefficient A. The derivation of this asymptotic
result can be found in [140]. •
9. 7. Seismology 741
9.7 Seismology
The energy supplied by the movement of the tectonic plates strains the
earth's crust and can cause slips along faults. The fault slips are violent events
called earthquakes. Large slips can produce earthquakes that may have devas-
tating social and economical consequences. There are few large earthquakes but
many small seismic events.
Seismologists have had a limited success in predicting the occurrence of in-
dividual future earthquakes. However, simple global laws for earth dynamics have
been found from empirical observations, for example, the Gutenberg-Richter law
log(n(m)) = a - b m and the energy-magnitude relationship m = ot + f3 log(e ),
where n(m) is the number of earthquakes with magnitude larger then m, e de-
notes the energy released during an earthquake, and a, b, ot, f3 are some constants.
These two equations show that log(n(m)) and log(e) are linearly related.
Large systems with many components tend to evolve into a critical, unstable
state that can change abruptly because of minor disturbances. The critical state
results solely from the interaction between the system components so that it is
self-organized. A system may evolve in a critical state if it is large, receives a
steady supply of energy, and is dissipative [16]. These conditions are satisfied by
the collection of interacting crust faults. The system of tectonic plates extends
over large regions, receives a steady supply of energy from the motion of tectonic
plates, and dissipates energy through fault slips. This observation suggests that
the Gutenberg-Richter law can be a manifestation of the self-organized critical
behavior of earth dynamics [16].
We consider two earthquake fault models related to the plate tectonic theory.
The first model, referred to as a physical model, consists of a collection of con-
nected blocks that are pulled at a constant speed on a rough surface. The second
model, called cellular automata, is a mathematical abstraction. The states of both
models satisfy equations with random coefficients and random or deterministic
inputs, which can be described by Eq. 9.1.
Moving plate
. I
Flat I
I I
I
Rough
Figure 9.26. Physical model of interacting crust faults (adapted from [35], Fig. 3)
is small. If the friction at a block site is exceeded, the block slips and the released
force may cause a single block or a collection of blocks to slip. It is possible that
a single block slip will trigger the slip of many neighboring blocks resulting in a
large avalanche of block slips and a significant reduction of the system potential
energy. Generally, this is not the case, and block slips are confined to small neigh-
borhoods. The block slips resemble the fault slips. The energies released by the
slip of one or a few blocks and the slip of many blocks corresponds to small and
large seismic events, respectively.
Figure 9.27 shows the time evolution of the potential energy e(t) for the
system in Fig. 9.26. Following an initial time interval when potential energy is
built in the system, the energy reaches a steady-state regime characterized by small
frequent and large infrequent negative jumps of magnitude X (t) = le(t)- e(t-) J.
Consider a large time interval in the steady-state regime of the potential energy
3500 10'
Energy, e(t) n@
oo
3000 00
0
2500 0
102
0
2000
0
0
1500 0
101 0
1000 0
500 0
0
0 100
0 100 200 300 10° 102
time, t ~
and let n(~) denote the number of jumps of X in this time interval with magnitude
larger than an arbitrary energy value ~ > 0. The resulting relationship between
9. 7. Seismology 743
the logarithms of n(~) and~ shown in Fig. 9.27 is nearly linear with a slope of
approximately -1.12, in agreement with the Gutenberg-Richter law [35].
The relationship between n(~) and ~ in Fig. 9.27 implies that the jump
process X associated with the potential energy must have a power-like tail, that is,
P(X (t) > ~) "" c ~-p as ~ ~ oo for some p, c > 0 ([161], Section 2) because
(1) the above power law implies that the logarithms of P(X(t) > ~) and~ are
linearly related and (2) n(~) scaled by n(O) is an estimate of P(X (t) > n
Note: Therelationshipbetweenn(~) and~ in Fig. 9.27 gives P(X(t) > ~) ~ c~-1.1 2 for
large values of~, and shows that the upper tail of the distribution of X (t) is approximately
Cauchy. Let Y be a Cauchy random variable with density f(y) = 1/[:rr (1 + y)] and
distribution F(y) = 1/2 + arctan(y)j:rr, y E JR. Hence, limy-+oo P(Y > y)jy- 1 = 1/:rr
so that P(Y > y) ~ (1/:rr) y- 1 as y -+ oo and log(P(Y > y)) ~ log(l/:rr) - log(y) so
that log(P(Y > y)) is linearly related to log(y) and the slope of this linear relationship is
-1 for Cauchy distributions.
We also note that the results in Fig. 9.27 are for a stochastic system subjected to a
deterministic input since the properties of the rough surface and the velocity of the moving
plate in Fig. 9.26 are random and deterministic, respectively. Problems characterized by
stochastic systems and deterministic input have been examined in Chapter 8. We discuss
the model in Fig. 9.26 here because it can be extended simply to the case in which the
velocity of the moving plate is random. .t.
Consider a square lattice with n 2 sites and denote by Z(i, j; t) the value of
a random field defined on this lattice at timet = 0, 1, ... and site (i, j), i, j =
1, ... , n. We can think of the random field Z as a model for forces acting on
hypothetical blocks located at the lattice sites (Fig 9.26).
Suppose that the lattice is in equilibrium at a time t :::: 0, that is, Z (i, j; t) :=:
Zc at all nodes, where Zc > 0 is an integer that can be related to the magnitude
of the friction force between the blocks and the fixed plate of the physical model
in Fig. 9.26. The effect of the moving plate on the system in Fig. 9.26 can be
simulated in cellular automata model by increasing the value of Z with unity at a
site selected at random. This site selection attempts to model the spatial variability
in the mechanical properties of a fault. If the updated value of Z exceeds z c. the
values of Z have to be modified according to a redistribution rule until the field Z
does not exceed Zc at all sites.
The algorithm for evolving the random field Z involves the following three
steps [16, 107]. Consider a time t and assume that at this time the lattice is in
equilibrium, that is, we have Z(i, j; t) < Zc for all i, j = 1, ... , n.
744 Chapter 9. Stochastic Systems and Input
1. Select a site (i, j) at random and increase the value of Z at this site by unity,
that is, Z(i, j; t) ~--+ Z(i, j; t) + 1 = Z(i, j; t). The values of Z at all the other
sites is not changed, that is, Z (k, l; t) ~--+ Z (k, l; t) = Z(k, l; t) fork f= i and/or
l f= j.
3. Modify Z according to the rules in the previous step at all sites with values
larger than Zc till equilibrium has been achieved. The resulting values of Z
correspond to timet+ 1 and are denoted by Z(i, j; t + 1), i, j = 1, ... , n.
Note: The initial values of Z are taken zero, that is, Z(i, j; 0) = 0 at all nodes. The
redistribution in the second step differs at boundary sites. For example, suppose that
Z(i, n; t) ~ Zc and i E {2, ... , n -1}. Then Z(i, n; t) is reduced by four, and Z(i ±1, n; t),
Z(i, n - 1; t) are increased by unity. A unit of energy is allowed to escape through the
boundary. The uncertainty in the input can be incorporated by replacing the first step with
Z(i, j; t) r+ Z(i, j; t) + Q, where Q is a {1, ... , a}-valued random variable and a > 0
denotes an integer smaller than Zc.
An earthquake is generated during a time interval (t, t + 1] if the application of the
above algorithm causes one or more sites to exceed the critical value Zc· If at timet the
values of Z(i, j; t) are relatively small with respect to Zc, a local slip associated with an
increase of Z at a site (step 1) is unlikely to propagate in the lattice. On the other hand, if
most of the values of Z are nearly equal to Zc, a local slip may trigger an avalanche of slips
covering a large region of the lattice. These two extreme events correspond to small and
large earthquakes, respectively. _.
Let E>(t) = 2:7,j=l S(i, j; t), t = 0, 1, ... , r, and X(t) = IE>(t)- E>(t-
1)1, where r > 0 is an integer. The random variables E>(t) and X(t) can be
interpreted as potential energy and earthquake magnitude, respectively, at time t.
Denote by N(x) = #{X(t) > x : t = 1, ... , r} the number of earthquakes with
magnitudelargerthanx > Ointhetimeinterval [0, r]. Then N(x) = N(x)/N(O)
is an estimate of the probability P (X (t) > x).
Figure 9.28 shows the evolution in time of two samples of 8 for n = 50,
Zc = 3, and r = 30, 000. The significant difference in the details of the samples of
8 suggests that there is little hope to ever predict details of future seismic events,
for example, the time and the magnitude of the next major earthquake. However,
the corresponding samples of N versus x = xI n2 are quite similar, indicating that
global features of the earthquake dynamics can be predicted. The relation between
1og(N) and x is nearly linear for both samples with slopes of approximately -1.17
and -1.29. This relationship is similar to the Gutenberg-Richter law introduced
at the beginning of Section 9.7.
9.8. Model selection 745
x.
X
L
X
of fl('%)
10-5L___ _~---~--_j
8500 9000 9500 10000 0 2 3
time, t
marginal distribution F and correlation function r(r) = E[Y(t) Y(t + r)]. Our
objective is to assess the sensitivity of the output X ofEq. 9.1 to equivalent mod-
els for Y, that is, to a class of stochastic processes with marginal distribution
F and correlation function r. Sensitivity is measured by the difference between
estimates of the marginal distribution, the skewness coefficient, and the kurtosis
coefficient of X corresponding to equivalent models for Y.
In the following two examples we assume that Y is a stationary process with
mean zero, lognormal marginal distribution F, and covariance function r (r) =
c(r) = exp(-A lrl), A > 0. Two equivalent models are considered for Y, a
translation process YT and a diffusion process Y D, defined by
where yz is the smallest value in the range of the lognormal density f (y)
dF(y)/dy of Y [36].
d 1 d 2 ( b(y) 2 fv(Y) ) =0
--(-A.yfv(y))+-- 2
dy 2 dy
so that A.y fv(y) + (l/2)d (b(y) 2 fv(Y)) /dy = 0 by the boundary conditions in Sec-
tion 7.3.1.3. This equation yields the above expression for bO by requiring that fv coin-
cides with the target density f, and solving the resulting equation forb(·). A
0.1 .-----.--...---,-----.-----,---,------,----.------.-----,
Sample of XT
-0.05
-0. 1o 10 20 30 40 50 60 70 80 90 100
0.1 ~-~--,---~---r--~-~-~-~-~-~
Sample of X0
0.05
Estimates of the skewness and kurtosis coefficients, Y3 and y4, are Y3 = 1.34 and
Y4 = 7.6829 for Xr and Y3 = 1.81 and Y4 = 8.06 for Xv. The differences be-
tween these estimates are relatively small. However, the differences between the
marginal histograms of X T and X v and estimates of the probabilities P (X T > x)
and P(Xv > x) in Fig. 9.30 indicate that reliability measures limiting the allow-
able values of X are sensitive to the particular model used for Y. All estimates are
based on 1,000 independent output samples<>
so~~~----~~--~---,
Histogram of XT
60 60
40 40
20
X X
P(XT>x)
0.1
0.05 /
P(XD >x)
0
0.01 0.02 0.03 0.04 0.05 0.06
X
In Examples 9.43 and 9.44 we have assumed that (1) the operator V in
Eq. 9.1 is perfectly known and has deterministic coefficients and (2) the input Y
to this equation is a real-valued stationary process with known marginal distri-
bution and correlation function. We have seen that the properties of the solution
X of Eq. 9.1 depend on the particular model used for Y, although the models
considered for Y are equivalent in the sense that they have the same marginal
distribution and correlation function. These results are of concern since the infor-
mation available in most applications rarely exceeds the information postulated in
Examples 9.43 and 9.44, and suggest that, at least for important systems, we need
to find properties of X in Eq. 9.1 for a collection of equivalent models of Y.
0.15 .---.-------.-.------.----,-------,--,-----,------,------r----,
Sample of XT
0.1
0.05
0 bvi.IIMI'!MI"''~'IJIWIWIII
-0.05
-0.1 L___..L..__ __.__ ___,__ ____.__ ____jl____..J..__ _j___ _ J __ __.__ ___J
0 10 20 30 40 50 60 70 80 90 100
0.15 .---.------.-----,-----,-------,--.---..,-----.------.----,
Sample of XD
0.1
-0.05
-0.1 '-------'----'-----'----'------'--"---..L.._---'-----'-----'
0 10 20 30 40 50 60 70 80 90 100
time
Example 9.45: Let Eq. 9.1 be a memoryless relation between two random vari-
ables, the input Y and the output X. Suppose that Eq. 9.1 is X = V - 1 [Y] = y 2
and we chose arbitrarily to model this equation by a linear regression X =a Y+b,
where (a, b) are unknown coefficients. If n independent samples (y i, Xi), i
1, ... , n, of these random variables are available, a and b can be estimated by
n n
and b A
-~--a~-·
- " Xi A " Yi
n n
i=1 i=1
The selected linear regression model may provide no information on the relation
between X andY. For example, if Y "' N(O, 1), the estimates (a, b) converge
to (0, 1) as n --7 oo so that the regression line becomes X = 1 under perfect
information. ~
Note: The estimates aand b minimize the error "L?-1 (a Yi + b -Xi ) 2 and have the above
a
expressions. Note that and b converge to Yxy /Yyy and J.tx - (Yxy /Yyy) J.ty as n --;. oo,
where (J.tx, J.ty) denote the means of (X, Y), 'Yxy is the covariance of (X, Y), and 'Yyy
denotes the variance of Y. If Y ~ N(O, 1), then J.ty = 0, J.tx = E[X] = E[Y 2 ] = 1, and
Yxy = E[X Yl = E[Y 3 ] = 0, so that a --;.
0 and b --;. 1 as n ___... oo. 6
so~~~--~~~------, 8007~~----~~--~---.
Histogram of XT Histogram of XD
60 60
40 40
20 20
0 0
_nn ~nnn.
-0.04 -0.02 0 0.02 0.04 0.06 -0.04 -0.02 0 0.02 0.04 0.06
X X
0.1
'•,
'•,
0.05
P(XD >x)
? '•
0
O.Q1 0.02 0.03 0.04 0.05 0.06
X
The method is based on a Bayesian inference approach, and is illustrated for the
case in which Eq. 9.1 is a Gaussian autoregressive process with unknown order,
coefficients, and input noise scale. We denote the output X and the input Y in
Eq. 9.1 by X and W, respectively.
where W1 are independent N(O, 1/ hk) variables, hk > 0, (f3o, ... , f3k, hk) are
unknown coefficients, and m ~ 1 is a specified integer defining the size of the
collection of competing models.
- Prior information on the collection of competing models, that is, the value
of m ~ 1 defining the largest autoregressive model considered in the analy-
sis, the prior densities f£ (/l k> hk) of the coefficients fl k = (f3o, f3r, ... , f3k)
and hk in the definition of Mk. k = 1, ... , m, and the prior probabilities
Pi > 0, Lk=l Pi = 1, where Pi is the initial likelihood that Mk is the cor-
rect model. An extensive discussion on the selection of ffc(flk, hk) can be
9.8. Model selection 751
- Utility function, which is problem specific and may depend on the model
use. Utility functions are needed in many applications since the obser-
vation vector z is frequently insufficient to provide a reliable solution to
the model selection problem. Let l (M k. Mq) be a utility function giving
the penalty if the model M k is selected and the correct model is M q and
let E[Mk] = L:;= 1 l(Mk. Mq) p~ denote the expected utility associated
with the selection of the model M k. where p~ denotes the posterior prob-
ability that Mq is the correct model and will be calculated later in this
section (Eq. 9.37). The optimal model minimizes E[M k] [89]. We take
l (Mk, Mq) = 1 - lhq in the following example, so that the optimal model
will have the highest posterior probability p k.
Our analysis delivers (1) an optimal model for V andY. that is, the model
Mk• with E[Mk•] 2: E[Mk] fork =f. k* and(2)theposteriordensities f£'(/Jk> hk)
of the coefficients of the models M k and the probabilities pf:, k = 1, ... , m. The
resulting optimal model M k* can be used to estimate extremes, integrals, and
other properties of X relevant for reliability studies [89]. The coefficients of M k*
are random because of the limited information. We now give formulas for calcu-
lating f£' and pf:.
Consider the model Mk in Eq. 9.33 and its regression form
1
Vk=akfJk+ fl:":"Gk (9.34)
vhk
corresponding to an observation vector z, where G k is an (n - k, 1) matrix with
independent N(O, 1) entries, V k = (Xk+t, ... , Xn) denotes a column vector,
Zk
Zk+l
(9.35)
Zn-1
and {J k = (f3o, f31, ... , f3k)T is a column vector. Given z = (zt, ... , Zn). the
vector V k and the matrix ak are perfectly known so that Eq. 9.34 becomes a
multiple regression model. Methods for regression analysis can be used to find
752 Chapter 9. Stochastic Systems and Input
If f£(f3k> hk) is a normal-gamma density with parameters ({3~, y~, p~, v~) and
n > k, then f£'({3k> hk) is also a normal-gamma density with parameters
Note: Consider a model Mk defined by Eq. 9.33. Under the above assumption on the prior
density of (fJk> hk), we have fJk I hk ~ N(fJ~, rUhk) is a Gaussian vector with mean fJ~
and covariance matrix rU hk and hk ~ G2(p~, v~) follows a gamma-2 distribution with
parameters (p~, vic), that is, the prior density of hk is ([91], p. 226)
A density f//fJb hk) is said to be a conjugate prior density if!£ and !£' have
the same functional form, for example, the prior and posterior densities in Eq. 9.36. The
posterior density f!/ (fJk, hk) can be obtained by applying the Bayes formula (Section 2.3.4
in this book, [201], Section 3.2.3).
The results in Eq. 9.36 can be used to calculate the simple predictive distribution,
that is, the distribution of y(k) = Xn+ 1 conditional on Mk and based on the prior informa-
tion f£(fJb hk) and the observation vector z = (Z1, ... 'Zn). We have y(k) = z[ fJk+ Wn
from Eq. 9.33 with t = n, where Zk = (1, Zn, ... , Zn-k+1) is a column vector. Hence,
y(k) I hk ~ N ( zf fJ'£, (1 + zf y'£ Zk)/ hk) so that the density of y(k) is given by
where fk(y 1 ~)is the density of y(k) I (hk =~)and f~: denotes the posterior density of
hk· The above integral shows that the random variable (Y(k) -zr J
fJ'£) I v;: (1 + z[ y'£ Zk)
is a standard Student t with v',: degrees of freedom ([201], Section 3.2.4). A
9.8. Model selection 753
=I
k=1
fk(Zk I fJb hk) is the density of ak fJk + Gk/../hk conditional on ({Jb hk),
Zk = (Zk+1, .•. , Zn) (Eq. 9.36), and f£(/Jk> hk) denotes the prior density of
the coefficients ({Jk, hk) of Mk.
Note: The vector ak flk + Gki.../Tik is Gaussian with mean ak flk and covariance matrix
ik! hk conditional on (flk> hk), where ik is an identity matrix. Generally, numerical inte-
gration is needed to find fk(Zk). Straightforward calculations show that, if f£<flb hk) is
the normal-gamma density in Eq. 9.36, then
~
" ' _k_
( ) _ (211')-(vk-vk)/2 IY 11 1112 (p k1 v'k /2)vkf 2 f(vk11 /2)
Jk Zk - I>'kI 11/2 (pkII vk/2)
II v"/2
k r( vk'/2)'
where r (·) denotes the Gamma function. Extensive considerations on the selection of the
prior density can be found, for example, in [91] (Chapter 3) and [201] (Section 8.4). &
Example 9.46: Two sets of samples of length n = 10, 20, 50, 100, 150, 200,
and 250 have been generated from an autoregressive Gaussian process X of order
k = 3 with parameters fJo = 0, fJ1 = 0.7, fJ2 = -0.5, {J3 = -0.3, and 1/ hk =
1 (Eq. 9.33). We assume that the samples are of unknown origin, consider a
collection Mk, k = 1, 2, 3, of autoregressive process, and apply the method in
this section to find the optimal model M k* for X.
Figure 9.33 shows the dependence on the sample size of the posterior prob-
abilities p'fc in Eq. 9.37 for the two sets of samples generated from M 3· The
posterior probabilities P1 are nearly zero for all values of n and are not shown.
The posterior probabilities p~ and p~ vary significantly from sample to sample
and with the sample size. However, the method delivers the correct model as
the available information, that is, the sample size n of the record, increases. The
parameters ({Jo, fJ1, fJ2, {J3, h) of the optimal model Mk• = M3 are random be-
cause n is finite. For example, the means of the parameters ({Jo, fJ1, fJ2, {J3) of this
model are
(1.0056, 0.6736, -0.4555, -0.3499) and (1.1558, 0.6636, -0.4734, -0.3330)
for the two samples of X with length n = 250 considered in Fig. 9.33. The
standard deviations of these parameters are
(0.0885, 0.0591, 0.0660, 0.0591) and (0.0826, 0.0596, 0.0676, 0.0598),
respectively. The posterior density of ({J o, fJ1 , fJ2, {J3, h) given by Eq. 9.36 can be
used to find properties of the output of Eq. 9.1 at this information level.
754 Chapter 9. Stochastic Systems and Input
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
100 150
Sample size, n
The evolution of the posterior probabilities in Fig. 9.33 shows that M 2 can
be the optimal model if the sample size n is not sufficiently large although (1) the
available samples have been generated from M 3 and (2) M2 is a special case of
M3. That simpler models may be superior to more general models for small sam-
ples is not surprising. For small values of n the uncertainty in the coefficients of
M3 is so large that its probability law becomes inconsistent with the observation
vector.<>
Note: The prior information used in the analysis has been derived from five consecutive
values of X generated from M3 with f3o = 0, f3I = 0.7, f32 = -0.5, /33 = -0.3, and
1/ hk = 1. This prior information has virtually no influence on the outcome in Fig. 9.33
since the sample size n considered in analysis is much larger.
Because the absolute values of the roots of the polynomial 1 - /31 z - f32 z2 - /33 z3
with /31 = 0.7, f32 = -0.5, and /33 = -0.3 differ from one, the state of the model
M 3 becomes stationary as t increases ([31], Theorem 3.1.3, p. 88). The samples used in
Fig. 9.33 correspond to the stationary state of M3. &
9.9 Problems
9.1: Find the local solution of the partial differential equation in Example 9.3
with Z1 and Z2 replaced by the random fields a1 + exp(G1 (x)) and -a2 + G2(x),
respectively, where a 1, a2 > 0 are constants, x E D, G 1 and G2 are homogeneous
Gaussian fields with mean zero and covariance functions q(~) = e-'Ak 11~11, k =
1, 2, and G1 is independent of G2. For numerical calculations take a1 = 1, a2 =
5, and A.1 = A.2 = 5.
9.9. Problems 755
where D is an open bounded subset of m.d and A > 0 denotes a random field.
Assume the initial and boundary conditions U(x, 0) = a > 0, x E D, and
U(x, t) = 0 for X E aD and t:::. 0, respectively.
9.3: Find U in Example 9.3 by conditional analysis. Use any numerical method
to calculate the solution U I (Z 1, Z2).
9.5: Find the first four moments of X in Example 9.7 by a closure method. Com-
pare the resulting moments with Monte Carlo solutions.
9.6: Find the function c(z2) in the expression for the stationary density of the state
vector Z in Example 9.7.
9.7: Let X be the solution of (a+ R) X(t) = Q(t), t :::_ 0, where R is a real-
valued random variable such that IR j a I < 1 a.s. and Q is a stochastic process in
L2. Find the Neumann series representation of X and examine its convergence.
Calculate the second moment properties of X.
9.8: Find the second moment properties of X in Example 9.8 by one of the closure
methods in Section 7.3.1.5.
9.9: Find the function c(z2, Z3) in the expression for the stationary density of the
state vector Z in Example 9.8.
9.10: Let X be the solution of X(t) = -a X (t) 2 for a > 0, t :::_ 0, and X (0) = Z,
where Z is a real-valued random variable. Find the density of X at some time
t :::_ 0 by using Eq. 9.3 and Eq. 9.5.
9.12: Solve the problem in Example 9.20 by the Galerkin method with Berstein
polynomials as trial functions.
9.13: Find the first two moments of U in Example 9.3 by the stochastic finite
element method.
9.14: Complete the details of the stochastic finite element analysis in Exam-
ple 9.30 based on polynomial chaos.
756 Chapter 9. Stochastic Systems and Input
9.15: Find the second order approximation of the solution of the steady-state
Boltzmann equation in Example 9.31.
9.16: Calculate the Lyapunov exponent for the stochastic differential equation in
Example 9.35. Discuss the stochastic stability of the stationary solution of this
equation as a function of p.
9.17: Suppose that Kr, r = 1, 2, 3, X(O), Z, and Kzp in Example 9.38 are real-
valued random variables. Find the marginal density of the state X (t) at an arbitrary
timet?: 0.
9.18: Find the local solution for Eq. 9.31 and a generalized version of this equa-
tion for the case in which the dispersion tensor dij is random.
9.19: Repeat the calculations in Examples 9.43 and 9.44 for other input processes,
which are equivalent in some sense. Also, assume that some of the coefficients of
the differential equations in these examples are random.
9.20: Complete the details of the proof presented for Eqs. 9.36 and 9.38.
9.21: Extend results in Example 9.46 to the case in which the observations are
imperfect, that is, the sample values Zi are replaced with Zi + Ui, where Ui are
independent copies of a real-valued random variable with mean zero and known
variance.
9.22: Let X be a stationary Gaussian process with mean zero and covariance func-
tion c('r) = E[X(t) X(t + r)]. The covariance function of X is not given. It
is known that cis one of the functions ct(r) = a 2 sin(vc r)j(vc r), cz(r) =
a 2 exp(-a Jrl), or c3(r) = a 2 (1 + fJJrl) exp(-{JJrl), where a, Vc, a, {3 > 0
are some unknown constants. Generate a sample of length n of X with c = c 1,
a = 1, and Vc = 30. Suppose that the resulting sample is of unknown origin.
Apply the approach in Example 9.46 to find the optimal model for X from the
collection of stationary Gaussian processes with mean zero and covariance func-
tions q, k = 1, 2, 3.
Bibliography
[1] R. J. Adler. The Geometry of Random Fields. John Wiley & Sons, New
York, 1981.
[4] B. K. Agarwal and M. Eisner. Statistical Mechanics. John Wiley & Sons,
New York, 1988.
[8] S. T. Ariaratnam and H. N. Pi. On the first-passage time for envelope cross-
ing for a linear oscillator. International Journal of Control, 18(1):89-96,
1973.
757
758 Bibliography
[20] J. Bertoin. Levy Processes. Cambridge University Press, New York, 1998.
[30] R. W. Brockett. Finite Dimensional Linear Systems. John Wiley & Sons,
Inc., New York, 1970.
[33] R. S. Bucy and P. D. Joseph. Filtering for Stochastic Processes with Appli-
cations to Guidance. Chelsea Publishing Company, New York, 1987.
[41] K. L. Chung. Green, Brown, and Probability. World Scientific, New Jersey,
1995.
[48] R. E. Cutkosky. A Monte Carlo method for solving a class of integral equa-
tions. Journal ofResearch of the National Bureau of Standards, 47(2): 113-
115, 1951.
[50] P.R. Dawson and E. B. Martin. Computational mechanics for metal de-
formation processes using polycrystal plasticity. Advances in Applied Me-
chanics, pages 77-169, 1998.
[51] J. B. Diaz and L. E. Payne. Mean value theorems in the theory of elasticity.
In Proceedings of the Third U. S. National Congress ofApplied Mechanics,
pages 293-303. The American Society of Mechanical Engineers, June 11-
14 1958.
[52] P. J. Digby. The effective elastic moduli of porous granular rocks. Journal
ofApplied Mechanics, 48:803-808, 1981.
[80] M. Grigoriu. Equivalent linearization for Poisson white noise input. Prob-
abilistic Engineering Mechanics, 11(1):45-51, 1995.
[81] M. Grigoriu. Lyapunov exponents for nonlinear systems with Poisson white
noise. Physics Letters A, 217:258-262, 1996.
[83] M. Grigoriu. Mean and covariance equations for boundary value problems.
Journal of Engineering Mechanics, ASCE, 123(5):485-488, 1997.
[87] M. Grigoriu. A spectral representation based model for Monte Carlo sim-
ulation. Probabilistic Engineering Mechanics, 15(4):365-370, October
2000.
Bibliography 763
[91] Raiffa H. and R. Schlaifer. Applied Statistical Decision Theory. The M.I.T.
Press, Cambridge, Massachusetts, 1961.
[104] I.D. Huntley and R. M. Johnson. Linear and Nonlinear Differential Equa-
tions. John Wiley & Sons, New York, 1983.
[105] R. A. Ibrahim. Parametric Random Vibration. John Wiley & Sons Inc.,
New York, 1985.
[106] R. Iranpour and P. Chacon. The Mark Kac Lectures. Macmillan Publishing
Company, New York, 1988.
[114] M. Kleiber and T. D. Hien. The Stochastic Finite Element. John Wiley &
Sons, New York, 1992.
[ 119] A. Kumar and P. R. Dawson. The simulation of texture evolution with finite
element over orientation space. Application to planar polycrystals. Applied
Mechanics and Engineering, 130:247-261, 1996.
[154] S.M. Ross. Stochastic Processes. John Wiley & Sons, New York, 1983.
[155] R. Rubinstein. Simulation and the Monte Carlo Method. John Wiley &
Sons, New York, NY, 1981.
[156] W. Rudin. Real and Complex Analysis. McGraw-Hill, Inc., New York,
1974.
[158] I. Rychlik. A note on Durbin's formula for the first-passage density. Statis-
tics & Probability Letters, 5:425-428, 1987.
[159] I. Rychlik and G. Lindgren. Crossreg, a computer code for first passage and
wave density analysis. Technical report, University of Lund and Lund Insti-
tute of Technology, Department of Mathematical Statistics, Lund, Sweden,
1990.
[161] G. Samorodnitsky. Long Range Dependence, Heavy Tails and Rare Events.
Lecture notes. MaPhySto, Center for Mathematical Physics and Stochas-
tics. Aarhus, Denmark, 2002.
[176] T.T. Soong. Probabilistic Modeling and Analysis in Science and Engineer-
ing. John Wiley & Sons, New York, NY, 1981.
[177] P. D. Spanos. Stocahstic linearization in structural dynamics. Applied Me-
chanics Reviews, 34(1):1-8, 1981.
[178] P. D. Spanos and C. A. Brebbia, editors. Computational Stochastic Me-
chanics. Elsevier Applied Science, New York, 1991.
[179] B. F. Spencer and L.A. Bergman. On the numerical solutions of the Fokker-
Planck equations for nonlinear stochastic systems. Nonlinear Dynamics,
4:357-372, 1993.
[180] D. Stoyan, W. S. Kendall, and J. Mecke. Stochastic Geometry and Its Ap-
plications. John Wiley & Sons, New York, 1987.
Bibliography 769
[183] G. P. Tolstov. Fourier Series. Dover Publications, Inc., New York, 1962.
[194] J. R. Willis. Variational and related methods for the overall properties of
composites. Advances in Applied Mechanics, 21:1-78, 1981.
770 Bibliography
[195] R. Willis and W. W.-G. Yeh. Groundwater Systems Planning and Manage-
ment. Prentice Hall, Englewood Cliffs, NJ, 1987.
771
772 Index
Stationary increments, 111, 122, 182, Taylor series method, 554, 570, 584,
186, 189 691
Stationary process, 119 Translation process, 125
in the strict sense, 119
in the weak sense, 128 Voigt average, 621
spectral density, 132
White noise process, 144, 184, 186,
Stochastic differential equation, 253
254,322,323
numerical solution, 275 Wong-Zakai theorem, 267
Euler, 277
Milstein, 280
semimartingale input, 271
Brownian motion input, 256
diffusion process, 262
equations for characteristic func-
tions, 267
equations for densities, 267
equations for moments, 267
semimartingale, 263
Wong-Zakai theorem, 267
Stochastic integral, 208
associativity, 225
Ito integral, 208, 250
preservation, 225
semimartingale, 217
simple predictable integrand, 221
Stratonovich integral, 210, 249,
250
Stochastic process, 104
adapted, 107
dtdlag, caglact, 113
classes of, 119
correlation, 127, 130
covariance, 127
finite dimensional densities, 117
finite dimensional distribution,
117
measurable, 106
progressively measurable, 108
sample properties, 110
second moment properties, 127
Stopping time, 78, 114
Stratonovich integral, 249