CT Markov

Chapter 2
Continuous time Markov chains
As before we assume that we have a finite or countable statespace I, but now the Markov chains
X = {X(t) : t ≥ 0} have a continuous time parameter t ∈ [0, ∞). In some cases, but not the ones of
interest to us, this may lead to analytical problems, which we skip in this lecture.
2.1 Q-Matrices
In continuous time there are no smallest time steps and hence we cannot speak about one-step tran-
sition matrices any more. If we are in state j ∈ I at time t, then we can ask for the probability of
being in a different state k ∈ I at time t + h,
f (h) = IP{X(t + h) = k |X(t) = j}.
We are interested in small time steps, i.e. small values of h > 0. Clearly, f (0) = 0. Assuming that
f is differentiable at 0, the most interesting information we can obtain about f at the origin is its
derivative f ′ (0), which we call qjk .
IP{X(t + h) = k |X(t) = j} f (h) − f (0)

lim = lim = f ′ (0) = qjk .
h↓0 h h↓0 h−0
We can write this as

IP{X(t + h) = k |X(t) = j} = qjk h + o(h).
Here o(h) is a convenient abbreviation for any function with the property that limh↓0 o(h)/h = 0, we
say that the function is of smaller order than h. An advantage of this notation is that the actual
function o might be a different one in each line, but we do not need to invent a new name for each
occurrence.
Again we have the Markov property, which states that if we know the state X(t) then all additional
information about X at times prior to t is irrelevant for the future: For all k 6= j, t0 < t1 < . . . < tn < t
and x0 , . . . , xn ∈ I,
IP{X(t + h) = k |X(t) = j, X(ti ) = xi ∀i} = IP{X(t + h) = k |X(t) = j} = qjk h + o(h).
1
From this we get, for every j ∈ I,
X
IP{X(t + h) = j |X(t) = j} = 1 − qjk h + o(h) = 1 + qjj h + o(h),
k∈I
P
if we define qjj = − k∈I qjk . As above we have qjj = f ′ (0) for f (h) = IP{X(t + h) = j |X(t) = j},
but now f (0) = 1.
We can enter this information into a matrix Q = (qij : i, j ∈ I), which contains all the information
about the transitions of the Markov chain X. This matrix is called the Q-matrix of the Markov chain.
Its necessary properties are
• all off-diagonal entries qij , i 6= j, are non-negative,

• all diagonal entries qii are nonpositive and,
• the sum over the entries in each row is zero (!).
We say that qij gives the rate at which we try to enter state j when we are in state i, or the jump
intensity from i to j.
Example 2.1: The Poisson process
Certain random occurrences (for example claim times to an insurance company, arrival times of cus-
tomers in a shop,. . . ) happen randomly in such a way that
• the probability of one occurrence in the time interval (t, t + h) is λh + o(h), the probability of
more than one occurrence is o(h).
• occurrences in the time interval (t, t + h) are independent of what happened before time t.
Suppose now that X(t) is the number of occurrences by time t. Then X is a continuous time Markov
chain with statespace I = {0, 1, 2, . . .}. Because, by our assumptions,


 λh + o(h) if k = j + 1,
IP{X(t + h) = k | X(t) = j} = 0 + o(h) if k > j + 1,

 0 if k < j,
we get the Q-matrix  
−λ λ 0 0 ...
 0 −λ λ 0 ... 
 
Q= 0 0 −λ λ 0 ... .
 
.. .. ..
0 ... 0 . . .
This process is called a Poisson process with rate λ.
Reason for this name: An analysis argument shows that, if X(0) = 0, then
(λt)n
IP{X(t) = n} = e−λt for n = 0, 1, 2, . . . ,
n!
i.e. X(t) is Poisson distributed with parameter λt.
Example 2.2: The pure birth-process
Suppose we have a population of (immortal) animals reproducing in such a way that, independent of
what happened before time t and of what happens to the other animals, in the interval (t, t + h) each
animal
2
• gives birth to 1 child with probability λh + o(h),
• gives birth to more than 1 child with probability o(h),
• gives birth to 0 children with probability 1 − λh + o(h).
Let p = λh + o(h) and q = 1 − p and X(t) the number of animals at time t. Then,
IP{X(t + h) = n | X(t) = n} = IP{ no births in (t, t + h) | X(t) = n}

= q n = [1 − λh + o(h)]n
= 1 − nλh + o(h),
and
IP{X(t + h) = n + 1 | X(t) = n} = IP{ one birth in (t, t + h) | X(t) = n}

= npq n−1 = nλh [1 − λh + o(h)]n−1
= nλh + o(h),
and finally, for k > 1,
IP{X(t + h) = n + k | X(t) = n} = IP{k births in (t, t + h) | X(t) = n}

! !
n k n−k n k k
= p q + o(h) = λ h [1 − λh + o(h)]n−k + o(h)
k k
= o(h).
Hence, X is a continuous time Markov chain with Q-matrix

 
−λ λ 0 0 ...
 0 −2λ 2λ 0 ... 
 
Q= 0 0 −3λ 3λ 0 ... .
 
.. .. ..
0 ... 0 . . .
An analysis argument shows that, if X(0) = 1, then
IP{X(t) = n} = e−λt (1 − e−λt )n−1 for n = 1, 2, 3, . . . ,
i.e. X(t) is geometrically distributed with parameter e−λt .
2.1.1 Construction of the Markov chain
Given the Q-matrix one can construct the paths of a continuous time Markov chain as follows. Suppose
the chain starts in a fixed state X0 = i for i ∈ I. Let
J = min{t : Xt 6= X0 }
be the first jump time of X (we always assume that the minimum exists).
3
Theorem 2.1 Under the law IPi of the Markov chain started in X0 = i the random variables J and
P
X(J) are independent. The distribution of J is exponential with rate qi := j6=i qij , which means
IPi {J > t} = e−qi t for t ≥ 0.

Moreover,
qij
IPi {X(J) = j} = ,
qi
and the chain starts afresh at time J.
Picture!
Let J1 , J2 , J3 , . . . be the times between successive jumps. Then
Yn = X(J1 + · · · + Jn )
defines a discrete time Markov chain with one-step transition matrix P given by
( qij
qi if i 6= j,
pij =
0 if i = j.
Given Yn = i the next waiting time Jn+1 is exponentially distributed with rate qi and independent of
Y1 , . . . , Yn−1 and J1 , . . . , Jn .
2.1.2 Exponential times
Why does the exponential distribution play a special role for continuous Markov chains?
Recall the Markov property in the following way: Suppose X(0) = i and let J be the time before the
first jump and t, h > 0. Then, using the Markov property in the second step,
IP{J > t + h | J > t} = IP{J > t + h | J > t, X(t) = i} = IP{J > t + h | X(t) = i} = IP{J > h}.
Hence the time J we have to wait for a jump satisfies the lack of memory property: if you have waited
for t time units and no jump has occurred, the remaining waiting time has the same distribution as
the original waiting time. This is sometimes called the waiting time paradox or constant failure rate
property.
The only distribution with the lack of memory property is the exponential distribution. Check that,
for IP{J > x} = e−µx , we have
e−µ(t+h)
IP{J > t + h | J > t} = = e−µh = IP{J > h}.
e−µt
Another important property of the exponential distribution is the following:
Theorem 2.2 If S and T are independent exponentially distributed random variables with rate α resp.
β, then their minimum S ∧ T is also exponentially distributed with rate α + β and it is independent of
the event {S ∧ T = S}. Moreover,
α β
IP{S ∧ T = S} = and IP{S ∧ T = T } = .
α+β α+β
Prove this as an exercise, see Sheet 9!
4
Example 2.3: The M/M/1 queue.
Suppose customers arrive according to a Poisson process with of rate λ at a single server. Each
customer requires an independent random service time, which is exponential with mean 1/µ (i.e. with
rate µ). Let X(t) be the number of people in the queue (including people currently being served).
Then X is a continuous-time Markov chain with
 
−λ λ 0 0 0 ...
 µ −(λ + µ) λ 0 0 ...
 
Q= 0 µ −(λ + µ) λ 0 ... .
 
.. .. ..
0 0 . . . ...
The previous theorem says: If there is at least one person in the queue, the distribution of the jump
time (i.e. the time until the queue changes) is exponential with rate λ + µ and the probability that at
the jump time the queue is getting shorter is µ/(λ + µ).
2.2 Kolmogorov’s equations and global Markov property

Apart from the local version of the Markov property, for small h > 0,
(
qjk h + o(h) if j 6= k,
IP{X(t + h) = k |X(t) = j, (X(s), s ≤ t)} =
1 + qjj h + o(h) if j = k,
continuous-time Markov chains also satisfy a global Markov property. To describe it let

P (t) = pij (t), i, j ∈ I, t ≥ 0 given by pij (t) = IPi {X(t) = j},
be the transition matrix function of the Markov chain. From P one can get the full information about
the law of the Markov chain, for 0 < t1 < . . . < tn and j1 , . . . , jn ∈ I,
IPi {X(t1 ) = j1 , . . . , X(tn ) = jn } = pij1 (t1 )pj1 j2 (t2 − t1 ) · · · pjn−1 jn (tn − tn−1 ).
P has the following properties,
X
(a) pij (t) ≥ 0 and pij (t) = 1 for all i ∈ I. Also
j∈I
(
1 if i = j,
lim pij (t) =
t↓0 0 otherwise.
X
(b) pik (t)pkj (s) = pij (t + s) for all i, j ∈ I and s, t ≥ 0.
k∈I
The equation in (b) is called the Chapman-Kolmogorov equation. It is easily proved,

X
IPi {X(t + s) = j} = IPi {X(t) = k, X(t + s) = j}
k∈I
X
= IPi {X(t) = k}IPi {X(t + s) = j |X(t) = k}
k∈I
X
= IPi {X(t) = k}IPk {X(s) = j}.
k∈I
5
Note that, by definition pjk (h) = qjk h + o(h) for j 6= k, and hence
p′jk (0) = qjk and p′jj (0) = qjj .
From this we can see how the Q-matrix is obtainable from the transition matrix function. The converse
operation is more involved, we restrict attention to the case of finite statespace I.
Consider P (t) as a matrix, then the Chapman-Kolmogorov equation can be written as
P (t + s) = P (t)P (s) for all s, t ≥ 0.
One can differentiate this equation with resp to s or t and gets
P ′ (t) = QP (t) and P ′ (t) = P (t)Q.
These equations are called the Kolmogorov backward resp. Kolmogorov forward equations. We also
know P (0) = I, the identity matrix. This matrix-valued differential equation has a unique solution
P (t), t ≥ 0.
For the case of a scalar function p(t) and a scalar q instead of the matrix-valued function P and matrix
Q the corresponding equation
p′ (t) = qp(t), p(0) = 1,
would have the unique solution p(t) = etq . In the matrix case one can argue similarly, defining
∞
X
Qt Qk t k
e = ,
k=0
k!
with Q0 = I. Then we get the transition matrix function P (t) = eQt which satisfies the Kolmogorov
forward and backward equations.
2.3 Resolvents
Let Q be the Q-matrix of the Markov chain X and P (t), t ≥ 0 the transition matrix function.
2.3.1 Laplace transforms
A function f : [0, ∞) → IR is called good if fR is continuous and either bounded (there exists K > 0
such that |f (t)| ≤ K for all t) or integrable ( 0∞ |f (t)| dt < ∞) or both. If f is good we can associate
the Laplace transform fˆ : (0, ∞) → IR which is defined by
Z ∞
fˆ(λ) = e−λt f (t) dt.
0
An important example is the case f (t) = e−αt . Then

Z ∞ 1
fˆ(λ) = e−λt e−αt dt = .
0 λ+α
An important result about Laplace transforms is the following:
6
Theorem 2.3 (Uniqueness Theorem) If f and g are good functions on [0, ∞) and fˆ(λ) = ĝ(λ)
for all λ > 0. Then f = g.
This implies that we can invert Laplace transforms uniquely, at least when restricting attention to
good functions.
2.3.2 Resolvents
The basic idea here is to calculate the exponential P (t) = eQt of the Q-matrix using the Laplace
transform. Let λ > 0 and argue that
Z ∞
R(λ) := e−λt P (t) dt
0
Z ∞ Z ∞
−λt Qt
= e e dt = e−(λI−Q)t dt
0 0
= (λI − Q)−1 .
Here the integrals over matrix-valued functions are understood componentwise, and the last step is
done by analogy with the real-valued situation, but can be made rigorous. Note that the inverse of
the matrix λI − Q can be calculated for all values λ which are not an eigenvalue of Q. Recall that
the inverse fails to exist if and only if the characteristic polynomial det(λI − Q) is zero, which is also
the criterion for λ to be an eigenvalue of Q.
Now R(λ) = (λI − Q)−1 , if it exists, can be calculated and P (t) can be recovered by inverting the
Laplace transform. The matrix function R(λ) is called the resolvent of Q (or of X). Its components
are the Laplace transforms of the functions pij ,
Z ∞
rij (λ) := e−λt pij (t) dt = pîj (λ).
0
Resolvents also have a probabilistic interpretation: Suppose we have an alarm-clock which rings,
independently of the Markov chain, at a random time A, which is exponentially distributed with
rate λ, i.e.
IP{A > t} = e−λt , IP{A ∈ (t, t + dt)} = λe−λt dt.
What is the state of the Markov chain when the alarm clock rings? The probability of being in state
j is Z ∞ Z ∞
IPi {X(A) = j} = IPi {X(t) = j, A ∈ (t, t + dt)} = pij (t)λe−λt dt = λrij (λ),
0 0
hence
IPi {X(A) = j} = λrij (λ) for B independent of X, exponential with rate λ.
We now discuss the use of resolvents in calculations for continuous-time Markov chains using the
following problem:
Problem: Consider a continuous-time Markov chain X with state space I := {A, B, C} and transi-
tions between the states described as follows:
• When currently in state A at time t, in the next time interval (t, t + h) of small length h and
independent of the past behaviour of the chain, with
7
– probability h + o(h) the chain jumps into state B,
– probability 2h + o(h) the chain jumps into state C,
– probability 1 − 3h + o(h) the chain remains in state A.
• Whilst in state B, the chain tries to enter state C at rate 2. The chain cannot jump into state
A directly from state B.
• On entering state C, the chain remains there for an independent exponential amount of time of
rate 3 before jumping. When the jump occurs, with probability 2/3 it is into state A, and with
probability 1/3 the jump is to state B.
Questions:
(a) Give the Q-matrix of the continuous-time Markov chain X.
(b) If the Markov chain is initially in state A, that is X(0) = A, what is the probability that the
Markov chain is still in state A at time t, that is, IPA {X(t) = A} ?
(c) Starting from state C at time 0, show that the probability X is in state B at time t is given by
1 1 −3t
− e for t ≥ 0.
3 3
(d) Initially, X is in state C. Find the distribution for the position of X after an independent
exponential time, T , of rate 4 has elapsed. In particular, you should find
1
IPC {X(T ) = B} = .
7
(e) Given that X starts in state C, find the probability that X enters state A at some point before
time t. [See Section 2.3.3 for solution.]
(f) What is the probability that we have reached state C at some point prior to time U , where U
is an independent exponential random variable of rate µ = 2.5, given that the chain is in state
B at time zero? [See Section 2.3.3 for solution.]
The method to obtain (a) should be familiar to you by now. For the jumps from state C we use
Theorem 2.2: Recall qCA is the rate of jumping from C to A and qCB is the rate of jumping from C to
B. We are given 3 = qCA + qCB , the rate of leaving state C. The probability that the first jump goes
to state A is, again by Theorem 2.2, qCA /(qCA + qCB ), which is given as 2/3. Hence we get qCA = 2
and qCB = 1. Altogether,  
−3 1 2
Q =  0 −2 2  .
2 1 −3
8
We now solve (b) with the resolvent method. Recall
Z ∞
pAA (t) = IPA {X(t) = A}, and rAA (t) = e−λt pAA (t) dt = p̂AA (t).
0
As R(λ) = (λI − Q)−1 we start by inverting

 
λ+3 −1 −2
λI − Q =  0 λ+2 −2  .
−2 −1 λ+3
First,

λ+3 −2 λ + 3 −1
det(λI − Q) = (λ + 2) det + 2 det
−2 λ+3 −2 −1
= λ(λ + 3)(λ + 5).
(Side remark: λ is always a factor in this determinant, because Q is a singular matrix. Recall that
the sum over the rows is zero!) Now the inverse of a matrix A is given by
(−1)i+j
(A−1 )ij = det(Mji ),
det A
where Mji is the matrix with row j and column i removed. Hence, in our example, the entry in
position AA is obtained by

λ+2 −2
det
−1 λ+3 λ2 + 5λ + 4
rAA (λ) = (λI − Q)−1
AA = = .
det(λI − Q) λ(λ + 3)(λ + 5)
It remains to invert the Laplace transform. For this purpose we need to form partial fractions. Solving
λ2 + 5λ + 4 = α(λ + 3)(λ + 5) + βλ(λ + 5) + γλ(λ + 3)
gives α = 4/15 (plug λ = 0) and β = 1/3 (plug λ = −3) and γ = 2/5 (plug λ = −5). Hence
4 1 2
rAA (λ) = + + .
15λ 3(λ + 3) 5(λ + 5)
Now the inverse Laplace transform of 1/(λ + β) is e−βt . We thus get

4 1 2
pAA (t) = + e−3t + e−5t ,
15 3 15
and note that pAA (0) = 1 as it should be!
Now solve part (c). To find IPC {X(t) = B} consider
Z ∞
rCB (λ) = e−λt pCB (t) dt = p̂CB (t).
0
We need to find the CB coefficient of the matrix (λI − Q)−1 , recalling the inversion formula we get

λ + 3 −1
− det
−2 −1 λ+5 1 11 1
rCB (λ) = = = = − ,
λ(λ + 3)(λ + 5) λ(λ + 3)(λ + 5) λ(λ + 3) 3 λ λ+3
9
and inverting Laplace transforms gives
1 1 −3t
IPC {X(t) = B} = pCB (t) = − e .
3 3
Suppose for part (d) that T is exponentially distributed with rate λ and independent of the chain.
Recall,
IPi {X(T ) = j} = λrij (λ),
and by matrix inversion,

0 λ+2
det
−2 −1 2(λ + 2)
rCA (λ) = = .
det(λI − Q) λ(λ + 3)(λ + 5)
As T is exponential with parameter λ = 4 we get (no Laplace inversion needed!)
2(4 + 2) 12 4
IPC {X(T ) = A} = = = .
(4 + 3)(4 + 5) 63 21
Similarly,
λ + 3 −1
− det
−2 −1 λ+5
rCB (λ) = = ,
det(λI − Q) λ(λ + 3)(λ + 5)
and
λ+3 −1
det
0 λ+2 (λ + 3)(λ + 2)
rCC (λ) = = .
det(λI − Q) λ(λ + 3)(λ + 5)
Plugging λ = 4 we get
1 4+2 2
IPC {X(T ) = B} = 4rCB (4) = and IPC {X(T ) = C} = = .
7 4+5 3
At this point it is good to check that the three values add up to one and thus define a probability
distribution on the statespace I = {A, B, C }.
2.3.3 First hitting times
Let Tj = inf{t > 0 : Xt = j} be the first positive time we are in state j. Define, for i, j ∈ I,
Fij (t) = IPi {Tj ≤ t} = IP{Tj ≤ t | X(0) = i},
then Fij (t) is the probability that we have entered state j at some time prior to time t, if we start the
chain in i. We have the first hitting time matrix function,

F (t) = Fij (t) : i, j ∈ I, t ≥ 0 ,
with F (0) = I and

Z t Z t
Fij (t) = fij (s) ds = IPi {Tj ∈ ds},
0 0
10
where fij (t) is the first hitting time density. The matrix functions f = (fij (t), t ≥ 0) and P are linked
by the integral equation
Z t
pij (t) = fij (s)pjj (t − s) ds, for i, j ∈ I and t ≥ 0. (2.3.1)
0
This can be checked as follows, use the strong Markov property, which is the fact that the chain starts
afresh at time Tj in state j and evolves independently of everything that happened before time Tj to
get
Z t
pij (t) = IPi {Xt = j} = IPi {Xt = j; Tj ≤ t} = IPi {Xt = j, Tj ∈ ds}
0
Z t Z t
= IPi {Tj ∈ ds}IPj {Xt−s = j} = fij (s)pjj (t − s) ds.
0 0
Given two functions f, g : [0, ∞) → IR we can define the convolution function f ∗ g by

Z t
f ∗ g(t) = f (s)g(t − s) ds,
0
and check by substituting u = t − s that f ∗ g = g ∗ f . We can rewrite the integral equation (2.3.1) as
pij (t) = fij ∗ pjj (t).
In order to solve this equation for the unknown fij we use Laplace transforms again.
∗ g(λ) = fˆ(λ)ĝ(λ).
Theorem 2.4 (Convolution Theorem) If f, g are good functions, then fd
To make this plausible look at the example of two functions f (t) = e−αt and g(t) = e−βt . Then
Z Z
t t e−βt − e−αt
f ∗ g(t) = f (s)g(t − s) ds = e−βt e−(α−β)s ds = .
0 0 α−β
Then
Z ∞
1 1 1
fd
∗ g(λ) = e−λs (f ∗ g)(s) ds = −
0 α−β λ+β λ+α
1 1
= = fˆ(λ)ĝ(λ).
λ+αλ+β
The convolution theorem applied to the integral equation (2.3.1) gives a formula for the Laplace
transform of the first hitting time density
rij (λ)
fîj (λ) = for i, j ∈ I and λ > 0. (2.3.2)
rjj (λ)
We use this now to given an answer to (e) in our problem. We are looking for
Z t
IPC {TA ≤ t} = FCA (t) = fCA (s) ds.
0
11
We first find the Laplace transform fˆCA of fCA , using rCA and rAA ,

0 λ+2
det
−2 −1 2(λ + 2)
fˆCA = = 2 ,
λ+2 −2 λ + 5λ + 4
det
−1 λ+3
and by partial fractions
2 4
fˆCA = + .
3(λ + 1) 3(λ + 4)
In this form, the Laplace transform can be inverted, which gives
2 4
fCA (t) = e−t + e−4t for t ≥ 0.
3 3
Now, by integration, we get
Z t
IPC {TA ≤ t} = fCA (s) ds
0
Z t Z t
2 4
= e−s ds + e−4s ds
3 0 3 0
2 1
= 1 − e−t − e−4t .
3 3
Check that IPC {TA ≤ 0} = 0, as expected.
As in the case of rij = p̂ij the Laplace transforms fîj also admit a direct probabilistic interpretation
with the help of an alarm clock, which goes off at a random time A, which is independent of the
Markov chain X and exponentially distributed with rate λ. Indeed, recalling that λe−xλ , x > 0 is the
density of A and fij is the density of Tj ,
Z ∞
IPi {Tj ≤ A} = IPi {Tj ≤ a}λe−aλ da
0
Z ∞Z a
= fij (s) dsλe−aλ da
0 0
Z ∞ Z ∞
= fij (s) λe−aλ da ds
0 s
Z ∞
= fij (s)e−sλ ds
0
= fîj (λ).
Altogether,
fîj (λ) = IPi {Tj < A} for i, j ∈ I and λ > 0.
We use this now to given an answer to (f ) in our problem. The question asks for IPB {TC < U }
where U is exponential with rate µ = 2.5 and independent of the Markov chain. We now know that

µ + 3 −2
− det
rBC (µ) 0 −2 2(µ + 3) 2
IPB {TC < U } = fˆBC (µ) = = = = ,
rCC (µ) µ+3 −1 (µ + 3)(µ + 2) µ+2
det
0 µ+2
12
and plugging in µ = 2.5 gives IPB {TC < U } = 94 .
Example: A machine can be in three states A, B, C and the transition between the states is given
by a continuous-time Markov chain with Q-matrix
 
−3 1 2
Q=  3 −3 0  .
1 2 −3
At time zero the machine is in state A. Suppose a supervisor arrives for inspection at the site of the
machine at an independent exponential random time T , with rate λ = 2. What is the probability that
he has arrived when the machine is in state C for the first time?
We write TC = inf{t > 0 : X(t) = C} for the first entry and SC = inf{t > TC : X(t) 6= C} for the
first exit time from C. Then the required probability is IPA {TC < T, SC > T }. Note that J = SC − TC
is the length of time spent by the chain in state C during the first visit. This time is exponentially
distributed with rate qCA + qCB = 3 and we get
IPA {TC < T, SC > T } = IPA {TC < T }IPA {SC > T | TC < T }
= IPA {TC < T }IPA {T − TC < J | T > TC }.
By the lack of memory property, given T > TC the law of T̃ = T − TC is exponential with rate λ = 2
again. Hence
IPA {TC < T, SC > T } = IPA {TC < T }IPC {T̃ < J},
where T̃ is an independent exponential random variable with rate λ = 2. The right hand side can be
calculated. We write down  
λ+3 −1 −2
λI − Q =  −3 λ+3 0 ,
−1 −2 λ+3
and derive

−1 −2
det
r AC (λ) λ + 3 0 2(λ + 3)
IPA {TC < T } = fÂC (λ) = = =
2
,
rCC (λ) λ+3 −1 λ + 6λ + 6
det
−3 λ+3
and, by Theorem 2.2,

λ
IPC {J > T̃ } =
λ+3
recalling that J and T̃ are independent exponentially distributed with rates 3 resp. λ = 2. Hence,
2λ 4 2
IPA {TC < T, SC > T } = = = .
λ2 + 6λ + 6 22 11
2.4 Long-term behaviour and invariant distribution

We now look at the long-term behaviour of continuous time Markov chains. The notions of an
irreducible and recurrent Markov chain can be defined in the same manner as in the discrete case:
13
• A chain is irreducible if one can get from any state to any other state in finite time, and
• an irreducible chain is recurrent if the probability that we return to this state in finite time is
one.
We assume in this chapter that our chain satisfies these two conditions and also a technical condition
called non-explosive, which ensures that the process is defined at all times, see Norris for further
details.
2.4.1 The invariant distribution
Recall that in the discrete time case the stationary distribution π on I was given as the solution of
the equation πP = π and, by iteration,
π = πP = πP 2 = πP 3 = · · · = πP n for all n ≥ 0.
Hence we would expect an invariant distribution π of a continuous time Markov chain with transition
matrix function (P (t) : t ≥ 0) to satisfy
πP (t) = π for all t ≥ 0.
If the statespace is finite, we can differentiate with respect to time to get πP ′ (t) = 0. Setting t = 0
and recalling P ′ (0) = Q we get
πQ = 0.
The two boxed statements can be shown to be equivalent in the general case, see Norris Theorem 3.5.5.
Any probability distribution π satisfying πQ = 0 is called an invariant distribution, or sometimes
equilibrium or stationary distribution.
Theorem 2.5 Suppose (X(t) : t ≥ 0) satisfies our assumptions and π solves

X X
πi qij = 0, πj > 0 for all j ∈ I, and πi = 1,
i∈I i∈I
then,
lim pij (t) = πj for all i ∈ I,
t→∞
Rt
and if Vj (t) = 0 1{X(s)=j} ds is the time spent in state j up to time t, we have
Vj (t)
lim = πj with probability one.
t→∞ t
This theorem is analogous to (parts of) the Big Theorem in the discrete time case. Note that it also
implies that the invariant
14
2.4.2 Symmetrisability
Suppose m = (mi : i ∈ I) satisfies mi > 0. We say that Q is m-symmetrisable if we can find mi > 0
with
mi qij = mj qji for all i, j ∈ I
These are the detailed balance equations.
P
Theorem 2.6 If we find an m solving the detailed balance equations such that M = i∈I mi < ∞,
then πi = mi /M defines the unique invariant distribution.
For the proof note that, for all j ∈ I,

1 X 1 X
(πQ)j = mi qij = mj qji = 0.
M i∈I M i∈I
Note that, as in the discrete case, the matrix Q may not be symmetrisable, but the invariant distri-
bution may still exist. Then the invariant distribution may be found using generating functions.
Example: The M/M/1 queue.
A single server has a service rate µ, customers arrive individually at a rate λ. Let X(t) be the number
of cutomers in the queue (including the customer currently served) and I = {0, 1, 2, . . .}. The Q-matrix
is given by  
−λ λ 0 ...
 µ −λ − µ λ 0 ... 
 
Q= 0 µ −λ − µ λ 0 ... .
 
.. .. .
... 0 . . .. 0
We first try to find the invariant distribution using symmetrisability. For this purpose we have to
solve the detailed balance equations mi qij = mj qji , explicitly
λ
m0 λ = m1 µ ⇒ m1 =m0 ,
µ
λ
m1 λ = m2 µ ⇒ m2 = m1 ,
µ
λ
mn λ = mn+1 µ ⇒ mn+1 = mn ,
µ
for all n ≥ 2. Hence denoting by ρ = λ/µ the traffic intensity we have mi = ρi m0 . If ρ < 1, then
∞
X ∞
X m0
M= mi = m0 ρi = < ∞,
i=0 i=0
1−ρ
Hence get that
mi
= (1 − ρ)ρi for all i ≥ 0.
πi =
M
In other words, if ρ < 1 the invariant distribution is the geometric distribution with parameter ρ.
Over a long time range the mean queue length is
∞
X ∞
X X
∞ ′ ρ
iπi = (1 − ρ)iρi = (1 − ρ)ρ ρi = .
i=1 i=1 i=1
1−ρ
15
PS: In the case ρ > 1 the chain is not recurrent, in the case ρ = 1 the chain is null recurrent and no
invariant distribution exists.
As a (more widely applicable) alternative one can also solve the equation πQ = 0 using the generating
function
∞
X
π̂(s) := s n πn = π0 + π1 s + π2 s 2 + · · · .
n=0
We write out the equation πQ = 0 as
−λπ0 + µπ1 = 0
λπ0 − (λ + µ)π1 + µπ2 = 0
λπ1 − (λ + µ)π2 + µπ3 = 0,
and so on. Multiplying the first equation by s, the second by s2 and so forth and adding up, we get
λs2 π̂(s) − (λ + µ)s(π̂(s) − π0 ) + µ(π̂(s) − π0 ) − λπ0 s = 0,
hence
π̂(s)(λs2 − (λ + µ)s + µ) = π0 µ(1 − s).
This implies
π0 µ(1 − s) π0 µ(1 − s) π0 µ
π̂(s) = = = .
λs2 − (λ + µ)s + µ (s − 1)(λs − µ) µ − λs
To find π0 recall that π̂(1) = 1 hence
µ−λ
π0 = = 1 − ρ,
µ
and we infer that

(1 − ρ)µ 1−ρ
π̂(s) = = .
µ − λs 1 − ρs
To recover the πn we expand π̂(s) as a power series and equate the coefficients,
∞
X
π̂(s) = (1 − ρ) ρn s n ,
n=0
hence πn = (1 − ρ)ρn . The mean queue length in the long term can be found using
∞
X ρ(1 − ρ)
π̂ ′ (s) = nsn−1 πn , and π̂ ′ (s) = .
n=1
(1 − ρs)2
which implies that the mean queue length is

∞
X ρ
nπn = π̂ ′ (1) = .
n=1
1−ρ
16
Last Example: The pure birth process.
Recall that the pure birth process X is a continuous time Markov chain with Q-matrix
 
−λ λ 0 0 ...
 0 −2λ 2λ 0 ... 
 
Q= 0 0 −3λ 3λ 0 ... .
 
.. .. ..
0 ... 0 . . .
This process is strictly increasing, and hence transient, we show that, if X(0) = 1, then
IP{X(t) = n} = e−λt (1 − e−λt )n−1 for n = 1, 2, 3, . . . , (2.4.1)
i.e. X(t) is geometrically distributed with parameter e−λt . Note that this shows that, as t → ∞,
n
X
IP{X(t) ≤ n} = e−λt (1 − e−λt )k−1 = 1 − (1 − e−λt )n −→ 0,
k=1
hence the process X(t) converges to infinity.

To prove (2.4.1) we use the resolvent method. We now write ρ for the Laplace parameter, hence we
have to invert  
ρ+λ −λ 0 0 ...
 0 ρ + 2λ −2λ 0 ... 
 
ρI − Q =  0 0 ρ + 3λ −3λ 0 ... .
 
.. .. ..
0 ... 0 . . .
The coefficients rin , n = 1, 2, . . . of the inverse matrix hence satisfy
r11 (ρ + λ) = 1, and r1,n−1 (−(n − 1)λ) + r1,n (ρ + nλ) = 0, for n ≥ 2.
Solving this gives

Yn
(n − 1)λ 1
r1n = r1,n−1 = λn−1 (n − 1)! .
ρ + nλ k=1
ρ + kλ
We use partial fractions, which means finding ak , k = 1, . . . , n such that
n
X ak
r1n = .
k=1
ρ + kλ
We thus get the equations

n
X Y
λn−1 (n − 1)! = ak (ρ + jλ).
k=1 j6=k
Plugging in ρ = −λk gives Y

λn−1 (n − 1)! = ak (λ(j − k)),
j6=k
hence !
(n − 1)! n−1
ak = Q = (−1)k−1 .
j6=k (j − k) k−1
17
Now we have !
n
X n−1 1
r1n = (−1)k−1 .
k=1
k−1 ρ + kλ
Inverting the Laplace transform yields
n
! n−1
!
X n−1 X n−1
p1n (t) = (−1)k−1 e−kλt = e−λt (−1)k e−kλt = e−λt (1 − e−λt )n−1 ,
k=1
k−1 k=0
k
by the binomial formula. This verifies (2.4.1).
18

CT Markov

Uploaded by

Copyright:

Available Formats

CT Markov

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CT Markov

Uploaded by

Copyright:

Available Formats

Chapter 2

Continuous time Markov chains

f (h) = IP{X(t + h) = k |X(t) = j}.

IP{X(t + h) = k |X(t) = j} f (h) − f (0)

We can write this as

IP{X(t + h) = k |X(t) = j, X(ti ) = xi ∀i} = IP{X(t + h) = k |X(t) = j} = qjk h + o(h).

• all off-diagonal entries qij , i 6= j, are non-negative,

• gives birth to more than 1 child with probability o(h),

• gives birth to 0 children with probability 1 − λh + o(h).

IP{X(t + h) = n | X(t) = n} = IP{ no births in (t, t + h) | X(t) = n}

IP{X(t + h) = n + 1 | X(t) = n} = IP{ one birth in (t, t + h) | X(t) = n}

and finally, for k > 1,

IP{X(t + h) = n + k | X(t) = n} = IP{k births in (t, t + h) | X(t) = n}

Hence, X is a continuous time Markov chain with Q-matrix

An analysis argument shows that, if X(0) = 1, then

IP{X(t) = n} = e−λt (1 − e−λt )n−1 for n = 1, 2, 3, . . . ,

i.e. X(t) is geometrically distributed with parameter e−λt .

2.1.1 Construction of the Markov chain

IPi {J > t} = e−qi t for t ≥ 0.

2.1.2 Exponential times

Prove this as an exercise, see Sheet 9!

2.2 Kolmogorov’s equations and global Markov property

The equation in (b) is called the Chapman-Kolmogorov equation. It is easily proved,

p′jk (0) = qjk and p′jj (0) = qjj .

P (t + s) = P (t)P (s) for all s, t ≥ 0.

One can differentiate this equation with resp to s or t and gets

P ′ (t) = QP (t) and P ′ (t) = P (t)Q.

2.3.1 Laplace transforms

An important example is the case f (t) = e−αt . Then

(a) Give the Q-matrix of the continuous-time Markov chain X.

As R(λ) = (λI − Q)−1 we start by inverting

λ2 + 5λ + 4 = α(λ + 3)(λ + 5) + βλ(λ + 5) + γλ(λ + 3)

Now the inverse Laplace transform of 1/(λ + β) is e−βt . We thus get

As T is exponential with parameter λ = 4 we get (no Laplace inversion needed!)

2.3.3 First hitting times

Fij (t) = IPi {Tj ≤ t} = IP{Tj ≤ t | X(0) = i},

with F (0) = I and

Given two functions f, g : [0, ∞) → IR we can define the convolution function f ∗ g by

pij (t) = fij ∗ pjj (t).

and, by Theorem 2.2,

2.4 Long-term behaviour and invariant distribution

2.4.1 The invariant distribution

πP (t) = π for all t ≥ 0.

Theorem 2.5 Suppose (X(t) : t ≥ 0) satisfies our assumptions and π solves

For the proof note that, for all j ∈ I,

We write out the equation πQ = 0 as

λs2 π̂(s) − (λ + µ)s(π̂(s) − π0 ) + µ(π̂(s) − π0 ) − λπ0 s = 0,

and we infer that

which implies that the mean queue length is

IP{X(t) = n} = e−λt (1 − e−λt )n−1 for n = 1, 2, 3, . . . , (2.4.1)

hence the process X(t) converges to infinity.

r11 (ρ + λ) = 1, and r1,n−1 (−(n − 1)λ) + r1,n (ρ + nλ) = 0, for n ≥ 2.

Solving this gives

We thus get the equations

Plugging in ρ = −λk gives Y

by the binomial formula. This verifies (2.4.1).

You might also like