Hittingtime

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

J. Appl. Prob.

45, 640–649 (2008)


Printed in England
© Applied Probability Trust 2008

HITTING TIME AND INVERSE PROBLEMS


FOR MARKOV CHAINS

VICTOR DE LA PEÑA,∗ Columbia University


HENRYK GZYL,∗∗ IESA Caracus
PATRICK MCDONALD,∗∗∗ New College of Florida

Abstract
Let Wn be a simple Markov chain on the integers. Suppose that Xn is a simple Markov
chain on the integers whose transition probabilities coincide with those of Wn off a finite
set. We prove that there is an M > 0 such that the Markov chain Wn and the joint
distributions of the first hitting time and first hitting place of Xn started at the origin for
the sets {−M, M} and {−(M + 1), (M + 1)} algorithmically determine the transition
probabilities of Xn .
Keywords: Inverse problem; Markov chain; first hitting time

2000 Mathematics Subject Classification: Primary 60G40; 35R30; 39A12

1. Introduction
Many problems arising in the natural sciences involve situations in which first hitting times
for an unknown diffusion process driving particles in an inaccessible region are given, and
from this data one seeks to determine properties of the underlying region and/or the particle
dynamics (see [4] for a general reference, see [2] for applications in neuroscience). As an
illustrative model problem, consider a long thin tube containing a liquid whose diffusivity is
known outside a given interval, say [−1, 1]. Suppose that particles are injected at the point
0 and that the first hitting time probabilities are given at a number of locations outside the
inaccessible interval [−1, 1]. We ask: What properties of the diffusivity can be determined
from the given data?
In this paper we study a discrete analog of the model diffusion problem sketched above. We
formalize the problem as follows.
Suppose that Wn is a simple Markov chain on the integers, Z (i.e. the transition probabilities
between two integers are nonzero if and only if the two integers are nearest neighbors). Suppose
that D ⊂ Z is a finite subset of Z. Suppose that Xn is a simple Markov chain whose transition
probabilities coincide with those of Wn outside of the subset D (we refer to such a Markov
chain as a simple D-perturbation of Wn ). The main result of this paper is the following.
Theorem 1.1. Let Wn be a simple Markov chain on the integers. Suppose that D ⊂ Z is a
finite set and that Xn is a simple D-perturbation of Wn . Let M = max{|i| : i ∈ D} + 1. Then
the Markov chain Wn and the joint distributions of the first hitting time and first hitting place

Received 22 August 2006; revision received 28 July 2008.


∗ Postal address: Department of Statistics, Columbia University, New York, NY 10027, USA.
Email address: [email protected]
∗∗ Postal address: Centro de Finanzas, IESA Caracus, Venezuela. Email address: [email protected]
∗∗∗ Postal address: Division of Natural Science, New College of Florida, Sarasota, FL 34243, USA.
Email address: [email protected]

640
Downloaded from https://www.cambridge.org/core. IP address: 207.241.231.82, on 26 Jul 2018 at 23:22:18, subject to the Cambridge Core terms of use, available
at https://www.cambridge.org/core/terms. https://doi.org/10.1239/jap/1222441820
Hitting time and inverse problems for Markov chains 641

of Xn starting at the origin for the sets {−M, M} and {−(M + 1), (M + 1)} determine the
transition probabilities of Xn .
We view Theorem 1.1 as an inverse result. Viewed in this context, there are a great number of
directions in which we can seek generalizations. These directions include generalizations of the
state space (discrete problems on other graphs, continuous problems, etc.), generalizations of
the process (Markov chains on graphs, diffusions, etc.), and generalizations of the probabilistic
data which one is given (moments of hitting times, etc.).
Our result can be further sharpened: from the proof of Theorem 1.1 in Section 3, it is clear
that we need the joint distribution of time and place for n ≤ 3M −1, and given such information,
there is an algorithm which constructs the required transition probabilities (cf. Corollary 3.1).
In addition, our result does not depend on the initial state of the Markov chain. Given that this
is the case, we can formulate a version of ‘diffusion tomography’ in which we study properties
of a given inaccessible region using particles which interact with the region via diffusion.
Grünbaum and his colleagues (who coined the term ‘diffusion tomography’) have investigated
related problems in the context of medical imaging (for a survey of optical tomography and
associated inverse problems, see [1]; for applications of Markov chains to optical tomography,
see [5], [6], [7], and [8]). The present work differs from those cited above in several ways. In
particular, our work demonstrates that (in the language of the above papers) detailed ‘time-of-
flight’ data suffices to solve the highly nonlinear inverse problem in one space dimension.
Because the joint distributions cited in Theorem 1.1 determine the underlying D-perturba-
tion, Theorem 1.1 provides a context in which we can define and study meaningful problems
in applied statistics. We develop general statistical tools for such problems and discuss several
applications.
Our proof of Theorem 1.1 involves an analysis of the pathspace associated to our Markov
chain.
The paper is organized as follows. In Section 2 we develop the notation used throughout
the paper, the preliminary material needed for the proof of our main result, and an example
that illustrates how our result can be used in practice to derive consistent estimators for
the unknown transition probabilities. In Section 3 we provide a proof of our main result
and corollaries which provide an explicit algorithm for constructing the required transition
probabilities (cf. Corollary 3.1). In Section 4 we formalize and discuss our statistical results.

2. Background and notation


As in the introduction, let Wn be a simple Markov chain on the integers, and denote the
transition probabilities of Wn by

wi = P(Wn+1 = i + 1 | Wn = i).

For notational convenience, we will write wi∗ = 1 − wi for the backward hopping probabilities
(see Figure 1).
Throughout this paper, we will view the integers as defining a bidirected graph with edges
given by connecting nearest neighbors and a natural orientation given by the ordering of the
integers. With these conventions, by a path of length k in the integers we will mean a sequence
of k integers connected by edges.
Definition 2.1. Let Wn be a simple Markov chain on Z, and suppose that D ⊂ Z is a finite set.
We say that a Markov chain Xn on Z is a D-perturbation if the transition probabilities of Xn

Downloaded from https://www.cambridge.org/core. IP address: 207.241.231.82, on 26 Jul 2018 at 23:22:18, subject to the Cambridge Core terms of use, available
at https://www.cambridge.org/core/terms. https://doi.org/10.1239/jap/1222441820
642 V. DE LA PEÑA ET AL.

p0 p1 p2

x x x x x x x x x
–L –2 –1 0 1 2 L

p*0 p*1 p*2

Figure 1: Transition probabilities for the one-dimensional problem.

coincide with those of Wn on Z \ D. We say that Xn is a simple D-perturbation of Wn if Xn is


a simple Markov chain which is a D-perturbation of Wn .
If L is a positive integer, we will denote by DL the subset of Z defined by
DL = {i ∈ Z : |i| ≤ L}.
Given an arbitrary finite D ⊂ Z, let L = max{|i| : i ∈ D}. Then D ⊂ DL . The sets DL play a
fundamental role in the sequel.
If Xn is a simple D-perturbation of Wn , we will denote by pi the transition probabilities of
Xn . If, as above, L = max{|i| : i ∈ D}, we can write

P(Xn+1 = i + 1 | Xn = i) if |i| ≤ L,
pi =
wi otherwise.

We will denote by Pl the probability measure associated to Xn which charges paths beginning
at l.
Given D ⊂ DL and M > L, we will denote by η = η−M,M the first hitting time of Xn for
{−M, M}:
η−M,M = inf{n ≥ 0 : Xn ∈ {−M, M}}.
We will write the joint distribution of the first hitting time and first hitting place as
P0M (k, −) = P0 (η = k, Xη = −M),
P0M (k, +) = P0 (η = k, Xη = M).
The first nontrivial example of the inverse problem described in Theorem 1.1 occurs for the
set D = D1 . We let Xn be a simple D1 -perturbation of Wn and we study this problem in detail.
Suppose that we are given the joint distribution of the first hitting time and first hitting place
for the set {−2, 2}. With the notation as above, we have
P02 (2, +) = p0 p1 , (2.1)
P02 (2, −) = p0∗ p−1

.
For all positive integers k, it is clear that
P02 (2k + 1) = 0.
In addition, every path of length 2k starting at 0 and having first hitting time of {−2, 2} given
by 2k and first hitting place given by 2 has an associated occurrence probability of the form
(p0∗ p−1 )l1 (p0 p1∗ )l2 p0 p1 ,

Downloaded from https://www.cambridge.org/core. IP address: 207.241.231.82, on 26 Jul 2018 at 23:22:18, subject to the Cambridge Core terms of use, available
at https://www.cambridge.org/core/terms. https://doi.org/10.1239/jap/1222441820
Hitting time and inverse problems for Markov chains 643

where l1 + l2 + 1 = k. Thus,
 
P02 (2k, +) = cl1 ,l2 (p0∗ p−1 )l1 (p0 p1∗ )l2 p0 p1 , (2.2)

where the coefficient cl1 ,l2 counts the number of distinct paths occurring for the partition of k
given by l1 + l2 + 1 = k. Thus,  
k−1
cl1 ,l2 = ,
l1
and we can sum:
P02 (2k, +) = (p0∗ p−1 + p0 p1∗ )k−1 P02 (2, +). (2.3)
But, p1∗ = (1 − p1 ) and p−1 = ∗ ).
(1 − p−1 From this we conclude that

p0∗ p−1 + p0 p1∗ = −(p0 p1 + p0∗ p−1



− 1).

This together with a trivial algebraic computation proves the following lemma.
Lemma 2.1. Let Wn be a simple Markov chain on the integers, and suppose that Xn is a simple
D1 -perturbation of Wn . Then the joint distribution of the first hitting time and first hitting place
of Xn starting at the origin for the set {−2, 2} does not determine the transition probabilities
of Xn .
In fact, the proof provides a method for constructing two different perturbations with first
hitting times and first hitting places having the same joint distributions:
 
(p−1 , p0 , p1 ) = 43 , 21 , 41 ,
 
(p−1 , p0 , p1 ) = 21 , 43 , 16 .

We note that the conclusion of Lemma 2.1 depends on the location of the starting point: it
is trivial to verify that other starting points give hitting time probabilities which determine the
transitions.
Suppose that we know the joint distribution of the first hitting time and first hitting place for
the set {−2, 2} and the set {−3, 3}. Then

P03 (3, +) = p0 p1 w2 (2.4)

and
P03 (5, +) = P03 (3, +)(p0 p1∗ + p0∗ p−1 + p1 w2∗ ). (2.5)
From (2.3), (2.4), and (2.5), we conclude that
 
1 P03 (5, +) P02 (4, +)
p1 = ∗ − , (2.6)
w2 P03 (3, +) P02 (2, +)

from which it follows that the transition probabilities are determined. In particular, we have
proven the following lemma.
Lemma 2.2. Let Wn be a simple Markov chain on the integers, and suppose that Xn is a simple
D1 -perturbation of Wn . Then the joint distributions of the first hitting time and first hitting place
of Xn starting at the origin for the sets {−2, 2} and {−3, 3} determine the transition probabilities
of Xn .

Downloaded from https://www.cambridge.org/core. IP address: 207.241.231.82, on 26 Jul 2018 at 23:22:18, subject to the Cambridge Core terms of use, available
at https://www.cambridge.org/core/terms. https://doi.org/10.1239/jap/1222441820
644 V. DE LA PEÑA ET AL.

From Lemma 2.2, it follows that there is a natural statistical problem associated to a simple
D1 -perturbation of Wn . We formalize this problem as follows. From (2.1), (2.2), (2.6), and
their analogs for p−1 , and using the fact that pi∗ = 1 − pi for all i, we can see that the transition
probabilities p0 , p1 , and p−1 can be obtained from the 8-vector P̄ given by

P̄ = ((P02 (2, ±)), (P03 (3, ±)), (P02 (4, ±)), (P03 (5, ±))).

Let {(X1i , X2i , . . . , X5i )}m


i=1 be m independent copies of our Markov chain up to time n = 5.
The natural estimator for the first component of the vector P̄ is given by
m
1 
P02 (2, +, m) = ξ{η−2,2 =2, Xi =2} , (2.7)
m 2
i=1

where ξ{η−2,2 =2,Xi =2} is the indicator function for the given event. By the strong law of
2
large numbers, (2.7) is a consistent statistic for P02 (2, +). Repeating the construction for
each component of P̄, we obtain a strongly consistent vector estimator for P̄. Using the
continuous mapping theorem (if all the original probabilities in our random vector are nonzero),
we obtain a vector (p−1,m , p0,m , p1,m ) which is a strongly consistent estimator for (p−1 , p0 , p1 )
(cf. Section 4, below).
Observing that we have used eight parameters to determine three unknowns, it is natural to
question whether there is a more efficient algorithm which recovers the perturbation. With this
in mind, consider the set {−2, 3}, introduce the obvious notation, and follow the argument used
to establish Lemma 2.2, to obtain
 0 
1 P−2,3 (5, +) P−2,3 (4, −)
0
p1 = ∗ − ,
w2 P0−2,3 (3, +) P0−2,3 (2, −)

establishing that the transition probabilities of Xn are determined by Wn and the joint distribution
of the first hitting time and first hitting place of Xn starting at the origin for the set {−2, 3}.
While we suspect that this example is an anomaly (i.e. for L > 1, we cannot recover the
transition probabilities using data from a single interval), we are able to prove only that our
algorithm for solving the inverse problem fails.

3. Proof of Theorem 1.1


We begin with a reduction.
Lemma 3.1. Theorem 1.1 is true if and only if, for all L > 0, Theorem 1.1 is true for D = DL .
Proof. Let D be an arbitrary finite subset of Z, and let Xn be a simple D-perturbation
of Wn . Let L = max{|i| : i ∈ D}, and let M = L + 1. Then D ⊂ DL and Xn is a simple
DL -perturbation of Wn with max{|i| : i ∈ DL } + 1 = M. Assuming that the transition
probabilities of Xn are determined on DL , we find that the transition probabilities of Xn are
determined on D. This completes the proof.
To prove Theorem 1.1 for sets of the form DL , we will give a careful analysis of the structure
of paths beginning at the origin and having certain prescribed hitting properties. To this end,
let L > 0, let m > 0, and define (for the remainder of the paper)

T = (L + m + 1) + 2m. (3.1)

Downloaded from https://www.cambridge.org/core. IP address: 207.241.231.82, on 26 Jul 2018 at 23:22:18, subject to the Cambridge Core terms of use, available
at https://www.cambridge.org/core/terms. https://doi.org/10.1239/jap/1222441820
Hitting time and inverse problems for Markov chains 645

We consider paths beginning at the origin and having T as the first hitting time for L + m + 1.
More precisely, we define

 = {γ : γ (0) = 0, γ (T ) = L + m + 1, γ (j ) < L + m + 1 for all j < T }. (3.2)

Elements of  have a nice lower bound.


Lemma 3.2. For  as in (3.2), if γ ∈  then γ (j ) > −(L + m) for all j ≤ T .
Proof. Let γ be a curve satisfying γ (0) = 0 and γ (j ) = −(L + m). Then j > (L + m − 1).
Denote by τ the first time that γ visits L + m + 1. Then

τ ≥ j + (L + m) + (L + m + 1)
> L + m − 1 + (L + m) + (L + m + 1)
= (L + m + L) + L + 2m.

We conclude that τ > T , which completes the proof.


We partition  by the first hitting times of L + m.
Lemma 3.3. Let  be as in (3.2), and define

k = {γ ∈  : γ (T − (2k − 1)) = L + m, γ (j ) < L + m for all j < T }. (3.3)

Then
(a) i ∩ j = ∅ if i  = j ,
m+1
(b) k=1 k = .
Proof. If i  = j , elements of i and j have different first hitting times of L + m, and, thus,
i ∩ j = ∅. Since every element of  begins at 0 and hits L + m by time T , m+1 k=1 k = .
This completes the proof.
Initial segments of paths in k define paths with nice first hitting properties. We make this
precise in the following lemma.
Lemma 3.4. Let 1 ≤ k ≤ m + 1. For γ ∈ k , the path Tk (γ ) obtained by truncating γ at time
T − (2k − 1) satisfies
(a) Tk (γ )(0) = 0,
(b) Tk (γ )(T − (2k − 1)) = L + m,
(c) −(L + m) < Tk (γ )(j ) < L + m for all j < T − (2k − 1).
Proof. By definition, for γ ∈ k , the first hitting time of L + m is T − (2k − 1), from which
statement (b) and the right-hand side inequality of statement (c) follow. Statement (a) is trivial,
and the left-hand side inequality of statement (c) follows from Lemma 3.2. This completes the
proof.
Truncation provides for a decomposition of paths in k : each such path consists of an initial
segment which has nice first hitting properties, followed by an end segment which never visits
the vertex L. We make this precise in the following lemma.

Downloaded from https://www.cambridge.org/core. IP address: 207.241.231.82, on 26 Jul 2018 at 23:22:18, subject to the Cambridge Core terms of use, available
at https://www.cambridge.org/core/terms. https://doi.org/10.1239/jap/1222441820
646 V. DE LA PEÑA ET AL.

Lemma 3.5. For 1 ≤ k < m + 1,

P0 (k ) = P0L+m (T − (2k − 1), +)χk ,

where χk is an expression which involves only the transition probabilities wj , L + 1 ≤ j ≤


L + m + 1.
Proof. By Lemma 3.4, each γ ∈ k can be decomposed as a path Tk (γ ) starting at 0 with
the first hitting time of {−(L + m), (L + m)} occurring at time T − (2k − 1) and position L + m,
followed by a path of length (2k − 1) which begins at L + m and ends when it makes its first
visit to L + m + 1. We will write

˜ k = {γ : γ (0) = L + m, γ (2k − 1) = L + m + 1, γ (j ) < L + m + 1 for all j < 2k − 1}.

By choice of k, if γ ∈ ˜ k , γ (j ) > L for all 0 ≤ j ≤ 2k − 1. Thus, if γ ∈ ˜ k , we can compute


PL+m ({γ }) in terms of the transition probabilities wj , L < j ≤ L + m. Summing over all
elements ˜ k gives an expression
χk = PL+m (˜ k ),
which involves only the transition probabilities wj , L < j ≤ L + m. Finally, we compute

P0 (k ) = P0L+m (T − (2k − 1), +)χk ,

as required. This completes the proof.


The following lemma is the essential step in establishing Theorem 1.1.
Lemma 3.6. Suppose that Wn is a simple Markov chain on Z. For L > 0, let Xn be a simple
DL -perturbation of Wn . Let m > 0. Then pL and p−L are determined by the Markov chain
Wn and the joint distributions of the first hitting time and first hitting place of Xn starting from
the origin for the sets {−(L + m), (L + m)} and {−(L + m + 1), (L + m + 1)}.
Proof. From Lemma 3.3 and Lemma 3.5, we have

P0L+m+1 (T , +) = P0 ()
m

= P0L+m (T − (2k − 1), +)χk + P0 (m+1 ). (3.4)
k=1

We let γ∗ be the element of m+1 , which changes direction exactly twice. From (3.1) and (3.3),
γ∗ is the path which starts at 0, moves right L + m units, moves left m units, and moves right
m + 1 units. Thus, we can explicitly compute the probability that γ∗ occurs:

P0 ({γ∗ }) = χ∗ pL , (3.5)

where
L+m−1 L+m
χ∗ = P0L+m+1 (L + m + 1, +) wi (1 − wj ).
i=L+1 j =L+1

If γ ∈ m+1 \ {γ∗ } then, as in Lemma 3.5, we may view γ as a truncation followed by a path
which never visits L. Thus, as in Lemma 3.5, we can write

P0 (m+1 \ {γ∗ }) = P0L+m (L + m, +)χm+1 , (3.6)

Downloaded from https://www.cambridge.org/core. IP address: 207.241.231.82, on 26 Jul 2018 at 23:22:18, subject to the Cambridge Core terms of use, available
at https://www.cambridge.org/core/terms. https://doi.org/10.1239/jap/1222441820
Hitting time and inverse problems for Markov chains 647

where χm+1 depends only on the transition probabilities wj , L < j ≤ L + m. Using (3.4),
(3.5), and (3.6), we can solve for pL :
 m+1
 
1
pL = P 0
(T , +) − PL+m (T − (2k − 1), +)χk .
0
(3.7)
χ∗ L+m+1
k=1
By symmetry, there is a similar formula for p−L . This completes the proof.
Proof of Theorem 1.1. By Lemma 3.1, it suffices to consider the case in which D = DL ,
where L > 0 is arbitrary. We proceed by induction on L.
By Lemma 2.2, the result is true when L = 1. When L = n + 1, we can use Lemma 3.6
with m = 1 to determine pn+1 and p−(n+1) . Applying Lemma 3.6 n times, incrementing m
while decreasing L at each repetition, completes the proof.
From the proofs of Lemma 3.6 and Theorem 1.1, we note that, with M as defined in
Theorem 1.1, we only require values of the joint distribution of the exit time and place for time
t ≤ M + 3. It is then the case that (3.7) gives an explicit value for the transition probability
pL . From this we deduce the following corollary.
Corollary 3.1. Under the hypotheses of Theorem 1.1, there is an algorithmic procedure for
explicitly determining the transition probabilities for any given simple D-perturbation. With
M as in Theorem 1.1, the algorithm depends on data from the joint distribution of the exit time
and place for time t ≤ 3L + 2.
Proof. From the proof of Lemma 3.6 we use (3.7) L times, each time decreasing L by 1
and increasing m by 1. This has the effect of changing the value of T by 2 at each step. The
corollary follows from (3.1).
From the proof of Theorem 1.1, it is clear that the starting point of the process need not be an
element of the region which defines the perturbation, nor need the starting point be the origin.
From the point of view of application, this fact allows us to develop a ‘probabilistic scattering’
framework.
Finally, we note that there is an alternate proof of Theorem 1.1 using the strong Markov
property. Our proof, however, provides an algorithm for constructing the required transition
probabilities.

4. Associated statistics
As in the introduction, let Wn be a simple Markov chain on the integers. Suppose that D ⊂ Z
is a finite set and that Xn is a simple D-perturbation of Wn . Let M = max{|i| : i ∈ D} + 1 and,
for L = M − 1, view Xn as a simple DL -perturbation of Wn . By Theorem 1.1 and its proof,
the transition probabilities associated to Xn can be reconstructed from the components of the
2(2L + 2)-vector (cf. Corollary 3.1):
P̄ = ((P0L+1 (L + 1, ±)), (P0L+2 (L + 2, ±)), . . . , (P0L+2 (3L + 2, ±))).
Let {(X1i , X2i , . . . , X3L+2
i )}m
i=1 be m independent copies of our random walk up to time n =
3L + 2. The natural estimator for the first component of the vector P̄ is given by
m
1 
P0L+1 (L + 1, +, m) = ξ{η−(L+1),L+1 =L+1, Xi =L+1} , (4.1)
m L+1
i=1
where ξ{η−(L+1),L+1 =L+1, Xi is the indicator function for the given event.
L+1 =L+1}

Downloaded from https://www.cambridge.org/core. IP address: 207.241.231.82, on 26 Jul 2018 at 23:22:18, subject to the Cambridge Core terms of use, available
at https://www.cambridge.org/core/terms. https://doi.org/10.1239/jap/1222441820
648 V. DE LA PEÑA ET AL.

Theorem 4.1. Let Xn be a simple DL -perturbation of a given Markov chain Wn . Suppose


that, for L + 1 ≤ j, l ≤ 3L + 2, P0L+1 (j, ±, m) and P0L+2 (l, ±, m) are the natural estimators
defined as analogs of (4.1). Let pi,m denote the estimator of pi obtained using the estimators
P0L+1 (j, ±, m) and P0L+2 (l, ±, m), and the algorithm of Corollary 3.1. Then, for each i, pi,m
is a consistent estimator for pi .
Proof. By the strong law of large numbers, (4.1) is a consistent statistic for each component
of P̄, the vector determining the DL -perturbation Xn . By the proof of Theorem 1.1, it is clear
that each of the pi defining the perturbation is a rational function of the components of P̄. By the
continuous mapping theorem (if all the original probabilities in our random vector are nonzero),
we obtain a vector (p−L,m , p−L+1,m , . . . , pL,m ) which is a strongly consistent estimator for
(p−L , p−L+1 , . . . , pL ). This completes the proof.
There are a number of pertinent observations to be made concerning Theorem 4.1.
The construction of our consistent estimator involves the use of (3.7) and its analogues.
Since these equations involve probabilities in the denominators, we need to safeguard against
dividing by 0. Since we are assuming that each of the transition probabilities is positive, each
of the components of P̄ is nonzero. If we set

τL+1 = inf {m : ξ{η−(L+1),L+1 =L+1, XL+1


m =L+1} > 0}
m≥0

then, by the strong law of large numbers, τL+1 < ∞ almost surely and P0L+1 (L+1, +, m∨τL+1 )
is a nonzero consistent estimator of P0L+1 (L + 1, +). The remaining components of P̄ can be
treated similarly, as long as we take as our running index m ∨ 3L+2 j =L+1 τj . Given this, the
estimating vector P̄m∨ 3L+2 τj converges almost surely to P̄ and, hence, we have a consistent
j =L+1
estimator.
We note that, while each of the entries in our vector estimator for the components of P̄ is a
maximum likelihood estimate for the respective probability, it is not clear that the vector estimate
for the transition probability vector defining the perturbation is a maximum likelihood estimate
(it is necessary to further study the joint behavior of the vector components before making such
a claim). Analogous claims hold for the derived vector (p−L,m , p−L+1,m , . . . , pL,m ).
We note that, as defined, the estimators pi,m , though consistent for pi , need not (and in fact
most likely will not) be unbiased.
As sketched in the introduction, our results were developed in the context of a model involving
first passage probabilities which arises in a variety of physical contexts (cf. [4]). This model
involves a natural inverse problem for a continuous process. More precisely, suppose that
a : R → R is a smooth function satisfying a(x) > 0 and a(x) = 1 if |x| ≥ 1. Let L be the
elliptic operator
 
1 d 2
L= a(x) ,
2 dx
and suppose that Xt is a diffusion process with generator L. Suppose that we start the process at
x = 0 and stop it at the boundary of [−1, 1]. Suppose that we are given the joint distribution of
the first hitting time and hitting place, as well as the (spatial) derivative of the joint distribution
at the boundary. Then it is natural to inquire as to whether a(x) is determined. This is indeed
the case. An argument using our results and convergence of the discrete approximation (as well
as an independent argument using completeness theorems for products of eigenfunctions) is in
preparation.

Downloaded from https://www.cambridge.org/core. IP address: 207.241.231.82, on 26 Jul 2018 at 23:22:18, subject to the Cambridge Core terms of use, available
at https://www.cambridge.org/core/terms. https://doi.org/10.1239/jap/1222441820
Hitting time and inverse problems for Markov chains 649

While it is easy to see that there can be no general higher-dimensional analog of our results,
we have obtained completely analogous results for finite trees [3]. We expect that these results
will have applications involving communication networks.
In addition to the ‘standard’ physical problems and those related to medical imaging, we
foresee a number of other potential applications of our method. For example, it seems plausible
that there will be potential applications in ecology. Releasing an animal at a central location and
monitoring its arrival times and positions at a set of sites can be used to estimate how likely it
is that the animal wandered in specified regions outside our direct control. Similar experiments
could be conceived to analyze the flow of counterfeit money, spread of disease, and the impact
of advertising on the image of a product.

References
[1] Arridge, S. R. (1999). Optical tomography in medical imaging. Inverse Problems 15, R41–R93.
[2] Bal, G. and Chu, T. (2004). On the reconstruction of diffusions from first-exit time distributions. Inverse Problems
20, 1053–1065.
[3] De la Peña, V., Gzyl, H. and McDonald, P. (2008). Inverse problems for random walks on trees: network
tomography. To appear in Statist. Prob. Lett.
[4] Gardiner, C. W. (2004). Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences, 3rd
edn. Springer, Berlin.
[5] Grünbaum, F. A. (1992). Diffuse tomography: the isotropic case. Inverse Problems 8, 409–419.
[6] Grünbaum, F. A. (2003). Diffuse tomography as a source of challenging nonlinear inverse problems for a general
class of networks. Modern Signal Process. 40, 137–146.
[7] Grünbaum, F. A. and Matusevich, L. F. (2002). Explicit inversion formulas for a model in diffuse tomography.
Adv. Appl. Math. 29, 172–183.
[8] Patch, S. (1995). Recursive recovery of a family of Markov transition probabilites from boundary value data.
J. Math. Phys. 36, 3395–3412.

Downloaded from https://www.cambridge.org/core. IP address: 207.241.231.82, on 26 Jul 2018 at 23:22:18, subject to the Cambridge Core terms of use, available
at https://www.cambridge.org/core/terms. https://doi.org/10.1239/jap/1222441820

You might also like