Bulletin of The Seismological Society of America, Vol. 82, No. 1, Pp. 259-288, February 1992

Bulletin of the Seismological Society of America, Vol. 82, No. 1, pp.
259-288, February 1992
RAY B E N D I N G REVISITED
BY T. J. MOSER, G. NOLET, AND R. SNIEDER
ABSTRACT
Two improvements on conventional ray bending are presented in this paper.
First, gradient search methods are proposed to find the locations of successive
points on a ray such that the travel time between its endpoints is minimal. Since
only the integration of the travel time along the whole ray is involved, such
methods are inherently more stable than methods that use first and second
derivatives of the ray path to solve the raytracing equations. Second, it is shown
that interpolation between successive points on a ray is generally necessary to
obtain sufficient precision in the quadrature, but also advantageous in terms of
efficiency when Beta-splines are used.
Velocity discontinuities are easy to handle in the proposed bending algorithm.
The target for the travel time precision is 1 part in 10 4, which can be considered
necessary for many applications in seismic tomography. With the proposed
method, this precision was reached for a constant gradient velocity model using
the conjugate gradients algorithm on a five-point Beta-spline in single precision
arithmetic. Several existing methods for parametrization and minimization failed
to produce this target on this simple model even with computing times that are
an order of magnitude larger. Calculations in a spherical Earth yield the required
precision within feasible computation times. Finally, alternative search direc-
tions, which can be obtained from an approximate second order expansion of
the travel time, accelerate the convergence in the first few iterations of the
bending process.
INTRODUCTION
The steadily increasing ability to map the lateral heterogeneity of the Earth's
crust and deeper regions not only makes stricter demands on the precision of
seismic observations such as travel times; inversion procedures for such data,
notably the more recent tomographic methods (Nolet, 1987), are also more
exacting on the precision and efficiency of the algorithms used to solve the
forward problem. This paper takes a new look at the old problem of seismic
raytracing with a special emphasis on precision without loss of efficiency.
To find the ray geometry and travel time for a seismic ray between two fixed
points A and B in the Earth, there are essentially two different computational
approaches (Aki and Richards, 1980, Chapter 13): in the method of shooting,
rays are computed that leave point A in different directions by solving the
raytracing equations until one ray happens to arrive in B. In the bending
method, some initial guess of the ray is perturbed, while A and B are kept
fixed, until it satisfies the ray equations or minimizes the travel time.
One advantage of the method of ray bending over shooting is that the bending
method yields the travel time of the diffracted ray, even if the point B is in the
shadow zone where no "physical" rays arrive from A. Of course, the term
"physical" is somewhat misleading since in the real Earth diffracted seismic
energy does arrive in B and m a y have been observed as an arrival time. In
tomographic interpretations, such observations must be discarded when the
direct problem is solved by shooting rather than bending. Also, when the
259
260 T. J. M O S E R , G. N O L E T , AND R. S N I E D E R
diffracted wave arrives before the physical ray, it is likely that the fastest of the
two is observed. Unless this is taken into account, tomographic studies are
likely to yield models that are biased towards high velocities (Wielandt, 1987).
Bending methods have been introduced in seismology by Wesson (1971),
Chander (1975), and Julian and Gubbins (1977). Two variants exist of the
bending method. In the original method (Julian and Gubbins, 1977), of which
the most elaborate version is by Pereyra et al. (1980), the raytracing equations
£ -v =o (i)
for a ray ~, parametrized by the arc length s, in a velocity field c(~), are
approximated by discretizing the ray path into k + i points (~o, rl,. • •, rk) and
by using finite differences for the derivatives. The resulting system of equations
is then linearized and solved to find the perturbation that must be applied to
some initial guess of the ray in order to reduce the right-hand side of the ray
equations to 0. Since the linearized system is approximate, several iterations
are needed for convergence.
In the second variant, pioneered by U m and Thurber (1987) and Prothero
et al. (1988), one attempts to minimize directly the travel time as a functional of
the ray curve ~.
f ds
- - - ~ Min. (2)
T(~) = c
This second variant of ray bending has one distinct advantage over the finite-
difference approach. The precision of the finite-difference approach is limited by
the precision of numerical differentiation, which is inherently more unstable
than numerical integration. The linearized system in the first variant therefore
contains small errors. Since, in our experience, the system is ill-conditioned in
the presence of a complicate velocity model, these small errors prohibit conver-
gence to the desired accuracy or may even cause a catastrophic divergence.
Such stability problems do not occur when T is minimized directly, as in the
second variant of ray bending.
U m and Thurber (1987) represent the ray by a number of points, connected by
straight line segments. In this paper, such representations are referred to as
"polygonal paths." In the method presented by U m and Thurber, the precision
of the integration increases with the number of segments. For efficiency rea-
sons, the number of segments grows with the number of iterations. Travel time
accuracies of a few parts in 1 0 4 a r e obtained using considerable CPU time (see
Table 3 of this paper). All computations reported in this paper were performed
using one NS32532 processor of an Encore MM510 parallel computer with 2.5
Mflop. The method is not suitable for media with velocity discontinuities, nor
does it work well with predominantly constant velocity zones. These drawbacks
do not exist with a similar method developed by Prothero et al. (1988), at the
expense, however, of computation times that are larger by a factor 2.5 to 5.5.
In this paper, two improvements on the method of U m and Thurber (1987) are
investigated: ray parametrizations in terms of Beta-splines rather than polygo-
hal paths; and a global, rather than a three-point, perturbation scheme to
RAY BENDING REVISITED 261
minimize the travel time T. In contrast with Prothero et al. (1988), who use the
simplex method for minimization (Nelder and Mead, 1965), it is shown that the
smoothness of the travel time allows for the use of the derivatives of T in
minimization techniques, such as the conjugate gradients method (Fletcher and
Reeves, 1965). The Beta-spline curve representation (Barsky, 1988) is chosen
instead of trigonometric polynomials because of its greater flexibility.
Instead of a gradient method for minimizing the travel time, one can also use
second-order minimization methods that use information that is present in the
Hessian matrix. The last section of this paper shows that, using an approximate
expression for the inversion of the Hessian, this leads to a stable ray bending
method.
The presented method has been compared with other bending methods only
qualitatively. A comparison with respect to efficiency is only partly useful,
because of the different convergence behavior of the methods in presence of
strong heterogeneities or discontinuities. The presented method has no prob-
lems with such situations, unlike most other methods (Julian and Gubbins
(1977) in complicated velocity models, due to the ill-conditioning of their lin-
earized system of equations, and U m and Thurber (1987) in constant velocity
zones, due to their definition of the ray perturbation). A quantitative analysis is
given to show the advantages of Beta-splines with regard to polygonal paths,
both in efficiency and accuracy.
This study was motivated by the need to obtain more precise travel times, and
yet retain efficiency in the computations to enable three-dimensional models to
be incorporated in delay-time studies. In seismic tomography, the number of
data (rays) is notoriously.large. A precision of 0.1 to 0.2 sec in measured arrival
times of 12 minutes (P) or 20 minutes (PKP) is required. Thus, raytracing
algorithms should aim at a precision of about 1 part in 10 4. This shall be used
as the target precision in this paper.
The first section introduces the parametrization of a ray path by the coordi-
nares of successive points on it and the travel time and its gradient along it as
multivariate functions. The second section, on minimization techniques using
derivatives, shows how conjugate gradients can be used for ray bending. The
Beta-spline curve representation is introduced in t h e third section. In the fourth
section, it is shown that bending in the presence of discontinuities causes no
problems. The section on higher-order terms gives a search direction, obtained
from a second-order expansion of the travel time, which leads to a faster
convergence in the first bending iterations. The proposed bending method is
applied to a spherical E a r t h in the last section.
With exception of the last section, this paper uses dimensionless quantities
for distance, time, velocity, and slowness. All velocity fields, except in the last
section, are sampled on a 50 x 50 grid and bilinearly interpolated. The range of
the fields is 0 to 100 in both the x and z directions.
THE TRAVEL-TIME FUNCTION AND ITS DERIVATIVES
According to Fermat's principle, the gradient V T of the travel time T,
considered as a functional on the set of continuous curves 3, connecting a
specified source-receiver pair, vanishes for a seismic ray path. With exception of
some degenerate ray perturbations, such as shifting a finite ray segment over a
velocity discontinuity, T(~) is a smooth functional. This suggests the use of
VT(~) for finding the minimal travel time path. The variation of the travel time
along an initial guess path due to small perturbations in its geometry, for which
VT(~) is a measure, gives an indication about how to change the guess path in
order to search for a m i n i m u m of T.
For computational purposes, the paths under consideration can be discretized
into polygonal paths consisting of k + 1 points in a two- or three-dimensional
space, numbered from 0 to k and connected by straight line segments. Their
successive x and z coordinates or x, y, and z coordinates form a set of
n = 2(k + 1) or 3(k + 1) parameters t h a t are stored in a n-vector ~:
-: ( Xo, xl, . . , (3)
In the rest of this paper, two dimensions will be chosen and asymptotical
relations for the efficiency and accuracy will be given in terms of n. The
extension to three dimensions is trivial.
The travel time along such a path, T(7), can be approximated with the
trapezoidal rule
T(7) = i=lE
1(1 1)--ci + v/(x - x -1)2 + z _l) 2 , (4)
/
where c~ is the seismic velocity at (xi, zi). It is a function from ~n to ~, t h a t can

be evaluated with O(n) arithmetic operations. A Taylor expansion in 7 gives
T ( 7 + ~ 7 ) = T(7)+(VT(7),5"y)+O(II67112), (115711~ 0), (5)
where ~7 = (SXo, 6Zo, - • -, 6xk, 6z~) is a small perturbation on 7, (', " ) is the
standard inner product on the n-dimensional Euclidian space and I1"11
= v/( • , •) is the norm. The gradient of T in the n dimensional point 7 is an n
vector as well:
c3T OT OT
OT OT OT)
, , , , ' . . , , •
VT(7)= OXo OZo Oxl Ozl Oxk cgzk (6)
For ray bending purposes, the end points of the polygonal paths must be kept
fixed:
5x o = S z o = S x k = 6 z ~ = 0 . (7)
so t h a t the corresponding components of the gradient do not need to be com-

puted. The other components can be calculated in several ways, the simplest of
which is numerical differentiation of (4). A more reliable differentiation will be
given in the section "Interpolation with Beta-splines." The central difference
formula for numerical differentiation gives
T ( 7 + ~ 7 ) - T(7-57)=2(VT(7),57)+O(II57[13), (I]~711-~0), (8)
where 57 consists of suitable increments 5x i and ~z i (see Stoer and Bulirsch,

1980, Section 5.4.3). One component of VT(7) can be calculated by changing
the appropriate point of the polygonal path with 5x i or 5z i and - S x i or - S z i,
t a k i n g the difference in travel time along both perturbed paths and dividing
t h i s b y 2 5xi or 2 5Z i. S u c h n u m e r i c a l d i f f e r e n t i a t i o n h a s a d i s c r e t i z a t i o n e r r o r of
O(115'/II ~)(ll 5'/II-~ 0), w h i c h is m o r e a c c u r a t e t h a n for i n s t a n c e f o r w a r d differ-
ences. H o w e v e r , II5~ II c a n n o t be c h o s e n too small, b e c a u s e of t h e round-off e r r o r
t h a t comes in w h e n t h e n e a r l y e q u a l n u m b e r s T ( ~ + 5~) a n d T ( ' / - 5~) a r e
s u b t r a c t e d . F o r e a c h c o m p o n e n t , T ( ~ + 5~) - T ( ~ / - 5~) n e e d to be c a l c u l a t e d
only for t h e two s e g m e n t s in w h i c h ~ + 5 " / a n d " / - 5 - / a r e different. T h u s , b o t h
t h e t r a v e l t i m e a n d its g r a d i e n t c a n be c o m p u t e d w i t h O ( n ) o p e r a t i o n s .
F i g u r e 1 i l l u s t r a t e s t h e s m o o t h n e s s of t h e t r a v e l t i m e f u n c t i o n a l in a simpli-
fied e x a m p l e of a t h r e e - p o i n t p o l y g o n a l p a t h u s i n g t h e t r a p e z o i d a l a p p r o x i m a -
tion of t h e t r a v e l t i m e . T h e g r a d i e n t , c o m p u t e d f r o m t h e p e r t u r b e d p a t h s , gives
a correct i n d i c a t i o n of t h e d i r e c t i o n in w h i c h t h e middle p o i n t should be c h a n g e d
in o r d e r to d e c r e a s e t h e t r a v e l t i m e .
MINIMIZATION TECHNIQUES USING DERIVATIVES
Since t h e t r a v e l t i m e T ( ~ ) a n d its g r a d i e n t V T ( 7 ) c a n be c a l c u l a t e d easily,

m i n i m i z a t i o n t e c h n i q u e s u s i n g d e r i v a t i v e s c a n be u s e d for b e n d i n g a n initial
g u e s s p a t h ~o t o w a r d s t h e m i n i m u m t r a v e l t i m e p a t h . T h e c o n j u g a t e g r a d i e n t s
m e t h o d is u s e d here, b e c a u s e of its m o d e s t m e m o r y space r e q u i r e m e n t s , n a m e l y
only four n vectors, a n d b e c a u s e of its s u p e r i o r c o n v e r g e n c e p r o p e r t i e s . O t h e r
m e t h o d s a r e N e w t o n ' s m e t h o d , w h i c h n e e d s O ( n 2) m e m o r y space, a n d t h e
s t e e p e s t d e s c e n t m e t h o d , w h i c h h a s c o n v e r g e n c e p r o b l e m s in t h e p r e s e n c e of a
O
T--
O
t'xl
O
nt
N °
O
CD
O
t'--
O
CO
O
O~
10 20 30 40 50 60 70 80 90
X
FIG. 1. Contours of the travel time along polygonal paths of three points ((0, 0), (x, z), (100,100))
as a function of the position of the second point, computed with the trapezoidal rule. The contours
are for travel times from 100.0 to 170.0 with an increment of 5.0. Two guess paths (1 and 2) are
shown, together with the perturbations used for the numerical gradient calculation with 5x = 5z =
4.0. Path 3 is the minimal trapezoidal travel time path consisting of three points. The velocity field
is given by c(x, z) = 1.0 + 0.01z and, as for all figures, sampled on a 50 × 50 grid and bilinearly
interpolated. The minimal trapezoidal travel time is 98.6549, whereas an analytic solution gives
96,2424.
long narrow valley in the travel-time function. See Stoer and Bulirsch (1980) for
conjugate gradients and Newton's method and Conte and de Boor (1980) for the
shortcomings of the steepest descent method.
In the simple but unrealistic case that the travel time is a quadratic function
in 7 (equation 9), the conjugate gradients method leads to the minimal travel
time path in at most n steps.
1
T(7) = c + (b,7) + (9)
Here c is a constant real number, b a n vector, A a symmetric n × n-matrix.

The gradient of the quadratic T(7) is b + Am and the minimum path satisfies
VT(7)= b+Am=0.
Starting with ~o, the conjugate gradients method generates a series of paths
71, 72, - • • such that
T(~/1) = minT(too + uoVT(7o) ),

u0
T(~2) = min T(~ 1 + uoVT(~o) + u 1VT(71)),

/t o ~ U 1
• • .
T(~j+I) = rain T(Tj + uoVT(7o) + "'" +ujVT(~j)). (10)

ltO~ ltl, . . . ~ Uj
The first path ml can be obtained by a simple line minimization, that is the
minimization of T along the line 7o + uoVT(mo), where 7o and VT(mo) are
fixed vectors and u o is a scalar parameter. The second path 72 must be obtained
by minimizing T in the plane ml + u o V T ( % ) + ulVT(71), otherwise the result
of the first minimization would be spoiled. The j + 1-st path 7j+1 is found by
minimizing T in a j + 1-dimensional subspace of ~ n spanned by the vectors
VT(mo) . . . . . VT(mj), which are mutually orthogonal. The very expensive j + 1-
dimensional minimization is avoided by keeping track of another series of
vectors P o , . . - , Pj such that
Po = - V T ( ~ o ) , (11)
(VT(Tj),VT(7j))
pj = - V T ( T j ) + (VT(mj_I),VT('yj_j)) pj-I"
The minimizations on the right-hand side of (10) can then be replaced by line
minimizations:
T(7j+l) = min T(Tj + ppj). (12)

p
Only four n vectors need to be stored in memory, namely 7j, VT(Tj), VT(mj_ 1),
and pj. For details, see Stoer and Bulirsch (1980) or Press et al. (1986).
In general, the travel-time function is not quadratic, which is immediately
clear when it has, for instance, more than one stationary point. The vectors
VT(7o) . . . . . VT(Tj) are not exactly orthogonal and the conjugate gradients
method m a y take more steps than n. Press et al. (1986) argue that there are
O ( n 2) function evaluations necessary before finding the minimal point of a

quadratic function, because it is given by O(n 2) coefficients. Following this
argument, one can estimate the total n u m b e r of operations to find the m i n i m u m
r ay p ath as O(n3), although this is a very coarse rule of thumb, which makes no
account of the structure of the m a t r i x A nor of the function being not quadratic.
In any case, the conjugate gradients method can not diverge, because each time
a nonzero gradient has been found the line minimization results in a new travel
time t h a t is smaller t h a n the old one and a zero gradient means t h a t a
stationary point has been found. Figure 2 shows the performance of the
conjugate gradients bending of a polygonal pat h in a linear velocity medium,
and Figure 3 shows it in a r a n d o m medium.
In the ideal case of exact arithmetic, the process will converge to a stationary
travel-time path. However, round-off errors in the calculation of the travel time
and its derivatives will cause small deviations from the stationary path. There-
fore, a tolerance m us t be defined as a stopping criterion. One possible stopping
criterion is the decrease in t r a ve l time after an iteration step. If it is smaller
t h a n some predescribed tolerance, the process can be stopped. This tolerance
can be chosen as small as the computational precision of T. Another possibil-
ity is the change in position of the m i n i m u m point after an iteration. This toler-
ance should not be chosen smaller t h a n the square root of the computational
precision of T (Press et al., 1986).
A completely different error source is the approximation of continuous curves
with polygonal paths and the travel time along t h e m with the trapezoidal rule.
In exact arithmetic, the approximation of the travel time along a continuous
O
¢q
O
03
N~
o
ex)
o
t'--
O
o0
17
o
O~
10 20 30 40 50 60 70 80 90
X
FIG. 2. Successiveline minimizations of the conjugate gradients method from an initial straight
guess path consisting of 16 equidistant points ((0,0),..., (100, 100)). The velocity field is given by
c(x, z) = 1.0 + 0.01z. After 3.8 sec, 17 iterations, 235 travel-time function evaluations, and using a
tolerance equal to the machine precision, the travel time is 96.2970 (exact value: 96.2424).
266 T. J. M O S E R , G. N O L E T , A N D R. S N I E D E R
1.o ~IIIHIIIL/LIIIIqlIlIIIlIIJNIIII]IIIIIIILlIHILLIIIIq]II]IIIILIlI[ILII
2.o
,,t-
0
O4
i[iml-~l
0
03
,,~ ~llJtlll
N o h~[TII
0
£0
(30
0
O3
10 20 30 40 50 60 70 80 90
X
FIG. 3. Bending with conjugate gradients from a n initial guess path of 16 points in a complicated
medium, with velocities smoothly v a r y i n g from 1.0 to 2.0. After 1.9 sec, nine iterations, 116
travel-time function evaluations, and using a tolerance equal to the machine precision, the travel
time is 85.7733 (no analytic solution known).
curve with the trapezoidal rule is correct up to O ( 1 / k 2 ) ( k --~ ¢¢). When k is to

small, the polygonal path with a minimal trapezoidal travel time can be
completely wrong, because it m a y have missed rapid variations in the velocity
field. One way out is to refine it by doubling the number of nodes. The
integration is t h e n more accurate, but the cost of this improvement is to
increase the number of arithmetic operations by a factor of roughly 8, because of
the efficiency bound of O(n3). An alternative for the computation of T will be
given in the next section.
Several strategies are possible for choosing an initial guess, stopping criteria
and refining. In this paper, the convergence of the relative change in travel
time to the machine precision of 1.0 × 10 -s is used as a stopping criterion,
unless otherwise stated. Refining is omitted, because the interpolation with
Beta-splines (next section) is a more general way to enhance the accuracy. An
example of an initial guess path with two loops (Fig. 4) shows t h a t unrealistic
guesses are no problem using conjugate gradients. Including higher-order terms
in the Taylor expansion of the travel time (5) causes the initial guess to
converge rapidly towards the neighborhood of a m i n i m u m path (see section on
"Higher-order terms"). It is by no means clear whether the resulting path is a
global or only a local minimum. All t h a t can be said is t h a t an initial guess
path will converge to the minimal travel time path t h a t is closest to it in some
sense. Also, in the case of multipathing, it is not a priori clear which of the
paths will be selected. Advanced techniques from graph theory (Moser, 1991),
O
.r-
O
¢Xl
O
,q-
N~
o
co
o
oo
o
o~
10 20 30 40 50 60 70 80 90
X
FIG. 4. Bending with conjugate gradients from an unrealistic initial polygonal guess path with
two loops, consisting of 12 points. The velocity field is given by c(x, z) = 1.0 + 0.01z. After 4.3 sec,
27 iterations, 328 travel-time function evaluations, and using a tolerance of 0.0001, the travel time
is 96.6638 (exact value: 96.2424). Only every third iteration is shown.
however, provide good initial guesses and guarantee that they are close to the
global minimum.
INTERPOLATION WITH B E T A - S P L I N E S
There are two important reasons for interpolation between the points of the
polygonal paths.
First, consider the behavior of trapezoidal travel time minimization with
polygonal paths in concave slowness regions. The slowness distribution, a(x, z)
= 1/c(x, z), is concave in a region D C ~2 when for all (xl, zl), (x2, x2) e D
a n d 0 _< X___ 1
a(X) _> ),a 1 + (1 - h)a2, (13)
where a(k) = f f ( ~ X 1 q" (1 -- )k)X2, ~Z 1 -[- ( 1 - k ) z 2 ) , a 1 = q ( X l , Z l ) a n d a 2 =

a ( x 2, z2). In such a region, the linear interpolation between the slownesses at
two points (xl, Zl) and (x2, z2), the right-hand side of (13), is smaller than or
equal to the slowness itself, the left hand side of (13). The trapezoidal rule,
which is an integral over the linearly interpolated slowness, is therefore an
underestimation of the exact travel time, which is an integral over the real
slowness. This means that a minimization algorithm will always repel the
points of the polygonal paths out of the concave slowness region. The result is
one segment of the polygonal path "bridging" the region of concavity and a
completely incorrect travel time.
268 T. J. MOSER, G. NOLET, AND R. SNIEDER
Concave slowness regions a p p e a r in all realistic models; a c o m m o n e x a m p l e is

the low-velocity zone. Consider for i n s t a n c e t h e velocity d i s t r i b u t i o n
( z - 50) 2
c ( x , z) - 2500 + 1, (14)
whose inverse, the slowness, is concave in a zone a r o u n d z = 50, w h e r e c has its

m i n i m u m , n a m e l y (1 - 1 / 8 x / 3 ) 5 0 <_ z ___ (1 + 1 / 3 ~ f 3 ) 5 0 . T h e s u b s e q u e n t itera-
tions of t h e conjugate g r a d i e n t s b e n d i n g of a n initial polygonal p a t h w i t h points
((10, 10), (35, 35), (65, 65), (90, 90)) are s h o w n in F i g u r e 5. It can be seen t h a t t h e
second a n d t h i r d point, i n i t i a l l y at (35, 35) a n d (65, 65), are r e m o v e d from the
concave r e g i o n a n d e v e n t u a l l y coincide w i t h t h e source point (10, 10) a n d the
r e c e i v e r point (90, 90). T h e t r a p e z o i d a l t r a v e l t i m e is t h e n e q u a l to t h e t r a v e l
t i m e along a s t r a i g h t r a y p a t h in a h o m o g e n e o u s model w i t h a velocity e q u a l to
c(10, 10) = 1.64.
One possible solution to this p r o b l e m is to use e q u i d i s t r i b u t i n g paths, for
which t h e spacing b e t w e e n the points is a d a p t e d to t h e v a r i a t i o n s in t h e velocity
field ( P e r e y r a e t a l . , 1980). This r e q u i r e s e x t r a c o m p u t a t i o n time, however,
w h e r e a s i n t e r p o l a t i o n r e s u l t s in a considerable s a v i n g of time, as will be shown
below.
1 .o IHIIIIILIIIIqlllllllllllllllllllllLlllllLLlllqllllllllllllllLLhll~l~
2. o
O
v-
O
OJ
O
CO
O
£O
O
CO
O
Ob
10 20 30 40 50 60 70 80 90
X
FIG. 5. Conjugate gradients bending of a straight polygonal path of four points

((10, 10), (35, 35), (65, 65), (90, 90)) in a velocity field given by c ( x , z) = ( z - 50)2/2500 + 1, whose
inverse is a concave function around z = 50. The successive positions of the second point are given
with numbers 0,...,4. After 0.5 sec, nine iterations, 148 travel-time function evaluations, and
using a tolerance equal to the machine precision, the travel time is 69.0294.
RAY B E N D I N G R E V I S I T E D 269
A second reason to interpolate polygonal paths is the observation that seismic

ray paths are continuous curves. Two points on a ray with a small difference in
travel time will have a small difference in spatial position. The successive
points of a polygonal path are therefore not completely free to move with respect
to each other and constraints should be formulated to the minimization of T(~),
in order to reduce the part of the space ~ ~, wherein the minimum is sought. It
is however a delicate question, how to formulate such continuity constraints, so
that the omission of parts of ~ compensates the overhead of the extra computa-
n
tion. A good alternative is then to use fewer points and interpolate between
them. This reflects the fact that not all the points that are involved in the
numerical integration for computing the travel times need to be perturbed
independently from each other. In other words, the number of points to be
perturbed can be smaller in magnitude than the number of points to be
integrated over.
Beta-Splines
The Beta-spline curve representation (Barsky, 1988; see also Appendix A) is a
modification of the cubic B-spline curve (Newman and Sproull, 1981) with
several properties that are in favor of the ray bending procedure, described in
the previous section.
Its parametric representation allows the ray path or its approximations to be
multivalued in each coordinate direction so that there is no principal restriction
to the curve: it can make loops or oscillations if necessary. Its convex hull
property on the other hand ensures complete control on the behavior of the
curve, because each segment of the Beta-spline curve lies in the convex hull of
its four support points. Other curve interpolations or representations that do
not obey this property, like the minimal curvature spline (Reinsch, 1967) or the
trigonometric polynomial, can show undesired wanderings or oscillations (Gibbs'
phenomenon, see Courant and Hilbert, 1967). The Beta-spline curve does not
pass through its support points generally, but it can be forced to do so by
making three successive support points coincident, for instance the end points of
the ray or points on interfaces. Another property, the local control property,
makes the recomputation of one Beta-spline segment possible after the pertur-
bation of one support point without recomputing the rest of the curve. This
property is of great advantage in the calculation of the travel time and its
gradient, because they can be evaluated with the same efficiency as in the case
of polygonal paths. The behavior of the Beta-spline curve is further controlled
by two shape parameters /~1 and ~2 which describe the continuity of the first
and second derivatives at the joints of the segments. /~1 = 1 and /~2 = 0 is the
most natural case, when both derivatives are continuous. The Beta-splines are
equal to cubic B-splines in this setting. For/~1 =/~2 = 0, the Beta-spline curve
consists of straight line segments between the support points; therefore, both
cubic B-splines and polygonal paths are special cases of the Beta-spline repre-
sentation of the seismic ray.
Like the points specifying polygonal paths (3), the k + 1 support points V~ of
the Beta-spline curve are collected in an n-vector % where n is again 2(k + 1)
in two dimensions:
• : ..... (15)
The travel time along Beta-spline can be calculated as accurately as desired,

again with the trapezoidal rule:
k m 1( 1 1
) [[ -
i=l j=l Cij Ci,j- 1
where Qi(u), (i = 1, ..:~, k, 0 < u < 1) is a point on the ith curve segment, m is
the number of points Qi(uj) to be integrated over on that segment, and II " II is
the Euclidian norm. cij is the seismic velocity at Qi(uj). An appropriate choice
for uj is uj = j / m.
It is also possible to invoke higher-order schemes for integration, like Simp-
son's rule. The accuracy of the integration (16), considered as the inverse of the
error, is O(m2n 2) and can be increased as much as desirable by increasing m,
without changing the number of support points. The price for the more accurate
integration is O(mk) or O(mn) operations for one travel-time calculation.
The gradient of the travel-time function can be calculated numerically as
before by moving each successive support point V~ in each coordinate direction
over a fixed increment and recomputing the travel time only along the four
changed segments. The convergence of the bending is however strongly depend-
ent on the definition of the increment. An alternative is the formal differentia-
tion of the travel time along a Beta-spline curve with respect to one support
point (see Appendix B). Such a differentiation is more stable, because it can be
expressed as an integral along the four segments over quantities that are
known analytically. The only remaining differentiation is the calculation of the
gradient of the velocity field, which is singular only at discontinuities. In the
next section, it will be shown that this singularity causes no problems.
Table 1 shows the travel times in the velocity field c(x, z) = 1.0 + 0.01z
along a Beta-spline with minimal travel time as found with the conjugate
gradients bending, where the gradient is computed with numerical differentia-
tion with a fixed increment ~x = 5z along the ray and with the formal differen-
tiation, described above and in Appendix B. In all cases, the initial guess path
consists of 11 equidistant points from (0, 0) to (100, 100) (k = 10) and there are
10 integration points on each segment (m = 10). The computation time per
iteration is roughly the same for all results, namely about 1.2 sec. This example
shows that the numerical differentiation is stable for a range of increments
TABLE 1
ACCURACY OF BENDING WITH DIFFERENT
METHODS FOR GRADIENT CALCULATION
Increment Iterations R~sult~ Error (%)
10.0 8 96.2483 0.0062

1.0 9 96.2444 0.0021
0.1 8 96.2444 0.0021
0.01 7 96.2444 0.0021
0.001 10 96.2449 0.0026
0.0001 14 96.2597 0.0180
0.00001 61 96.7315 0.5100
Formal Diff. 10 96.2443 0.0020
Exact Value 96.2424
from 5x = 6z = 0.001 to 10.0 and t h a t the formal differentiation amounts to the

same as numerical differentiation with an optimal choice for the increment.
In any case, the number of arithmetic operations for finding one gradient
vector is O(mn), t h a n k s to the local control property. Following the rule of
thumb t h a t the minimization of a function requires O(n 2) evaluations of it, one
can estimate t h a t the total number of operations to find the m i n i m u m ray path
consisting of k + 1 support points and each segment interpolated with a Beta°
spline curve evaluated at m points is O(rnk 3) or O(mn3). This can be much less
t h a n the minimization with polygonal paths, because the number of parameters
n can be chosen much smaller. In practice, the minimization of the trapezoidal
travel time along polygonal paths can be done much more efficient with the
same accuracy or much more accurate with the same computational effort. One
can divide n by a factor a and multiply m with a, so t h a t the accuracy of the
trapezoidal travel time (16) is the same, as m × n is the same, but the
minimization is a2 times more efficient in the latter case. It must be kept in
mind however t h a t the number of support points n must be large enough for the
Beta-spline curve to accommodate the variations in the velocity field. When n is
too small, the continuity constraints are too strict; the travel time accuracy
along candidate paths may be acceptable for a large m, but the candidate paths
do not have enough degrees of freedom to approach the m i n i m u m travel time
path.
The accuracy and efficiency of minimization with polygonal paths and
Beta-splines are summarized in Table 2.
The advantages of Beta-splines can be illustrated with the example of Figure
2. An initial guess path consisting of k + 1 points, equally distributed from
(0, 0) to (100, 100), is bent towards the m i n i m u m path in the linear velocity field
c(x, z) = 1.0 + 0.01z, each time using the machine precision as a stopping
criterion. The results are given in Table 3. The first calculation of the travel-
time from (0, 0) to (100, 100), with k = 64 and m = 1, has been compared with
TABLE2
EFFICIENCY AND ACCURACY OF BENDING
Computation Time Accuracy
Polygonal p a t h s O( n 3) O( n 2)
Beta-splines O(mn 3) O ( m 2 n 2)
TABLE 3
PRAC~CAL COMPARISON
k m Iterations Function evM. Comp. time Result
1. Polygonal p a t h (/31 =/32 = 0) 64 1 67 795 58.0 96.2463

2. Idem, U m a n d T h u r b e r - m e t h o d 64 1 200 75.1 96.8554
3. Beta-spline (~1 = 1, f12 = 0) 8 8 6 82 5.3 96.2464
4. Beta-spline (~1 = 1, f12 = 0) 32 8 17 234 59.1 96.2429
5. Beta-spline (/31 = 1, ~2 = 0) 5 10 4 57 2.9 96.2514
Exact value 96.2424
Velocity field: c(x,z)= 1.0 + 0.01z; initial guess: k + I e q u i d i s t a n t points from (0,0) to
(100, 100).
t h e b e n d i n g m e t h o d of U m a n d T h u r b e r (1987). I n b o t h cases, no r e f i n i n g as
described b y U m a n d T h u r b e r h a s b e e n applied. T h e second row of T a b l e 3
shows t h a t t h e m e t h o d of U m a n d T h u r b e r n e e d s e x c e s s i v e l y m a n y i t e r a t i o n s to
r e a c h only a poor r e s u l t . T h e t h i r d c a l c u l a t i o n , w i t h k = 8 a n d m = 8, shows a n
e n o r m o u s g a i n in efficiency c o m p a r e d to t h e first calculation, w i t h p o l y g o n a l
p a t h s of 65 points, w h i l e p r e s e r v i n g t h e precision. T h e f o u r t h calculation, w i t h
k = 32 a n d m = 8, is t h e o t h e r w a y around: w i t h t h e s a m e c o m p u t a t i o n t i m e as
in t h e first c a l c u l a t i o n , one o b t a i n s a m u c h h i g h e r a c c u r a c y . A f t e r s o m e
e x p e r i e n c e , it a p p e a r s t h a t , in t h i s e x a m p l e , k = 4 a n d m = 10 is t h e m o s t
efficient choice to get a n a c c e p t a b l e e r r o r of I p a r t in 104.
F i g u r e 6 s h o w s t h e c o n j u g a t e g r a d i e n t s b e n d i n g of B e t a - s p l i n e s in t h e concave
slowness field of F i g u r e 5. T h e r e is no " b r i d g i n g " effect like w i t h t h e b e n d i n g of
p o l y g o n a l p a t h s , t h a n k s to t h e i n t e r p o l a t i o n b e t w e e n t h e s u p p o r t points, al-
t h o u g h a m o r e a c c u r a t e i n t e g r a t i o n still r e s u l t s in a l a r g e r t r a v e l t i m e . Also,
t h e m i n i m a l t r a v e l t i m e p a t h is m u c h s m o o t h e r w i t h t h e s a m e n u m b e r of
p a r a m e t e r s , t h a n k s to t h e B e t a - s p l i n e r e p r e s e n t a t i o n . F i g u r e 7 shows t h e
solution of t h e b e n d i n g of a B e t a - s p l i n e in t h e velocity field of F i g u r e 3.
DISCONTINUITIES
T h e o c c u r r e n c e of d i s c o n t i n u i t i e s in t h e velocity field is a p r o b l e m w i t h
classical b e n d i n g m e t h o d s ( T h u r b e r a n d E l l s w o r t h , 1980; U m a n d T h u r b e r ,
1,o 2.o
O
T-
O
¢q
O
03
No
O
£O
O
CO
O
Ob
10 20 30 40 50 60 70 80 90
X
FIG. 6. Conjugate gradients bending of Beta-splines in the same situation as in Figure 5 (k = 3,

m = 10, refer to the text). The successive positions of the second point are given with numbers
0, 1, 2. After 1.9 sec, four iterations, 60 travel-time function evaluations, and using a tolerance
equal to the machine precision, the travel time is 94.1967 (no analytic solution, but k = 8 and
m = 10, which can be considered as more accurate, gives 94.2376).
1.o ~IllilIIIIIIIIII~IIIIIFmJlIII]IIIIIIHI@2. 0
o
('NI
o
(D
0
I:D
10 20 30 40 50 60 70 80 90
X
FIG. 7. Conjugate gradients bending of Beta-splines in the velocity field of Figure 3 (k = 5,

m = 10). After 5.9 sec, eight iterations, 112 travel-time function evaluations, and using a tolerance
equal to the machine precision, the travel time is 89.0627 (no analytic solution known, but k = 25
and m = 50, which can be considered as much more accurate, gives 88.9772).
1987; Prothero e t a l . , 1988). Even without modifications, the conjugate gradi-

ents bending of Beta-splines has no major problems with discontinuities, as is
shown in Figure 8. The minimal travel time is found with an error smaller t h a n
0.1%. The corresponding ray path does not refract however at the discontinuity,
although three coincident support points could realize this. It is probably due to
the computational error in the numerical integration or differentiation t h a t
such a configuration of support points does not occur.
Discontinuities can be handled more correctly by posing constraints on the
paths for which the travel time is to be minimized. The candidate paths are
Beta-spline curves t h a t behave as differentiable curves in smooth velocity
regions, like in the previous section, but with three-fold support points t h a t are
constrained to lie on the interface. In this way, one can calculate different
phases on the seismogram, like reflections, multiples or reverberations.
The constraints can be easily incorporated by inserting an extra transforma-
tion between the support points ~/and the travel time T(7) along the path t h a t
they define. Suppose t h a t there is one interface in the model, given by a relation
z = I ( x ) , or x = I I ( u ) and z = I ~ ( u ) with u a variable parameter. ~, is then
transformed as follows:
~ ' = ( X 'o , z 'o , x 'l , z ~ . . . . . x '~ , z 'k ) , (17)

where x~ = x i and z~ = z/ when i is a free support point and
x ; = x i, z 'i = I ( x i ) (18)
o
,r--
c=1.0
o
OJ
0
03
0
',<1"
No
,0
(..0
o
cO
o~°~ c=2.0
10 20 30 40 50 60 70 80 90
X
FI~. 8. Conjugate gradients bending of Beta-splines (k = 10, m = 5) in a discontinuous velocity

field (c = 1.0 for z __ 50.0, c = 2.0 for z > 50.0). After 6.1 sec, eight iterations, 130 travel-time
function evaluations, and using a tolerance equal to the machine precision, the travel time is
100.897 (exact value obtained from bisection: 100.941).
or
when i is to lie on the interface, f ( x i ) is a function to transform the x

coordinate of the support point into a suitable parameter u that, in turn,
produces coordinates of a point on the interface, z i has become a dummy
variable, t h a t plays no role in the minimization process.
Figure 9 shows the constrained conjugate gradients bending of a Beta-spline
in a three layered model.
HIGHER-ORDER TERMS
Although the conjugate gradients bending has no problems with initial guess
paths t h a t are far away from a minimal travel time path (see Fig. 4), informa-
tion from a higher-order expansion of the travel time functional may provide a
faster convergence. Consider for instance the calculations of Figure 2, where a
relatively large number of iterations was necessary to find the ray path in a
rather simple velocity field. It was argued earlier (see Table 3) t h a t such a slow
convergence is cause by a ray path with too m a n y points and the neglect of the
velocity variations between the points. Yet, it can be expected t h a t taking into
account the second derivatives of the travel-time function enhances the conver-
gence, even when Beta-splines are used. This follows from the observation t h a t
the m i n i m u m of a quadratic function (9) can be found at once by inversion of the
Hessian, the matrix of second derivatives of the travel time. A non-quadratic
function can be minimized in an iterative procedure by calculating or estimat-
ing the inverse Hessian and using it for finding a better approximation of the
o c=1.0
o,,I
o
03
c=2.0
CD
b,-
C) c=3.0
10 20 30 40 50 60 70 80 90
X
FIG. 9. Constrained conjugate gradients bending of Beta-splines (/~1 = ~2 = 0, k a=5'ndm = 1) in a

discontinuous velocity field (c = 1.0 for z ~ 40.0, c = 2.0 for 40.0 < z < 70.0 c = 3.0 for
70.0 < z). After 0.08 sec, six iterations, 181 travel time function evaluations, and using a tolerance
equal to the machine precision, the travel time is 109.326 (equal to exact value obtained from
bisection).
r a y path. Such p r o c e d u r e s are described e x t e n s i v e l y in S t e e r a n d B u l i r s c h

(1980).
In t h e case t h a t t h e H e s s i a n is calculated, t h e m i n i m i z a t i o n of a function of n
v a r i a b l e s i n d e e d r e q u i r e s O(n 3) a r i t h m e t i c o p e r a t i o n s for each iteration. F o r a n
a l m o s t q u a d r a t i c function, t h e r e are r e l a t i v e l y few i t e r a t i o n s needed, so the
t o t a l n u m b e r of o p e r a t i o n s is a g a i n O(n3). In t h e case t h a t t h e inverse H e s s i a n
is e s t i m a t e d a n d u p d a t e d e a c h i t e r a t i o n , n i t e r a t i o n s are needed, each one
r e q u i r i n g O(n 2) o p e r a t i o n s (see P r e s s et al., 1986). Therefore, no profit is
n e c e s s a r i l y to be expected w h e n u s i n g second-order t e r m s i n s t e a d of only first
order t e r m s in t h e t r a v e l t i m e expansion.
A n o t h e r p r o b l e m is t h e n u m e r i c a l d i f f e r e n t i a t i o n for c a l c u l a t i n g second
derivatives, which can be v e r y u n s t a b l e , c a u s i n g e r r o r s in the c o m p u t e d Hes-
sian. T h e s e e r r o r s can be c a t a s t r o p h i c w h e n t h e Hessian, which is u s u a l l y
ill-conditioned, is i n v e r t e d . In such situations, which f r e q u e n t l y occur in compli-
cated velocity models, the c o n v e r g e n c e is d e s t r o y e d completely.
It is possible, however, to simplify the expressions for t h e second d e r i v a t i v e s
u s i n g some a s s u m p t i o n s w i t h r e g a r d to t h e d e r i v a t i v e s of the slowness field
along the r a y a n d t h e r a y itself. A p p e n d i x C give s a d e r i v a t i o n of a n expression,
e q u a t i o n (C25), t h a t can be used as a s e a r c h direction in the conjugate g r a d i e n t s
bending. It is a n u m e r i c a l l y efficient a n d stable a p p r o x i m a t i o n of the direction
t h a t would be o b t a i n e d f r o m t h e i n v e r s i o n of t h e Hessian. It gives a m u c h b e t t e r
direction for t h e line m i n i m i z a t i o n , at least at t h e first i t e r a t i o n s of the
bending. In t h e v i c i n i t y of t h e r a y p a t h , it is difficult to c o m p u t e n u m e r i c a l l y , so
t h a t the first-order s e a r c h is to be p r e f e r r e d in f u r t h e r iterations.
Table 4 is an extension of the calculations of Figure 2 and Table 3. Starting

with a Beta-spline consisting of six equidistant points from (0, 0) to (100,100)
and 10 interpolated points on each segment (k = 5, m = 10), in the velocity
field c ( x , z ) = 1.0 + 0.01 z, the conjugate gradients bending gives a travel time
of 96.2514 in five iterations, when the gradient is calculated with the formal
differentiation derived in Appendix B (first column). The second-order search
direction introduced above gives almost the same value, 96.2510, in the same
number of iterations, but with a much faster convergence in the first iterations.
The successive iterations of the first and second order search bending are shown
in Figure 10.
The problem of the efficiency of the second-order search is still a subject of
research.
B E N D I N G IN A S P H E R I C A L E A R T H
In this section, the conjugate gradients bending is applied to a spherical

Earth with discontinuities. The computations were done for a two-dimensional
slice through the Earth; three-dimensional computations would involve a gradi-
ent of larger dimension b u t are not expected to give significantly different
results. The primary aim was to see whether the spherical geometry, with
discontinuities, possess any special problems and to investigate the trade off
between precision and computing time.
TABLE 4
COMPARISON OF FIRST- AND SECOND-
ORDER SEARCH
Iteration 1st Order Search 2nd Order Search
0 98.0287 98.0287
1 96.5979 96.2614
2 96.2576 96.2523
3 96.2514 96.2510
4 96.2514 96.2510
5 96.2514 96.2510
Exact value 96.2424 96.2424
' i , ~ ' i , ' i ' i , i ,
04
No No
It')
25 50 75 25 50 75
X x
FIG. 10. B e n d i n g of B e t a - s p l i n e c u r v e s (/~ = 1, ~2 = 0, k = 5, m = 10) w i t h first-order s e a r c h
direction (left) a n d second-order s e a r c h d i r e c t i o n (right). T h e velocity field is t h e s a m e as i n F i g u r e
2, c(x, z) = 1.0 + 0.01z. T h e t r a v e l t i m e s of t h e s u c c e s s i v e i t e r a t i o n s a r e t a b u l a t e d i n T a b l e 4.
The curves t h a t are used are the cubic B-spline variant of Beta-splines
(/~1 = 1, ~u = 0). Their support points or nodes are specified by spherical coordi-
nates r and 0. To investigate the precision, rays are traced in a spherically
symmetric E a r t h (the PEMC model of Dziewonski et al., 1975), for which the
travel times are computed with a precision of 0.01 sec. The initial guess rays
are obtained by randomly perturbing the exact ray, both in the radial and in the
latitudinal (0) direction. The perturbations Ar and /x0 correlate from ray node
to ray node according to the following formula:
Ar i = 0.TAr/_ 1 + 0.3 P E R T × ( R A N D - 0.5), (1)
where P E R T determines the rms amplitude of the perturbation (which is about

0.16 × P E R T ) and R A N D is the pseudo random number with uniform distribu-
tion on [0, 1] as computed by a F o r t r a n library function. An identical algorithm
determines the perturbation A0 i. Figure 11 shows the first 100 values of/Xri for
P E R T = 100 km. The nodes obtained by adding the perturbations to the
coordinates of the initial ray are used as the support points for the Beta-spline
interpolation in the first iteration. Some experience showed t h a t a step size of
h = 20 km is the optimal choice for the numerical computation of the gradient
with the central difference formula:
c~T T(...,r i + h,... ) - T(...,r i- h,... )

Or i 2h '
1 OT T('",O i+h/ri,''')- T('",Oi-h/ri,'")
r aO~ 2h
Discontinuities were handled by including a threefold node on the discontinuity,

ensuring t h a t the interpolated curve actually passes through the point. Such
interface points are only perturbed in the 0 direction. Since a small shift along
a discontinuity with a velocity contrast gives a much larger change in T t h a n a
similar shift of a node within a continuous velocity field, a reduced h (of 0.2 km)
was used for the computation of ( 1 / r ) ( a T / O 0 ) on interface points.
Computations were performed for six values of the ray parameter p:
random ray perturbations used in test
O -
~r
OJ
t~?'
O
O
10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0

ray node index
FIG. 11. Perturbations of the first 100 nodes of the exact ray, used to produce an initial guess for
ray bending.
5, 7 . . . . ,10 sec/degree. With these slownesses, rays arrive between 21 ° and 86 °

epicentral distance, with t u r n i n g point depths ranging from 650 to more t h a n
2500 km, and travel times ranging between 394 and 763 sec. The main results
are summarized in Figure 12.
The decrease in travel time in the results of two subsequent line minimiza-
tions was used as a stopping criterion. Whenever this decrease was less t h a n
0.01 sec, the iterations were stopped. The resulting travel time was in all cases
within 0.17 sec of the exact T, with the average over all six slownesses close to
0.1 sec. The error grows slightly as the initial ray is perturbed further from the
correct ray (Fig. 12, right).
The computation time is of the order of 10 sec per ray if interface nodes in the
crust and upper mantle are perturbed from the initial ray with equal rms
perturbations (Fig. 12, left, black circles). It is more realistic to assume t h a t the
r a y near the endpoints is quite close to the initial ray. This situation is
approximated by leaving the interface points unperturbed, in which case the
computation time decreases by 2 to 5 sec (Fig. 12, left, open circles), while the
error is the same (Fig. 12, right). The computation time grows roughly linearly
with the rms amplitude of the deviation of the initial ray.
In these computations, a node spacing on the rays was used of about 100 km.
The computation of the gradient at interface points becomes numerically unsta-
ble if the distance to the next node is much closer t h a n the average spacing, but
a simple removal of such nodes was sufficient to improve the accuracy.
Additional characteristics of the algorithm were tested using different node
spacings and a constant P E R T of 100 km. Figure 13 shows t h a t the node
spacing is an important variable in determining the final error. The few errors
larger t h a n or equal to t h a n 0.1 sec all belong to long rays, with a small ray
parameter, and few (< 100) nodes. In view of the r a t h e r conservative stopping
criterion (0.01 sec), it can be inferred t h a t it is the spline interpolation t h a t is
not accurate enough to shape the ray if there are not enough nodes. This
inference was confirmed by letting one of these computations go on until the
decrease in T was below the machine precision, which did not improve the
error. In laterally heterogeneous E a r t h models, the node spacing should match
the wavelength of the heterogeneities to obtain an acceptable precision in T.
Figure 14 shows how the computation time increases as a function of the
number of nodes. The behavior is nonlinear, especially when only few nodes are
used and the limited precision is reached within a few iterations only.
. . . . i . . . . i
0 kO
~ 0
C3"r-
1.0
, , , , I , , , , L , , , i , I , , , ~ I
25.0 50.0 25.0 50.0

rms deviation (km) rms deviation (kin)
FIG. 12. Computation time (left) and travel-time error (right) in the final solutions of the ray
bending for different deviations of the initial guess from the exact ray. Open circles refer to the case
of u n p e r t u r b e d points at interfaces, black circles represent the calculations with perturbations on
all points (see text).
1-(3
c5
CD
L_
LO
0 0 C 0
o oo o o
o o
i i I i in, , I , , , I i , , I , , t
50 1 O0 150 200
nodes
FIG. 13. Travel-time error of the final solutions of the ray bending as a function of the number of
nodes along the path.
0
CM
dJ
c~
0 OD 0
0
o
0
,t-
o
o~
50 1 O0 150 200
nodes
FIG. 14. The computation time of the ray bending as a function of the number of nodes along the
path.
Plotting the computed rays and the exact rays shows that the largest errors
are near the turning point. The maximum ray location error for a bad initial
ray ( P E R T = 300 km, p = 8 sec/degree) was about 20 km.
CONCLUSIONS
The two most important points presented in this paper are the use of the
conjugate gradients method for ray bending and the representation of the ray
paths by Beta-splines. Minimizing the travel time along a curve is to be
preferred to fitting the ray equation, because it is more stable and can not
diverge. The use of information about the derivative of the travel time along a
curve with respect to perturbations of it is suggested by Fermat's principle.

Then, the conjugate gradients method for minimization using derivative infor-
mation is suitable because it requires the storage of only four vectors of the size
of the storage of the ray itself and it employs the structure of the travel-time
function. For the first few iterations it can be advantageous to use a pseudo
second-order search direction, obtained from an approximation of the inversion
of the Hessian of the travel time. The points of the ray path should be
interpolated during the minimization process, first because it avoids inaccura-
cies in concave slowness regions and secondly because interpolation results in
both a considerably higher accuracy and efficiency. Beta-splines are particu-
larly suitable as an interpolation scheme, because they have several properties,
like the local control and convex hull properties, that are in favor of the ray
bending and because they provide extra control on the curves by means of the
shape parameters. Ray bending in the presence of discontinuities in the velocity
field can be done by transforming the location parameters of a point of a ray,
such that it is forced to lie on the discon£inuity. There are no restrictions to the
initial guess path that can be chosen because divergence is impossible, but it is
not clear whether the resulting minimal travel time path is a global minimal
path or only a local minimal path.
ACKNOWLEDGMENTS
This research is financially supported by the Netherlands Technology Foundation (STW). We
would like to thank S. van der Lee and F. Neele for critical readings of the manuscript.
REFERENCES
Aki, K. and P. Richards (1990). Quantitative Seismology: Theory and Methods, W. H. Freeman, San
Francisco.
Barsky, B. A. (1988). Computer Graphics and Geometric Modeling Using Beta-Splines, Springer;
New York.
Chander, R. (1975). On tracing seismic rays with specified end points, J. Geophys. 41, 173-177.
Conte, S. D. and C. de Boor (1980). Elementary Numerical Analysis, McGraw-Hilh New York.
Courant, R. and D. Hilbert (1967). Methoden der Mathematischen Physik, Dritte Auflage, Springer.
Dziewonski, A. M., A. L. Hales, and E. R. Lapwood (1975). Parametrically simple earth models
consistent with geophysical data, Physics Earth Planet. Interiors 10, 12-48.
Fletcher, R. and C. M. Reeves (1965). Function minimization by conjugate gradients, The Computer
Journal 7, 149-154.
Julian, B. R. and D. Gubbins (1977). Three-dimensional seismic ray tracing, J. Geophys. 48,
95-114.
Moser, T. J. (1991). Shortest path calculation of seismic rays, Geophysics 56, 59-67.
Nelder, J. A. and R. Mead (1965). A simplex method for function minimization, The Computer
Journal 7, 308-313.
Newman, W. M. and R. F. Sproull (1981). Principles of Interactive Computer Graphics, McGraw-Hill,
New York.
Nolet, G. (1987). Seismic wave propagation and seismic tomography, in Seismic Tomography with
Applications in Global Seismology and Exploration Geophysics, G. Nolet, (Editor), Reidel,
Dordrecht, 1-23.
Pereyra, V., W. H. K. Lee, and H. B. Keller (1980). Solving two-point seismic ray tracing problems
in a heterogeneous medium. Part 1. A general adaptive finite difference method, Bull. Seism.
Soc. Am. 70, 79-99.
Press, H. P., B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling (1986). Numerical Recipes, The
Art of Scientific Computing, Cambridge University Press.
Prothero, W. A., W. J. Taylor, and J. A. Eickemeyer (1988). A fast, two-point, three-dimensional
ray tracing algorithm using a single step search method, Bull. Seism. Soc. Am. 78, 1190-1198.
Reinsch, C. H. (1967). Smoothing by spline functions, Numer. Math. 19, 177-183.
Stoer, J. and R. Bulirsch (1980). Introduction to Numerical Analysis. Springer, New York.
Thurber, C. H. and W. L. Ellsworth (1980). Rapid solution of ray tracing problems in heterogeneous
media, Bull. Seism. Soc. Am. 79, 1137-1148.
Urn, J. and C. H. Thurber (1987). A fast algorithm for two-point seismic ray tracing, Bull. Seism.
Soc. Am. 77, 972-986.
Wesson, R. L. (1971). Travel-time inversion for laterally inhomogeneous crustal velocity models,
Bull. Seism. Soc. Am. 61, 729-746.
Wielandt, E. (1987). On the validity of the ray approximation for interpreting delay times, in
Seismic Tomography with Applications in Global Seismology and Exploration Geophysics, G.
Nolet (Editor), Reidel. Dovdrecht.
APPENDIX A: BETA-SPLINES
This appendix summarizes the theory of Beta-splines; more details can be
found in Barsky (1988). A Beta-spline curve is a curve consisting of segments
that are joined together continuously or with a continuous derivative and that
is determined by two shape parameters. A point on a Beta-spline curve segment
~ ( u ) can be regarded as a weighted average of four support points Vi_2, Vi-1,
Vi, and Vi+l:
1
i(U) = ~ br(~l,[~2, u)gi+r, for0< u_< 1. (A1)
r= - 2
The four weighting factors br(~l, ~2, u) are cubic polynomials in u with coeffi-
cients determined by ~1 and ~2:
3
br(~l,~l,U)-= E Cgr(~l,~2) ug, for0_<u=< landr= - 2 , . . . . ,1 (A2)
g=0
These coefficients are chosen so that the joint between the ith and (h + 1)st
curve segments satisfies continuity conditions:
Q i + I ( 0 ) ~- Q i ( 1 ) , (A3)
(0) = (1),
d2Qi+l 2 d2Qi d~
2 (o) = -jj(1) +
From these conditions, /31 and ~2 come out as shape parameters because they
control the curve in a global manner when t h e support points are fixed. The
parameter /31 measures the discontinuity of dQ/du at a joint: ~1 = 1 for C 1
curves; the parameter ~2 determines the tension of the curve: for positive/32 the
curve is attracted towards its support points, for negative /~2 it is pushed away.
~1 = 1 and /~2 = 0 indicates continuity of first and second derivatives so pro-
duces the smoothest curves. The coefficient cgr(/~l, ~2) obtained by imposing the
282 T. J . M O S E R , G. N O L E T , A N D R. S N I E D E R
previous conditions are:
2~13 4~12 + 4~1 +/~e 2

CO, - 2 -- - - Co, - 1 = 6 C0, 0 ~--- ~ C0, 1 : 0,
6t~13 6t~1(~12- 1) 6~
m
Cl, -2 cl,-1= 5 C l , o-- - - 5 cl, 1 = 0,
6/~13 - 2 ~ 1 ~ - 2~12 - ~2
C2, - 2 -- c2_ 1 = 3
(A4)
2/~1e + Be
C2, 0 = 3 c2,1 = 0,
2~1 s ~13 + ~12 + ~ + /32

C3, - 2 --
2
C3,0
where
= 2613 + 461 e + 4~1 + 62 + 2.
The following properties and advantages with regard to other curve representa-
tions can be summed up.
Parametric Representation. This allows a curve to be multivalued in each
coordinate direction without causing ambiguity. In practice, this means that
loops are possible.
Convex Hull Property. The weight functions br(t31, ~2, u) satisfy:
1
E br(B~,~2,u) = 1, for all u. (A5)
r = --2
The im~ication of this fact is that Qi(u) lies in the convex hull of the support
points Vi_ 2, V i_ 1, Vi and Vi+ 1. This means that unexpected and undesired
oscillations or wanderings are impossible. One can easily control the curve by
its support points and its shape p a r a m e t e r s . T h i s a l s o means that interpolation
can be forced by simply s e t t ~ g Vi-2 = V / = Vi+l. The ~ n g e n t s a r e then
specified by V~_ 1 - Vi_ e and V/+ 2 - Vi+ 1. The endpoints Vo and Vn can be
interpolated automatically by projecting indices - 1 , - 2 and n + 1 on 0,
respectively n.
Local Control. Shifting one support point Vi causes only changes to the curve
segments Qi_l(u), Qi(u), Qi+l(u) and Qi+2(u). This provides an e a s y way to
recompute the whole curve if one point is perturbed. Therefore, Ql(u) can be
calculated very efficiently, namely in O(1) time, regardless the number of
points.
Application of Tension to the Curve. By varying ~2 one can attract or repel the
curve from the support points, each obeying the convex hull property (at least
for not too negative ~e). The original polygonal path defining the Beta-spline
curve is found back for B1 = ~2 = 0. The concept of shape parameters can be
extended to continuously varying shape parameters along the curve.
RAY B E N D I N G REVISITED 283
Generalization to 3D. There is no restriction to the dimension of the space in

which the curve is drawn; exactly the same formulation applies. Properties 1 to
4 remain valuable, also in three dimensions.
Most of these properties are illustrated by Figure 15: at the left side a
Beta-spline with /~1 = 1 and f12 = 0 and at the right side with fil = 0 and
f12 = 0; the endpoints are interpolated.
APPENDIX B: THE GRADIENT OF THE TRAVEL TIME ALONG A BETA-SPLINE
The change o f travel time along a Beta-spline Q(u) due to change of one
support point V i is
OT _ 0_. / R ds
Oy i og i dS T
0 /sR i[ Q'( u) I[ du
O~r~ e(Q(u))
f~ o llO'(u)lL}du
/( 1 oll O'(,,)t[ o 1 du. (B1)
= c(O(u)) ~'~, + lr~'(u)l] o'~, 4O(u))
The prime ' denotes the derivative with respect to u. The first term of (B1)
accounts for the extension of the ray path, the second term for the change in
0
0.1
-N
0
¢.D
, 1 J , [ , I , f J , i , q ~ I
10 20 30 40 50 60 70 80 90
X
FIG. 15. Beta-spline curves with ~1 = 1, ~2 = 0 (left) and B1 = 132 = 0 (right). The end points are
interpolated automatically.
velocity along the path. The differentiation of the first term is carried out as
follows:
o11~'(u)1, ~,(~) oO,(u)
where OQ'(u)/OV i is a 2 × 2, or 3 × 3 in three dimensions, identity matrix

multiplied with a scalar factor equal to the derivative with respect of u of one of
the polynomials br(u ). If the Beta-spline (~(u) is considered as segments (~j(u),
then
oq'j( u) P
ovi b~_j(u)
for j = i- 1to i+2and
OQ'j(,) - o
otherwise. The second term of (B1) can be differentiated as follows:
~ - v ~(u)) 0~(u)
OV i c ( Q ( u ) ) OV~ '
where O Q ( u ) / O V i is a 2 × 2 identity matrix again, this time multiplied with

one of the polynomials br(u) themselves:
-=~
OQ~(u)
o~i - b~ ~(u)
forj=i- lto i+2and
o~(u) - o
otherwise. The result is
aT
-,/o,(
j=i-1
1
c(~(u)) iI~;(u)11 b~-~(")
+,, ,, )( ) (B2)
The usual integration by parts of the first term of (B2) is omitted here, in order
to avoid second derivatives of the ray path.
APPENDIX C: HIGHER-ORDER TERMS IN THE TRAVEL-TIME EXPANSION
The purpose of this appendix is to derive an expression for a search direction,
using a second-order expansion in the travel time along a curve with respect to
perturbations on it. The travel time along a curve 3, parametrized by the arc
length s, in a slowness field a(F) = 1/c(~') is
T= f a(~)ds. (C1)
Let an initial guess path joining the source and the receiver be given by ro(So).
The source is located at ro(0), while the length of the path is S o. Let }(So) be a
perturbation perpendicular on ~(So), such that
(~, r o ) = 0 (C2)
and
~'(o) = ~(So) = o (c3)

The dot denotes the derivative with respect to so. This perturbation has two
effects on the travel time, one due to the change in the arc length and one due to
the sampling of a different velocity region. For the arc length, this gives to
second order
Os - • 1 ( ( ~ ~) ~)2)
OSo : 1 + (ro,~) + ~ , -()0, , (C4)
and for the slowness
o( o + g)= + (c5)
where : stands for the dyadic product. Inserting this into (C1) and (C4) gives
T(~) : To + T~(~+) + r~(g), (c6)

where Tn is the travel time perturbation of the nth order in ~:
To = f ~(~o) alSo, (C7a)
+ so, (C7b)
T2--/ {1~ (~:VVa(~,o)) + ()o, ~)(~,Va(ro))
(C7c)
286 W. J. MOSER, G. NOLET, AND R. SNIEDER
Now perturb ~ with an extra perturbation ~(, which is an order of magnitude
smaller. The travel time is stationary for a finite curve change ~(o) when it does
not change to first order by the extra small perturbation 5}(o)- Applying
integration by parts using~ (C3), one can write the perturbations in 7'1 and T 2
due to the perturbation 6~ as
5T 1 = fJ Va- d (~io),~)~so, (cso)
It has been used that (~o, 5~) = 0 and 5~(0) = 6~(So) = 0.

Note that 5T 1 vanishes when d/dso(aTo) = Va, that is, when To(so) is a true
ray. However, To(So) is in general not a ray, and a finite deformation {(so) is
needed to make the travel time stationary. This i s t h e case when 5T 1 + 5T 2
does not depend on arbitrary small perturbations 5~(So), which happens when
• . . ~ . • ..
(c91
This is a second-order differential equation that needs to be solved for ~ with
the boundary conditions (C3). The right-hand side of this expression is the
deviation from the ray equation for the reference curve ~o(So).
Constraining the perturbation to be perpendicular to the reference curve
eliminates one degree of freedom and makes the computation more efficient
without loss of generality. In order to take full advantage of this, it is conve-
nient to transform (C9) to ray coordinates. In order to do so, define two unit
vectors 01 and 02 such that (~o, ql' q2) form an orthonormal basis. It is assumed
here that the unit vectors do not rotate around the reference curve, that is
(~1, 02) = (01, ~2) = 0. (C10)
The perturbation ~ is perpendicular to the reference curve so that
{ = qlql + q2q2' (Cll)
In order to convert (C9).to ray coordinates, the derivatives of ~ are needed.

From the orthogonality (ro, ql) = (~o, 0e) = (01, qe) = 0 and the normalizations
(ro, ro) = (01, 01) = (02, 02) = 1 one derives by direct differentiation that
(~, ~o) = - (~,~o), (o12)

((}, ~) = 0 . (C13)
(~o, ~ o ) = 0 , (C14)
where 4 denotes either ql or 42. The projection of ql on the basis vectors gives
~1 = ( ~ 1 , 4 1 ) 4 1 + (~ql, q2)q2+ (~ql, ~ro")~ro. (c15)
With (C10), (C12), and (C13), this leads to
~1 = - ( r o , ql)ro • (C16)
The differentiation of (C16) and use of (C10) gives
/d3Fo • ..
ql
~ : - 41 ro - (Fo, 41~ro
~:; • (C17)
dso3'
Let the perpendicular component of an arbitrary vector ~ be denoted by
b~±--~- (u',r~)~
o r o = (~
u, ql )0 1 + (7, q2 )4 2. (C18)
By differentiation of (Cll) one finds, using (C16) and (C17),
-(t141+q242- {ql(:ro,41)+q2(Fo,42)}ro, (C19a)
~=/11q~+/i242- q~ o,41 ) +q2 ,q2 1} o • (C19b)
It follows that (C14) that
ro = (ro, 41)41 + (~o, 4~,)42 (C20)
Projecting (C9) on the unit vectors ql and q2 perpendicular to the reference

curve one obtains with (C19) and (C20) that
ds ° [a ds---o 41 + ds--~ as ° ] 42 - (q141ql: VVo -4- q 2 q 1 4 2 : V V a ) 4 1
- ( ql q142 : ~7 V a -4- q2 q2 q2 : V V a ) q2
- ( ( V a ) . =r o + Fo(Va)± , q141 + q2q2 ) = V± a - a~o, (C21)
where it is used that (ro, V f ) = df/ds o. By projecting this equation on ql

respectively q2 one finds using V V a - 2/a(Va)(Va)= -a2VV(1/a) and after
some algebraic manipulation that
d
(°-dqi ] '[- 02 ~ qi4j: ~7~7 __ qj
dso ~ dso } j=l 0
-[-(~] ~ (qi4j -[- 4j4i): (~70")(O'~0 - ~7~)qj : (4i, 6r~0 _ ~70) ' (C22)
This equation constitutes two coupled linear differential equations in ql and

q2 that need to be solved subject to the boundary condition qi(0) = qi(So) = O.
Note that this expression is different from the expression (13.15) of Aki and
Richards (1980). The reason for this is that they ignored the change in the path
length when the ray is perturbed, that is, they used Os/Os o = 1 rather than
(C4). Furthermore, expression (C22) is in ray coordinates, whereas Aki and
Richards (1980) consider a three-dimensional Cartesian coordinate system.
The ray estimate can be updated by solving (C22) recursively. When the
solution has converged to the true ray, the right-hand side of (C22) vanishes and
the solution does not change anymore. If one applies (C22) recursively, it may
be advantageous to solve a simplified version of (C22) rather than the exact
equations (C22). When the reference curve is near the true ray, (9, ~ o - V~)
0. It is therefore reasonable to ignore this term in the left hand side of (C22).
Furthermore, it may not be possible to compute the second derivative of
the velocity VV(1/o) with great numerical accuracy. Ignoring these second
derivatives gives
d ( dqi)~ " (~ii, £ro_ Va) (C23)

ds ° ~ ds~°
Note that in this approximation the components ql and q2 are decoupled.

Equation (C23) can be integrated in closed form. One can make a further
approximation and ignore the derivative of the slowness in the left-hand side of
(C23):
,°
(C24)
(T
This equation can be integrated to give
qi(s) ~ - (1- So s'Fi(s') ds' - s 1- So Fi(s') ds'' (C25)
where
..
Fi(s) : (c26)
O-
DEPARTMENT OF THEORETICAL GEOPHYSICS
PO Box 80.021
BUDAPESTLAAN 4 , 3 5 T A UTRECHT, THE NETHERLANDS
Manuscript received 28 J a n u a r y 1991

Bulletin of The Seismological Society of America, Vol. 82, No. 1, Pp. 259-288, February 1992

Uploaded by

Copyright:

Available Formats

Bulletin of The Seismological Society of America, Vol. 82, No. 1, Pp. 259-288, February 1992

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bulletin of The Seismological Society of America, Vol. 82, No. 1, Pp. 259-288, February 1992

Uploaded by

Copyright:

Available Formats

Bulletin of the Seismological Society of America, Vol. 82, No. 1, pp.

259-288, February 1992

BY T. J. MOSER, G. NOLET, AND R. SNIEDER

-: ( Xo, xl, . . , (3)

where c~ is the seismic velocity at (xi, zi). It is a function from ~n to ~, t h a t can

T ( 7 + ~ 7 ) = T(7)+(VT(7),5"y)+O(II67112), (115711~ 0), (5)

so t h a t the corresponding components of the gradient do not need to be com-

T ( 7 + ~ 7 ) - T(7-57)=2(VT(7),57)+O(II57[13), (I]~711-~0), (8)

where 57 consists of suitable increments 5x i and ~z i (see Stoer and Bulirsch,

MINIMIZATION TECHNIQUES USING DERIVATIVES

Since t h e t r a v e l t i m e T ( ~ ) a n d its g r a d i e n t V T ( 7 ) c a n be c a l c u l a t e d easily,

Here c is a constant real number, b a n vector, A a symmetric n × n-matrix.

T(~/1) = minT(too + uoVT(7o) ),

T(~2) = min T(~ 1 + uoVT(~o) + u 1VT(71)),

T(~j+I) = rain T(Tj + uoVT(7o) + "'" +ujVT(~j)). (10)

T(7j+l) = min T(Tj + ppj). (12)

O ( n 2) function evaluations necessary before finding the minimal point of a

curve with the trapezoidal rule is correct up to O ( 1 / k 2 ) ( k --~ ¢¢). When k is to

a(X) _> ),a 1 + (1 - h)a2, (13)

where a(k) = f f ( ~ X 1 q" (1 -- )k)X2, ~Z 1 -[- ( 1 - k ) z 2 ) , a 1 = q ( X l , Z l ) a n d a 2 =

Concave slowness regions a p p e a r in all realistic models; a c o m m o n e x a m p l e is

whose inverse, the slowness, is concave in a zone a r o u n d z = 50, w h e r e c has its

FIG. 5. Conjugate gradients bending of a straight polygonal path of four points

A second reason to interpolate polygonal paths is the observation that seismic

The travel time along Beta-spline can be calculated as accurately as desired,

Increment Iterations R~sult~ Error (%)

10.0 8 96.2483 0.0062

from 5x = 6z = 0.001 to 10.0 and t h a t the formal differentiation amounts to the

Computation Time Accuracy

k m Iterations Function evM. Comp. time Result

1. Polygonal p a t h (/31 =/32 = 0) 64 1 67 795 58.0 96.2463

FIG. 6. Conjugate gradients bending of Beta-splines in the same situation as in Figure 5 (k = 3,

FIG. 7. Conjugate gradients bending of Beta-splines in the velocity field of Figure 3 (k = 5,

1987; Prothero e t a l . , 1988). Even without modifications, the conjugate gradi-

~ ' = ( X 'o , z 'o , x 'l , z ~ . . . . . x '~ , z 'k ) , (17)

FI~. 8. Conjugate gradients bending of Beta-splines (k = 10, m = 5) in a discontinuous velocity

when i is to lie on the interface, f ( x i ) is a function to transform the x

FIG. 9. Constrained conjugate gradients bending of Beta-splines (/~1 = ~2 = 0, k a=5'ndm = 1) in a

r a y path. Such p r o c e d u r e s are described e x t e n s i v e l y in S t e e r a n d B u l i r s c h

Table 4 is an extension of the calculations of Figure 2 and Table 3. Starting

In this section, the conjugate gradients bending is applied to a spherical

' i , ~ ' i , ' i ' i , i ,

Ar i = 0.TAr/_ 1 + 0.3 P E R T × ( R A N D - 0.5), (1)

where P E R T determines the rms amplitude of the perturbation (which is about

c~T T(...,r i + h,... ) - T(...,r i- h,... )

Discontinuities were handled by including a threefold node on the discontinuity,

random ray perturbations used in test

10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0

5, 7 . . . . ,10 sec/degree. With these slownesses, rays arrive between 21 ° and 86 °

25.0 50.0 25.0 50.0

curve with respect to perturbations of it is suggested by Fermat's principle.

previous conditions are:

2~13 4~12 + 4~1 +/~e 2

Cl, -2 cl,-1= 5 C l , o-- - - 5 cl, 1 = 0,

2~1 s ~13 + ~12 + ~ + /32

= 2613 + 461 e + 4~1 + 62 + 2.

Generalization to 3D. There is no restriction to the dimension of the space in

o11~'(u)1, ~,(~) oO,(u)

where OQ'(u)/OV i is a 2 × 2, or 3 × 3 in three dimensions, identity matrix