arXiv:cond-mat/0302579v1 [cond-mat.stat-mech] 27 Feb 2003
Financial Probabilities from Fisher
Information
Raymond J. Hawkins† and B. Roy Frieden‡
†
Mulsanne Capital Management, 220 Montgomery Street
Suite 506, San Francisco, CA 94104, USA
‡
Optical Sciences Center, University of Arizona
Tucson, AZ 85721, USA
September 16, 2018
Abstract
We present a novel synthesis of Fisher information and asset pricing theory that yields a practical method for reconstructing the probability density implicit in security prices. The Fisher information approach to these inverse problems transforms the search for a probability density into the solution of a differential equation for which a
substantial collection of numerical methods exist. We illustrate the
potential of this approach by calculating the probability density implicit in both bond and option prices. Comparing the results of this
approach with those obtained using maximum entropy we find that
Fisher information usually results in probability densities that are
smoother than those obtained using maximum entropy.
1
Introduction
Probability laws that derive from a variational principle provide an operational calculus for the incorporation of new knowledge. This feature, long
exploited in the physical sciences, is becoming increasingly popular in finance and economics where, for example, maximum entropy has found application as a useful and practical computational approach to financial economics (Maasoumi 1993, Sengupta 1993, Golan, Miller & Judge 1996, Fomby
1
& Hill 1997). It is generally felt that a candidate probability law p(x) should
be as non-informative and smooth (in some sense) as possible while maintaining consistency with the known information about x. While this criterion has
often been used to motivate the use of maximum entropy (Buck & Macaulay
1990), other variational approaches provide similar - and potentially superior
- degrees of smoothness to probability laws (Frieden 1988, Edelman 2001) and
it is the purpose of this paper to show that Fisher information (Frieden 1998)
- which provides just such a variational approach - can be used to reconstruct
probability densities of interest in financial economics. In particular, Fisher
information provides a variational calculus, a well developed computational
approach to the estimation of probability laws and yields a probability law
where smoothness is ensured across the range of support; in contrast to maximum entropy where smoothness tends to be concentrated in regions where
the probability density is very small.
Since maximum entropy is comparatively well known in financial economics and as it shares with Fisher information a common variational structure, we shall use it as a point of comparison when we review Fisher information in Sec. 2. In that section we shall see that Fisher information yields not
an analytic expression for the probability law, but rather, a differential equation for the probability law that is well known in the physical sciences: the
Schroedinger equation. Using this fortunate correspondence we can exploit
decades of development of computational approaches to the Schroedinger
equation in the construction of probability laws in finance and economics.
Like other variational approaches the Fisher information approach depends upon prior information in the form of equality constraints, and we
shall explore this in Sec. 3 where examples of increasingly complex structure
will be given. In particular, exploiting the correspondence between (i) the
resulting differential equations for probability laws in these examples from
financial economics and (ii) the Schroedinger equation, will permit the generation both of yield curves from observed bond prices and of probability
densities from observed option prices in new and efficient ways. In addition,
we shall see that the probability densities generated using Fisher information are, in general, smoother than those obtained from maximum entropy.
A brief summary of our results is given in Sec. 4 which is followed by two
appendices where details of the calculations are presented in somewhat expanded form.
2
2
Theory
When observed data are expectation values, their relationship to the probability density is that of a linear integral, and the calculation of the probability
density from these observations is a linear inverse problem.
In the most general case we have M observed data values d1 , . . . , dM =
{dm } that are known to be averages of known functions {fm (x)} and are
related to a probability density function p(x) by
Z b
fm (x)p(x)dx = dm , m = 1, . . . , M .
(1)
a
As there are an infinite number of probability densities that satisfy Eq. 1,
a regularizer is commonly introduced to impose a structural constraint on
the characteristics of p(x) and to choose a particular density. This can be
accomplished by constructing a Lagrangian and employing the variational
calculus. In the case of maximum entropy, the regularizer is the Shannon
entropy (Shannon 1948)
Z b
H≡−
p(x) ln [p(x)] dx ,
(2)
a
with which one can form the Lagrangian
Z b
Z b
M
X
p(x) ln [p(x)] dx +
λm
fm (x)p(x)dx − dm ,
a
(3)
a
m=1
where the {λm } are the Lagrange undetermined multipliers. For this Lagrangian the extremum solution with zero first variation is known to be (Jaynes
1968)
" M
#
X
1
exp −
λm fm (x) ,
(4)
pM E (x) =
Z(λ1 , . . . λM )
m=1
where
Z(λ1 , . . . λM ) =
Z
a
"
b
exp −
M
X
m=1
#
λm fm (x) dx .
(5)
By comparison, in most uses of the Fisher information approach the Shannon
entropy is simply replaced with Fisher information (Fisher 1925) in its shiftinvariant form
Z b ′ 2
p (x)
I=
dx ,
(6)
a p(x)
3
where the prime denotes differentiation with respect to x. Consequently the
Lagrangian becomes
Z b
Z b ′ 2
M
X
p (x)
dx +
λm
fm (x)p(x)dx − dm .
(7)
a p(x)
a
m=1
As before we seek an extremum solution with zero first variation. This time,
however, the Euler-Lagrange equations result in a differential equation for
p(x):
′ 2 X
M
p (x)
d p′ (x)
+
λm fm (x) = 0 .
(8)
−
2
dx p(x)
p(x)
m=1
This equation can be simplified through the use of a probability amplitude
function q(x) where q 2 (x) ≡ p(x). Substituting the probability amplitude
into Eq. 8 yields the expression
M
q(x) X
d2 q(x)
=
λm fm (x) ,
dx2
4 m=1
(9)
known in the physical sciences as the Schroedinger equation.
Thus we see that, although the use of Fisher information results in a
differential equation for p(x) instead of an algebraic function, the resulting
differential equation for q(x) is one for which a number of analytic solutions
are known and for which a substantial collection of numerical solutions exist.
Furthermore, as we shall see in the next section, the Fisher information solutions are, in general, smoother than those obtained using maximum entropy.
3
Examples
In this section we use the Fisher information approach to motivate practical computational approaches to the extraction of probability densities and
related quantities from financial observables. In each case we contrast the
results obtained by this method with those of maximum entropy. We begin
in Secs. 3.1 and 3.2 with examples from fixed income where observables are
expressible in terms of partial integrals of a probability density and of first
moments of the density. In Secs. 3.3 and 3.4 we finish the examples by examining the canonical problem of extracting risk-neutral densities from observed
option prices and extend this analysis by showing how Fisher information can
be used to calculate a generalized implied volatility.
4
3.1
The Probability Density in the Term Structure of
Interest Rates: Yield Curve Construction
A basis for the use of information theory in fixed-income analysis comes from
the perspective developed by Brody & Hughston (2001) who observed that
the price of a zero-coupon bond - also known as the discount factor D(t) - can
be viewed as a complementary probability distribution where the maturity
date is taken to be an abstract random variable. The associated probability
density p(t) satisfies p(t) > 0 for all t, and
Z ∞
D(t) =
p(τ )dτ ,
(10)
t
Z ∞
=
Θ(τ − t)p(τ )dτ ,
(11)
0
where Θ(x) is the Heaviside step function1 .
To illustrate the differences between the Fisher information and maximum
entropy approaches to term structure estimation we consider the case of a
single zero coupon bond of tenor T = 10 years and a price of 28 cents on the
dollar or, equivalently, a discount factor of 0.28 2 . The maximum entropy
solution, Eq. 4, is
e−λΘ(t−T )
pM E (t) =
,
(12)
T + 1/λ
and the corresponding discount factor is D(T ) = 1/(λT + 1) from which the
Lagrange multiplier is found to be 0.2571.
The Fisher information solution is obtained by solving Eq. 9 with the
constraints of normalization and one discount factor:
d2 q(t)
q(t)
=
[λ0 + λ1 Θ(t − 10)] ,
2
dt
4
This problem is known3 to have the solution
A cos(αt) if t ≤ T ;
q(t) =
Be−βt
if t > T ,
(13)
(14)
where the coefficients A and B are determined by requiring that q(t) and
q ′ (t) (or the logarithmic derivative q ′ (t)/q(t)) match at t = T . Matching the
logarithmic derivative yields
tan(αT ) = β/α .
5
(15)
√
Choosing αT = π/4 and matching amplitudes we find that B = A exp(π/4)/ 2,
(
8α cos2 (αt)
if t ≤ T ;
pF I (t) = 8αeπ+4
(16)
π/2−ln 2−2βt
if
t
>
T
;
π+4
and D(T ) = 2/(π + 4) = 0.28.
The maximum entropy and Fisher information probability densities (Eqs.
12 and 16 respectively) and the corresponding term structure of interest rates
are illustrated in Fig. 1. In the uppermost panel we see both probability densities as a function of tenor with the Fisher information result denoted in all
panels by the solid line. As expected, the maximum entropy result is uniform until the first (and in this case only) observation is encountered; beyond
which a decaying exponential is observed. The Fisher information result is
smoother reflecting the need to match both the amplitude and derivative at
the data point.
Looking at the probability density is difficult to make any real aesthetic
choice between the two results. For this we turn to the two important derived
quantities: the spot rate r(t) and the forward rate f (t) that are related to
the discount factor by
D(t) = e−r(t)t ,
Rt
= e
0
f (τ )dτ
(17)
.
(18)
The spot rates are shown in the middle panel of Fig. 1. Both methods yield
a smooth result with the Fisher information solution showing less structure
than the maximum entropy solution. A greater difference between the two
methods is seen in the lowermost panel of Fig. 1 where the forward rate
is shown. It is the structure of this function that is often looked to when
assessing the relative merits of a particular representation of the discount
factor. The forward rate reflects the structure of the probability density as
expected from the relationship
f (t) = p(t)/D(t) ,
(19)
with the maximum entropy result showing more structure than the Fisher
information result due to the continuity at the level of the first derivative
imposed on p(t) by the Fisher information approach.
6
PROBABILITY
0.10
Fisher
MaxEnt
0.08
0.06
0.04
0.02
0.00
SPOT RATE
0.20
0.15
0.10
0.05
Fisher
MaxEnt
FORWARD RATE
0.00
0.30
0.20
0.10
Fisher
MaxEnt
0.00
0
5
10
TENOR (years)
15
20
Figure 1: The probability density and derived interest rates for a single ten
year discount factor of 0.28
7
It is a comparatively straightforward matter to extend this approach to
the construction of a term structure that is consistent with any number of arbitrarily spaced zero coupon bonds. A particularly convenient computational
approach based on transfer matrices is presented in Appendix A.
While previous work on inferring the term structure of interest rates from
observed bond prices has usually focused on the somewhat ad hoc application of splines to the spot or forward rates (McCullough 1975, Vasicek &
Fong 1982, Shea 1985, Fisher, Nychken & Zervos 1995), the work of Frishling & Yamamura (1996) that minimized df (t)/dt is similar in spirit to the
Fisher information approach. Their paper dealt with the often unacceptable
results that a straightforward application of splines to this problem of inference can produce. In some sense, Fisher information can be seen as an
information-theoretic approach to imposing the structure sought by Frishling
and Yamamura on the term structure of interest rates.4
3.2
The Probability Density in a Perpetual Annuity
Material differences between the probability densities generated by Fisher
information and maximum entropy are also seen when the observed data are
moments of the density - a common situation in financial applications. The
value of a perpetual annuity ξ, for example, is given by (Brody & Hughston
2001, Brody & Hughston 2002)
Z +∞
ξ=
τ p(τ )dτ .
(20)
0
This first-moment constraint provides an interesting point of comparison for
the maximum entropy approach employed by Brody and Houghston and our
Fisher information approach. The maximum entropy solution is known to
be
1
pM E (t) = exp (−t/ξ)
(21)
ξ
while the Fisher information solution is known (Frieden 1988) to be
pF I (t) = c1 Ai2 (c2 t) ,
(22)
where Ai(x) is Airy’s function and the constants ci are determined uniquely
by normalization and by the constraint of Eq. 20. We show these two probability densities as a function of tenor for a perpetual annuity with a price
8
1
MaxEnt
Fisher
PROBABILITY DENSITY
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
TENOR
Figure 2: The probability density functions associated with a perpetual annuity with a price of 1.0.
9
of $1.00 in Fig. 2. The two solutions are qualitatively similar in appearance;
both monotonically decrease with tenor and are quite smooth. However,
since the Fisher information solution starts out at zero tenor with a lower
value than the maximum entropy solution and then crosses the maximum
entropy solution so as to fall off more slowly than maximum entropy solution
in the mid-range region, the Fisher information solution seems to be even
smoother. In the context of Fisher information, a measure of smoothness is
the size of the Cramer-Rao bound (CRB) 1/I (Frieden 1988) that, for the
Fisher information solution is found to be 1.308 by integrating Eq. 6. The
CRB for the maximum entropy solution is 1.000. Hence, by the criterion of
maximum Cramer-Rao bound, the Fisher information solution is significantly
smoother. We also compared the relative smoothness of these solutions using
Shannon’s entropy, H, (Eq. 2) as a measure. The maximum entropy solution is found to have an H of 1.0, while the Fisher information solution has
an H of 0.993. Hence the maximum entropy solution does indeed have a
larger Shannon entropy than does the Fisher information solution, but it is
certainly not much larger. Since the two solutions differ much more in their
CRB values than in their Shannon H values, it appears that the CRB is a
more sensitive measure of smoothness, and hence biasedness, of a probability
law.
An interesting result emerges from an examination of the smoothness as
a function of range. Since over the range 0 ≤ T ≤ 5 shown in Fig. 2 it appears that the Fisher information ought to be measurably smoother than the
maximum entropy solution by any criterion, we also computed the Shannon
H values over this interval alone. The results are H = 0.959 for maximum
entropy and H = 0.978 for Fisher information: the Fisher information solution has the larger entropy. This shows that the Fisher information solution
is smoother than the maximum entropy solution, even by the maximum entropy criterion, over this finite interval. It follows that the maximum entropy
solution has the larger entropy overall only due to its action in the long tail
region T > 5. Note, however, that in this region the maximum entropy
solution would have negligibly small values for most purposes.
Hence, the indications are that the criterion of maximum entropy places
an unduly strong weight on the behavior of the probability density in the tail
regions. Note by comparison that over the interval 0 ≤ T ≤ 5 the CRB values
for the two solutions still strongly differ, with 1.317 for the Fisher information
solution and 1.007 for the maximum entropy. Moreover, these are close to
their values as previously computed over the entire range 0 ≤ T ≤ ∞. Hence,
10
Fisher information gives comparatively lower weight to the tail regions of the
probability densities.
3.3
The Probability Density in Option Prices
While the reconstruction of probability densities from observed option prices
has been the topic of much research (c.f. (Jackwerth 1999) and references
therein), the expectation of global smoothness in the resulting probability
densities has often not been achieved through the use of techniques that
focus globally (e.g. maximum entropy (Stutzer 1994, Hawkins, Rubinstein &
Daniell 1996, Buchen & Kelley 1996, Stutzer 1996, Hawkins 1997, Avellaneda
1998)). As with term-structure estimation, various approaches to smoothing
probability densities have been often introduced in an ad hoc manner. Fisher
information, however, provides a natural manner for the introduction of a
measure of local smoothness into the option-based density reconstruction
problem.
For purposes of illustration, let us consider the expression for the price of
a European call option c(k):
Z ∞
−rt
max(x − k, 0)p(x)dx ,
(23)
c(k) = e
0
where k is the strike price, t is the time to expiration, and r is the risk-free
interest rate. We take c(k), k, r, and t to be known and ask what p(x) is
associated with the observed c(k). The observed call value c(k) can also be
viewed as the mean value of the function e−rt max(x − k, 0), and it is from
this viewpoint that a natural connection between option pricing theory and
information theory can be made.
Given a set of M observed call prices {c(km )} the Fisher Information
solution is obtained by solving Eq. 9 subject to these constraints and normalization,
"
#
M
X
d2 q(x)
q(x)
λ0 +
λm max(x − km , 0) .
(24)
=
dx2
4
m=1
While the numerical method described in Appendix A can be applied to
Eq. 24 by replacing the terms multiplying q(t) on the right-hand side of
Eq. 24 with a stepwise approximation or by treating the solution between
11
strike prices as a linear combination of Airy functions, Eq. 24 can be integrated directly using the Numerov method (cf. Appendix B).5 The maximum
entropy solution can be obtained in a straightforward manner substituting
Eq. 4 into Eq. 23 and varying the {λm } to reproduce the observed option
prices.
We applied the Fisher information approach to the example discussed
in Edelman (2001) of 3 call options on a stock with an annualized volatility
σ of 20% and a time to expiration t of 1 year. In this example the interest
rate r is set to zero, the stock price St is $1.00 and the three option prices
with strike price k of $0.95, $1.00, and $1.05 have Black-Scholes prices $0.105,
$0.080, and $0.059 respectively. We note that the stock can also be viewed in
this example as a call option with a strike price of $0.00. Given these ‘observations’ together with the normalization requirement on p(x) we generated
the solutions shown in Fig. 3. The smooth dash-dot curve is the lognormal
distribution with which the ‘observed’ option prices were sampled6 , and is
the distribution we are trying to recover from the option prices. The Fisher
information result is given by the solid curve and the MaxEnt result is the
dashed curve that was calibrated to the observed option prices.
We began by considering the case of the probability density implicit in a
single at-the-money (i.e. k = $1.00) option price and the results are shown
in the upper panel. The maximum entropy solution is a sharply peaked
product of two exponentials. In contrast the Fisher information solution
has, by virtue of requiring continuity of the first derivative, a much smoother
appearance and is much closer in appearance to the lognormal density from
which the option price was generated. We do, however, see that with only a
single option price the asymmetric features of the lognormal density are not
recovered.
Adding the option prices at the k = $0.95 and k = $1.05 (from Edelman
example) strikes to the problem results in the densities shown in the lower
panel of Fig. 3. With this information the agreement between the Fisher
information solution and the lognormal density is improved substantially
with the peak of the density functions nearly the same and the characteristic asymmetry of the lognormal density now seen in the Fisher information
result.. The maximum entropy solution continues to be sharply peaked and
to deviate substantially from the lognormal density. Sharply peaked densities have appeared in previous work involving limited financial observables
(see, for example, (Kuwahara & Marsh 1994)) and this feature should not
be interpreted as a specific indictment of the maximum entropy approach.
12
3.50
Fisher
MaxEnt
Lognormal
PROBABILITY DENSITY
3.00
2.50
2.00
1.50
1.00
0.50
0.00
3.50
Fisher
MaxEnt
Lognormal
PROBABILITY DENSITY
3.00
2.50
2.00
1.50
1.00
0.50
0.00
0.0
0.5
1.0
STOCK PRICE ($)
1.5
2.0
Figure 3: Probability density functions associated with option prices.
13
In previous work (Hawkins et al. 1996) we generated smooth implied densities from S&P-500 index options by assuming a lognormal prior based on
the at-the-money-option (k = 1) option instead of the uniform prior assumed in Eq. 2 and forming a Lagrangian using the Kullback-Leibler entropy (Kullback 1959)
Z b
p(x)
p(x) ln
dx ,
(25)
G=−
r(x)
a
where r(x) is the prior - in our case lognormal - density. This points to one
of the key differences between the maximum entropy and Fisher information
approached: needed assumptions. When observations are limited - as they
often are in financial economics applications - smooth maximum entropy
densities often require a prior density: Fisher information simply imposes a
continuity constraint.
3.4
A Measure of Volatility
In addition to providing a way of calculating the density implied by option
prices, Fisher information yields, via the Cramer-Rao bound, a simultaneous
measurement of the intrinsic uncertainty in this system. This uncertainty
principle is, perhaps, most easily seen if we recall that Fisher information is
generally presented in terms of a system specified by a parameter θ with a
probability density p(x|θ) where
Z
b
p(x|θ)dx = 1 ,
(26)
a
and the Fisher information is
2
Z b
∂p(x|θ)/∂θ
p(x|θ)
dx .
p(x|θ)
a
(27)
For distributions that can be parameterized by a location parameter θ, Fisher
information can then be defined with respect to the family of densities
{p(x − θ)} (cf. (Cover & Thomas 1991)). This transformation leads directly
to Eq. 6 for the Fisher information. It also implies that the mean-square
error e2 in the estimate of that location parameter is related to this Fisher
information, I, by the Cramer-Rao inequality e2 I ≥ 1. If, for example, p(x)
14
is a Gaussian, the minimum Cramer-Rao bound is I = 1/σ 2 with σ 2 being
the variance. Given that the probability density reconstructed from option
prices is the conditional density of underlying asset levels at option expiration and given that a set of option prices is a measurement of the probability
density it follows that the Fisher information, via the Cramer-Rao bound,
provides a measure of the mean-square error in the estimate of the level of
the underlying asset at expiration provided by the option prices and, thus,
a natural generalization of the concept of implied volatility to non-Gaussian
underlying-asset distributions.
4
Discussion and Summary
Fisher information provides a variational approach to the estimation of probability laws that are consistent with financial observables. Using this approach one can employ well developed computational approaches from formally identical problems in the physical sciences. The resulting probability
laws posses a degree of smoothness consistent with priors concerning the
probability density and/or quantities derived therefrom. In this paper we
have shown how Fisher information can be used to solve some canonical
inverse problems in financial economics and that the resulting probability
densities are generally smoother than those obtained by other methods such
as maximum entropy. Fisher information also has the virtue of providing
a natural measure of, and computation approach for determining, implied
volatility via the Cramer-Rao inequality.
Comparing the maximum entropy approach to that of Fisher information
it was seen that the former has the virtue of simplicity, always being an exponential. The Fisher information solution is always a differential equation
for the probability density. Although somewhat more complicated, this is, in
some sense, a virtue as the probability densities of physics and, presumably
of finance and economics, generally obey differential equations. Indeed, it is
in this application that we see the greatest potential for future applications of
Fisher information in finance and economics. For example, in addition to the
problems examined in Sec. 3, recent developments in macroeconomic modeling have focused upon placing all or nearly all heterogeneous microeconomic
agents within one stochastic and dynamic framework. This has the effect
of replacing the replicator dynamics of evolutionary games or the Malthusian dynamics of Friedman with the backward Chapman-Kolmogorov equa15
tion (Aoki 1998, Aoki 2001). Such a statistical mechanical representation of
macroeconomics provides a natural link to Fisher information. The solutions
of the Schroedinger equation (Eq. 9) are known to be stationary solutions
to the Fokker-Plank equation (Gardiner 1996, Risken 1996). More generally, however, the entire Legendre-transform structure of thermodynamics
can be expressed using Fisher information in place of the Shannon entropy
- an essential ingredient for constructing a statistical mechanics (Frieden,
Plastino, Plastino & Soffer 1999) - and it has been shown (Frieden, Plastino,
Plastino & Soffer 2002) that both equilibrium and non-equilibrium statistical
mechanics can be obtained from Eq. 9: the output of a constrained process
that extremizes Fisher information.
The elegant and powerful Lagrangian approach to physics, in use for
over 200 years, has until a recent application of the Fisher information approach (Frieden & Soffer 1995, Frieden 1998) not had a formal method for
constructing Lagrangians. This formalism for deriving Lagrangians and probability laws that are consistent with observations should prove to be a very
useful tool in finance and economics.
Acknowledgments
We thank Prof. David Edelman for providing a preprint of his work and for
stimulating discussions. We also thank Dana Hobson, Prof. David A.B. Miller,
Leif Wennerberg, and Prof. Ewan Wright for helpful discussions and suggestions.
Appendix A
To generalize the amplitude-matching that we used in Sec. 3.1 we use a
transfer matrix method that is employed often for formally similar problems
in quantum electronics (Yeh, Yariv & Hong 1977)7. Given a set of discount
factors {D(Tm )} we seek a solution to
"
#
M
X
d2 q(t)
q(t)
λ0 +
λm Θ(t − Tm , 0) .
(A-1)
=
dt2
4
m=1
16
Between tenors, say Tm−1 and Tm it is straightforward to show that the
probability amplitude q(t) is given by
q(t) = Am eiωm (t−Tm ) + Bm e−iωm (t−Tm ) ,
(A-2)
where Am √
and Bm are coefficients to be determined by the matching conditions, i = −1, and
v
uX
m
1u
t
ωm =
λj ,
(A-3)
2 j=0
with {λj } being the Lagrange undetermined multipliers. We now consider
the match across the mth tenor. Since t − Tm = 0, continuity of q(t) gives
q(TM ) = AL + BL = Am+1 + Bm+1 ,
(A-4)
where the subscript L denotes the value of the coefficients of q(t) immediately
to the left of Tm . Continuity of the first derivative of q(t) gives
AL − BL = ∆m (Am+1 + Bm+1 ) ,
where ∆m = ωm+1 /ωm . These two equations combine to give
1 1 + ∆m 1 − ∆m
AL
Am+1
=
BL
Bm+1
2 1 − ∆m 1 + ∆m
(A-5)
(A-6)
Since Am = AL exp(−iωm τm ) and Bm = AL exp(iωm τm ) where τm = Tm+1 −
Tm it follows that the coefficients for q(t) in layer m are related to those in
layer m + 1 by
1 e−iωm τm
Am
0
1 + ∆m 1 − ∆m
Am+1
=
(A-7)
Bm
0
eiωm τm
1 − ∆m 1 + ∆m
Bm+1
2
Given this relationship we can generate the function q(t) sequentially, beginning with the region beyond the longest tenor that we shall take to be
Tm+1 . In this region we require that the solution be a decaying exponential exp(−ωm+1 t) which implies that Bm+1 = 0 and that ωm+1 be a positive
real number. Given a {λm } and setting Am+1 = 1 , it is now a straightforward matter to calculate the fixed income functions described in Sec. 3.1.
This transfer matrix procedure maps {λm } into a discount function and, in
a straightforward manner, into coupon bond prices.
17
Appendix B
Our choice of the Numerov method to solve Eq. 24 follows from the popularity
of this method for solving this equation in quantum mechanics (Koonin 1986).
In this appendix we present a brief overview of the method and it’s application to the option problem discussed in Section 3.3.
The Numerov method provides a numerical solution to8
d2 q(x)
+ G(x)q(x) = 0 ,
dx2
(B-1)
that begins by expanding q(x) in a Taylor series
q(x ± ∆x) =
∞
X
(±∆x)n dn q(x)
n=0
n!
dxn
.
(B-2)
Combining these two equations and retaining terms to fourth order yields
d2 q(x) (∆x)2 d4 q(x)
q(x + ∆x) + q(x − ∆x) − 2q(x)
=
+
+ O((∆x)4 ) ,
(∆x)2
dx2
12
dx4
(B-3)
The second derivative term in Eq. B-3 can be cleared by substituting Eq. B-1
and the fourth derivative term can be cleared by differentiating Eq. B-1 two
more times
d4 q(x)
d2 (G(x)q(x))
= −
,
(B-4)
dx4
dx2
G(x + ∆x)q(x + ∆x) + G(x − ∆x)q(x − ∆x) − 2G(x)q(x)
∼
,
= −
(∆x)2
and substituting this result into Eq. B-1. Performing these substitutions and
rearranging terms we arrive at the Numerov algorithm:
2
(∆x)2
2 1 − 5(∆x)
1
+
G
G
n
n−1
12
12
qn −
qn+1 + O((∆x)6 ) . (B-5)
qi+1 =
2
(∆x)
(∆x)2
1 + 12 Gn+1
1 + 12 Gn+1
To apply this to the option problem it is useful to note that the calculation
can be separated into two stages: first, the calculation of the {λ0 , q(x)} pair
that solves the eigenvalue problem for a fixed set of {λm }m>0 , and second,
18
the search for the set {λm }m>0 for which q(x) of the corresponding {λ0 , q(x)}
pair reproduces the observed option prices.
The {λ0 , q(x)} pair for a given {λm }m>0 is obtained by integrating q(x)
from both the lower and upper limits of the x range and varying λ0 until the
logarithmic derivatives of each solution match at an arbitrary mid point. In
this example that point was x = 1 and λ0 was found with a 1-dimensional
minimization algorithm using the difference in the logarithmic
derivatives as a
P
penalty function and the minimum of the function M
λ
max(x−k
m , 0)/4
m=1 m
as the initial guess for λ0 .
Once the logarithmic derivatives are matched the option prices can be
calculated and compared with the observed option prices. The results of this
comparison form the penalty function for a multidimensional variation of the
{λm }m>0 to obtain agreement between the observed and calculated option
prices.
Notes
1
Θ(x) = 0 if x < 0 and 1 otherwise.
These particular parameters were chosen so that the trigonometric functions resolve
into simple fractions.
3
See, for example, Section 22 of (Landau & Lifshitz 1977).
4
A similar pairing on the minimization of d2 f (t)/dt2 is seen in the work of Adams &
Van Deventer (1994) and the recent entropy work of Edelman (2001).
5
In this paper we focus solely on the equilibrium solutions: those densities that correspond to the lowest value of λ0 . Equation 24 is solved by a collection of {λ0 , q(x)}
pairs; the solutions q(x) corresponding to other λ0 being nonequilibrium solutions. These
solutions are the subject of a forthcoming paper.
6
The lognormal distribution is
!
1
[ln(ST /St ) − r(T − t)]2
p
p(ST , T |St , t) =
exp −
2σ 2 (T − t)
ST 2πσ 2 (T − t)
2
from which the Black-Scholes call price
c(k) =
d1
≡
d2
≡
St N (d1 ) − ke−r(T −t) N (d2)
ln(St /k) + (r + σ 2 /2)(T − t)
√
σ T −t
√
d1 − σ T − t
can be derived.
7
Our presentation of this derivation was derived, in part, from unpublished optoelectronics notes of Prof. David A.B. Miller.
19
8
The Numerov method can, in fact, be used to solve numerically the somewhat more
general equation
d2 q(x)
+ G(x)q(x) = S(x) .
dx2
The development of the algorithm follows the same line as that illustrated in this Appendix.
References
Adams, K. J. & Van Deventer, D. R. (1994), ‘Fitting yield curves and forward
rate crves with maximum smoothness’, J. Fixed Income pp. 52–62.
Aoki, M. (1998), New Approaches to Macroeconomic Modeling: Evolutionary Stochastic Dynamics, Multiple Equilibria, and Externalities as Field
Effects, Cambridge University Press, New York.
Aoki, M. (2001), Modeling Aggregate Behavior & Fluctuations in Economics:
Stochastic Views of Interacting Agents, Cambridge University Press,
New York.
Avellaneda, M. (1998), ‘Minimum-relative-entropy calibration of assetpricing models’, International Journal of Theoretical and Applied Finance 1, 447–472.
Brody, D. C. & Hughston, L. P. (2001), ‘Interest rates and information geometry’, Proc. R. Soc. Lond. A 457, 1343–1364.
Brody, D. C. & Hughston, L. P. (2002), ‘Entropy and information in the
interest rate term structure’, Quantitative Finance 2, 70–80.
Buchen, P. W. & Kelley, M. (1996), ‘The maximum entropy distribution of an
asset inferred from option prices’, Journal of Financial and Quantitative
Analysis 31, 143–159.
Buck, B. & Macaulay, V. A., eds (1990), Maximum Entropy in Action: A
Collection of Expository Essays, Oxford University Press, Oxford, UK.
Cover, T. M. & Thomas, J. A. (1991), Elements of Information Theory, Wiley
Series in Telecommunications, John Wiley & Sons, Inc., New York.
20
Edelman, D. (2001), The minimum local cross-entropy criterion for inferring risk-neutral price distributions from traded option prices. Private
commuication.
Fisher, M., Nychken, D. & Zervos, D. (1995), Fitting the term structure of
interest rates with smoothing splines”, Technical Report FEDS 95-1,
Federal Reserve Board, Washington, D.C.
Fisher, R. A. (1925), ‘Theory of statistical estimation’, Proceedings of the
Cambridge Philisophical Society 22, 700–725.
Fomby, T. & Hill, R. C., eds (1997), Applying Maximum Entropy to Econometric Problems, Vol. 12 of Advances in Econometrics, JAI Press Inc.,
Greenwich, Connecticut.
Frieden, B. R. (1988), ‘Applications to optics and wave mechanics of the
criterion of maximum Cramer-Rao bound’, J. Mod. Opt. 35, 1297–1316.
Frieden, B. R. (1998), Physics from Fisher Information, Cambridge University Press, Cambridge.
Frieden, B. R., Plastino, A., Plastino, A. R. & Soffer, B. H. (1999), ‘Fisherbased thermodynamics: Its Legendre transform and concavity properties’, Phys. Rev. E 60, 48–53.
Frieden, B. R., Plastino, A., Plastino, A. R. & Soffer, B. H. (2002),
‘Schroedinger link between nonequilibrium thermodynamics and Fisher
information’, Phys. Rev. E 66, 046128.
Frieden, B. R. & Soffer, B. H. (1995), ‘Lagrangians of physics and the game
of Fisher-information transfer’, Phys. Rev. E 52, 2274–2286.
Frishling, V. & Yamamura, J. (1996), ‘Fitting a smooth forward rate curve
to coupon instruments’, J. Fixed Income pp. 97–103.
Gardiner, C. W. (1996), Handbook of Stochastic Methods: For Physics,
Chemistry, and the Natural Sciences, Vol. 13 of Springer Series in Synergetics, second edn, Springer-Verlag, New York.
Golan, A., Miller, D. & Judge, G. G. (1996), Maximum Entropy Econometrics: Robust Estimation with Limited Data, Financial Economics and
Quantitative Analysis, John Wiley & Sons, New York.
21
Hawkins, R. J. (1997), Maximum entropy and derivative securities, in
T. Fomby & R. C. Hill, eds, ‘Applying Maximum Entropy to Econometric Problems’, Vol. 12 of Advances in Econometrics, JAI Press Inc.,
Greenwich, Connecticut, pp. 277–301.
Hawkins, R. J., Rubinstein, M. & Daniell, G. (1996), Reconstruction of the
probability density implicit in option prices from incomplete and noisy
data, in K. M. Hanson & R. N. Silver, eds, ‘Maximum Entropy and
Bayesian Methods’, Vol. 79 of Fundamendal Theories of Physics, Kluwer
Academic Publishers, Santa Fe, New Mexico, pp. 1–8.
Jackwerth, J. (1999), ‘Option implied risk-neutral distributions and implied
binomial trees: A literature review’, Journal of Derivatives 7, 66–82.
Jaynes, E. T. (1968), ‘Prior probabilities’, IEEE Transactions on System
Science and Cynerbetics SSC-4, 227–241.
Koonin, S. E. (1986), Computational Physics, Addison-Wesley, New York.
Kullback, S. (1959), Information Theory and Statistics, Wiley, New York.
Kuwahara, H. & Marsh, T. (1994), ‘Why doesn’t the Black-Scholes model
fit Japanese warrants and convertible bonds?’, Japanese Journal of Financial Economics 1, 33–65.
Landau, L. D. & Lifshitz, E. M. (1977), Quantum Mechanics (Non-relativistic
Theory), Vol. 3 of Course of Theoretical Physics, third edn, Pergamon
Press, New York.
Maasoumi, E. (1993), ‘A compendium to information theory in economics
and econometrics’, Econometric Review 12, 137–181.
McCullough, J. H. (1975), ‘The tax-adjusted yield curve’, J. Finance
XXX, 811–829.
Risken, H. (1996), The Fokker-Plank Equation: Methods of Solution and
Applications, Vol. 18 of Springer Series in Synergetics, second edn,
Springer-Verlag, New York.
Sengupta, J. K. (1993), Econometrics of Information and Efficiency, Kluwer
Academic, Dordrecht.
22
Shannon, C. E. (1948), ‘A mathematical theory of communication’, The Bell
System Technical Journal 27, 379–423.
Shea, G. S. (1985), ‘Term structure estimation with exponential splines’,
J. Finance XL, 319–325.
Stutzer, M. (1994), The statistical mechanics of asset prices, in R. Elworth,
W. Everitt & E. Lee, eds, ‘Differential Equations, Dynamical Systems,
and Control Science - A Festschrift in Honor of Lawrence Markus’, Vol.
152 of Lecture Notes in Pure and Applied Mathematics, Marcel Dekker,
New York, pp. 321–342.
Stutzer, M. (1996), ‘A simple nonparametric approach to derivative security
pricing’, J. Finance 51, 1633–1652.
Vasicek, O. A. & Fong, H. G. (1982), ‘Term structure modeling using exponential splines’, J. Finance XXXVII, 339–356.
Yeh, P., Yariv, A. & Hong, C. (1977), ‘Electromagnetic propagation in periodic stratified media. I. General theory’, J. Opt. Soc. Am. 67, 423–438.
23