Academia.eduAcademia.edu

Financial Probabilities from Fisher Information

2003, Arxiv preprint cond-mat/0302579

We present a novel synthesis of Fisher information and asset pricing theory that yields a practical method for reconstructing the probability density implicit in security prices. The Fisher information approach to these inverse problems transforms the search for a probability density into the solution of a differential equation for which a substantial collection of numerical methods exist. We illustrate the potential of this approach by calculating the probability density implicit in both bond and option prices. Comparing the results of this approach with those obtained using maximum entropy we find that Fisher information usually results in probability densities that are smoother than those obtained using maximum entropy.

arXiv:cond-mat/0302579v1 [cond-mat.stat-mech] 27 Feb 2003 Financial Probabilities from Fisher Information Raymond J. Hawkins† and B. Roy Frieden‡ † Mulsanne Capital Management, 220 Montgomery Street Suite 506, San Francisco, CA 94104, USA ‡ Optical Sciences Center, University of Arizona Tucson, AZ 85721, USA September 16, 2018 Abstract We present a novel synthesis of Fisher information and asset pricing theory that yields a practical method for reconstructing the probability density implicit in security prices. The Fisher information approach to these inverse problems transforms the search for a probability density into the solution of a differential equation for which a substantial collection of numerical methods exist. We illustrate the potential of this approach by calculating the probability density implicit in both bond and option prices. Comparing the results of this approach with those obtained using maximum entropy we find that Fisher information usually results in probability densities that are smoother than those obtained using maximum entropy. 1 Introduction Probability laws that derive from a variational principle provide an operational calculus for the incorporation of new knowledge. This feature, long exploited in the physical sciences, is becoming increasingly popular in finance and economics where, for example, maximum entropy has found application as a useful and practical computational approach to financial economics (Maasoumi 1993, Sengupta 1993, Golan, Miller & Judge 1996, Fomby 1 & Hill 1997). It is generally felt that a candidate probability law p(x) should be as non-informative and smooth (in some sense) as possible while maintaining consistency with the known information about x. While this criterion has often been used to motivate the use of maximum entropy (Buck & Macaulay 1990), other variational approaches provide similar - and potentially superior - degrees of smoothness to probability laws (Frieden 1988, Edelman 2001) and it is the purpose of this paper to show that Fisher information (Frieden 1998) - which provides just such a variational approach - can be used to reconstruct probability densities of interest in financial economics. In particular, Fisher information provides a variational calculus, a well developed computational approach to the estimation of probability laws and yields a probability law where smoothness is ensured across the range of support; in contrast to maximum entropy where smoothness tends to be concentrated in regions where the probability density is very small. Since maximum entropy is comparatively well known in financial economics and as it shares with Fisher information a common variational structure, we shall use it as a point of comparison when we review Fisher information in Sec. 2. In that section we shall see that Fisher information yields not an analytic expression for the probability law, but rather, a differential equation for the probability law that is well known in the physical sciences: the Schroedinger equation. Using this fortunate correspondence we can exploit decades of development of computational approaches to the Schroedinger equation in the construction of probability laws in finance and economics. Like other variational approaches the Fisher information approach depends upon prior information in the form of equality constraints, and we shall explore this in Sec. 3 where examples of increasingly complex structure will be given. In particular, exploiting the correspondence between (i) the resulting differential equations for probability laws in these examples from financial economics and (ii) the Schroedinger equation, will permit the generation both of yield curves from observed bond prices and of probability densities from observed option prices in new and efficient ways. In addition, we shall see that the probability densities generated using Fisher information are, in general, smoother than those obtained from maximum entropy. A brief summary of our results is given in Sec. 4 which is followed by two appendices where details of the calculations are presented in somewhat expanded form. 2 2 Theory When observed data are expectation values, their relationship to the probability density is that of a linear integral, and the calculation of the probability density from these observations is a linear inverse problem. In the most general case we have M observed data values d1 , . . . , dM = {dm } that are known to be averages of known functions {fm (x)} and are related to a probability density function p(x) by Z b fm (x)p(x)dx = dm , m = 1, . . . , M . (1) a As there are an infinite number of probability densities that satisfy Eq. 1, a regularizer is commonly introduced to impose a structural constraint on the characteristics of p(x) and to choose a particular density. This can be accomplished by constructing a Lagrangian and employing the variational calculus. In the case of maximum entropy, the regularizer is the Shannon entropy (Shannon 1948) Z b H≡− p(x) ln [p(x)] dx , (2) a with which one can form the Lagrangian Z b  Z b M X p(x) ln [p(x)] dx + λm fm (x)p(x)dx − dm , a (3) a m=1 where the {λm } are the Lagrange undetermined multipliers. For this Lagrangian the extremum solution with zero first variation is known to be (Jaynes 1968) " M # X 1 exp − λm fm (x) , (4) pM E (x) = Z(λ1 , . . . λM ) m=1 where Z(λ1 , . . . λM ) = Z a " b exp − M X m=1 # λm fm (x) dx . (5) By comparison, in most uses of the Fisher information approach the Shannon entropy is simply replaced with Fisher information (Fisher 1925) in its shiftinvariant form Z b ′ 2 p (x) I= dx , (6) a p(x) 3 where the prime denotes differentiation with respect to x. Consequently the Lagrangian becomes Z b  Z b ′ 2 M X p (x) dx + λm fm (x)p(x)dx − dm . (7) a p(x) a m=1 As before we seek an extremum solution with zero first variation. This time, however, the Euler-Lagrange equations result in a differential equation for p(x):    ′ 2 X M p (x) d p′ (x) + λm fm (x) = 0 . (8) − 2 dx p(x) p(x) m=1 This equation can be simplified through the use of a probability amplitude function q(x) where q 2 (x) ≡ p(x). Substituting the probability amplitude into Eq. 8 yields the expression M q(x) X d2 q(x) = λm fm (x) , dx2 4 m=1 (9) known in the physical sciences as the Schroedinger equation. Thus we see that, although the use of Fisher information results in a differential equation for p(x) instead of an algebraic function, the resulting differential equation for q(x) is one for which a number of analytic solutions are known and for which a substantial collection of numerical solutions exist. Furthermore, as we shall see in the next section, the Fisher information solutions are, in general, smoother than those obtained using maximum entropy. 3 Examples In this section we use the Fisher information approach to motivate practical computational approaches to the extraction of probability densities and related quantities from financial observables. In each case we contrast the results obtained by this method with those of maximum entropy. We begin in Secs. 3.1 and 3.2 with examples from fixed income where observables are expressible in terms of partial integrals of a probability density and of first moments of the density. In Secs. 3.3 and 3.4 we finish the examples by examining the canonical problem of extracting risk-neutral densities from observed option prices and extend this analysis by showing how Fisher information can be used to calculate a generalized implied volatility. 4 3.1 The Probability Density in the Term Structure of Interest Rates: Yield Curve Construction A basis for the use of information theory in fixed-income analysis comes from the perspective developed by Brody & Hughston (2001) who observed that the price of a zero-coupon bond - also known as the discount factor D(t) - can be viewed as a complementary probability distribution where the maturity date is taken to be an abstract random variable. The associated probability density p(t) satisfies p(t) > 0 for all t, and Z ∞ D(t) = p(τ )dτ , (10) t Z ∞ = Θ(τ − t)p(τ )dτ , (11) 0 where Θ(x) is the Heaviside step function1 . To illustrate the differences between the Fisher information and maximum entropy approaches to term structure estimation we consider the case of a single zero coupon bond of tenor T = 10 years and a price of 28 cents on the dollar or, equivalently, a discount factor of 0.28 2 . The maximum entropy solution, Eq. 4, is e−λΘ(t−T ) pM E (t) = , (12) T + 1/λ and the corresponding discount factor is D(T ) = 1/(λT + 1) from which the Lagrange multiplier is found to be 0.2571. The Fisher information solution is obtained by solving Eq. 9 with the constraints of normalization and one discount factor: d2 q(t) q(t) = [λ0 + λ1 Θ(t − 10)] , 2 dt 4 This problem is known3 to have the solution  A cos(αt) if t ≤ T ; q(t) = Be−βt if t > T , (13) (14) where the coefficients A and B are determined by requiring that q(t) and q ′ (t) (or the logarithmic derivative q ′ (t)/q(t)) match at t = T . Matching the logarithmic derivative yields tan(αT ) = β/α . 5 (15) √ Choosing αT = π/4 and matching amplitudes we find that B = A exp(π/4)/ 2, ( 8α cos2 (αt) if t ≤ T ; pF I (t) = 8αeπ+4 (16) π/2−ln 2−2βt if t > T ; π+4 and D(T ) = 2/(π + 4) = 0.28. The maximum entropy and Fisher information probability densities (Eqs. 12 and 16 respectively) and the corresponding term structure of interest rates are illustrated in Fig. 1. In the uppermost panel we see both probability densities as a function of tenor with the Fisher information result denoted in all panels by the solid line. As expected, the maximum entropy result is uniform until the first (and in this case only) observation is encountered; beyond which a decaying exponential is observed. The Fisher information result is smoother reflecting the need to match both the amplitude and derivative at the data point. Looking at the probability density is difficult to make any real aesthetic choice between the two results. For this we turn to the two important derived quantities: the spot rate r(t) and the forward rate f (t) that are related to the discount factor by D(t) = e−r(t)t , Rt = e 0 f (τ )dτ (17) . (18) The spot rates are shown in the middle panel of Fig. 1. Both methods yield a smooth result with the Fisher information solution showing less structure than the maximum entropy solution. A greater difference between the two methods is seen in the lowermost panel of Fig. 1 where the forward rate is shown. It is the structure of this function that is often looked to when assessing the relative merits of a particular representation of the discount factor. The forward rate reflects the structure of the probability density as expected from the relationship f (t) = p(t)/D(t) , (19) with the maximum entropy result showing more structure than the Fisher information result due to the continuity at the level of the first derivative imposed on p(t) by the Fisher information approach. 6 PROBABILITY 0.10 Fisher MaxEnt 0.08 0.06 0.04 0.02 0.00 SPOT RATE 0.20 0.15 0.10 0.05 Fisher MaxEnt FORWARD RATE 0.00 0.30 0.20 0.10 Fisher MaxEnt 0.00 0 5 10 TENOR (years) 15 20 Figure 1: The probability density and derived interest rates for a single ten year discount factor of 0.28 7 It is a comparatively straightforward matter to extend this approach to the construction of a term structure that is consistent with any number of arbitrarily spaced zero coupon bonds. A particularly convenient computational approach based on transfer matrices is presented in Appendix A. While previous work on inferring the term structure of interest rates from observed bond prices has usually focused on the somewhat ad hoc application of splines to the spot or forward rates (McCullough 1975, Vasicek & Fong 1982, Shea 1985, Fisher, Nychken & Zervos 1995), the work of Frishling & Yamamura (1996) that minimized df (t)/dt is similar in spirit to the Fisher information approach. Their paper dealt with the often unacceptable results that a straightforward application of splines to this problem of inference can produce. In some sense, Fisher information can be seen as an information-theoretic approach to imposing the structure sought by Frishling and Yamamura on the term structure of interest rates.4 3.2 The Probability Density in a Perpetual Annuity Material differences between the probability densities generated by Fisher information and maximum entropy are also seen when the observed data are moments of the density - a common situation in financial applications. The value of a perpetual annuity ξ, for example, is given by (Brody & Hughston 2001, Brody & Hughston 2002) Z +∞ ξ= τ p(τ )dτ . (20) 0 This first-moment constraint provides an interesting point of comparison for the maximum entropy approach employed by Brody and Houghston and our Fisher information approach. The maximum entropy solution is known to be 1 pM E (t) = exp (−t/ξ) (21) ξ while the Fisher information solution is known (Frieden 1988) to be pF I (t) = c1 Ai2 (c2 t) , (22) where Ai(x) is Airy’s function and the constants ci are determined uniquely by normalization and by the constraint of Eq. 20. We show these two probability densities as a function of tenor for a perpetual annuity with a price 8 1 MaxEnt Fisher PROBABILITY DENSITY 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 4 5 TENOR Figure 2: The probability density functions associated with a perpetual annuity with a price of 1.0. 9 of $1.00 in Fig. 2. The two solutions are qualitatively similar in appearance; both monotonically decrease with tenor and are quite smooth. However, since the Fisher information solution starts out at zero tenor with a lower value than the maximum entropy solution and then crosses the maximum entropy solution so as to fall off more slowly than maximum entropy solution in the mid-range region, the Fisher information solution seems to be even smoother. In the context of Fisher information, a measure of smoothness is the size of the Cramer-Rao bound (CRB) 1/I (Frieden 1988) that, for the Fisher information solution is found to be 1.308 by integrating Eq. 6. The CRB for the maximum entropy solution is 1.000. Hence, by the criterion of maximum Cramer-Rao bound, the Fisher information solution is significantly smoother. We also compared the relative smoothness of these solutions using Shannon’s entropy, H, (Eq. 2) as a measure. The maximum entropy solution is found to have an H of 1.0, while the Fisher information solution has an H of 0.993. Hence the maximum entropy solution does indeed have a larger Shannon entropy than does the Fisher information solution, but it is certainly not much larger. Since the two solutions differ much more in their CRB values than in their Shannon H values, it appears that the CRB is a more sensitive measure of smoothness, and hence biasedness, of a probability law. An interesting result emerges from an examination of the smoothness as a function of range. Since over the range 0 ≤ T ≤ 5 shown in Fig. 2 it appears that the Fisher information ought to be measurably smoother than the maximum entropy solution by any criterion, we also computed the Shannon H values over this interval alone. The results are H = 0.959 for maximum entropy and H = 0.978 for Fisher information: the Fisher information solution has the larger entropy. This shows that the Fisher information solution is smoother than the maximum entropy solution, even by the maximum entropy criterion, over this finite interval. It follows that the maximum entropy solution has the larger entropy overall only due to its action in the long tail region T > 5. Note, however, that in this region the maximum entropy solution would have negligibly small values for most purposes. Hence, the indications are that the criterion of maximum entropy places an unduly strong weight on the behavior of the probability density in the tail regions. Note by comparison that over the interval 0 ≤ T ≤ 5 the CRB values for the two solutions still strongly differ, with 1.317 for the Fisher information solution and 1.007 for the maximum entropy. Moreover, these are close to their values as previously computed over the entire range 0 ≤ T ≤ ∞. Hence, 10 Fisher information gives comparatively lower weight to the tail regions of the probability densities. 3.3 The Probability Density in Option Prices While the reconstruction of probability densities from observed option prices has been the topic of much research (c.f. (Jackwerth 1999) and references therein), the expectation of global smoothness in the resulting probability densities has often not been achieved through the use of techniques that focus globally (e.g. maximum entropy (Stutzer 1994, Hawkins, Rubinstein & Daniell 1996, Buchen & Kelley 1996, Stutzer 1996, Hawkins 1997, Avellaneda 1998)). As with term-structure estimation, various approaches to smoothing probability densities have been often introduced in an ad hoc manner. Fisher information, however, provides a natural manner for the introduction of a measure of local smoothness into the option-based density reconstruction problem. For purposes of illustration, let us consider the expression for the price of a European call option c(k): Z ∞ −rt max(x − k, 0)p(x)dx , (23) c(k) = e 0 where k is the strike price, t is the time to expiration, and r is the risk-free interest rate. We take c(k), k, r, and t to be known and ask what p(x) is associated with the observed c(k). The observed call value c(k) can also be viewed as the mean value of the function e−rt max(x − k, 0), and it is from this viewpoint that a natural connection between option pricing theory and information theory can be made. Given a set of M observed call prices {c(km )} the Fisher Information solution is obtained by solving Eq. 9 subject to these constraints and normalization, " # M X d2 q(x) q(x) λ0 + λm max(x − km , 0) . (24) = dx2 4 m=1 While the numerical method described in Appendix A can be applied to Eq. 24 by replacing the terms multiplying q(t) on the right-hand side of Eq. 24 with a stepwise approximation or by treating the solution between 11 strike prices as a linear combination of Airy functions, Eq. 24 can be integrated directly using the Numerov method (cf. Appendix B).5 The maximum entropy solution can be obtained in a straightforward manner substituting Eq. 4 into Eq. 23 and varying the {λm } to reproduce the observed option prices. We applied the Fisher information approach to the example discussed in Edelman (2001) of 3 call options on a stock with an annualized volatility σ of 20% and a time to expiration t of 1 year. In this example the interest rate r is set to zero, the stock price St is $1.00 and the three option prices with strike price k of $0.95, $1.00, and $1.05 have Black-Scholes prices $0.105, $0.080, and $0.059 respectively. We note that the stock can also be viewed in this example as a call option with a strike price of $0.00. Given these ‘observations’ together with the normalization requirement on p(x) we generated the solutions shown in Fig. 3. The smooth dash-dot curve is the lognormal distribution with which the ‘observed’ option prices were sampled6 , and is the distribution we are trying to recover from the option prices. The Fisher information result is given by the solid curve and the MaxEnt result is the dashed curve that was calibrated to the observed option prices. We began by considering the case of the probability density implicit in a single at-the-money (i.e. k = $1.00) option price and the results are shown in the upper panel. The maximum entropy solution is a sharply peaked product of two exponentials. In contrast the Fisher information solution has, by virtue of requiring continuity of the first derivative, a much smoother appearance and is much closer in appearance to the lognormal density from which the option price was generated. We do, however, see that with only a single option price the asymmetric features of the lognormal density are not recovered. Adding the option prices at the k = $0.95 and k = $1.05 (from Edelman example) strikes to the problem results in the densities shown in the lower panel of Fig. 3. With this information the agreement between the Fisher information solution and the lognormal density is improved substantially with the peak of the density functions nearly the same and the characteristic asymmetry of the lognormal density now seen in the Fisher information result.. The maximum entropy solution continues to be sharply peaked and to deviate substantially from the lognormal density. Sharply peaked densities have appeared in previous work involving limited financial observables (see, for example, (Kuwahara & Marsh 1994)) and this feature should not be interpreted as a specific indictment of the maximum entropy approach. 12 3.50 Fisher MaxEnt Lognormal PROBABILITY DENSITY 3.00 2.50 2.00 1.50 1.00 0.50 0.00 3.50 Fisher MaxEnt Lognormal PROBABILITY DENSITY 3.00 2.50 2.00 1.50 1.00 0.50 0.00 0.0 0.5 1.0 STOCK PRICE ($) 1.5 2.0 Figure 3: Probability density functions associated with option prices. 13 In previous work (Hawkins et al. 1996) we generated smooth implied densities from S&P-500 index options by assuming a lognormal prior based on the at-the-money-option (k = 1) option instead of the uniform prior assumed in Eq. 2 and forming a Lagrangian using the Kullback-Leibler entropy (Kullback 1959)   Z b p(x) p(x) ln dx , (25) G=− r(x) a where r(x) is the prior - in our case lognormal - density. This points to one of the key differences between the maximum entropy and Fisher information approached: needed assumptions. When observations are limited - as they often are in financial economics applications - smooth maximum entropy densities often require a prior density: Fisher information simply imposes a continuity constraint. 3.4 A Measure of Volatility In addition to providing a way of calculating the density implied by option prices, Fisher information yields, via the Cramer-Rao bound, a simultaneous measurement of the intrinsic uncertainty in this system. This uncertainty principle is, perhaps, most easily seen if we recall that Fisher information is generally presented in terms of a system specified by a parameter θ with a probability density p(x|θ) where Z b p(x|θ)dx = 1 , (26) a and the Fisher information is 2  Z b ∂p(x|θ)/∂θ p(x|θ) dx . p(x|θ) a (27) For distributions that can be parameterized by a location parameter θ, Fisher information can then be defined with respect to the family of densities {p(x − θ)} (cf. (Cover & Thomas 1991)). This transformation leads directly to Eq. 6 for the Fisher information. It also implies that the mean-square error e2 in the estimate of that location parameter is related to this Fisher information, I, by the Cramer-Rao inequality e2 I ≥ 1. If, for example, p(x) 14 is a Gaussian, the minimum Cramer-Rao bound is I = 1/σ 2 with σ 2 being the variance. Given that the probability density reconstructed from option prices is the conditional density of underlying asset levels at option expiration and given that a set of option prices is a measurement of the probability density it follows that the Fisher information, via the Cramer-Rao bound, provides a measure of the mean-square error in the estimate of the level of the underlying asset at expiration provided by the option prices and, thus, a natural generalization of the concept of implied volatility to non-Gaussian underlying-asset distributions. 4 Discussion and Summary Fisher information provides a variational approach to the estimation of probability laws that are consistent with financial observables. Using this approach one can employ well developed computational approaches from formally identical problems in the physical sciences. The resulting probability laws posses a degree of smoothness consistent with priors concerning the probability density and/or quantities derived therefrom. In this paper we have shown how Fisher information can be used to solve some canonical inverse problems in financial economics and that the resulting probability densities are generally smoother than those obtained by other methods such as maximum entropy. Fisher information also has the virtue of providing a natural measure of, and computation approach for determining, implied volatility via the Cramer-Rao inequality. Comparing the maximum entropy approach to that of Fisher information it was seen that the former has the virtue of simplicity, always being an exponential. The Fisher information solution is always a differential equation for the probability density. Although somewhat more complicated, this is, in some sense, a virtue as the probability densities of physics and, presumably of finance and economics, generally obey differential equations. Indeed, it is in this application that we see the greatest potential for future applications of Fisher information in finance and economics. For example, in addition to the problems examined in Sec. 3, recent developments in macroeconomic modeling have focused upon placing all or nearly all heterogeneous microeconomic agents within one stochastic and dynamic framework. This has the effect of replacing the replicator dynamics of evolutionary games or the Malthusian dynamics of Friedman with the backward Chapman-Kolmogorov equa15 tion (Aoki 1998, Aoki 2001). Such a statistical mechanical representation of macroeconomics provides a natural link to Fisher information. The solutions of the Schroedinger equation (Eq. 9) are known to be stationary solutions to the Fokker-Plank equation (Gardiner 1996, Risken 1996). More generally, however, the entire Legendre-transform structure of thermodynamics can be expressed using Fisher information in place of the Shannon entropy - an essential ingredient for constructing a statistical mechanics (Frieden, Plastino, Plastino & Soffer 1999) - and it has been shown (Frieden, Plastino, Plastino & Soffer 2002) that both equilibrium and non-equilibrium statistical mechanics can be obtained from Eq. 9: the output of a constrained process that extremizes Fisher information. The elegant and powerful Lagrangian approach to physics, in use for over 200 years, has until a recent application of the Fisher information approach (Frieden & Soffer 1995, Frieden 1998) not had a formal method for constructing Lagrangians. This formalism for deriving Lagrangians and probability laws that are consistent with observations should prove to be a very useful tool in finance and economics. Acknowledgments We thank Prof. David Edelman for providing a preprint of his work and for stimulating discussions. We also thank Dana Hobson, Prof. David A.B. Miller, Leif Wennerberg, and Prof. Ewan Wright for helpful discussions and suggestions. Appendix A To generalize the amplitude-matching that we used in Sec. 3.1 we use a transfer matrix method that is employed often for formally similar problems in quantum electronics (Yeh, Yariv & Hong 1977)7. Given a set of discount factors {D(Tm )} we seek a solution to " # M X d2 q(t) q(t) λ0 + λm Θ(t − Tm , 0) . (A-1) = dt2 4 m=1 16 Between tenors, say Tm−1 and Tm it is straightforward to show that the probability amplitude q(t) is given by q(t) = Am eiωm (t−Tm ) + Bm e−iωm (t−Tm ) , (A-2) where Am √ and Bm are coefficients to be determined by the matching conditions, i = −1, and v uX m 1u t ωm = λj , (A-3) 2 j=0 with {λj } being the Lagrange undetermined multipliers. We now consider the match across the mth tenor. Since t − Tm = 0, continuity of q(t) gives q(TM ) = AL + BL = Am+1 + Bm+1 , (A-4) where the subscript L denotes the value of the coefficients of q(t) immediately to the left of Tm . Continuity of the first derivative of q(t) gives AL − BL = ∆m (Am+1 + Bm+1 ) , where ∆m = ωm+1 /ωm . These two equations combine to give      1 1 + ∆m 1 − ∆m AL Am+1 = BL Bm+1 2 1 − ∆m 1 + ∆m (A-5) (A-6) Since Am = AL exp(−iωm τm ) and Bm = AL exp(iωm τm ) where τm = Tm+1 − Tm it follows that the coefficients for q(t) in layer m are related to those in layer m + 1 by       1 e−iωm τm Am 0 1 + ∆m 1 − ∆m Am+1 = (A-7) Bm 0 eiωm τm 1 − ∆m 1 + ∆m Bm+1 2 Given this relationship we can generate the function q(t) sequentially, beginning with the region beyond the longest tenor that we shall take to be Tm+1 . In this region we require that the solution be a decaying exponential exp(−ωm+1 t) which implies that Bm+1 = 0 and that ωm+1 be a positive real number. Given a {λm } and setting Am+1 = 1 , it is now a straightforward matter to calculate the fixed income functions described in Sec. 3.1. This transfer matrix procedure maps {λm } into a discount function and, in a straightforward manner, into coupon bond prices. 17 Appendix B Our choice of the Numerov method to solve Eq. 24 follows from the popularity of this method for solving this equation in quantum mechanics (Koonin 1986). In this appendix we present a brief overview of the method and it’s application to the option problem discussed in Section 3.3. The Numerov method provides a numerical solution to8 d2 q(x) + G(x)q(x) = 0 , dx2 (B-1) that begins by expanding q(x) in a Taylor series q(x ± ∆x) = ∞ X (±∆x)n dn q(x) n=0 n! dxn . (B-2) Combining these two equations and retaining terms to fourth order yields d2 q(x) (∆x)2 d4 q(x) q(x + ∆x) + q(x − ∆x) − 2q(x) = + + O((∆x)4 ) , (∆x)2 dx2 12 dx4 (B-3) The second derivative term in Eq. B-3 can be cleared by substituting Eq. B-1 and the fourth derivative term can be cleared by differentiating Eq. B-1 two more times d4 q(x) d2 (G(x)q(x)) = − , (B-4) dx4 dx2 G(x + ∆x)q(x + ∆x) + G(x − ∆x)q(x − ∆x) − 2G(x)q(x) ∼ , = − (∆x)2 and substituting this result into Eq. B-1. Performing these substitutions and rearranging terms we arrive at the Numerov algorithm:     2 (∆x)2 2 1 − 5(∆x) 1 + G G n n−1 12 12  qn −   qn+1 + O((∆x)6 ) . (B-5) qi+1 =  2 (∆x) (∆x)2 1 + 12 Gn+1 1 + 12 Gn+1 To apply this to the option problem it is useful to note that the calculation can be separated into two stages: first, the calculation of the {λ0 , q(x)} pair that solves the eigenvalue problem for a fixed set of {λm }m>0 , and second, 18 the search for the set {λm }m>0 for which q(x) of the corresponding {λ0 , q(x)} pair reproduces the observed option prices. The {λ0 , q(x)} pair for a given {λm }m>0 is obtained by integrating q(x) from both the lower and upper limits of the x range and varying λ0 until the logarithmic derivatives of each solution match at an arbitrary mid point. In this example that point was x = 1 and λ0 was found with a 1-dimensional minimization algorithm using the difference in the logarithmic derivatives as a P penalty function and the minimum of the function M λ max(x−k m , 0)/4 m=1 m as the initial guess for λ0 . Once the logarithmic derivatives are matched the option prices can be calculated and compared with the observed option prices. The results of this comparison form the penalty function for a multidimensional variation of the {λm }m>0 to obtain agreement between the observed and calculated option prices. Notes 1 Θ(x) = 0 if x < 0 and 1 otherwise. These particular parameters were chosen so that the trigonometric functions resolve into simple fractions. 3 See, for example, Section 22 of (Landau & Lifshitz 1977). 4 A similar pairing on the minimization of d2 f (t)/dt2 is seen in the work of Adams & Van Deventer (1994) and the recent entropy work of Edelman (2001). 5 In this paper we focus solely on the equilibrium solutions: those densities that correspond to the lowest value of λ0 . Equation 24 is solved by a collection of {λ0 , q(x)} pairs; the solutions q(x) corresponding to other λ0 being nonequilibrium solutions. These solutions are the subject of a forthcoming paper. 6 The lognormal distribution is ! 1 [ln(ST /St ) − r(T − t)]2 p p(ST , T |St , t) = exp − 2σ 2 (T − t) ST 2πσ 2 (T − t) 2 from which the Black-Scholes call price c(k) = d1 ≡ d2 ≡ St N (d1 ) − ke−r(T −t) N (d2) ln(St /k) + (r + σ 2 /2)(T − t) √ σ T −t √ d1 − σ T − t can be derived. 7 Our presentation of this derivation was derived, in part, from unpublished optoelectronics notes of Prof. David A.B. Miller. 19 8 The Numerov method can, in fact, be used to solve numerically the somewhat more general equation d2 q(x) + G(x)q(x) = S(x) . dx2 The development of the algorithm follows the same line as that illustrated in this Appendix. References Adams, K. J. & Van Deventer, D. R. (1994), ‘Fitting yield curves and forward rate crves with maximum smoothness’, J. Fixed Income pp. 52–62. Aoki, M. (1998), New Approaches to Macroeconomic Modeling: Evolutionary Stochastic Dynamics, Multiple Equilibria, and Externalities as Field Effects, Cambridge University Press, New York. Aoki, M. (2001), Modeling Aggregate Behavior & Fluctuations in Economics: Stochastic Views of Interacting Agents, Cambridge University Press, New York. Avellaneda, M. (1998), ‘Minimum-relative-entropy calibration of assetpricing models’, International Journal of Theoretical and Applied Finance 1, 447–472. Brody, D. C. & Hughston, L. P. (2001), ‘Interest rates and information geometry’, Proc. R. Soc. Lond. A 457, 1343–1364. Brody, D. C. & Hughston, L. P. (2002), ‘Entropy and information in the interest rate term structure’, Quantitative Finance 2, 70–80. Buchen, P. W. & Kelley, M. (1996), ‘The maximum entropy distribution of an asset inferred from option prices’, Journal of Financial and Quantitative Analysis 31, 143–159. Buck, B. & Macaulay, V. A., eds (1990), Maximum Entropy in Action: A Collection of Expository Essays, Oxford University Press, Oxford, UK. Cover, T. M. & Thomas, J. A. (1991), Elements of Information Theory, Wiley Series in Telecommunications, John Wiley & Sons, Inc., New York. 20 Edelman, D. (2001), The minimum local cross-entropy criterion for inferring risk-neutral price distributions from traded option prices. Private commuication. Fisher, M., Nychken, D. & Zervos, D. (1995), Fitting the term structure of interest rates with smoothing splines”, Technical Report FEDS 95-1, Federal Reserve Board, Washington, D.C. Fisher, R. A. (1925), ‘Theory of statistical estimation’, Proceedings of the Cambridge Philisophical Society 22, 700–725. Fomby, T. & Hill, R. C., eds (1997), Applying Maximum Entropy to Econometric Problems, Vol. 12 of Advances in Econometrics, JAI Press Inc., Greenwich, Connecticut. Frieden, B. R. (1988), ‘Applications to optics and wave mechanics of the criterion of maximum Cramer-Rao bound’, J. Mod. Opt. 35, 1297–1316. Frieden, B. R. (1998), Physics from Fisher Information, Cambridge University Press, Cambridge. Frieden, B. R., Plastino, A., Plastino, A. R. & Soffer, B. H. (1999), ‘Fisherbased thermodynamics: Its Legendre transform and concavity properties’, Phys. Rev. E 60, 48–53. Frieden, B. R., Plastino, A., Plastino, A. R. & Soffer, B. H. (2002), ‘Schroedinger link between nonequilibrium thermodynamics and Fisher information’, Phys. Rev. E 66, 046128. Frieden, B. R. & Soffer, B. H. (1995), ‘Lagrangians of physics and the game of Fisher-information transfer’, Phys. Rev. E 52, 2274–2286. Frishling, V. & Yamamura, J. (1996), ‘Fitting a smooth forward rate curve to coupon instruments’, J. Fixed Income pp. 97–103. Gardiner, C. W. (1996), Handbook of Stochastic Methods: For Physics, Chemistry, and the Natural Sciences, Vol. 13 of Springer Series in Synergetics, second edn, Springer-Verlag, New York. Golan, A., Miller, D. & Judge, G. G. (1996), Maximum Entropy Econometrics: Robust Estimation with Limited Data, Financial Economics and Quantitative Analysis, John Wiley & Sons, New York. 21 Hawkins, R. J. (1997), Maximum entropy and derivative securities, in T. Fomby & R. C. Hill, eds, ‘Applying Maximum Entropy to Econometric Problems’, Vol. 12 of Advances in Econometrics, JAI Press Inc., Greenwich, Connecticut, pp. 277–301. Hawkins, R. J., Rubinstein, M. & Daniell, G. (1996), Reconstruction of the probability density implicit in option prices from incomplete and noisy data, in K. M. Hanson & R. N. Silver, eds, ‘Maximum Entropy and Bayesian Methods’, Vol. 79 of Fundamendal Theories of Physics, Kluwer Academic Publishers, Santa Fe, New Mexico, pp. 1–8. Jackwerth, J. (1999), ‘Option implied risk-neutral distributions and implied binomial trees: A literature review’, Journal of Derivatives 7, 66–82. Jaynes, E. T. (1968), ‘Prior probabilities’, IEEE Transactions on System Science and Cynerbetics SSC-4, 227–241. Koonin, S. E. (1986), Computational Physics, Addison-Wesley, New York. Kullback, S. (1959), Information Theory and Statistics, Wiley, New York. Kuwahara, H. & Marsh, T. (1994), ‘Why doesn’t the Black-Scholes model fit Japanese warrants and convertible bonds?’, Japanese Journal of Financial Economics 1, 33–65. Landau, L. D. & Lifshitz, E. M. (1977), Quantum Mechanics (Non-relativistic Theory), Vol. 3 of Course of Theoretical Physics, third edn, Pergamon Press, New York. Maasoumi, E. (1993), ‘A compendium to information theory in economics and econometrics’, Econometric Review 12, 137–181. McCullough, J. H. (1975), ‘The tax-adjusted yield curve’, J. Finance XXX, 811–829. Risken, H. (1996), The Fokker-Plank Equation: Methods of Solution and Applications, Vol. 18 of Springer Series in Synergetics, second edn, Springer-Verlag, New York. Sengupta, J. K. (1993), Econometrics of Information and Efficiency, Kluwer Academic, Dordrecht. 22 Shannon, C. E. (1948), ‘A mathematical theory of communication’, The Bell System Technical Journal 27, 379–423. Shea, G. S. (1985), ‘Term structure estimation with exponential splines’, J. Finance XL, 319–325. Stutzer, M. (1994), The statistical mechanics of asset prices, in R. Elworth, W. Everitt & E. Lee, eds, ‘Differential Equations, Dynamical Systems, and Control Science - A Festschrift in Honor of Lawrence Markus’, Vol. 152 of Lecture Notes in Pure and Applied Mathematics, Marcel Dekker, New York, pp. 321–342. Stutzer, M. (1996), ‘A simple nonparametric approach to derivative security pricing’, J. Finance 51, 1633–1652. Vasicek, O. A. & Fong, H. G. (1982), ‘Term structure modeling using exponential splines’, J. Finance XXXVII, 339–356. Yeh, P., Yariv, A. & Hong, C. (1977), ‘Electromagnetic propagation in periodic stratified media. I. General theory’, J. Opt. Soc. Am. 67, 423–438. 23