Merton 1990

Download as pdf or txt
Download as pdf or txt
You are on page 1of 85

Chapter 11

C A P I T A L M A R K E T T H E O R Y AND T H E P R I C I N G OF
F I N A N C I A L SECURITIES

ROBERT MERTON*
Harvard University Graduate School of Business Administration

Contents
1. Introduction 498
2. One-period portfolio selection 499
3. Risk measures for securities and portfolios in the one-period model 507
4. Spanning theorems, mutual fund theorems, and bankruptcy
constraints 513
5. Two special models of one-period portfolio selection 538
6. Intertemporal consumption and portfolio selection theory 546
7. Consumption and portfolio selection theory in the continuous-time
model 551
8. Options, contingent claims analysis, and the Modigliani-Miller
Theorem 560
9. Bankruptcy, transactions costs, and financial intermediation in the
continuous-time model 566
10. Intertemporal capital asset pricing 569
References 576

* This chapter is a revised and expanded version of Merton (1982a). I thank A . M . E i k e b o o m for
technical assistance and D.A. H a n n o n for editorial assistance.

Handbook of Monetary Economics, Volume 1, Edited by B.M. Friedman and F.H. Hahn
© Elsevier Science Publishers B.V., 1990
498 R. Merton

1. Introduction

The core of financial economic theory is the study of individual behavior of


households in the intertemporal allocation of their resources in an environment
of uncertainty and of the role of economic organizations in facilitating these
allocations. The intersection between this specialized branch of micro-
economics and macroeconomic monetary theory is most apparent in the
theory of capital markets [cf. Fischer and Merton (1984)]. It is therefore
appropriate on this occasion to focus on the theories of portfolio selection,
capital asset pricing, and the roles that financial markets and intermediaries can
play in improving allocational efficiency.
The complexity of the interaction of time and uncertainty provides intrinsic
excitement to study of the subject, and as we shall see, the mathematics of
capital market theory contains some of the most interesting applications of
probability and optimization theory. As exemplified by option pricing and
modern portfolio theory, the research with all its seemingly abstruse mathe-
matics has nevertheless had a direct and significant influence on practice. This
conjoining of intrinsic intellectual interest with extrinsic application is, indeed,
a prevailing theme of theoretical research in financial economics.
The tradition in economic theory is to take the existence of households, their
tastes, and endowments as exogenous to the theory. This tradition does not,
however, extend to economic organizations and institutions. They are regarded
as existing primarily because of the functions they serve instead of functioning
primarily because they exist. Economic organizations are endogenous to the
theory. To derive the functions of financial instruments, markets, and inter-
mediaries, a natural starting point is, therefore, to analyze the investment
behavior of individual households.
It is convenient to view the investment decision by households as having two
parts: (1) the "consumption-saving" choice where the individual decides how
much income and wealth to allocate to current consumption and how much to
save for future consumption including bequests; and (2) the "portfolio selec-
tion" choice where the investor decides how to allocate savings among the
available investment opportunities. In general, the two decisions cannot be
made independently. However, many of the important findings in portfolio
theory can be more easily derived in a one-period environment where the
consumption-savings allocation has little substantive impact on the results.
Thus, we begin in Section 2 with the formulation and solution of the basic
portfolio selection problem in a static framework taking as given the individ-
ual's consumption decision.
Using the analysis of Section 2, we derive necessary conditions for static
Ch. 11: Capital Market Theory 499

financial equilibrium that are used to determine restrictions on equilibrium


security prices and returns in Sections 3 and 4. In Sections 4 and 5 these
restrictions are used to derive spanning or mutual fund theorems that provide a
basis for an elementary risk-pooling theory of financial intermediation.
In Section 6 the combined consumption-portfolio selection problem is
formulated in a more realistic and more complex dynamic setting. As shown in
Section 7, dynamic models in which agents can revise and act on their decisions
continuously in time produce significantly sharper results than their discrete-
time counterparts and do so without sacrificing the richness of behavior found
in an intertemporal decision-making environment.
The continuous-trading model is used in Section 8 to derive a theory of
option, corporate-liability, and general derivative-security pricing. In Section 9
the dynamic portfolio strategies used to derive these prices are shown to
provide a theory of production for the creation of risk-sharing instruments by
financial intermediaries. The closing section of the chapter examines inter-
temporal-equilibrium pricing of securities and analyzes the conditions under
which allocations in the continuous-trading model are Pareto efficient.
As is evident from this brief overview of content, the chapter does not cover
a number of important topics in capital market theory. For example, there is
no attempt to make explicit how individuals and institutions acquire the
information needed to make their decisions, and in particular how they modify
their behavior in environments where there are significant differences in the
information available to various participants. Thus, we do not cover either the
informational efficiency of capital markets or the principal-agent problem and
theory of auctions as applied to financial contracting, intermediation, and
markets. 1 Although the analysis is not institutionally based, the context is one
of a domestic economy. Adler and Dumas (1983) provide an excellent survey
article on applications of the theory in an international context.

2. One-period portfolio selection

The basic investment-choice problem for an individual is to determine the


optimal allocation of his or her wealth among the available investment
opportunities. The solution to the general problem of choosing the best

1On the informational efficiency of the stock market, see Fama (1965, 1970a), Samuelson
(1965), Hirshleifer (1973), Grossman (1976), Grossman and Stiglitz (1980), Black (1986), and
Merton (1987a, 1987b). On financial markets and incomplete information generally, see the
excellent survey paper by Bhattacharya (1989). On financial markets and auction theory, see
Hansen (1985), Parsons and Raviv (1985), and Rock (1986). On the role of behavioral theory in
finance, see Hogarth and Reder (1986).
500 R. Merton

investment mix is called portfolio selection theory. The study of portfolio


selection theory begins with its classic one-period or static formulation.
There are n different investment opportunities called securities and the
random variable one-period return per dollar on security j is denoted by
Z j ( j = 1 , . . . , n), where a "dollar" is the "unit of account". Any linear
combination of these securities which has a positive market value is called a
portfolio. It is assumed that the investor chooses at the beginning of a period
that feasible portfolio allocation which maximizes the expected value of a v o n
N e u m a n n - M o r g e n s t e r n utility function 2 for end-of-period wealth. Denote this
utility function by U(W), where W is the end-of-period value of the investor's
wealth measured in dollars. It is further assumed that U is an increasing strictly
concave function on the range of feasible values for W and that U is
twice-continuously differentiable. 3 Because the criterion function for choice
depends only on the distribution of end-of-period wealth, the only information
about the securities that is relevant to the investor's decision is his subjective
joint probability distribution for (Z 1. . . . . Zn).
In addition, it is assumed that:

Assumption 1. "Frictionless markets". There are no transactions costs or


taxes, and all securities are perfectly divisible.

Assumption 2. "Price taker". The investor believes that his actions cannot
affect the probability distribution of returns on the available securities. Hence,
if w~ is the fraction of the investor's initial wealth W0, allocated to security j,
then { W a , . . . , wn) uniquely determines the probability distribution of his
terminal wealth.

A riskless security is defined to be a security or feasible portfolio of securities


whose return per dollar over the period is known with certainty.

Assumption 3. "No-Arbitrage opportunities". All riskless securities must


have the same return per dollar. This c o m m o n return will be denoted by R.

2yon Neumann and Morgenstern (1947). For an axiomatic description, see Herstein and Milnor
(1953) and Machina (1982). Although the original axioms require that U be bounded, the
continuity axiom can be extended to allow for unbounded functions. See Samuelson (1977) for a
discussion of this and the St. Petersburg Paradox.
3The strict concavity assumption implies that investors are everywhere risk-averse. Although
strictly convex or linear utility functions on the entire range imply behavior that is grossly at
variance with observed behavior, the strict concavity assumption also rules out Friedman-Savage
type utility functions whose behavioral implications are reasonable. The strict concavity also
implies U'(W)> 0, which rules out investor satiation.
Ch. 11: Capital Market Theory 501

Assumption 4. "No-Institutional restrictions". Short sales of all securities,


with full use of proceeds, is allowed without restriction. If there exists a riskless
security, then the borrowing rate equals the lending rate. 4
Hence, the only restriction on the choice for the {w]} is the budget
constraint that ~ 1 w] = 1.

Given these assumptions, the portfolio-selection problem can be formally


stated as:

max w/ , (2.1)
{wt . . . . . w,,}

//

subject to ~1 wj = 1, where E is the expectation operator for the subjective


joint probability distribution. If ( w ~ , . . . , w*) is a solution to (2.1), then it will
satisfy the first-order conditions:

E{U'(Z*Wo)Zj}=A/Wo, ]=1,2 ..... n, (2.2)


t/

where the prime denotes derivative; Z e ~ E 1 w?Zj is the random variable


return per dollar on the optimal portfolio; and A is the Lagrange multiplier for
the budget constraint. Together with the concavity assumptions on U, if the
n x n variance-covariance matrix of the returns ( Z 1 , . . . , Z n) is non-singular
. . . . . . . 5 •
and an interior solution extsts, then the solution ts unique. This non-singularity
condition on the returns distribution eliminates " r e d u n d a n t " securities (i.e.
securities whose returns can be expressed as exact linear combinations of the
returns on other available securities). 6 It also rules out that any one of the
securities is a riskless security.
If a riskless security is added to the menu of available securities [call it the
(n + 1)st security], then it is the convention to express (2.1) as the following

4Borrowings and short sales are demand loans collateralized by the investor's total portfolio. The
"borrowing rate" is the rate on riskless-in-terms-of-default loans. Although virtually every
individual loan involves some chance of default, the empirical "spread" in the rate on actual
margin loans to investors suggests that this assumption is not a "bad approximation" for portfolio
selection analysis. However, an explicit analysis of risky loan evaluation and bankruptcy is
provided in Sections 8 and 9.
5The existence of an interior solution is assumed throughout the analyses in the chapter. For a
complete discussion of necessary and sufficient conditions for the existence of an interior solution,
see Leland (1972) and Bertsekas (1974).
6For a trivial example, shares of IBM with odd serial numbers are distinguishable from ones with
even serial numbers and are, therefore, technically different securities. However, because their
returns are identical, they are perfect substitutes from the point of view of investors. In portfolio
theory, securities are operationally defined by their return distributions, and therefore two
securities with identical returns are indistinguishable.
502 R. Merton

unconstrained maximization problem:

m.x ,:3,
(wI. . . . . wn} L, 1

where the portfolio allocations to the risky securities are unconstrained because
the fraction allocated to the riskless security
n
can always be chosen to satisfy the
budget constraint (i.e. wn+1
* = 1 - ~1 wj ). The first-order conditions can be
written as:

E{U'(Z*Wo)(Z j - R ) } = O , j=l,2,...,n, (2.4)


n
where Z* can be rewritten as ~ 1 w~ (Zj - R) + R. Again, if it is assumed that
the variance-covariance matrix of the returns on the risky securities is non-
singular and an interior solution exists, then the solution is unique.
As formulated, neither (2.1) nor (2.3) reflects the physical constraint that
end-of-period wealth cannot be negative. Moreover, although Assumption 4
permits short sales and borrowing, no explicit description was given to the
treatment of personal bankruptcy. A proper specification of the portfolio
problem thus requires the additional constraint that Z * -> 0 with probability
7
one.
This non-negativity constraint does not by itself address the institutional
rules for bankruptcy, because the probability assessments on the {Zj} are
subjective. A set of rules that does is to forbid borrowing and short selling in
conjunction with limited-liability securities where, by law, Zj >-0. These rules
can be formalized as restrictions on the permissible set of {wj}, such that
wj - 0 , j = 1, 2 . . . . . n + 1, and (2.1) or (2.3) can be solved using the methods
of Kuhn and Tucker (1951) for inequality constraints. However, imposition of
these specific restrictions generally leads to unnecessary and significant losses in
the allocational efficiency of the capital markets. Moreover, they do not reflect
real-world institutional constraints.
In Sections 4 and 5, using alternative rules, we introduce personal bank-
ruptcy into the static model and analyze the portfolio-selection problem with
the non-negativity constraint on wealth. In Sections 7 - 9 we formally analyze
intertemporal portfolio behavior and the pricing of securities when investors,
firms, and their creditors all recognize the prospect of default. Those analyses
shall show that the important results derived for the classical, unconstrained
version of the model are robust with respect to these restrictions. Thus, until

7If U is such that U'(0) = m, and by extension, U'(W) = % W < 0, then from (2.2) or (2.4) it is
easy to show that the probability of Z* -<0 is a set of measure zero. Mason (1981) and Karatzas,
Lehoczky, Sethi and Shreve (1986) study the effects of various bankruptcy rules on portfolio
behavior.
Ch. I1: CapitalMarket Theory 503

those sections, the non-negativity constraint on wealth and the treatment of


bankruptcy are ignored.
The optimal demand functions for risky securities, { w~ W0}, and the result-
ing probability distribution for the optimal portfolio will, of course, depend on
the risk preferences of the investor, his initial wealth, and the joint distribution
for the securities' returns. It is well known that the von N e u m a n n - M o r g e n -
stern utility function can only be determined up to a positive affine transforma-
tion. Hence, the preference orderings of all choices available to the investor
are completely specified by the P r a t t - A r r o w s absolute risk-aversion function,
which can be written as:

-u"(w)
A(W)~ U'(W) ' (2.5)

and the change in absolute risk-aversion with respect to a change in wealth is,
therefore, given by:

dA
- A ' ( W ) = A ( W ) ][A ( W ) + U " ( W ) ] (2.6)
dW u U"(W) J"

By the assumption that U(W) is increasing and strictly concave, A ( W ) is


positive, and such investors are called risk-averse. An alternative, but related,
measure of risk-aversion is the relative risk-aversion function, defined to be:

u"(w)w
R(W)~- U'(W) = A ( W ) W , (2.7)

and its change with respect to a change in wealth is given by:

R'(W) = A'(W)W + A(W). (2.8)

The certainty-equivalent end-of-period wealth, We, associated with a given


portfolio for end-of-period wealth whose random variable value is denoted by
W, is defined to be that value such that

U(Wc) = E { U ( W ) ) , (2.9)

i.e. Wc is the amount of money such that the investor is indifferent between
having this amount of money for certain or the portfolio with random variable
outcome W. The term "risk-averse" as applied to investors with strictly

8The behavior associated with the utility function V(W) =-aU(W) + b, a > 0, is identical to that
associated with U(W). Note: A(W) is invariant to any positive affine transformation of U(W). See
Pratt (1964).
504 R. Merton

concave utility functions is descriptive in the sense that the certainty-equivalent


end-of-period wealth is always less than the expected value of the associated
portfolio, E{W}, for all such investors. The p r o o f follows directly by Jensen's
Inequality: if U is strictly concave, then

U(Wc) = E { U ( W ) } < U ( E { W } ) ,

whenever W has positive dispersion, and because U is an increasing function of


W, Wc < E{W}.
The certainty-equivalent can be used to compare the risk-aversions of two
investors. An investor is said to be more risk-averse than a second investor if,
for every portfolio, the certainty-equivalent end-of-period wealth for the first
investor is less than or equal to the certainty-equivalent end-of-period wealth
associated with the same portfolio for the second investor with strict inequality
holding for at least one portfolio.
While the certainty-equivalent provides a natural definition for comparing
risk-aversions across investors, Rothschild and Stiglitz 9 have in a corresponding
fashion attempted to define the meaning of "increasing risk" for a security so
that the "riskiness" of two securities or portfolios can be compared. In
comparing two portfolios with the same expected values, the first portfolio with
random variable outcome denoted by W 1 is said to be less risky than the second
portfolio with r a n d o m variable outcome denoted by W2 if

E(U(Wl)) -> E( U(WO}, (2.10)

for all concave U with strict inequality holding for some concave U. They
bolster their argument for this definition by showing its equivalence to the
following two other definitions:

There exists a random variable Z such that W2 has the


same distribution as W~ + Z, where the conditional
expectation of Z given the outcome of W 1 is zero (2.11)
(i.e. W2 is equal in distribution to W1 plus some
"noise").

If the points of F and G, the distribution functions of


W 1 and W2, are confined to the closed interval
[a, b], and T(y) =- .fY [G(x) - f(x)] dx, then (2.12)
T(y) >-0 and T(b)= 0 (i.e. W 2 has m o r e "weight in
its tails" than W1).

9Rothschild and Stiglitz (1970, 1971). There is an extensive literature, not discussed here, that
uses this type of risk measure to determine when one portfolio "stochastically dominates" another.
Cf. Hadar and Russell (1969, 1971), Hanoch and Levy (1969), and Bawa (1975).
Ch. 11: Capital Market Theory 505

A feasible portfolio with returns per dollar Z will be called an efficient


portfolio if there exists an increasing, strictly concave function V such that
E { V ' ( Z ) ( Z j - R)} = 0, j = 1, 2 . . . . . n. Using the Rothschild-Stiglitz defini-
tion of "less risky", a feasible portfolio will be an efficient portfolio only if
there does not exist another feasible portfolio which is less risky than it is. All
portfolios that are not efficient are called inefficient portfolios.
From the definition of an efficient portfolio, it follows that no two portfolios
in the efficient set can be ordered with respect to one another. From (2.10) it
follows immediately that every efficient portfolio is a possible optimal port-
folio, i.e. for each efficient portfolio there exists an increasing, concave U and
an initial wealth W0 such that the efficient portfolio is a solution to (2.1) or
(2.3). Furthermore, from (2.10), all risk-averse investors will be indifferent
between selecting their optimal portfolios from the set of all feasible portfolios
or from the set of all efficient portfolios. Hence, without loss of generality,
assume that all optimal portfolios are efficient portfolios.
With these general definitions established, we now turn to the analysis of the
optimal demand functions for risky assets and their implications for the
distributional characteristics of the underlying securities. A note on notation:
the symbol " Z e " will be used to denote the random variable return per dollar
on an efficient portfolio, and a bar over a random variable (e.g. Z ) will denote
the expected value of that random variable.

Theorem 2.1. I f Z denotes the random variable return per dollar on any
feasible portfolio and if (Ze - 2~) is riskier than ( Z - Z ) in the Rothschild and
Stiglitz sense, then Z~ > Z.

Proof. By hypothesis, E { U ( [ Z - ZlW0)) > E { U ( [ Z e - 2~]Wo) }. If 2 -> 2e,


then trivially, E { U ( Z W o ) } > E { U ( Z e W o ) }. But Z is a feasible portfolio
and Z e is an efficient portfolio. Hence, by contradiction, 2~ > Z.

Corollary 2.1a. If there exists a riskless security with return R, then 2~ > R,
with equality holding only if Z e is a riskless security.

Proof. The riskless security is a feasible_portfolio with expected return


R. If Z e is riskless, then by Assumption 3, Z e = R. If Z e is not riskless, then
(Ze - 2e) is riskier than (R - R). Therefore, by Theorem 2.1, 2~ > R.

Theorem 2.2. The optimal portfolio for a non-satiated, risk-averse investor will
be the riskless security (i.e. wn+ = 1, wj* = O, j = 1, 2, . . . , n) if and only if
Z j = R for j = l , 2 . . . . , n .

Proof. From (2.4), { W l , . . . , w*} will satisfy E { U ' ( Z * W o ) ( Z j - R ) } = O . If


Zj = R, j = 1, 2 , . . . , n, then Z* = R will satisfy these first-order conditions. By
506 R. Merton

the strict concavity of U and the non-singularity of the variance-covariance


matrix of returns, this solution is unique. This proves the "if" part. If Z* = R is
an optimal solution, then we can rewrite (2.4) as U'(RWo)E(Z j - R ) - - 0 . By
the non-satiation assumption, U'(RWo)>0. Therefore, for Z * = R to be an
optimal solution, Zj = R, j = 1, 2 . . . . . n. This proves the "only if" part.

Hence, from Corollary 2.1a and T h e o r e m 2.2, if a risk-averse investor chooses


a risky portfolio, then the expected return on that portfolio exceeds the riskless
rate, and a risk-averse investor will choose a risky portfolio if, at least, one
available security has an expected return different from the riskless rate.
Define the notation E(Y] X 1 , . . . , Xq) to mean the conditional expectation
of the random variable Y, conditional on knowing the realizations for the
random variables (Xl, . . . , Xq).

Theorem 2.3. Let Zp denote the return on any portfolio p that does not contain
security s. If there exists a portfolio p such that for security s, Z, = Zp + e,,
where E(e,) = E(e, [ Z j, j = 1 , . . . , n, j ~ s) = 0, then the fraction of every
efficient portfolio allocated to security s is the same and equal to zero.

Proof. The proof follows by contradiction. Suppose Z e is the return on an


efficient portfolio with fraction 8, va 0 allocated to security s. Let Z be the
return on a portfolio with the same fractional holdings as Z~ except instead of
security s, it holds the fraction 6, in feasible portfolio Z . Hence, Z e =
Z + 8,(Z, - Zp) or Z e = Z + 6,e~. By hypothesis, Ze = Z, and because port-
folio Z does not contain security s, by construction, E(es ] Z ) = 0. Therefore,
for 6, ~ 0 , Z e is riskier than Z in the Rothschild-Stiglitz sense. But this
contradicts the hypothesis that Z e is an efficient portfolio. Hence, 8, = 0 for
every efficient portfolio.

Corollary 2.3a. Let ~ denote the set of n securities with returns ( Z 1 , . . . , Z , _ I ,


Zs, Z , + I , . . . , Z n ) and t)' denote the same set of securities, except
Z, is replaced with Zs,. If Z s , = Z , + e , and E ( e s ) = E ( e , I Z 1 , . . . , Z , 1,
Z,, Z s + l , . . . , Z n ) = 0 , then all risk-averse investors would prefer to choose
their optimal portfolios from ~ rather than t)'.

The proof is essentially the same as the proof of T h e o r e m 2.3, with Z s


replacing Zp. Unless the holdings of Z s in every efficient portfolio are zero, O
will be strictly preferred to tU.
T h e o r e m 2.3 and its corollary demonstrate that all risk-averse investors
would prefer any "unnecessary" uncertainty or "noise" to be eliminated. In
particular, by this theorem, the existence of lotteries is shown to be inconsis-
Ch. 11: Capital Market Theory 507

tent with strict risk-aversion on the part of all investorsJ ° While the inconsis-
tency of strict risk-aversion with observed behavior such as betting on the
numbers can be "explained" by treating lotteries as consumption goods, it is
difficult to use this argument to explain other implicit lotteries such as callable,
sinking-fund bonds where the bonds to be redeemed are selected at random.
As illustrated by the partitioning of the feasible portfolio set into its efficient
and inefficient parts and the derived theorems, the Rothschild-Stiglitz defini-
tion of increasing risk is quite useful for studying the properties of optimal
portfolios. However, it is important to emphasize that these theorems apply
only to efficient portfolios and not to individual securities or inefficient port-
folios. For example, if ( Z j - Zj) is riskier than ( Z - Z ) in the Rothschild-
Stiglitz sense and if security j is held in positive amounts in an efficient or
optimal portfolio (i.e. w~ > 0 ) , then it does not follow t h a t Zj must equal
or exceed Z. In particular, if w~ > 0, it does not follow that Zj must equal or
exceed R. Hence, to know that one security is riskier than a second security
using the Rothschild-Stiglitz definition of increasing risk provides no norma-
tive restrictions on holdings of either security in an efficient portfolio. And
because this definition of riskier imposes no restrictions on the optimal
demands, it cannot be used to derive properties of individual securities' return
distributions from observing their relative holdings in an efficient portfolio. To
derive these properties, a second definition of risk is required. Development of
this measure is the topic of Section 3.

3. Risk measures for securities and portfolios in the one-period model

In the previous section it was suggested that the Rothschild-Stiglitz measure is


not a natural definition of risk for a security. In this section a second definition
of increasing risk is introduced, and it is argued that this second measure is a
more appropriate definition for the risk of a security. Although this second
measure will not in general provide the same orderings as the Rothschild-
Stiglitz measure, it is further argued that the two measures are not in conflict,
and indeed, are complementary.
If Z K is the random variable return per dollar on an efficient portfolio K,
then let VK(Z K) denote an increasing, strictly concave function such that, for
V~ =- d V K / d Z K,

E{V~Zj - n)} = 0, j = 1, 2 . . . . , n ,

i.e. VK is a concave utility function such that an investor with initial wealth

1°I believe that Christian von Weizs~cker proved a similar theorem in unpublished notes some
years ago. However, I do not have a reference.
508 R. Merton

W0 = 1 and these preferences would select this efficient portfolio as his optimal
portfolio. While such a function V~: will always exist, it will not be unique. If
cov[xl, x2] is the functional notation for the covariance between the random
variables x~ and x2, then define the random variable, YK, by:

- E{V;,)
cov[V , Zff] " (3.1)

YK is well defined as long as Zff has positive dispersion because cov[V~:, Ze~] <
0.11 It is understood that in the following discussion "efficient portfolio" will
mean "efficient portfolio with positive dispersion". Let Zp denote the random
variable return per dollar on any feasible portfolio p.

Definition. The measure of risk of portfolio p relative to efficient portfolio K


K
with random variable return Z x, bp, is defined by:
x y ,
G =-c°v[ , , , G ]

and portfolio p is said to be riskier than portfolio p' relative to efficient portfolio
K K
K if bp > bp,.

Theorem 3.1. If Zp is the return on a feasible portfolio p and Z ~


e ts
" the return
K -K
on efficient portfolio K, then Zp - R = b p ( Z e - R ) .

Proof. From the definition of Vx, E { V ~ Z ~ - R)} = 0, j = 1, 2 , . . . , n. Let 8j


be the fraction n of portfolio p allocated to security j. Then, Zp = E 1 ~ j ( z ] -
R) + R, and ~1 8~E{V~Zj - R)} = E { V ~ Z r - R)} = 0. By a similar argu-
ment, E { V ~ Z f - R ) } = O . Hence, cov[V~:, Z f ] = ( R - 2 f ) E { V ~ } and
cov[Vk, Zp] = (R - Z- p ) E { V 'n } _ B y Corollary 2.1a, 2X~>R. Therefore,
cov[YK, Zp] (R Z e ) / ( R - Z e ) .

Hence, the expected excess return on portfolio p, Zp - R, is in direct propor-


tion to its risk, and because 2ff > R, the larger is its risk, the larger is its
K
expected return. Thus, Theorem 3.1 provides the first argument why b p is a
natural measure of risk for individual securities.
A second argument goes as follows.- Consider an investor with utility
function U and initial wealth W0 who solves the portfolio selection problem:

max E { U([wZj + (1 - w)ZlWo) ) ,

laFor a proof, see Theorem 236 in Hardy, Littlewood and P61ya (1959).
Oh. 11: Capital Market Theory 509

where Z is the return on a portfolio of securities and Zj is the return on


security j. The optimal mix, w*, will satisfy the first-order condition:

E( U'([w* + (1 - w*)ZlWo)(Z - Z) ) = O . (3.2)

If the original portfolio of securities chosen was this investor's optimal portfolio
(i.e. Z = Z*), then the solution to (3.2) is w*--0. However, an optimal
portfolio is an efficient portfolio. Therefore, by Theorem 3.1, Z j - R =
b ~ . ( 2 * - R). Hence, the "risk-return tradeoff" provided in Theorem 3.1 is a
condition for personal portfolio equilibrium. Indeed, because security j may be
contained in the optimal portfolio, w * W o is similar to an excess demand
function, b ~ measures the contribution of security j to the Rothschild-Stiglitz
risk of the optimal portfolio in the sense that the investor is just indifferent to a
marginal change in the holdings of security j provided that Z j - R = b~(Z* -
R). Moreover, by the Implicit Function Theorem, we have from (3.2) that

Ow* w*WoE{U"(Z - Zj)) - E { U ' )


>0, at w* = 0 . (3.3)
OZj WoE{U"(Z - Zj) 2)

Therefore, if 2 i lies above the "risk-return" line in the (Z, b*) plane, then the
investor would prefer to increase his holdings in security j, and if Zj lies below
the line, then he would prefer to reduce his holdings. If the risk of a security
increases, then the risk-averse investor must be "compensated" by a corre-
sponding increase in that security's expected return if his current holdings are
to remain unchanged.
A third argument for why bp K is a natural measure of risk for individual
securities is that the ordering of securities by their systematic risk relative to a
given efficient portfolio will be identical to their ordering relative to any other
efficient portfolio. That is, given the set of available securities, there is an
unambiguous meaning to the statement "security j is riskier than security i".
To show this equivalence along with other properties of the bpK measure, we
first prove a lemma.

L e m m a 3.1. (a) E[g~lvk]= E[Zp I ZeK] for efficient portfolio K. (b) If


E[Zp[Z~] = Zp, then cov[Zp, V~] = 0. (c) cov[Zp, V~] = 0 for efficient port-
folio K if and only if cov[Zp, V~] -- 0 for every efficient portfolio L.

Proof. (a) V~: is a continuous, monotonic function of Z~ and hence, V~:


and Zff are in one-to-one correspondence. (b) cov[Zp, V~] = E [ V ~ ( Z p -
Zp)] = E { V ~ E [ Z p - (c) By definition, bpK=0 if and only if
cov[Zp, V~:] = 0. From Theorem 3.1, if bpK=0, then Zp = R. From Corollary
510 R. Merton

2.1a, 2~ > R for every efficient portfolio L. Thus, from Theorem 3.1, bpL = 0 if
and only if Zp -- R.
K
Properties of the bp measure of risk are:

Property 1. If L and K are efficient portfolios, then for any portfolio p,


K K L
bp =bLb e.

From Corollary 2.1a, Z ~ > R and 2 L > R. From Theorem 3.1, b~ = (Z L - R)/
( Z- Ke - R ) , b Kp = ( Z-p - R ) / ( Z - Ke - R ) , and b Lp = ( Z- p - R ) / ( Z ~- L- R ) . Hence,
the b x measure satisfies a type of "chain rule" with respect to different efficient
portfolios.

Property 2. If L and K are efficient portfolios, then bKK = 1 and b~ > 0.

Property 2 follows from Theorem 3.1 and Corollary 2.1a. Hence, all efficient
portfolios have positive systematic risk, relative to any efficient portfolio.

Property 3. Zp = R if and only if bpK = 0 for every efficient portfolio K.

Property 3 follows from Theorem 3.1 and Properties 1 and 2.

Property 4. Let p and q denote any two feasible portfolios and let K and L
denote any two efficient portfolios, bpK z :~= - bqK if and only if bpL ~ bq.
L

Property 4 follows from Property 3 i f bpL = bqL = 0. Suppose b L # 0. Then Pro-


petty 4 follows from Properties 1 and 2 because (bq/bp)L L = (bKbq Lx)/(bKbpL~:) =
K K K
(bq/bp). Thus, the bp measure provides the same orderings of risk for any
reference efficient portfolio.

Property 5. For each efficient portfolio K and any feasible portfolio p, Zp =


R + bpK(Z eK - R) + ep, where E(gp) = 0 and E [epVL(Z
, L e )] = 0 for every efficient
portfolio L.

From Theorem 3.1, E ( e p ) = 0. If portfolio q is constructed by holding $1 in


portfolio p, $bpK in the riskless security, and short selling $bpK of efficient
portfolio K, then Zq = R + ep. From Property 3, Zq = R implies that bqL = 0 for
L ¢
every efficient portfolio L. B u t bq = 0 implies 0 = cov[Zq, V~] = E[epVL] for
every efficient portfolio L.

Property
K
6. If a feasible portfolio p has portfolio weights ( 6 1 , . . . , 6n) ,
n K
then bp = ~ 1 6jb ~ .
Ch. 11: Capital Market Theory 511

Property 6 follows directly from the linearity of the covariance operator with
respect to either of its arguments. Hence, the systematic risk of a portfolio is
the weighted sum of the systematic risks of its component securities.
K
The Rothschild-Stiglitz measure of risk is clearly different from the bj
measure here. The Rothschild-Stiglitz measure provides only for a partial
K
ordering, while the bj measure provides a complete ordering. Moreover, they
can give different rankings. For example, suppose the return on security j is
independent of the return on efficient portfolio K, then b~ = 0 and Zj = R.
Trivially, b~ = 0 for the riskless security. Therefore, by the b f measure,
security j and the riskless security have equal risk. However, if security j has
positive variance, then by the Rothschild-Stiglitz measure, security j is more
risky than the riskless security. Despite this, the two measures are not in
conflict and, indeed, are complementary. The Rothschild-Stiglitz definition
measures the "total risk" of a security in the sense that it compares the
expected utility from holding a security alone with the expected utility from
holding another security alone. Hence, it is the appropriate definition for
identifying optimal portfolios and determining the efficient portfolio set.
However, it is not useful for defining the risk of securities generally because it
does not take into account that investors can mix securities together to form
portfolios. The bjK measure does take this into account because it measures the
only part of an individual security's risk which is relevant to an investor:
namely, the part that contributes to the total risk of his optimal portfolio. In
contrast to the Rothschild-Stiglitz measure of total risk, the bjK measures the
"systematic risk" of a security (relative to efficient portfolio K). Of course, to
determine the b jK, the efficient portfolio set must be determined. Because the
Rothschild-Stiglitz measure does just that, the two measures are com-
plementary.
Although the expected return of a security provides an equivalent ranking to
K K
its b e measure, the bp measure is not vacuous. There exist non-trivial
K
information sets which allow bp to be determined without knowledge of Zp.
For example, consider a model in which all investors agree on the joint
distribution of the returns on securities. Suppose we know the utility function
U for some investor and the probability distribution of his optimal portfolio,
Z*Wo. From (3.2) we therefore know the distribution of Y(Z*). For security j,
define the random variable ej --= Zj - Zj. Suppose, furthermore, that we have
enough information about the joint distribution of Y(Z*) and ei to compute
cov[Y(Z*),ej]=cov[Y(Z*),Zj]=bT, but do not know Zj. '2 However,
Theorem 3.1 is a necessary condition for equilibrium in the securities market.

12A sufficient amount of information would be the joint distribution of Z* and ej. What is
necessary will depend on the functional form of U'. However, in no case will knowledge of Zj be a
necessary condition.
512 R. Merton

Hence, we can deduce the equilibrium expected return on security j from


Z,j = R + b ~ . ( 2 * - R ) . Analysis of the necessary information sets required to
deduce the equilibrium structure of security returns is an important topic in
portfolio theory and one that will be explored further in succeeding sections.
The manifest behavioral characteristic shared by all risk-averse utility maxi-
mizers is to diversify (i.e. to spread one's wealth among many investments).
The benefits of diversification in reducing risk depend upon the degree of
statistical interdependence among returns on the available investments. The
greatest benefits in risk reduction come from adding a security to the portfolio
whose realized return tends to be higher when the return on the rest of the
portfolio is lower. Next to such "counter-cyclical" investments in terms of
benefit are the non-cyclic securities whose returns are orthogonal to the return
on the portfolio. Least beneficial are the pro-cyclical investments whose returns
tend to be higher when the return on the portfolio is higher and lower when
the return on the portfolio is lower. A natural summary statistic for this
characteristic of a security's return distribution is its conditional expected-
return function, conditional on the realized return of the portfolio. Because the
risk of a security is measured by its marginal contribution to the risk of an
optimal portfolio, it is perhaps not surprising that there is a direct relation
between the risk measure of portfolio p, bp, and the behavior of the condition-
al expected-return function, G p ( Z e ) = - E [ Z p [ Z e ] , w h e r e Z~ is the realized
return on an efficient portfolio.

Theorem 3.2. If Zp and Zq denote the returns on portfolios p and q,


respectively, and if for each possible value of Z~, dGp(Ze)/dZ e >- dGq(Ze)/dZ e
with strict inequality holding over some finite probability measure of Z~, then
portfolio p is riskier than portfolio q and Zp > Zq.

Proof. From (3.1) and the linearity of the covariance operator, b p - bq =


cov[Y(Ze), Zp - Zq] = E[Y(Ze)(Z p - Zq)] because E[Y(Ze) ] = 0. By the prop-
erty of conditional expectations, E [ Y ( Z e ) ( Z p - Z q ) ] = E ( Y ( Z e ) [ G p ( Z e ) -
Gq(Ze)]) = cov[Y(Ze) , a p ( Z e ) - G q ( Z e ) ]. Thus, bp - bq = c o v [ Y ( Z e ) ,
G p ( Z e ) - Gq(Ze) ]. From (3.1), Y(Ze) is a strictly increasing function of Z~
and, by hypothesis, Gp(Ze) - Gq(Ze) is a non-decreasing function of Z~ for all
Z~ and a strictly increasing function of Ze over some finite probability measure
of Z~. From Theorem 236 in Hardy, Littlewood and P61ya (1959), it follows
that Cov[Y(Ze) , Gp(_Ze) -_Gq(Z~)] > 0, and therefore, bp > bq. From Theorem
3.1, it follows that Zp > Zq.

Theorem 3.3. If Zp and Zq denote the returns on portfolios p and q,


respectively, and if, for each possible value o f _ Z e , dGp(Ze)_/dZ e -
dGq(Ze)/dZ e = apq, a constant, then bp = b q + apq and Zp = Zq + a p q ( Z e - R ) .
Ch. I1: Capital Market Theory 513

Proof. By hypothesis, G p ( Z e ) - G q ( Z e ) = a p q Z e + h, where h does not de-


pend on Z~. As in the proof of Theorem 3.2, bp - bq = Cov[Y(Ze) , Gp(Ze) -
Gq(Ze) ] = cov[Y(Ze) , apqZ e + h]. Thus, b e -- bq = apq because
cov[Y(Z~),Z~]= 1 and c o v [ Y ( Z ~ ) , h ] = 0 . From Theorem 3.1, Zp =
R + h q ( Z e - R) + a p q ( Z e - R) : Zq -[- a p q ( Z e - R).

Theorem 3.4. If, for all possible values of Z~,


(i) dGp(Ze)/dZ e > 1, then Zp > Ze, -
(ii) O < d G e ( Z e ) / d Z e < l , then R < Z p < 2 ~ ;
(iii) dGp(Ze)/dZ e < 0 , then Zp < R;
(iv) dGp(Ze)/dZ e = ap, a constant, then Zp = R + a p ( Z e - R ) .

The proof follows directly from Theorems 3.2 and 3.3 by substituting either
Z e or R for Zq and noting that dGq(Ze)/dZ e = 1 for Zq = Ze and dGq(Ze)/
dZ~ = 0 for Zq : R.
As Theorems 3.2-3.4 demonstrate, the conditional expected-return function
provides considerable information about a security's risk and equilibrium
expected return. It is, moreover, common practice for security analysts to
provide conditioned forecasts of individual security returns, conditioned on the
realized return of a broad-based stock portfolio such as the Standard & Poor's
500. As is evident from these theorems, the conditional expected-return
function does not in general provide sufficient information to determine the
exact risk of a security. As follows from Theorems 3.3 and 3.4(iv), the
exception is the case where this function is linear in Z~. Although surely a
special case, it is a rather important one as will be shown in Section 4.

4. Spanning theorems, mutual fund theorems, and bankruptcy constraints

Definition. A set of M feasible portfolios with random variable returns


(X 1, . . . , AM) is said to span the space of portfolios contained in the set qr if
and only if for any portfolio in ~ with return denoted by Zp, there exist
numbers ((~1,..., •M)' ElM 6j = 1, such that Z e = E 1M6iX/-.

If N is the number of securities available to generate the portfolios in ~ and


if M* denotes the smallest number of feasible portfolios that span the space of
portfolios contained in qz, then M* -< N.
Fischer (1972) and Merton (1982a, pp. 611-614) use comparative statics
analysis to show that little can be derived about the structure of optimal
portfolio demand functions unless further restrictions are imposed on the class
of investors' utility functions or the class of probability distributions for
securities' returns. A particularly fruitful set of such restrictions is the one that
514 R. Merton

provides for a non-trivial (i.e. M * < N) spanning of either the feasible or


efficient portfolio sets. Indeed, the spanning property leads to a collection of
"mutual fund" or "separation" theorems that are fundamental to modern
financial theory.
A mutual fund is a financial intermediary that holds as its assets a portfolio
of securities and issues as liabilities shares against this collection of assets.
Unlike the optimal portfolio of an individual investor, the portfolio of secur-
ities held by a mutual fund need not be an efficient portfolio. The connection
between mutual funds and the spanning property can be seen in the following
theorem:

Theorem 4.1. If there exist M mutual funds whose portfolios span the portfolio
set ~, then all investors will be indifferent between selecting their optimal
portfolios from qt or from portfolio combinations of just the M mutual funds.

The proof of the theorem follows directly from the definition of spanning. If
Z* denotes the return on an optimal portfolio selected from ~ and if Xj
denotes the return on the jth mutual fund's portfolio, then there exist portfolio
weights ( 6 ~ ' , . . . , ~ t ) such that Z* -- ~1M 6~Xj. Hence, any investor would be
indifferent between the portfolio with return Z* and the (~ ] ' , . . . , ,5~) combi-
nation of the mutual fund shares.
Although the theorem states "indifference", if there are information-gather-
ing or other transactions costs and if there are economies of scale, then
investors would prefer the mutual funds whenever M < N. By a similar
argument, one would expect that investors would prefer to have the smallest
number of funds necessary to span gt. Therefore, the smallest number of such
funds, M*, is a particularly important spanning set. Hence, the spanning
property can be used to derive an endogenous theory for the existence of
financial intermediaries with the functional characteristics of a mutual fund.
Moreover, from these functional characteristics a theory for their optimal
management can be derived.
For the mutual fund theorems to have serious empirical content, the
minimum number of funds required for spanning M* must be significantly
smaller than the number of available securities N. When such spanning
obtains, the investor's portfolio-selection problem can be separated into two
steps: first, individual securities are mixed together to form the M* mutual
funds; second, the investor allocates his wealth among the M* funds' shares. If
the investor knows that the funds span the space of optimal portfolios, then he
need only know the joint probability distribution of ( X 1. . . . , XM. ) to de-
termine his optimal portfolio. It is for this reason that the mutual fund
theorems are also called "separation" theorems. However, if the M* funds can
be constructed only if the fund managers know the preferences, endowments,
Ch. 11: Capital Market Theory 515

and probability beliefs of each investor, then the formal separation property
will have little operational significance.
In addition to providing an endogenous theory for mutual funds, the
existence of a non-trivial spanning set can be used to deduce equilibrium
properties of individual securities' returns and to derive optimal rules for
business firms making production and capital budgeting decisions. Moreover,
in virtually every model of portfolio selection in which empirical implications
beyond those presented in Sections 2 and 3 are derived, some non-trivial form
of the spanning property obtains.
While the determination of conditions under which non-trivial spanning will
obtain is, in a broad sense, a subset of the traditional economic theory of
aggregation, the first rigorous contributions in portfolio theory were made by
Arrow (1953, 1964), Markowitz (1959), and Tobin (1958). In each of these
papers, and most subsequent papers, the spanning property is derived as an
implication of the specific model examined, and therefore such derivations
provide only sufficient conditions. In two notable exceptions, Cass and Stiglitz
(1970) and Ross (1978) "reverse" the process by deriving necessary conditions
for non-trivial spanning to obtain. In this section necessary and sufficient
conditions for spanning are developed along the lines of Cass and Stiglitz and
Ross, leaving until Section 5 discussion of the specific models of Arrow,
Markowitz and Tobin.
Let ~ f denote the set of all feasible portfolios that can be constructed from a
riskless security with return R and n risky securities with a given joint
probability distribution for their random variable returns ( Z ~ , . . . , Z n). Let J2
denote the n x n variance-covariance matrix of the returns on the n risky
assets.

Theorem 4.2. Necessary conditions for the M feasible portfolios with returns
(X l, , XM) to span the portfolio set ~Pf are~i) that the rank of 0 <- M
and i iii that there exist numbers (61 . . . . ,6M), ~1 6j = 1, such that the random
variable E ~ 6,Xj has zero variance.

Proofl (i) The set of portfolios Te defines a (n + 1) dimensional vector space.


By definition, if ( X 1 , . . . , XM) spans fire then each risky security's return can
be represented as a linear combination of ( X I , . . . , XM). Clearly, this is only
possible if the rank of J2-< M. (ii) The riskless security is contained in ~Ff.
Therefore, if ( X 1. . . . . XM) spans qel, then there must exist a portfolio
combination of ( X ~ , . . . , XM) which is riskless.
n

Proposition 4.1. If Zp = E 1 ajZj + b is the return on some security or portfolio


and if there are no "arbitrage opportunities" (Assumption 3), then (1) b =
n

[ 1 - E 7 ajlR and (2) Zp = R + ~1 a~(Zj- R).


516 R. Merton

Proof. Let Z* be the return on a portfolio with fraction 8j allocated to


security j, j = 1 . . . . . n; 8p allocated to the security with return Zp; (1 - 8p -
7 8 ~) allocated to the riskless security with return R. If 8 ~ is chosen such that
t ~ n
15j = -Spaj, then Z ' = R + 8 (b - R[1 - ~ a ] ) Z* is a riskless security, and
P t / "
therefore, by Assumption 3, Z = R But ~ can be chosen arbitrarily There-
fore, b = [ 1 - a,]R Substituting'for b,Pit follows directly that Zp = R +
E , a j ( Z j - R).

As long as there are no arbitrage opportunities, from T h e o r e m 4.2 and


Proposition 4.1 it can be assumed without loss of generality that one of the
portfolios in any candidate spanning set is the riskless security. If, by conven-
tion, X M = R, then in all subsequent analyses the notation ( X ~ , . . . , Xm, R)
will be used to denote an M-portfolio spanning set where m = - M - 1 is the
number of risky portfolios (together with the riskless security) that span ~f.

Proof. If ( X I , . M , X m, R) span ~f, then there exist portfolio weights


(/Su. . . . , ~Mj), ~1 ~q = 1, such as Zj = ~ 7 ~ijXi . Noting that XM = R and
m m
substituting ~Mj = 1 - ~ ~ ~q, we have that Zj = R + ~ 1 ~q(Xe - R). This proves
m
necessity. If there exist numbers (aq) such that Zj = R + ~1 a q ( X i - R ) ,
then pick the portfolio weights ~q ~ a q for i = 1 . . . . , m, and ~Mj = 1 - ~ ~" ~q,
from which it follows that Zj = ~1 ~qXi. But every portfolio in ~ can be
written as a portfolio combination of (Z~ . . . . . Zn) and R. Hence,
( X 1 , . . . , Xm, R) spans ~ f and this proves sufficiency.

Let 12x denote the m × m variance-covariance matrix of the returns on the


m portfolios with returns ( X ~ , . . . , X m).

Corollary 4.3a. A necessary and sufficient condition for ( X 1. . . . , Xm, R) to


be the smallest number of feasible portfolios that span (i.e. M* = m + 1) is that
the rank of 12 equals the rank of 12x = m.

Proof. If ( X 1 , . . . , Am, R) span qzf and m is the smallest number of risky


portfolios that does, then ( X 1 , . . . , Am) must be linearly independent, and
therefore rank O x = m. Hence, ( X 1 , . . . , X m) form a basis for the vector space
of security returns ( Z ~ , . . . , Z n). Therefore, the rank of 12 must equal the rank
of 0 x. This proves necessity. If the rank of 12x = m, then ( X 1 , . . . , X m) are
linearly independent. Moreover, (X 1. . . . . Xm) E ~Ff. Hence, m
if the rank of
12 = m, then there exist numbers (aq) such that Zj - 2j = E l a q ( X i - f(~) for
Ch. 11: Capital Market Theory 517
m m
j = l , 2 . . . . , n . Therefore, Z j = b j + E I aqXi, where b j ~ - Z , j - 2 1 a q X i.
By the same argument as that used to prove Proposition 4.1, bj = [1--
m m

1 aq]R. Therefore, Zj = R + ~ 1 aq(Xi - R). By Theorem 4.3,


( X 1 , . . . , Xm, R ) span W e.

It follows from Corollary 4.3a that a necessary and sufficient condition for
non-trivial spanning of gtf is that some of the risky securities are redundant
securities. Note, however, that this condition is sufficient only if securities are
priced such that there are no arbitrage opportunities.
In all these derived theorems the only restriction on investors' preferences
was that they prefer more to less. In particular, it was not assumed that
investors are necessarily risk-averse. Although ~ f was defined in terms of a
known joint probability distribution for ( Z 1. . . . . Zn), which implies homo-
geneous beliefs among investors, inspection of the proof of T h e o r e m 4.3 shows
that this condition can be weakened. If minvestors agree on a set of portfolios
( X ~ , . . . , Xm, R ) such that Zj = R + ~1 a q ( X i - R), j = 1 , 2 , . . . , n, and if
they agree on the numbers (aq), then by T h e o r e m 4.3, ( X 1. . . . , Xm, R ) span
~ f even if investors do not agree on the joint distribution of (X 1. . . . , Xm).
These appear to be the weakest restrictions on preferences and probability
beliefs that can produce non-trivial spanning and the corresponding mutual
fund theorem. Hence, to derive additional theorems it is now further assumed
that all investors are risk-averse and that investors have homogeneous prob-
ability beliefs.
Define qt~ to be the set of all efficient portfolios contained in ~f.

Proposition 4.2. I f Z e & the return on a portfolio contained in W e, then any


portfolio that combines positive amounts o f Ze with the riskless security is also
contained in ~ .

Proof. Let Z = ~ ( Z e - R ) + R be the return on a portfolio with positive


fraction 8 allocated to Z e and fraction (1 - ~) allocated to the riskless security.
Because Z e is an efficient portfolio, there exists a strictly concave, increasing
function V such that E { V ' ( Z e ) ( Z j - R)} = 0, j = 1, 2 . . . . , n. Define U ( W ) =-
V ( a W + b ) , where a ~ l / 8 > 0 and b = - ( 6 - 1 ) R / 8 . Because a > 0 , U is a
strictly concave and increasing function. Moreover, U ' ( Z ) = aV'(Ze). Hence,
E { U ' ( Z ) ( Z j - R)} -- 0, j = 1, 2 , . . . , n. Therefore, there exists a utility func-
tion such that Z is an optimal portfolio, and thus Z is an efficient portfolio.

It follows immediately from Proposition 4.2 that for every number 2 such
that ,~ _> R, there exists at least one efficient portfolio with expected return
equal to Z. Moreover, we also have that if ( X 1 , . . . , XM) are the returns on M
candidate portfolios to span the space of efficient portfolios q re, then without
loss of generality it can be assumed that one of the portfolios is the riskless
security.
518 R. Merton

Theorem 4.4. Let (X~, . . . , Xm) denote the returns on m feasible portfolios. If
for security j there exist numbers (air) such that Z r = Zi + E ~ aq(X i - X_i) + er,
where E[erV'K(Z~)]=O for some efficient portfolio K, then Z r = R + ,
E1%(2, - R).

Proof. Let Zp be the return on a portfolio with fraction 6 allocated to security


m
j; fraction 6~ = - S a q allocated to portfolio X~, i = 1 , . . . , m; and 1 - 6 - ~ l 6i
allocated to the riskless security. By hypothesis, Zp can be written as Zp =
m t t
R + 312 r - R - E 1 aq(f( i - R)]+ 6er, where E [ 6 e j V K ] = 6 E [ e y K ] = O . By
construction, E(er) = 0, and hence, cov[Zp, V~] = 0. Therefore, the systematic
K
risk of portfolio p, bp = 0. From Theoremm 3.1, Zp = R. But ~ can be chosen
arbitrarily. Therefore, Z r = R + ~1 aq(f(~ - R).

Hence, if the return on a security can be written in this linear form relative
to the portfolios ( X 1 , . . . , Xm), then its expected excess return, 2 r - R, is
completely determined by the expected excess returns on these portfolios and
the weights (air) .

Theorem 4.5. m
If, for every security j, there exist numbers (aq) such that
Zj = R + ~1 aq(Xi - R) + ej, where E [ e r l X ~ , . . . , Xm] = 0, then
(X1, . . . , Xm, R) span the set of efficient portfolios q re.

Proof. Let wjK denote the fraction of efficient portfolio K allocated to security
K -I- m K + K
j, j = 1 . . . . . n. By hypothesis, we can write Z e = R E 1 6 i (X i -R) e ,
. ~K ~Tn K 1 K ~ln K K
wnere
,r~ n
o.t ==-~lw.a..,
1 U
ano e = - ~ . t w rej, where E[e ] X 1 , . . . , X m ] =
L 1 w~E[ei ] X1, • • •, Xm] = 0. Construct the portfolio with return Zm by allocat-
ing fraction ~K to portfolio X i, i = 1 , . . . , m, and fraction 1 - ~ 1 6K to the
riskless security. By construction, Z K = Z + e K, where E[e K ] Z ] =
m K K
E[e~l E1 3; X;] = 0 b e c a u s e E [ e K I x 1 , . . . , Xm] = 0 . Hence, for e K ~ 0, Z e is

riskier than Z in the Rothschild-Stiglitz sense, which contradicts that Z x is an


efficient portfolio. Thus, e ~: ~- 0 for every efficient portfolio K, and all efficient
portfolios can be generated by a portfolio combination of ( X ~ , . . . ,Xm, R).

Therefore, if we can find a set of portfolios ( X 1 , . . . , X m) such that every


security's return can be expressed as a linear combination of the returns
( X 1 , . . . , Xm, R), plus noise relative to these portfolios, then we have a set of
portfolios that span ~e. The following theorem, first proved by Ross (1978),
shows that security returns can always be written in a linear form relative to a
set of spanning portfolios.

Theorem 4.6. Let w K denote the fraction of efficient portfolio K allocated to


security j, j = l , . . . ,n. ( X1,. .. , X re,R) span ~ if and only if there exist
Ch. 11: Capital Market Theory 519
numbers (aq) for every security j such that Zj = R + ~'~ a q ( X i - R) + ej, where
E[ej]Zlm ~ i1¢X i ] = O , 6 iK =-Z n1 wjaij,
K for every efficient portfolio K.

Proof. The "if" part follows directly from m


the proof of Theorem 4.5. In
that proof, we only needed that E [ e K ] ~ I 6iKXi] = 0 for every efficient port-
folio K to show that ( X 1 , . . . , Xm, R) span qtY. The proof of the "only if" part
is long and requires the proof of four specialized lemmas [see Ross (1978,
appendix)]. It is, therefore, not presented here.

Corollary 4.6. (X, R) span ~-re if and only if there exists a number aj for each
security j, j = 1 , . . . , n, such that Zj = R + a j ( g - R) + ej, where E(ej ] X) = 0.

Proof. The "if" part follows directly from Theorem 4.5. The "only if" part is
as follows. By hypothesis, Z~ = 6 ~;(X- R) + R for every efficient portfolio K.
If ) ( = R, then from Corollary 2.1a, 6/~ = 0 for every efficient portfolio K and
R spans gre. Otherwise, from Theorem 2.2, 6 K ~ 0 for every efficient portfolio.
By Theorem 4.6, E[ej[ 6KX] = 0, for j = 1 , . . . , n and every efficient portfolio
K. But, for 6 K ¢ 0 , E[ej[6KX] = 0 if and only if E[ej[X] = 0 .

In addition to Ross (1978), there have been a number of studies of the


properties of efficient portfolios [cf. Chen and Ingersoll (1983), Dybvig and
Ross (1982), and Nielsen (1986)]. However, there is still much to be de-
termined. For example, from Theorem 4.6, a necessary condition for
( X 1 , . . . , Xm, R) to span gre is that E[ej ] Z~] = 0, for j = 1 , . . . , n and every
efficient portfolio K. For m > 1, this condition is not sufficient
m
to ensure that
( X 1 , . . . , Xm, R) span qte. The condition that E [ e j [ ~ 1 hiXi] = 0 for all num-
bers Ai implies that E[ej-[ X 1. . . . , Am] = 0. If, however, the {hi) are restricted
to the class of optimal portfolio weights {6~} as in Theorem 4.6 and m > 1, it
does not follow that E [ e j ] X 1. . . . . Xm] = 0. Thus, E [ e j [ g I . . . . . Xm] = 0 is
sufficient, but not necessary, for ( X 1 , . . . , Xm, R) to span qtY. It is not known
whether any material cases of spanning are ruled out by imposing this stronger
condition. Empirical application of the spanning conditions generally assumes
that the condition E [ e j [ X 1. . . . . Am] = 0 obtains.
Since qpe is contained in q~f, any properties proved for portfolios that span
qte must be properties of portfolios that span O f. From Theorems 4.3, 4.5, and
4.6, the essential difference is that to span the efficient portfolio set it is not
necessary that linear combinations of the spanning portfolios exactly replicate
the return on each available security. Hence, it is not necessary that there exist
redundant securities for non-trivial spanning of l/re to obtain. Of course, all
three theorems are empty of any empirical content if the size of the smallest
spanning set M* is equal to (n + 1).
As discussed in the introduction to this section, all the important models of
520 R. Merton

portfolio selection exhibit the non-trivial spanning property for the efficient
portfolio set. Therefore, for all such models that do not restrict the class of
admissible utility functions beyond that of risk-aversion, the distribution of
individual security returns must be such that Z r = R + ~ air(X i - R ) + ej,
where er satisfies the conditions of T h e o r e m 4.6 for j = 1 , . . . , n. Moreover,
given some knowledge of the joint distribution of a set of portfolios t h a t s p a n
qre with ( Z r - Zr), there exists a method for determining the (air) and Z r.

Proposition 4.3. If, for every security j, E ( e r l X 1 , . . . , X m ) = O with


( X 1. . . . . Xm) linearly independent with finite variances and if the return on
security j, Z r, has a finite variance, then the (aij), i = 1, 2 . . . . . m, in Theorems
4.5 and 4.6 are given by:

air = ~ Uik Cov[Xk, Zr] ,


1

where vik is the i-kth element of J2x 1.

The proof of Proposition 4.3 follows directly from the condition that
E(ej[ Xk) = 0, which implies that cov[er, Xk] = 0, k = 1 , . . . , m. The condition
that ( X 1 , . . . , X m) be linearly independent is trivial in the sense that knowing
the joint distribution of a spanning set one can always choose a linearly
independent subset. The only properties of the joint distributions required to
compute the (air) are the variances and covariances of X 1 , . . . , X m and the
covariances between Z r and X x, . . . , X m. In particular, knowledge of 2 r is not
required because cov[X~, Zr] = cov[X~, Z r - Zr]. Hence, for m < n (and espe-
cially so for m ~ n), there exists a non-trivial information set which allows the
(air) to be determined without knowledge of 2 r. If X1 . . . . . )(m are known,
then 2 r can be computed by the formula in T h e o r e m 4.4. By comparison with
the example in Section 3, the information set required there to determine 2 r
was a utility function and the joint distribution of its associated optimal
portfolio with ( Z r - Zr). Here, we must know a complete set of portfolios that
span Te. However, here only the second-moment properties of the joint
distribution need be known, and no utility function information other than
risk-aversion is required.
A special case of no little interest is when a single risky portfolio and the
riskless security span the space of efficient portfolios and Corollary 4.6 applies.
Indeed, the classic mean-variance model of Markowitz and Tobin, which is
discussed in Section 5, exhibits this strong form of separation. Moreover, most
macroeconomic models have highly aggregated financial sectors where inves-
tors' portfolio choices are limited to simple combinations of two securities:
"bonds" and "stocks". The rigorous microeconomic foundation for such
Ch. 11: Capital Market Theory 521

aggregation is precisely that qt~ is spanned by a single risky portfolio and the
riskless security.
If X denotes the random variable return on a risky portfolio such that (X, R)
spans qre, then the return on any efficient portfolio, Ze, can be written as if it
had been chosen by combining the risky portfolio with return X with the
riskless security. Namely, Ze = 6 ( X - R) + R, where ,5 is the fraction allocated
to the risky portfolio and ( 1 - 6 ) is the fraction allocated to the riskless
security. By Corollary 2.1a, the sign of 6 will be the same for every efficient
portfolio, and therefore all efficient portfolios will be perfectly positively
correlated. If 2( > R, then by Proposition 4.2, X will be an efficient portfolio
and 6 > 0 for every efficient portfolio.

Proposition 4.4. If ( Z 1 , . . . , Zn) contain no redundant securities, `sj denotes


the fraction of portfolio X allocated to security j, and w~ denotes the fraction of
any risk-averse investor's optimal portfolio allocated to security j, j = 1 , . . . , n,
then for every such risk-averse investor:

w slw~=`5/`Sk, j,k=l,2,...,n.

The proof follows immediately because every optimal portfolio is an efficient


portfolio, and the holdings of risky securities in every efficient portfolio are
proportional to the holdings in X. Hence, the relative holdings of risky
securities will be the same for all risk-averse investors. Whenever Proposition
4.4 holds and ifn there exist numbers (`5"),. where `5~l`5*~=`sj/`sk, j, k =
1 , . . . , n, and ~1 `s j* = 1, then the portfolio with proportions (6~,. . . , 6*) is
called the Optimal Combination of Risky Assets. If such a portfolio ~n •exists, then
without loss of generality it can always be assumed that X = I`SjZj-

Proposition 4.5. If (X, R) spans ~ , then 11~e is a convex set.

Proof. Let Zle and Z~ denote the returns on two distinct efficient portfolios.
Because (X, R) spans qse, Z~ = `51(X- R) + R and Z~ = `52(X- R) + R. Be-
cause they are distinct, 61 ~ `52, and so assume `51 ~ 0. Let Z -= AZ~ + (1 - A)Z2e
denote the return on a portfolio which allocates fraction A to Z~ and (1 - A) to
Z~, where 0-< A --< 1. By substitution, the expression for Z can be rewritten as
Z = `5(Z1~- R) + R, where `5 -= [A + (82/81)(1 - A)]. Because Z~ and Z~ are
efficient portfolios, the sign of 81 is the same as the sign of 62. Hence, 6 -> 0.
Therefore, by Proposition 4.2, Z is an efficient portfolio. It follows by
induction that for any integer k and numbers Ai such that 0-< Ai-< 1, i =
1 , . . . , k, and ~ Ai = 1, Z k=- ~ AiZie is the return on an efficient portfolio.
Hence, ~ is a convex set.
522 R. Merton

Definition. A market portfolio is defined as a portfolio that holds all available


securities in proportion to their market values. To avoid the problems of
"double counting" caused by financial intermediaries and inter-investor issues
of securities, the equilibrium market value of a security for this purpose is
defined to be the equilibrium value of the aggregate demand by individuals for
the security. In models where all physical assets are held by business firms and
business firms hold no financial assets, an equivalent definition is that the
market value of a security equals the equilibrium value of the aggregate
amount of that security issued by business firms. If V/denotes the market value
of security j and V R denotes the value of the riskless security, then

M_ vj
6j , j=l,2 ..... n,
E vj+ vR
1

M
where 6~ is the fraction of security j held in a market portfolio.

Theorem 4.7. If ~ is a convex set, and if the securities' market is in


equilibrium, then a market portfolio is an efficient portfolio.

Proof. Let there be K risk-averse investors in the economyn


with the initial
wealth of investor k denoted by W0g. Define Z k --- R + ~ 1 w kj(Zj - R) to be the
return per dollar on investor k's optimal portfolio, where w~ is the fraction
allocated to security j. In equilibrium, ~ ~ wjWokk = Vj, j = 1, 2, .. . , n, and
K k ~t-~n
tWo~Wo~L1Vj+VR . Define A~=-Wko/Wo, k = l . . . . . K. Clearly, 0 -<
h~ <1_1 and ~ i h~ = 1. By definition of a market portfolio, ~ rl wjhk~ = 6~,
j = 1, 2 , . . . , n. Multiplying by (Zj - R) and summing over j, it follows that
~1K A~E~ w jk( Z j - R ) = E 1 ~ ak(Z k - R ) = E n1 6 jM( Z j - R ) = Z M _ R, where
Z~ is defined to be the return per dollar on the market portfolio. Because
2 ~ )tk = 1, Z M = ~ ~ A~Z k. But every optimal portfolio is an efficient portfolio.
Hence, Z M is a convex combination of the returns on K efficient portfolios.
Therefore, if ~ e is convex, then the market portfolio is contained in q~e.
Because a market portfolio can be constructed without the knowledge of
preferences, the distribution of wealth, or the joint probability distribution for
the outstanding securities, models in which the market portfolio can be shown
to be efficient are more likely to produce testable hypotheses. In addition, the
efficiency of the market portfolio provides a rigorous microeconomi¢ justifica-
tion for the use of a "representative man" to derive equilibrium prices in
aggregated economic models, i.e. the market portfolio is efficient if and only if
there exists a concave utility function such that maximization of its expected
Ch. 11: Capital Market Theory 523

value with initial wealth equal to national wealth would lead to the market
portfolio as the optimal portfolio. Indeed, it is currently fashionable in the real
world to advise "passive" investment strategies that simply mix the market
portfolio with the riskless security. Provided that the market portfolio is
efficient, by Proposition 4.2 no investor following such strategies could ever be
convicted of "inefficiency". Moreover, the market portfolio will be efficient if
markets are "complete" in the sense of Arrow (1953, 1964) and Debreu (1959)
and investors have homogeneous beliefs. Unfortunately, general necessary and
sufficient conditions for the market portfolio to be efficient have not as yet
been derived.
However, even if the market portfolio were not efficient, it does have the
following important property:

Proposition 4.6. In all portfolio models with homogeneous beliefs and risk-
averse investors, the equilibrium expected return on the market portfolio exceeds
the return on the riskless security.

The proof follows directly from the proof of Theorem 4.7 andCorollary 2.1a.
Clearly, 2 M - - R = ~ x A k ( z k - - R ) . By Corollary 2.1a, L ->R for k =
1 , . . . , K, with strict inequality holding if Z k is risky. But, h k > 0 . Hence,
Z M > R if any risky securities are held by any investor. Note that using no
information other than market prices and quantities of securities outstanding,
the market portfolio (and combinations of the market portfolio and the riskless
security) is the only risky portfolio where the sign of its equilibrium expected
excess return can always be predicted.
Returning to the special case where qte is spanned by a single risky portfolio
and the riskless security, it follows immediately from Proposition 4.5 and
Theorem 4.7 that the market portfolio is efficient. Because all efficient
portfolios are perfectly positively correlated, it follows that the risky spanning
portfolio can always be chosen to be the market portfolio (i.e. X = ZM).
Therefore, every efficient portfolio (and hence, every optimal portfolio) can be
represented as a simple portfolio combination of the market portfolio and the
riskless security with a positive fraction allocated to the market portfolio. If all
investors want to hold risky securities in the same relative proportions, then
the only way in which this is possible is if these relative proportions are
identical to those in the market portfolio. Indeed, if there were one best
investment strategy, and if this "best" strategy were widely known, then
whatever the original statement of the strategy, it must lead to simply this
imperative: "hold the market portfolio".
Because for every security ~jM ->0, it follows from Proposition 4.4 that in
equilibrium, every investor will hold non-negative quantities of risky securities,
and therefore it is never optimal to short sell risky securities. Hence, in models
524 R. Merton

where m = 1, the introduction of restrictions against short sales will not affect
the equilibrium.

Theorem 4.8. If (ZM, R) span ~e, then the equilibrium expected return on
security j can be written as:

Z] = R + fi]( Z M - R ) ,

where

cov[Z, zM]
t j- var(ZM)
The proof follows directly from Corollary 4.6 and Proposition 4.3. This
relation, called the Security Market Line, was first derived by Sharpe (1964) as
a necessary condition for equilibrium in the mean-variance model of Mar-
kowitz and Tobin when investors have homogeneous beliefs. This relation has
been central to most empirical studies of securities' returns published during
the last two decades. Indeed, the switch in notation from aij to/3 i in this special
case reflects the almost universal adoption of the term, "the 'beta' of a
security", to mean the covariance of that security's return with the market
portfolio divided by the variance of the return on the market portfolio.
In the special case of Theorem 4.8, /3] measures the systematic risk of
M
security j relative to the efficient portfolio Z M (i.e. /3j = b~ as defined in
Section 3), and therefore beta provides a complete ordering of the risk of
individual securities. As is often the case in research, useful concepts are
derived in a special model first. The term "systematic risk" was first coined by
Sharpe and was measured by beta. The definition in Section 3 is a natural
generalization. Moreover, unlike the general risk measure of Section 3,/3j can
be computed from a simple covariance between Zj and Z M. Securities whose
returns are positively correlated with the market are pro-cyclical, and will be
priced to have positive equilibrium expected excess returns. Securities whose
returns are negatively correlated are counter-cyclical, and will have negative
equilibrium expected excess returns.
In general, the sign of b~ cannot be determined by the sign of the correlation
coefficient between Z~ and Z~. However, as shown in Theorems 3.2-3.4,
because a Y ( Z ek) / O Z ~e > 0 for each realization of Z~, bjk > 0 does imply a
generalized positive "association" between the return on Zj and Zke. Similarly,
b~ < 0 implies a negative "association".
Let qrmi. denote the set of portfolios contained in ~ f such that there exists
no other portfolio in qtf with the same expected return and a smaller variance.
Ch. 11: Capital Market Theory 525

Let Z(IX) denote the return on a portfolio contained in q~m~. such that
Z(IX) = ix, and let 6 ~ denote the fraction of this portfolio allocated to security
j,j=l,...,n.

Theorem 4.9. I f ( Z ~ , . . . , Z , ) contain no redundant securities, then ( a ) f o r


each value ix, 6 ~i, ] = 1 , . . . , n, are unique; (b) there exists aportfolio contained
in 1/)"min with return X such that (X, R) span ~,in; and (c) Z] - R = aj()( - R),
where aj =- cov(Zj, X ) / v a r ( X ) , j = 1,2 . . . . . n.

Proof. Let o-q denote the i-jth element of 12 and because (Z~ . . . . , Z , )
contain no redundant securities, 12 is non-singular. Hence, let v 0 denote the
i-jth element of g2-~. All portfolios in ~m~n with expected return ix must have
portfolio weights that are solutions to the problem: rain E7 E~ 6i~jo-ij subject
to the constraint Z(IX)= ix. Trivially, if ix = R, then Z ( R ) = R and 6~ = 0,
j = 1, 2 , . . . , n. Consider the case where ix ~ R. The n first-order conditions
are:

O=k6;o'ij-a,,(2i-R), i=l,2,...,n,
1

where A, is the Lagrange multiplier for the constraint. Multiplying by 6~ and


summing, we have that Au = v a r [ Z ( i x ) ] / ( i x - R). By definition of ~min, Au
must be the same for all Z(IX). Because 12 is non-singular, the set of linear
equations has the unique solution:

a~=A~f~vq(2i-R), j=l,2,...,n.
I

This proves (a). From this solution, 6~/6~, j, k = 1, 2 , . . . , n, are the same for
every value of ix. Hence, all portfolios in qZmin with ix ¢ R are perfectly
correlated. Hence, pick any portfolio in ~mi, with ix ~ R and call its return X.
Then every Z(IX) can be written in the form Z ( / z ) = 6 ( X - R ) + R. Hence,
(X, R) span qZmin which proves (b), and from Corollary 4.6 and Proposition
4.3, (c) follows directly.

From Theorem 4.9, a~ will be equivalent to b E as a measure of a security's


systematic risk provided that the Z(IX) chosen for X is such that ix > R. Like
/3k, the only information required to compute a k is the joint second moments of
Z~ and X. Which of the two equivalent measures will be more useful obviously
depends upon the information set that is available. However, as the following
theorem demonstrates, the a k measure is the natural choice in the case when
there exists a spanning set for ~ with m = 1.
526 R. Merton

Theorem 4.10. I f ( X , R ) span g t~ and if X has a finite variance, then ~ is


contained in gtm~..

Proof. Let Z e be the return on any efficient portfolio. By hypothesis, Ze can


be written as Z e = R + a e ( X - R ) . Let Zp be the return on any portfolio in q~f
such that Ze = Zp" By Corollary 4.6, Zp can be written as Zp = R + a p ( X -
R ) + ep, where E(ep) = E(ep IX) = 0. Therefore, ap = a e if Zp = Z~; var(Zp) =
2 q_ > 2
ap var(X) v a r ( e p ) - ap var(X) = var(Ze). Hence, Z~ is contained in ~mi~"
Moreover, 1/-re will be the set of all portfolios in g'man such t h a t / z ~ R.

Thus, whenever there exists a spanning set for gZe with m = 1, the means,
variances, and covariances of ( Z I , . . . , Z , ) are sufficient statistics to complete-
ly determine all efficient portfolios. Such a strong set of conclusions suggests
that the class of joint probability distributions for ( Z 1 , . . . , Z , ) which admit a
two-fund separation theorem will be highly specialized. However, as the
following theorems demonstrate, the class is not empty.

Theorem 4.11. I f (Z1, . . . , Z n ) have a joint n o r m a l probability distribution,


then there exists a portfolio with return X such that ( X , R ) span ~e.

Proof. Using the procedure applied in the proof of Theorem 4.9, construct a
risky portfolio contained in q~m~n, and call its return X. Define the random
variables, e~ =- Z k - R - a k ( X - R ) , k = 1 , . . . , n. By part (c) of that theorem,
E(ek) = 0, and by construction, Cov[ek, X] = 0. Because Z~ . . . . , Z, are nor-
mally distributed, X will be normally distributed. Hence, e k is normally
distributed, and because cov[ek, X] = 0, e k and X are independent. Therefore,
E(e~) = E(e k I X ) = 0. From Corollary 4.6, it follows that (X, R) span qre.

It is straightforward to prove that if ( Z 1 , . . . , Zn) can have arbitrary means,


variances, and covariances, and can be mutually independent, then a necessary
condition for there to exist a portfolio with return X such that (X, R) span ~ e
is that ( Z 1 , . . . , Z n) be joint normally distributed. However, it is important to
emphasize both the word "arbitrary" and the prospect for independence. For
example, consider a joint distribution for (Z 1. . . . , Zn) such that the joint
probability density function, p ( Z 1. . . . , Z n ) , is a symmetric function. That is,
for each set of admissible outcomes for ( Z 1 , . . . , Z n ) , p ( Z l , . . . , Zn) remains
unchanged when any two arguments of p are interchanged. An obvious special
case is when (Z 1. . . . , Zn) are independently and identically distributed and
p(Z~, . . . , Zn) = p ( Z 1 ) P ( Z 2 ) . . . p ( Z , ) .

Theorem 4.12. I f p ( Z 1. . . . . Z n) is a symmetric function with respect to all its


arguments, then there exists a portfolio with return X such that (X, R ) spans ~ e.
Ch. 11: Capital Market Theory 527

Proof. By hypothesis, p ( Z l , . . . , Z, . . . . , Z n ) = p ( Z i , . . . , Z1, . . . , Z n ) for


each set of given values (Z~ . . . . . Zn). Therefore, from the first-order condi-
tions for portfolio selection, (2.4), every risk-averse investor will choose
~T = 8". But, this is true for i = 1 , . . . , n. Hence, all investors will hold all
risky securities in the same relative proportions. Therefore, if X is the return
on a portfolio with an equal dollar investment in each risky security, then
(X, R) will span g ee.

Samuelson (1967) was the first to examine this class of symmetric density
functions in a portfolio context. Chamberlain (1983) has shown that the class of
elliptical distributions characterize the distributions that imply mean-variance
utility functions for all risk-averse expected utility maximizers. However, for
distributions other than Gaussian to obtain, the security returns cannot be
independently distributed.
The Arbitrage Pricing Theory (APT) model developed by Ross (1976a)
provides an important class of linear-factor models that generate (at least
approximate) spanning without assuming joint normal probability distributions.
Suppose the returns on securities are generated by:

Zr=2r+~airYi+er, j=l .... ,n, (4.1)


1

where E(er) = E(er] I"1, - • • , Ym) = 0 and without loss of generality, E(Yi) = 0
and cov[Yi, Y~]=0, i ¢ j . The random variables { Y i } represent common
factors that are likely to affect the returns on a significant number of securities.
If it is possible to construct a set of m portfolios with returns ( X I , . . . , X m)
such that X i and Y; are perfectly correlated, i = 1, 2 , . . . , m, then the condi-
tions of T h e o r e m 4.5 will be satisfied and (X 1. . . . , X m, R ) will span qre.
Although in general it will not be possible to construct such a set, by
imposing some mild additional restrictions on {er}, Ross (1976a) derives an
asymptotic spanning theorem as the number of available securities, n, becomes
large. While the rigorous derivation is rather tedious, a rough description goes
as follows. Let Zp be the return on a portfolio with fraction 6j allocated to
security j, j = 1, 2 , . . . , n. From (4.1), Zp can be written as:

Zp = Zp + ~ , a i p Y i + e p , (4.2)
I

n
where Zp = R + E~ gr(2j - R); alp ~ E ~ ~raq; ep ~ E 1 6jej. Consider the set of
portfolios (called well-diversified p o r t f o l i o s ) that have the property 8j ~ i~r/n,
where ]~j[ -< M r < o~ and M r is independent of n, j = 1 . . . . , n. Virtually by the
definition of a common factor, it is reasonable to assume that for every n ~ m,
528 R. Merton

a significantly positive fraction of all securities, Ai, have aij ~ O, and this will be
true for each c o m m o n factor i, i = 1 , . . . , m. Similarly, because the {ej}
denote the variations in securities' returns not explained by c o m m o n factors, it
is also reasonable to assume for large n that for each j, ej is uncorrelated with
virtually all other securities' returns. Hence, if the n u m b e r of c o m m o n factors,
m, is fixed, then for all n >> m, it should be possible to construct a set of
well-diversified portfolios (Xk} such that for Ark, aik = 0, i = 1 , . . . , m, i ~ k,
and akk ~ 0. It follows from (4.2) that Xg can be written as:

x k = f(k + ak~yk + n1 ~~ /xjej,


k k 1,...,m.

But is bounded, independently of n, and virtually all the {ej}_are


uncorrelated. Therefore, by the Law of Large Numbers, as n ~ 0% X k ~ X~ +
akkY~ with probability one. So, as n becomes very large, X k and Yk b e c o m e
perfectly correlated, and by T h e o r e m 4.5, asymptotically ( X 1 , . . . , Xm, R) will
span q re. In particular, if m = 1, then asymptotically two-fund separation will
obtain independent of any other distributional characteristics of I11 or the { ej}.
As can be seen from T h e o r e m 2.3 and its corollary, all efficient portfolios in
the A P T model are well-diversified portfolios. Unlike in the m e a n - v a r i a n c e
model, returns on all efficient portfolios need not, however, be perfectly
correlated. The model is also attractive because, at least in principle, the
equilibrium structure of expected returns and risks of securities can be derived
without explicit knowledge of investors' preferences or endowments. Indeed,
whenever non-trivial spanning of ~Fe obtains and the set of risky spanning
portfolios can be identified, much of the structure of individual securities
returns can be empirically estimated. For example, if we know of a set of
portfolios (Xi} such that E ( e i [ X 1 , . . . , X m ) = O , j = l , . . . , n , then by
T h e o r e m 4.5, ( X 1 , . . . , X m, R) span gte. By Proposition 4.3, ordinary-least-
squares (OLS) regression of the realized excess returns on security j, Zj - R,
on the realized excess returns of the spanning portfolios, ( X 1 - R . . . . . X m -
R), will always give unbiased estimates of the {aij }. Of course, to apply
time-series estimation, it must be assumed that the spanning portfolios
(X~ . . . . , Xm) and {aij } are intertemporally stable. For these estimators to be
efficient, further restrictions on the (ej} are required to satisfy the G a u s s -
Markov Theorem.
Early empirical studies of stock market securities' returns rarely found m o r e
than two or three statistically significant c o m m o n factors. 13 Given that there are

~3Cf. King (1966), Livingston (1977), Farrar (1962), Feeney and Hester (1967), and Farrell
(1974). Unlike standard "factor analysis", the number of common factors here does not depend
upon the fraction of total variation in an individual security's return that can be "explained".
Rather, what is important is the number of factors necessary to "explain" the covariation between
pairs of individual securities.
Ch. 11: Capital Market Theory 529

tens of thousands of different corporate liabilities traded in U.S. securities


markets, there appears to be empirical foundation for the assumptions of the
A P T model. More-recent studies have, however, concluded that the number of
common factors may be considerably larger, and some have raised serious
questions about the prospect for identifying the factors by using stock-return
data alone. 14
Although the analyses derived here have been expressed in terms of
restrictions on the joint distribution of security returns without explicitly
mentioning security prices, it is obvious that these derived restrictions impose
restrictions on prices through the identity that Zj =-Vj/Vj0, where Vj is the
random variable, end-of-period aggregate value of security j and Vj0 is its initial
value. Hence, given the characteristics of any two of these variables, the
characteristics of the third are uniquely determined. For the study of equilib-
rium pricing, the usual format is to determine equilibrium Vj0 given the
distribution of Vj.

Theorem 4.13. I f (X1, . . . , X m) denote a set of linearly independent portfolios


that satisfy the hypothesis of Theorem 4.5, and all securities have finite vari-
ances, then a necessary condition for equilibrium in the securities' market is that

z 1 1
Vj0 , j = 1, - . . ~ n , (4 • 3)
R
where vik is the i-kth element of ~ x 1.

Proof. By linear independence, ~2x is non-singular. From the identity Vj-=


ZjVjo and Theorem 4.5, Vj = Vj0[R + 2 1 a q ( X i - R) + ejl, where
E(ejlX1,.m.,Xm)=E(ej)=O. Taking expectations, we have that ~ =
Vj0[R -~- E 1 aij(~ri - R)]. Noting thatm COv[Xk' Vj] = Vj0 cov[X~, Zj], we have
from_Proposition 4.3 that Vj0a q = 2 1 Uik C o V [ X k , V j ] . By substituting for aq in
the Vj expression and rearranging terms, the theorem is proved.

Hence, from Theorem 4.13, a sufficient set of information to determine the


equilibrium value of security j is the first and second moments for the joint
distribution of ( X 1 , . . . , Xm, Vj). Moreover, the valuation formula has the
following important "linearity" properties:

Corollary 4.13a. I f the hypothesized conditions of Theorem 4.13 hold and if


the end-of-period value of a security is given by V = 2~ AjVj, then in
14There is considerable controversy on this issue. See Chamberlain and Rothschild (1983),
Dhrymes, Friend and Gultekin (1984, 1985), Roll and Ross (1980), Rothschild (1986), Shanken
(1982), and Trzcinka (1986).
530 R. M e r t o n

equilibrium:

Vo : , b Vjo .
1

The proof of the corollary follows by substitution for V in formula (4.3). This
property of formula (4.3) is called "value-additivity".

Corollary 4.13b. If the hypothesized conditions of Theorem 4.13 hold and if


the end-of-period value of a security is given by V= qVj + u, where E ( u ) =
E(ulX1,...,Xm)=~ and E ( q ) = E ( q ] X 1. . . . ,Xm,V])=c~, then in
equilibrium:

Vo = 4Vjo + a m .

The proof follows by substitution for V in formula (4.3) and by applying the
hypothesized conditional-expectation conditions to show that cov[X~, V] =
¢] cov[Xk, Vj]. Hence, to value two securities whose end-of-period values differ
only by multiplicative or additive "noise", we can simply substitute the
expected values of the noise terms.
As discussed in Merton (1982a, pp. 642-651), Theorem 4.13 and its corol-
laries are central to the theory of optimal investment decisions by business
firms. To finance new investments, the firm can use internally available funds,
issue common stock or issue other types of financial claims (e.g. debt,
preferred stock, and convertible bonds). The selection from the menu of these
financial instruments is called the firm's financing decision. Although the
optimal investment and financing decisions by a firm generally require simulta-
neous determination, under certain conditions the optimal investment decision
can be made independently of the method of financing.
Consider firm j with random variable end-of-period value V j and q different
financial claims. The kth such financial claim is defined by the function fk(vJ),
which describes how the holders of this security will share in the end-of-period
value of the firm. The production technology and choice of investment
intensity, Vj(Ij; Oj) and/i, are taken as given where Ojis a random variable. If it
is assumed that the end-of-period value of the firm is independent of its choice
of financial liabilities, 15 then V j= Vj(Ij; Oj), and ~q f~ ~Vj.(IE; Oj) for every
outcome Oj.

~5This assumption formally rules out financial securities that alter the tax liabilities of the firm
(e.g. interest deductions) or ones that can induce "outside" costs (e.g. bankruptcy costs).
However, by redefining ~ ( l j ; 0j) as the pre-tax-and-bankruptcy value of the firm and letting one of
the fk represent the government's tax claim and another the lawyers' bankruptcy-cost claim, the
analysis in the text will be valid for these extended securities as well [cf. Merton (1990, ch. 13)].
Ch. 11: Capital Market Theory 531

Suppose that if firm j were all equity-financed, there exists an equilibrium


such that the initial value of firm j is given by Vj0(Ij).

Theorem 4.14. If firm j is fnanced by q different claims defined by the


functions fk(W), k = 1 , . . . , q, and if there exists an equilibrium such that the
return distribution of the efficient portfolio set remains unchanged from the
equilibrium in which firm j was all equity-financed, then
q
Z =
1

where fko is the equilibrium initial value of financial claim k.

Proof. In the equilibrium in which firm j is all equity-financed, the end-of-


period random variable value of firm j is Vj(/j; 0j) and the initial value, Vj0(Ij) ,
is given by formula (4.3), where ( X 1 , . . . , Xm, R) span the efficient set.
Consider now that firm j is financed by the q different claims. The random
• q
variable end-of-period value of firm 1, ~ i fk, is still given by Vj(Ifi 0j). By
hypothesis, there exists an equilibrium such that the distribution of the efficient
portfolio set remains unchanged, and therefore the distribution of
( X 1 , . . . , Xm, R) remains unchanged. By inspection of formula (4.3), the
initial value of firm j will remain unchanged, and therefore ~ q fk0 = V/0(Ij).

Hence, for a given investment policy, the way in which the firm finances its
investment will not affect the market value of the firm unless the choice of
financial instruments changes the return distributions of the efficient portfolio
set. Theorem 4.14 is representative of a class of theorems that describe the
impact of financing policy on the market value of a firm when the investment
decision is held fixed, and this class is generally referred to as the Modigliani-
Miller Hypothesis, after the pioneering work in this direction by Modigliani
and Miller. 16
Clearly, a sufficient condition for Theorem 4.14 to obtain is that each of the
financial claims issued by the firm are "redundant securities" whose payoffs can
be replicated by combining already-existing securities. This condition is satis-
fied by the subclass of corporate liabilities that provide for linear sharing
rules (i.e. f~(V) = akV + b k , where E q1 ak = 1 and ~1q bk = 0). Unfortunately,
as will be shown in Section 8, most common types of financial instruments
issued by corporations have non-linear payoff structures. As Stiglitz (1969,
1974) has shown for the Arrow-Debreu and Capital Asset Pricing Models,
16Modigliani and Miller (1958)• See also Stiglitz (1969, 1974), Fama (1978), and Miller (1977),
The "MM" concept has also been applied in other parts of monetary economics as in Wallace
(1981).
532 R. Merton

linearity of the sharing rules is not a necessary condition for Theorem 4.14 to
obtain. Nevertheless, the existence of non-linear payoff structures among wide
classes of securities makes the establishment of conditions under which the
hypothesis of Theorem 4.14 is valid no small matter.
Beyond the issue of whether firms can optimally separate their investment
and financing decisions, the fact that many securities have non-linear sharing
rules raises serious questions about the robustness of spanning models. As
already discussed, the APT model, for example, has attracted much interest
because it makes no explicit assumptions about preferences and places seem-
ingly few restrictions on the joint probability distribution of security returns. In
the APT model, ( X 1 , . . . , Xm, R) span the set of optimal portfolios and there
exist m m numbers ( a l k , . . . , amk ) for each security k, k = 1 , . . . , n, such that
Zk = ~1 aik(Xi- R) + R + ek, where E(ek) = E ( e k l X 1 , . . . , Xm) = 0 .
Suppose that security k satisfies this condition and security q has a payoff
structure that is given by Zq =f(Zx), where f is a non-linear function. If
security q is to satisfy this condition, then there must exist numbers
(alq,~. m. , amq ) SO that for all possible values of ( X 1 , . . . , Xm),
E [ f ( L 1 aik(X ~- R) + R + e k ) l X 1 , . . . , Xm] = ~"~ aiq(X i - R) + R. However,
unless e k ~ 0 and eq =- O, such a set of numbers cannot be found for a general
non-linear function f.
Since the APT model only has practical relevance if for most securities,
var(ek) > 0, it appears that the reconciliation of non-trivial spanning models
with the widespread existence of securities with non-linear payoff structures
requires further restrictions on either the probability distributions of securities
returns or investor preferences. How restrictive these conditions are cannot be
answered in the abstract. First, the introduction of general-equilibrium pricing
conditions on securities will impose some restrictions on the joint distribution
of returns. Second, the discussed benefits to individuals from having a set of
spanning mutual funds may induce the creation of financial intermediaries or
additional financial securities, that together with pre-existing securities will
satisfy the conditions of Theorem 4.6. Although the intertemporal models of
Sections 7-10 will explore these possibilities in detail, we examine here one
important area of non-linear risk-sharing: namely, personal bankruptcy.
With limited liability on the nn risky assets (Zj -> 0, j = 1 , . . . , n), an uncon-
strained portfolio return, Z = ~1 w~(Zj - R) + R, can take on negative values
only if there is short selling (i.e. w j < 0 for some j) or borrowing (i.e.
~1 wj > 1). In the formulation of the portfolio-selection problem, the inves-
tor's portfolio is placed in escrow as collateral for all loans. However, because
the portfolio represents the investor's entire wealth, the value of the portfolio
is the only recourse for the investor's creditors to be paid. Hence, if an
investor's optimal unconstrained portfolio has the possibility that Z* < 0, then
the lenders of securities or cash may receive less than their promised payments.
Ch. 11: Capital Market Theory 533

n *
Suppose, for example, that the investor's portfolio has w~ -> 0 and ~ 1 w~. >
1, so that he borrows. Under our assumptions that neglect personal bank-
ruptcy, the investor borrows (~1 w~ - 1)W0 and pays R(E1 w~ - 1)W0 at the
end of the period. If, however, we take account of personal bankruptcy, then
the payment actually received by the investor's creditor is R ( ~ w~ - 1)W0 if
Z* ->0 and [R(E~ w~ - 1) + Z*]W0 = ( ~ w~.Zj)Wo if Z* <0. Thus, the actual
sharing rule between the investor and his creditor is that r/ •
the investorrt receives
W0 max[0, Z*] and the creditor receives W0 min[(~ 1 wj - 1)R, ~i w~Zj] =
W0[(E ~ w~ - 1)R - max(0, - Z * ) ] . Therefore, personal bankruptcy creates,
de facto, a set of securities with payoffs that are non-linear functions of the
returns on the n underlying risky assets.
Under the terms for borrowing and short selling in the unconstrained case,
the investor's end-of-period wealth is given by Z*Wo. Under the same terms,
but with bankruptcy, the investor receives W0max[0 , Z*]. In effect, the
bankruptcy provision guarantees that the value of the investor's portfolio is
never negative, and the provider of that guarantee is the investor's creditor.
That is, the payoff pattern to the investor is as if he held the unconstrained
portfolio, Z*Wo, together with a "portfolio-value" guarantee with payoff
W0 m a x [ 0 , - Z * ] . But, of course, the lenders of securities and cash recognize
that they are implicitly supplying this guarantee to the investor and realize that
if Z* < 0 , they will receive less than their promised payments. They will
therefore charge for the guarantee.
Let F(Wl,..., w,) denote the price charged by creditors for a guarantee
security with payoff function m a x ( 0 , - ~ 1 wi(Zj- R ) - R). As will be dis-
cussed in Section 8, this payoff function is identical to the one for a put option
(on the portfolio) with a zero exercise price. F-> 0, and we assume that the
price schedule is a twice-continuously-differentiable
n
function. F = 0 only if
creditors believe that prob{~ 1 wj(Zj - R) + R < 0} = 0, and F j ( W l , . . . , w,) ---
OF(w~,...,w,)/Owj=O, j = l , . . . , n , if p r o b { E ~ w j ( Z i - R ) + R > 0 } = l .
To capture the effect of personal bankruptcy and ensure the non-negativity
of end-of-period wealth, we require that the investor must always purchase a
guarantee security on his underlying
n
risky-asset portfolio. n The payoff function
to one "unit" is ~ l w j ( Z j - R ) + R + m a x ( O , - E 1 wj(Zj- R ) - R)=
max(0, ~ wj(Zj - R) + R) and the price per unit is 1 + F(wl,... , w,). The
return per dollar invested in each unit is thus max(0, ~7 wj(Zj- R)+ R)/
[1 + F(Wl,..., w,)]. The portfolio-selection problem taking account of per-
sonal bankruptcy is formulated as:

max . . . . .
{w~ . . . . . w.} 1

(4.4)
534 R. Merton

If the price schedule {F} is such that an interior maximum exists, then the
n
first-order conditions for the optimal portfolio, (Z* = ~ 1 w~(Zj - R) + R), are
given by

E + U'[Z*Wo/(I+F* -R (I+F*) =0, j=l,...,n,


(4.5)

where E + is the partial-expectation operator over the portion of the joint


distribution of Z 1 , . . . , Z, such that Z*->0; F * = F ( w ~ , . . . , w,), and Fj
Fj(w~, . . . , w*), j = 1 , . . . , n.
With homogeneous probability beliefs among investors and creditors, if
prob{Z* >0} = 1, then E + = E, F* = 0, and F~ = 0, ] = 1 , . . . , n. Hence, for
all such optimal portfolios, we have from (4.5) that E{ U'[Z*Wo](Z j - R)} = 0,
] = 1 , . . . , n, which is identical to (2.4). Therefore, the optimal portfolios
selected by these investors will be the same with or without explicit recognition
of the personal-bankruptcy constraint.
By inspection, the solution for Z* in (4.5) depends on the price schedule
{F) and therefore, without further specification, little can be said about the
relation between such portfolios and the portfolios contained in the uncon-
strained efficient portfolio set, ~ . Consider, however, an institutional environ-
ment in which there is a default-free intermediary that will buy and sell
put options on any security or portfolio of securities. F ( W l , . . . , w,) must,
therefore, equal the price of a put option with zero exercise price on a portfolio
with return ~ wj(Z~ - R) + R. Because the put options are traded, personal-
portfolio equilibrium in an unconstrained environment requires that the returns
on these put options satisfy (2.4) in the same way that any Zj does. Hence, for
any Z e E qre and any ( w x , . . . , w,), we have that

EIV'(Z,)| ~ . . . . . R =0 (4.6)
L l,*

where V is the strictly risk-averse utility function such that Z e is the associated
optimal portfolio. We can rewrite (4.6) as:

I+F(Wl,...,w,)=E
{ ( n
G(Ze)max 0 , ~ w ~ ( Z j - R ) + R
)}/ R
1

= E + G(Ze) wj(Zj- R) + R R, (4.7)


Ch. 11: Capital Market Theory 535

where G = - V ' ( Z e ) / E { V ' ( Z e ) } and, as in (4.5), E + is the partial-expectation


operator over the region }-]7 wj(Zj - R) + R >- O. By differentiating (4.7), we have
that

Fj(Wl,...,w,)=E+{G(Ze)(Zj-R)}/R, j=l,...,n. (4.8)


An economic interpretation of pricing formula (4.7) is as follows. Let
d P ( Z I , . . . , Z , ) denote the joint probability density function for ZI . . . . . Z,.
Define d Q ( Z 1 , . . . , Z,)=- G ( Z ~ ) d P ( Z 1. . . . , Z,). By definition, G > 0 and
E{G} = 1. Hence, d Q is a well-defined probability density function with the
property that d Q = 0 if and only if d P = 0. From (4.7), the pricing function can
be expressed as:

1+ F ( w ~ , . . . , w,)= Eo{max(0, ~ w j ( Z j - R ) + R ) } / R (4.9)


1

+
where E o and E o are the corresponding expectation operators over dQ. By
inspection, (4.9) is the classic present value formula with discounting at the
riskless interest rate. Because (4.9) applies for any choice of ( W l , . . . , w,), it
follows that the expected return (as measured over the d Q distribution) on
every traded security is the same and equal to R. Hence, dQ is a "risk-
adjusted" distribution for all traded securities. In the development of their
utility-based pricing theory for warrants and options, Samuelson and Merton
(1969) call d Q the "util-prob" distribution for security returns. Although all
investors have the same dP, d Q will, in general, be different for each investor.
Under the pricing assumption of (4.7) and (4.8), we have the following
connection between the set of optimal underlying risky-asset portfolios with
personal bankruptcy and the unconstrained efficient portfolio set, q re.
n e e
Theorem 4.15. If, for every Ze(=--~ 1 w j ( Z j - R ) + R ) E q * , the portfolio-
e
guarantee price schedule in (4.5) satisfies (4.7) and (4.8) for wj = w~, j =
1 . . . . , n, then for any strictly concave and increasing U, there exists a solution
(w~ . . . . , w*) to (4.5) such that Z* E ~e.

Proof. Consider any strictly concave and increasing U. Let Z** denote the
return on the associated optimal unconstrained portfolio, (w 1 . . . . , w, ), for
initial wealth W o. From (2.4), E { U ' [ Z * * W o ] ( Z ~ - R)} = 0, j = 1 . . . . , n, and
hence, Z * * E k ~ e. Therefore, by hypothesis, (4.7) and (4.8) apply for
G = U ' [ Z * * W o ] / E { U ' [ Z * * W o ] } and wj --w**
j , j = 1 , .. . , n . F r o m ( 4 . 7 ) and
536 R. Merton

(4.8), it follows that F j ( w ~ * , . . . , w, )/[1 + F(w*l* . . . . . w , )] =


E+{U'[Z**Wo](Zj - R)}/E+{U'[Z**Wo]Z**}, j = 1 , . . . , n. By rearrang-
ing terms, E + { U ' [ Z * * W o ] [ Z j - R - Fj(w~; * . . . . , w * * ) Z * * / [ 1 +
F ( w 1 , . . . , w**)]]} = 0 , j = 1 . . . . ,n. Therefore, by inspection, for wj
w j• * , j = l , . . . , n , and W +
0 = W0/(1 + F*), Z* = Z**, will satisfy (4.5).
Hence, there exists a solution to (4.5) such that Z * E gte.

Provided that the solution to (4.5) is unique, under the hypothesized condi-
tions of Theorem 4.15, any set of portfolios, ( X 1 , . . . , X m , R ) , that spans We
will also span the set of optimal underlying risky-asset portfolios {Z*}, which
take account of the personal bankruptcy constraint.
To motivate the pricing schedule given in (4.7) and (4.8), we posited the
existence of a default-free intermediary that buys or sells put options on any
security or portfolio. This assumption implies a very rich set of available
risk-sharing financial instruments for investors. Although sufficient, such a set
is not required for the hypothesized conditions of Theorem 4.15 to obtain.
Suppose, for example, that no put options are traded, and the intermediary
issues only portfolio guarantees as part of a unit containing the underlying
risky-asset portfolio and restricts each investor to the purchase of only one type
of unit. If borrowing and short selling are permitted, this is no more than an
institutional representation for the lenders of securities and cash, and there is,
otherwise, no expansion in the set of risk-sharing opportunities for investors.
Nevertheless, provided that prices charged for the guarantees satisfy (4.7) and
(4.8), Theorem 4.15 still obtains. Moreover, (4.7) and (4.8) are consistent with
competitive-equilibrium pricing provided that the intermediary is default free.
Theorem 4.15 will not apply for portfolios in ~e such that there is no
feasible investment strategy for the intermediary to ensure no default if it
charges a finite price for the portfolio guarantee. Two models in which such
feasible strategies always exist and Theorem 4.15 applies are the Arrow-
Debreu complete-markets model and the continuous-time, intertemporal
model. Indeed, Theorem 4.15 was first proved by Cox and Huang (1989) in the
context of the continuous-time model. However, the effects of the non-
linearities induced by personal bankruptcy on the spanning theorems for the
static model of this section are not as yet fully worked out.
An alternative approach to the development of non-trivial spanning
theorems is to derive a class of utility functions for investors such that even
with arbitrary joint probability distributions for the available securities, inves-
tors within the class can generate their optimal portfolios from the spanning
portfolios. Let ~u denote the set of optimal portfolios selected from ~ f by
investors with strictly concave yon Neumann-Morgenstern utility functions
{Ui). Cass and Stiglitz (1970) have proved the following theorem.
Ch. 11: Capital Market Theory 537

Theorem 4.16. There exists a portfolio with return X such that (X, R) span ~ u
if and only if A i ( W ) = 1/(a i + b W ) >0, where A i is the absolute risk-aversion
function for investor i in ~u.17

The family of utility functions whose absolute risk-aversion functions can be


written as 1/(a + b W ) > 0 is called the " H A R A " (hyperbolic absolute risk
aversion) family, is By appropriate choices for a and b, various members of the
family will exhibit increasing, decreasing, or constant absolute and relative
risk-aversion. Hence, if each investor's utility function could be approximated
by some member of the H A R A family, then it might appear that this
alternative approach would be fruitful. However, it should be emphasized that
the b in the statement of T h e o r e m 4.16 does not have a subscript i, and
therefore, for separation to obtain, all investors in ~ u must have virtually the
same utility function. 19 Moreover, they must agree on the joint probability
distribution for ( Z 1 , . . . , Z n). Hence, the only significant way in which inves-
tors can differ is in their endowments of initial wealth.
Cass and Stiglitz (1970) also examine the possibilities for more-general
non-trivial spanning (i.e. 1 -< m < n) by restricting the class of utility functions
and conclude, " . . . it is the requirement that there be any mutual funds, and
not the limitation on the number of mutual funds, which is the restrictive
feature of the property of separability" (p. 144). Hence, the Cass and Stiglitz
analysis is essentially a negative report on this approach to developing spanning
theorems.
In closing this section, two further points should be made. First, although
virtually all the spanning theorems require the generally implausible assump-
tion that all investors agree upon the joint probability distribution for se-
curities, it is not so unreasonable when applied to the theory of financial
intermediation and mutual fund management. In a world where the economic
concepts of "division of labor" and "comparative advantage" have content,
then it is quite reasonable to expect that an efficient allocation of resources
would lead to some individuals (the "fund managers") gathering data and
actively estimating the joint probability distributions and the rest either buying
this information directly or delegating their investment decisions by "agreeing
to agree" with the fund managers' estimates. If the distribution of returns is

17For this family of utility functions, the probability distribution for securities cannot be
completely arbitrary without violating the von Neumann-Morgenstern axioms. For example, it is
required that for every realization of W, W >-a/b for b >0 and W < - a / b for b < 0. The latter
condition is especially restrictive.
18Anumber of authors have studied the properties of this family. See Merton (1971, p. 389) for
references.
1gAs discussed in footnote 17, the range of values for a1 cannot be arbitrary for a given b.
Moreover, the sign of b uniquely determines the sign of A'(W).
538 R. Merton

such that non-trivial spanning of ~ does not obtain, then there are no gains to
financial intermediation over the direct sale of the distribution estimates.
However, if non-trivial spanning does obtain and the number of risky spanning
portfolios, m, is small, then a significant reduction in redundant information
processing and transactions can be produced by the introduction of mutual
funds. If a significant coalition of individuals can agree upon a common source
for the estimates and if they know that, based on this source, a group of mutual
funds offered spans W e, then they need only be provided with the joint
distribution for these mutual funds to form their optimal portfolios. On the
supply side, if the characteristics of a set of spanning portfolios can be
identified, then the mutual fund managers will know how to structure the
portfolios of the funds they offer. We explore this point further in Section 9.
The second point concerns the riskless security. It has been assumed
throughout that there exists a riskless security. Although some of the specifica-
tions will change slightly, virtually all the derived theorems can be shown to be
valid in the absence of a riskless security.2° However, the existence of a riskless
security vastly simplifies many of the proofs.

5. Two special models of one-period portfolio selection

The two most cited models in the literature of portfolio selection are the
time-state preference model of Arrow (1953, 1964) and Debreu (1959) and the
mean-variance model of Markowitz (1959) and Tobin (1958). Because these
models have been central to the development of the microeconomic theory of
investment, there are already many review and survey articles devoted just to
each of these models. 2~ Hence, only a focused description of each model is
presented here, with specific emphasis on how each model fits within the
framework of the analyses presented in the other sections. In particular, we
show that these models are special cases of the spanning models of the
preceding section. We also use the Arrow-Debreu model to re-examine the
portfolio-selection problem with the non-negativity constraint on wealth.
Under appropriate conditions, both the A r r o w - D e b r e u and Markowitz-Tobin
models can be interpreted as multi-period, intertemporal portfolio-selection
models. However, such an interpretation is postponed until later sections.
The structure of the A r r o w - D e b r e u model is described as follows. Consider
an economy where all possible configurations for the economy at the end of the
period can be described in terms of M possible states of nature. The states are

2°Cf. Ross (1978) for spanning proofs in the absence of a riskless security. Black (1972) and
Merton (1972) derive the two-fund theorem for the mean-variance model with no riskless security.
2~For the Arrow-Debreu model, see Hirshleifer (1965, 1966, 1970), Myers (1968), and Radner
(1972). For the mean-variance model, see Jensen (1972a, 1972b), and Sharpe (1970).
Ch. 11: Capital Market Theory 539

mutually exclusive and exhaustive. It is assumed that there are N risk-averse


individuals with initial wealth W0k and a von N e u m a n n - M o r g e n s t e r n utility
function U k ( W ) for investor k, k = 1 , . . . , N. Each individual acts on the basis
of subjective probabilities for the states of nature denoted by Pk(O), 0 =
1 , . . . , M. While these subjective probabilities can differ across investors, it is
assumed for each investor that 0 < Pk(O) < 1, 0 = 1 , . . . , M . As was assumed
in Section 2, there are n risky securities with returns per dollar Zj and initial
market value, Vj0, j = 1 , . . . , n, and the "perfect m a r k e t " assumptions of that
section, Assumptions 1-4, are assumed here as well. Moreover, if state 0
obtains, then the return on security j will be Zi(O), and all investors agree on
the functions {Zj(0)). Because the set of states is exhaustive,
[Zj(1) . . . . , Z j ( M ) ] describe all the possible outcomes for the returns on
security j. In addition, there are available M " p u r e " securities with the
properties that, i = 1 . . . . . M, one unit (share) of pure security i will be worth
$1 at the end of the period if state i obtains and will be worthless if state i does
not obtain. If H i denotes the price per share of pure security i and if X i denotes
its return per dollar, then for i = 1 , . . . , M, X i as a function of the states of
nature can be written as X i ( 0 ) = l / / / ~ if O = i and X i ( 0 ) = 0 if 0 # i . All
investors agree on the functions {X~(0)), i, 0 = 1 . . . . , M.
Let Z = Z ( N 1 , . . . , N ~ ) denote the return per dollar on a portfolio of pure
securities that holds Nj shares of pure security j, j = 1 . . . . , M. If
V 0 ( N I , . . . , ArM)--= ~ Nj//j denotes the initial value of this portfolio, then the
return per dollar on the portfolio, as a function of the states of nature, can be
written as Z(0) = No/Vo, 0 = 1 , . . . , M .

Proposition 5.1. There exists a riskless security, and its return p e r dollar R
equals 1 / ( ~ 7 Hi).

Proof. Consider the pure-security portfolio that holds one share of each pure
security (Nj = 1, j = 1 , . . . , M). The return per dollar Z is the same in every
state of nature and equals 1 / V 0 ( 1 , . . . , 1). Hence, there exists a riskless
security and, by Assumption 3, its return R is given by 1 / ( ~ 7 / / j ) .

Proposition 5.2. For each security j with return Zj, there exists a portfolio o f
pure securities, whose return p e r dollar exactly replicates Zj.

Proof. Let Z j~- Z(Zj(1) . . . . , Z j ( M ) ) denote the return on a portfolio of


pure securities with N o = Zj(O), 0 = 1 , . . . , M. It follows that
V0(Zj(I ) . . . . . Z j ( M ) ) = ~1 [IiZj(i) and ZJ(O) = Z j ( O ) / V o, 0 = 1 . . . . , M .
Consider a three-security portfolio with return Zp where fraction V0 is invested
in ZJ; fraction - 1 is invested in Zj; and fraction 1 - V 0 - ( - 1 ) = ( 2 - V0) is
invested in the riskless security. The return per dollar on this portfolio as a
540 R. M e r t o n

function of the states of nature can be written as:

Zp(O) = (2 - Vo)R + VoZJ(O) - Zj(O) = (2 - Vo)R ,

which is the same for all states. Hence, Zp is a riskless security, and by
Assumption 3, Zp(O)=R. Therefore, V 0 = l , and Z J ( O ) = Z j ( O ) , O=
1. . . . , M .

Proposition 5.3. The set o f pure securities with returns (X1, . . . , X~4 ) span the
set o f all feasible portfolios that can be constructed from the M pure securities
and the n other securities.

The p r o o f follows immediately from Propositions 5.1 and 5.2. Hence,


whenever a complete set of pure securities exists or can be constructed from
the available securities, then every feasible portfolio can be replicated by a
portfolio of pure securities. Models in which such a set of pure securities exists
are called complete-markets models in the sense that any additional securities
or markets would be redundant. Necessary and sufficient conditions for such a
set to be constructed from the available n risky securities alone and therefore,
for markets to be complete, are that n - M; a riskless asset can be created and
Assumption 3 holds; and the rank of the variance-covariance matrix of
returns, g2, equals M - 1.
The connection between the pure securities of the A r r o w - D e b r e u model
and the mutual fund theorems of Section 4 is immediate. To put this model in
comparable form, we can choose the alternative spanning set
( X 1. . . . , Xm, R), where m = - M - 1. From T h e o r e m 4.3, the returns on the
risky securities can be written as:

Zj=R+~aq(X i-R), j=l .... ,n, (5.1)


1

where the numbers (aq) are given by Proposition 4.3.


Note that nowhere in the derivation were the subjective probability assess-
ments of the individual investors required. Hence, individual investors need
not agree on the joint distribution for ( X 1. . . . . X,,). H o w e v e r , by T h e o r e m
4.3, investors cannot have arbitrary beliefs in the sense that they must agree on
the (aq) in (5.1).

Proposition 5.4. I f Vj(O) denotes the end-of-period value of security j, if state 0


obtains, then a necessary condition for equilibrium in the securities market is that
M
Vjo = E IIk Vj( k ) , j = 1. . . . . n .
I
Ch. 11: Capital Market Theory 541

The proof follows immediately from the proof of Proposition 5.2. It was shown
M
there that V0 -- ~1 IIkZ:(k) = 1. Multiplying both sides by Vj0 and noting the
identity Vj(k)-= VjoZj(k) j, it follows that Vj0 = ~1M IIkVj(k ).
However, by Theorem 4.13 and Proposition 5.3, it follows that the {Vj0} can
also be written as:

Vjo~__ 1 1 R , j=l,.., ,
n , (5.2)

where Oik is the i-kth element of O x 1. Hence, from (5.2) and Proposition 5.4,
it follows that the (aij) in (5.1) can be written as:

aij=[Zj(i)-RI/[1/FI,-R], i=l .... ,m; j=l,...,n. (5.3)

From (5.3), given the prices of the securities {H/} and {V/0}, the {aij } will be
agreed upon by all investors if and only if they agree upon the {Vj(i))
functions.
While it is commonly believed that the Arrow-Debreu model is completely
general with respect to assumptions about investors' beliefs, the assumption
that all investors agree on the {Vj(i)} functions can impose non-trivial restric-
tions on these beliefs. In particular, when there is production, it will in general
be inappropriate to define the states, tautologically, by the end-of-period
values of the securities, and therefore investors will at least have to agree on
the technologies specified for each firm. 22 However, as discussed in Section 4, it
is unlikely that a model without some degree of homogeneity in beliefs (other
than agreement on currently observed variables) can produce testable restric-
tions. Among models that do produce such testable restrictions, the assump-
tions about investors' beliefs in the Arrow-Debreu model are among the most
general.
To perhaps provide further intuition about the solution of the portfolio-

221f the states are defined in terms of end-of-period values of the firm in addition to "en-
vironmental" factors, then the firms' production decisions will, in general, alter the state-space
description which violates the assumptions of the model. Moreover, I see no obvious reason why
individuals are any more likely to agree upon the {~(i)} functions than upon the probability
distributions for the environmental factors. If sufficient information is available to partition the
states into fine enough categories to produce agreement on the {~(i)} functions, then, given this
information, it is difficult to imagine how rational individuals would have heterogeneous beliefs
about the probability distributions for these states, As with the standard certainty model,
agreement on the technologies is necessary for Pareto optimality in this model. However, as Peter
Diamond has pointed out to me, it is not sufficient. Sufficiency demands the stronger requirement
that everyone be "right" in their assessment of the technologies. See Varian (1985) and Black
(1986, footnote 5) on whether differences of opinion among investors can be supported in this
model.
542 R. Merton

selection problem with the non-negativity constraint on wealth as analyzed in


Section 4, we reformulate that problem in the context of the A r r o w - D e b r e u
model as follows: the investor selects a portfolio of pure securities so as to
maximize his expected utility of end-of-period wealth, subject to his budget
constraint and the feasibility requirement that wealth cannot be negative.
Without loss of generality, we can restrict the choices to just the pure
securities, because from Proposition 5.3 these securities span the set of all
feasible portfolios.
If Nj denotes the number of shares of pure security ] held, ] = 1 , . . . , M ,
M
then the budget constraint requires that W0 = E 1 NjffIj. If W(O) denotes
end-of-period wealth in state 0, then from the payoff structure for pure
securities, W(O) = No, and the non-negativity-of-wealth constraint implies that
N o >-0, 0 = 1 . . . . , M. The constrained optimization problem can thus be
expressed as:

max ( ~M
{Nj}
i P(j)U(Nj)+A[W o - ~///Nj]
1
+ ~ TjNj},
1
(5.4)
where P(O) is the investor's subjective probability for state 0 and )~ and
3'1, - - -, YM are Kuhn-Tucker multipliers.
From (5.4), the first-order conditions for the optimal portfolio {NT} can be
written as:

O=P(j)U'(NT)-A*//]+yj, j=I,...,M, (5.5a)


M
0 = Wo -- ~ / / ] N ~ , (5.5b)
1

0 = "yjNj , j = 1,..., M. (5.5c)

Because U ' > 0 and U " < 0, we have from (5.5a) and (5.5c) that

YJ*=max[0, A * / / J - P ( J ) U ' ( 0 ) ] ' j=I,...,M. (5.6)

As noted in footnote 7, we have by inspection of (5.6) that Y3 = 0 , j =


1 , . . . , M, if U ' ( 0 ) = ~, and therefore, for such utility functions, the non-
negativity constraint on wealth is never binding.
Because U' is monotonic and strictly decreasing, it is globally invertible.
Define G ( y ) = = - ( U ' ) - l ( y ) . From (5.5a) and (5.5c) we have that

N~=max[0, G(3,*HJP(j))], j=I,...,M. (5.7)

Thus, from (5.7), only the multiplier 2~* need be found to complete the
Ch. 11: Capital Market Theory 543

solution for the optimal portfolio. Substituting for N~ from (5.7) into (5.5b) we
have that
M

0 = Wo - ~ / / j max[O, G ( A * I I i / P ( j ) ) ] . (5.8)
1

Because d G / d y < 0 a n d / / j -> 0, A* can be determined as the unique solution to


the algebraic transcendental equation (5.8).
To compare the solution to the portfolio problem here with the solutions of
the preceding sections that do not take explicit account of the non-negativity
constraint on wealth, note that the unconstrained formulation of (5.4) simply
imposes yj-=0, j = 1 . . . . . M. If {N~*} denotes the optimal unconstrained
solution, then from (5.5a) with y~ = 0, we have that

NT*=G(A**H/P(j)), j=I,...,M, (5.9)

where A** is the multiplier associated with the budget constraint (5.5b). As in
(5.8), A** is determined as the solution to:
M

0 = W o - ~, H j G ( A * * I I j / P ( ] ) ) . (5.10)
1

Noting that max[0, G ( A * I I / P ( j ) ) ] = G ( A * F I / P ( j ) ) + max[0, - G ( A * H i /


P(j))], we can rearrange terms and rewrite (5.8) as:
M
+
W o = ~ IIjG(A*IIjlP(j)), (5.11)
1

where W o --- W0 - ~ 7 / / j max[0, - G ( A * I I / P ( j ) ) ] . If the non-negativity con-


straint in (5.4) is binding, so that the constrained and unconstrained solutions
are not identical, then for at least one state j, G ( A * I I j / P ( j ) ) < O . It follows
from the definition of W o that W o < W0. By inspection of (5.10) and (5.11),
we see that the formal structure of the constrained solution is the same as for
the unconstrained solution, except that the value of the initial endowment in
the latter case, Wo, is smaller.
From (5.9) and (5.10), the optimal unconstrained portfolio allocation de-
pends on initial wealth, W0 [i.e. N~* = N~*(W0) ]. The end-of-period value of
the optimal portfolio in state j is given by W * * ( j ) = N~*(Wo) , j = 1 . . . . . M.
The payoff structure to a security that makes up the "shortfall" between the
payoffs to this unconstrained portfolio and zero is given by m a x [ 0 , - W * * ( j ) ]
in state j, j = 1 , . . . , M . As discussed in Section 4, this security can be
interpreted as a portfolio-value or loan guarantee that pays the difference
544 R. Merton

between the promised payments to the investor's creditors and the value of the
portfolio. From Proposition 5.4, the equilibrium initial price of such a guaran-
tee, F[W0], is given by:
M

F[W0] = ~ / / j max[0, - W * * ( j ) ]
1

= ~ / / j max[0, -N**(W0) ] . (5.12)


1

As in Section 4, we have a rather intuitive economic interpretation of the


optimal constrained portfolio strategy: the investor solves for his optimal
unconstrained portfolio as a function of initial wealth, {N~*(Wo) }. If
N~*(Wo) < 0 for some state j, then to implement the strategy, the investor
must borrow cash or short sell securities. However, the creditors making these
loans recognize that the collateral (i.e. the portfolio) may have insufficient
value to meet the promised payments. They therefore require that the investor
purchase a loan guarantee either from them or from a third-party guarantor.
d-
Either way the investor pays for the guarantee and therefore he only has W 0 to
d-
allocate to the pure securities in his "unconstrained" portfolio, where W 0 is
the solution to 0-- W o + F[ W o ] - W o. Thus, his feasible unconstrained solu-
tion is {Nj* * (W0) + }. Although N]* * (W0)
+ may be negative in some states, the
payoffs to the entire portfolio including the guarantee, Nj**(W0)÷ +
max[0, - N j ** (W0)], + is always non-negative. From (5.9), (5.11), and (5.12) it
is straightforward to show that N~(Wo) = N]** ( W o+) + max[O, - N j** (W0)], --
] = 1 . . . . . M. If, of course, N~*(Wo)>-0 for ] = 1 , . . . , M, then F[W0] = 0 and
+
W o = W o.
As noted in Section 4, Cox and Huang (1989) were the first to recognize this
relation between the unconstrained and constrained solutions, and although
they did so in the framework of a continuous-time dynamic model, their
derivation is more like the one here than the development in Section 4. In
Sections 7-9 we analyze the non-negativity constraint and bankruptcy issue in
the context of the continuous-time model. This completes our examination of
the Arrow-Debreu model, and we turn now to the Markowitz-Tobin model.
The most elementary type of portfolio selection model in which all securities
are not perfect substitutes is one where the attributes of every optimal portfolio
can be characterized by two numbers: its "risk" and its "return". The
mean-variance portfolio selection model of Markowitz (1959) and Tobin
(1958) is such a model. In this model, each investor chooses his optimal
portfolio so as to maximize a utility function of the form H[E(W), var(W)],
subject to his budget constraint, where W is his random variable end-of-period
Ch. 11: Capital Market Theory 545

wealth. The investor is said to be "risk-averse in a mean-variance sense" if


H 1 > 0 , H 2 < 0 , H I ~ < 0 , H 2 2 < 0 , and H11H22-H~z>O, where subscripts
denote partial derivatives.
In an analogous fashion to the general definition of an efficient portfolio in
Section 2, a feasible portfolio will be called a mean-variance efficient portfolio
if there exists a risk-averse mean-variance utility function such that this
feasible portfolio would be preferred to all other feasible portfolios. Let ~ v
denote the set of mean-variance efficient portfolios. As defined in Section 4,
~nin is the set of feasible portfolios such that there exists no other portfolio
with the same expected return and a smaller variance. For a given initial wealth
W0, every risk-averse investor would prefer the portfolio with the smallest
variance among those portfolios with the same expected return. Hence, W~mvis
contained in qZmin.

Proposition 5.5. I f ( Z 1. . . . . Zn) are the returns on the available risky se-
curities, then there exists a portfolio contained in ~mv with return X such that
(X, R) span ~Fev and Zj - R = a j ( X - R), where aj =-cov(Zj, X ) / v a r ( X ) , j =
1, 2 , . . . , n.

The proof follows immediately from T h e o r e m 4.9. 23 Hence, all the properties
derived in the special case of two-fund spanning (m = 1) in Section 4 apply to
the mean-variance model. Indeed, because all such investors would prefer a
higher expected return for the same variance of return, ~ v is the set of
all portfolios contained in l~min, such that their expected returns are equal
to or exceed R. Hence, as with the complete-markets model, the m e a n -
variance model is also a special case of the spanning models developed in
Section 4.
If investors have homogeneous beliefs, then the equilibrium version of the
mean-variance model is called the Capital Asset Pricing Model. 24 It follows
from Proposition 4.5 and T h e o r e m 4.7 that, in equilibrium, the market
portfolio can be chosen as the risky spanning portfolio. From T h e o r e m 4.8, the
equilibrium structure of expected returns must satisfy the Security Market
Line.
Because of the mean-variance model's attractive simplicity and its strong
empirical implications, a number of authors 25 have studied the conditions

23In particular, the optimal portfolio demand functions are of the form derived in the proof of
Theorem 4.9. For a complete analytic derivation, see Merton (1972).
24Sharpe (1964), Lintner (1965), and Mossin (1966) are generally credited with independent
derivations of the model. Black (1972) extended the model to include the case of no riskless
security.
ZSCf.Borch (1969), Feldstein (1969), Tobin (1969), Samuelson (1967), and Chamberlain (1983).
546 R. Merton

under which such a criterion function is consistent with the expected utility
maxim. Like the studies of general spanning properties cited in Section 4, these
studies examined the question in two parts. (i) What is the class of probability
distributions such that the expected value of an arbitrary concave utility
function can be written solely as a function of mean and variance? (ii) What is
the class of strictly concave von Neumann-Morgenstern utility functions whose
expected value can be written solely as a function of mean and variance for
arbitrary distributions? Since the class of distributions in (i) was shown in
Section 4 to be equivalent to the class of finite-variance distributions that admit
two-fund spanning of the efficient set, the analysis will not be repeated here.
To answer (ii), it is straightforward to show that a necessary condition is that U
have the form W - b W 2, with b > 0. This member of the H A R A family is
called the quadratic and will satisfy the yon Neumann axioms only if W -< 1/2b
for all possible outcomes for W. Even if U is defined to be m a x [ W -
bW 2, 1/4b], so that U satisfies the axioms for all W, its expected value for
general distributions can be written as a function of just E(W) and var(W)
only if the maximum possible outcome for W is less than 1/2b.
Although both the Arrow-Debreu and Markowitz-Tobin models were
shown to be special cases of the spanning models in Section 4, they deserve
special attention because they are unquestionably the genesis of these general
models.

6. Intertemporai consumption and portfolio selection theory

As in the preceding analyses the majority of papers on investment theory


under uncertainty have assumed that individuals act so as to maximize the
expected utility of end-of-period wealth and that intraperiod revisions are not
feasible. Therefore, all events which take place after next period are irrelevant
to their decisions. Of course, investors do care about events beyond "next
period", and they can review and change their allocations periodically. Hence,
the one-period, static analyses will only be robust under those conditions such
that an intertemporally-maximizing individual acts, each period, as if he were a
one-period, expected utility-of-wealth maximizer. In this section the lifetime
consumption-portfolio selection problem is solved, and conditions are derived
under which the one-period static portfolio problem is an appropriate "surro-
gate" for the dynamic, multi-period portfolio problem.
As in the early contributions by Hakansson (1970), Samuelson (1969), and
Merton (1969), the problem of choosing optimal portfolio and consumption
rules for an individual who lives T years is formulated as follows. The
individual investor chooses his consumption and portfolio allocation for each
Ch. 11: Capital Market Theory 547

p e r i o d so as to m a x i m i z e 26

Eo U[C(t), t] + B[W(T), T] , (6.1)

w h e r e C(t) is c o n s u m p t i o n c h o s e n at age t; W(t) is w e a l t h at age t; E , is t h e


c o n d i t i o n a l e x p e c t a t i o n o p e r a t o r c o n d i t i o n a l o n k n o w i n g all r e l e v a n t i n f o r m a -
tion a v a i l a b l e as o f t i m e t; t h e utility f u n c t i o n ( d u r i n g life) U is a s s u m e d to b e
strictly c o n c a v e in C; a n d t h e " b e q u e s t " f u n c t i o n B is also a s s u m e d to b e
c o n c a v e in W.
It is a s s u m e d t h a t t h e r e a r e n risky s e c u r i t i e s with r a n d o m v a r i a b l e r e t u r n s
b e t w e e n t i m e t a n d t + 1 d e n o t e d by Z~(t+ 1) . . . . , Zn(t+ 1), a n d t h e r e is a
riskless s e c u r i t y w h o s e r e t u r n b e t w e e n t a n d t + 1, R(t), will b e k n o w n with
c e r t a i n t y as o f t i m e t. 27 W h e n t h e i n v e s t o r " a r r i v e s " at d a t e t, w e will k n o w the
v a l u e o f his p o r t f o l i o , W(t). H e c h o o s e s h o w m u c h to c o n s u m e , C(t), a n d t h e n
r e a l l o c a t e s t h e b a l a n c e o f his w e a l t h , W ( t ) - C ( t ) , a m o n g the a v a i l a b l e se-
curities. H e n c e , t h e a c c u m u l a t i o n e q u a t i o n b e t w e e n t a n d t + 1 can b e w r i t t e n
as: 2s

W(t + 1 ) = I ~ w:.(t)[Zj(t + 1) - R(t)] + R(t)][W(t)- C(t)], (6.2)

w h e r e wi(t ) is t h e f r a c t i o n o f his p o r t f o l i o a l l o c a t e d to s e c u r i t y j at d a t e
t, j = 1 , . . . , n. B e c a u s e t h e f r a cn t i o n a l l o c a t e d to t h e riskless s e c u r i t y can
always b e c h o s e n to e q u a l 1 - ~ wj(t), t h e c h o i c e s for wi(t ) . . . . , wn(t ) a r e
unconstrained.
It is a s s u m e d t h a t t h e r e exist m state v a r i a b l e s , {Sk(t)}, such t h a t the

26The additive independence of the utility function and the single-consumption good assump-
tions are made for analytic simplicity and because the focus of the chapter is on capital market
theory and not the theory of consumer choice. Fama (1970b) in discrete time and Meyer (1970)
and Huang and Kreps (1985) in continuous time, analyze the problem for non-additive and
temporally-dependent utilities. Although T is treated as known in the text, the analysis is
essentially the same for an uncertain lifetime with T a random variable [cf. Richard (1975) and
Merton (1971)]. The analysis is also little affected by making the direct-utility function "state-
dependent" (i.e. having U depend on other variables in addition to consumption and time). See
Merton (1990, ch. 6) for a summary of these various generalizations on preferences.
27This definition of a riskless security is purely technical and without normative significance. For
example, investing solely in the riskless security will not allow for a certain consumption stream
because R(t) will vary stochastically over time. On the other hand, a T-period, riskless-in-terms-of-
default coupon bond, which allows for a certain consumption stream, is not a riskless security
because its one-period return is uncertain. For further discussion, see Merton (1970, 1973b).
281t is assumed that all income comes from investment in securities. The analysis would be the
same with wage income provided that investors can sell shares against future income. However,
because institutionally this cannot be done, the "non-marketability" of wage income will cause
systematic effects on the portfolio and consumption decisions.
548 R. Merton

stochastic processes for { Z 1 ( t + l ) , . . . , Z , ( t + l ) , R(t+l), Sl(t+l),...,


Sm(t+ 1)) are M a r k o v with respect to $1(t)~9... , Sin(t), and S(t) denotes
the m-vector of state-variable values at time t.
The method of stochastic dynamic p r o g r a m m i n g is used to derive the optimal
consumption and portfolio rules. Define the function J[W(t), S(t), t] by:

J[W(t), S(t), t] -= max E t f[C(,l'), T] -t- B [ W ( T ) , T] . (6.3)

J, therefore, is the (utility) value of the balance of the investor's optimal


c o n s u m p t i o n - i n v e s t m e n t p r o g r a m from date t forward and, in this context, is
called the " d e r i v e d " utility of wealth function. By the Principle of Optimality,
(6.3) can be rewritten as:

J[W(t), S(t), tl = max{U[C(t), tl + E,(J[W(t + 1), S(t + 1), t + 11)},


(6.4)

where " m a x " is over the current decision variables [C(t), wl(t ) . . . . , w,(t)].
Substituting for W(t + 1) in (6.4) from (6.2) and differentiating with respect to
each of the decision variables, we can write the n + 1 first-order conditions for
30
a regular interior m a x i m u m as:

0 = Uc[C*(t), t] - E, Jw[W(t + 1), S(t + 1), t + 1] w~.(Zj - R) + R


(6.5)

and

O=E,(Jw[W(t+I),S(t+I),t+I](Zj-R)}, j=l,2,...,n, (6.6)

where U c =- OU/OC, Jw =- OJ/OW, and (C*, w~) are the optimal values for the
decision variables. As in the static analysis of Section 2, we do not explicitly
impose the feasibility conditions that C * -> 0 and W >-0. Henceforth, except
where needed for clarity, the time indices will be dropped. Using (6.6), (6.5)

29Many non-Markov stochastic processes can be transformed to fit the Markov format by
expanding the number of state variables [cf. Cox and Miller (1968, pp. 16-18)]. To avoid including
"surplus" state variables, it is assumed that {S(t)) represent the minimum number of variables
necessary to make {Zj(t + 1)} Markov.
3°Cf. Dreyfus (1965) for the dynamic programming technique. Sufficient conditions for existence
are described in Bertsekas (1974). Uniqueness of the solutions is guaranteed by: (1) strict
concavity of U and B; (2) no redundant securities; and (3) no arbitrage opportunities. See Cox,
Ingersoll and Ross (1985a) for corresponding conditions in the continuous-time version of the
model.
Ch. 11: Capital Market Theory 549

can be written as:

O= Uc[C*, t ] - RE,iJw[W(t + 1), S(t + 1), t + 1]}. (6.7)

To solve for the complete optimal program, one first solves (6.6) and (6.7)
for C* and w* as functions of W(t) and S(t) when t = T - 1. This can be done
because J[W(T), S(T), T] = B[W(T), T], a known function. Substituting the
solutions for C * ( T - 1 ) and w * ( T - 1 ) in the right-hand side of (6.4), (6.4)
becomes an equation and, therefore, one has J [ W ( T - 1 ) , S ( T - 1 ) , T - 1 ] .
Using (6.6), (6.7), and (6.4) one can proceed to solve for the optimal rules in
earlier periods in the usual "backwards" recursive fashion of dynamic program-
ming. Having done so, one has a complete schedule of optimal consumption
and portfolio rules for each date expressed as functions of the (then) known
state variables W(t), S(t), and t. Moreover, as Samuelson (1969) has shown,
the optimal consumption rules will satisfy the "envelope condition" expressed
as:

Jw[w(t), s(t), t] = u c [ c * ( t ) , t], (6.8)

i.e. at the optimum, the marginal utility of wealth (future consumption) will
just equal the marginal utility of (current) consumption. Moreover, from (6.8),
it is straightforward to show that Jww < 0 because Ucc < 0 . Hence, J is a
strictly concave function of wealth.
A comparison of the first-order conditions for the static portfolio-selection
problem, (2.4) in Section 2, with the corresponding conditions (6.6) for the
dynamic problem will show that they are formally quite similar. Of course,
they do differ in that, for the former case, the utility function of wealth is taken
to be exogenous while, in the latter, it is derived. However, the more
fundamental difference in terms of portfolio-selection behavior is that J is not
only a function of W, but also a function of S. The analogous condition in the
static case would be that the end-of-period utility function of wealth is also
state dependent.
To see that this difference is not trivial, consider the Rothschild-Stiglitz
definition of "riskier" that was used in the one-period analysis to partition the
feasible portfolio set into its efficient and inefficient parts. Let W1 and W2 be
the random variable, end-of-period values of two portfolios with identical
expected values. If W2 is equal in distribution to W1 + Z, where E ( Z [ W1) = 0,
then from (2.10) and (2.11), W2 is riskier than W1 and every risk-averse
maximizer of the expected utility of end-of-period wealth would prefer W~ to
W2. However, consider an intertemporal maximizer with a strictly concave,
derived utility function J. It will not, in general, be true that Et{J[W~, S(t +
1), t + 1]} > Et{J[W2, S(t + 1), t + 1]}. Therefore, although the intertemporal
550 R. Merton

maximizer selects his portfolio for only one period at a time, the optimal
portfolio selected may be one that would never be chosen by any risk-averse,
one-period maximizer. Hence, the portfolio-selection behavior of an inter-
temporal maximizer is, in general, operationally distinguishable from the
behavior of a static maximizer.
To adapt the Rothschild-Stiglitz definition to the intertemporal case, a
stronger condition is required: namely if W2 is equal in distribution to W1 + Z,
where E[ZI W1, S(t + 1)] = 0, then every risk-averse intertemporal maximizer
would prefer to hold W 1 rather than W2 in the period t to t + 1. The proof
follows immediately from the concavity of J and Jensen's Inequality. Namely,
Et{J[W2, S(t + 1), t + 1]} = E,{E(J[W 2, S(t + 1), t + 111Wl, S(t + 1))}. By Jen-
sen's Inequality, E( J[ W2, S(t + 1), t + 1]IW1, S(t + 1)) < J[E( W2 [W~, S(t +
1)), S(t + 1), t + 1] = J[W~, S(t + 1), t + 1], and therefore E,{J[W2, S(t + 1),
t + 1]} < E,{J[W1, S(t + 1), t + 1]}. Hence, "noise" as denoted by Z must not
only be noise relative to W1, but noise relative to the state variables Sl(t +
l ) , . . . , Sm(t + 1 ) . All the analyses of the preceding sections can be formally
adapted to the intertemporal framework by simply requiring that the "noise"
terms there, e, have the additional property that Et(e I S(t + 1)) = Et(e) = 0.
Hence, in the absence of further restrictions on the distributions, the resulting
efficient portfolio set for intertemporal maximizers will be larger than in the
static case.
However, under certain conditions, 3I the portfolio selection behavior of
intertemporal maximizers will be "as if" they were one-period maximizers. For
example, if Et[Zj(t + 1)] -= Zj(t + 1) = E,[Zj(t + 1)1 S(t + 1)], ] = 1, 2 , . . . , n,
then the additional requirement that E,(e I s(t + 1)) -- 0 will automatically be
satisfied for any feasible portfolio, and the original Rothschild-Stiglitz "static"
definition will be a valid. Indeed, in the cited papers by Hakansson, Samuel-
son, and Merton, it is assumed that the security returns {Zl(t) . . . . , Zn(t)} are
serially independent and identically distributed in time which clearly satisfies
this condition.
Define the investment opportunity set at time t to be the joint distribution for
{Zl(t + 1 ) , . . . , Zn(t + 1)} and the return on the riskless security, R(t). The
Hakansson et al. papers assume that the investment opportunity set is constant
through time. The condition Zj(t + 1) = Et[Zj(t + 1)lS(t + 1)], j = 1 , . . . , n,
will also be satisfied if changes in the investment opportunity set are either
completely random or time dependent in a non-stochastic fashion. Moreover,
with the possible exception of a few special cases, these are the only conditions
on the investment opportunity set under which Zj(t + 1) = E,[Zj(t + 1)]S(t +
1)], j = 1 . . . . , n. Hence, for arbitrary concave utility functions, the one-period
analysis will be a valid surrogate for the intertemporal analysis only if changes
in the investment opportunity set satisfy these conditions.

31See Fama (1970b) for a general discussion of these conditions.


Ch. 11: CapitalMarket Theory 551

Of course, by inspection of (6.6), if J were of the form V[W(t), t] +


H[S(t), t] so that Jw = Vw is only a function of wealth and time, then for
arbitrary investment opportunity sets such an intertemporal investor will act
"as if" he is a one-period maximizer. Unfortunately, the only concave utility
function that will produce such a J function and satisfy the additivity specifica-
tion in (6.1) is U[C, t] -- a(t) log[C] and B[W, T] = b(T) log[W], where either
a = 0 and b > 0 or a > 0 and b-> 0. While some have argued that this utility
function is of special normative significance, 32 any model whose results depend
singularly upon all individuals having the same utility function and where, in
addition, the utility function must have a specific form, can only be viewed as
an example, and not the basis for a general theory.
Hence, in general, the one-period static analysis will not be rich enough to
describe the investor behavior in an intertemporal framework. Indeed, without
additional assumptions, the only derived restrictions on optimal demand
functions and equilibrium security returns are the ones that rule out arbitrage.
Hence, to deduce additional properties, further assumptions about the
dynamics of the investment opportunity set are needed.

7. Consumption and portfolio selection theory in the continuous-time model

There are three time intervals or horizons involved in the consumption-


portfolio problem. 33 First, there is the trading horizon, which is the minimum
length of time between which successive transactions by economic agents can
be made in the market. In a sequence-of-markets analysis, it is the length of
time between successive market openings, and is therefore part of the specifica-
tion of the structure of markets in the economy. While this structure will
depend upon the tradeoff between the costs of operating the market and its
benefits, this time scale is not determined by the individual investor, and is the
same for all investors in the economy. Second, there is the decision horizon,
which is the length of time between which the investor makes successive
decisions, and it is the minimum time between which he would take any action.
For example, an investor with a fixed decision interval of one month, who
makes a consumption decision and portfolio allocation today, will under no
conditions make any new decisions or take any action prior to one month from

32See Latane (1959), Markowitz (1976), and Rubinstein (1976) for arguments in favor of this
view, and Samuelson (1971), Goldman (1974), and Merton and Samuelson (1974) for arguments in
opposition to this view.
33These introductory paragraphs are adapted from Merton (1975, pp. 662-663). See the books
by Duffle (1988), Ingersoll (1987), and Merton (1990) for development of the continuous-time
model and extensive bibliographies.
552 R. Merton

now. This time scale is determined by the costs to the individual of processing
information and making decisions, and is chosen by the individual. Third, there
is the planning horizon, which is the maximum length of time for which the
investor gives any weight in his utility function. Typically, this time scale would
correspond to the balance of his lifetime and is denoted by T in the formulation
(6.1).
The static approach to portfolio selection implicitly assumes that the in-
dividual's decision and planning horizons are the same: "one period". While
the intertemporal approach distinguishes between the two, when individual
demands are aggregated to determine market equilibrium relations, it is
implicitly assumed in both approaches that the decision interval is the same for
all investors, and therefore corresponds to the trading interval.
If h denotes the length of time in the trading interval, then every solution
derived has, as an implicit argument, h. Clearly, if h changes, then the derived
behavior of investors would change, as indeed would any deduced equilibrium
relations. 34 I might mention, somewhat parenthetically, that empirical re-
searchers often neglect to recognize that h is part of a model's specification.
For example, in Theorem 4.6 the returns on securities were shown to have a
linear relation to the returns on a set of spanning portfolios. However, because
the n-period return on a security is the product (and not the sum) of the
one-period returns, this linear relation can only obtain for the single time
interval, h. If we define a fourth time interval, the observation horizon, to be
the length of time between successive observations of the data by the re-
searcher, then the usual empirical practice is to implicitly assume that the
decision and trading intervals are equal to the observation interval. This is
done whether the observation interval is daily, weekly, monthly, or annually!
If the frictionless-markets assumption (Assumption 1) is extended to include
no costs of information processing or operating the markets, then it follows
that all investors would prefer to have h as small as physically possible. Indeed,
the aforementioned general assumption that all investors have the same
decision interval will, in general, only be valid if all such costs are zero. This
said, it is natural to examine the limiting case when h tends to zero and trading
takes place continuously in time.
Consider an economy where the trading interval, h, is sufficiently small that
the state description of the economy can change only "locally" during the
interval (t, t + h). Formally, the Markov stochastic processes for the state
variables, S(t), are assumed to satisfy the property that one-step transitions are
permitted only to the nearest neighboring states. The analogous condition in

34If investor behavior were invariant to h, then investors would choose the same portfolio if they
were "frozen" into their investments for ten years as they would if they could revise their portfolios
every day.
Ch. 11: Capital Market Theory 553

the limiting case of continuous time is that the sample paths for S(t) are con-
tinuous functions of time, i.e. for every realization of S(t + h), except pos-
sibly on a set of measure zero, limh~o[S~(t + h) - Sk(t)] = 0, k = 1 . . . . . m. If,
however, in the continuous limit the uncertainty of "end-of-period" returns is to
be preserved, then an additional requirement is that limh_,0[Sk(t + h) - Sk(t)]/h
exists almost nowhere, i.e. even though the sample paths are continuous,
the increments to the states are not, and therefore, in particular, "end-of-
period" rates of return will not be "predictable" even in the continuous-
time limit. The class of stochastic processes that satisfy these conditions are called
diffusion processes• 35
Although such processes are almost nowhere differentiable in the usual
sense, under some mild regularity conditions there is a generalized theory of
stochastic differential equations which allows their instantaneous dynamics to
36
be expressed as the solution to the system of equations:

dSi(t ) = Gi(S, t) dt + Hi(S, t) dqi(t), i = 1,..., m, (7.1)

where Gi(S, t) is the instantaneous expected change in Si(t ) per unit time at
time t; H 2 is the instantaneous variance of the change in Si(t ), where it is
understood that these statistics are conditional on S(t)= S. The dqi(t ) are
Wiener processes with the instantaneous correlation coefficient per unit of time
between dqi(t ) and dqj(t) given by the function "%(S, t), i, j = 1 . . . . , m . 37
Moreover, specifying the functions {Gi, H;, r/ij }, i, j = 1 , . . . , m, is sufficient
to completely determine the transition probabilities for S(t) between any two
dates. 3s
Under the assumption that the returns on securities can be described by
diffusion processes, Merton (1969, 1971) has solved the continuous-time analog
to the discrete-time formulation in (6.1), namely:

max E 0
l f0 U[C(t), t] dt + B[W(T), T] / . (7.2)

Neither negative consumption nor negative wealth is physically possible•

35See Feller (1966), It8 and McKean (1964), and Cox and Miller (1968).
36(7.1) is a short-hand expression for the stochastic integral:

S,(t) = S,(O) + fo' G,(S, r) dr + fo' He(S' r) d q , ( r ) ,

where Si(t ) is the solution to (7.1) with probability one. For a general discussion and proofs, see
It6 and McKean (1964), McKean (1969), McShane (1974), and Harrison (1985).
37fo dq~ = qi(t) - q~(O) is normally distributed with a zero mean and variance equal to t.
38See Feller (1966, pp. 320-321) and Cox and Miller (1968, p. 215). The transition probabilities
will satisfy the Kolmogorov or Fokker-Planck partial differential equations.
554 R. Merton

Hence, a feasible consumption-investment strategy for (7.2) must satisfy


C(t) >-0 and W(t)>-0 for t ~ [0, T]. Because explicit recognition of these
constraints does little to complicate the analysis, we do so along the lines of
Karatzas, Lehoczky, Sethi and Shreve (1986) and Cox and Huang (1989). 39
The constraint on consumption is captured by the usual K u h n - T u c k e r method.
The constraint on wealth is imposed by making zero wealth an "absorbing
state" so that if W(t)= 0, then W(~-)= 0 and C(~-)= 0 for ~-E [t, T].
Adapting the notation in Merton (1971), 4° the rate of return dynamics on
security j can be written as:

dPj/Pj=aj(S,t) dt+ crj(S,t)dZj, j= l , . . . , n , (7.3)

where aj is the instantaneous conditional expected rate of return per unit time;
2
ori is its instantaneous conditional variance per unit time; and dZj are Wiener
processes, with the instantaneous correlation coefficient per unit time between
dZj(t) and dZk(t ) given by the function pj~(S, t), j, k = 1 , . . . , n. In addition
to the n risky securities, there is a riskless security whose instantaneous rate of
return per unit time is the interest rate r(t). 41 To complete the model's
dynamics description, define the functions /xi~(S, t) to be the instantaneous
correlation coefficients per unit time between dq/(t) and dZj(t), i = 1 . . . . , m;
42
j = l . . . . . n.
As in the discrete-time case, define J by:

max E,If "IW T (7.4)


t

subject to J[0, S(t), t] = I f U[0, r] dr + B[0, T], which reflects the require-
ment that W(t)= 0 is an absorbing state.

39Karatzas et al. use the dynamic programming technique. Cox and Huang use an alternative
method based on a martingale representation technology. As discussed in Merton (1990, ch. 6),
this method is especially powerful for solving optimization problems of this sort.
4°Merton (1971, p. 377). dP/P~ in continuous time corresponds to Zj(t + 1) - 1 in the discrete-
time analysis.
41r(t) corresponds to R ( t ) - 1 in the discrete-time analysis, and is the "force-of-interest"
continuous rate. While the rate earned between t and t + dt, r(t), is known with certainty as of time
t, r(t) can vary stochastically over time.
42Unlike in the A r r o w - D e b r e u model, for example, it is not assumed here that the returns are
necessarily completely described by the changes in the state variables, dS i, i = 1 , . . . , m, i.e., the
dZj need not be instantaneously perfectly correlated with some linear combination of
dql . . . . . dqm. Rather, it is only assumed that (dPI/P~ . . . . , dP./P,,, dS~ . . . . . dS,,,) is Markov in
s(t).
Ch. 11: Capital Market Theory 555

43
The continuous-time analog to (6.4) can be written as:

0 = max U[C, t] + AC + J~ + Jw Wj(OLj -- r) + r W - C + ~ JiGi


1

1 1 1 1

+ Em ~ Ji w Wj OvjI-'Ii ]ZijW ) , (7.5)


1 1

where h is the K u h n - T u c k e r multiplier reflecting the non-negativity constraint


on consumption. The subscripts t, W, and i on J denote partial derivatives with
respect to the arguments, t, W, and S i (i = 1 . . . . , m) of J, respectively, and
o-q---o-g~jpq is the instantaneous covariance of the returns of security i with
security j, i, j = 1 , . . . , n. As was the case in (6.4), the "max" in (7.5) is over
the current decision variables [C(t), wl(t ) . . . . , w,(t)]. If C* and w* are the
optimal rules, then the (n + 1) first-order conditions for (7.5) can be written
as1

o = uc[c*, t] + - J w [ w , s , t] (7.6)

and

O=Jw(aj-r)+Jww~W*o'qW+~Jiw~jHilxq, j=l ..... n. (7.7)


1 1

The K u h n - T u c k e r condition, A'C* = 0, implies that A* = max[0, Jw[W, S, t] -


Uc[0, t]]. Hence, for regions in which C* > 0, equation (7.6) is identical to the
"envelope condition", (6.8), in the unconstrained discrete-time analysis. How-
ever, unlike (6.6) in the discrete-time case, (7.7) is a system of equations which
is linear in the optimal demands for risky securities. Hence, if none of the risky
securities is redundant, then (7.7) can be solved explicitly for the optimal
demand functions using standard matrix inversion, i.e.

wT(t)W(t ) = K ~ vkj(a k - r) + ~ Bi(q , j= l .... , n , (7.8)


1 1

where vkj is the k-jth element of the inverse of the instantaneous variance-

43See Merton (1971, p. 381) and Kushner (1967, ch. IV, theorem 7).
556 R. Merton

covariance matrix of returns [crq];

~ii ~ ~ OkjO'kHil~ik , K ~ - Jw/Jww and B~ =- - J i w / J w w ,


1

i=l .... ,m andj=l,...,n.


As an immediate consequence of (7.8), we have the following mutual fund
theorem:

Theorem 7.1. If the returns dynamics are described by (7.1) and (7.3), then
there exist (m + 2) mutual funds constructed from linear combinations of the
available securities such that, independent of preferences, wealth distribution, or
planning horizon, investors will be indifferent between choosing f r o m linear
combinations of just these (m + 2) funds or linear combinations of all n risky
securities and the riskless security.

Proof. Let mutual fund #1 be the riskless security; let mutual fund #2 hold
fraction 6 j ~ ~ v ~ j ( % - r ) in security j, j = 1 , . . . , n, and the balance ( 1 -
17

1 fii) in the riskless security; let mutual fund #(2 + i) hold fraction 6 ji -= ~ij in
n

security j, j = 1 , . . . , n, and the balance (1 - E 1 /~i) in the riskless security for


i = 1 . . . . . m. Consider a portfolio of these mutual funds which allocates
dz(t) = K dollars to fund #2; dz+i(t ) = B i dollars to fund # ( 2 + i), i =
1 , . . . , m; and dl(t ) = W(t) - ~2+mz di(t) dollars to fund #1. By inspection of
(7.8), this portfolio of funds exactly replicates the optimal portfolio holdings
chosen from among the original n risky securities and the riskless security.
However, the fractional holdings of these securities by the (m + 2) funds do
not depend upon the preferences, wealth, or planning horizon of the individ-
uals investing in the funds. Hence, every investor can replicate his optimal
portfolio by investing in the (m + 2) funds.

Of course, as with the mutual fund theorems of Section 4, Theorem 7.1 is


vacuous if m -> n + 1. However, for m ~ n, the (m + 2) portfolios provide for a
non-trivial spanning of the efficient portfolio set, and it is straightforward to
show that the instantaneous returns on individual securities will satisfy the
same linear specification relative to these spanning portfolios as was derived in
Theorem 4.6 for the one-period analysis.
It was shown in the discrete-time analysis of Section 6 that if E,[Zj(t +
1) [ S(t)] = 2~(t + 1), j = 1 , . . . , n, then the intertemporal maximizer's demand
behavior is "as if" he were a static maximizer of the expected utility of
end-of-period wealth. The corresponding condition in the continuous-time case
is that the instantaneous rates of return on all available securities are uncorre-
lated with the unanticipated changes in all state variables S(t) (i.e. /xq = 0,
Ch. 11: Capital Market Theory 557

i = 1 . . . . , m, and j = 1 . . . . , n). Under this condition, the optimal demand


functions in (7.8) can be rewritten as:

wT(t)W(t) = K ~ v k j ( % - r) , j = l, . . . , n . (7.9)
1

A special case of this condition occurs when the investment opportunity set is
non-stochastic [i.e. either H i = 0, i = 1 , . . . , m, or (a i, o-q, r) are, at most,
deterministic functions of time, i, j = 1 , . . . , n]. Optimal demands will also
satisfy (7.9) if preferences are such that the marginal utility of wealth of the
derived-utility function does not depend on S ( t ) (i.e. B i = 0, i = 1 . . . . . m). By
inspection of (7.6) this condition will obtain if the optimal consumption
function C* does not depend on S(t). In direct correspondence to the
discrete-time finding in Section 6, the only time-additive and independent
utility function to satisfy this condition is U [ C , t] = a(t) log[C(t)], a preference
function which also has the property that C * ( t ) > 0 and ~ * = 0.
By inspection of (7.9) the relative holdings of risky securities, w ~ ( t ) / w * ( t ) ,
are the same for all investors, and thus, under these conditions, the efficient
portfolio set will be spanned by just two funds: a single risky fund and a
riskless fund. Moreover, by the procedure used to prove Theorem 4.9 and
Theorem 4.10 in the static analysis, the efficient portfolio set here can be
shown to be generated by the set of portfolios with minimum (instantaneous)
variance for a given expected rate of return. Hence, under these conditions the
continuous-time intertemporal maximizer will act "as if" he were a static,
Markowitz-Tobin mean-variance maximizer. Although the demand functions
are formally identical to those derived from the mean-variance model, the
analysis here assumes neither quadratic preferences nor elliptic or normally
distributed security returns. Indeed, if for example the investment opportunity
set {aj, r, o'q; i, j = 1, 2 , . . . , n} is constant through time, then from (7.3) the
return on each risky security will be log-normally distributed, which implies
that all securities have limited liability.44
In the general case described in Theorem 7.1, the qualitative behavioral
differences between an intertemporal maximizer and a static maximizer can be
clarified further by analyzing the characteristics of the derived spanning
portfolios.
As already shown, fund #1 and fund #2 are the "usual" portfolios that
would be mixed to provide an optimal portfolio for a static maximizer. Hence,
the intertemporal behavioral differences are characterized by funds #(2 + i),
i = 1 , . . . , m. At the level of demand functions, the "differential demand" for

44See Merton (1971, pp. 384-388). It is also shown there that the returns will be lognormal on
the risky fund which, together with the riskless security, spans the efficient portfolio set. Joint
lognormal distributions are not elliptical distributions.
558 R. Merton

risky security j, zlD~, is defined to be the difference between the demand for
that security by an intertemporal maximizer at time t and the demand for that
security by a static maximizer of the expected utility of "end-of-period" wealth
where the absolute risk-aversion and current wealth of the two maximizers are
the same. Noting that K = - - J w / J w w is the reciprocal of the absolute risk-
aversion of the derived utility of wealth function, from (7.8) we have that

zlD 7 = ~ B e ~ q , j=l,...,n. (7.10)


1

L e m m a 7.1. Define:

d Y i = - d S i - ( ~ 8j~--~f-*/dPJ r d t ) + r d t )

The set o f portfolio weights {6~} that minimize the (instantaneous) variance o f
dY~ are given by 6 j = ~q, j = l, . . . , n and i = l . . . . , m.
n
Proof. The instantaneous variance of d Yi is equal to [H/2 -- 2 ~1 ~ jHio-jtzij
* q-
E n
1 E n
1 6j6kO)k].
* "f Hence, the minimizing set of (6~} will satisfy 0 = -H/o)k~q +
E ln a *ko)~, j = 1 , . . . , n. By matrix inversion, 6jt = ~'q.

The instantaneous
n
rate of return on fund # ( 2 + i) is exactly
[r dt + ~1 ~ q ( d P / P j - r dr)]. Hence, fund # ( 2 + i) can be described as that
feasible portfolio whose rate of return most closely replicates the stochastic
part of the instantaneous change in state variable Si(t), and this is true for
i=1,... ,m.
Consider the special case where there exist securities that are instantaneously
perfectly correlated with changes in each of the state variables. Without loss of
generality, assume that the first m securities are the securities such that d P J P i
is perfectly positively correlated with d S i, i = 1 . . . . , m. In this c a s e , 45 t h e
demand function (7.8) can be rewritten in the form:

w * ( t ) w ( t ) = K ~ vik(,~ - r) + BiHi/,~i , =l,...,m,


1
n
= K2 vik(ak -- r ) , i = m + 1. . . . , n . (7.11)
l

45As will be shown in Section 10, this case is similar in spirit to the Arrow-Debreu complete-
markets model.
Ch. 11: Capital Market Theory 559

Hence, the relative holdings of securities m + 1 through n will be the same for
all investors, and the differential demand functions can be rewritten as:

AD* = BiHJo- i , i=l,...,m,


=0, i=m+l,... ,n. (7.12)

The composition of fund # ( 2 + i) reduces to a simple combination of security i


and the riskless security.
The behavior implied by the demand functions in (7.8) can be more easily
interpreted if they are rewritten in terms of the direct-utility and optimal-
consumption functions. The optimal-consumption function has the form
C*(t) = C*(W, S, t), and from (7.6) it follows immediately that, for C*(t) > 0:

K = - U c [ C * , tl/(Ucc[C*, t]OC*/OW), (7.13)


B,= -(OC*/OS,)/(OC*/OW), i= 1,..., m. (7.14)

Because O C * / O W > O , it follows that the sign of B i equals the sign of


(-OC*/OSi). An unanticipated change in a state variable is said to be
unfavorable if, ceteris paribus, such a change would reduce current optimal
consumption, e.g. an unanticipated increase in Si would be unfavorable if
OC*/OSi<O. Inspection of (7.12), for example, shows that for such an
individual the differential demand for security i (which is perfectly positively
correlated with changes in S;) will be positive. If there is an unanticipated
increase in Si, then, ceteris paribus, there will be an unanticipated increase in
his wealth. Because OC*/OW > 0, this increase in wealth will tend to offset the
negative impact on C* caused by the increase in S~, and therefore the
unanticipated variation in C* will be reduced. In effect, by holding more of this
security, the investor expects to be "compensated" by larger wealth in the
event that S i changes in the unfavorable direction. Of course, if OC*/OS i > O,
then the investor takes a differentially short position. However, in all cases
investors will allocate their wealth to the funds #(2 + i), i = 1 . . . . , m, so as to
"hedge" against unfavorable changes in the state variables S(t). 46
Analysis of the usual static model does not produce such hedging behavior
because the utility function is posited to depend only on end-of-period wealth
and therefore implicitly assumes that OC*/OSi=O , i = 1 , . . . , m . Thus, in
addition to their manifest function of providing an "efficient" risk-return
tradeoff for end-of-period wealth, securities in the intertemporal model have a

46This behavior obtains even when the return on fund #(2 + i) is not instantaneouslyperfectly
correlated with dSr See Merton (1990, ch. 15).
560 R. Merton

latent function of allowing consumers to hedge against other uncertainties.4v


The effect on equilibrium security prices from these "hedging demands" is
examined in Section 10.
As a consequence of the richer role played by securities in the intertemporal
model, the number of securities required to span the set of optimal portfolios
will, in general, be larger than in the corresponding one-period model. It is
therefore somewhat surprising that non-trivial spanning can obtain in the
continuous-trading model. In the one-period analysis of Section 4, it was shown
that for general preferences a necessary and sufficient condition for a set of
portfolios to span the efficient portfolio set is that the returns on every security
can be written as a linear function of the returns on the spanning portfolios
plus noise. As discussed in Section 4, in the absence of complete markets in the
Arrow-Debreu sense, the widespread existence of corporate liability and other
securities with non-linear payoff structures appears to virtually rule out non-
trivial spanning unless further restrictions are imposed on either preferences or
the probability distributions of security returns. The hypothesized conditions of
Theorem 7.1 require only that investors be risk-averse with smooth prefer-
ences. Thus, it follows that the key to the spanning result is the combination of
continuous trading and diffusion processes for the dynamics of security returns.
As will be shown in the sections to follow, diffusion processes are "closed"
under non-linear transformations. That is, the dynamics of a reasonably
well-behaved function of diffusion-driven random variables will also be de-
scribed by a diffusion process. Thus, unlike in the static and discrete-time
dynamic models, the creation of securities whose payoff structures are non-
linear functions of existing security prices will not, in general, cause the size of
the portfolio spanning set to increase.

8. Options, contingent claims analysis, and the Modigliani-Miiler Theorem

Futures contracts, options, loan guarantees, mortgage-backed securities, and


virtually all corporate liabilities are among the many types of securities with the
feature that their payoffs are contractually linked to the prices of other traded
securities at some future date. Contingent claims analysis (CCA) is a technique
for determining the price of such "derivative" securities. As indicated in
Sections 4 and 7, the fact that these contractual arrangements often involve

47For further discussion of this analysis, descriptions of specific sources of uncertainty, and
extensions to discrete-time examples, see Merton (1970, 1973b, 1975, 1977a). Breeden (1979) and
Merton (1990, ch. 6) show that similar behavior obtains in the case of multiple consumption goods
with uncertain relative prices. However, C* is a vector and Jw is the "shadow" price of the
"composite" consumption bundle. Hence, the corresponding derived "hedging" behavior is to
minimize the unanticipated variations in Jw.
Ch. 11: Capital Market Theory 561

non-linear sharing rules has important implications for both corporate finance
and the structure of equilibrium asset prices• For this reason and because
derivative securities represent a significant and growing fraction of the out-
standing stock of financial instruments, CCA is a mainstream topic in financial
economic theory•
Although closely connected with the continuous-time portfolio models ana-
lyzed in the previous section, the origins of CCA are definitely rooted in the
pioneering work of Black and Scholes (1973) on the theory of option pricing.
Thus, we begin the study of derivative-security pricing with an analysis of
option securities•
A "European-type call (put) option" is a security that gives its owner the
right to buy (sell) a specified quantity of a financial or real asset at a specified
price, the "exercise price", on a specified date, the "expiration date". An
American-type option allows its owner to exercise the option on or before the
expiration date. If the owner chooses not to exercise the option on or before
the expiration date, then it expires and becomes worthless•
If V(t) denotes the price of the underlying asset at time t and E denotes the
exercise price, then from the contract terms, the price of the call option at the
expiration date T is given by max[0, V ( T ) - E] and the price of the put option
is max[0, E - V(T)]. If there is positive probability that V ( T ) > E, and positive
probability that V ( T ) < E, then these options provide examples of securities
with contractually derived non-linear sharing rules with respect to the underly-
ing asset•
Although academic study of option pricing can be traced back to at least the
turn of the century, the "watershed" in this research is the Black and Scholes
48
(1973) model, which uses arbitrage arguments to derive option prices. It was,

of course, well known before 1973 that if a portfolio of securities can be


constructed to exactly match the payoffs to some security, then that security is
redundant, and to rule out arbitrage, its price is uniquely determined by the
prices of securities in the replicating portfolio• It was also recognized that
because the price of an option at its expiration date is perfectly functionally
related to the price of its underlying asset, the risk of an option position could
be reduced by taking an offsetting position in the underlying asset• However,
because portfolios involve linear combinations of securities and because the
option has a non-linear payoff structure, there is no static (i.e. "buy-and-
hold") portfolio strategy in the underlying asset that can exactly replicate the
payoff to the option. Thus, it would seem that an option cannot be priced by
arbitrage conditions alone• Black and Scholes had the fundamental insight that
a dynamic portfolio strategy in the underlying asset and the riskless security
can be used to hedge the risk of an option position. With the idea in mind that

48Black (1987) gives a brief history on how he and Scholes came to discover their model.
562 R. Merton

the precision of the hedge can be improved by increasing the frequency of


portfolio revisions, they focused on the limiting case of continuous trading. By
assuming that the price dynamics of the underlying asset are described by a
geometric Brownian motion and that the interest rate is constant, Black and
Scholes derive a trading strategy that perfectly hedges the option position.
They are, thus, able to determine the option price from the equilibrium
condition that the return on a perfectly hedged portfolio must equal the
interest rate.
Under the assumption that the dynamics for the underlying asset price are
described by a diffusion process with a continuous sample path, Merton (1970,
1973a, 1977b) uses the mathematics of It6 stochastic integrals to prove that
with continuous trading, the Black-Scholes dynamic portfolio strategy will
exactly replicate the payoff to an option held until exercise or expiration.
Therefore, under these conditions the Black-Scholes option price is a necess-
ary condition to rule out arbitrage. Using a simplified version of the arbitrage
proof in Merton (1977b), we derive the Black-Scholes price for a European
call option.
Following the notation in Section 7, we assume that the dynamics of the
underlying asset price are described by a diffusion process given by:

dV = a V d t + rrV d Z , (8.1)

where o- is, at most, a function of V and t. No cash payments or other


distributions will be made to the owners of this asset prior to the expiration
date of the option.
Let F(V, t) be the solution to the partial differential equation:

1 2•r2r,
~o- v r n + r V F 1 - r F + F 2 = 0 (8.2)
subject to the boundary conditions:

(a) F(0, t) = 0 ,
(b) F / V b o u n d e d , (8.3)
(c) F(V, T) = max[0, V - E l ,

where subscripts denote partial derivatives with respect to the arguments of F.


A solution to (8.2)-(8.3) exists and is unique. 49
Consider a continuous-time portfolio strategy where the investor allocates
the fraction w ( t ) to the underlying asset and [1 - w(t)] to the riskless security.

49(8.2) is a classic linear partial-differentialequation of the parabolic type. If ~r2 is a continuous


function, then there exists a unique solution that satisfies boundary conditions (8.3). The usual
method for solving this equation is Fourier transforms.
Ch. I1: Capital Market Theory 563

If w(t) is a right-continuous function and P(t) denotes the value of the portfolio
at time t, then, from Section 7, the dynamics for P can be written as:

d P = [w(a - r) + r]P dt + wo'P d Z . (8.4)

Suppose the investor selects the particular portfolio strategy w ( t ) =


FI(V, t ) V ( t ) / P ( t ) . Note that the strategy rule w(t) for each t depends on the
partial derivative of the known function F, the current price of the underlying
asset, and the current value of the portfolio. By substitution into (8.4), we
have that
d P = [ F , V ( a - r) + rP] dt + F, Vo- d Z . (8.5)

Since F is twice-continuously differentiable, It6's L e m m a 5° can be applied to


express the stochastic process for F ( V ( t ) , t) as:

d F = [ ~1
2
V 2 F11 + a V F 1 + F2] dt + FIV~r d Z . (8.6)

But F satisfies (8.2) and therefore (8.6) can be rewritten as:

d F = [FIV(a - r) + rF] dt + FiVer d Z . (8.7)

From (8.5) and (8.7), d P - d F = [P - F]r dt, an ordinary differential equation


with the well-known solution:
P(t) - F(V(t), t) = [P(0) - F(V(O), 0)] ert . (8.8)
If the initial investment in the portfolio is chosen so that P(0) = F(V(0), 0),
then from (8.8), P ( t ) = F ( V ( t ) , t) for 0 - t - < T. From (8.3) we have that
P(t) = 0 if V(t) = 0 and P ( T ) = max[0, V ( T ) - E]. Thus, we have constructed a
feasible portfolio strategy in the underlying asset and the riskless security that
exactly replicates the payoff structure to an European call option with exercise
price E and expiration date T. By the standard no-arbitrage condition, two
securities with identical payoff structures must have the same price. Thus, the
equilibrium call option price at time t is given by F(V(t), t), the Black-Scholes
price.
The derivation did not assume that the equilibrium option price depends
only on the price of the underlying asset and the riskless interest rate. Thus, if
the option price is to depend on other prices or stochastic variables, then, by

5°It6's Lemma is for stochastic differentiation, the analog to the Fundamental Theorem of the
calculus for deterministic differentiation. For a statement of the Lemma and applications in
economics, see Merton (1971, 1973a, 1982b, 1990). For its rigorous proof, see McKean (1969, p.
44).
564 R. Merton

inspection of (8.2), it must be because either 0 -2 o r r is a function of these


prices. Similarly, the findings that the option price is a twice continuously-
differentiable function of the underlying asset price and that its dynamics
follow a diffusion process are derived results and not assumptions.
Because Black and Scholes derived (8.2)-(8.3) for the case where 0-2 is a
constant (i.e. geometric Brownian motion), they were able to obtain a closed-
form solution, given by:

E(V, t) = VCl)(xl) - E e-r(T-t)cI)(x2) , (8.9)


where x I =- [ l o g ( V / E ) + (r + 0-2/2)(T- t)] /o-X/ T - t; x 2 =- x I - o - X / T - t; and
(b(.) is the cumulative Gaussian density function. From (8.9) it follows that
the portfolio-construction rule is given by w ( t ) = F 1 V / F = q b ( x l ) V / F ( V , t).
By inspection of (8.2) or (8.9), a striking feature of the Black-Scholes
analysis is that the determination of the option price and the replicating
portfolio strategy does not require knowledge of either the expected return on
the underlying asset a or investor risk preferences and endowments. Indeed,
the only variable or parameter required that is not directly observable is the
variance-rate function, 0-2. This feature, together with the relatively robust
nature of an arbitrage derivation, gives the Black-Scholes model an important
practical significance, and it has been widely adopted in the practicing financial
community.
In the derivation of the equilibrium call-option price, the only place that the
explicit features of the call option enter is in the specification of the boundary
conditions (8.3). Hence, by appropriately adjusting the boundary conditions,
the same methodology can be used to derive the equilibrium prices of other
derivative securities with payoff structures contingent on the price of the
underlying asset. For example, to derive the price of a European put option,
one need only change (8.3) so that F satisfies F ( O , t ) = E e x p [ - r ( T - t ) ] ;
F(V, t) bounded; and F(V, T) -- max[0, E - V].
Although options are rather specialized financial instruments, the Black-
Scholes option pricing methodology can be applied to a much broader class of
securities.
A prototypal example analyzed in Black and Scholes (1973) and Merton
(1970, 1974) is the pricing of debt and equity of a corporation. Consider the
case of a firm financed by equity and a single homogeneous zero-coupon debt
issue. The contractual obligation of the firm is to pay B dollars to the
debtholders on the maturity date T, and in the event that the firm does not pay
(i.e. defaults), then ownership of the firm is transferred to the debtholders.
The firm is prohibited from making payments or transferring assets to the
equityholders prior to the debt being retired. Let V(t) denote the market value
of the firm at time t. If at the maturity date of the debt, V ( T ) >- B, then the
Ch. 11: Capital Market Theory 565

debtholders will receive their promised payment B and the equityholders will
have the "residual" value, V ( T ) - B. If, however, V(T)< B, then there are
inadequate assets within the firm to pay the debtholders their promised
amount. By the limited-liability provision of corporate equity, the equity-
holders cannot be assessed to make up the shortfall, and it is clearly not in
their interests to do so voluntarily. Hence, if V(T)< B, then default occurs
and the value of the debt is V(T) and the equity is worthless. Thus, the
contractually derived payoff function for the debt at time T, fl, can be written
as:

f~(V, T) = min[V(T), B ] , (8.10)


and the corresponding payoff function for equity, f2, can be written as:

f2(V, T) = max[O, V(T) - B]. (8.11)

Provided that default is possible but not certain, we have from (8.10) and
(8.11) that the sharing rule between debtholders and equityholders is a
non-linear function of the value of the firm. Moreover, the payoff structure to
equity is isomorphic to a European call option where the underlying asset is the
firm, the exercise price is the promised debt payment, and the expiration date
is the maturity date. Because min[V(T), B] = V ( T ) - max[0, V ( T ) - B], the
debtholders' position is functionally equivalent to buying the firm outright from
the equityholders at the time of issue and simultaneously giving them an option
to buy back the firm at time T for B. Hence, provided that the conditions of
continuous-trading opportunities and a diffusion-process representation for the
dynamics of the firm's value are satisfied, the Black-Scholes option pricing
theory can be applied directly to the pricing of levered equity and corporate
debt with default risk.
The same methodology can be applied quite generally to the pricing of
derivative securities by adjusting the boundary conditions in (8.3) to match the
contractually derived payoff structure. Cox, Ingersoll and Ross (1985b) use this
technique to price default-free bonds in their widely used model of the term
structure of interest rates. The survey articles by Smith (1976) and Mason and
Merton (1985) and the books by Cox and Rubinstein (1985) and Merton
(1990) provide applications of CCA in a broad range of areas, including the
pricing of general corporate liabilities, project evaluation and financing, pen-
sion fund and deposit insurance, and employment contracts such as guaranteed
wage floors and tenure. CCA can also take account of differential tax rates on
different types of assets as demonstrated by Scholes (1976) and Constantinides
and Scholes (1980). Although, in most applications, (8.2)-(8.3) will not yield
closed-form solutions, powerful computational methods have been developed
566 R. Merton

to provide high-speed numerical solutions for both the security price and its
first derivative.
As shown in Section 4, the linear generating process for security returns
which is required for non-trivial spanning in Theorem 4.6 is generally not
satisfied by securities with non-linear sharing rules. However, if the underlying
asset-price dynamics are diffusions, we have shown that the dynamics of
equilibrium derivative-security prices will also follow diffusion processes. The
existence of such securities is, therefore, consistent with the hypothesized
conditions of Theorem 7.1. Hence, the creation of securities with non-linear
sharing rules will not adversely affect the spanning results derived for the
continuous-time portfolio selection model of Section 7. Using a replication
argument similar to the one presented here, Merton (1974, 1977b) proves that
Theorem 4.14, the Modigliani-Miller Theorem, obtains under the conditions
of continuous trading and a diffusion representation for the dynamics of the
market value of the firm.

9. Bankruptcy, transactions costs, and financial intermediation in the


continuous-time model 5~

In Sections 4 and 5 we analyzed the static portfolio-selection problem, taking


account of the non-negativity constraint on wealth and the prospect of personal
bankruptcy. Much the same analysis could be applied to the discrete-time
dynamic portfolio model of Section 6. Although, given the distribution of
security returns in these models, we can describe an algorithm for computing
the constrained optimal-portfolio demands from the unconstrained ones, the
nonqinear sharing rules induced by explicit recognition of bankruptcy add
considerable complexity to the determination of conditions under which non-
trivial spanning of the efficient portfolio set obtains. This is especially so in
models where spanning occurs because of a specific assumption about the joint
probability distribution for returns as, for example, in the Markowitz-Tobin
mean-variance model.
In contrast, the continuous-time model of Section 7 is easily adapted to
include the effects of bankruptcy on portfolio choice and on the return
distributions of both investor and creditor portfolios. From the perspective of
investor behavior, we have, by inspection of (7.8), that the structure of
optimal-portfolio demands is not materially changed by including the non-
negativity constraints on wealth and consumption. Just as CCA was used in
Section 8 to determine the price and return characteristics of corporate debt

51This section is adapted from Merton (1989) and Merton (1990, ch. 14).
Ch. 11: Capital Market Theory 567

with default possibilities, so from the perspective of creditor behavior it can be


used to evaluate price and return characteristics of cash or security loans made
to an investor whose portfolio provides the sole collateral for these loans. As
we have seen, the introduction of non-linear sharing rules (in this case,
between an investor and his creditors) does not by itself violate the diffusion
assumption of security and portfolio returns. Therefore, the non-linearities
induced by taking account of personal bankruptcy do not alter the set of
spanning portfolios, and hence do not affect the conclusions of Theorem 7.1.
Theorem 7.1 can be used for product identification and implementation in
the continuous-time theory of financial intermediation. That is, if individual
securities are "pre-packaged" into a specified group of portfolios, then inves-
tors can achieve the same optimal allocations by selecting from just this group
as they could by choosing from the entire universe of available securities.
Theorem 7.1 thus serves to identify a class of risk-pooling investment products
for which there is a natural demand. Moreover, by specifying the dynamic
trading rules for creating these portfolios, the theorem also provides the
"blueprints" or production technologies for intermediaries to manufacture
these products.
As we have seen, contingent claims analysis can be used to price derivative
securities, including ones issued by intermediaries. The contribution of CCA to
the theory of intermediation is, however, deeper than just the pricing of
financial products. It also contributes to the theory of product implementation
by providing the production technologies to create risk-sharing products. The
portfolio-replication process used to derive the derivative-security price in
Section 8 applies whether or not the security actually exists. Thus, the specified
dynamic portfolio strategy used to create an arbitrage position against a traded
derivative security is also a prescription for synthesizing an otherwise non-
existent security. The investment required to fund the replicating portfolio
F(V(O), 0) in (8.8) becomes, in this context, the production cost to the
intermediary that creates the security. The blueprint for the dynamic produc-
tion process is given by the rules: hold F l(V(t), t)V(t) in the underlying risky
asset and F(V(t), t) - FI(V(t), t)V(t) in the riskless security at each time t for
t = 0 until t = T. The "raw materials" for the manufacturing process are the
underlying risky asset and the riskless security. The "output" produced is a set
of cash flows that are identical to the prescribed payoffs of the financial product
issued.
In Theorem 7.1, as in the mutual fund theorems of Section 4, investors are
shown to be indifferent between selecting their portfolios either from the m + 2
portfolios that span the efficient set or from all n + 1 available securities.
Similarly, investors are indifferent as to whether or not derivative securities are
available because they can use portfolio rules (8.2)-(8.3) to replicate the
payoffs to these securities. It would, thus, seem that the rich menu of financial
568 R. Merton

intermediaries and financial instruments observed in the real world has no


important risk-pooling or risk-sharing function in the environment posited in
Section 7.
Such indifference is indeed the case if, as assumed, all investors can gather
information and transact without cost. Hence, some type of transaction-cost
structure in which financial intermediaries and market makers have a compara-
tive advantage over individual investors and general business firms is required
to provide a raison d'etre for financial intermediation and markets for deriva-
tive securities.
Grossman and Laroque (1987) have shown that including transaction costs in
consumption goods alone does not affect the basic structure of optimal
portfolio demands in the continuous-trading model. However, from the work
of Leland (1985), Constantinides (1986), and Sun (1987), incorporation of
such costs for asset trading in the continuous-time model leads to considerable
technical difficulties.52 Moreover, development of a satisfactory equilibrium
theory of allocations and prices in the presence of transactions costs in assets
promises still more complexity because it requires a simultaneous endogenous
determination of prices, allocations, and the least-cost form of financial
intermediation and market structure.
To circumvent all this complexity and also preserve a role for intermedia-
tion, Merton (1989, 1990, ch. 14) introduces a continuous-time model in which
many agents cannot trade costlessly, but the lowest-cost transactors (by
definition, financial intermediaries) can. 53 In this model, the dynamic portfolio
analysis of Section 7 and CCA of Section 8 can be applied to determine the
production costs for financial products issued by intermediaries. However,
unlike in the standard zero-transaction-cost model, these risk-pooling and
risk-sharing products can significantly improve economic efficiency. That is, as
customers of intermediaries, high-transaction-cost agents can achieve invest-
ment allocations that are not otherwise possible by direct asset-market transac-
tions.
If, in addition, agents and intermediaries are price-takers in the traded-
securities markets and if there is a sufficient number of potential producers

52With diffusion processes and proportional transactions costs, investors cannot trade continu-
ously. T h e reason is that with continuous trading, transactions costs at each trade will be
proportional to IdZl, where d Z is a Brownian motion. However, for any non-infinitesimal T,
J~0
~ [dZl = ~ almost certainly and hence, with continuous trading, the total transactions cost is
u n b o u n d e d with probability one.
53This model also appears in Merton (1978) where the cost of surveillance by the deposit insurer
is, in equilibrium, borne by the depositors in the form of a lower yield on their deposits. If all
investors can transact without cost, then none would hold deposits and instead would invest
directly in higher-yielding U S T bills. Thus, to justify this form of intermediation, it is necessary to
assume that at least some investors face positive transactions cost for such direct investments in the
market.
Ch. 11: Capital Market Theory 569

who can trade continuously at zero marginal cost, then the equilibrium prices
of financial products equal the production costs of the lowest-cost producers. In
this competitive version of the model, equilibrium prices for derivative-security
products are given by the solution to (8.2) with the appropriate boundary
conditions. Merton (1990, ch. 14) shows that in this environment a set of
feasible contracts between customers and intermediaries exists that allows all
agents to achieve optimal consumption-:bequest allocations as if they could
trade continuously without cost. Thus, in this limiting case of fully-efficient and
competitive intermediation, equilibrium asset prices and allocations are the
same as in the zero-transactions-cost version of the model. However, mutual
funds and derivative securities provide important economic benefits to inves-
tors and corporate issuers, even though these securities are priced in equilib-
rium as if they were redundant. With these remarks on the robustness of the
model as background, we turn now to the issues of general-equilibrium pricing
and the efficiency of allocations for the continuous-trading model of Section 7.

10. Intertemporal capital asset pricing

Using the continuous-time model of portfolio selection described in Section 7,


Merton (1973b) and Breeden (1979) aggregate individual investor demand
functions and impose market-clearing conditions to derive an intertemporal
model of equilibrium asset prices. Solnik (1974) and Stulz (1981) use the
model to derive equilibrium prices in an international context. By assuming
constant-returns-to-scale production technologies with stochastic outputs and
technical progress described by diffusion processes, Cox, Ingersoll and Ross
(1985a) develop a general-equilibrium version of the model which explicitly
integrates the real and financial sectors of the economy. Huang (1985a, 1985b,
1987) further strengthens the foundation for these models by showing that if
information in an economy with continuous-trading opportunities evolves
according to diffusion processes, then intertemporal-equilibrium security prices
will also evolve according to diffusion processes.
In Proposition 5.5 it was shown that if X denotes the return on any
mean-variance efficient portfolio (with positive dispersion), then the expected
returns on each of the n risky assets used to construct this portfolio will satisfy
Zj - R = a j ( ) ( - R), where aj ~- cov(Zj, X)/var(X), j = 1. . . . , n. Because the
returns on all mean-variance efficient portfolios are perfectly positively corre-
lated, this relation will apply with respect to any such portfolio. Moreover,
because Proposition 5.5 is purely a mathematical result, it follows immediately
for the model of Section 7 that at time t:

aj(t)-r(t)=a~(t)[a*(t)-r(t)], j=l .... ,n, (10.1)


570 R. M e r t o n

where ~* is the expected return on an (instantaneously) mean-variance


efficient portfolio and a~ equals the instantaneous covariance of the return on
security ] with this portfolio divided by the instantaneous variance of the
portfolio's return. Thus, knowledge of the expected return and variance of a
mean-variance efficient portfolio together with the covariance of that port-
folio's return with the return on asset j is sufficient information to determine
the risk and expected return on asset j. It is, however, generally difficult to
identify an ex ante mean-variance efficient portfolio by statistical estimation
alone and hence, the practical application of (10.1) is limited.
As shown in Sections 4 and 5, the Sharpe-Lintner-Mossin Capital Asset
Pricing Model provides an example where identification without estimation is
possible. Because all investors in that model hold the same relative proportions
of risky assets, market-clearing conditions for equilibrium imply that the
market portfolio is mean-variance efficient. The mathematical identity (10.1)
is thus transformed into the Security Market Line (Theorem 4.8) which has
economic content. The market portfolio can, in principle, be identified without
knowledge of the joint distribution of security returns.
From (7.8) the relative holdings of risky assets will not in general be the
same for all investors in the continuous-trading model. Thus, the market
portfolio need not be mean-variance efficient as a condition for equilibrium,
and therefore the Security Market Line will not obtain in general. However,
Merton (1973b, p. 881, 1977a, p. 149, 1990, p. 510) and Breeden (1979, p.
273) show that equilibrium expected returns will satisfy:

rn+l

- r(t) = B,j(t)[a(t)-r(t)], j=l ..... n, (10.2)


1

where O/1 is the expected return on the market portfolio; a i is the expected
return on a portfolio with the maximum feasible correlation of its return with
the change in state variable Si_l, i = 2 , . . . , m + 1; and the {Bij} correspond to
the theoretical multiple-regression coefficients from regressing the (instanta-
neous) returns of security ] on the returns of these m + 1 portfolios. (10.2) is a
natural generalization of the Security Market Line and is therefore aptly called
the Security Market Hyperplane.
Let d X i / X ~, i = 1 , . . . , m + 1, denote the instantaneous rate of return on the
ith portfolio whose expected return is represented on the right-hand side of
(10.2). It follows immediately from the definition of the {Bgj} that the return
dynamics on asset j can be written as:

m+l
dPj/Pj = r(t) dt + ~ B,j(t)(dX'/X i - r(t) d t ) + d e j , (10.3)
1
Ch. 11: Capital Market Theory 571

where dej is a diffusion


process such that E,(dej) =
E , ( d e j l d X l / X l, . . . , d S m + l / s re+l) ---0.
(10.3) is the continuous-trading
dynamic analog to the result derived for static models of spanning in T h e o r e m
4.6. If the {B0(t)} are sufficiently slowly varying functions of time relative to
the intervals over which successive returns are observed, then from (10.3)
these risk-measure coefficients can, in principle, be estimated using time-series
regressions of individual security returns on the spanning portfolios' returns.
From L e m m a 7.1 we have that d X i + l / x i+l = d S i - d Y i , i = 1 , . . . , m, where
d Yi is uncorrelated with all securities' returns and therefore is uncorrelated
with d X k / X k, and dej, k = 1 , . . . , m + 1, and j = 1 . . . . , n. It follows from
(10.3) that
m+l

dPj/Pj = Aj(t) dt + B~j(t)[dX1/X 1 - r(t) d t ] + ~ Bii(t ) dSi_ 1 -}- de~ ,


2
(10.4)
where Aj(t) is a locally non-stochastic drift term and d e ; = - d e j -
~ , + 1 Bij(t)[dy~_l _ E,(dyi_l)]" Although E,(de~) = 0 and Et(dejl
d X 1 / X 1,
dS~ . . . . . dSm) = 0 , it is not the case that Et(de~ldX1/X 1, d S 1 , . . . , dSm)= 0
(unless d Y i =-0, i = 1 . . . . . m). Hence, unlike (10.3), (10.4) is not a properly
specified regression equation because dS i and dYg are not uncorrelated for
every i, k = 1 , . . . , m. Thus, it is not in general valid to regress speculative-
price returns on the change in non-speculative-price state variables to obtain
estimates of the {Bij(t)}.
By I t f ' s Lemma, in the region where Cq > 0, the unanticipated change in
investor q's optimal consumption rate can be written as:
n m

dCq - E,(dCq) = (OCq/OWq)2 W~WqOjdZj + ~ (OCq/OSi)gi dq~ ,


1 I
(lO.5)
where {w~} is his optimal holding of security j as given in (7.8). Let d X * / X *
denote the n return on the mean-variance efficient portfolio, which allocates
fraction ~1 v~i(~i-r) to security j, j = 1 . . . . . n, and the balance to the
riskless security. By substitution for w~Wq from (7.8) and rearranging terms,
we can rewrite (10.5) as:

dCq - Et(dCq) = Vq(dX* /X* - o~* dt) - ~ (OCq/OS,)[dY i - E , ( d Y i ) l ,


1
(10.6)

where Vq =-[OCq/OWq]Kq and dY i is as defined in L e m m a 7.1. From L e m m a


7.1, d P / P j and d Y i are uncorrelated for j = 1 . . . . . n and i = 1 . . . . , m. It
572 R. Merton

follows, therefore, that cov[dCq, d P / P j ] = Vq cov[dX*/X*, dP/Pj]. If C-=


q Cq denotes aggregate consumption, then by the linearity of the covariance
operator, we have that

cov[d(~, dPj/P~] = V cov[dX*/X*, dPi/Pj], j = 1,..., n, (10.7)

where V=- Z q Vq. But, from (10.1), (a t - r) = cov[dX*/X*, dPj/Pil(a* - r)/


var[dX*/X*]. Hence, from (10.1) and (10.7), we have that

ctk - r = ~ k c ( a j - r)/fl~c , k, j = l . . . . , n , (10.8)

where flkC =--cov[dC, dPk/Pk]/var[d(~]. Thus, from (10.8) a security's risk can
be measured by a single composite statistic: namely, the covariance between its
return and the change in aggregate consumption. Breeden (1979, pp. 274-276)
was the first to derive this relation which combines the generality of (10.2)-
(10.4) with the simplicity of the classic Security Market Line. 54
If in the model of Section 7 the menu of available securities is sufficiently
rich that investors can perfectly hedge against unanticipated changes in each of
the state variables S I . . . . . Sm, then from Lemma 7.1 var(dYi)= 0 for i =
1 , . . . , m. From (10.6), unanticipated changes in each investor's optimal
consumption rate are instantaneously perfectly correlated with the returns on a
mean-variance efficient portfolio and therefore, are instantaneously perfectly
correlated with unanticipated changes in aggregate consumption. This special
case analyzed in (7.11) and (7.12) takes on added significance because B reeden
(1979) among others has shown that intertemporal equilibrium allocations will
be Pareto efficient if such perfect hedging opportunities are available.
This efficiency finding for general preferences and endowments is perhaps
surprising, because it is well known that a competitive equilibrium does not in
general produce Pareto-optimal allocations without complete A r r o w - D e b r e u
markets. Because the dynamics of the model in Section 7 are described by
diffusion processes, there is a continuum of possible states over any finite
interval of time. Therefore, complete markets in this model would seem to
require an uncountable number of pure Arrow-Debreu securities. However,
as we know from the work of Arrow (1953) and Radner (1972), an A r r o w -
Debreu equilibrium allocation can be achieved without a full set of pure
time-state contingent securities if the menu of available securities is sufficient
for agents to use dynamic-trading strategies to replicate the payoff structures of
the pure securities. Along the lines of the contingent-claims analysis of Section

5~Although equilibrium condition (10.2) will apply in the cases of either state-dependent direct
utility, U(C, S, t), utilities which depend on the path of past consumption, or models with
transactions costs for the consumption good, (10.8) will no longer obtain under these conditions
[cf. Grossman and Laroque (forthcoming) and Merton (1990, ch. 15)].
Ch. 11: Capital Market Theory 573

8, we now show that if perfect hedging of the state variables is feasible, then
the Radner conditions are satisfied by the continuous-trading model of Section
7.
By hypothesis, it is possible to construct portfolios whose returns are
instantaneously perfectly correlated with changes in each of the state variables,
[dSl(t) . . . . . dSm(t)], as described by (7.1). For notational simplicity and
without loss of generality, assume that the first m available risky securities are
these portfolios so that d Z i = dqi, i = 1 , . . . , m.
With subscripts denoting partial derivatives of H, with respect to
(S 1. . . . . Sin, t), let II(S, t; S, T) satisfy the linear partial differential equation:

0 = ½ ~ ~ HiHj~TqFIq + ~ [Gj - Hj(aj- r)/o)]Hj + F/m+1 -- rH (10.9)


1 1 1

subject to theboundary conditions: H( S, t; S, T) >- 0 and


fs11(S, t',_~, T) d S 1 . . . d S
m ---1 for all S and t < T; for a given e > 0 ,
F/(S, T; S, T)= I/e i f S g - e / 2 _ < S k-<Sk + e / 2 for each k = l , 2 , . . . , m and
F/(S, T; S, T) = 0 otherwise. 55
Consider the continuous-trading portfolio strategy that allocates the fraction
6j(t) = F/jHj/[o)V(t)] to security j, j = 1 , . . . , m, and the balance to the riskless
asset, where V(t) denotes the value of the portfolio at time t. From (7.3) the
dynamics of the portfolio value can be written as:

dV= V 6j(aj - r) + r dt + 6j~ dZj


I

=[~IIjHj(aj-r)/%+rVldt+~,HjHjdq j, (10.10)
1

because dZj = dqj, j = 1, . . . , m.


Because / / i s twice-continuously differentiable, we can use It6's Lemma to
write the dynamics of H(S(t), t) as:

dH=(½~HiHjrhjlIq+~,[GJIj]+F/m+l) dt+~,IIjHjdqj.
1 1 1 1
(10.11)

But, II(S, t) satisfies (10.9) and hence (10.11) can be rewritten as:

dF/= HjHj(aj- r) /~r/ + rF/ dt + ~ F//Hjdqj. (10.12)


1

55Under mild regularity conditions on the functions H, 7/, G, a, ~, and r, a solution exists and is
unique.
574 R, Merton

From (10.10) and (10.12), d H - d V = r ( F l - V ) d t . Therefore, if the initial


investment in the portfolio is chosen so that V ( 0 ) = H(S(0), 0), then V ( t ) =
II(S(t), t) for 0 -< t -< T.
Thus, by using (m + 1) available securities, a dynamic portfolio strategy has
been constructed with a payoff structure that matches the boundary conditions
of (10.9). By taking the appropriate limit as e-->0, the solution of (10.9)
provides the portfolio prescription to exactly replicate the payoff to a
pure Arrow-Debreu security, which pays $1 at time T if Sk(T ) = Sk, k =
1 . . . . , m, and pays 0 otherwise. 56 by changing the parameters S and T, one
can generate the portfolio rules to replicate all of the uncountable number of
- • 57
A r r o w - D e b r e u securities using just m + 1 securlnes.
As in the similar analysis of derivative-security pricing in Section 8, we have
here that H(S, t) will also be the equilibrium price for the corresponding
A r r o w - D e b r e u security. Note, however, that unlike the analysis in Section 8,
the solution to (10.9) requires knowledge of the expected returns
( a ~ , . . . , am). The reason is that the state variables of the system are not
speculative prices. If they were, then to avoid arbitrage, G j, Hi, a~, %, and r
would have to satisfy the condition [Gj - rS~]/Hj = (% - r ) / % , j = 1 , . . . , m.
In that case, the coefficient of //j in (10.9) can be rewritten as rSj, j =
1 . . . . , m, and the solution of (10.9) does not require explicit knowledge of
either Gi or a~..
As we saw in Section 8, options are a fundamental security in the theory of
derivative-security pricing. As demonstrated in a discrete-time context by Ross
(1976b), options can also be used in an important way to complete markets and
thereby, to improve allocational efficiency.
The close connection between pure state securities and options in the
continuous-time model is perhaps best exemplified by the work of Breeden and
Litzenberger (1978). They analyze the pricing of state-contingent claims in the
scalar case where a single variable is sufficient to describe the state of the
economy. A "butterfly-spread" option strategy holds a long position in two call
options, one with exercise price E - A and the other with exercise price E + zl,

56Of course, with a continuum of states, the price of any one A r r o w - D e b r e u security, like
the probability of a state, is infinitesimal. The solution to (10.9) is analogous_to a probability
density and therefore, the actual A r r o w - D e b r e u price is II(S, t ) d S ~ . . , d S m. The limiting
boundary condition for t = T in (10.9) is a vector, generalized Dirac delta function.
57The derivation can be generalized to the case in Section 7, where d Z , ~ + ~ , . . . , dZ,, are not
perfectly correlated with the state variables by adding the mean-variance efficient portfolio to the
m + 1 portfolios used here. As shown in Merton (1990, chs. 14 and 16), such a portfolio must
generally be included as part of the state-space description if these path-independent pure
securities are to span the entire optimal consumption-bequest allocation set. Cox, Ingersoll and
Ross (1985a) present a more general version of partial differential equation (10.9), which describes
general-equilibrium pricing for all assets and securities in the economy. See Duffle (1986, 1988) for
discussion of existence of equilibrium in these general models.
Ch. 11: Capital Market Theory 575

and a short position in two call options with exercise price E, where the
expiration dates of the options are the same. If options are available on a
security whose price V(t) is in one-to-one correspondence with the state
variable, then the payoff to a state-contingent claim which pays $1 at time T if
V(T) = E and $0 otherwise, can be approximated by 1/d units of a butterfly
spread. This approximation becomes exact in the limit as zl--~ dE, the infinites-
imal differential. Breeden and Litzenberger thus show that the pure state
security price is given by (a2F/aE 2) dE, where F is the call-option pricing
function derived in Section 8. Under the specialized conditions for which the
Black-Scholes formula (8.9) applies, the solution for the pure state security
price has a closed-form given by e x p [ - r ( T - t ) ] q b ' ( x 2 ) d E / ( o - E X / - T- S t).
In the intertemporal version of the A r r o w - D e b r e u complete-markets model,
there is a security for every possible state of the economy, but markets need
only be open " o n c e " because agents will have no need for further trade. In the
model of this section, there are many fewer securities, but agents trade
continuously. Nevertheless, both models have many of the same properties. It
appears, therefore, that a good substitute for having a large number of markets
and securities is to have the existing markets open more frequently for trade.
In addressing this point as well as the robustness of the continuous-time
model, Duffle and Huang (1985) derive necessary and sufficient conditions for
continuous-trading portfolio strategies with a finite number of securities to
effectively complete markets in a Radner economy. As discussed in Section 5,
the A r r o w - D e b r e u model permits some degree of heterogeneity in beliefs
among agents. Just so, Duffle and Huang show that the spanning results
derived here for continuous trading are robust with respect to heterogeneous
probability assessments among agents provided that their subjective probability
measures are uniformly absolutely continuous. In later work [Duffle and
Huang (1986)] they derive conditions where these results obtain in the more
general framework of differential information among agents. 58
Although continuous trading is, of course, only a theoretical proposition, the
continuous-trading solutions will be an asymptotically valid approximation to
the discrete-time solutions as the trading interval becomes small. 59 An in-depth
discussion of the mathematical and economic assumptions required for the
valid application of the continuous-time analysis is beyond the scope of this
chapter. 6° However, actual securities markets are open virtually all the time,

58Other research in this area includes Williams (1977), Hellwig (1982), Gennotte (1986), and
Dothan and Feldman (1986).
59See Samuelson (1970) and Merton and Samuelson (1974). Merton (1975, p. 663) discusses
special cases in which the limiting discrete-time solutions do not approach the continuous-time
solutions.
6°Merton (1982b, 1990) discusses in detail the economic assumptions required for the con-
tinuous-time methodology. Moreover, most of the mathematical tools for manipulation of these
models are derived using only elementary probability theory and the calculus.
576 R. Merton

and hence the required assumptions are rather reasonable when applied in that
context.
In summary, we have seen that all the interesting models of portfolio
selection and capital market theory share in common, the property of non-
trivial spanning. If, however, a model is to be broadly applicable, then it
should also satisfy the further conditions that: (i) the number of securities
required for spanning be considerably smaller than both the number of agents
and the number of possible states for the economy; and (ii) the creation of
securities with non-linear sharing rules by an individual investor or firm should
not, in general, alter the size of the spanning set. As we have also seen, the
continuous-trading model with vector diffusions for the underlying state vari-
ables meets these criteria. Motivated in part by the important work of Harrison
and Kreps (1979), Duffle and Huang (1985) use martingale representation
theorems to show that with continuous trading these conditions can also obtain
for a class of non-Markov, path-dependent processes, some of which do not
have continuous sample paths. 61 It remains, however, an open and important
research question as to whether in the absence of continuous trading these
criteria can be satisfied in interesting models with general preferences and
endowments.

6~If the underlying dynamics of the system include Poisson-driven processes with discontinuous
sample paths, then the resulting equilibrium prices will satisfy a mixed partial difference-
differential equation. In the case of non-Markov path-dependent processes, the valuation condi-
tions cannot be represented as a partial differential equation.

References

Adler, M. and B. Dumas (1983) 'International portfolio choice and corporation finance: A
synthesis', Journal of Finance, 38: 925-984.
Arrow, K J . (1953) 'Le rSle des valcurs boursi~res pour la r6partition la meilleure des risques',
Econometrie, Colloques Internationaux du Centre National de la Recherche Scientifiquc, Vol.
XI, Paris, pp. 41-47.
Arrow, K.J. (1964) 'The role of securities in the optimal allocation of risk bearing,' Review of
Economic Studies, 31: 91-95.
Bawa, V.S. (1975) 'Optimal rules for ordering uncertain prospects', Journal of Financial
Economics, 2: 95-121.
Bertsekas, D.E (1974) 'Necessary and sufficient conditions for existence of an optimal portfolio',
Journal of Economic Theory, 8: 235-247.
Bhattacharya, S. (1989) 'Financial markets and incomplete information: A review of some recent
developments', in: S. Bhattacharya and G.M. Constantinides, eds., Frontiers of modern financial
theory: Financial markets and incomplete information. Totowa: Rowman and Littlefield.
Black, F. (1972) 'Capital market equilibrium with restricted borrowing', Journal of Business, 45:
444-455.
Black, F. (1986) 'Noise', Journal of Finance, 41: 529-543.
Black, F. (1987) 'The week's citation classic', Current Contents~Social and Behavioral Science, 33:
17 August. Philadelphia: Institute for Scientific Information, p. 16.
Ch. 11: Capital Market Theory 577

Black, F. and M. Scholes (19"73) 'The pricing of options and corporate liabilities', Journal of
Political Economy, 81:637-654.
Borch, K. (1969) 'A note on uncertainty and indifference curves', Rev&w of Economic Studies, 36:
1-4.
Breeden, D.T. (1979) 'An intertemporal asset pricing model with stochastic consumption and
investment opportunities', Journal of Financial Economics, 7: 265-296.
Breeden, D.T. and R. Litzenberger (1978) 'Prices of state-contingent claims implicit in option
prices', Journal of Business, 51: 621-651.
Cass, D. and J.E. Stiglitz (1970) 'The structure of investor preferences and asset returns, and
separability in portfolio allocation: A contribution to the pure theory of mutual funds', Journal
of Economic Theory, 2: 122-160.
Chamberlain, G. (1983) 'A characterization of the distributions that imply mean-variance utility
functions', Journal of Economic Theory, 29: 185-201.
Chamberlain, G. and M. Rothschild (1983) 'Arbitrage and mean-variance analysis on large asset
markets', Econometrica, 51: 1281-1301.
Chen, N. and J.E. Ingersoll (1983) 'Exact pricing in linear factor models with finitely-many assets:
A note', Journal of Finance, 38: 985-988.
Constantinides, G. (1986) 'Capital market equilibrium with transactions costs', Journal of Political
Economy, 94: 842-862.
Constantinides, G. and M. Scholes (1980) 'Optimal liquidation of assets in the presence of
personal taxes: Implications for asset pricing', Journal of Finance, 35: 439-443.
Cox, D.A. and H.D. Miller (1968) The theory of stochastic processes. New York: John Wiley.
Cox, J.C. and C. Huang (1989) 'Optimum consumption and portfolio policies when asset prices
follow a diffusion process', Journal of Economic Theory, 49: 33-83.
Cox, J.C. and C. Huang (forthcoming) 'A variational problem arising in financial economics with
an application to a portfolio turnpike theorem', Journal of Mathematical Economics.
Cox, J.C. and M. Rubinstein (1985) Options markets. Englewood Cliffs: Prentice-Hall.
Cox, J.C., J.E. Ingersoll and S.A. Ross (1985a) 'An intertemporal general equilibrium model of
asset prices', Econometrica, 53: 363-384.
Cox, J.C., J.E. Ingersoll and S.A. Ross (1985b) 'A theory of the term structure of interest rates',
Econometrica, 53: 385-408.
Debreu, G. (1959) Theory of value. New York: John Wiley.
Dhrymes, P., I. Friend and N. Gultekin (1984) 'A critical examination of the empirical evidence on
the arbitrage pricing theory', Journal of Finance, 39: 323-347.
Dhrymes, P., I. Friend and N. Gultekin (1985) 'New tests of the APT and their implications',
Journal of Finance, 40: 659-674.
Dothan, M.U. and D. Feldman (1986) 'Equilibrium interest rates and multiperiod bonds in a
partially observable economy', Journal of Finance, 41: 369-382.
Dreyfus, S.E. (1965) Dynamic programming and the calculus of variations. New York: Academic
Press.
Duffle, D. (1986) 'Stochastic equilibria: Existence, spanning number, and the "no expected
financial gain from trade" hypothesis', Econometrica 54: 1161-1184.
Duffle, D. (1988) Security markets: Stochastic models. New York: Academic Press.
Duffle, D. and C. Huang (1985) 'Implementing Arrow-Debreu equilibria by continuous trading of
few long-lived securities', Econometrica, 53: 1337-1356.
Duffle, D. and C. Huang (1986) 'Multiperiod securities markets with differential information:
Martingales and resolution times', Journal of Mathematical Economics, 15: 283-303.
Dybvig, P. and S.A. Ross (1982) 'Portfolio efficient sets', Econometrica, 50: 1525-1546.
Fama, E. (1965) 'The behavior of stock market prices', Journal of Business, 38: 34-105.
Fama, E. (1970a) 'Efficient capital markets: A review of theory and empirical work', Journal of
Finance, 25: 383-417.
Fama, E. (1970b) 'Multiperiod consumption-investment decisions', American Economic Review,
60: 163-174.
Fama, E. (1978) 'The effects of a firm's investment and financing decisions on the welfare of its
securityholders', American Economic Review, 68: 272-284.
Farrar, D.E. (1962) The investment decision under uncertainty. Englewood Cliffs: Prentice-Hall.
578 R. Merton

Farrell, J.L. (1974) 'Analyzing covariation of returns to determine homogeneous stock groupings',
Journal of Business, 47: 186-207.
Feeney, G.J. and D. Hester (1967) 'Stock market indices: A principal components analysis', in: D.
Hester and J. Tobin, eds., Risk aversion and portfolio choice. New York: John Wiley.
Feldstein, M.S. (1969) 'Mean-variance analysis in the theory of liquidity preference and portfolio
selection', Review of Economic Studies, 36: 5-12.
Feller, W. (1966) An introduction to probability theory and its applications, Vol. 2. New York: John
Wiley.
Fischer, S. (1972) 'Assets, contingent commodities, and the Slutsky equations', Econometrica, 40:
371-385.
Fischer, S. and R.C. Merton (1984) 'Macroeconomics and finance: The role of the stock market',
in: K. Brunner and A.H. Melzer, eds., Essays on macroeconomic implications of financial and
labor markets and political processes, Vol. 21. Amsterdam: North-Holland.
Friend, I. and J. Bicksler, eds. (1977) Studies in risk and return, Vols. I & II. Cambridge, Mass.:
Ballinger.
Gennotte, G. (1986) 'Optimal portfolio choice under incomplete information', Journal of Finance,
41: 733-746.
Goldman, M.B. (1974) 'A Negative report on the "near-optimality" of the max-expected-log
policy as applied to bounded utilities for long-lived programs', Journal of Financial Economics,
1: 97-103.
Grossman, S. (1976) 'On the efficiency of competitive stock markets where traders have diverse
information', Journal of Finance, 31: 573-585.
Grossman, S. and G. Laroque (forthcoming) 'Asset pricing and optimal portfolio choice in the
presence of illiquid durable consumption goods', Econometrica.
Grossman, S. and J.E. Stiglitz (1980) 'On the impossibility of informationaUy efficient markets',
American Economic Review, 70: 393-408.
Hadar, J. and W.R. Russell (1969) 'Rules for ordering uncertain prospects', American Economic
Review, 59: 25-34.
Hadar, J. and W.R. Russell (1971) 'Stochastic dominance and diversification', Journal of
Economic Theory, 3: 288-305.
Hakansson, N. (1970) 'Optimal investment and consumption strategies under risk for a class of
utility functions', Econometrica, 38: 587-607.
Hanoch, G. and H. Levy (1969) 'The efficiency analysis of choices involving risk', Review of
Economic Studies, 36: 335-346.
Hansen, L. (1985) 'Auctions with contingent payments', American Economic Review, 75: 862-865.
Hardy, G.H., J.E. Littlewood and G. P61ya (1959) Inequalities. Cambridge: Cambridge University
Press.
Harrison, J.M. (1985) Brownian motion and stochastic flow systems. New York: John Wiley.
Harrison, J.M. and D. Kreps (1979) 'Martingales and arbitrage in multiperiod securities markets',
Journal of Economic Theory, 20:381-408.
Hellwig, M.F. (1982) 'Rational expectations equilibrium with conditioning on past prices', Journal
of Economic Theory, 26: 279-312.
Herstein, I. and J. Milnor (1953) 'An axiomatic approach to measurable utility', Econometrica, 21:
291-297.
Hirshleifer, J. (1965) 'Investment decision under uncertainty: Choice-theoretic approaches',
Quarterly Journal of Economics, 79: 509-536.
Hirshleifer, J. (1966) 'Investment decision under uncertainty: Applications of the state-preference
approach', Quarterly Journal of Economics, 80: 252-277.
Hirshleifer, J. (1970) Investment, interest and capital. Englewood Cliffs: Prentice-Hall.
Hirshleifer, J. (1973) 'Where are we in the theory of information?', American Economic Review,
63: 31-39.
Hogarth, R.M. and M.W. Reder, eds. (1986) 'The behavioral foundations of economic theory',
Journal of Business No. 4, Part 2, 59: S181-$505.
Huang, C. (1985a) 'Information structure and equilibrium asset prices', Journal of Economic
Theory, 34: 33-71.
Huang, C. (1985b) 'Information structures and viable price systems', Journal of Mathematical
Economics, 14: 215-240.
Ch. I1: Capital Market Theory 579

Huang, C. (1987) 'An intertemporal general equilibrium asset pricing model: The case of diffusion
information', Econometrica, 55: 117-142.
Huang, C. and K. Kreps (1985) 'Intertemporal preferences with a continuous time dimension: An
exploratory study', Massachusetts Institute of Technology, Mimeo.
Ingersoll, Jr., J.E. (1987) Theory of financial decision making. Totowa: Rowman and Littlefield.
It6, K. and H.P. McKean, Jr. (1964) Diffusion processes and their sample paths. New York:
Academic Press.
Jensen, M.C., ed. (1972a) Studies in the theory of capital markets. New York: Praeger.
Jensen, M.C. (1972b) 'Capital markets: Theory and evidence', Bell Journal of Economics and
Management Science, 3: 357-398.
Karatzas, I., J. Lehoczky, S. Sethi and S. Shreve (1986) 'Explicit solutions of a general
consumption/investment problem', Mathematics of Operations Research, 11: 261-294.
King, B.R. (1966) 'Market and industry factors in stock price behavior', Journal of Business, 39,
Supplement: 139-190.
Kuhn, H.W. and A.W. Tucker (1951) 'Nonlinear programming', in: J. Neyman, ed., Proceedings of
the Second Berkeley Symposium of Mathematical Statistics and Probability. Berkeley: University
of California Press.
Kushner, H.J. (1967) Stochastic stability and control. New York: Academic Press.
Latane, H. (1959) 'Criteria for choice among risky ventures', Journal of Political Economy, 67:
144-155.
Leland, H. (1972) 'On the existence of optimal policies under uncertainty', Journal of Economic
Theory, 4: 35-44.
Leland, H. (1985) 'Option pricing and replication with transactions costs', Journal of Finance, 40:
1283-1301.
Lintner, J. (1965) 'The valuation of risk assets and the selection of risky investments in stock
portfolios and capital budgets', Review of Economics and Statistics, 47: 13-37.
Livingston, M. (1977) 'Industry movements of common stocks', Journal of Finance, 32: 861-874.
Machina, M. (1982) ' "Expected utility" analysis without the independence axiom', Econometrica,
50: 277-323.
Markowitz, H. (1959) Portfolio selection: Efficient diversification of investment. New York: John
Wiley.
Markowitz, H. (1976) 'Investment for the long run: New evidence for an old rule', Journal of
Finance, 31: 1273-1286.
Mason, S. (1981) 'Consumption and investment incentives associated with welfare programs',
Working Paper 79-34, Harvard Business School.
Mason, S. and R.C. Merton (1985) 'The role of contingent claims analysis in corporate finance,'
in: E. Altman and M. Subrahmanyan, eds., Recent advances in corporate finance. Homewood:
Richard D. Irwin.
McKean, Jr., H.P. (1969) Stochastic integrals. New York: Academic Press.
McShane, E.J. (1974) Stochastic calculus and stochastic models. New York: Academic Press.
Merton, R.C. (1969) 'Lifetime portfolio selection under uncertainty: The continuous-time case',
Review of Economics and Statistics, 51: 247-257.
Merton, R.C. (1970) 'A dynamic general equilibrium model of the asset market and its application
to the pricing of the capital structure of the firm', Working Paper 494-70, MIT Sloan School of
Management.
Merton, R.C. (1971) 'Optimum consumption and portfolio rules in a continuous-time model',
Journal of Economic Theory, 3: 373-413.
Merton, R.C. (1972) 'An analytic derivation of the efficient portfolio frontier', Journal of Financial
and Quantitative Analysis, 7: 1851-1872.
Merton, R.C. (1973a) 'Theory of rational option pricing', Bell Journal of Economics and
Management Science, 4: 141-183.
Merton, R.C. (1973b) 'An intertemporal capital asset pricing model', Econometrica, 41: 867-
887.
Merton, R.C. (1974) 'On the pricing of corporate debt: The risk structure of interest rates',
Journal of Finance, 29: 449-470.
Merton, R.C. (1975) 'Theory of finance from the perspective of continuous time', Journal of
Financial and Quantitative Analysis, 10: 659-674.
580 R. Merton

Merton, R.C. (1977a) 'A re-examination of the capital asset pricing model', in: I. Friend and J.
Bicksler, eds., Studies in risk and return, Vols. I & II. Cambridge, Mass.: Ballinger.
Merton, R.C. (1977b) 'On the pricing of contingent claims and the Modigliani-Miller theorem',
Journal of Financial Economics, 5: 241-249.
Merton, R.C. (1978) 'On the cost of deposit insurance when there are surveillance costs', Journal
of Business, 51: 439-452.
Merton, R.C. (1982a) 'On the microeconomic theory of investment under uncertainty', in: K.J.
Arrow and M. Intriligator, eds., Handbook of mathematical economics, Vol. II. Amsterdam:
North-Holland.
Merton, R.C. (1982b) 'On the mathematics and economics assumptions of continuous-time
financial models', in: W.F. Sharpe and C.M. Cootner, eds., Financial economics: Essays in
honor of Paul Cootner. Englewood Cliffs: Prentice-Hall.
Merton, R.C. (1987a) 'On the current state of the stock market rationality hypothesis', in: S.
Fischer, R. Dornbusch and J. Bossons, eds., Macroeconomics and finance: Essays in honor of
Franco Modigliani. Cambridge, Mass.: MIT Press.
Merton, R.C. (1987b) 'A simple model of capital market equilibrium with incomplete
information', Journal of Finance, 42: 483-510.
Merton, R.C. (1989) 'On the application of the continuous-time theory of finance to financial
intermediation and insurance', Twelfth Annual Lecture of the Geneva Association, The Geneva
Papers on Risk and Insurance, 14: 225-262.
Merton, R.C. (1990) Continuous-time finance. Oxford: Basil Blackwell.
Merton, R.C. and P.A. Samuelson (1974) 'Fallacy of the log-normal approximation to optimal
portfolio decision making over many periods', Journal of Financial Economics, 1: 67-94.
Meyer, R.F. (1970) 'On the relationship among the utility of assets, the utility of consumption, and
investment strategy in an uncertain, but time-invariant world', in: J. Lawrence, ed., 0R69:
Proceedings of the Fifth International Congress on Operational Research. Tavistock Publications.
Miller, M.H. (1977) 'Debt and taxes', Journal of Finance, 32: 261-276.
Modigliani, R. and M.H. Miller (1958) 'The cost of capital, corporation finance, and the theory of
investment', American Economic Review, 48: 261-297.
Mossin, J. (1966) 'Equilibrium in a capital asset market', Econometrica, 35: 768-783.
Myers, S.C. (1968) 'A time-state-preference model of security valuation', Journal of Financial
and Quantitative Analysis, 3: 1-33.
Nielsen, L.T. (1986) 'Mutual fund separation: Factor structure and robustness', Working Paper
86/87-2-3, Graduate School of Business, University of Texas at Austin.
Parsons, J. and A. Raviv (1985) 'Underpricing of seasoned issues', Journal of Financial
Economics, 14: 377-397.
Pratt, J.W. (1964) 'Risk aversion in the small and in the large', Econometrica, 32: 122-136.
Radner, R. (1972) 'Existence of plans, prices, and price expectations in a sequence of markets',
Econometrica, 40: 289-303.
Richard, S. (1975) 'Optimal consumption, portfolio, and life insurance rules for an uncertain lived
individual in a continuous-time model', Journal of Financial Economics, 2: 187-204.
Rock, K. (1986) 'Why new issues are underpriced', Journal of Financial Economics, 15: 187-212.
Roll, R. and S.A. Ross (1980) 'An empirical investigation of the arbitrage pricing theory', Journal
of Finance, 35: 1073-1103.
Ross, S.A. (1976a) 'Arbitrage theory of capital asset pricing', Journal of Economic Theory, 13:
341-360.
Ross, S.A. (1976b) 'Options and efficiency', Quarterly Journal of Economics, 90: 75-89.
Ross, S.A. (1978) 'Mutual fund separation in financial theory: The separating distributions',
Journal of Economic Theory, 17: 254-286.
Rothschild, M. (1986) 'Asset pricing theories', in: W.P. Heller, R.M. Starr and D.A. Starrett, eds.,
Uncertainty, information and communication: Essays in honor of Kenneth J. Arrow, Vol. III.
Cambridge: Cambridge University Press.
Rothschild, M. and J.E. Stiglitz (1970) 'Increasing risk I: A definition', Journal of Economic
Theory, 2: 225-243.
Rothschild, M. and J.E. Stiglitz (1971) 'Increasing risk II: Its economic consequences', Journal of
Economic Theory, 3: 66-84.
Ch. 11: Capital Market Theory 581

Rubinstein, M. (1976) 'The strong case for the generalized logarithmic utility model as the premier
model of financial markets', Journal of Finance, 31: 551-572.
Samuelson, P.A. (1965) 'Proof that properly anticipated prices fluctuate randomly', Industrial
Management Review, 6: 41-49. Reprinted in Samuelson (1972).
Samuelson, P.A. (1967) 'General proof that diversification pays', Journal of Financial and
Quantitative Analysis, 2: 1-13. Reprinted in Samuelson (1972).
Samuelson, P.A. (1969) 'Lifetime portfolio selection by dynamic stochastic programming', Review
of Economics and Statistics, 51: 239-246. Reprinted in Samuelson (1972).
Samuelson, P.A. (1970) 'The fundamental approximation theory of portfolio analysis in terms of
means, variances, and higher moments', Review of Economic Studies, 37: 537-542. Reprinted in
Samuelson (1972).
Samuelson, P.A. (1971) 'The "Fallacy" of maximizing the geometric mean in long sequences of
investing or gambling', Proceedings of the National Academy of Sciences, 68: 2493-2496.
Reprinted in Samuelson (1972).
Samuelson, P.A. (1972) in: R.C. Merton, ed., The collected scientific papers ofPaulA. Samuelson,
Vol. III. Cambridge, Mass.: MIT Press.
Samuelson, P.A. (1977) 'St. Petersburg paradoxes: Defanged, dissected, and historically
described', Journal of Economic Literature, 15: 24-55.
Samuelson, P.A. and R.C. Merton (1969) 'A complete model of warrant pricing that maximizes
utility', Industrial Management Review, 10 (Winter): 17-46. Reprinted in Samuelson (1972).
Scholes, M. (1976) 'Taxes and the pricing of options', Journal of Finance, 31: 319-332.
Shanken, J. (1982) 'The arbitrage pricing theory: Is it testable?', Journal of Finance, 37:
1129-1140.
Sharpe, W. (1964) 'Capital asset prices: A theory of market equilibrium under conditions of risk',
Journal of Finance, 19: 425-442.
Sharpe, W. (1970) Portfolio theory and capital markets. New York: McGraw-Hill.
Smith, Jr., C.W. (1976) 'Option pricing: A review', Journal of Financial Economics, 3: 3-52.
Solnik, B.H. (1974) 'An equilibrium model of the international capital market', Journal of
Economic Theory, 8: 500-524.
Stiglitz, J.E. (1969) 'A re-examination of the Modigliani-Miller theorem', American Economic
Review, 59: 78-93.
Stiglitz, J.E. (1974) 'On the irrelevance of corporate financial policy', American Economic Review,
64: 851-886.
Stulz, R.M. (1981) 'A model of international asset pricing', Journal of Financial Economics, 9:
383 -406.
Sun, T. (1987) 'Transactions costs and intervals in a discrete-continuous time setting for consump-
tion and portfolio choice', in: Connections between discrete-time and continuous-time financial
models, Ph.D. Dissertation, Graduate School of Business, Stanford University, ch. 1.
Tobin, J. (1958) 'Liquidity preference as behavior towards risk', Review of Economic Studies, 25:
68 -85.
Tobin, J. (1969) 'Comment on Botch and Feldstein', Review of Economic Studies, 36: 13-14.
Trzcinka, C. (1986) 'On the number of factors in the arbitrage pricing model', Journal of Finance,
41: 347-368.
Varian, H. (1985) 'Divergence of opinion in complete markets: A note', Journal of Finance, 40:
309-318.
von Neumann, J. and O. Morgenstern (1947) Theory of games and economic behavior, 2nd edn.
Princeton: Princeton University Press.
Wallace, N. (1981) 'A Modigliani-Miller theorem for open-market operations', American
Economic Review, 71: 267-274.
Williams, J.T. (1977) 'Capital asset prices with heterogeneous beliefs', Journal of Financial
Economics, 5: 219-239.

You might also like