Jorion 1986

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Bayes-Stein Estimation for Portfolio Analysis

Author(s): Philippe Jorion


Source: The Journal of Financial and Quantitative Analysis, Vol. 21, No. 3 (Sep., 1986), pp.
279-292
Published by: Cambridge University Press on behalf of the University of Washington School of
Business Administration
Stable URL: http://www.jstor.org/stable/2331042 .
Accessed: 02/09/2014 12:33

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].

Cambridge University Press and University of Washington School of Business Administration are collaborating
with JSTOR to digitize, preserve and extend access to The Journal of Financial and Quantitative Analysis.

http://www.jstor.org

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
JOURNAL
OF FINANCIAL
ANDQUANTITATIVE
ANALYSIS VOL.21,NO.3,SEPTEMBER1986

Bayes-Stein Estimation for Portfolio Analysis

Philippe Jorion*

Abstract

In portfolioanalysis, uncertaintyabout parametervalues leads to suboptimal portfolio


choices. The resultingloss in the investor's utilityis a functionof theparticularestimator
chosen forexpected returns.So, this is a problem of simultaneousestimationof normal
means undera well-specifiedloss function.In this situation,as Stein has shown, the clas?
sical sample mean is inadmissible. This paper presentsa simple empiricalBayes estimator
thatshould outperformthe sample mean in the contextof a portfolio.Simulation analysis
shows that these Bayes-Stein estimatorsprovide significantgains in portfolioselection
problems.

I. Introduction ,.
, medio .,
In virtus

One of the fundamental propositions of the modern virtus theory of finance


is that security risk has to be considered in the context of a portfolio. It is aston-
ishing then that estimation techniques in finance have not recognized the implica?
tions of this result for efficient estimation of unknown parameters. In the context
of a portfolio, using sample means to estimate expected returns amounts to ignor?
ing information contained in other series, and could be compared to assessing the
risk of a security by looking at the variance of its return, rather than at its contri?
bution to overall portfolio risk.
This paper presents an application of shrinkage estimation to portfolio selec?
tion problems. Shrinkage estimators have already been used in finance (see [31],
[7], and [19]), but always on an ad hoc basis. This paper provides a sound ration-
ale for such estimators, and illustrates the extent of possible gains over classical
estimators.
The application of portfolio analysis a la Markowitz [25] traditionally pro?
ceeds in two steps. First, the moments ofthe distribution of returns are estimated
from a time-series of historical returns; then the mean variance portfolio selection
problem is solved separately, as if the estimates were the true parameters. This
* GraduateSchoolof
Business,ColumbiaUniversity, New York,NY 10027.Thispaperdraws
on partof the author'sdissertation
at the University
of Chicago,and was presentedat the 1983
NBER-NSF Conference on BayesianInferencein Econometrics. The authoris grateful
to Arnold
Zellnerforusefuldiscussionsand commentsand an anonymous JFQArefereewho also helpedto
improve thepaper.Theresearchwas supported bytheCollegeInteruniversitaire
d'EtudesDoctorales
dansIes Sciencesdu Management.

279

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
280 Journal of Financial and Quantitative Analysis

"certainty equivalence" viewpoint, in which the sample estimates are treated as


the true values, has been criticized by Barry[l], Brown [11], and Klein and Bawa
[22], who advocate a Bayesian approach that explicitly incorporates estimation
error. But their conclusions should be carried further,which is what the present
study proposes to do.
The impact of parameter uncertainty on optimal portfolio selection has been
recognized by a number of authors (see [17], [13], and [20]) who show that the
practical application of portfolio analysis is seriously hampered by estimation
error, especially in expected returns. Variances and covariances are also un?
known, but are more stable over time, as pointed out by Merton [27].
In this context, the relevant measure of estimation risk is the utility loss due
to a portfolio choice based on sample estimates, rather than on true values. This
loss is a function of the estimator chosen for the population moments. Conse?
quently, one should select an estimator with average minimizing properties rela?
tive to this loss function. Brown [11] provided a Bayesian correction based on a
diffuse prior, which reduces estimation risk, but the estimator in this case is still
the sample mean, which too often takes extreme values. Further, choosing sam?
ple means does not fully exploit the multivariate nature ofthe problem. The issue
here is not to estimate each expected return separately, in which case the sample
mean would be optimal, but rather to minimize the impact of estimation risk on
optimal portfolio choice. Thus, the portfolio context should be central to the esti?
mation procedure.
Instead of the sample mean, an estimator obtained by "shrinking" the
means toward a common value is proposed, which should lead to decreased esti?
mation error with more than two assets in the portfolio. This result can be traced
to the inadmissibility of the sample mean, which was firstproved by Stein [29]
and extended by Brown [8]. It stems from the fact that the effect of estimation
error for all assets is summarized into one loss function, which should be minim?
ized as a whole rather than each component separately.
Section II develops the topic of estimation risk, and Section III reviews the
original Stein estimator and its extensions. An empirical Bayes interpretation is
presented in Section IV. The shrinkage effect is explained by an informative
prior. The parameters of the prior are not prespecified but, rather, are computed
from the data themselves.
Section V illustrates the gains from Bayes-Stein estimation. By simulation
analysis, it is shown that this estimator drastically reduces estimation error: ex?
pressed in risk-free equivalent return, the gain over the Bayes diffuse prior is of
the order of a few percent per annum for sample sizes below 50. Some conclud-
ing remarks are offered in Section VI.

II. Estimation Risk

Traditionally, statistical estimation has been kept separate from portfolio


decisions, mainly because portfolio choice has been analyzed in the "certainty
equivalence" framework, in which the underlying moments are assumed known.
But this two-step procedure is not optimal from an estimation viewpoint: effi?
ciency can be improved by directly considering the effect of parameter uncer?
taintyon the investor's utility.

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
Jorion 281

First, estimation risk should be defined. In a one-period model, the usual


rationality axioms lead the investor to maximize the expected utility of his or her
end-of-period wealth. In terms of rates of return, the control problem is to choose
a set of weights q so as to maximize the expected utility of the return on the
= q'r, where r is the vector of futureobservations,
portfolio z

(1) EU(z) = (u(z)p(z | Q)dz ,

subject to a feasibility constraint.


The problem has two components: (i) a utility function U(z), generally char-
acterized by a functional form and a set of parameters, both of which can be
differentacross investors; and (ii) the conditional distribution of rates of return,
conditioned on a set of parameters 0, unknown for all practical purposes.
In the certainty equivalence framework, one assumes that 0 equals its esti?
mate 0(y), based on some estimator, defined as a function of the observations y,

0 = ?00] ?
(2) xtm.Ey[U(z)\

This approach obviously ignores the issue of estimation risk, or parameter uncer?
tainty.
The Bayesian solution to this problem, as first suggested by Zellner and
Chetty [33], is to define optimal portfolio choice in terms of the predictive den?
sity function. The latter is obtained after integrating out the unknown parameter
0, which explicitly takes into account uncertainty about 0. The investor's prob?
lem can be described as the maximization of the unconditional expected utility of
his portfolio

,with
max^[^|0[f/(z)| 0]]

| 0]] = | d)dzp(d | y,I0)dB


EQ_[EyJ^U(z) jju(z)p(z
e y

= dz.
JU(Z) jp(z\ 0)/?(e| y9IQ)dO

The term between brackets is defined as the predictive densityfunction of z = q'r

(3) P(z | y) = | e)/?(0 | y,I0)dd ,


jp(z
0

where p(Q \y,/0) is the posterior density function of 0, given the data and the
prior information /0,

p(e| y,i0) +f(y_\ e)P(e| /0).

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
282 Journal of Financial and Quantitative Analysis

As Klein and Bawa [22] have shown, this approach is optimal according to the
expected utility von Neumann-Morgenstern axioms.
In a mean-variance framework the objective function can be reduced to a
derived utility function

(4) EU(z) =
F^z,ctI),

where |xz = q'\k and u2 = q' 2 q are the expected return and variance of
the portfolio. The control problem is to choose the vector of investment propor?
tions q so as to maximize expected utility, subject to the constraint that the
weigriFsmust sum to one, for instance.
If the distribution moments 0 = (|x,2) are known, the choice must be opti?
mal

= ^ =
(5)F(jx*,az,2*) F(2*(6) | ?,2) F(q*'?,q*'Zq*) F]MAX *

On the other hand, if the parameters 0 are unknown, the portfolio choice g will
be made on the basis of the sample estimate 0(y). The expected utility, measured
in terms of the true underlying distribution, will necessarily be lower than before

-
F(i(hl)) | H:>2) F(q'?,q'M) ^ F,MAX "

Clearly, this loss in utility is due to parameter uncertainty. Following Brown


[11], the loss due to estimation risk can be measured as
~
F(2>
if * ^ = FMAX
-?-:-.
(6) L(q* ,q)
|rMAX|

Figure 1 illustrates the utility loss due to estimation risk. If the investor
knew the true parameters, he would choose the portfolio represented by point A,
where the weights q* are optimal relative to the true frontier (the solid line).
Unfortunately, he only observes sample estimates, and selects portfolio B, with
composition q, which is optimal relative to the estimated frontier (the dashed
line). However, this choice q is suboptimal relative to the true parameters: for
point C, F(q) ^ FMAX. The difference between these values can be attributed to
estimation risk; it can also be expressed in risk-free equivalent return, by trans-
forming the level of expected utilityF into a risk-freerate R

(7) = F(R>v ?
F(^^T)

Various estimators 0(y) imply various portfolio choices q(Q(y)) and, thus, various
losses L(0,0,(y)). This leads to a well-defined loss function for the estimator 0(y)
viewed as a function t(-) ofthe data:

1. for 0(y) = 0, the loss is zero,


2. for any 0(y) ?^ 0, the loss is nonnegative.

Because the loss is a function of random elements, it cannot be minimized

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
Jorion 283

Return
Expected

Standard
Deviation
FIGURE 1
Portfolio
Choice withEstimationError

as such. In sampling theory, the risk function for an estimator r(*) is defined as
the expected loss over repeated samples

(8) = 0)Jy.
*,(fi) J^(e,e(y))/(y|

A decision rule t0(-) is said to be inadmissible if there exists another rule tx(-)
with at least equal and sometimes lower risk for any value of the true unknown
parameter 0.
Thus, a reasonable minimum requirement for any estimator is admissibility.
The central thesis of this paper is that the usual sample mean is not admissible for
portfolio estimation.

III. Stein Estimation

Consider the problem of estimating |x, the vector of means of N normal


random variables, distributed as
~ t = l,...,T,
y_t MDQl,2),

where the covariance matrix is assumed known. For N greater than 2, Efron and
Morris [16], generalizing Stein's [29] result, showed that the maximum-likeli-
hood estimator (lML(y), which is also the vector of sample means Y, is inadmissi?
ble relative to a quadratic loss function

(9) = - -l, - ?
l(h>?Q0) (H: ?00)'s Qi ?00)

The use of this loss function is relatively widespread because it leads to tractable
results. It is interesting to study because, in the univariate case, the optimal esti?
mator is the sample mean. Also, a quadratic loss is generally a good local ap?
proximation of a more general loss function expanded in a Taylor series. For
repeated observations, the so-called James-Stein estimator
= -
(10) %s00 (1 ^)Z+ #Y0l,

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
284 Journal of Financial and Quantitative Analysis

where w is defined as

-
w = min (N 2)/T
(11) 1,
(y-Y0i)'1,-1(y-Y0i)]'

has uniformlylower risk than the sample mean Y. This estimator is also called a
-
shrinkage estimator, since the sample means are multiplied by a coefficient (1
w) lower than one. Further, the estimator can be shrunk toward any point Y0, and
still have lower risk than the sample mean, but the gains are greater when Y0 is
close to the true value. For negative values of (1 ? w), setting the coefficient
equal to zero leads to an improved estimator: this is the positive part rule.1 Note
that this estimator is biased and nonlinear, since the shrinkage factor is itself a
function of the data.
The superiority of the James-Stein estimator is a startling result. Indeed,
statisticians have been slow to recognize this powerful new statistical technique,
in spite of Lindley's [23] early description of it as "one of the most important
statistical ideas of the decade.' '2
Basically, the result stems from the summation of the components of the
loss function: Stein's estimators achieve uniformly lower risk than the maximum
likelihood estimator, allowing increased risk to some individual components of
the vector jx. As a result, the inadmissibility of the sample mean has been ex?
tended by Brown ([8], [9], and [10]) to other loss functions under surprisingly
weak conditions.
Unfortunately, proof of inadmissibility does not necessarily lead to the con-
struction of better estimators, and the computation ofthe risk function is an ardu-
ous task. Berger [3], however, developed an approach that leads to improved
estimators for polynomial loss functions. For a loss function that is the square of
the usual quadratic loss, he finds that a shrinkage factor ofthe form

(12) d + (Y-
Y0l)rTX-\Y-Y0l)'

with 0 ^ b ^ 2 (N ? 2) and some weak conditions on d, leads to an estimator


better than the sample mean. Estimators of this form tend to be very robust with
respect to the exact functional form ofthe loss, as Brown [8] demonstrated.
It has been shown that the assumptions of known u 2 and of normality are
not critical, but Berger [5] has indicated that the improvement is most significant
in "symmetric" situations, where variances are similar across series. As stan?
dard deviations of stock returns are large relative to sample means, and similar
across stocks, we expect that Stein estimation should lead to significant improve?
ment over the sample mean.

1 Bergerand Bock [6] discussmethodsforimproving based on eliminating


Steinestimators,
singularities.
2 See [23], p. 285, and also theexplanations
advancedby Efronand Morris[14] fortheresis?
tancetothisnewconcept.

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
Jorion 285

IV, The Empirical Bayes Approach

The surprising results found by Stein have been given an interesting Baye?
sian interpretation. Consider the following informative conjugate prior for |x

(13) p(\? | t\,K) - exp


-(^Ot-I'n)'^'1)^-!^)

which could also be derived from a random means model. In purely Bayesian
inference, such as in [34], the prior grand mean tj and prior precision \ are as?
sumed known a priori.
Instead, an empirical Bayes approach would let the parameters tj and \ be
derived directly from the data. Therefore, this approach will outperform the clas?
sical sample mean because it relies on a richer model and includes the sample
mean as the special case \ = 0.3 The inadmissibility result found by Stein can be
explained in a Bayesian framework by the fact that the sample mean corresponds
to a diffuse prior, which is improper since it does not integrate to one. In that
case, the Bayes rule need not be admissible.4
As discussed in Section II, optimal portfolio choice should be based on the
predictive density function of the vector of future rates of return r. With the in?
formative prior (13), this predictive density functionp(r | y,2,\), conditional on
2 and \, is multivariate normal, with mean

E\r\ = (1 -w)Y+
(14) wlY0,

where w = -?? = x'Y = ^-r-Y ,


T+K Y0
?
I'2~ I

and covariance matrix

v^ = + + ?
^
4 rh) WTTTY) T#TT

The derivation is detailed in the Appendix. It is interesting to note that, after


integration of tj, the grand mean Y0 happens to be the average return for the
minimum variance portfolio.5
Zellner and Chetty [33] and Brown [11] studied the diffuse prior case \ =
0, where the moments reduce to

(16) E[r\ y] = Y V[r\ y,S] =


s(l +^).

For very large values of T, the correction due to estimation risk disappears: the
moments E[r] and V[r] tend to the usual values Y and 2, used in the certainty
equivalence approach. But the richness ofthe empirical Bayes approach is that \
3 EfronandMorris[14] andMorris[28] describethisrationale
infurtherdetail.
4 See, forinstance,[4].
5 Althoughnotdirectly derivedfroma portfolio process,theweightsx minimize
optimization
thevarianceoftheportfolio thattheysumtoone.
subjecttothecondition

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
286 Journal of Financial and Quantitative Analysis

was estimated from the data directly. The probability density function
p(X | jx,T|,2) is a gamma distribution with mean at (N + 2)1d, where d is defined
as Qx - Jhrj)'2-1(jx_ - Tn.), and is replaced by its sample estimate (Y ? lY0)f
2~l (Y - IY0). The shrinkage coefficient is then
. = _W + 2_
'
(17) - -
(N + 2) + (y Y0iyTX~l{Y Y0l)

which is a special form of (12).


In practice, 2 is unknown, and could be replaced, as in Zellner and Chetty
[33], by

(18) t = ,
j^-2S

where S is the usual unbiased sample covariate matrix. Substitution into (15)
yields V[r].

V. Practical Applications

The firstgoal of this paper was to demonstrate the inadmissibility of portfo?


lio selection procedures based on sample means. The practical interest of this
general result will now be illustrated by specific examples of potential gains from
using shrinkage estimators. The performance of various estimation procedures
should be measured by the loss of utility due to estimation error, averaged over
repeated samples. Since this risk function is seldom analytically tractable, the
natural procedure is to resort to simulation analysis.
Table 1 illustrates typical stock return data. These are sample estimates
from stock market returns for seven major countries, calculated over a 60-month
period. It is apparent that standard deviations are not too differentacross assets,
and that they are very large relative to sample means. This is precisely a situation
in which Bayes-Stein estimation is likely to help.
The parameters |x and 2 were chosen equal to the estimates reported in
Table 1. T independent vectors of returns were generated6 from this distribution.
For each sampling, the following estimators were computed:

1) Certainty Equivalence Y, S
2) Bayes Diffuse Prior Y, V[r, X = 0]
3) Minimum Variance (w = 1) l YQ, V[r, \?> oo]
= -
4) Bayes-Stein (w w(y,T)) (1 w)Y + w 1Y0, V\r, \(y)].
Brown [11] has found the second estimator to be generally superior to the first
one. The third estimator, advocated by Jobson et al. [19], is an extreme case of
shrinkage, and has no formal justification in this context because the system is
forced to be stationary. This choice, however, yields a particularly simple portfo?
lio selection rule: for all utility functions, the optimal weights will be those ofthe
minimum variance portfolio.

6 Returnsweregenerated
by theIMSL subroutine
GGNSM. All computations
wereperformed
indouble-precision
FORTRAN.

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
c=rs-1l = 0.11838
b = r5"11 = 0.0953, thus Y0 = blc = 0.805
a= rs-1/= 0.15849
cTO = (/- 1/0)'S-1 tf- iV0) = 0.08171
Notes: Dollar returnsin percent per month.Sample period is January1977?December
1981 (T= 60).

In order to find optimal weights, we had to define the investor's utilityfunc?


tion. The negative exponential utility function was chosen here because of the
existence of a closed-form solution for the optimal portfolio.7 But the results
should be robust to the choice ofthe utility function. A quadratic utility function
gave essentially the same results, which are not reported here.
For each drawing k, the optimal portfolio was computed for each possible
estimator, leading to different values of the derived expected utility Ft =
= 1 to 4. Repeating the experiment K = 1000 indepen?
F[q([L((y)) | (x, 2], for/
dent times, the empirical risk function was defined as the average loss of ex?
pected utility

Risk. =
rMAX ^U=\

Expected utility was also expressed in risk-free equivalent return, as in (7). Fi?
nally, to study the effect of sample size, the previous operations were repeated
for various values of T ranging from 25 to 200.
Figure 2 depicts the empirical risk functions, also reported in Table 2. Sev?
eral features are apparent. First, the Bayes-Stein estimator has always lower risk
than both the certainty equivalence and the Bayes diffuse prior estimators. The
improvement is noticeable and significant. In risk-free equivalent, the gain over
the diffuse prior case ranges from 8 percent (T = 25) to 2 percent (T = 50) to
0.2 percent (T = 200) per annum. The reason behind the superiority of the
Bayes-Stein estimator is that the shrinkage factor w is directly derived from the
data. For small sample sizes, large values of w indicate that portfolio analysis
should not rely too much on the observed dispersion in sample means, given the
large coefficients of variation of stock returns. But, of course, as the sample size
7 The absoluteriskaversionwas chosento be 1/(52.2%p.a.), as in Brown[11]. In annual
deviation)mustbe
instandard
terms,thisimpliesthata 1 percentincreaseinthevariance(10 percent
accompanied byan increaseofabout1 percent inexpectedreturn.

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
288 Journal of Financial and Quantitative Analysis

FIGURE 2

EmpiricalRisk Functions

TABLE 2
EmpiricalRisk Functionsand ShrinkageFactors

Sample Certainty Bayes Minimum Shrinkage


Size Equivalence DiffusePrior Variance Bayes-Stein Mean Std. Dev.
25 1.5606 0.3452 0.1337 0.1815 5883 0.1564
30 0.4846 0.2294 0.1100 0.1301 5692 0.1464
40 0.2421 0 1526 0.0865 0.0916 5383 0.1427
50 0.1578 0.1137 0.0762 0.0722 5199 0.1386
60 0.1104 0.0856 0.0690 0.0576 5004 0.1365
80 0.0828 0.0683 0.0608 0.0473 4555 0.1352
100 0.0569 0.0493 0.0577 0.0375 4275 0.1221
125 0.0430 0.0385 0.0528 0.0308 3997 0.1171
160 0.0338 0.0310 0.0510 0.0256 3504 0.1073
200 0.0253 0.0236 0.0484 0.0205 3164 0.0906
Notes: Negative exponentialutility
functionwithabsolute riskaversion of 1/(52.2% p.a.).
Maximumexpected utilitygiventhetrueparametersis 0.99734.

increases, so does the estimated precision of sample means, and the shrinkage
factor decreases, thus putting less weight on the informative prior relative to the
data.
Next, the minimum-variance estimator performs well for small sample
sizes, but is dominated for higher sample sizes. This is not astonishing: this strat?
egy completely disregards any information contained in the sample means,
which produces very good results for small samples, but is otherwise clearly in-
appropriate. For higher sample sizes, expected returns are more accurately esti?
mated, and utility could be increased by taking into account the expected return
of the portfolio. But this estimator would be particularly robust to nonsta?
tionarity. Finally, comparisons of the certainty equivalence and Bayes diffuse
prior rules confirm the conclusions of Brown's study [11]: the Bayes diffuse prior
uniformlydominates the classical rule.
Sections III and IV indicated that Bayes-Stein estimation should outperform

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
Jorion 289

the sample mean, whatever the true parameter values, and the simulation analy?
sis indicated that the gains were substantial. Surely, these gains must be sensitive
to the choice of the "true" parameter values, but it seems that these results pro?
vide conservative estimates of the gains from the Bayes-Stein estimator. Con?
sider how different the means are in Table 1: expressed on a per annum basis,
they vary from 6 percent to 22 percent. Insofar as this dispersion might be con?
sidered unrealistic, the simulation analysis will be biased against Bayes-Stein
estimation. To take the case even further, the analysis was repeated with the
highest mean changed from 22 percent to 44 percent per annum. The gains from
shrinkage estimation were, on average, cut in half, but the Bayes-Stein estimator
still uniformly dominated the sample mean.

VI. Conclusions

This paper studied the effect of estimation error on portfolio choice. Since
parameter uncertainty implies a loss of investor utility, decision theory should be
based on this loss viewed as a function of the estimator and of the true parameter
values. A fundamental result of statistical theory is that the sample mean is an
inadmissible estimator when the number of parameters is greater than two.
(There exists another estimator that always yields lower expected loss in repeated
samples.) This result stems from the summation of the effect of estimation error
for each component into one single loss measure. Thus, the portfolio context is
central to this result. The issue was also analyzed in an empirical Bayes frame?
work, and this paper presented a simple Bayes-Stein estimator that should im?
prove on the classical sample mean. Next, the extent of gains from Bayes-Stein
estimation was illustrated by simulation analysis. The classical rule was always
outperformed, and the gains were often substantial, in the range of a few percent
per annum in risk-freeequivalent return.
Numerous other applications of this technique are possible in finance. For
instance, extensions to improved estimation of beta coefficients are straightfor-
ward. Also, Jorion [21] evaluated the out-of-sample performance of various esti?
mators, based on actual stock return data, and found that shrinkage estimators
significantly outperform the classical sample mean.

Appendix

Bayes-Stein Estimation

The problem is to find, as in Zellner and Chetty [33], the predictive distribu?
tion of future returns r, conditional on the prior (13), the data y = (y,, . . ., y,),
on the covariance matrix 2 and on the scale factor X

(A.l) p(r_ | y,S,X) = I v^A)^^ .


JjpGMt.T!

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
290 Journal of Financial and Quantitative Analysis

When necessary, 2 and X will be estimated from the conditional distribution.


The joint density of r, u-, and t] is given by
?
P&iik^ I Z>S>X) = Pd I ^,^,2,X) p(&,i] I 2,2,X)
+ P(l | | 1?,2)/?(M | ti,X)/?(ti) .
fi:,2)-/(2

With normality. the likelihood function of yn given |x and 2, is

- - -
(A.2) f(h | it,2) B)'X-x(h *)
expFf-i)^

and the density function of |x, given t\and X, is given by the informative prior

(A.3) p(& I ti,X,2) 4- exp


(-^(i^-inD'xS-1^-^!)]

Here, the X parameter is a measure of the tightness of the prior; for X tending to
zero, the prior tends to a diffuse prior. The parameter tj represents the unknown
grand mean, which is given a diffuse prior. Instead of an informative prior on a
model with constant means (x, (A.3) could also represent a model where the
means p, vary randomly around a common grand mean.
The predictive density function can then be writtenas

Pililk^ I Z>2>x) + exP G(r,&,i),y) , with

G = (r-ii)'2-,G-l?:) +
2(Zr-*)'2",fc-ii)
t=l

+ (li-T)I)'X2-,(li.-T)I).

After integration over tj and |x, the predictive density can be shown to be normal
with mean vector and covariance matrix as follows

= - + wlYQ,
(A.4) E[r] (1 w)F

w = w = \I(T + X),
shrinkage factor
Y = vector of Y = (1/7) ]?r= i yt*
sample means:
= = x'Y,
Y0 grand mean Y0
x' = x' = V 2 ~V(I'2
weights of min. var. portfolio: _1I).

1 ir
(A.5) V[r] =21 + +
r + x; r(r+ i + x)r2-i1

This covariance matrix has the following interpretation. The firstterm 2 repre?
around the mean |x. The second term 2/(7 + X) is due to
sents the variation of y_t
the uncertainty in the measure of the sample average Y, whereas the third term
corresponds to uncertainty in the common factor.

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
Jorion 291

For T large, w tends to zero, r tends to Y, and V tends to 2. There is no


estimation risk, and the sample means are accurate estimates of the expected
returns. Similarly, for X very small, w tends to zero, E[r] to Y, and V to 2(1 +
(IIT)). Bayes-Stein estimation is useless here, and estimation risk, due to im-
precise knowledge of expected returns, is impounded on Vthe usual way.
In contrast, for very large values of X, w tends to one, E[r] to l Y0, and V to
2 4- (IIT) l I'(l/1' 2 -11). Estimation of the means can be based only on the
common weighted average Y0, and the matrix added to 2 reflects uncertainty in
this common mean.

References

Barry,C. B. "PortfolioAnalysisunderUncertain Means,VariancesandCovariances."Jour?


nal ofFinance,29 (May 1974),515-522.
Bawa, V. S.; S. J. Brown;and R. W. Klein.Estimation Riskand OptimalPortfolioChoice.
Amsterdam: NorthHolland(1979).
Berger,J. "MinimaxEstimation of a Multivariate
NormalMean underPolynomialLoss."
JournalofMultivariate Analysis,8 (June1978), 173-180.
_Statistical DecisionTheory.New York:Springer-Verlag (1980).
_"Selecting a MinimaxEstimator of a Multivariate
NormalMean."
TheAnnalsofStatistics,10 (March1982),81-92.
Berger,J.,andM. E. Bock. "Eliminating Singularitiesof SteinTypeEstimatorsofLocation
Vectors."JournaloftheRoyalStatisticalSociety,38 (1976), 166-170.
Blume,M. "Betas and TheirRegressionTendencies."JournalofFinance,30 (June1975),
785-795.
Brown,L. D. "On theAdmissibility ofInvariantEstimators ofOne orMoreLocationParame?
ters." AnnalsofMathematical Statistics,37 (Aug. 1966), 1087-1136.
-"Estimation withIncompletely SpecifiedLoss Functions."Journalof
theAmerican Statistical
Assoeiation,70 (June1975),417-427.
_"A HeuristicMethodforDetermining of Estimators?
Admissibility
withApplications."AnnalsofStatistics, 1 (Sept. 1979),960-994.
Brown,S. J. "OptimalPortfolioChoice underUncertainty: A BayesianApproach."Ph.D.
Diss., Univ.ofChicago(1976).
_"The Effectof EstimationRisk on Capital MarketEquilibrium."
JournalofFinancialand Quantitative Analysis,14 (June1979),215-220.
Dickenson, J. P. "The of
Reliability EstimationProcedures inPortfolioAnalysis."Journalof
Financialand QuantitativeAnalysis,9 (Sept. 1979),447-462.
Efron,B., and C. Morris."Stein's EstimationRule and its Competitors?AnEmpirical
Bayes Approach."Journalof theAmericanStatisticalAssoeiation,68 (March 1973), 117-
130.
_"Data AnalysisUsing Stein's Estimatorand its Generalizations."
JournaloftheAmericanStatistical
Assoeiation,70 (June1975),311-319.
_"Families of MinimaxEstimators of theMean of a Multivariate
Nor?
malDistribution." 4 (Jan.1976), 11-21.
TheAnnalsofStatistics,
Frankfurter,G. M.; H. E. Phillips;andJ. P. Seagle. "PortfolioSelection:theEffects
of Un?
certainMeans,Variancesand Covariances."JournalofFinancialand Quantitative Analysis,
6 (Sept. 1971), 1251-1262.
James,W., andC. Stein."EstimationwithQuadraticLoss." Proceedingsofthe4thBerkeley
Symposium onProbability and Statistics
1. Berkeley:Univ.ofCalif.Press(1961), 361-379.
Jobson,J.D.; B. Korkie;andV. Ratti."ImprovedEstimation forMarkowitz Portfolios
using
James-Stein TypeEstimators."ProceedingsoftheAmericanStatistical Assoeiation,Business
andEconomiesStatistics Section,41 (1979), 279-284.

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions
292 Journal of Financial and Quantitative Analysis

Jobson,J.D., andB. Korkie."Estimation forMarkowitz EfficientPortfolios."Journalofthe


American Association,75 (Sept. 1980),544-554.
Statistical
Jorion,P. "International PortfolioDiversificationwithEstimation Risk." JournalofBusi?
ness,58 (July1985),259-278.
Klein,R. W., andV. S. Bawa. "The EffectofEstimation Riskon OptimalPortfolio Choice."
JournalofFinancialEconomics,3 (June1976),215-231.
Lindley,D. V. "Discussionon Professor Stein'sPaper." JournalofRoyalStatistical Society,
24 (1962), 285-287.
Lindley,D. V., andA. F. M. Smith."Bayes Estimates fortheLinearModel." Journalofthe
RoyalStatistical Society,34 (1972),1-41.
Markowitz,H. M. PortfolioSelection:Efficient ofInvestments.
Diversification New York:
WileyandSons (1959).
Merton,R. C. "An AnalyticDerivation oftheEfficient PortfolioFrontier."JournalofFinan?
cial and Quantitative Analysis,! (Sept. 1972), 1851-1872.
_"On Estimating theExpectedReturnon theMarket."JournalofFi?
nancialEconomics,8 (Dec. 1980),323-361.
Morris,C. N. "Parametric EmpiricalBayesInference: TheoryandApplications."Journalof
theAmerican Association,78 (March1983),47-55.
Statistical
Stein,C. "Inadmissibility oftheUsual Estimator fortheMean ofa Multivariate NormalDis?
'' onProbabilityandStatistics 1. Berke?
tribution. Proceedingsofthe3rdBerkeley Symposium
ley:Univ.ofCalif.Press(1955), 197-206.
_"Confidence Sets fortheMean of a Multivariate NormalDistribu?
tion."JournaloftheRoyalStatistical Society,24 (1962), 265-296.
Vasicek,O. "A Noteon UsingCross-Sectional Information on BayesianEstimation of Secu?
rityBetas." JournalofFinance,28 (Dec. 1973), 1233-1239.
Zellner,A. An Introduction to BayesianInferencein Econometrics. New York:Wileyand
Sons(1971).
Zellner,A., andV. K. Chetty."Prediction andDecisionProblemsinRegression Modelsfrom
theBayesianPointofView.'' JournaloftheAmerican Statistical
Association,60 (June1965),
608-615.
Zellner,A., and W. Vandaele. "Bayes-SteinEstimators for&-MeansRegressionand Simul?
taneousEquationsModels." In StudiesinBayesianEconometrics S. Fienberg
and Statistics,
andA. Zellner,eds. Amsterdam: NorthHolland(1974).

This content downloaded from 87.67.189.183 on Tue, 2 Sep 2014 12:33:52 PM


All use subject to JSTOR Terms and Conditions

You might also like