SC 99 30 PDF
SC 99 30 PDF
SC 99 30 PDF
M ARC C. S TEINBACH
Abstract
Mean-variance portfolio analysis provided the first quantitative treatment of the trade-
off between profit and risk. We investigate in detail the interplay between objective and
constraints in a number of single-period variants, including semi-variance models. Particular
emphasis is laid on avoiding the penalization of overperformance. The results are then used as
building blocks in the development and theoretical analysis of multi-period models based on
scenario trees. A key property is the possibility to remove surplus money in future decisions,
yielding approximate downside risk minimization.
Key words. mean-variance analysis, downside risk, multi-period model
AMS subject classifications. 90A09, 90C15, 90C20
0 Introduction
The classical mean-variance approach for which Harry Markowitz received the 1990 Nobel Price
in Economics offered the first systematic treatment of a dilemma that each investor faces: the
conflicting objectives high profit versus low risk. In dealing with this fundamental issue Markowitz
came up with a parametric optimization model that was both sufficiently general for a significant
range of practical situations and simple enough for theoretical analysis and numerical solution.
As the Swedish Academy of Sciences put it: his primary contribution consisted of developing a
rigorously formulated, operational theory for portfolio selection under uncertainty [62].
Indeed, the subject is so complex that Markowitz’ seminal work of the fifties [51, 52, 54]
probably raised more questions than it answered, thus initiating a tremendous amount of related
research. Before placing the present paper into perspective, the following paragraphs give a coarse
overview of these issues. A substantial number of references is included, but the list is far from
complete and cannot even contain all the relevant papers. We cite just a few references on each
subject (in chronological order, and usually only the most recent from several contributions of the
same author) to provide some starting points for the interested reader.
An important aspect of pareto-optimal (efficient) portfolios is that each determines a von Neu-
mann–Morgenstern utility function [80] for which it maximizes the expected utility of the return on
investment. This allowed Markowitz to interpret his approach by the theory of rational behavior
under uncertainty [52], [54, Part IV]. Further, certain measures of risk averseness evolved as
a basic concept in economic theory. These are derived from utility functions and justified by
their relationship to the corresponding risk premiums, see Pratt [64], Arrow [2], Rubinstein [68],
Duncan [17], Kihlstrom and Mirman [36], Ross [67], Li and Ziemba [49]. Applications of utility
theory and risk averseness measures to portfolio selection are reported, e.g., by Mossin [59], Levy
and Markowitz [48], Kroll et al. [46], Jewitt [31], King and Jensen [40], Kijima and Ohnishi [37].
A fundamental (and still debated) question is how risk should be measured properly. Markowitz
discusses the pros and cons of replacing the variance by alternative risk measures in a more
general mean-risk approach [54, Chap. XIII]. These considerations and the theory of stochastic
dominance (see Bawa [5], Fishburn [20], Levy [47], Kijima and Ohnishi [38]) stimulated the research
1
2 M. C. Steinbach
in asymmetric risk measures like expectation of loss and semi-variance, cf. Konno [41], King [39],
Zenios and Kang [83], Uryasev and Rockafellar [79]. The properties of real return distributions also
led to risk models involving higher moments, see Ziemba [84], Kraus and Litzenberger [45], Konno
and Suzuki [43]. More recently the theoretical concept of coherent risk measures was introduced by
Artzner et al. [3], while portfolio tracking (or replication) approaches became popular in practice,
see King [39], Konno and Watanabe [44], Dembo and Rosen [15].
It is quite interesting that the mean-variance approach has received very little attention in the
context of long-term investment planning. Although Markowitz does consider true multi-period
models (where the portfolio may be readjusted several times during the planning horizon) [54,
Chap. XIII], these considerations use a utility function based on the consumption of wealth over
time rather than mean and variance of the final wealth, which places the problem in the realm of
dynamic programming (Bellman [6]). Further long-term and simplified multi-period approaches
are discussed, e.g., by Mossin [59], Samuelson [71], Hakansson [24], Merton and Samuelson [57],
Konno et al. [42]. Much research has also been carried out in the closely related field of continuous-
time models, see Merton [56], Harrison and Pliska [25], Heath et al. [27], Karatzas [34], Dohi and
Osaki [16]. Over the past decade more detailed multi-period models have become tractable due
to the progress in computing technology (both algorithms and hardware), see, e.g., Mulvey and
Vladimirou [60], Dantzig and Infanger [14], Consigli and Dempster [13], Beltratti et al. [7].
With the exception of this group, most of the work cited above neglects details like asset
liquidity or transaction costs. At least the second idealization causes serious errors when many
transactions are performed, as in continuous-time models. Imperfect markets are briefly discussed
by Markowitz [54, p. 297+]; later studies include Perold [63], He and Pearson [26], Karatzas et
al. [35], Jacka [30], Shirakawa [73], Morton and Pliska [58], Atkinson et al. [4].
A final issue concerns the assumptions of the investor about the future, which is represented
by probability distributions of the asset returns. Being based on assessments of financial analysts
or estimated from historical data (or both), these distributions are never exact. (Markowitz calls
them probability beliefs.) The question of the sensitivity of optimization results with respect to
errors in the distribution is discussed, e.g., by Jobson [32], Broadie [11], Chopra and Ziemba [12],
Best and Ding [8], MacLean and Weldon [50].
Additional material and references are found in a more recent book by Markowitz [55] and
any standard text on mathematical finance, like Sharpe [72], Elton and Gruber [18], Ingersoll [29],
Alexander and Sharpe [1], Zenios [82], Ziemba and Mulvey [85].
The present paper develops a fairly complete theoretical understanding of the multi-period
mean-variance approach based on scenario trees. This is achieved by analyzing various portfolio
optimization problems with gradually increasing complexity. Primal and dual solutions of these
problems are derived, and dual variables are given an interpretation if possible. The most impor-
tant aspect in our discussion is the precise interaction of objective (or risk measure) and constraints
(or set of feasible wealth distributions), a subject that has not much been studied in the previous
literature. It should be clear that arguing the properties of risk measures may be meaningless
in an optimization context unless it is clear which distributions are possible. A specific goal in
our analysis is to avoid a penalization of overperformance. In this context we discuss the role of
cash and, in some detail, variance versus semi-variance. A key ingredient of our most complex
multi-period model is an artificial arbitrage-like mechanism involving riskless though inefficient
portfolios and representing a choice between immediate consumption or future profit.
Each of the problems considered tries to isolate a certain aspect, usually under the most general
conditions even if practical situations typically exhibit more specific characteristica. However, we
give higher priority to a clear presentation, and inessential generality will sometimes be sacrificed
for technical simplicity. In particular, no inequality constraints are included unless necessary. (A
separate section is devoted to the influence of such restrictions.) Neither do we attempt to model
liquidity constraints or short-selling correctly, or to include transaction costs; we consider only
idealized situations without further justification. The work here is based on a multi-period mean-
variance model that was first proposed by Frauendorfer [21] and later refined by Frauendorfer and
Siede [23]. A complete application model (including transaction costs and market restrictions)
together with implications of the results developed here will be presented later in a joint paper.
Single-Period and Multi-Period Mean-Variance Models 3
Due to future uncertainty the portfolio optimization problems in this paper are all stochastic.
More precisely, they are deterministic equivalents of convex stochastic programs, cf. Wets [81].
Except for the semi-variance problems they are also quadratic programs involving a second-order
approximation of the return distribution in some sense, cf. Samuelson [70]. Based on earlier work
in nonlinear optimal control [74, 75, 78], the author has developed structure-exploiting numerical
algorithms for multi-stage convex stochastic programs like the ones discussed here [76, 77]. More
general problem classes and duality are studied by Rockafellar and Wets [66], and Rockafellar [65].
For background material on stochastic programming we refer the reader to Ermoliev and Wets [19],
Kall and Wallace [33], Birge [10], Birge and Louveaux [9], Ruszczyński [69].
The paper is organized as follows. Our analysis begins with single-period models in Section 1.
Although many of the results are already known, the systematic discussion of subtle details adds
insight that is essential in the multi-period case. To some extent this section has tutorial character;
the problems may serve as examples in an introductory course on optimization. Next, multi-period
mean-variance models are analyzed in Section 2, where the final goal consists in constructing
an approximate downside risk minimization through appropriate constraints. This material is
completely new; the research was motivated by practical experience with the application model
mentioned above. Some concluding remarks are finally given in Section 3.
(The existence of these two moments is assumed throughout the paper.) The choice of a specific
portfolio determines a certain distribution of the associated total return (or final wealth) w ≡ r∗ x.
Mean-variance analysis aims at forming the most desirable return distribution through a suitable
portfolio, where the investor’s idea of desirability depends solely on the first two moments.
Various formulations of the mean-variance problem exist. From the very beginning [52, 54],
Markowitz related his approach to the utility theory of von Neumann and Morgenstern [80]. As
we shall see later, maximizing the expectation of a concave quadratic utility function leads to a
formulation like
1
max μρ(x) − R(x)
x 2 (1)
s.t. e∗ x = 1,
where e ∈ Rn denotes the vector of all ones. The objective models the actual goal of the investor,
a tradeoff between risk and reward,1 while the budget equation e∗ x = 1 simply specifies her initial
wealth w0 (normalized to w0 = 1 w.l.o.g.). Our preferred formulation comes closer to the original
1 Many authors attach a tradeoff parameter θ to the risk term and maximize ρ(x) − θR(x)/2, which is equivalent
to (1) if μ ≡ θ 1 > 0. However, this problem becomes unbounded for θ ≤ 0, whereas (1) remains solvable for
μ ≤ 0. This is important in our analysis.
4 M. C. Steinbach
one; it minimizes risk subject to the budget equation and subject to the condition that a certain
desired reward ρ be obtained,
1
min R(x)
x 2
s.t. e∗ x = 1, (2)
ρ(x) = ρ.
Here the investor’s goal is split between objective and reward condition.
In this section we study the precise relation of Problems (1) and (2), and a number of in-
creasingly general single-period variants. We will include a cash account, then consider certain
inequality constraints, utility functions, and finally downside risk. Many of the results are already
known, but usually in a different form. Here we choose a presentation that facilitates the study of
nuances in the optimization problems and that integrates seamlessly with the more general case
of multi-period problems in Section 2.
Remark. The inclusions for α and γ are sharp but not the bound on |β|, and neither α < γ nor
β > 0 hold in general. Anyway, we need only α, γ, δ > 0.
Problem 1. Let us first consider the standard tradeoff formulation. To simplify the comparison
with our preferred formulation, we minimize negative utility
1 ∗
min x Σx − μr̄∗ x
x 2
s.t. e∗ x = 1.
The Lagrangian is
1 ∗
L(x, λ; μ) = x Σx − μr̄∗ x − λ(e∗ x − 1).
2
Theorem 1. Problem 1 has the unique primal-dual solution
ρ = λβ + μγ = (β + μδ)/α.
Proof. From the Lagrangian one obtains the system of first order necessary conditions
Σ e x μr̄
= .
e∗ −λ 1
Its first row (dual feasibility) yields the optimal portfolio x. The optimal multiplier λ and reward ρ
are obtained by substituting x into the second row (primal feasibility) and the definition of ρ,
respectively. Uniqueness of the solution follows from strong convexity of the objective and full
rank of the constraint.
Remark. Although the qualitative interpretation of the tradeoff function is clear, the precise value
of the tradeoff parameter μ should also have an interpretation. In specific, the resulting reward
is of interest. This is one reason why we prefer a different formulation of the mean-variance
problem. Other important reasons are greater modeling flexibility, and sparsity in the multi-
period formulation, see Section 2.
Its Lagrangian is
1 ∗
L(x, λ, μ; ρ) = x Σx − λ(e∗ x − 1) − μ(r̄∗ x − ρ).
2
We refer to the dual variables λ, μ as budget multiplier and reward multiplier, respectively. It
will soon be shown that the optimal reward multiplier μ is precisely the tradeoff parameter of
Problem 1.
As in Theorem 1, the optimal portfolio x is obtained from the first row. Substitution of x into
rows two and three yields the optimal multipliers
∗ −1
λ e −1
−1 1 α β 1 1 γ − βρ
= Σ e r̄ = = .
μ r̄∗ ρ β γ ρ δ αρ − β
1 − μβ δ − αβρ + β 2 αγ − αβρ γ − βρ
= = = .
α αδ αδ δ
Hence optimal portfolios also agree. The “only if” direction is trivial.
Remarks. Apparently, the optimality conditions of Problem 2 include the optimality conditions of
Problem 1, and additionally the reward condition. These n + 2 equations define a one-dimensional
affine subspace for the n + 3 variables x, λ, μ, ρ, which is parameterized by μ in Problem 1 and
by ρ in Problem 2. As an immediate consequence, the optimal risk is a quadratic function of ρ,
denoted as σ 2 (ρ). Its graph is called the efficient frontier.2
Theorem 4. In Problems 1 and 2, the optimal risk is
Its global minimum over all rewards is attained at ρ̂ = β/α and has the positive value σ 2 (ρ̂) = 1/α.
The associated solution is x̂ = Σ−1 e/α, λ̂ = 1/α, μ̂ = 0.
Proof. By Definition 2 and Theorem 2,
Discussion. The optimal portfolio is clearly a reward-dependent linear combination of the reward-
independent portfolios Σ−1 e and Σ−1 r̄. Moreover, it is an affine function of ρ. The efficient frontier
and optimal investments into two risky assets are depicted in Fig. 1. Here, since n = 2, the optimal
portfolio is completely determined by the budget condition and the reward condition; it does not
depend on Σ and is thus correlation-independent. Not so the risk: for negatively correlated assets,
2 More generally, efficient frontier refers to the set of all pareto-optimal solutions in any multi-objective opti-
mization problem. The solutions (portfolios) are also called efficient. Strictly speaking this applies only to the
upper branch here, that is, ρ ≥ ρ̂ or, equivalently, μ ≥ 0 (see discussion below).
Single-Period and Multi-Period Mean-Variance Models 7
-1
0
1 1.05 1.1 1.15 1.2 1 1.05 1.1 1.15 1.2
Figure 1: Portfolio with two risky assets; r̄1 = 1.15, r̄2 = 1.08. Left: efficient frontier for negatively
correlated, uncorrelated, and positively correlated assets. Right: optimal portfolio vs. reward.
it has a pronounced minimum at a fairly large reward ρ̂. As the correlation increases, the lowest
possible risk also increases and is attained at a smaller reward. (These statements do not simply
generalize to the case n > 2.)
A serious drawback of the model (in this form) is the fact that positive deviations from the
prescribed reward are penalized, and hence the “risk” increases when ρ is reduced below ρ̂. Indeed,
the penalization cannot be avoided, indicating that the model is somehow incomplete. We will
see, however, that unneccessary positive deviations from ρ do not occur if the model is extended
appropriately. For the moment let us accept that only the upper branch is relevant in practice.
σ 2 (ρ) = (ρ − rc )2 /δ c .
8 M. C. Steinbach
Its global minimum over all rewards is attained at ρ̂ = rc and has value zero. The associated
solution has 100 % cash: (x̂, x̂c ) = (0, 1), λ̂ = μ̂ = 0.
The optimal budget multiplier λ is obtained from row two. Substitution into row one yields the
expression for x, and substituting x into row three yields xc . Substitution of x and xc into row
four gives
Theorem 6. Problem 3 with parameter ρ and Problem 4 with parameter μ are equivalent if and
only if ρ = rc + μδ c .
Proof. The proof is analogous to the proof of Theorem 3 and therefore omitted.
Discussion. Basically the situation is quite similar to Problem 2, the only qualitative difference
being the existence of one zero risk portfolio: for ρ = rc , the capital is completely invested in cash
and the risk vanishes. Otherwise a fraction of e∗ x = μ(β − rc α) is invested in risky assets and the
risk is positive, see Fig. 2. The optimal portfolio is now a mix of the (reward-independent) risky
portfolio (Σ−1 (r̄ − rc e), 0) and cash (0,1). The following comparison shows how precisely the cash
account reduces risk when added to a set of (two or more) risky assets.
Theorem 7. The risk in Problem 3 is almost always lower than in Problem 2: If β = r c α, then
the efficient frontiers touch in the single point
δc γ − rc β δc
ρ = rc + = , σ 2 (ρ) = ,
β − rc α β − rc α (β − rc α)2
(see Fig. 2), where the solutions of both problems are “identical”: x or (x, 0). If β = rc α, then
xc ≡ 1 and e∗ x ≡ 0, and risks differ by the constant 1/α,
(ρ − rc )2 1 αρ2 − 2βρ + γ
c
+ = .
δ α δ
Single-Period and Multi-Period Mean-Variance Models 9
-1
0
0.95 1 1.05 1.1 1.15 0.95 1 1.05 1.1 1.15
Figure 2: Portfolio with two risky assets and cash; r̄1 , r̄2 as before, rc = 1.05. Left: efficient
frontiers with/without cash. Right: optimal portfolio vs. reward.
1 rc
μ= , λ=− .
β − rc α β − rc α
This gives the stated values of ρ and σ 2 (ρ) by Theorem 5. Substituting ρ into the formulae for
λ, μ in Theorem 2 yields identical values in both problems. Hence the portfolios agree, too. The
curvatures of the efficient frontiers, d2 σ 2 (ρ)/dρ2 , are 2α/δ and 2/δ c , respectively. Now,
Thus 2α/δ > 2/δ c > 0, implying that Problem 3 has lower risk if xc = 0. The case β = rc α is
trivial: both efficient frontiers have ρ̂ = rc and identical curvatures.
To conclude this section, we show that it does not make sense to consider portfolios with more
than one riskless asset (and no further restrictions).
Lemma 2 (Arbitrage). Any portfolio having at least two riskless assets xc , xd with different
returns rc , rd can realize any desired reward at zero risk.
Basic assumptions. In addition to the conditions of the previous section we now require positive
cash return (rc ≤ rl does not make sense).
A1) Σ > 0.
A3) r̄ = rc e.
A4) rc > 0.
3 The suggestive (but actually misleading) notion of an “asset with guaranteed total loss” is due to Infanger [28]
Problem 5. All covariances associated with xc or xl vanish, so that the risk and reward are
R(x, xc , xl ) = x∗ Σx and ρ(x, xc , xl ) = r̄∗ x + rc xc , respectively, and the optimization problem
reads
⎛ ⎞∗ ⎛ ⎞⎛ ⎞
x Σ 0 0 x
1 ⎝ c⎠ ⎝ 1
min x 0 0 0⎠ ⎝xc ⎠ = x∗ Σx
x,xc ,xl 2 2
xl 0 0 0 xl
s.t. e∗ x + xc + xl = 1, xl ≥ 0,
r̄∗ x + rc xc = ρ.
Note that the no-arbitrage condition xl ≥ 0 must be imposed; otherwise one could borrow arbitrary
amounts of money without having to repay. However, Lemma 2 still works for sufficiently small ρ.
This is precisely our intention.
Theorem 8. Problem 5 has unique primal and dual solutions x, xc , xl , λ, μ, η, where η is the
multiplier of the nonnegativity constraint xl ≥ 0. For ρ > rc , the optimal solution has xl = 0 and
η = −λ > 0, and is otherwise identical to the solution of Problem 3. Any reward ρ ≤ rc is obtained
at zero risk by investing in a linear combination of the two riskless assets, with primal-dual solution
ρ ρ
x = 0, xc = , xl = 1 − , λ = μ = η = 0.
rc rc
Proof. The system of necessary conditions can be written
⎛ ⎞⎛ ⎞ ⎛ ⎞
Σ 0 0 e r̄ x 0
⎜0 0 0 1 rc ⎟ ⎜ xc ⎟ ⎜0⎟
⎜ ⎟⎜ l ⎟ ⎜ ⎟
⎜0 0 0 1 0⎟ ⎜ ⎟ ⎜ ⎟ xl ≥ 0, η ≥ 0, xl η = 0.
⎜ ∗ ⎟ ⎜ x ⎟ = ⎜η ⎟ ,
⎝e 1 1 ⎠ ⎝−λ⎠ ⎝1⎠
r̄∗ rc 0 −μ ρ
As in Theorem 5, the first two rows yield λ = −rc μ and x = μΣ−1 (r̄ − rc e). The third row yields
η = −λ = rc μ. Hence, by complementarity of xl and η, rows four and five yield either xc and μ
as in Problem 3 (if xl = 0 and η ≥ 0; case 1), or xc + xl = 1 and rc xc = ρ (if xl ≥ 0 and η = 0;
case 2). Due to the nonnegativity of xl and η, case 1 can only hold for ρ ≥ rc and case 2 only for
ρ ≤ rc . (Indeed, for ρ = rc both cases coincide so that all variables are continuous with respect
to the parameter ρ.)
Problem 6. The tradeoff version of Problem 5 reads
1 ∗
min x Σx − μ(r̄∗ x + rc xc )
x,xc ,xl 2
s.t. e∗ x + xc + xl = 1, xl ≥ 0.
asset 1
asset 2
1 cash
loss
0
0.95 1 1.05 1.1 1.15 0.95 1 1.05 1.1 1.15
Figure 3: Portfolio with two risky assets, cash, and asset with guaranteed loss; r̄1 , r̄2 , rc as before.
Left: efficient frontier. Right: optimal portfolio vs. reward.
Let us first give the provocative answer “Yes, why not?”. From the point of view of the model,
the investor’s goal is minimizing the “risk” of earning less or more than the specified reward.
Therefore it makes sense to get rid of money whenever this reduces the variance—which it indeed
does for ρ < ρ̂. The model cannot know and consequently does not care how the investor will
interpret that, and it will use any possible means to take out capital if appropriate.
Of course, we can also offer a better interpretation. The fraction invested in xl is simply surplus
capital: the desired reward ρ is achieved at zero risk without that amount, so it should not be
invested in the first place—at least not into the portfolio under consideration. The investor may
enjoy a free lunch instead or support her favorite artist, if she prefers that to burning the money.
Or she may reconsider and decide to pursue a more ambitious goal: the model does not suggest
how to spend the surplus money. This interpretation of the new riskless (but inefficient) solutions
becomes obvious after the following observation.
Lemma 3. Problem 5 is equivalent to the modification of Problem 3 where the budget equation
e∗ x + xc = 1 is replaced by the inequality e∗ x + xc ≤ 1, i.e., less than 100 % investment is allowed.
Proof. With a slack variable s ≥ 0, the modified condition is equivalent to e∗ x + xc + s = 1, and
the modified Problem 3 becomes identical to Problem 5: the ominous loss asset is simply a slack
variable, xl ≡ s.
1 αρ − β
Uρ (w) := μρ w − (w − ρ)2 , μρ ≡ .
2 δ
(μρ is the optimal budget multiplier of the desired reward ρ.) If ρ + μρ > 0, then this equivalence
remains valid for the normalized utility functions
1 1 2 w w2
Ūρ (w) := U ρ (w) + ρ = − ,
(ρ + μρ )2 2 ρ + μρ 2(ρ + μρ )2
satisfying Ūρ (0) = 0 and maxw∈Ê Ūρ (w) = Ūρ (ρ + μρ ) = 12 . For a portfolio with two positively
correlated risky assets, Fig. 4 shows the normalized utility functions associated with several de-
sired rewards, and the resulting optimal wealth distribution functions given normally distributed
12 M. C. Steinbach
1
0.5 o| oo| |o| 0.5
0
1.6
1.4
1 1.05 1.2
1
1.1 1.15 0.8
0 1.2 0.6
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Figure 4: Utility-based portfolio optimization; two positively correlated assets. Left: Normalized
utility functions Ūρ and optimal wealth distribution functions Φρ for ρ ∈ {0.9rc , rc , r̄1 , r̄2 }. Sym-
metry center of Φρ curves at (ρ, 12 ) marked by ‘o’; top of Ūρ parabolas at (ρ + μρ , 12 ) marked by ‘|’.
Right: Family of optimal wealth distribution functions Φρ over the range ρ ∈ [1, 1.2].
1
0.5 o
| o| o| o| 0.5
0
1.6
1.4
1 1.05 1.2
1
1.1 1.15 0.8
0 1.2 0.6
0.7 0.8 0.9 1 1.1 1.2 1.3
returns, r ∼ N (r̄, Σ). The optimal wealth distributions have the explicit form
w
1 (t − ρ)2 w−ρ
Φρ (w) := √ exp − dt = Φ √
2π σρ −∞ 2σρ2 2 σρ
where σρ2 := σ 2 (ρ) = (μ2ρ δ + 1)/α, and Φ is the standard error integral,
w
1
Φ(w) := √ exp(−t2 ) dt.
π −∞
When cash is included in the portfolio (Problem 4), utility functions Uρ , Ūρ and distribution
functions Φρ have precisely the same form, except that the final wealth becomes w = r∗ x + rc xc ,
and that μρ := (ρ − rc )/δ c and σρ2 := μ2ρ δ c . The properties of Φρ give another indication of the
risk reduction mechanism described in Theorem 7: Fig. 5 shows that the slope of Φρ is rather
steep and becomes a jump in the zero-risk case, ρ = rc .
When a loss asset is also added to the portfolio in Problem 6, the utility functions Uρ , Ūρ
are exactly identical to the previous case, and even their optimal wealth distributions for ρ ≥ rc
coincide, see Fig. 6. For ρ ≤ rc , however, the wealth distributions Φρ := χ[ρ,∞) become indicator
functions rather than normal distributions: they all have μρ = σρ = 0 and a jump discontinuity
at ρ, producing zero risk. (In Problem 4 this happens only for ρ = rc .)
In this paper we do not wish to pursue the subject further. The interested reader should refer
to the original considerations of Markowitz [54, Part IV], the literature cited in the introduction
(especially [31, 37, 39, 40, 46, 48, 59]), and the references therein.
Single-Period and Multi-Period Mean-Variance Models 13
1
0.5 o| o| o| o| 0.5
0
1.6
1.4
1 1.05 1.2
1
1.1 1.15 0.8
0 1.2 0.6
0.7 0.8 0.9 1 1.1 1.2 1.3
Figure 6: The same situation as in Fig. 4 with cash and loss included.
Otherwise some constraints become tight, excluding the corresponding assets from the portfolio
and increasing risk. More precisely, the following simple facts hold.
Proof. Statement 1 is trivial. Since each problem is convex, an optimal solution exists if and only if
the feasible set is nonempty. For ρ ∈ [r̄min , r̄max ], the feasible set clearly contains a (unique) convex
combination of xmin and xmax . Conversely, every convex combination of assets yields as reward
the same convex combination of individual expected returns, which lies in the range [r̄min , r̄max ].
This proves statement 2. To prove statement 3 consider Problem 2 first. Let ρ0 , ρ1 ∈ [r̄min , r̄max ]
with respective solutions x0 , x1 . Then xt := (1 − t)x0 + tx1 is feasible for ρt := (1 − t)ρ0 + tρ1 ,
t ∈ [0, 1], and convexity of the efficient frontier follows from convexity of R,
At r̄min and r̄max all the money is invested in one single asset: xmin or xmax . Each ρ ∈ (r̄min , r̄max )
determines a subset of two or more nonnegative assets whose efficient frontier gives the optimal
risk in that point. Since strict positivity is a generic property, each of these sub-portfolios is
optimal either in a single point or on an entire interval. Thus, the efficient frontier (in Problem 2)
is composed of finitely many quadratic pieces. Precisely the same arguments hold for Problem 3
since R(x, xc ) ≡ R(x). In Problem 5, the efficient frontier consists of the segment σ 2 (ρ) ≡ 0 on
[0, rc ] and the pieces of Problem 3 on [rc , r̄max ].
Discussion. The theorem gives a simple characterization of the influence of standard nonnegativity
constraints. In a portfolio with three risky assets, the respective efficient frontiers of subportfolios
that contribute to the optimal solution in Problems 2, 3, and 5 might look as in Fig. 7. Other
inequalities, like upper bounds on the assets or limits on arbitrary asset combinations, will further
restrict the range of feasible rewards and increase the risk in a similar manner. This case was
already considered by Markowitz: he handles general linear inequalities by dummy assets (slacks)
and constraints Ax = b, x ≥ 0, where A = e∗ (with x ≥ 0 and ρ(x) ≥ ρ) is called the standard
case [54, p. 171]. Moreover, Markowitz devised an algorithm to trace the critical lines, that is,
the segments of the efficient frontier [53], [54, Chap. VIII]. The number of assets in an optimal
portfolio is ivestigated, e.g., by Nakasato and Furukawa [61].
(Of course, if P has a density φ, then Ξ = supp(φ) = {x ∈ Rn : φ(x) > 0}.) In the following we
will actually use the convex hull of the support in most cases, denoted by C := conv(Ξ).
Single-Period and Multi-Period Mean-Variance Models 15
assets 2,3
assets 1,2
assets 1,2,3
Definition 3 (Downside risk). For a function w of the random vector r with distribution P ,
the downside risk of order q > 0 with target τ ∈ R is
Rτq (w) := E |min(w(r) − τ, 0)|q = |min(w(r) − τ, 0)|q dP.
Ên
Remarks. Without the risk context such expectations are neutrally called lower partial moments,
with downside expected value or semi-deviation (order 1) and downside variance or semi-variance
(order 2) as special cases. In [54, Chap. XIII] Markowitz gives a qualitative discussion of the
linear case (expected value of loss, q = 1), the quadratic case (semi-variance, q = 2), and some
other measures of risk, by examining the associated utility functions. Expectation of loss has
recently gained interest as a coherent replacement for the popular Value-at-Risk (VaR), often
under alternative names like mean shortfall, tail VaR, or conditional VaR, cf. [3, 79].
In the following we are only interested in quadratic downside risk of portfolio returns like
wx,xc (r, rc ) = r∗ x + rc xc . Moreover, we always use the desired reward as a natural choice for the
shortfall target, τ = ρ, and write simply Rρ (x, xc ) instead of Rρ2 (wx,xc ). The problems considered
in this section are downside risk versions of Problems 3 and 5, and of the modification of Problem 3
with ρ(x, xc ) ≥ ρ. In each case only the objective is changed: standard risk R is replaced by
downside risk Rρ . Before considering these problems we need some technical preparations.
For x = 0 and c ∈ R let us introduce open and closed half-spaces
Lemma 4. Denote by
a disjoint union. Then, for x = 0 and a > 0,
1. H(ax, c) = H(x, a−1 c), H(x, ac) = H(a−1 x, c), H(ax, ac) = H(x, c).
2. H(−x, −c) = Rn \ H̄(x, c).
3. H̄(x, c) = H(x, c)
∂H(x, c), H̄(x, 0) = H(x, 0)
{x}⊥ .
Statements 1 and 2 remain valid when H and H̄ are exchanged everywhere.
Proof. Trivial.
Lemma 5. For x ∈ Rn and a > 0,
1. Σ(ax) = Σ(x).
2. 0 ≤ Σ(x) ≤ Σ. (In particular, each Σ(x) is positive semidefinite.)
3. x∗ Σ(x)x = E[min((r − r̄)∗ x, 0)2 ].
4. x∗ [Σ(x) + Σ(−x)]x = x∗ Σx.
Proof. Statements 1, 2, 3 are obvious from the definitions and the first identity in Lemma 4.
The expressions in statement 4 are identical for x = 0; otherwise they differ by the integral of
((r − r̄)∗ x)2 over r̄ + {x}⊥ , which is clearly zero.
Lemma 6. For any random vector r the following holds.
1. The expectation lies in the convex hull of the support: r̄ ∈ C.
2. The covariance matrix and all semi-variance matrices are positive definite if and only if
Ξ has full dimension in the sense that its convex hull has nonempty interior:
int(C) = ∅ ⇐⇒ Σ > 0 ⇐⇒ Σ(x) > 0 ∀x ∈ Rn .
3. If r is discrete with Σ > 0, then it has at least n + 1 realizations.
Proof. Assume r̄ ∈/ C. Then r̄ has positive distance to C, and a vector x = 0 exists so that
(r − r̄)∗ x > 0 for all r ∈ C. Since expectation is the integral over Ξ ⊆ C, this yields the
contradiction 0 < E[(r − r̄)∗ x] = 0, proving statement 1. Now assume int(C) = ∅. Then C is
contained in some hyperplane r̄ + {x}⊥ with x = 0, implying
Hence Σ is only positive semidefinite. Conversely, assume int(C) = ∅ and x = 0. Then (r−r̄)∗ x < 0
for all r ∈ r̄ + H(x, 0). By Lemma 7 below, r̄ + H(x, 0) has positive measure. Therefore
∗
x Σ(x)x = ((r − r̄)∗ x)2 dP > 0,
r̄+H(x,0)
showing that Σ(x) > 0. The proof of statement 2 is complete since Σ ≥ Σ(x) for all x. Now
statement 3 is an immediate consequence.
Lemma 7. Let int(C) = ∅ and x = 0. Then r̄ + H(x, 0) has positive measure.
Proof. The inner product s(x) := (r − r̄)∗ x is negative, zero, and positive on the respective sets
r̄ + H(x, 0), r̄ + {x}⊥ , and r̄ + H(−x, 0). Furthermore,
s(x) dP + s(x) dP + s(x) dP = E[s(x)] = 0.
r̄+H(x,0) r̄+{x}⊥ r̄+H(−x,0)
Therefore r̄ + H(x, 0) and r̄ + H(−x, 0) have either both positive measure or both measure zero.
The second case implies Ξ ⊆ r̄ + {x}⊥ , which leads to the contradiction int(C) = ∅.
Single-Period and Multi-Period Mean-Variance Models 17
Let us now study the downside risk versions of Problems 3 and 5 under the same assumptions
as before (A1 and A3 resp. A1, A3, and A4). It will be seen that in these cases the qualitative
behavior does not change significantly. This is mainly because the constraints are linear and Σ(x)
depends only on the direction and not on the magnitude of x (cf. Lemma 5).
Problem 7. We minimize downside risk Rρ (x, xc ) for risky assets and cash, with fixed desired
reward ρ(x, xc ) = ρ,
1
minc min(r∗ x + rc xc − ρ, 0)2 dP
x,x 2 Ên
s.t. e∗ x + xc = 1,
r̄∗ x + rc xc = ρ.
Problem 8. Now minimize downside risk Rρ (x, xc , xl ) for risky assets, cash, and loss, with fixed
desired reward ρ(x, xc , xl ) = ρ,
1
min min(r∗ x + rc xc − ρ, 0)2 dP
x,xc ,xl 2 Ên
s.t. e∗ x + xc + xl = 1, xl ≥ 0,
r̄∗ x + rc xc = ρ.
We are now ready to analyze Problems 7 and 8. In general, closed-form solutions cannot be
found due to the nonlinearity of downside risk with respect to the risky assets. However, we can
derive some important properties of the solutions and give a qualitative comparison to Problems
3 and 5.
Lemma 9. Optimal solutions always exist in Problems 7 and 8. The resulting downside risk is
nonnegative and not greater than the optimal risk in Problem 3 or 5, respectively. Moreover, the
riskless solutions of Problems 7 and 3 (8 and 5) are identical. (In general the solutions are not
unique.)
Proof. Convexity of min(w, 0) implies convexity of downside risk x∗ Σ(x)x and thus of Problems 7
and 8. The existence of solutions and the stated inclusion follow since 0 ≤ Σ(x) ≤ Σ by Lemma 4.
By assumption A1 and Lemma 6, zero risk requires x = 0, which holds under the same conditions
as in the standard risk case.
Theorem 12. In Problem 7, choose respective optimal portfolios (x± , xc± ) for ρ± := rc ± 1. Then
(ax± , axc± − a + 1) is optimal for ρ = rc ± a if a ≥ 0. Moreover, x± = 0 and x+ = x− .
18 M. C. Steinbach
r2 r2
_
r +{y}⊥
_
r +{x}⊥
_
_ Ξ
r
r
x c
r ce r
e
Ξ’
y y
1
r1
r
Proof. The transformation of xc+ follows from e∗ x + xc = 1 if ax+ is optimal. Assume ax+ is not
optimal for ρ = rc + a > rc . By Lemma 8, x = ax+ exists so that (r̄ − rc e)∗ x = ρ − rc and
and
Thus x+ cannot be optimal for ρ+ : a contradiction. The case ρ < rc is analogous, and ρ = rc is
trivial. Finally observe that (r̄ − rc e)∗ x− < 0 < (r̄ − rc e)∗ x+ .
Theorem 13. Constants c± ∈ (0, 1) exist so that the optimal risk in Problem 7 is c+ (c− ) times
the optimal risk of Problem 3 on the upper (lower) branch.
Proof. The existence of c± ∈ (0, 1] with the stated properties follows from Lemma 9 and Theo-
rem 12. Statement 4 of Lemma 5 implies c± < 1.
Theorem 14. The same statements as in Theorems 12 and 13 hold on the upper branch in Prob-
lem 8. On the lower branch one has the (unique) riskless solution (x, xc , xl ) = (0, ρ/rc , 1 − ρ/rc ).
The last problem considered in this section is the downside risk version of the modification of
Problem 3. Again the riskless solutions are of interest.
Problem 9. We minimize downside risk Rρ (x, xc ) for risky assets and cash, with desired minimal
reward ρ(x, xc ) ≥ ρ,
1
min min(r∗ x + rc xc − ρ, 0)2 dP
x,xc ,θ 2 Ên
s.t. e∗ x + xc = 1,
r̄∗ x + rc xc = ρ + θ, θ ≥ 0.
Remark. Note that downside risk is still calculated with respect to the desired reward ρ whereas
the actual reward is now ρ + θ. Otherwise the problem would be equivalent to Problem 8.
Lemma 10. (x, xc ) = (0, 1) is feasible for Problem 9 iff ρ ≤ rc . Otherwise, with xc ≡ 1 − e∗ x,
Problem 9 is equivalent to
1
min (θ + (r − r̄)∗ x)2 dP
x,θ 2 r̄+H(x,−θ)
s.t. (r̄ − rc e)∗ x = ρ + θ − rc , θ ≥ 0.
Proof. The first part is trivial. The second part is proved as Lemma 8, the only difference being
that the reward condition now yields r∗ x + rc xc − ρ = (r − r̄)∗ x + θ instead of (r − r̄)∗ x.
Theorem 15. If r c e ∈ int(C), then the following holds in Problem 9.
1. For every ρ ≤ rc , (x, xc ) = (0, 1) is a riskless solution.
2. A portfolio with x = 0 has zero risk iff Ξ ⊆ rc e + H̄(−x, rc − ρ).
3. For ρ < rc such a portfolio exists iff Ξ is contained in any closed half-space.
4. For ρ ≥ rc such a portfolio does not exist.
Proof. Statement 1 is trivial. If x = 0, then downside risk clearly vanishes iff Ξ does not intersect
r̄ + H(x, −θ), that is, iff (r − r̄)∗ x + θ ≥ 0 for r ∈ Ξ. Substituting θ = (r̄ − rc e)∗ x − ρ + rc from
the reward equation yields the equivalent condition (r − rc e)∗ x ≥ ρ − rc for r ∈ Ξ, which proves
statement 2. (This condition implies feasibility of x since r̄ ∈ C.) Now, if Ξ is contained in some
closed half-space, then y = 0 exists so that (r − rc e)∗ y ≥ −1 for r ∈ Ξ, see Fig. 8. For ρ < rc let
x := (rc − ρ)y to satisfy the zero-risk condition. The “only if” direction of statement 3 is trivial.
Finally observe that C contains an open ball centered at rc e ∈ int(C). On that ball the inner
product (r − rc e)∗ x takes positive and negative values for any x = 0, showing that for ρ ≥ rc the
zero-risk condition (r − rc e)∗ x ≥ ρ − rc ≥ 0 cannot be satisfied.
Remarks. Similar arguments show that for ρ > rc (ρ = rc ) a zero-risk portfolio with x = 0 exists
iff Ξ lies in a closed half-space not containing rc e (not containing rc e in its interior), see Ξ in
Fig. 8. This is why we need an additional no-arbitrage condition. Although rc e ∈ C would suffice,
we choose the stronger condition rc e ∈ int(C) to ensure a unique riskless solution (100 % cash) for
ρ = rc . Thus x = 0 will produce positive risk for any ρ ≥ rc .
Discussion. Risk vanishes on the lower branch in Problem 9, but for the upper branch we know
only that it is convex; even the optimal portfolios for different ρ > rc may be unrelated. This is
because the actual reward may exceed the shortfall target, resulting in semi-variance half-spaces
far from r̄ and producing asymmetric (or decentral) risk integrals instead of Σ(x). One might
expect such truly nonlinear behavior from any downside risk measure, but it occurs only if the
actual reward may differ from the shortfall target. One might also associate zero risk on the lower
branch with downside risk, but this property occurs for standard risk as well and has nothing to
do with the objective; it is caused by the inequality ρ(x) ≥ ρ or e∗ x + xc ≤ 1.
20 M. C. Steinbach
1.7 Summary
We have discussed various formulations of the classical mean-variance approach to obtain single-
period models that give a qualitatively correct description of risk, particularly for unreasonably
small desired rewards. Positivity constraints and other inequalities have been studied, and down-
side risk models have been analyzed in detail. Thus we have clarified the effects and interaction of
all components in the portfolio optimization problems. In the following we use the results of this
section in developing multi-period models. The goal is to achieve an approximate minimization of
downside risk.
Tradeoff formulations or utility functions will not be considered any more since extra constraints
provide higher modeling flexibility and facilitate a better understanding of subtle details.
and
R(x) = σ 2 (rT∗ +1 xT ) = E[(rT∗ +1 xT − ρ(x))2 ].
Lemma 11 (Siede [23]). The risk is given by
R(x) = E[x∗T (ΣT + r̄T r̄T∗ )xT ] − ρ(x)2 = pj x∗j (Σj + r̄j r̄j∗ )xj − ρ(x)2 .
j∈L
Proof. By definition,
R(x) = E[(rT∗ +1 xT − ρ(x))2 ] = E(x∗T rT +1 rT∗ +1 xT ) − ρ(x)2
= E[E(x∗T rT +1 rT∗ +1 xT |LT )] − ρ(x)2
= E[x∗T E(rT +1 rT∗ +1 |LT )xT ] − ρ(x)2 = E[x∗T (ΣT + r̄T r̄T∗ )xT ] − ρ(x)2 .
The discrete representation follows immediately.
Single-Period and Multi-Period Mean-Variance Models 21
r̄1 r̄N
...
1
N
r
1
r N
Remarks. Notice that this representation yields a block-diagonal risk matrix because of the sep-
arate term ρ(x)2 . If the Hessian of the latter were included in the risk matrix, it would add a
completely dense rank-1 term: the dyadic product containing all the covariances −pj pk r̄j r̄k∗ be-
tween different leaves j, k ∈ L. Since ρ(x) = ρ is fixed in the optimization problems below, we can
neglect the term ρ2 except when considering the reward-dependence of risk.
Corollary 1. Denote by ρT (xT ) := r̄T∗ xT and RT (xT ) := x∗T ΣT xT the conditional reward and
risk of the final period, respectively, with realizations ρj (xj ) = r̄j∗ xj and Rj (xj ) = x∗j Σj xj on LT .
Then R(x) = Rc (x) + Rd (x) with
Rc (x) := E[RT (xT )] = pj x∗j Σj xj
j∈L
and
Rd (x) := E[ρT (xT )2 ] − ρ(x)2 = pj ρj (xj )2 − ρ(x)2 .
j∈L
Let r̃j := pj r̄j and Σ̃j := pj (Σj + r̄j r̄j∗ ) in the leaves, j ∈ L. By assumption A5 below, Σ̃j > 0;
therefore we can define
α̃j := e∗ Σ̃−1
j e, β̃j := e∗ Σ̃−1
j r̃j , γ̃j := r̃j∗ Σ̃−1
j r̃j , δ̃j := α̃j γ̃j − β̃j2 .
and employ A5 again to define α̃i , . . . , δ̃i in analogy to α̃j , . . . , δ̃j . In the subsequent analysis (and
in the solution algorithm) these quantities will play a similar role as their counterparts in the
leaves, but they do not have the same meaning. In particular, r̄i := r̃i /pi and Σi := Σ̃i /pi − r̄i r̄i∗
are usually not the expectation and covariance matrix of the discrete distribution {rj }j∈S(i) .
Basic assumptions.
A5) ∀j ∈ V : Σ̃j > 0.
A6) ∃j ∈ V : r̃j is not a multiple of e.
Remarks. The role of these conditions is analogous to the single-period case: they ensure strict
convexity and avoid degenerate constraints. Assumption A5 also implies N ≥ n as a technical
requirement on the return discretization in each period. In practice one will usually have N > n,
otherwise the covariance matrices are only positive semidefinite by Lemma 6. Suitable multi-
period discretizations can be generated, e.g., by barycentric approximations [22].
Lemma 12. Under assumptions A5 and A6, the constants α̃j , γ̃j are all positive, the δ̃j are all
nonnegative, and at least one δ̃j is positive.
Proof. Positivity and nonnegativity are proved as in Lemma 1, where δ̃j = 0 iff r̃j , e are linearly
dependent.
Problem 10. The multi-period mean-variance problem (using i ≡ π(j)) reads
1 1
min x∗ Σ̃j xj − ρ2
x 2 j 2
j∈L
s.t. e∗ x0 = 1,
e∗ xj = rj∗ xi ∀j ∈ V ∗ ,
r̃j∗ xj = ρ,
j∈L
Its Lagrangian is
1
1 2
L(x, λ, μ; ρ) = x∗j Σ̃j xj ∗
− ρ − λ0 (e x0 − 1) − ∗ ∗
λj (e xj − rj xi ) − μ ∗
r̃j xj − ρ .
j∈L
2 2 ∗ j∈V j∈L
αj := e∗ Σ−1
j e, βj := e∗ Σ−1
j r̄j , γj := r̄j∗ Σ−1
j r̄j ,
and
δjc := (r2c )2 αj − 2r2c βj + γj = (r̄j − r2c e)∗ Σ−1
j (r̄j − r2 e).
c
and furthermore r̄0 = r̃0 /p̃0 , Σ0 = Σ̃0 /p̃0 − r̄0 r̄0∗ , and r̃1c := p̃0 rc (not p̃0 r1c ). Using A7 again,
α0 , β0 , γ0 are then defined in analogy to αj , βj , γj . Finally let δ0c := (r1c )2 α0 − 2r1c β0 + γ0 and recall
that t is the current time in j ∈ V .
24 M. C. Steinbach
A7) ∀j ∈ V : Σj > 0.
A9) rc = 0.
Remark. The conditions here are similar to A5 and A6 (or A1 and A3), but in addition we require
nonzero cash returns rtc , t = 1, . . . , T + 1. The opposite case would unnecessarily complicate the
analysis and is not considered.
Lemma 15. Under assumptions A7–A9, the constants αj , γj are all positive, the δjc are nonneg-
ative, and at least one δjc is positive. Moreover, p̃0 ∈ (0, 1].
Proof. Positivity and nonnegativity are obvious, where δjc = 0 iff r̄j = rtc e. Now δjc ≥ 0 implies
pj /(δjc + 1) ∈ (0, pj ] and hence p̃0 ∈ (0, 1].
In the following we use two different formulations for both the reward and the risk, involving
again the conditional final-period risk and return. The latter is now ρT (xT , xcT ) := r̄T∗ xT + rTc +1 xcT
with realizations ρj (xj , xcj ) := r̄j∗ xj + rTc +1 xcj , and the discrete decision vector is x = (xj , xcj )j∈V .
ρ(x) = E[rT∗ +1 xT + rTc +1 xcT ] = E[r̄T∗ xT + rTc +1 xcT ] = E[ρT (xt , xcT )].
Problem 11. Using the first formulation for reward and risk in Lemma 16, the multi-period
mean-variance problem with cash reads
1 xj ∗ Σ̃ rTc +1 r̃j
xj 1 2
c − ρ
j
min c c ∗ c c
x 2 xj rT +1 r̃j rT +1 r̃j xj 2
j∈L
s.t. e∗ x0 + xc0 = 1,
e∗ xj + xcj = rj∗ xi + rtc xci ∀j ∈ V ∗ ,
r̃j∗ xj + r̃jc xcj = ρ.
j∈L
Single-Period and Multi-Period Mean-Variance Models 25
Its Lagrangian is
1 xj ∗ Σ̃ rTc +1 r̃j
xj 1 2
c − ρ
j
L(x, λ, μ; ρ) = c c ∗ c c
2 xj rT +1 r̃j rT +1 r̃j xj 2
j∈L
− λ0 (e∗ x0 + xc0 − 1) − λj (e∗ xj + xcj − rj∗ xi − rtc xci )
j∈V ∗
−μ r̃j∗ xj + r̃jc xcj − ρ .
j∈L
s.t. e∗ x0 + xc0 = 1,
e∗ xj + xcj = rj∗ xi + rtc xci ∀j ∈ V ∗ ,
r̃j∗ xj + r̃jc xcj = ρ + θ, θ ≥ 0.
j∈L
26 M. C. Steinbach
Theorem 18. Problem 12 has a unique primal-dual solution. For ρ ≥ r c one obtains θ = 0,
η = μ − ρ ≥ 0, and otherwise the same solution as in Problem 11. For ρ ≤ rc one obtains η = 0
and θ = ρ − rc ≥ 0, giving reward ρ + θ = rc and the associated riskless solution of Problem 11.
(At ρ = rc both cases coincide.)
Proof. See appendix.
Its Lagrangian is
1 xj ∗ Σ̃ rTc +1 r̃j
xj 1
L(x, λ, η, μ; ρ) = j
− ρ2
2 xcj rTc +1 r̃j∗ rTc +1 r̃jc xcj 2
j∈L
− λ0 (e∗ x0 + xc0 + xl0 − 1) − λj (e∗ xj + xcj + xlj − rj∗ xi − rtc xci )
j∈V ∗
− ηj xlj −μ r̃j∗ xj + r̃jc xcj −ρ .
j∈V j∈L
Single-Period and Multi-Period Mean-Variance Models 27
Theorem 20 in the appendix characterizes optimal solutions in a way similar to the previous
theorems. However, the loss variables xlj still appear in the formulae, and the case distinctions
are more involved than in the single-period case (cf. Theorem 8) or in Problem 12. Later we will
discuss this in part; for the time being, the following results provide more insight.
Lemma 17 (Arbitrage in Problem 13). If assumption A11 is strictly violated, r 1c e ∈
/ C0 , then
Problem 13 has a riskless solution for arbitrary ρ.
Proof. Since C0 is convex, y ∈ Rn exists so that (r − r1c e)∗ y ≥ 1 for r ∈ C0 . Given ρ ∈ R, let
x0 := (ρ/r2c − r1c )y, xc0 := 1 − e∗ x0 , and xl0 := 0 to obtain wj = (rj − r1c e)∗ x0 + r1c ≥ ρ/r2c for all
j ∈ L. Now let xj := 0, xcj := ρ/r2c , and xlj := wj − ρ/r2c ≥ 0.
For the analysis of actual riskless solutions consider in Rn the family of closed convex polyhedra
∗
rj ρ rj ρ
Z0 (ρ) := x : − e x ≥ − 1 ∀j ∈ L = H̄ e − , 1 − .
r1c rc r1c rc
j∈L
Lemma 18. Z0 (ρ) is nonempty (containing the origin) iff ρ ≤ rc . For ρ < rc , 0 ∈ int(Z0 (ρ)) and
ρ
Z0 (ρ) = 1 − c Z0 (0).
r
Each Z0 (ρ) is bounded and hence compact. In particular, Z0 (rc ) = {0}.
Remark. More generally one can show that the following four conditions are equivalent under the
weaker no-arbitrage condition r1c e ∈ C0 .
1. r1c e ∈ int(C0 ).
2. Z0 (rc ) = {0}.
3. ∃ρ ∈ (−∞, rc ]: Z0 (ρ) is bounded.
4. ∀ρ ∈ (−∞, r c ]: Z0 (ρ) is bounded.
As in Theorem 15 we choose the stronger condition r1c e ∈ int(C0 ) to avoid unnecessary technical
complications and ambiguities.
Proof. By A11 there exists a convex combination j∈L ξj (rj − r1c e) = 0. Every x ∈ Z0 (ρ) then
satisfies
rj ∗
ρ
0= ξj c − e x ≥ c − 1.
r1 r
j∈L
Hence Z0 (ρ) = ∅ if ρ > rc and 0 ∈ Z0 (ρ) if ρ ≤ rc . Now let ρ < rc . For c > 0,
∗ ∗
rj rj
− e x ≥ −1 ⇐⇒ − e cx ≥ −c,
r1c r1c
showing that x ∈ Z0 (0) iff (1 − ρ/rc )x ∈ Z0 (ρ). To verify 0 ∈ int(Z0 (ρ)), observe that Z0 (ρ)
contains the ball around the origin with radius
ρ rj
1− c
max c − e
> 0.
r j∈L r1
2
cf. Fig. 10. Condition 1) in the remark holds by A11. We prove 2), 3), and 4) in natural order.
Assume first that 2) does not hold and let y ∈ Z0 (rc ) \ {0}. Then (rj /r1c − e)∗ y ≥ 0 for all j ∈ L
and hence (r/r1c − e)∗ y ≥ 0 for r ∈ C0 , i.e., C0 ⊆ r1c e + H̄(−y, 0). This yields the contradiction
r1c e ∈
/ int(C0 ) and proves 2), which obviously implies 3). Assume now that 4) does not hold, that
is, Z0 (ρ) is unbounded for some ρ ≤ rc . Then, since Z0 (ρ) is convex and 0 ∈ Z0 (ρ), y = 0 exists
so that cy ∈ Z0 (ρ) for all c > 0, that is, (rj /r1c − e)∗ cy ≥ ρ/rc − 1 for j ∈ L. This implies
(rj /r1c − e)∗ cy ≥ 0 and hence (rj /r1c − e)∗ cy ≥ ρ/rc − 1 for c > 0, j ∈ L, and ρ ≤ rc . Thus
3) implies 4): either none of the polyhedra is bounded or all of them.
28 M. C. Steinbach
C0
C0
Z0(ρ)
Z0(ρ)
Figure 10: First-period returns {rj }, r1c e ∈ R2 and associated zero-risk polyhedra Z0 (ρ) with
enclosed balls for three values of ρ (black = rc , dark grey, medium grey). Left: compact case,
r1c e ∈ int(C0 ). Right: unbounded case, r1c e ∈ ∂C0 .
Remarks. Strictly speaking, the riskless solutions for ρ < rc form an (n + 1)-dimensional cone
whose projection on the (x0 , xl0 )-space has Z0 (ρ) as base, see Fig. 11. Similarly, the projection
for ρ = rc is Z0 (rc ) × {0}, and in both cases the remaining variables are uniquely determined by
feasibility. (If r1c e ∈ ∂C0 , then x0 ∈ Z0 (rc ), xc0 , and xlj , j ∈ L, would not be unique for ρ = rc .)
Proof. The condition in 1 is obviously sufficient for zero risk (cf. Lemma 17). Let p := (pj )j∈L
and q := (ρj (xj , xcj ))j∈L . Then
2
Rd (x) = pj qj2 − pj qj = q ∗ [Diag(p) − pp∗ ]q.
j∈L j∈L
By Lemma 19, this quadratic form vanishes if q = ρe and is positive else. Moreover, Rc (x) ≥ 0
for all x, and Rc (x) = 0 iff xj = 0 for all j ∈ L. This proves statement 1. The zero-risk condition
requires wj ≡ rj∗ x0 + r1c xc0 ≥ ρ/r2c in all scenarios. If this implied restriction on the root variables
holds, then the leaf variables are uniquely determined by statement 1 and xlj = wj −xcj ≥ 0, proving
statement 2. Since xc0 = 1 − e∗ x0 − xl0 , the inequality wj ≥ ρ/r2c is equivalent to x0 ∈ Z0 (ρ + rc xl0 ).
By Lemma 18 we conclude that x is optimal for ρ < rc iff xl0 ∈ [0, 1 − ρ/rc ] and x0 ∈ Z0 (ρ + rc xl0 ),
see Fig. 11. This proves statement 3. Statements 4 and 5 follow similarly: for ρ = rc the zero-risk
cone degenerates (only its vertex remains), and for ρ > rc it becomes empty.
Lemma 19. Let p ∈ RN , p > 0, e∗ p = 1, and f (q) := 12 q ∗ [Diag(p) − pp∗ ]q. Then
x0c
x0l
x0
Figure 11: Zero-risk cones over Z0 (ρ) for three values of ρ (black = rc , dark grey, light grey);
compact case. (Here Z0 (ρ) is a segment of the x0 axis.)
shows that surplus money is actually necessary for positive loss variables: if there is no surplus
money, then positive amounts xlj are better invested in cash to reduce risk.
Lemma 20. Let x be an optimal solution of Problem 13 for ρ > r c .
1. If xj = 0 in a scenario j ∈ L, then xlj = 0.
2. xl0 = 0.
Remark. Statement 1 of the lemma is also obtained from Theorem 20, and moreover the plausible
fact that xlj = 0 in scenario j if ρj < ρ. However, the proof of that theorem is not constructive.
Proof. Assume xlj > 0. We modify the local variables in scenario j to construct a better feasible
solution. Let a := (r̄j − r2c e)∗ xj , define ∈ (0, 1] as
min(1, r2c xlj /a) if a > 0,
:=
1 else,
(x̂j , x̂cj , x̂lj ) := (xj , xcj , xlj ) − (xj , −r̄j∗ xj /r2c , a/r2c ).
Then x̂lj ≥ 0, ŵj = wj , ρ̂j = ρj , and the risk is reduced by the positive amount [1−(1−)2 ]x∗j Σj xj .
This proves statement 1. For xl0 > 0 we modify xc0 and all xlj . Let (x̂0 , x̂c0 , x̂l0 ) := (x0 , xc0 +xl0 , 0) and
(x̂j , x̂cj , x̂lj ) := (xj , xcj , xlj + r1c xl0 ) for j ∈ L. Clearly, x̂ is feasible, R(x̂) = R(x), and x̂lj ≥ r1c xl0 > 0
in all scenarios. Thus, by statement 1, x̂ and consequently x are not optimal.
30 M. C. Steinbach
Discussion. As in the one-period case, the 100 % cash solution plays a key role: it has the largest
reward among all riskless solutions. But now the solutions become degenerate for small rewards
ρ < rc , even if loss is allowed only in the second period, i.e., if xl0 = 0 is fixed. This does not
happen in Problem 12 (the modification of Problem 11 with bounded reward ρ(x) ≥ ρ), which
behaves precisely as the corresponding single-period problem. Of course, the degeneracy occurs
only for practically irrelevant rewards, and even then it can easily be avoided. (One may choose
the vertex of the zero-risk cone, i.e., xl0 = 1 − ρ/rc . This removes any surplus money immediately
and gives a unique solution.)
The only case of practical interest is ρ > rc , when the solution of Problem 11 remains optimal
in Problem 12. Why do we prefer the loss formulation, Problem 13? Obviously the risk cannot be
higher than in Problem 11 since every optimal solution of the latter remains feasible in the former
problem. Actually it turns out that the loss formulation gives strictly lower risk in most cases,
i.e., it allows better solutions than Problem 12. To develop a geometric understanding for this
observation, we compare Problems 11 and 13 in a simplified situation. A reformulation eliminates
all the budget equations and most of the portfolio variables in favor of the individual scenario
returns. The two risk terms Rc , Rd are then used to explain in which cases (and how) an optimal
solution of Problem 11 can be modified to give a better feasible solution of Problem 13.
Problem 14. Consider as example a portfolio consisting of just one risky asset and cash, using
the second formulation of reward and risk in Lemma 16. Include loss assets in the leaves but not
in the root, and write the problem with scenario returns ρj as additional variables,
1 1
min pj [Σj x2j + ρ2j ] − ρ2
x,{ρj } 2 2
j∈L
Solving for xj , inserting it into the objective and using φj yields Problem (3). Differentiating the
Lagrangian with respect to the returns ρj gives optimality conditions
φj (ρj − rc − ψj x0 + θj ) + ρj = μ,
from which one obtains the expression (4). This derivation holds for the case without loss, too:
one just has to set xlj = θj = 0 everywhere. Finally, when the problem transformations above are
applied to the full primal-dual system of optimality conditions, it is observed that μ has the same
value in both problems.
Discussion. Consider the case θj = 0 first (Problem 11). We have to choose optimal values for x0
and for the scenario returns ρj so that their mean equals ρ. Defining dj := rc + ψj x0 gives the
continuous risk part
Rc = pj φj (ρj − dj )2 ≥ 0.
j∈L
This is a weighted average of scenario risks φj (ρj −dj )2 , each of which defines a parabola character-
ized by its offset dj and curvature φj . Both magnitude and distance of the offsets are influenced
by the common “spread factor” x0 : they all coincide with rc if x0 = 0 (100 % cash), whereas
x0 = 1 (no cash) yields the discrete distribution dj = r2c rj with mean d := r2c E({rj }). Clearly, the
continuous risk Rc is small when all the scenario returns are close to their respective offsets, while
the discrete risk Rd is small when they are close to each other. Thus, loosely speaking, x0 has the
job to balance the scenarios by adjusting d (close to ρ) without spreading the offsets too much.
Only one detail changes when θj > 0 is allowed (Problem 13): each offset dj is replaced by
dj − θj , that is, the parabolas may be shifted to the left separately in each scenario.
We can now explain the risk reduction mechanism. Consider an optimal solution of Problem 14
without loss. Typically there will be “good” scenarios (fortunate for the investor, with large offsets
dj > ρ) and “bad” scenarios (unfortunate for the investor, with small offsets dj < ρ). Moreover,
one expects that some of the scenario returns will lie on their local upper branch (ρj > dj ) and
some on their local lower branch (ρj < dj ). If in a good scenario the optimal return lies on the
lower branch, ρ < ρj < dj , then its contribution to the risk is canceled by shifting the parabola to
the left, θj := dj − ρj > 0. Clearly, nothing else changes, so that this gives a suboptimal feasible
solution of Problem 13 which is better than the optimal solution of Problem 11. The mechanism
here is precisely the same as in a single period (Problem 5) except that it now occurs locally in
individual scenarios.
Of course, the optimal solution of Problem 13 will readjust all the variables globally. Now x0
still has the job to balance scenarios by adjusting d close to ρ, but only without spreading the
small offsets dj < ρ too much. The large offsets dj > ρ produce surplus money in all sufficiently
good scenarios. These scenarios do not contribute to Rc , and all their returns are equal (and
slightly larger than ρ to balance the bad ones). This is proved in Theorem 20: surplus money
xlj > 0 implies r2c wj > μ and ρj = μ ≥ ρ. It means that a jump discontinuity is produced in
the distribution of final wealth, to which each of the riskless scenarios contributes a fraction. It
also means that the bad scenarios dominate the resulting risk: again we have an approximate
minimization of downside risk.
When looking for an instance of Problem 11 with ρ < ρj < dj one might try Problem 3 with
only two scenarios. However, the following results show that the effect cannot occur if n = 1 and
N = 2: in that case (with ρ > rc ) the single degree of freedom in x0 is sufficient to balance the
scenarios well. After proving this we give an example of the risk reduction with n = 1 and N = 3.
A slight modification finally shows that risk reduction can occur even in bad scenarios, that is,
optimal solutions of Problem 11 may have ρj < dj < ρ.
For the comparison of signs we define the equivalence relation
1.6 1.4
mu 1.2
1.5 rho1
rho2 1
1.4 rho3 (= mu)
0.8
1.3
0.6 cash1
asset1
1.2 cash2
0.4
asset2
1.1 0.2 cash3
asset3
1 0
1 1.05 1.1 1.15 1.2 1.25 1 1.05 1.1 1.15 1.2 1.25
1.6 1.4
mu 1.2
1.5 rho1
rho2 1
1.4 rho3
0.8
1.3
0.6 cash1
asset1
1.2 cash2
0.4
asset2
1.1 0.2 cash3
asset3
1 0
1 1.05 1.1 1.15 1.2 1.25 1 1.05 1.1 1.15 1.2 1.25
Figure 12: Optimal scenario returns (left) and investments (right) for the small example problem.
Asset values unconstrained (top) and nonnegative (bottom; r̄max = 1.21).
Remark. Notice that sj depends only on the stochastic data (the return distribution), and in
particular not on ρ.
Corollary 2. In Problem 14 with two scenarios and without loss assets, both ρ1 and ρ2 lie on
their respective upper branches if ρ > rc .
Proof. Assumption A11 yields r1 < r1c < r2 , and the sums sj consist of one single term each since
N = 2. This implies that s1 , s2 are both positive.
Remark. A weak version of this result holds for r1c ∈ C0 . Then r1 ≤ r1c ≤ r2 , and s1 , s2 are both
nonnegative. Even that weaker version would prevent any beneficial shift θj > 0.
The example with N = 3 is simple: let (p1 , p2 , p3 ) = (0.2, 0.6, 0.2), (r1 , r2 , r3 ) = (1.0, 1.1, 1.2),
r1c = 1.05, Σ1 = Σ2 > 0, and j = 3. Then δ1c = 0.052 /Σ1 = δ2c , and hence
2
pk 0.2(−0.2)(−0.05) + 0.6(−0.1)(0.05)
s3 = (rk − r3 )(rk − r1c ) = < 0.
δkc + 1 δ1c + 1
k=1
The corollary proves that ρ3 lies on the lower branch even though r1c ∈ int(C0 ); thus introducing
the loss asset xl3 reduces the optimal risk. (The values of Σ3 , r2c , and r̄1 , r̄2 , r̄3 do not matter here
as long as A7, A8, A10, and A11 are satisfied. The reader may check that this holds for Σj = 1,
r2c = 1.05, and r̄j = 1.1, j ∈ {1, 2, 3}.) Setting instead r1c := 1.15 and Σ2 = Σ3 > 0 yields s1 < 0.
This shows that risk reduction can also occur in bad scenarios—at least for unreasonably large rtc .
Results for this small example problem (with loss assets included) are displayed in Fig. 12.
Single-Period and Multi-Period Mean-Variance Models 33
(As before the root can be excluded by considering only “large” desired rewards ρ > rc , with
rc := rTc +1 . . . r1c .) When there is surplus money, the whole scenario subtree rooted in j does not
contribute to the risk provided that sufficient node capitals wk are maintained. Implicitly this
condition defines a zero-risk polyhedron whose geometry is determined by the subtree’s discrete
return distribution. Of course, if j ∈ LT −1 , then one gets a cone similar to the one considered
before, but depending on wj . These observations imply that the generic optimal solution is highly
degenerate: any reasonable return discretization will include good and bad partional scenarios on
each level, and surplus money will almost always appear somewhere in the tree. However, even in
the zero-risk subtrees all leaf variables are again uniquely determined by the lower-level variables.
min
Obviously wt+s is the sufficient capital for a node k ∈ Lt+s in the zero-risk subtree, and
the easiest way to maintain that amount is to invest precisely wtmin in cash and remove the rest
(“invest” in xlj ). Thus all surplus money is taken out immediately in the root of the subtree and
each remaining node has 100 % cash.
3 Conclusions
We have seen that multi-period mean-variance problems behave similar to their single-period
counterparts in many respects. In specific, it is possible to avoid overperformance by allowing to
remove capital. Small desired rewards ρ ≤ rc are met exactly at zero risk. In that case all the
capital is either invested in cash or removed; thus minimizing the variance is trivially equivalent to
minimizing the semi-variance (or any other downside risk measure) without removing capital but
allowing to exceed the desired reward. That is, with x = (xj , xcj )j∈V and in abbreviated notation,
the problem
(Of course, the solutions of the second problem differ insofar as surplus money is invested in
cash instead of being removed.) For moderate values ρ > rc one cannot avoid overperformance
completely, but in effect the first problem still tends to minimize the semi-variance. More precisely,
the discrete part Rd approximates its downside version due to the existence of zero-risk subtrees.
The quality of that approximation decreases as ρ increases so that for large values the risk measure
becomes a blending of variance and semi-variance. Note that there is no such gradual process in the
single-period case, but there is a close similarity of single-period downside risk (Theorem 15) and
multi-period zero-risk polyhedra (Lemma 18). We may conclude that Problem 13 is a reasonable
multi-period model for an investor who wishes to minimize the semi-variance rather than the
variance of final wealth.
The previous comparison also gives some hints how an optimal policy of Problem 13 should
be interpreted. Again, positive values of xlj do not suggest to burn that amount. They indicate
the presence of surplus money which the investor may spend immediately without risking to
miss her goal, or which she may invest in cash to obtain a riskless extra profit. Of course, she
34 M. C. Steinbach
may also consume part of the surplus and invest the rest. Thus, if the investor implements the
optimal policy over the full planning horizon, she will approximately minimize the risk of ending
up with less than the desired amount, regardless of her choice. Interestingly, the second alternative
(investing) amounts to a single-period strategy with predetermined intermediate decisions, which
may be useful when the investor cannot react to the market until the end of her planning horizon,
or for some reason does not wish to do so.
However, it should be noted that the problem under consideration is not time-invariant due
to the reward condition. That condition involves an expectation over all scenarios, that is, over
the potential futures at t = 0. But at t = 1 most of these potential futures become impossible,
whichever scenario realizes. The terminal condition ρ(x) = ρ or ρ(x) ≥ ρ can usually not be
satisfied when the restricted expectation over the subtree is taken—unless it happens to be a zero-
risk subtree. Therefore only the immediate decision will be of interest for the typical investor.
Rather than following the original future policy, she will adjust the reward and solve the problem
anew for each decision. Of course, the investor may also build an extended model after each period
in pursuing a moving horizon technique.
In any case it seems appropriate to consider all riskless strategies (in addition to the efficient
ones) as reasonable choices in multi-period decision models. This does no harm since it includes
all the standard alternatives, but it opens up new possibilities like the trick described above.
Let us finally point out two issues that might be interesting subjects of future research. First,
the model presented here does not include any preferences of consumption, although one can easily
specify hard constraints (exact, minimal, or maximal consumption) through a cash flow. However,
it is not clear how one should incorporate (soft) preferences and how the result would be related
to long-term models based on utility of consumption. Second, the multi-period setting enables the
investor to control higher moments of her distribution of final wealth—at least to some extent.
How would risk measures involving skewness, e.g., behave in the context of our model?
Appendix
The appendix contains some proofs and a theorem that would disrupt the line of thought in the
main body of the multi-period section.
Proof of Theorem 16. The system of optimality conditions (for two periods) can be written
⎛ ⎞⎛ ⎞ ⎛ ⎞
0 e −r1 . . . −rN 0 x0 0
⎜ Σ̃1 e r̃1 ⎟ ⎜ x1 ⎟ ⎜ 0 ⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ .. .. .. ⎟ ⎜ .. ⎟ ⎜ .. ⎟
⎜ . . ⎜
. ⎟⎜ . ⎟
⎟ ⎜ ⎟
⎜ ⎟ ⎜.⎟
⎜ r̃N ⎟ ⎜ xN ⎟ ⎜
⎟ ⎜ ⎟ ⎟
⎜ Σ̃N e ⎜0⎟
⎜ e∗ ⎟ ⎜ −λ0 ⎟ = ⎜ 1 ⎟ .
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ −r∗ e∗ ⎟ ⎜ −λ1 ⎟ ⎜ 0 ⎟
⎜ 1 ⎟⎜ ⎟ ⎜ ⎟
⎜ . .. ⎟⎜ . ⎟ ⎜ . ⎟
⎜ .. . ⎟ ⎜ .. ⎟ ⎜ .. ⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎝−r∗ e∗ ⎠ ⎝−λN ⎠ ⎝ 0 ⎠
N
0 r̃1∗ ∗
. . . r̃N −μ ρ
For j ∈ L, xj = Σ̃−1j (λj e + μr̃j ) is immediately obtained from the j-th dual feasibility condition.
Substitution into the budget equation for xj and into dual feasibility condition 0, respectively,
yields λj = (rj∗ x0 − μβ̃j )/α̃j and
rj∗ x0 − μβ̃j
0 = −λ0 e + λj rj = −λ0 e + rj = −λ0 e + Σ̃0 x0 − μr̃0 ,
α̃j
j∈L j∈L
giving x0 = Σ̃−1
0 (λ0 e + μr̃0 ). The budget equation for x0 now reads
1 = e∗ x0 = e∗ Σ̃−1
0 (λ0 e + μr̃0 ) = λ0 α̃0 + μβ̃0 ,
Single-Period and Multi-Period Mean-Variance Models 35
yielding λ0 = (1 − μβ̃0 )/α̃0 . From the reward equation one finally obtains
β̃j β̃j2
ρ= r̃j∗ xj = (λj β̃j + μγ̃j ) = rj∗ x0 − μ + μγ̃j
α̃j α̃j
j∈L j∈L j∈L
δ̃j δ̃j
= r̃0∗ x0 + μ = λ0 β̃0 + μγ̃0 + μ
α̃j α̃j
j∈L j∈L
(Here we need assumption A6: the denominator vanishes if δ̃j = 0 for all j ∈ V .) As to the global
minimum, we have
and similarly
Subtracting ρ2 yields a risk expression of the form σ 2 (ρ) = s + (ρ − c)2 /d − ρ2 . Since the optimal
portfolio x is an affine function of ρ and R is convex quadratic, the efficient frontier is either
strictly convex (iff d < 1), or σ 2 (ρ) ≡ 0 (iff d = 1 and c = s = 0). But s = α̃−1
0 > 0 by Lemma 12;
therefore σ 2 (ρ) has the global minimum s + c2 /(1 − d) at ρ̂ = c/(1 − d), as stated.
1 ∗ λj
xcj = −r̄j xj + + μ . (5)
r2c r̃jc
λj −1 λj
xj = − Σ (r̄j − r2c e), xcj = (γj − r2c βj + 1).
r̃jc j r2c r̃jc
λj λj μ λj δjc + 1 μ
0 = e∗ xj + xcj = − c (βj − r2c αj ) + c c (γj − r2c βj + 1) + c = c + c,
r̃j r2 r̃j r2 r2 r̃jc r2
Now we can proceed with the root variables. The condition ∂L/∂xc0 = 0 reads
r̃jc
0 = −λ0 + r1c [rc (r∗ x0 + r1c xc0 ) − μ] = −λ0 + rc r2c r̃0∗ x0 + rc r̃1c xc0 − μr̃1c ,
δjc + 1 2 j
j∈L
giving
1 c ∗ λ0
xc0 = c
−r2 r̄0 x0 + c + μ . (6)
r r̃1
Similarly, after inserting λj and then xc0 , the condition ∂L/∂x0 = 0 reads
r̃jc
0 = −λ0 e + rj [rc (r∗ x0 + r1c xc0 ) − μ]
δjc + 1 2 j
j∈L
λ0
= −λ0 e + (r2c )2 Σ̃0 x0 + rc r2c r̃0 xc0 − μr2c r̃0 = (r2c )2 p̃0 Σ0 x0 + (r̄0 − r1c e).
r1c
This yields x0 and, by substitution into (6), xc0 :
λ0 −1 λ0 μ
x0 = − Σ (r̄0 − r1c e), xc0 = (γ0 − r1c β0 + 1) + c .
r2c r̃1c 0 rc r̃1c r
λ0 λ0 μ λ0 δ c + 1 μ
0 = e∗ x0 + xc0 = − c c (β0 − r1c α0 ) − c c (γ0 − r1c β0 + 1) + c = c 0 c + c ,
r2 r̃1 r r̃1 r r r̃1 r
which gives
r̃1c
λ0 = (rc − μ) = ρ̃(rc − μ).
δ0c + 1
Single-Period and Multi-Period Mean-Variance Models 37
By Lemma 15 we have p̃0 ∈ (0, 1] and δ0c ≥ 0. Moreover, if p̃0 = 1 then δjc = 0 for j ∈ L and thus
δ0c > 0. This proves the inclusion ρ̃ ≡ rc p̃0 /(δ0c + 1) ∈ (0, rc ). Altogether, the previous results give
λj λj λj
r̃j∗ xj + r̃jc xcj = − (γj − r2c βj ) + c (γj − r2c βj + 1) + pj μ = c + pj μ (7)
r2c r2 r2
and similarly
λ0 λ0 λ0
r2c r̃0∗ x0 + r̃1c xc0 = − (γ0 − r1c β0 ) + c (γ0 − r1c β0 + 1) + p̃0 μ = c + p̃0 μ. (8)
rc r r
Upon insertion of λj and then λ0 the reward condition reads
pj
ρ= r̃j∗ xj + r̃jc xcj = [rc (r∗ x0 + r1c xc0 ) − μ] + μ
δc +1 2 j
j∈L j∈L j
λ0 ρ̃
= r2c r̃0∗ x0 + r̃1c xc0 − p̃0 μ + μ = c
+ μ = c (rc − μ) + μ.
r r
This gives ρ = ρ̃ + μ(rc − ρ̃)/rc and thus μ = rc (ρ − ρ̃)/(rc − ρ̃). To calculate the risk, let
R̄j (xj , xcj ) := x∗j Σj xj + ρj (xj , xcj )2 and use the last expression from (7) for pj ρj (xj , xcj ). Then
2 2 2
λj λj λj λj
R̄j (xj , xcj ) = δjc + +μ = (δjc + 1) + 2 μ + μ2
r̃jc r̃jc r̃jc r̃jc
[r2c (rj∗ x0 + r1c xc0 ) − μ]2 [r2c (rj∗ x0 + r1c xc0 ) − μ]
= + 2 μ + μ2
δjc + 1 δjc + 1
[r2c (rj∗ x0 + r1c xc0 )]2 2
δjc
= + μ .
δjc + 1 δjc + 1
Since R(x) = j∈L pj R̄j (xj , xcj ) − ρ2 , we get
pj ∗ 2 pj δjc
R(x) + ρ2 = (r2c )2 (r x
j 0 ) + 2r c c ∗
x r
1 0 j 0 x + (r c c 2
x
1 0 ) + μ 2
δc + 1 δc + 1
j∈L j j∈L j
= (r2c )2 x∗0 Σ̃0 x0 + 2r1c xc0 r̃0∗ x0 + p̃0 (r1c xc0 )2 + μ2 (1 − p̃0 ).
Therefore we have
(rc )2 + δ0c μ2 p̃0
R(x) + ρ2 = p̃0 + μ 2
(1 − p̃ 0 ) = r c
ρ̃ + μ 2
1 −
δ0c + 1 δ0c + 1
ρ̃ (ρ − ρ̃)2
= rc ρ̃ + μ2 1 − c = rc ρ̃ + rc c .
r r − ρ̃
Subtracting ρ2 gives the stated risk formula whose minimum over all ρ is easily determined. The
remaining statements follow trivially.
38 M. C. Steinbach
Proof of Theorem 18. The optimality conditions include the ones of Problem 11 except that ρ
must be replaced by ρ + θ (in ∂L/∂μ = 0). Additional conditions are μ − (ρ + θ) = η (from
∂L/∂θ = 0), θ ≥ 0, η ≥ 0, and complementarity θη = 0. The proof now proceeds precisely as
the proof of Theorem 17, with ρ replaced by ρ + θ in all instances. Finally nonnegativity and
complementarity of θ and η lead to the case distinction: either θ = 0 and μ ≥ ρ, or θ > 0 and
μ = ρ + θ. The formula μ = rc (ρ + θ − ρ̃)/(rc − ρ̃) gives ρ ≥ rc in the first case and ρ + θ = rc in
the second case.
Theorem 20. In Problem 13 let θj ≡ r2c xlj for j ∈ L, θ0 ≡ rc xl0 , and
1 pj 1 pj
θ̄0 := θj , s0 := θj rj .
p̃0 δjc + 1 p̃0 δjc + 1
j∈L j∈L
cf. (7) and the reward condition after (8). The additional optimality conditions above together
with these identities lead to the stated case distinctions when it is observed that λj has the same
sign as r2c wj − μ − θj for j ∈ L.
Remark. Note that all the multipliers now have a natural interpretation. The reward multiplier μ
is the maximal scenario return and a threshold value for surplus money: there is surplus money
in scenario j iff ρj = μ and r2c wj > μ. The budget multipliers λj (up to a scaling factor) measure
the difference of desired return or scenario returns to the threshold.
Single-Period and Multi-Period Mean-Variance Models 39
Therefore we have
rc − μ r̄0 − r1c
ρj − dj ∼ μ − rc + r2c (rj − r1c )ρ̃ c c
r2 r̃1 Σ0
rj − r1 r̄0 − r1
c c
= (μ − rc ) 1 − c .
δ 0 + 1 Σ0
Acknowledgement
This work would have been impossible without many intensive discussions in the Operations
Research group at the University of St. Gallen. My sincerest thanks especially to K. Frauendorfer,
H. Siede, and D. Steiner.
References
[1] A. J. Alexander and W. F. Sharpe, Fundamentals of investment, Prentice-Hall, 1989.
[2] K. J. Arrow, Essays in the theory of risk-bearing, Markham, Chicago, 1971.
[3] P. Artzner, F. Delbaen, J. M. Eber, and D. Heath, Thinking coherently, Risk, 10 (1997),
pp. 68–71.
[4] C. Atkinson, S. R. Pliska, and P. Wilmott, Portfolio management with transaction costs., Proc.
R. Soc. Lond., Ser. A, 453 (1997), pp. 551–562.
[5] V. S. Bawa, Stochastic dominance: A research bibliography, Manag. Sci., 28 (1982), pp. 698–712.
[6] R. E. Bellman, Dynamic Programming, Princeton University Press, 1957.
[7] A. Beltratti, A. Consiglio, and S. A. Zenios, Scenario modeling for the management of inter-
national bond portfolios., Ann. Oper. Res., 85 (1999), pp. 227–247.
[8] M. Best and B. Ding, On the continuity of the minimum in parametric quadratic programs., J.
Optimization Theory Appl., 86 (1995), pp. 245–250.
[9] J. R. Birge, Stochastic programming computations and applications, INFORMS J. Comput., 9
(1997), pp. 111–133.
40 M. C. Steinbach
[37] M. Kijima and M. Ohnishi, Mean-risk analysis of risk aversion and wealth effects on optimal port-
folios with multiple investment opportunities, Ann. Oper. Res., 45 (1993), pp. 147–163.
[38] , Portfolio selection problems via the bivariate characterization of stochastic dominance rela-
tions., Math. Finance, 6 (1996), pp. 237–277.
[39] A. J. King, Asymmetric risk measures and tracking models for portfolio optimization under uncer-
tainty, Ann. Oper. Res., 45 (1993), pp. 165–177.
[40] A. J. King and D. L. Jensen, Linear-quadratic efficient frontiers for portfolio optimization., Appl.
Stochastic Models Data Anal., 8 (1992), pp. 195–207.
[41] H. Konno, Piecewise linear risk function and portfolio optimization, J. Oper. Res. Soc. Japan, 33
(1990), pp. 139–156.
[42] H. Konno, S. R. Pliska, and K.-I. Suzuki, Optimal portfolios with asymptotic criteria, Ann. Oper.
Res., 45 (1993), pp. 187–204.
[43] H. Konno and K.-I. Suzuki, A mean-variance-skewness portfolio optimization model., J. Oper. Res.
Soc. Japan, 38 (1995), pp. 173–187.
[44] H. Konno and H. Watanabe, Bond portfolio optimization problems and their applications to index
tracking: A partial optimization approach., J. Oper. Res. Soc. Japan, 39 (1996), pp. 295–306.
[45] A. Kraus and R. Litzenberger, Skewness preference and the valuation of risky assets, J. Fin., 21
(1976), pp. 1085–1094.
[46] Y. Kroll, H. Levy, and H. M. Markowitz, Mean-variance versus direct utility maximization, J.
Fin., 39 (1984), pp. 47–62.
[47] H. Levy, Stochastic dominance and expected utility: survey and analysis, Manag. Sci., 38 (1992),
pp. 555–593.
[48] H. Levy and H. M. Markowitz, Approximating expected utility by a function of mean and variance,
Amer. Econ. Rev., 69 (1979), pp. 308–317.
[49] Y. Li and W. T. Ziemba, Univariate and multivariate measures of risk aversion and risk premiums,
Ann. Oper. Res., 45 (1993), pp. 265–296.
[50] L. C. MacLean and K. Weldon, Estimating multivariate random effects without replication.,
Comm. Statist. Theory Methods, 25 (1996), pp. 1447–1469.
[51] H. M. Markowitz, Portfolio selection, J. Fin., 7 (1952), pp. 77–91.
[52] , The utility of wealth, J. Political Econ., 60 (1952), pp. 151–158.
[53] , The optimization of a quadratic function subject to linear constraints, Nav. Res. Logist. Quar-
terly, 3 (1956), pp. 111–133.
[54] , Portfolio Selection: Efficient Diversification of Investments, John Wiley, New York, 1959.
[55] , Mean-Variance analysis in portfolio choice and capital markets, Basil Blackwell, 1987.
[56] R. C. Merton, Optimal consumption and portfolio rules in a continuous-time model, J. Econ. Theory,
3 (1971), pp. 373–413.
[57] R. C. Merton and P. A. Samuelson, Fallacy of the log-normal approximation to optimal portfolio
decision-making over many periods, J. Fin. Econ., 1 (1974), pp. 67–94.
[58] A. J. Morton and S. R. Pliska, Optimal portfolio management with fixed transaction costs., Math.
Finance, 5 (1995), pp. 337–356.
[59] J. Mossin, Optimal multiperiod portfolio policies, J. Bus., 41 (1968), pp. 215–229.
[60] J. M. Mulvey and H. Vladimirou, Stochastic network optimization models for investment planning,
Ann. Oper. Res., 20 (1989), pp. 187–217.
[61] M. Nakasato and K. Furukawa, On the number of securities which constitute an efficient portfolio,
Ann. Oper. Res., 45 (1993), pp. 333–347.
[62] The Sveriges Riksbank (Bank of Sweden) Prize in Economic Sciences in Memory of Alfred Nobel.
Press Release of The Royal Swedish Academy of Sciences, Oct. 16, 1990. URL http://www.nobel.se
/laureates/economy-1990-press.html.
[63] A. F. Perold, Large-scale portfolio optimization, Manag. Sci., 30 (1984), pp. 1143–1160.
42 M. C. Steinbach
[64] J. W. Pratt, Risk aversion in the small and in the large, Econometrica, 32 (1964), pp. 122–136.
[65] R. T. Rockafellar, Duality and optimality in multistage stochastic programming, Ann. Oper. Res.,
85 (1999), pp. 1–19.
[66] R. T. Rockafellar and R. J.-B. Wets, Generalized linear-quadratic problems of deterministic and
stochastic optimal control in discrete time, SIAM J. Control Optimization, 28 (1990), pp. 810–822.
[67] S. Ross, Some stronger measures of risk aversion in the small and in the large, Econometrica, 49
(1981), pp. 621–638.
[68] M. E. Rubinstein, A comparative statistics analysis of risk premiums, J. Bus., 12 (1973), pp. 605–
615.
[69] A. Ruszczyński, Decomposition methods in stochastic programming, Math. Programming, 79 (1997),
pp. 333–353.
[70] P. A. Samuelson, The fundamental approximation theorem of portfolio analysis in terms of means,
variances, and higher moments, Rev. Econ. Studies, 25 (1958), pp. 65–86.
[71] , Lifetime portfolio selection by dynamic stochastic programming, Rev. Econ. Studies, 51 (1969),
pp. 239–246.
[72] W. F. Sharpe, Investment, Prentice-Hall, 1978.
[73] H. Shirakawa, Optimal consumption and portfolio selection with incomplete markets and upper and
lower bound constraints., Math. Finance, 4 (1994), pp. 1–24.
[74] M. C. Steinbach, Fast Recursive SQP Methods for Large-Scale Optimal Control Problems, Ph. D.
dissertation, University of Heidelberg, 1995.
[75] , Structured interior point SQP methods in optimal control, Z. Angew. Math. Mech., 76 (1996),
pp. 59–62.
[76] , Recursive direct optimization and successive refinement in multistage stochastic programs,
Preprint SC-98-27, ZIB, 1998.
[77] , Recursive direct algorithms for multistage stochastic programs in financial engineering, in
Operations Research Proceedings 1998, P. Kall and H.-J. Lüthi, eds., Springer-Verlag, 1999, pp. 241–
250.
[78] M. C. Steinbach, H. G. Bock, G. V. Kostin, and R. W. Longman, Mathematical optimization
in robotics: Towards automated high speed motion planning, Surv. Math. Ind., 7 (1998), pp. 303–340.
[79] S. Uryasev and R. T. Rockafellar, Optimization of Conditional Value-at-Risk, Research Report
99-4, University of Florida, 1999.
[80] J. von Neumann and O. Morgenstern, Theory of Games and Economic Behavior, Princeton
University Press, 3rd ed., 1953.
[81] R. J.-B. Wets, Programming under uncertainty: The equivalent convex program, SIAM J. Appl.
Math., 14 (1984), pp. 89–105.
[82] S. A. Zenios, Financial optimization, Cambridge University Press, Cambridge, UK, 1992.
[83] S. A. Zenios and P. Kang, Mean-absolute deviation portfolio optimization for mortgage-backed
securities, Ann. Oper. Res., 45 (1993), pp. 433–450.
[84] W. T. Ziemba, Choosing investments when the returns have stable distributions, in Mathematical
programming in theory and practice, P. L. Hammer and G. Zoutendijk, eds., North-Holland, Ams-
terdam, 1974, pp. 443–482.
[85] W. T. Ziemba and J. M. Mulvey, eds., Worldwide asset and liability modeling, Cambridge Uni-
versity Press, Cambridge, UK, 1998.