SC 99 30 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

Takustraße 7

Konrad-Zuse-Zentrum D-14195 Berlin-Dahlem


Germany
für Informationstechnik Berlin

M ARC C. S TEINBACH

Markowitz Revisited: Single-Period and


Multi-Period Mean-Variance Models

Preprint SC 99-30 (August 1999)


Markowitz Revisited:
Single-Period and Multi-Period Mean-Variance Models
Marc C. Steinbach

Abstract
Mean-variance portfolio analysis provided the first quantitative treatment of the trade-
off between profit and risk. We investigate in detail the interplay between objective and
constraints in a number of single-period variants, including semi-variance models. Particular
emphasis is laid on avoiding the penalization of overperformance. The results are then used as
building blocks in the development and theoretical analysis of multi-period models based on
scenario trees. A key property is the possibility to remove surplus money in future decisions,
yielding approximate downside risk minimization.
Key words. mean-variance analysis, downside risk, multi-period model
AMS subject classifications. 90A09, 90C15, 90C20

0 Introduction
The classical mean-variance approach for which Harry Markowitz received the 1990 Nobel Price
in Economics offered the first systematic treatment of a dilemma that each investor faces: the
conflicting objectives high profit versus low risk. In dealing with this fundamental issue Markowitz
came up with a parametric optimization model that was both sufficiently general for a significant
range of practical situations and simple enough for theoretical analysis and numerical solution.
As the Swedish Academy of Sciences put it: his primary contribution consisted of developing a
rigorously formulated, operational theory for portfolio selection under uncertainty [62].
Indeed, the subject is so complex that Markowitz’ seminal work of the fifties [51, 52, 54]
probably raised more questions than it answered, thus initiating a tremendous amount of related
research. Before placing the present paper into perspective, the following paragraphs give a coarse
overview of these issues. A substantial number of references is included, but the list is far from
complete and cannot even contain all the relevant papers. We cite just a few references on each
subject (in chronological order, and usually only the most recent from several contributions of the
same author) to provide some starting points for the interested reader.
An important aspect of pareto-optimal (efficient) portfolios is that each determines a von Neu-
mann–Morgenstern utility function [80] for which it maximizes the expected utility of the return on
investment. This allowed Markowitz to interpret his approach by the theory of rational behavior
under uncertainty [52], [54, Part IV]. Further, certain measures of risk averseness evolved as
a basic concept in economic theory. These are derived from utility functions and justified by
their relationship to the corresponding risk premiums, see Pratt [64], Arrow [2], Rubinstein [68],
Duncan [17], Kihlstrom and Mirman [36], Ross [67], Li and Ziemba [49]. Applications of utility
theory and risk averseness measures to portfolio selection are reported, e.g., by Mossin [59], Levy
and Markowitz [48], Kroll et al. [46], Jewitt [31], King and Jensen [40], Kijima and Ohnishi [37].
A fundamental (and still debated) question is how risk should be measured properly. Markowitz
discusses the pros and cons of replacing the variance by alternative risk measures in a more
general mean-risk approach [54, Chap. XIII]. These considerations and the theory of stochastic
dominance (see Bawa [5], Fishburn [20], Levy [47], Kijima and Ohnishi [38]) stimulated the research

1
2 M. C. Steinbach

in asymmetric risk measures like expectation of loss and semi-variance, cf. Konno [41], King [39],
Zenios and Kang [83], Uryasev and Rockafellar [79]. The properties of real return distributions also
led to risk models involving higher moments, see Ziemba [84], Kraus and Litzenberger [45], Konno
and Suzuki [43]. More recently the theoretical concept of coherent risk measures was introduced by
Artzner et al. [3], while portfolio tracking (or replication) approaches became popular in practice,
see King [39], Konno and Watanabe [44], Dembo and Rosen [15].
It is quite interesting that the mean-variance approach has received very little attention in the
context of long-term investment planning. Although Markowitz does consider true multi-period
models (where the portfolio may be readjusted several times during the planning horizon) [54,
Chap. XIII], these considerations use a utility function based on the consumption of wealth over
time rather than mean and variance of the final wealth, which places the problem in the realm of
dynamic programming (Bellman [6]). Further long-term and simplified multi-period approaches
are discussed, e.g., by Mossin [59], Samuelson [71], Hakansson [24], Merton and Samuelson [57],
Konno et al. [42]. Much research has also been carried out in the closely related field of continuous-
time models, see Merton [56], Harrison and Pliska [25], Heath et al. [27], Karatzas [34], Dohi and
Osaki [16]. Over the past decade more detailed multi-period models have become tractable due
to the progress in computing technology (both algorithms and hardware), see, e.g., Mulvey and
Vladimirou [60], Dantzig and Infanger [14], Consigli and Dempster [13], Beltratti et al. [7].
With the exception of this group, most of the work cited above neglects details like asset
liquidity or transaction costs. At least the second idealization causes serious errors when many
transactions are performed, as in continuous-time models. Imperfect markets are briefly discussed
by Markowitz [54, p. 297+]; later studies include Perold [63], He and Pearson [26], Karatzas et
al. [35], Jacka [30], Shirakawa [73], Morton and Pliska [58], Atkinson et al. [4].
A final issue concerns the assumptions of the investor about the future, which is represented
by probability distributions of the asset returns. Being based on assessments of financial analysts
or estimated from historical data (or both), these distributions are never exact. (Markowitz calls
them probability beliefs.) The question of the sensitivity of optimization results with respect to
errors in the distribution is discussed, e.g., by Jobson [32], Broadie [11], Chopra and Ziemba [12],
Best and Ding [8], MacLean and Weldon [50].
Additional material and references are found in a more recent book by Markowitz [55] and
any standard text on mathematical finance, like Sharpe [72], Elton and Gruber [18], Ingersoll [29],
Alexander and Sharpe [1], Zenios [82], Ziemba and Mulvey [85].
The present paper develops a fairly complete theoretical understanding of the multi-period
mean-variance approach based on scenario trees. This is achieved by analyzing various portfolio
optimization problems with gradually increasing complexity. Primal and dual solutions of these
problems are derived, and dual variables are given an interpretation if possible. The most impor-
tant aspect in our discussion is the precise interaction of objective (or risk measure) and constraints
(or set of feasible wealth distributions), a subject that has not much been studied in the previous
literature. It should be clear that arguing the properties of risk measures may be meaningless
in an optimization context unless it is clear which distributions are possible. A specific goal in
our analysis is to avoid a penalization of overperformance. In this context we discuss the role of
cash and, in some detail, variance versus semi-variance. A key ingredient of our most complex
multi-period model is an artificial arbitrage-like mechanism involving riskless though inefficient
portfolios and representing a choice between immediate consumption or future profit.
Each of the problems considered tries to isolate a certain aspect, usually under the most general
conditions even if practical situations typically exhibit more specific characteristica. However, we
give higher priority to a clear presentation, and inessential generality will sometimes be sacrificed
for technical simplicity. In particular, no inequality constraints are included unless necessary. (A
separate section is devoted to the influence of such restrictions.) Neither do we attempt to model
liquidity constraints or short-selling correctly, or to include transaction costs; we consider only
idealized situations without further justification. The work here is based on a multi-period mean-
variance model that was first proposed by Frauendorfer [21] and later refined by Frauendorfer and
Siede [23]. A complete application model (including transaction costs and market restrictions)
together with implications of the results developed here will be presented later in a joint paper.
Single-Period and Multi-Period Mean-Variance Models 3

Due to future uncertainty the portfolio optimization problems in this paper are all stochastic.
More precisely, they are deterministic equivalents of convex stochastic programs, cf. Wets [81].
Except for the semi-variance problems they are also quadratic programs involving a second-order
approximation of the return distribution in some sense, cf. Samuelson [70]. Based on earlier work
in nonlinear optimal control [74, 75, 78], the author has developed structure-exploiting numerical
algorithms for multi-stage convex stochastic programs like the ones discussed here [76, 77]. More
general problem classes and duality are studied by Rockafellar and Wets [66], and Rockafellar [65].
For background material on stochastic programming we refer the reader to Ermoliev and Wets [19],
Kall and Wallace [33], Birge [10], Birge and Louveaux [9], Ruszczyński [69].
The paper is organized as follows. Our analysis begins with single-period models in Section 1.
Although many of the results are already known, the systematic discussion of subtle details adds
insight that is essential in the multi-period case. To some extent this section has tutorial character;
the problems may serve as examples in an introductory course on optimization. Next, multi-period
mean-variance models are analyzed in Section 2, where the final goal consists in constructing
an approximate downside risk minimization through appropriate constraints. This material is
completely new; the research was motivated by practical experience with the application model
mentioned above. Some concluding remarks are finally given in Section 3.

1 Single-period mean-variance analysis


Consider an investment in n assets over a certain period of time. Denote by xν the capital invested
in asset ν, by x ∈ Rn the portfolio vector, and by r ∈ Rn the random vector of asset returns,
yielding asset capitals rν xν at the end of the investment period. Suppose that r is given by a joint
probability distribution with expectation r̄ := E(r) and covariance matrix

Σ := E[(r − r̄)(r − r̄)∗ ] = E[rr∗ ] − r̄r̄∗ .

(The existence of these two moments is assumed throughout the paper.) The choice of a specific
portfolio determines a certain distribution of the associated total return (or final wealth) w ≡ r∗ x.
Mean-variance analysis aims at forming the most desirable return distribution through a suitable
portfolio, where the investor’s idea of desirability depends solely on the first two moments.

Definition 1 (Reward). The reward of a portfolio is the mean of its return,

ρ(x) := E(r∗ x) = r̄∗ x.

Definition 2 (Risk). The risk of a portfolio is the variance of the return,

R(x) := σ 2 (r∗ x) = E[(r∗ x − E(r∗ x))2 ] = E[x∗ (r − r̄)(r − r̄)∗ x] = x∗ Σx.

Various formulations of the mean-variance problem exist. From the very beginning [52, 54],
Markowitz related his approach to the utility theory of von Neumann and Morgenstern [80]. As
we shall see later, maximizing the expectation of a concave quadratic utility function leads to a
formulation like
1
max μρ(x) − R(x)
x 2 (1)
s.t. e∗ x = 1,

where e ∈ Rn denotes the vector of all ones. The objective models the actual goal of the investor,
a tradeoff between risk and reward,1 while the budget equation e∗ x = 1 simply specifies her initial
wealth w0 (normalized to w0 = 1 w.l.o.g.). Our preferred formulation comes closer to the original
1 Many authors attach a tradeoff parameter θ to the risk term and maximize ρ(x) − θR(x)/2, which is equivalent

to (1) if μ ≡ θ 1 > 0. However, this problem becomes unbounded for θ ≤ 0, whereas (1) remains solvable for
μ ≤ 0. This is important in our analysis.
4 M. C. Steinbach

one; it minimizes risk subject to the budget equation and subject to the condition that a certain
desired reward ρ be obtained,
1
min R(x)
x 2
s.t. e∗ x = 1, (2)
ρ(x) = ρ.

Here the investor’s goal is split between objective and reward condition.
In this section we study the precise relation of Problems (1) and (2), and a number of in-
creasingly general single-period variants. We will include a cash account, then consider certain
inequality constraints, utility functions, and finally downside risk. Many of the results are already
known, but usually in a different form. Here we choose a presentation that facilitates the study of
nuances in the optimization problems and that integrates seamlessly with the more general case
of multi-period problems in Section 2.

1.1 Risky assets only


The simplest situation is given by portfolios consisting exclusively of risky assets. In this case we
impose two conditions on the return distribution.
Basic assumptions.
A1) The covariance matrix is positive definite, Σ > 0.
A2) r̄ is not a multiple of e.
Remarks. The first assumption means that all n assets (and any convex combination) are indeed
risky; riskless assets like cash will be treated separately if present. The second assumption implies
n ≥ 2 and guarantees a non-degenerate situation: otherwise Problem (1) would always have the
same optimal portfolio x = Σ−1 e/(e∗ Σ−1 e) regardless of the tradeoff parameter, and Problem (2)
would have contradicting constraints except for one specific value of the desired reward: ρ = r̄∗ e/n.
Notice that no formal restrictions are imposed on the value of r̄, although r̄ > 0 (and even r̄ > e)
will usually hold in practice.
Due to assumption A1 we can define the following constants that will be used throughout this
section:

α := e∗ Σ−1 e, β := e∗ Σ−1 r̄, γ := r̄∗ Σ−1 r̄, δ := αγ − β 2 .

Lemma 1. The constants α, γ, and δ are positive. More precisely,


    √
n n r̄22 r̄22 n r̄2
α∈ , , γ∈ , , |β| < ,
λmax (Σ) λmin (Σ) λmax (Σ) λmin (Σ) λmin (Σ)
where λmin , λmax denote the minimal and maximal eigenvalue, respectively.
Proof. Since Σ > 0 (by A1), we have
 
  n n
α = e∗ Σ−1 e ∈ e22 λmin (Σ−1 ), e22 λmax (Σ−1 ) = , .
λmax (Σ) λmin (Σ)
The inclusion for γ is analogous. Since r̄, e are linear independent and Σ > 0 (by A2 and A1), the
(2,2)-matrix
 ∗  
e −1
 α β
Σ e r̄ = >0
r̄∗ β γ
√ √
has positive determinant δ. Thus |β| < αγ ≤ n r̄2 /λmin (Σ).
Single-Period and Multi-Period Mean-Variance Models 5

Remark. The inclusions for α and γ are sharp but not the bound on |β|, and neither α < γ nor
β > 0 hold in general. Anyway, we need only α, γ, δ > 0.

Problem 1. Let us first consider the standard tradeoff formulation. To simplify the comparison
with our preferred formulation, we minimize negative utility
1 ∗
min x Σx − μr̄∗ x
x 2
s.t. e∗ x = 1.

The Lagrangian is
1 ∗
L(x, λ; μ) = x Σx − μr̄∗ x − λ(e∗ x − 1).
2
Theorem 1. Problem 1 has the unique primal-dual solution

x = Σ−1 (λe + μr̄), λ = (1 − μβ)/α

and associated reward

ρ = λβ + μγ = (β + μδ)/α.

Proof. From the Lagrangian one obtains the system of first order necessary conditions
    
Σ e x μr̄
= .
e∗ −λ 1

Its first row (dual feasibility) yields the optimal portfolio x. The optimal multiplier λ and reward ρ
are obtained by substituting x into the second row (primal feasibility) and the definition of ρ,
respectively. Uniqueness of the solution follows from strong convexity of the objective and full
rank of the constraint.

Remark. Although the qualitative interpretation of the tradeoff function is clear, the precise value
of the tradeoff parameter μ should also have an interpretation. In specific, the resulting reward
is of interest. This is one reason why we prefer a different formulation of the mean-variance
problem. Other important reasons are greater modeling flexibility, and sparsity in the multi-
period formulation, see Section 2.

Problem 2. The mean-variance problem with prescribed reward reads


1 ∗
min x Σx
x 2
s.t. e∗ x = 1,
r̄∗ x = ρ.

Its Lagrangian is
1 ∗
L(x, λ, μ; ρ) = x Σx − λ(e∗ x − 1) − μ(r̄∗ x − ρ).
2
We refer to the dual variables λ, μ as budget multiplier and reward multiplier, respectively. It
will soon be shown that the optimal reward multiplier μ is precisely the tradeoff parameter of
Problem 1.

Theorem 2. Problem 2 has the unique primal-dual solution

x = Σ−1 (λe + μr̄), λ = (γ − βρ)/δ, μ = (αρ − β)/δ.


6 M. C. Steinbach

Proof. The system of first-order optimality conditions reads


⎛ ⎞⎛ ⎞ ⎛ ⎞
Σ e r̄ x 0
⎝e∗ ⎠ ⎝−λ⎠ = ⎝1⎠ .
r̄∗ −μ ρ

As in Theorem 1, the optimal portfolio x is obtained from the first row. Substitution of x into
rows two and three yields the optimal multipliers
   ∗      −1    
λ e −1
 −1 1 α β 1 1 γ − βρ
= Σ e r̄ = = .
μ r̄∗ ρ β γ ρ δ αρ − β

Uniqueness of the solution follows as in Theorem 1.


Theorem 3. Problem 1 with parameter μ and Problem 2 with parameter ρ are equivalent if and
only if μ equals the optimal reward multiplier of Problem 2 or, equivalently, ρ equals the optimal
reward of Problem 1.
Proof. The required conditions, μ = (αρ − β)/δ and ρ = (β + μδ)/α, are clearly equivalent. It
follows that the optimal budget multipliers of both problems are identical,

1 − μβ δ − αβρ + β 2 αγ − αβρ γ − βρ
= = = .
α αδ αδ δ
Hence optimal portfolios also agree. The “only if” direction is trivial.
Remarks. Apparently, the optimality conditions of Problem 2 include the optimality conditions of
Problem 1, and additionally the reward condition. These n + 2 equations define a one-dimensional
affine subspace for the n + 3 variables x, λ, μ, ρ, which is parameterized by μ in Problem 1 and
by ρ in Problem 2. As an immediate consequence, the optimal risk is a quadratic function of ρ,
denoted as σ 2 (ρ). Its graph is called the efficient frontier.2
Theorem 4. In Problems 1 and 2, the optimal risk is

σ 2 (ρ) = (αρ2 − 2βρ + γ)/δ = (μ2 δ + 1)/α.

Its global minimum over all rewards is attained at ρ̂ = β/α and has the positive value σ 2 (ρ̂) = 1/α.
The associated solution is x̂ = Σ−1 e/α, λ̂ = 1/α, μ̂ = 0.
Proof. By Definition 2 and Theorem 2,

σ 2 (ρ) = x∗ Σx = (λe + μr̄)∗ Σ−1 (λe + μr̄)


= λ2 α + 2λμβ + μ2 γ = λ(λα + μβ) + μ(λβ + μγ).

Using λ, ρ from Theorem 1 and λ, μ from Theorem 2 gives

σ 2 (ρ) = λ + μρ = (αρ2 − 2βρ + γ)/δ = (μ2 δ + 1)/α.

The remaining statements follow trivially.

Discussion. The optimal portfolio is clearly a reward-dependent linear combination of the reward-
independent portfolios Σ−1 e and Σ−1 r̄. Moreover, it is an affine function of ρ. The efficient frontier
and optimal investments into two risky assets are depicted in Fig. 1. Here, since n = 2, the optimal
portfolio is completely determined by the budget condition and the reward condition; it does not
depend on Σ and is thus correlation-independent. Not so the risk: for negatively correlated assets,
2 More generally, efficient frontier refers to the set of all pareto-optimal solutions in any multi-objective opti-

mization problem. The solutions (portfolios) are also called efficient. Strictly speaking this applies only to the
upper branch here, that is, ρ ≥ ρ̂ or, equivalently, μ ≥ 0 (see discussion below).
Single-Period and Multi-Period Mean-Variance Models 7

correlation: negative asset 1 (high risk)


zero 2 asset 2 (low risk)
positive

-1

0
1 1.05 1.1 1.15 1.2 1 1.05 1.1 1.15 1.2

Figure 1: Portfolio with two risky assets; r̄1 = 1.15, r̄2 = 1.08. Left: efficient frontier for negatively
correlated, uncorrelated, and positively correlated assets. Right: optimal portfolio vs. reward.

it has a pronounced minimum at a fairly large reward ρ̂. As the correlation increases, the lowest
possible risk also increases and is attained at a smaller reward. (These statements do not simply
generalize to the case n > 2.)
A serious drawback of the model (in this form) is the fact that positive deviations from the
prescribed reward are penalized, and hence the “risk” increases when ρ is reduced below ρ̂. Indeed,
the penalization cannot be avoided, indicating that the model is somehow incomplete. We will
see, however, that unneccessary positive deviations from ρ do not occur if the model is extended
appropriately. For the moment let us accept that only the upper branch is relevant in practice.

1.2 Risky assets and riskless cash


Now consider n risky assets and an additional cash account xc with deterministic return rc ≡ r̄c .
The portfolio is (x, xc ), and x, r, r̄, Σ refer only to its risky part.
Basic assumptions. Assumption A2 is replaced by a similar condition on the extended portfolio,
which may now consist of just one risky asset and cash.
A1) Σ > 0.
A3) r̄ = rc e.
Remarks. Again, the second assumption excludes degenerate situations, and no restrictions are
imposed to ensure realistic returns. In practice one can typically assume r̄ > rc e > 0 (or even
rc e ≥ e), which satisfies A3. The constants α, β, γ are defined as before; they are related to the
risky part of the portfolio only. (Replacing A2 makes δ = 0 possible, but δ plays no role here.)
Problem 3. Any covariance associated with cash vanishes, so that the risk and reward are
R(x, xc ) = x∗ Σx and ρ(x, xc ) = r̄∗ x + rc xc , respectively, and the optimization problem reads
 ∗   
1 x Σ 0 x 1
minc = x∗ Σx
x,x 2 xc 0 0 xc 2
s.t. e∗ x + xc = 1,
r̄∗ x + rc xc = ρ.

Theorem 5. Problem 3 has the unique primal-dual solution

x = Σ−1 (λe + μr̄) = μΣ−1 (r̄ − rc e), λ = −rc μ,


xc = 1 − μ(β − rc α), μ = (ρ − rc )/δ c ,

where δ c := (rc )2 α − 2rc β + γ > 0. The resulting optimal risk is

σ 2 (ρ) = (ρ − rc )2 /δ c .
8 M. C. Steinbach

Its global minimum over all rewards is attained at ρ̂ = rc and has value zero. The associated
solution has 100 % cash: (x̂, x̂c ) = (0, 1), λ̂ = μ̂ = 0.

Proof. The system of optimality conditions is


⎛ ⎞⎛ ⎞ ⎛ ⎞
Σ 0 e r̄ x 0
⎜ 0 0 1 rc ⎟ ⎜ xc ⎟ ⎜0⎟
⎜ ∗ ⎟⎜ ⎟ = ⎜ ⎟.
⎝e 1 ⎠ ⎝−λ⎠ ⎝1⎠

r̄ r c
−μ ρ

The optimal budget multiplier λ is obtained from row two. Substitution into row one yields the
expression for x, and substituting x into row three yields xc . Substitution of x and xc into row
four gives

ρ = r̄∗ x + rc xc = μ(γ − rc β) + rc − μ(rc β − (rc )2 α) = rc + μδ c ,

yielding μ. The positivity of δ c follows (with A3) from

δ c = (r̄ − rc e)∗ Σ−1 (r̄ − rc e).

Finally, the second formula for x yields

σ 2 (ρ) = μ2 (r̄ − rc e)∗ Σ−1 (r̄ − rc e) = μ2 δ c = (ρ − rc )2 /δ c .

The remaining statements (ρ̂ = rc etc.) follow trivially.

Problem 4. Problem 3 also has a tradeoff version:


1 ∗
minc x Σx − μ(r̄∗ x + rc xc )
x,x 2
s.t. e∗ x + xc = 1.

Theorem 6. Problem 3 with parameter ρ and Problem 4 with parameter μ are equivalent if and
only if ρ = rc + μδ c .

Proof. The proof is analogous to the proof of Theorem 3 and therefore omitted.

Discussion. Basically the situation is quite similar to Problem 2, the only qualitative difference
being the existence of one zero risk portfolio: for ρ = rc , the capital is completely invested in cash
and the risk vanishes. Otherwise a fraction of e∗ x = μ(β − rc α) is invested in risky assets and the
risk is positive, see Fig. 2. The optimal portfolio is now a mix of the (reward-independent) risky
portfolio (Σ−1 (r̄ − rc e), 0) and cash (0,1). The following comparison shows how precisely the cash
account reduces risk when added to a set of (two or more) risky assets.

Theorem 7. The risk in Problem 3 is almost always lower than in Problem 2: If β = r c α, then
the efficient frontiers touch in the single point

δc γ − rc β δc
ρ = rc + = , σ 2 (ρ) = ,
β − rc α β − rc α (β − rc α)2

(see Fig. 2), where the solutions of both problems are “identical”: x or (x, 0). If β = rc α, then
xc ≡ 1 and e∗ x ≡ 0, and risks differ by the constant 1/α,

(ρ − rc )2 1 αρ2 − 2βρ + γ
c
+ = .
δ α δ
Single-Period and Multi-Period Mean-Variance Models 9

risk with cash asset 1


risk w/o cash 2 asset 2
cash

-1
0
0.95 1 1.05 1.1 1.15 0.95 1 1.05 1.1 1.15

Figure 2: Portfolio with two risky assets and cash; r̄1 , r̄2 as before, rc = 1.05. Left: efficient
frontiers with/without cash. Right: optimal portfolio vs. reward.

Proof. If β = rc α, then Problem 3 has a unique zero-cash solution, xc = 0, with

1 rc
μ= , λ=− .
β − rc α β − rc α

This gives the stated values of ρ and σ 2 (ρ) by Theorem 5. Substituting ρ into the formulae for
λ, μ in Theorem 2 yields identical values in both problems. Hence the portfolios agree, too. The
curvatures of the efficient frontiers, d2 σ 2 (ρ)/dρ2 , are 2α/δ and 2/δ c , respectively. Now,

αδ c − δ = (rc )2 α2 − 2rc αβ + αγ − αγ + β 2 = (rc α − β)2 > 0.

Thus 2α/δ > 2/δ c > 0, implying that Problem 3 has lower risk if xc = 0. The case β = rc α is
trivial: both efficient frontiers have ρ̂ = rc and identical curvatures.

To conclude this section, we show that it does not make sense to consider portfolios with more
than one riskless asset (and no further restrictions).

Lemma 2 (Arbitrage). Any portfolio having at least two riskless assets xc , xd with different
returns rc , rd can realize any desired reward at zero risk.

Proof. Choose xc = (ρ − rd )/(rc − rd ), xd = 1 − xc , and invest nothing in other assets.

1.3 Risky assets, cash, and guaranteed total loss


Let us now consider a portfolio with n ≥ 1 risky assets, a riskless cash account as in Problem 3,
and in addition an “asset” xl with guaranteed total loss, i.e., rl ≡ r̄l = 0. (Notice that xl is not
“risky” in the sense of an uncertain future.) At first glance this situation seems strange, but it
will turn out to be useful.3

Basic assumptions. In addition to the conditions of the previous section we now require positive
cash return (rc ≤ rl does not make sense).

A1) Σ > 0.

A3) r̄ = rc e.

A4) rc > 0.
3 The suggestive (but actually misleading) notion of an “asset with guaranteed total loss” is due to Infanger [28]

who constructed an example problem similar to the one considered here.


10 M. C. Steinbach

Problem 5. All covariances associated with xc or xl vanish, so that the risk and reward are
R(x, xc , xl ) = x∗ Σx and ρ(x, xc , xl ) = r̄∗ x + rc xc , respectively, and the optimization problem
reads
⎛ ⎞∗ ⎛ ⎞⎛ ⎞
x Σ 0 0 x
1 ⎝ c⎠ ⎝ 1
min x 0 0 0⎠ ⎝xc ⎠ = x∗ Σx
x,xc ,xl 2 2
xl 0 0 0 xl
s.t. e∗ x + xc + xl = 1, xl ≥ 0,
r̄∗ x + rc xc = ρ.

Note that the no-arbitrage condition xl ≥ 0 must be imposed; otherwise one could borrow arbitrary
amounts of money without having to repay. However, Lemma 2 still works for sufficiently small ρ.
This is precisely our intention.
Theorem 8. Problem 5 has unique primal and dual solutions x, xc , xl , λ, μ, η, where η is the
multiplier of the nonnegativity constraint xl ≥ 0. For ρ > rc , the optimal solution has xl = 0 and
η = −λ > 0, and is otherwise identical to the solution of Problem 3. Any reward ρ ≤ rc is obtained
at zero risk by investing in a linear combination of the two riskless assets, with primal-dual solution
ρ ρ
x = 0, xc = , xl = 1 − , λ = μ = η = 0.
rc rc
Proof. The system of necessary conditions can be written
⎛ ⎞⎛ ⎞ ⎛ ⎞
Σ 0 0 e r̄ x 0
⎜0 0 0 1 rc ⎟ ⎜ xc ⎟ ⎜0⎟
⎜ ⎟⎜ l ⎟ ⎜ ⎟
⎜0 0 0 1 0⎟ ⎜ ⎟ ⎜ ⎟ xl ≥ 0, η ≥ 0, xl η = 0.
⎜ ∗ ⎟ ⎜ x ⎟ = ⎜η ⎟ ,
⎝e 1 1 ⎠ ⎝−λ⎠ ⎝1⎠
r̄∗ rc 0 −μ ρ

As in Theorem 5, the first two rows yield λ = −rc μ and x = μΣ−1 (r̄ − rc e). The third row yields
η = −λ = rc μ. Hence, by complementarity of xl and η, rows four and five yield either xc and μ
as in Problem 3 (if xl = 0 and η ≥ 0; case 1), or xc + xl = 1 and rc xc = ρ (if xl ≥ 0 and η = 0;
case 2). Due to the nonnegativity of xl and η, case 1 can only hold for ρ ≥ rc and case 2 only for
ρ ≤ rc . (Indeed, for ρ = rc both cases coincide so that all variables are continuous with respect
to the parameter ρ.)
Problem 6. The tradeoff version of Problem 5 reads
1 ∗
min x Σx − μ(r̄∗ x + rc xc )
x,xc ,xl 2
s.t. e∗ x + xc + xl = 1, xl ≥ 0.

Theorem 9. Problem 6 with μ > 0 is equivalent to Problem 5 (with ρ > r c ) iff ρ = rc + μδ c .


Every solution of Problem 5 with ρ ≤ rc is optimal for Problem 6 with μ = 0. Problem 6 is
unbounded for μ < 0: no solution exists.
Proof. The necessary conditions of both problems are identical, except that in the tradeoff problem
μ is given and the reward condition is missing. The condition η = rc μ together with nonnegativity
and complementarity of xl , η leads immediately to the three given cases.
Discussion. Apparently, at the price of slightly increased complexity, Problem 5 correctly captures
the case of an overly pessimistic investor. It minimizes something that qualitatively resembles a
quadratic downside risk (or shortfall risk ): the risk to obtain less than the desired amount, see
Fig. 3. In that sense the model is now more realistic. (In contrast, its tradeoff version becomes
degenerate for μ = 0 and does not extend to μ < 0.) But what does it mean to “invest” knowingly
in an asset with guaranteed total loss? Does it not imply that one could as well burn the money?
Single-Period and Multi-Period Mean-Variance Models 11

asset 1
asset 2
1 cash
loss

0
0.95 1 1.05 1.1 1.15 0.95 1 1.05 1.1 1.15

Figure 3: Portfolio with two risky assets, cash, and asset with guaranteed loss; r̄1 , r̄2 , rc as before.
Left: efficient frontier. Right: optimal portfolio vs. reward.

Let us first give the provocative answer “Yes, why not?”. From the point of view of the model,
the investor’s goal is minimizing the “risk” of earning less or more than the specified reward.
Therefore it makes sense to get rid of money whenever this reduces the variance—which it indeed
does for ρ < ρ̂. The model cannot know and consequently does not care how the investor will
interpret that, and it will use any possible means to take out capital if appropriate.
Of course, we can also offer a better interpretation. The fraction invested in xl is simply surplus
capital: the desired reward ρ is achieved at zero risk without that amount, so it should not be
invested in the first place—at least not into the portfolio under consideration. The investor may
enjoy a free lunch instead or support her favorite artist, if she prefers that to burning the money.
Or she may reconsider and decide to pursue a more ambitious goal: the model does not suggest
how to spend the surplus money. This interpretation of the new riskless (but inefficient) solutions
becomes obvious after the following observation.
Lemma 3. Problem 5 is equivalent to the modification of Problem 3 where the budget equation
e∗ x + xc = 1 is replaced by the inequality e∗ x + xc ≤ 1, i.e., less than 100 % investment is allowed.
Proof. With a slack variable s ≥ 0, the modified condition is equivalent to e∗ x + xc + s = 1, and
the modified Problem 3 becomes identical to Problem 5: the ominous loss asset is simply a slack
variable, xl ≡ s.

1.4 Utility functions


Let us start a brief excursion into utility-based portfolio optimization by considering Problem 1,
the tradeoff formulation of the mean-variance model for n risky assets. In utility theory, the
portfolio is chosen so that some function U (w), the investor’s subjective utility of final wealth
w = r∗ x, has maximal expectation for the given return distribution. The connection is apparent:
minimizing the tradeoff function with parameter μρ is equivalent to maximizing the expectation
E[Uρ (r∗ x)] if we define the family of concave quadratic utility functions

1 αρ − β
Uρ (w) := μρ w − (w − ρ)2 , μρ ≡ .
2 δ
(μρ is the optimal budget multiplier of the desired reward ρ.) If ρ + μρ > 0, then this equivalence
remains valid for the normalized utility functions
 
1 1 2 w w2
Ūρ (w) := U ρ (w) + ρ = − ,
(ρ + μρ )2 2 ρ + μρ 2(ρ + μρ )2

satisfying Ūρ (0) = 0 and maxw∈Ê Ūρ (w) = Ūρ (ρ + μρ ) = 12 . For a portfolio with two positively
correlated risky assets, Fig. 4 shows the normalized utility functions associated with several de-
sired rewards, and the resulting optimal wealth distribution functions given normally distributed
12 M. C. Steinbach

1
0.5 o| oo| |o| 0.5

0
1.6
1.4
1 1.05 1.2
1
1.1 1.15 0.8
0 1.2 0.6
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Figure 4: Utility-based portfolio optimization; two positively correlated assets. Left: Normalized
utility functions Ūρ and optimal wealth distribution functions Φρ for ρ ∈ {0.9rc , rc , r̄1 , r̄2 }. Sym-
metry center of Φρ curves at (ρ, 12 ) marked by ‘o’; top of Ūρ parabolas at (ρ + μρ , 12 ) marked by ‘|’.
Right: Family of optimal wealth distribution functions Φρ over the range ρ ∈ [1, 1.2].

1
0.5 o
| o| o| o| 0.5

0
1.6
1.4
1 1.05 1.2
1
1.1 1.15 0.8
0 1.2 0.6
0.7 0.8 0.9 1 1.1 1.2 1.3

Figure 5: The same situation as in Fig. 4 with cash included.

returns, r ∼ N (r̄, Σ). The optimal wealth distributions have the explicit form
 w    
1 (t − ρ)2 w−ρ
Φρ (w) := √ exp − dt = Φ √
2π σρ −∞ 2σρ2 2 σρ
where σρ2 := σ 2 (ρ) = (μ2ρ δ + 1)/α, and Φ is the standard error integral,
 w
1
Φ(w) := √ exp(−t2 ) dt.
π −∞
When cash is included in the portfolio (Problem 4), utility functions Uρ , Ūρ and distribution
functions Φρ have precisely the same form, except that the final wealth becomes w = r∗ x + rc xc ,
and that μρ := (ρ − rc )/δ c and σρ2 := μ2ρ δ c . The properties of Φρ give another indication of the
risk reduction mechanism described in Theorem 7: Fig. 5 shows that the slope of Φρ is rather
steep and becomes a jump in the zero-risk case, ρ = rc .
When a loss asset is also added to the portfolio in Problem 6, the utility functions Uρ , Ūρ
are exactly identical to the previous case, and even their optimal wealth distributions for ρ ≥ rc
coincide, see Fig. 6. For ρ ≤ rc , however, the wealth distributions Φρ := χ[ρ,∞) become indicator
functions rather than normal distributions: they all have μρ = σρ = 0 and a jump discontinuity
at ρ, producing zero risk. (In Problem 4 this happens only for ρ = rc .)
In this paper we do not wish to pursue the subject further. The interested reader should refer
to the original considerations of Markowitz [54, Part IV], the literature cited in the introduction
(especially [31, 37, 39, 40, 46, 48, 59]), and the references therein.
Single-Period and Multi-Period Mean-Variance Models 13

1
0.5 o| o| o| o| 0.5

0
1.6
1.4
1 1.05 1.2
1
1.1 1.15 0.8
0 1.2 0.6
0.7 0.8 0.9 1 1.1 1.2 1.3

Figure 6: The same situation as in Fig. 4 with cash and loss included.

1.5 Influence of inequalities


Except for the no-arbitrage condition xl ≥ 0, all the problems considered so far were purely
equality-constrained. Now we study problems with inequalities. Let us first view ρ as a lower
bound (not an exact value) for the desired reward.
Theorem 10. Consider the following modifications of Problems 2, 3, and 5: the reward equation
is replaced by the inequality r̄∗ x ≥ ρ in Problem 2 and by r̄∗ x + rc xc ≥ ρ in Problems 3 and 5.
Then:
1. The solution of each original problem for ρ ≥ ρ̂ is also the unique solution of the correspond-
ing modified problem (upper branch).
2. The solution of Problem 2 or 3 with reward ρ̂ is also the unique solution of the corresponding
modified problem for any ρ ≤ ρ̂ (lower branch).
3. Any solution of Problem 5 with ρ(x, xc , xl ) ∈ [ρ, rc ] is a riskless solution of the modified
problem with ρ < rc . That is, any portfolio (0, xc , 1 − xc ) with xc ∈ [ρ/rc , 1] is optimal.
Proof. Obvious.
Discussion. Specifying the desired reward as a lower bound rather than an exact value leads to
reasonable behavior on the lower branch of the efficient frontier. The minimal-risk solution is
simply extended to all sufficiently small rewards, yielding again a quadratic downside-like risk
in each case. In Problem 3 this provides an alternative to introducing a loss asset. (By state-
ment 3, the combination yields non-unique solutions but no further advantages.) The lower bound
formulation—as originally introduced by Markowitz—might appear more natural than the loss as-
set, but mathematically both are equivalent: optimal investments in risky assets and the resulting
risk are identical, only the surplus money is put in xl (removed immediately) in the latter case
and invested in cash xc in the former case—to be removed afterwards. This difference could be
interpreted as reflecting certain attitudes of the investor toward surplus money: e∗ x + xc ≤ 1
and ρ(x) ≥ ρ would model respective preferences for immediate consumption or for future profit,
whereas the combination would express indecision. Instead, we simply interpret the loss model
as giving the investor a choice between consumption and profit in some situations. Leaving the
choice open seems preferable in view of the multi-period case.
In practice, nonnegativity constraints x ≥ 0, xc ≥ 0 will usually be included to prohibit short-
selling assets or borrowing cash.4 The budget equation then implies that only convex combinations
of the assets are permitted. (That is, strictly speaking, convex combinations of single-asset port-
folios). Of course, the unconstrained solution remains valid if and only if it is nonnegative anyway.
4 Even if borrowing is allowed, it should also be modeled as a (separate) nonnegative asset since the interest

rate differs from the one for investing.


14 M. C. Steinbach

Otherwise some constraints become tight, excluding the corresponding assets from the portfolio
and increasing risk. More precisely, the following simple facts hold.

Theorem 11. Include nonnegativity constraints x ≥ 0, xc ≥ 0 in Problems 2, 3, and 5, and


assume for simplicity that r̄ > rc e > 0. Denote by r̄min , r̄max the minimal and maximal expected
return in the portfolio, and choose corresponding assets xmin , xmax .

1. Problems 3 and 5 have xmin = xc and xmin = xl , respectively.

2. In each problem, an optimal solution exists iff ρ ∈ [r̄min , r̄max ].

3. The efficient frontier is convex and piecewise quadratic (or linear).

Proof. Statement 1 is trivial. Since each problem is convex, an optimal solution exists if and only if
the feasible set is nonempty. For ρ ∈ [r̄min , r̄max ], the feasible set clearly contains a (unique) convex
combination of xmin and xmax . Conversely, every convex combination of assets yields as reward
the same convex combination of individual expected returns, which lies in the range [r̄min , r̄max ].
This proves statement 2. To prove statement 3 consider Problem 2 first. Let ρ0 , ρ1 ∈ [r̄min , r̄max ]
with respective solutions x0 , x1 . Then xt := (1 − t)x0 + tx1 is feasible for ρt := (1 − t)ρ0 + tρ1 ,
t ∈ [0, 1], and convexity of the efficient frontier follows from convexity of R,

σ 2 (ρt ) ≤ R(xt ) ≤ (1 − t)R(x0 ) + tR(x1 ) = (1 − t)σ 2 (ρ0 ) + tσ 2 (ρ1 ).

At r̄min and r̄max all the money is invested in one single asset: xmin or xmax . Each ρ ∈ (r̄min , r̄max )
determines a subset of two or more nonnegative assets whose efficient frontier gives the optimal
risk in that point. Since strict positivity is a generic property, each of these sub-portfolios is
optimal either in a single point or on an entire interval. Thus, the efficient frontier (in Problem 2)
is composed of finitely many quadratic pieces. Precisely the same arguments hold for Problem 3
since R(x, xc ) ≡ R(x). In Problem 5, the efficient frontier consists of the segment σ 2 (ρ) ≡ 0 on
[0, rc ] and the pieces of Problem 3 on [rc , r̄max ].

Discussion. The theorem gives a simple characterization of the influence of standard nonnegativity
constraints. In a portfolio with three risky assets, the respective efficient frontiers of subportfolios
that contribute to the optimal solution in Problems 2, 3, and 5 might look as in Fig. 7. Other
inequalities, like upper bounds on the assets or limits on arbitrary asset combinations, will further
restrict the range of feasible rewards and increase the risk in a similar manner. This case was
already considered by Markowitz: he handles general linear inequalities by dummy assets (slacks)
and constraints Ax = b, x ≥ 0, where A = e∗ (with x ≥ 0 and ρ(x) ≥ ρ) is called the standard
case [54, p. 171]. Moreover, Markowitz devised an algorithm to trace the critical lines, that is,
the segments of the efficient frontier [53], [54, Chap. VIII]. The number of assets in an optimal
portfolio is ivestigated, e.g., by Nakasato and Furukawa [61].

1.6 Downside risk


In the discussion of Section 1.3 we made the remark that Problem 5 resembles a downside risk.
This will now be investigated in detail. We have to work with the distribution of returns, but the
entire analysis can be given in geometric terms using its support. Consider a probability space
(Rn , B, P ) and let Ξ ∈ B denote the support of P , i.e., the smallest closed Borel set with measure
one,

Ξ = supp(P ) := S.
P (S)=1

(Of course, if P has a density φ, then Ξ = supp(φ) = {x ∈ Rn : φ(x) > 0}.) In the following we
will actually use the convex hull of the support in most cases, denoted by C := conv(Ξ).
Single-Period and Multi-Period Mean-Variance Models 15

assets 2,3
assets 1,2

assets 1,2,3

cash & assets 2,3


cash & loss

1.02 1.04 1.06 1.08 1.1 1.12 1.14 1.16

Figure 7: Efficient frontiers for Problems 2, 3, and 5 with nonnegative assets.

Definition 3 (Downside risk). For a function w of the random vector r with distribution P ,
the downside risk of order q > 0 with target τ ∈ R is

 
Rτq (w) := E |min(w(r) − τ, 0)|q = |min(w(r) − τ, 0)|q dP.
Ên
Remarks. Without the risk context such expectations are neutrally called lower partial moments,
with downside expected value or semi-deviation (order 1) and downside variance or semi-variance
(order 2) as special cases. In [54, Chap. XIII] Markowitz gives a qualitative discussion of the
linear case (expected value of loss, q = 1), the quadratic case (semi-variance, q = 2), and some
other measures of risk, by examining the associated utility functions. Expectation of loss has
recently gained interest as a coherent replacement for the popular Value-at-Risk (VaR), often
under alternative names like mean shortfall, tail VaR, or conditional VaR, cf. [3, 79].
In the following we are only interested in quadratic downside risk of portfolio returns like
wx,xc (r, rc ) = r∗ x + rc xc . Moreover, we always use the desired reward as a natural choice for the
shortfall target, τ = ρ, and write simply Rρ (x, xc ) instead of Rρ2 (wx,xc ). The problems considered
in this section are downside risk versions of Problems 3 and 5, and of the modification of Problem 3
with ρ(x, xc ) ≥ ρ. In each case only the objective is changed: standard risk R is replaced by
downside risk Rρ . Before considering these problems we need some technical preparations.
For x = 0 and c ∈ R let us introduce open and closed half-spaces

H(x, c) := {r ∈ Rn : r∗ x < c}, H̄(x, c) := {r ∈ Rn : r∗ x ≤ c},

and portfolio-dependent semi-variance matrices



Σ(x) := (r − r̄)(r − r̄)∗ dP, x = 0.
r̄+H(x,0)

For x = 0 let Σ(0) := 12 Σ where Σ is of course the usual covariance matrix,



Σ := (r − r̄)(r − r̄)∗ dP.
Ên
16 M. C. Steinbach

Lemma 4. Denote by
a disjoint union. Then, for x = 0 and a > 0,
1. H(ax, c) = H(x, a−1 c), H(x, ac) = H(a−1 x, c), H(ax, ac) = H(x, c).
2. H(−x, −c) = Rn \ H̄(x, c).
3. H̄(x, c) = H(x, c)
∂H(x, c), H̄(x, 0) = H(x, 0)
{x}⊥ .
Statements 1 and 2 remain valid when H and H̄ are exchanged everywhere.
Proof. Trivial.
Lemma 5. For x ∈ Rn and a > 0,
1. Σ(ax) = Σ(x).
2. 0 ≤ Σ(x) ≤ Σ. (In particular, each Σ(x) is positive semidefinite.)
3. x∗ Σ(x)x = E[min((r − r̄)∗ x, 0)2 ].
4. x∗ [Σ(x) + Σ(−x)]x = x∗ Σx.
Proof. Statements 1, 2, 3 are obvious from the definitions and the first identity in Lemma 4.
The expressions in statement 4 are identical for x = 0; otherwise they differ by the integral of
((r − r̄)∗ x)2 over r̄ + {x}⊥ , which is clearly zero.
Lemma 6. For any random vector r the following holds.
1. The expectation lies in the convex hull of the support: r̄ ∈ C.
2. The covariance matrix and all semi-variance matrices are positive definite if and only if
Ξ has full dimension in the sense that its convex hull has nonempty interior:
int(C) = ∅ ⇐⇒ Σ > 0 ⇐⇒ Σ(x) > 0 ∀x ∈ Rn .
3. If r is discrete with Σ > 0, then it has at least n + 1 realizations.
Proof. Assume r̄ ∈/ C. Then r̄ has positive distance to C, and a vector x = 0 exists so that
(r − r̄)∗ x > 0 for all r ∈ C. Since expectation is the integral over Ξ ⊆ C, this yields the
contradiction 0 < E[(r − r̄)∗ x] = 0, proving statement 1. Now assume int(C) = ∅. Then C is
contained in some hyperplane r̄ + {x}⊥ with x = 0, implying

x∗ Σx = E[((r − r̄)∗ x)2 ] = 0.

Hence Σ is only positive semidefinite. Conversely, assume int(C) = ∅ and x = 0. Then (r−r̄)∗ x < 0
for all r ∈ r̄ + H(x, 0). By Lemma 7 below, r̄ + H(x, 0) has positive measure. Therefore


x Σ(x)x = ((r − r̄)∗ x)2 dP > 0,
r̄+H(x,0)

showing that Σ(x) > 0. The proof of statement 2 is complete since Σ ≥ Σ(x) for all x. Now
statement 3 is an immediate consequence.
Lemma 7. Let int(C) = ∅ and x = 0. Then r̄ + H(x, 0) has positive measure.
Proof. The inner product s(x) := (r − r̄)∗ x is negative, zero, and positive on the respective sets
r̄ + H(x, 0), r̄ + {x}⊥ , and r̄ + H(−x, 0). Furthermore,
  
s(x) dP + s(x) dP + s(x) dP = E[s(x)] = 0.
r̄+H(x,0) r̄+{x}⊥ r̄+H(−x,0)

Therefore r̄ + H(x, 0) and r̄ + H(−x, 0) have either both positive measure or both measure zero.
The second case implies Ξ ⊆ r̄ + {x}⊥ , which leads to the contradiction int(C) = ∅.
Single-Period and Multi-Period Mean-Variance Models 17

Let us now study the downside risk versions of Problems 3 and 5 under the same assumptions
as before (A1 and A3 resp. A1, A3, and A4). It will be seen that in these cases the qualitative
behavior does not change significantly. This is mainly because the constraints are linear and Σ(x)
depends only on the direction and not on the magnitude of x (cf. Lemma 5).

Problem 7. We minimize downside risk Rρ (x, xc ) for risky assets and cash, with fixed desired
reward ρ(x, xc ) = ρ,

1
minc min(r∗ x + rc xc − ρ, 0)2 dP
x,x 2 Ên
s.t. e∗ x + xc = 1,
r̄∗ x + rc xc = ρ.

Problem 8. Now minimize downside risk Rρ (x, xc , xl ) for risky assets, cash, and loss, with fixed
desired reward ρ(x, xc , xl ) = ρ,

1
min min(r∗ x + rc xc − ρ, 0)2 dP
x,xc ,xl 2 Ên
s.t. e∗ x + xc + xl = 1, xl ≥ 0,
r̄∗ x + rc xc = ρ.

Lemma 8. With xc ≡ 1 − e∗ x − xl and θ ≡ rc xl , Problem 8 is equivalent to


1 ∗
min x Σ(x)x
x,θ 2
s.t. (r̄ − rc e)∗ x = ρ + θ − rc , θ ≥ 0.

When fixing θ = 0, the resulting problem is equivalent to Problem 7.

Proof. The modified reward condition is immediately obtained by the identity xc = 1 − e∗ x − xl .


Using r∗ x + rc xc − ρ = (r − r̄)∗ x gives the downside risk
 
min((r − r̄)∗ x, 0)2 dP = ((r − r̄)∗ x)2 dP = x∗ Σ(x)x.
Ên r̄+H(x,0)

(The special case x = 0 is easily verified.) Clearly, θ = 0 means xl = 0, yielding Problem 7.

We are now ready to analyze Problems 7 and 8. In general, closed-form solutions cannot be
found due to the nonlinearity of downside risk with respect to the risky assets. However, we can
derive some important properties of the solutions and give a qualitative comparison to Problems
3 and 5.

Lemma 9. Optimal solutions always exist in Problems 7 and 8. The resulting downside risk is
nonnegative and not greater than the optimal risk in Problem 3 or 5, respectively. Moreover, the
riskless solutions of Problems 7 and 3 (8 and 5) are identical. (In general the solutions are not
unique.)

Proof. Convexity of min(w, 0) implies convexity of downside risk x∗ Σ(x)x and thus of Problems 7
and 8. The existence of solutions and the stated inclusion follow since 0 ≤ Σ(x) ≤ Σ by Lemma 4.
By assumption A1 and Lemma 6, zero risk requires x = 0, which holds under the same conditions
as in the standard risk case.

Theorem 12. In Problem 7, choose respective optimal portfolios (x± , xc± ) for ρ± := rc ± 1. Then
(ax± , axc± − a + 1) is optimal for ρ = rc ± a if a ≥ 0. Moreover, x± = 0 and x+ = x− .
18 M. C. Steinbach

 
r2  r2 

_
r +{y}⊥ 

_
r +{x}⊥ 

_ 

_ Ξ

r 

r
x c

r ce r 
e 


Ξ’











y y 
 1
r1 
 r



Figure 8: Hyperplanes (left) and half-spaces (right) in downside risk (n = 2).

Proof. The transformation of xc+ follows from e∗ x + xc = 1 if ax+ is optimal. Assume ax+ is not
optimal for ρ = rc + a > rc . By Lemma 8, x = ax+ exists so that (r̄ − rc e)∗ x = ρ − rc and

x∗ Σ(x)x < ax∗+ Σ(ax+ )ax+ = a2 x∗+ Σ(x+ )x+ ,

where the last equality holds by Lemma 5. Hence,

(r̄ − rc e)∗ a−1 x = a−1 (ρ − rc ) = 1 = ρ+ − rc

and

a−1 x∗ Σ(a−1 x)a−1 x = a−2 x∗ Σ(x)x < x∗+ Σ(x+ )x+ .

Thus x+ cannot be optimal for ρ+ : a contradiction. The case ρ < rc is analogous, and ρ = rc is
trivial. Finally observe that (r̄ − rc e)∗ x− < 0 < (r̄ − rc e)∗ x+ .

Theorem 13. Constants c± ∈ (0, 1) exist so that the optimal risk in Problem 7 is c+ (c− ) times
the optimal risk of Problem 3 on the upper (lower) branch.
Proof. The existence of c± ∈ (0, 1] with the stated properties follows from Lemma 9 and Theo-
rem 12. Statement 4 of Lemma 5 implies c± < 1.

Theorem 14. The same statements as in Theorems 12 and 13 hold on the upper branch in Prob-
lem 8. On the lower branch one has the (unique) riskless solution (x, xc , xl ) = (0, ρ/rc , 1 − ρ/rc ).

Proof. This is a simple case distinction.


Remarks. When assumption A1 (Σ > 0) is dropped, the following can be shown using Lemmas
4–8. A feasible portfolio with x = 0 has zero risk in Problem 7 or 8 iff Ξ is contained in the
hyperplane r̄ + {x}⊥ . In Problem 7 such a portfolio exists for ρ = rc iff Ξ lies in a hyperplane
containing both r̄ and rc e (see x in Fig. 8), and for ρ = rc iff Ξ lies in a hyperplane containing r̄
but not rc e (see y in Fig. 8). Similarly, in Problem 8 such a portfolio exists for ρ ≤ rc iff Ξ lies in
any hyperplane containing r̄ (see x, y in Fig. 8), and for ρ > rc iff Ξ lies in a hyperplane containing
r̄ but not rc e (see y in Fig. 8). In both cases, an arbitrage is thus possible if Ξ lies in a hyperplane
containing r̄ but not rc e.
Discussion. Up to now downside risk behaves qualitatively similar to standard risk: The efficient
frontier is still piecewise quadratic, and optimal portfolios are always combinations of reward-
independent portfolios. Only uniqueness is not guaranteed any more, and the curvatures of upper
and lower branches of the efficient frontier may differ. (Optimal portfolios will usually also differ
from their standard risk counterparts, of course). The similarity is caused by fixing the reward:
this places the mean r̄ on the boundary of semi-variance half-spaces so that the properties of Σ(x)
come into play. The two risk measures become identical if the return distribution is symmetric
with respect to rotations about r̄. In that case Σ(x) = 12 Σ for all x, and c+ = c− = 12 .
Single-Period and Multi-Period Mean-Variance Models 19

The last problem considered in this section is the downside risk version of the modification of
Problem 3. Again the riskless solutions are of interest.
Problem 9. We minimize downside risk Rρ (x, xc ) for risky assets and cash, with desired minimal
reward ρ(x, xc ) ≥ ρ,

1
min min(r∗ x + rc xc − ρ, 0)2 dP
x,xc ,θ 2 Ên
s.t. e∗ x + xc = 1,
r̄∗ x + rc xc = ρ + θ, θ ≥ 0.

Remark. Note that downside risk is still calculated with respect to the desired reward ρ whereas
the actual reward is now ρ + θ. Otherwise the problem would be equivalent to Problem 8.
Lemma 10. (x, xc ) = (0, 1) is feasible for Problem 9 iff ρ ≤ rc . Otherwise, with xc ≡ 1 − e∗ x,
Problem 9 is equivalent to

1
min (θ + (r − r̄)∗ x)2 dP
x,θ 2 r̄+H(x,−θ)
s.t. (r̄ − rc e)∗ x = ρ + θ − rc , θ ≥ 0.

Proof. The first part is trivial. The second part is proved as Lemma 8, the only difference being
that the reward condition now yields r∗ x + rc xc − ρ = (r − r̄)∗ x + θ instead of (r − r̄)∗ x.
Theorem 15. If r c e ∈ int(C), then the following holds in Problem 9.
1. For every ρ ≤ rc , (x, xc ) = (0, 1) is a riskless solution.
2. A portfolio with x = 0 has zero risk iff Ξ ⊆ rc e + H̄(−x, rc − ρ).
3. For ρ < rc such a portfolio exists iff Ξ is contained in any closed half-space.
4. For ρ ≥ rc such a portfolio does not exist.
Proof. Statement 1 is trivial. If x = 0, then downside risk clearly vanishes iff Ξ does not intersect
r̄ + H(x, −θ), that is, iff (r − r̄)∗ x + θ ≥ 0 for r ∈ Ξ. Substituting θ = (r̄ − rc e)∗ x − ρ + rc from
the reward equation yields the equivalent condition (r − rc e)∗ x ≥ ρ − rc for r ∈ Ξ, which proves
statement 2. (This condition implies feasibility of x since r̄ ∈ C.) Now, if Ξ is contained in some
closed half-space, then y = 0 exists so that (r − rc e)∗ y ≥ −1 for r ∈ Ξ, see Fig. 8. For ρ < rc let
x := (rc − ρ)y to satisfy the zero-risk condition. The “only if” direction of statement 3 is trivial.
Finally observe that C contains an open ball centered at rc e ∈ int(C). On that ball the inner
product (r − rc e)∗ x takes positive and negative values for any x = 0, showing that for ρ ≥ rc the
zero-risk condition (r − rc e)∗ x ≥ ρ − rc ≥ 0 cannot be satisfied.
Remarks. Similar arguments show that for ρ > rc (ρ = rc ) a zero-risk portfolio with x = 0 exists
iff Ξ lies in a closed half-space not containing rc e (not containing rc e in its interior), see Ξ in
Fig. 8. This is why we need an additional no-arbitrage condition. Although rc e ∈ C would suffice,
we choose the stronger condition rc e ∈ int(C) to ensure a unique riskless solution (100 % cash) for
ρ = rc . Thus x = 0 will produce positive risk for any ρ ≥ rc .
Discussion. Risk vanishes on the lower branch in Problem 9, but for the upper branch we know
only that it is convex; even the optimal portfolios for different ρ > rc may be unrelated. This is
because the actual reward may exceed the shortfall target, resulting in semi-variance half-spaces
far from r̄ and producing asymmetric (or decentral) risk integrals instead of Σ(x). One might
expect such truly nonlinear behavior from any downside risk measure, but it occurs only if the
actual reward may differ from the shortfall target. One might also associate zero risk on the lower
branch with downside risk, but this property occurs for standard risk as well and has nothing to
do with the objective; it is caused by the inequality ρ(x) ≥ ρ or e∗ x + xc ≤ 1.
20 M. C. Steinbach

1.7 Summary
We have discussed various formulations of the classical mean-variance approach to obtain single-
period models that give a qualitatively correct description of risk, particularly for unreasonably
small desired rewards. Positivity constraints and other inequalities have been studied, and down-
side risk models have been analyzed in detail. Thus we have clarified the effects and interaction of
all components in the portfolio optimization problems. In the following we use the results of this
section in developing multi-period models. The goal is to achieve an approximate minimization of
downside risk.
Tradeoff formulations or utility functions will not be considered any more since extra constraints
provide higher modeling flexibility and facilitate a better understanding of subtle details.

2 Multi-period mean-variance analysis


For the multi-period mean-variance models we consider a planning horizon of T + 1 periods (not
necessarily equidistant) in discrete time t = 0, . . . , T + 1. The portfolio is allocated at t = 0
and thereafter restructured at t = 1, . . . , T , before the investor obtains her reward after the final
period, at time T + 1. The portfolios and return vectors are xt , rt+1 ∈ Rn , t = 0, . . . , T , yielding
asset capitals rtν xνt−1 just before the decision at time t. Cash, its return, and loss assets (if present)
are denoted by xct , rtc , and xlt ; the wealth is wt = rt xt−1 (or rt xt−1 + rtc xct−1 ). Cash returns rtc are
assumed to be known a priori, whereas the evolvement of asset returns is of course random. The
decision at time t is made after observing the realizations of r1 , . . . , rt but prior to observations of
rt+1 , . . . , rT +1 , leading to a nonanticipative policy x = (x0 , . . . , xT ).
Suppose that the distribution of returns until T is given by a scenario tree: each rt has finitely
many realizations rj with probabilities pj > 0, j ∈ Lt , so that Lt forms a level set in the tree. The
T
set of all nodes is V := t=0 Lt , and the set of leaves, each representing a scenario, is L := LT .
We denote by 0 ∈ L0 the root, by j ∈ Lt the current node (a partial scenario), by i ≡ π(j) ∈ Lt−1
its parent node, and by S(j) ⊆ Lt+1 the set of child nodes (successors). The return in the final
period may be given by continuous distributions in each leaf. Thus rt , xt are random vectors
on a discrete-continuous probability space that possesses a filtration generated by the tree. The
conditional expectation r̄T := E(rT +1 |LT ) and its covariance matrix
ΣT := E[(rT +1 − r̄T )(rT +1 − r̄T )∗ |LT ] = E(rT +1 rT∗ +1 |LT ) − r̄T r̄T∗
define random variables on the same space, with realizations r̄j , Σj on LT .
The discrete decision vector is denoted x = (xj )j∈V . As before, reward and risk are defined as
mean and variance of the final wealth, wT +1 . In the absence of cash these definitions read

ρ(x) = E(rT∗ +1 xT ) = E(r̄T∗ xT ) = pj r̄j∗ xj
j∈L

and
R(x) = σ 2 (rT∗ +1 xT ) = E[(rT∗ +1 xT − ρ(x))2 ].
Lemma 11 (Siede [23]). The risk is given by

R(x) = E[x∗T (ΣT + r̄T r̄T∗ )xT ] − ρ(x)2 = pj x∗j (Σj + r̄j r̄j∗ )xj − ρ(x)2 .
j∈L

Proof. By definition,
R(x) = E[(rT∗ +1 xT − ρ(x))2 ] = E(x∗T rT +1 rT∗ +1 xT ) − ρ(x)2
= E[E(x∗T rT +1 rT∗ +1 xT |LT )] − ρ(x)2
= E[x∗T E(rT +1 rT∗ +1 |LT )xT ] − ρ(x)2 = E[x∗T (ΣT + r̄T r̄T∗ )xT ] − ρ(x)2 .
The discrete representation follows immediately.
Single-Period and Multi-Period Mean-Variance Models 21
   
r̄1 r̄N
...
1
  N
r
1
  r N

Figure 9: Scenario tree for the two-period mean-variance model.

Remarks. Notice that this representation yields a block-diagonal risk matrix because of the sep-
arate term ρ(x)2 . If the Hessian of the latter were included in the risk matrix, it would add a
completely dense rank-1 term: the dyadic product containing all the covariances −pj pk r̄j r̄k∗ be-
tween different leaves j, k ∈ L. Since ρ(x) = ρ is fixed in the optimization problems below, we can
neglect the term ρ2 except when considering the reward-dependence of risk.
Corollary 1. Denote by ρT (xT ) := r̄T∗ xT and RT (xT ) := x∗T ΣT xT the conditional reward and
risk of the final period, respectively, with realizations ρj (xj ) = r̄j∗ xj and Rj (xj ) = x∗j Σj xj on LT .
Then R(x) = Rc (x) + Rd (x) with

Rc (x) := E[RT (xT )] = pj x∗j Σj xj
j∈L

and

Rd (x) := E[ρT (xT )2 ] − ρ(x)2 = pj ρj (xj )2 − ρ(x)2 .
j∈L

Proof. Obvious from Lemma 11.


Remark. We call Rc the continuous part and Rd the discrete part of the risk. This distinction will
be useful in the subsequent analysis: Rc is the expectation of the conditional variance (of wT +1 ),
measuring the average final-period risk, whereas Rd is the variance of the conditional expectation,
measuring how well the individual scenario returns are balanced.
Apparently, period T + 1 with its continuous distribution (of which only the conditional mean
and variance enter the problem) corresponds to the single period in the classical model. Indeed,
for T = 0 the multi-period model reduces precisely to the one-period case, where the “scenario
tree” consists of the root only, and x ≡ xT , r ≡ rT +1 , r̄ ≡ r̄T , Σ ≡ ΣT , w ≡ wT +1 . Assuming
idealized transactions with no loss of capital in the whole multi-period situation, the single budget
equation e∗ x = 1 is supplemented by the set {e∗ xt = rt∗ xt−1 }Tt=1 with discrete representation
{e∗ xj = rj∗ xπ(j) }j∈V ∗ where V ∗ := V \ {0}.
The following analysis requires some lengthy technical proofs; these are moved into the ap-
pendix. Moreover, for technical simplicity, increasing amounts of the presentation will be spe-
cialized for the two-period problem, T = 1. In that case it is convenient to number the leaves
as j = 1, . . . , N , so that V ∗ = S(0) = L = {1, . . . , N }, see Fig. 9. We would like to point out,
however, that the two-period models exhibit almost all the properties of the general multi-period
case. Comments will be given at the end of each section.

2.1 Risky assets only


Let us begin the investigation of multi-period problems with the case of purely risky assets. The
model discussed here was originally proposed by Frauendorfer [21] (with slightly different objective)
and later refined by Frauendorfer and Siede [23].
In this section we impose regularity conditions similar to A1 and A2 in all nodes and in at
least one node, respectively. The conditions for Lt−1 require certain definitions on Lt which in
turn depend on the conditions for Lt . To avoid a nested presentation, the conditions will be stated
after the definitions.
22 M. C. Steinbach

Let r̃j := pj r̄j and Σ̃j := pj (Σj + r̄j r̄j∗ ) in the leaves, j ∈ L. By assumption A5 below, Σ̃j > 0;
therefore we can define

α̃j := e∗ Σ̃−1
j e, β̃j := e∗ Σ̃−1
j r̃j , γ̃j := r̃j∗ Σ̃−1
j r̃j , δ̃j := α̃j γ̃j − β̃j2 .

Recursively for t = T, . . . , 1 and i ∈ Lt−1 let


 β̃j  1
r̃i := rj , Σ̃i := rj rj∗ ,
α̃j α̃j
j∈S(i) j∈S(i)

and employ A5 again to define α̃i , . . . , δ̃i in analogy to α̃j , . . . , δ̃j . In the subsequent analysis (and
in the solution algorithm) these quantities will play a similar role as their counterparts in the
leaves, but they do not have the same meaning. In particular, r̄i := r̃i /pi and Σi := Σ̃i /pi − r̄i r̄i∗
are usually not the expectation and covariance matrix of the discrete distribution {rj }j∈S(i) .
Basic assumptions.
A5) ∀j ∈ V : Σ̃j > 0.
A6) ∃j ∈ V : r̃j is not a multiple of e.
Remarks. The role of these conditions is analogous to the single-period case: they ensure strict
convexity and avoid degenerate constraints. Assumption A5 also implies N ≥ n as a technical
requirement on the return discretization in each period. In practice one will usually have N > n,
otherwise the covariance matrices are only positive semidefinite by Lemma 6. Suitable multi-
period discretizations can be generated, e.g., by barycentric approximations [22].
Lemma 12. Under assumptions A5 and A6, the constants α̃j , γ̃j are all positive, the δ̃j are all
nonnegative, and at least one δ̃j is positive.

Proof. Positivity and nonnegativity are proved as in Lemma 1, where δ̃j = 0 iff r̃j , e are linearly
dependent.
Problem 10. The multi-period mean-variance problem (using i ≡ π(j)) reads
1 1
min x∗ Σ̃j xj − ρ2
x 2 j 2
j∈L

s.t. e∗ x0 = 1,
e∗ xj = rj∗ xi ∀j ∈ V ∗ ,

r̃j∗ xj = ρ,
j∈L

Its Lagrangian is
1   
1 2
L(x, λ, μ; ρ) = x∗j Σ̃j xj ∗
− ρ − λ0 (e x0 − 1) − ∗ ∗
λj (e xj − rj xi ) − μ ∗
r̃j xj − ρ .
j∈L
2 2 ∗ j∈V j∈L

Theorem 16. Problem 10 has the unique primal-dual solution


  
wj − μβ̃j β̃0 δ̃j
xj = Σ̃−1
j (λj e + μr̃j ), λj = , μ= ρ− ,
α̃j α̃0 α̃j
j∈V

where w0 = 1 and wj = rj∗ xi for j ∈ V ∗ . The associated optimal risk is


 2  
21 β̃0 δ̃j
R(x) ≡ σ (ρ) = + ρ− − ρ2 .
α̃0 α̃0 α̃j
j∈V
Single-Period and Multi-Period Mean-Variance Models 23

Its global minimizer ρ̂ and minimal risk, respectively, are


  δ̃j   2    δ̃j 
β̃0 1 β̃0
ρ̂ = 1− , σ 2 (ρ̂) = − 1− .
α̃0 α̃j α̃0 α̃0 α̃j
j∈V j∈V

Proof. See appendix (for T = 1).


Discussion. The proof of Theorem 16 is given for the two-period case only, where expressions for
the solution variables are derived first in the leaves and then in the root. In the multi-period case
this generalizes readily to a recursive procedure which is actually a highly efficient algorithm for
practical computations. We call that recursion the tree-sparse Schur complement method [76, 77].
The specialization to a single period gives similar results as Theorems 1, 2, and 4, but under
slightly weaker conditions: only Σ + r̄r̄∗ > 0 rather than Σ > 0 is now required, so that riskless
portfolios may exist. (We use the weaker condition anyway since Σ̃ = Σ + r̄r̄∗ appears naturally
in the problem.) The two following lemmas establish the precise relationship.
Lemma 13. If Σ̃ > 0 but not Σ > 0, then the nullspace is N (Σ) = span(Σ̃−1 r̄).
Proof. If Σx = 0, then Σ̃x = r̄r̄∗ x and x = (r̄∗ x)Σ̃−1 r̄ ∈ span(Σ̃−1 r̄).
Lemma 14. Consider the single-period case of Problem 10 under assumptions A1 and A2, i.e.,
Σ ≡ Σ0 > 0. Then
α+δ β γ δ
α̃0 = , β̃0 = , γ̃0 = , δ̃0 = ,
1+γ 1+γ 1+γ 1+γ
and
x̃ = x, λ̃ = λ, μ̃ = μ + ρ,
where quantities with and without tilde refer to problems 10 and 2, respectively.
Proof. The Sherman–Morrison formula applied to Σ̃ = Σ + r̄r̄∗ yields
Σ−1 r̄r̄∗ Σ−1 Σ−1 r̄r̄∗ Σ−1
Σ̃−1 = Σ−1 − ∗ −1
= Σ−1 − .
1 + r̄ Σ r̄ 1+γ
Using this in the definitions of α̃0 , β̃0 , γ̃0 , δ̃0 gives the first set of identities after a few elemen-
tary calculations. The relation of the solutions is similarly obtained by simple but more lengthy
calculations (first μ̃, then λ̃, then x̃).

2.2 Risky assets and cash


Here we define local constants for the two-period case only but state the general optimization
problem. Denote by r1c , r2c the deterministic (and thus scenario-independent) cash returns in
periods one and two, respectively, and by rc := r2c r1c their combined return. As before, let r̃j = pj r̄j ,
Σ̃j = pj (Σj + r̄j r̄j∗ ) for j ∈ L, and define in addition r̃jc := pj r2c . With assumption A7 below let

αj := e∗ Σ−1
j e, βj := e∗ Σ−1
j r̄j , γj := r̄j∗ Σ−1
j r̄j ,

and
δjc := (r2c )2 αj − 2r2c βj + γj = (r̄j − r2c e)∗ Σ−1
j (r̄j − r2 e).
c

In the root define


 pj  pj  pj
p̃0 := , r̃0 := rj , Σ̃0 := rj r∗ ,
δc +1 δc +1 δc +1 j
j∈S(0) j j∈S(0) j j∈S(0) j

and furthermore r̄0 = r̃0 /p̃0 , Σ0 = Σ̃0 /p̃0 − r̄0 r̄0∗ , and r̃1c := p̃0 rc (not p̃0 r1c ). Using A7 again,
α0 , β0 , γ0 are then defined in analogy to αj , βj , γj . Finally let δ0c := (r1c )2 α0 − 2r1c β0 + γ0 and recall
that t is the current time in j ∈ V .
24 M. C. Steinbach

Basic assumptions. For the general multi-period case assume

A7) ∀j ∈ V : Σj > 0.

A8) ∃j ∈ V : r̄j = rtc e.

A9) rc = 0.

Remark. The conditions here are similar to A5 and A6 (or A1 and A3), but in addition we require
nonzero cash returns rtc , t = 1, . . . , T + 1. The opposite case would unnecessarily complicate the
analysis and is not considered.

Lemma 15. Under assumptions A7–A9, the constants αj , γj are all positive, the δjc are nonneg-
ative, and at least one δjc is positive. Moreover, p̃0 ∈ (0, 1].

Proof. Positivity and nonnegativity are obvious, where δjc = 0 iff r̄j = rtc e. Now δjc ≥ 0 implies
pj /(δjc + 1) ∈ (0, pj ] and hence p̃0 ∈ (0, 1].

In the following we use two different formulations for both the reward and the risk, involving
again the conditional final-period risk and return. The latter is now ρT (xT , xcT ) := r̄T∗ xT + rTc +1 xcT
with realizations ρj (xj , xcj ) := r̄j∗ xj + rTc +1 xcj , and the discrete decision vector is x = (xj , xcj )j∈V .

Lemma 16. The reward in the presence of cash can be written


 
ρ(x) = r̃j∗ xj + r̃jc xcj = pj ρj (xj , xcj ).
j∈L j∈L

The risk has the two representations


 xj ∗  Σ̃ rTc +1 r̃j
 
xj
R(x) = j
∗ − ρ(x)2
xcj c c
rT +1 r̃j rT +1 r̃j c xcj
j∈L
 
= pj x∗j Σj xj + pj ρj (xj , xcj )2 − ρ(x)2 =: Rc (x) + Rd (x).
j∈L j∈L

Proof. By definition, the continuous representation of the reward reads

ρ(x) = E[rT∗ +1 xT + rTc +1 xcT ] = E[r̄T∗ xT + rTc +1 xcT ] = E[ρT (xt , xcT )].

Similarly, using Lemma 11,


 ∗     ∗   
xT ΣT 0 rT +1 rT +1 xT
R(x) = E + c − ρ(x)2
xcT 0 0 rT +1 rTc +1 xcT
= E(x∗T ΣT xT ) + E[(r̄T∗ xT + rTc +1 xcT )2 ] − ρ(x)2 .

In both cases, the stated discrete formulae are readily obtained.

Problem 11. Using the first formulation for reward and risk in Lemma 16, the multi-period
mean-variance problem with cash reads
 1 xj ∗  Σ̃ rTc +1 r̃j
 
xj 1 2
c − ρ
j
min c c ∗ c c
x 2 xj rT +1 r̃j rT +1 r̃j xj 2
j∈L

s.t. e∗ x0 + xc0 = 1,
e∗ xj + xcj = rj∗ xi + rtc xci ∀j ∈ V ∗ ,

r̃j∗ xj + r̃jc xcj = ρ.
j∈L
Single-Period and Multi-Period Mean-Variance Models 25

Its Lagrangian is
 1 xj ∗  Σ̃ rTc +1 r̃j
 
xj 1 2
c − ρ
j
L(x, λ, μ; ρ) = c c ∗ c c
2 xj rT +1 r̃j rT +1 r̃j xj 2
j∈L

− λ0 (e∗ x0 + xc0 − 1) − λj (e∗ xj + xcj − rj∗ xi − rtc xci )
j∈V ∗
 
−μ r̃j∗ xj + r̃jc xcj − ρ .
j∈L

Theorem 17. Problem 11 with T = 1 has the unique primal-dual solution


λj −1 λ0 −1
xj = − Σ (r̄j − r2c e), x0 = − Σ (r̄0 − r1c e),
r̃jc j r2c r̃1c 0
   
1 λj 1 λ0
xcj = c c (γj − r2c βj + 1) + μ , xc0 = c c (γ0 − r1c β0 + 1) + μ ,
r2 r̃j r r̃1
r̃jc r̃1c
λj = c [rc (r∗ x0 + r1c xc0 ) − μ], λ0 = (rc − μ) ≡ ρ̃(rc − μ),
δj + 1 2 j δ0c+1
ρ − ρ̃ r̃c
μ = rc c , ρ̃ := c 1 ∈ (0, rc ).
r − ρ̃ δ0 + 1
The associated risk is
(rc − ρ)2
R(x) ≡ σ 2 (ρ) = ρ̃ .
rc − ρ̃
Its global minimum is attained at ρ̂ = rc and has value zero. The associated solution has 100 %
cash: (x̂0 , x̂c0 ) = (0, 1), (x̂j , x̂cj ) = (0, r1c ), λ̂ = 0, μ̂ = rc .
Proof. See appendix.
Remark. Problem 11 also has a unique solution if r2c = 0 and r1c = 0, and it has multiple solutions
for r2c = 0 (regardless of r1c ). These situations are qualitatively different and quite unrealistic,
however, and therefore not of interest here.
Discussion. Zero risk is now possible (with (xj , xcj ) = (0, r1c ) for j ∈ L) since balanced scenario
returns do not require investments in risky assets. In fact, they require 100 % cash in both periods
so that the riskless solution is unique.
It can be seen that the whole situation is actually covered by the results of the previous section
if one replaces in each node xj by (xj , xcj ), etc. This way the analysis extends again to the general
multi-period case. The details are obtained precisely as in the previous section when proper
replacements are carried out everywhere. However, the splitting into cash and risky assets makes
the definitions of intermediate quantities and the specialization to the single-period case somewhat
more involved.
To conclude this section we show that specifying ρ as a desired minimal reward has precisely
the same effect as in the single-period case.
Problem 12. We modify Problem 11 by requiring ρ to be a lower bound on the reward (with
associated slack θ ≥ 0 and dual slack η ≥ 0),
 1 xj ∗  Σ̃ rTc +1 r̃j
 
xj 1 2
c − (ρ + θ)
j
min c c ∗ c c
x,θ 2 xj rT +1 r̃j rT +1 r̃j xj 2
j∈L

s.t. e∗ x0 + xc0 = 1,
e∗ xj + xcj = rj∗ xi + rtc xci ∀j ∈ V ∗ ,

r̃j∗ xj + r̃jc xcj = ρ + θ, θ ≥ 0.
j∈L
26 M. C. Steinbach

The Lagrangian is now


 1 xj ∗  Σ̃ rTc +1 r̃j
 
xj 1
L(x, θ, λ, μ, η; ρ) = j
− (ρ + θ)2
2 xcj rTc +1 r̃j∗ rTc +1 r̃jc xcj 2
j∈L

− λ0 (e∗ x0 + xc0 − 1) − λj (e∗ xj + xcj − rj∗ xi − rtc xci )
j∈V ∗
 
−μ r̃j∗ xj + r̃jc xcj − ρ − θ − ηθ.
j∈L

Theorem 18. Problem 12 has a unique primal-dual solution. For ρ ≥ r c one obtains θ = 0,
η = μ − ρ ≥ 0, and otherwise the same solution as in Problem 11. For ρ ≤ rc one obtains η = 0
and θ = ρ − rc ≥ 0, giving reward ρ + θ = rc and the associated riskless solution of Problem 11.
(At ρ = rc both cases coincide.)
Proof. See appendix.

2.3 Risky assets, cash, and loss


We are now entering the main section. Although it will still be a simplification, Problem 13 below
covers all the essential aspects of the multi-period application model mentioned in the introduction.
The notation and constants of the previous section remain valid, and the discrete decision vector
is x = (xj , xcj , xlj )j∈V . We keep the general multi-period notation only in the problem statement
and Lagrangian; the remaining analysis now concentrates on the two-period case.
Basic assumptions. In addition to the assumptions of the previous section we require positive
cash returns and (as in Theorem 15) a no-arbitrage condition on the discrete part of the return
distribution, involving C0 := conv({rj }j∈L ).
A7) ∀j ∈ V : Σj > 0.
A8) ∃j ∈ V : r̄j = rtc e.
A10) r1c > 0, r2c > 0.
A11) r1c e ∈ int(C0 ).
Problem 13. The multi-period mean-variance problem with cash and loss reads
 1 xj ∗  Σ̃ rTc +1 r̃j
 
xj 1 2
c − ρ
j
min c c ∗ c c
x 2 xj rT +1 r̃j rT +1 r̃j xj 2
j∈L

s.t. e∗ x0 + xc0 + xl0 = 1, xl0 ≥ 0,


e∗ xj + xcj + xlj = rj∗ xi + rtc xci , xlj ≥ 0 ∀j ∈ V ∗ ,

r̃j∗ xj + r̃jc xcj = ρ.
j∈L

Its Lagrangian is
 1 xj ∗  Σ̃ rTc +1 r̃j
 
xj 1
L(x, λ, η, μ; ρ) = j
− ρ2
2 xcj rTc +1 r̃j∗ rTc +1 r̃jc xcj 2
j∈L

− λ0 (e∗ x0 + xc0 + xl0 − 1) − λj (e∗ xj + xcj + xlj − rj∗ xi − rtc xci )
j∈V ∗
  
− ηj xlj −μ r̃j∗ xj + r̃jc xcj −ρ .
j∈V j∈L
Single-Period and Multi-Period Mean-Variance Models 27

Theorem 20 in the appendix characterizes optimal solutions in a way similar to the previous
theorems. However, the loss variables xlj still appear in the formulae, and the case distinctions
are more involved than in the single-period case (cf. Theorem 8) or in Problem 12. Later we will
discuss this in part; for the time being, the following results provide more insight.
Lemma 17 (Arbitrage in Problem 13). If assumption A11 is strictly violated, r 1c e ∈
/ C0 , then
Problem 13 has a riskless solution for arbitrary ρ.
Proof. Since C0 is convex, y ∈ Rn exists so that (r − r1c e)∗ y ≥ 1 for r ∈ C0 . Given ρ ∈ R, let
x0 := (ρ/r2c − r1c )y, xc0 := 1 − e∗ x0 , and xl0 := 0 to obtain wj = (rj − r1c e)∗ x0 + r1c ≥ ρ/r2c for all
j ∈ L. Now let xj := 0, xcj := ρ/r2c , and xlj := wj − ρ/r2c ≥ 0.
For the analysis of actual riskless solutions consider in Rn the family of closed convex polyhedra
  ∗    
rj ρ rj ρ
Z0 (ρ) := x : − e x ≥ − 1 ∀j ∈ L = H̄ e − , 1 − .
r1c rc r1c rc
j∈L

Lemma 18. Z0 (ρ) is nonempty (containing the origin) iff ρ ≤ rc . For ρ < rc , 0 ∈ int(Z0 (ρ)) and
 ρ
Z0 (ρ) = 1 − c Z0 (0).
r
Each Z0 (ρ) is bounded and hence compact. In particular, Z0 (rc ) = {0}.
Remark. More generally one can show that the following four conditions are equivalent under the
weaker no-arbitrage condition r1c e ∈ C0 .
1. r1c e ∈ int(C0 ).
2. Z0 (rc ) = {0}.
3. ∃ρ ∈ (−∞, rc ]: Z0 (ρ) is bounded.
4. ∀ρ ∈ (−∞, r c ]: Z0 (ρ) is bounded.
As in Theorem 15 we choose the stronger condition r1c e ∈ int(C0 ) to avoid unnecessary technical
complications and ambiguities.

Proof. By A11 there exists a convex combination j∈L ξj (rj − r1c e) = 0. Every x ∈ Z0 (ρ) then
satisfies
  rj ∗
ρ
0= ξj c − e x ≥ c − 1.
r1 r
j∈L

Hence Z0 (ρ) = ∅ if ρ > rc and 0 ∈ Z0 (ρ) if ρ ≤ rc . Now let ρ < rc . For c > 0,
 ∗  ∗
rj rj
− e x ≥ −1 ⇐⇒ − e cx ≥ −c,
r1c r1c
showing that x ∈ Z0 (0) iff (1 − ρ/rc )x ∈ Z0 (ρ). To verify 0 ∈ int(Z0 (ρ)), observe that Z0 (ρ)
contains the ball around the origin with radius
   
ρ  rj 
1− c 
max  c − e
 > 0.
r j∈L r1
2

cf. Fig. 10. Condition 1) in the remark holds by A11. We prove 2), 3), and 4) in natural order.
Assume first that 2) does not hold and let y ∈ Z0 (rc ) \ {0}. Then (rj /r1c − e)∗ y ≥ 0 for all j ∈ L
and hence (r/r1c − e)∗ y ≥ 0 for r ∈ C0 , i.e., C0 ⊆ r1c e + H̄(−y, 0). This yields the contradiction
r1c e ∈
/ int(C0 ) and proves 2), which obviously implies 3). Assume now that 4) does not hold, that
is, Z0 (ρ) is unbounded for some ρ ≤ rc . Then, since Z0 (ρ) is convex and 0 ∈ Z0 (ρ), y = 0 exists
so that cy ∈ Z0 (ρ) for all c > 0, that is, (rj /r1c − e)∗ cy ≥ ρ/rc − 1 for j ∈ L. This implies
(rj /r1c − e)∗ cy ≥ 0 and hence (rj /r1c − e)∗ cy ≥ ρ/rc − 1 for c > 0, j ∈ L, and ρ ≤ rc . Thus
3) implies 4): either none of the polyhedra is bounded or all of them.
28 M. C. Steinbach


 
 

 
 


C0
 

C0
 


 

 

 

 

Z0(ρ)
 


 
Z0(ρ)

 

 

 



 


Figure 10: First-period returns {rj }, r1c e ∈ R2 and associated zero-risk polyhedra Z0 (ρ) with
enclosed balls for three values of ρ (black = rc , dark grey, medium grey). Left: compact case,
r1c e ∈ int(C0 ). Right: unbounded case, r1c e ∈ ∂C0 .

Theorem 19. Consider the feasible solutions of Problem 13. Then:

1. Risk vanishes iff xj = 0 and xcj = ρ/r2c in all scenarios; it is positive else.

2. Leaf variables of riskless solutions depend uniquely on the root variables.

3. For ρ < rc , risk vanishes on an (n + 1)-dimensional cone over Z0 (ρ).

4. For ρ = rc , the unique zero-risk portfolio has 100 % cash.

5. For ρ > rc , risk is strictly positive.

Remarks. Strictly speaking, the riskless solutions for ρ < rc form an (n + 1)-dimensional cone
whose projection on the (x0 , xl0 )-space has Z0 (ρ) as base, see Fig. 11. Similarly, the projection
for ρ = rc is Z0 (rc ) × {0}, and in both cases the remaining variables are uniquely determined by
feasibility. (If r1c e ∈ ∂C0 , then x0 ∈ Z0 (rc ), xc0 , and xlj , j ∈ L, would not be unique for ρ = rc .)

Proof. The condition in 1 is obviously sufficient for zero risk (cf. Lemma 17). Let p := (pj )j∈L
and q := (ρj (xj , xcj ))j∈L . Then

  2
Rd (x) = pj qj2 − pj qj = q ∗ [Diag(p) − pp∗ ]q.
j∈L j∈L

By Lemma 19, this quadratic form vanishes if q = ρe and is positive else. Moreover, Rc (x) ≥ 0
for all x, and Rc (x) = 0 iff xj = 0 for all j ∈ L. This proves statement 1. The zero-risk condition
requires wj ≡ rj∗ x0 + r1c xc0 ≥ ρ/r2c in all scenarios. If this implied restriction on the root variables
holds, then the leaf variables are uniquely determined by statement 1 and xlj = wj −xcj ≥ 0, proving
statement 2. Since xc0 = 1 − e∗ x0 − xl0 , the inequality wj ≥ ρ/r2c is equivalent to x0 ∈ Z0 (ρ + rc xl0 ).
By Lemma 18 we conclude that x is optimal for ρ < rc iff xl0 ∈ [0, 1 − ρ/rc ] and x0 ∈ Z0 (ρ + rc xl0 ),
see Fig. 11. This proves statement 3. Statements 4 and 5 follow similarly: for ρ = rc the zero-risk
cone degenerates (only its vertex remains), and for ρ > rc it becomes empty.

Lemma 19. Let p ∈ RN , p > 0, e∗ p = 1, and f (q) := 12 q ∗ [Diag(p) − pp∗ ]q. Then

min f (q) s.t. p∗ q = ρ


q

has the unique minimizer q = ρe, with optimal value f (ρe) = 0.


Single-Period and Multi-Period Mean-Variance Models 29

x0c






















 x0l



x0

Figure 11: Zero-risk cones over Z0 (ρ) for three values of ρ (black = rc , dark grey, light grey);
compact case. (Here Z0 (ρ) is a segment of the x0 axis.)

Proof. The Lagrangian of the minimization problem is


1 ∗
L(q, η) = q [Diag(p) − pp∗ ]q − η(p∗ q − ρ),
2
yielding the optimality condition
∂L
0= = Diag(p)q − pp∗ q − ηp = Diag(q)p − ρp − ηp.
∂q
Since p > 0, this gives qj − ρ − η = 0 for j = 1, . . . , N , and hence q = (ρ + η)e. Now p∗ q = ρ
implies η = 0, as required. Clearly, f (ρe) = 0.
Statement 3 of Theorem 19 says that xl0 > 0 (xlj > 0) is possible when the total cash return
r = r2c r1c w0 (the second-period cash return r2c wj ) exceeds the desired reward. The following result
c

shows that surplus money is actually necessary for positive loss variables: if there is no surplus
money, then positive amounts xlj are better invested in cash to reduce risk.
Lemma 20. Let x be an optimal solution of Problem 13 for ρ > r c .
1. If xj = 0 in a scenario j ∈ L, then xlj = 0.

2. xl0 = 0.
Remark. Statement 1 of the lemma is also obtained from Theorem 20, and moreover the plausible
fact that xlj = 0 in scenario j if ρj < ρ. However, the proof of that theorem is not constructive.

Proof. Assume xlj > 0. We modify the local variables in scenario j to construct a better feasible
solution. Let a := (r̄j − r2c e)∗ xj , define  ∈ (0, 1] as

min(1, r2c xlj /a) if a > 0,
 :=
1 else,

and replace (xj , xcj , xlj ) by

(x̂j , x̂cj , x̂lj ) := (xj , xcj , xlj ) − (xj , −r̄j∗ xj /r2c , a/r2c ).
Then x̂lj ≥ 0, ŵj = wj , ρ̂j = ρj , and the risk is reduced by the positive amount [1−(1−)2 ]x∗j Σj xj .
This proves statement 1. For xl0 > 0 we modify xc0 and all xlj . Let (x̂0 , x̂c0 , x̂l0 ) := (x0 , xc0 +xl0 , 0) and
(x̂j , x̂cj , x̂lj ) := (xj , xcj , xlj + r1c xl0 ) for j ∈ L. Clearly, x̂ is feasible, R(x̂) = R(x), and x̂lj ≥ r1c xl0 > 0
in all scenarios. Thus, by statement 1, x̂ and consequently x are not optimal.
30 M. C. Steinbach

Discussion. As in the one-period case, the 100 % cash solution plays a key role: it has the largest
reward among all riskless solutions. But now the solutions become degenerate for small rewards
ρ < rc , even if loss is allowed only in the second period, i.e., if xl0 = 0 is fixed. This does not
happen in Problem 12 (the modification of Problem 11 with bounded reward ρ(x) ≥ ρ), which
behaves precisely as the corresponding single-period problem. Of course, the degeneracy occurs
only for practically irrelevant rewards, and even then it can easily be avoided. (One may choose
the vertex of the zero-risk cone, i.e., xl0 = 1 − ρ/rc . This removes any surplus money immediately
and gives a unique solution.)
The only case of practical interest is ρ > rc , when the solution of Problem 11 remains optimal
in Problem 12. Why do we prefer the loss formulation, Problem 13? Obviously the risk cannot be
higher than in Problem 11 since every optimal solution of the latter remains feasible in the former
problem. Actually it turns out that the loss formulation gives strictly lower risk in most cases,
i.e., it allows better solutions than Problem 12. To develop a geometric understanding for this
observation, we compare Problems 11 and 13 in a simplified situation. A reformulation eliminates
all the budget equations and most of the portfolio variables in favor of the individual scenario
returns. The two risk terms Rc , Rd are then used to explain in which cases (and how) an optimal
solution of Problem 11 can be modified to give a better feasible solution of Problem 13.
Problem 14. Consider as example a portfolio consisting of just one risky asset and cash, using
the second formulation of reward and risk in Lemma 16. Include loss assets in the leaves but not
in the root, and write the problem with scenario returns ρj as additional variables,
1 1
min pj [Σj x2j + ρ2j ] − ρ2
x,{ρj } 2 2
j∈L

s.t. ρj = r̄j xj + r2c xcj ∀j ∈ L,


x0 + xc0 = 1,
xj + xcj + xlj = rj x0 + r1c xc0 , xlj ≥ 0 ∀j ∈ L,

pj ρj = ρ.
j∈L

This specialization of Problem 13 is only considered for ρ ≥ rc , but for arbitrary ρ ∈ R as a


specialization of Problem 11, i.e., when all the loss variables xlj = 0 are fixed. (In these cases the
solution is unique by Theorems 17 and 20.)
Lemma 21. For simplicity assume r̄j = r2c for all j ∈ L. (A8 guarantees this for just one j ∈ L.)
Let φj := Σj /(r̄j − r2c )2 > 0, ψj := r2c (rj − r1c ), and θj ≡ r2c xlj . Then Problem 14 is equivalent to
1 1
min pj [φj (ρj − rc − ψj x0 + θj )2 + ρ2j ] − ρ2
x0 ,{ρj ,θj } 2 2
j∈L
 (3)
s.t. pj ρj = ρ, θj ≥ 0 ∀j ∈ L.
j∈L

The optimal scenario returns are


φj (rc + ψj x0 − θj ) + μ
ρj = , j ∈ L. (4)
φj + 1
Moreover, the optimal reward multiplier μ has the same value in both problems.
Proof. Eliminate
xc0 = 1 − x0 , xcj = (rj − r1c )x0 + r1c − xj − xlj .
Then substitute xcj into the ρj equation and use ψj and θj to obtain
ρj = (r̄j − r2c )xj + rc + ψj x0 − θj .
Single-Period and Multi-Period Mean-Variance Models 31

Solving for xj , inserting it into the objective and using φj yields Problem (3). Differentiating the
Lagrangian with respect to the returns ρj gives optimality conditions

φj (ρj − rc − ψj x0 + θj ) + ρj = μ,

from which one obtains the expression (4). This derivation holds for the case without loss, too:
one just has to set xlj = θj = 0 everywhere. Finally, when the problem transformations above are
applied to the full primal-dual system of optimality conditions, it is observed that μ has the same
value in both problems.
Discussion. Consider the case θj = 0 first (Problem 11). We have to choose optimal values for x0
and for the scenario returns ρj so that their mean equals ρ. Defining dj := rc + ψj x0 gives the
continuous risk part

Rc = pj φj (ρj − dj )2 ≥ 0.
j∈L

This is a weighted average of scenario risks φj (ρj −dj )2 , each of which defines a parabola character-
ized by its offset dj and curvature φj . Both magnitude and distance of the offsets are influenced
by the common “spread factor” x0 : they all coincide with rc if x0 = 0 (100 % cash), whereas
x0 = 1 (no cash) yields the discrete distribution dj = r2c rj with mean d := r2c E({rj }). Clearly, the
continuous risk Rc is small when all the scenario returns are close to their respective offsets, while
the discrete risk Rd is small when they are close to each other. Thus, loosely speaking, x0 has the
job to balance the scenarios by adjusting d (close to ρ) without spreading the offsets too much.
Only one detail changes when θj > 0 is allowed (Problem 13): each offset dj is replaced by
dj − θj , that is, the parabolas may be shifted to the left separately in each scenario.
We can now explain the risk reduction mechanism. Consider an optimal solution of Problem 14
without loss. Typically there will be “good” scenarios (fortunate for the investor, with large offsets
dj > ρ) and “bad” scenarios (unfortunate for the investor, with small offsets dj < ρ). Moreover,
one expects that some of the scenario returns will lie on their local upper branch (ρj > dj ) and
some on their local lower branch (ρj < dj ). If in a good scenario the optimal return lies on the
lower branch, ρ < ρj < dj , then its contribution to the risk is canceled by shifting the parabola to
the left, θj := dj − ρj > 0. Clearly, nothing else changes, so that this gives a suboptimal feasible
solution of Problem 13 which is better than the optimal solution of Problem 11. The mechanism
here is precisely the same as in a single period (Problem 5) except that it now occurs locally in
individual scenarios.
Of course, the optimal solution of Problem 13 will readjust all the variables globally. Now x0
still has the job to balance scenarios by adjusting d close to ρ, but only without spreading the
small offsets dj < ρ too much. The large offsets dj > ρ produce surplus money in all sufficiently
good scenarios. These scenarios do not contribute to Rc , and all their returns are equal (and
slightly larger than ρ to balance the bad ones). This is proved in Theorem 20: surplus money
xlj > 0 implies r2c wj > μ and ρj = μ ≥ ρ. It means that a jump discontinuity is produced in
the distribution of final wealth, to which each of the riskless scenarios contributes a fraction. It
also means that the bad scenarios dominate the resulting risk: again we have an approximate
minimization of downside risk.
When looking for an instance of Problem 11 with ρ < ρj < dj one might try Problem 3 with
only two scenarios. However, the following results show that the effect cannot occur if n = 1 and
N = 2: in that case (with ρ > rc ) the single degree of freedom in x0 is sufficient to balance the
scenarios well. After proving this we give an example of the risk reduction with n = 1 and N = 3.
A slight modification finally shows that risk reduction can occur even in bad scenarios, that is,
optimal solutions of Problem 11 may have ρj < dj < ρ.
For the comparison of signs we define the equivalence relation

a∼b : ⇐⇒ sign(a) = sign(b) ∈ {−1, 0, 1}

and use the fact that {−1, 0, 1} is a multiplicative subgroup in R.


32 M. C. Steinbach

1.6 1.4

mu 1.2
1.5 rho1
rho2 1
1.4 rho3 (= mu)
0.8
1.3
0.6 cash1
asset1
1.2 cash2
0.4
asset2
1.1 0.2 cash3
asset3

1 0
1 1.05 1.1 1.15 1.2 1.25 1 1.05 1.1 1.15 1.2 1.25

1.6 1.4

mu 1.2
1.5 rho1
rho2 1
1.4 rho3
0.8
1.3
0.6 cash1
asset1
1.2 cash2
0.4
asset2
1.1 0.2 cash3
asset3

1 0
1 1.05 1.1 1.15 1.2 1.25 1 1.05 1.1 1.15 1.2 1.25

Figure 12: Optimal scenario returns (left) and investments (right) for the small example problem.
Asset values unconstrained (top) and nonnegative (bottom; r̄max = 1.21).

Lemma 22. In Problem 14 without loss assets, ρj − dj ∼ sj (ρ − rc ) where


 pk
sj := (rk − rj )(rk − r1c ).
δkc + 1
k∈L\{j}

Proof. See appendix.

Remark. Notice that sj depends only on the stochastic data (the return distribution), and in
particular not on ρ.

Corollary 2. In Problem 14 with two scenarios and without loss assets, both ρ1 and ρ2 lie on
their respective upper branches if ρ > rc .

Proof. Assumption A11 yields r1 < r1c < r2 , and the sums sj consist of one single term each since
N = 2. This implies that s1 , s2 are both positive.

Remark. A weak version of this result holds for r1c ∈ C0 . Then r1 ≤ r1c ≤ r2 , and s1 , s2 are both
nonnegative. Even that weaker version would prevent any beneficial shift θj > 0.
The example with N = 3 is simple: let (p1 , p2 , p3 ) = (0.2, 0.6, 0.2), (r1 , r2 , r3 ) = (1.0, 1.1, 1.2),
r1c = 1.05, Σ1 = Σ2 > 0, and j = 3. Then δ1c = 0.052 /Σ1 = δ2c , and hence
2
 pk 0.2(−0.2)(−0.05) + 0.6(−0.1)(0.05)
s3 = (rk − r3 )(rk − r1c ) = < 0.
δkc + 1 δ1c + 1
k=1

The corollary proves that ρ3 lies on the lower branch even though r1c ∈ int(C0 ); thus introducing
the loss asset xl3 reduces the optimal risk. (The values of Σ3 , r2c , and r̄1 , r̄2 , r̄3 do not matter here
as long as A7, A8, A10, and A11 are satisfied. The reader may check that this holds for Σj = 1,
r2c = 1.05, and r̄j = 1.1, j ∈ {1, 2, 3}.) Setting instead r1c := 1.15 and Σ2 = Σ3 > 0 yields s1 < 0.
This shows that risk reduction can also occur in bad scenarios—at least for unreasonably large rtc .
Results for this small example problem (with loss assets included) are displayed in Fig. 12.
Single-Period and Multi-Period Mean-Variance Models 33

2.4 The general multi-period case


The general situation with loss assets becomes quite complicated and we do not attempt a thorough
analysis. Instead, we state some plausible extensions of the two-period results that have been
confirmed in practical computations.
Surplus money can now appear in any node j ∈ Lt , t ∈ {0, . . . , T }, if the (partial) scenario is
sufficiently good up to that point, that is, if
μ
wj > wtmin := .
rTc +1 c
. . . rt+1

(As before the root can be excluded by considering only “large” desired rewards ρ > rc , with
rc := rTc +1 . . . r1c .) When there is surplus money, the whole scenario subtree rooted in j does not
contribute to the risk provided that sufficient node capitals wk are maintained. Implicitly this
condition defines a zero-risk polyhedron whose geometry is determined by the subtree’s discrete
return distribution. Of course, if j ∈ LT −1 , then one gets a cone similar to the one considered
before, but depending on wj . These observations imply that the generic optimal solution is highly
degenerate: any reasonable return discretization will include good and bad partional scenarios on
each level, and surplus money will almost always appear somewhere in the tree. However, even in
the zero-risk subtrees all leaf variables are again uniquely determined by the lower-level variables.
min
Obviously wt+s is the sufficient capital for a node k ∈ Lt+s in the zero-risk subtree, and
the easiest way to maintain that amount is to invest precisely wtmin in cash and remove the rest
(“invest” in xlj ). Thus all surplus money is taken out immediately in the root of the subtree and
each remaining node has 100 % cash.

3 Conclusions
We have seen that multi-period mean-variance problems behave similar to their single-period
counterparts in many respects. In specific, it is possible to avoid overperformance by allowing to
remove capital. Small desired rewards ρ ≤ rc are met exactly at zero risk. In that case all the
capital is either invested in cash or removed; thus minimizing the variance is trivially equivalent to
minimizing the semi-variance (or any other downside risk measure) without removing capital but
allowing to exceed the desired reward. That is, with x = (xj , xcj )j∈V and in abbreviated notation,
the problem

min R(x) s.t. ρ(x) = ρ, e∗ xj + xcj ≤ wj ∀j ∈ V,


x

is equivalent to the downside risk problem

min Rρ2 (x) s.t. ρ(x) ≥ ρ, e∗ xj + xcj = wj ∀j ∈ V.


x

(Of course, the solutions of the second problem differ insofar as surplus money is invested in
cash instead of being removed.) For moderate values ρ > rc one cannot avoid overperformance
completely, but in effect the first problem still tends to minimize the semi-variance. More precisely,
the discrete part Rd approximates its downside version due to the existence of zero-risk subtrees.
The quality of that approximation decreases as ρ increases so that for large values the risk measure
becomes a blending of variance and semi-variance. Note that there is no such gradual process in the
single-period case, but there is a close similarity of single-period downside risk (Theorem 15) and
multi-period zero-risk polyhedra (Lemma 18). We may conclude that Problem 13 is a reasonable
multi-period model for an investor who wishes to minimize the semi-variance rather than the
variance of final wealth.
The previous comparison also gives some hints how an optimal policy of Problem 13 should
be interpreted. Again, positive values of xlj do not suggest to burn that amount. They indicate
the presence of surplus money which the investor may spend immediately without risking to
miss her goal, or which she may invest in cash to obtain a riskless extra profit. Of course, she
34 M. C. Steinbach

may also consume part of the surplus and invest the rest. Thus, if the investor implements the
optimal policy over the full planning horizon, she will approximately minimize the risk of ending
up with less than the desired amount, regardless of her choice. Interestingly, the second alternative
(investing) amounts to a single-period strategy with predetermined intermediate decisions, which
may be useful when the investor cannot react to the market until the end of her planning horizon,
or for some reason does not wish to do so.
However, it should be noted that the problem under consideration is not time-invariant due
to the reward condition. That condition involves an expectation over all scenarios, that is, over
the potential futures at t = 0. But at t = 1 most of these potential futures become impossible,
whichever scenario realizes. The terminal condition ρ(x) = ρ or ρ(x) ≥ ρ can usually not be
satisfied when the restricted expectation over the subtree is taken—unless it happens to be a zero-
risk subtree. Therefore only the immediate decision will be of interest for the typical investor.
Rather than following the original future policy, she will adjust the reward and solve the problem
anew for each decision. Of course, the investor may also build an extended model after each period
in pursuing a moving horizon technique.
In any case it seems appropriate to consider all riskless strategies (in addition to the efficient
ones) as reasonable choices in multi-period decision models. This does no harm since it includes
all the standard alternatives, but it opens up new possibilities like the trick described above.
Let us finally point out two issues that might be interesting subjects of future research. First,
the model presented here does not include any preferences of consumption, although one can easily
specify hard constraints (exact, minimal, or maximal consumption) through a cash flow. However,
it is not clear how one should incorporate (soft) preferences and how the result would be related
to long-term models based on utility of consumption. Second, the multi-period setting enables the
investor to control higher moments of her distribution of final wealth—at least to some extent.
How would risk measures involving skewness, e.g., behave in the context of our model?

Appendix
The appendix contains some proofs and a theorem that would disrupt the line of thought in the
main body of the multi-period section.
Proof of Theorem 16. The system of optimality conditions (for two periods) can be written
⎛ ⎞⎛ ⎞ ⎛ ⎞
0 e −r1 . . . −rN 0 x0 0
⎜ Σ̃1 e r̃1 ⎟ ⎜ x1 ⎟ ⎜ 0 ⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ .. .. .. ⎟ ⎜ .. ⎟ ⎜ .. ⎟
⎜ . . ⎜
. ⎟⎜ . ⎟
⎟ ⎜ ⎟
⎜ ⎟ ⎜.⎟
⎜ r̃N ⎟ ⎜ xN ⎟ ⎜
⎟ ⎜ ⎟ ⎟
⎜ Σ̃N e ⎜0⎟
⎜ e∗ ⎟ ⎜ −λ0 ⎟ = ⎜ 1 ⎟ .
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ −r∗ e∗ ⎟ ⎜ −λ1 ⎟ ⎜ 0 ⎟
⎜ 1 ⎟⎜ ⎟ ⎜ ⎟
⎜ . .. ⎟⎜ . ⎟ ⎜ . ⎟
⎜ .. . ⎟ ⎜ .. ⎟ ⎜ .. ⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎝−r∗ e∗ ⎠ ⎝−λN ⎠ ⎝ 0 ⎠
N
0 r̃1∗ ∗
. . . r̃N −μ ρ

For j ∈ L, xj = Σ̃−1j (λj e + μr̃j ) is immediately obtained from the j-th dual feasibility condition.
Substitution into the budget equation for xj and into dual feasibility condition 0, respectively,
yields λj = (rj∗ x0 − μβ̃j )/α̃j and

  rj∗ x0 − μβ̃j
0 = −λ0 e + λj rj = −λ0 e + rj = −λ0 e + Σ̃0 x0 − μr̃0 ,
α̃j
j∈L j∈L

giving x0 = Σ̃−1
0 (λ0 e + μr̃0 ). The budget equation for x0 now reads

1 = e∗ x0 = e∗ Σ̃−1
0 (λ0 e + μr̃0 ) = λ0 α̃0 + μβ̃0 ,
Single-Period and Multi-Period Mean-Variance Models 35

yielding λ0 = (1 − μβ̃0 )/α̃0 . From the reward equation one finally obtains

   β̃j β̃j2

ρ= r̃j∗ xj = (λj β̃j + μγ̃j ) = rj∗ x0 − μ + μγ̃j
α̃j α̃j
j∈L j∈L j∈L
 δ̃j  δ̃j
= r̃0∗ x0 + μ = λ0 β̃0 + μγ̃0 + μ
α̃j α̃j
j∈L j∈L

β̃0 β̃02  δ̃j β̃0  δ̃j


= −μ + μγ̃0 + μ = +μ .
α̃0 α̃0 α̃j α̃0 α̃j
j∈L j∈V

This gives the reward multiplier using Lemma 12,


  
β̃0 δ̃j
μ= ρ− .
α̃0 α̃j
j∈V

(Here we need assumption A6: the denominator vanishes if δ̃j = 0 for all j ∈ V .) As to the global
minimum, we have

x∗j Σ̃j xj = λj (λj α̃j + μβ̃j ) + μ(λj β̃j + μγ̃j )


β̃j rj∗ x0 + μδ̃j (rj∗ x0 )2 δ̃j
= λj rj∗ x0 + μ = + μ2
α̃j α̃j α̃j

and similarly

β̃0 + μδ̃0 1 δ̃0


x∗0 Σ̃0 x0 = λ0 + μ = + μ2 .
α̃0 α̃0 α̃0
Therefore,
  (rj∗ x0 )2 δ̃j
  δ̃j
x∗j Σ̃j xj = +μ 2
= x∗0 Σ̃0 x0 + μ2
α̃j α̃j α̃j
j∈L j∈L j∈L
 δ̃j  2  
1 2 1 β̃0 δ̃j
= +μ = + ρ− .
α̃0 α̃j α̃0 α̃0 α̃j
j∈V j∈V

Subtracting ρ2 yields a risk expression of the form σ 2 (ρ) = s + (ρ − c)2 /d − ρ2 . Since the optimal
portfolio x is an affine function of ρ and R is convex quadratic, the efficient frontier is either
strictly convex (iff d < 1), or σ 2 (ρ) ≡ 0 (iff d = 1 and c = s = 0). But s = α̃−1
0 > 0 by Lemma 12;
therefore σ 2 (ρ) has the global minimum s + c2 /(1 − d) at ρ̂ = c/(1 − d), as stated.

Proof of Theorem 17. The system of optimality conditions can be written



∂L/∂x0 = 0 : −λ0 e + λj rj = 0,
j∈L

∂L/∂xc0 = 0 : −λ0 + λj r1c = 0,
j∈L

∂L/∂xj = 0 : Σ̃j xj + r2 r̃j xj − λj e − μr̃j


c c
=0 ∀j ∈ L,
∂L/∂xcj = 0: r2c r̃j∗ xj + r2c r̃jc xcj − λj − μr̃jc =0 ∀j ∈ L,
∂L/∂λ0 = 0 : e∗ x0 + xc0 = 1,
∂L/∂λj = 0 : e∗ xj + xcj − rj∗ x0 − r1c xc0 = 0 ∀j ∈ L,

∂L/∂μ = 0 : r̃∗ xj + r̃jc xcj = ρ.
j∈L j
36 M. C. Steinbach

The dual feasibility condition ∂L/∂xcj = 0 gives

1 ∗ λj 
xcj = −r̄j xj + + μ . (5)
r2c r̃jc

Substitution into ∂L/∂xj = 0 yields


 λj  λj
0 = Σ̃j xj + r̃j −r̄j∗ xj + c + μ − λj e − μr̃j = pj Σj xj + c (r̄j − r2c e),
r̃j r2

which gives xj and, upon substitution into (5), xcj :

λj −1 λj
xj = − Σ (r̄j − r2c e), xcj = (γj − r2c βj + 1).
r̃jc j r2c r̃jc

Therefore the budget equation ∂L/∂λj = 0 reads

λj λj μ λj δjc + 1 μ
0 = e∗ xj + xcj = − c (βj − r2c αj ) + c c (γj − r2c βj + 1) + c = c + c,
r̃j r2 r̃j r2 r2 r̃jc r2

from which one obtains


r̃jc
λj = [rc (r∗ x0 + r1c xc0 ) − μ].
δjc +1 2 j

Now we can proceed with the root variables. The condition ∂L/∂xc0 = 0 reads
 r̃jc
0 = −λ0 + r1c [rc (r∗ x0 + r1c xc0 ) − μ] = −λ0 + rc r2c r̃0∗ x0 + rc r̃1c xc0 − μr̃1c ,
δjc + 1 2 j
j∈L

giving

1 c ∗ λ0 
xc0 = c
−r2 r̄0 x0 + c + μ . (6)
r r̃1

Similarly, after inserting λj and then xc0 , the condition ∂L/∂x0 = 0 reads
 r̃jc
0 = −λ0 e + rj [rc (r∗ x0 + r1c xc0 ) − μ]
δjc + 1 2 j
j∈L
λ0
= −λ0 e + (r2c )2 Σ̃0 x0 + rc r2c r̃0 xc0 − μr2c r̃0 = (r2c )2 p̃0 Σ0 x0 + (r̄0 − r1c e).
r1c
This yields x0 and, by substitution into (6), xc0 :

λ0 −1 λ0 μ
x0 = − Σ (r̄0 − r1c e), xc0 = (γ0 − r1c β0 + 1) + c .
r2c r̃1c 0 rc r̃1c r

Thus the root budget equation is

λ0 λ0 μ λ0 δ c + 1 μ
0 = e∗ x0 + xc0 = − c c (β0 − r1c α0 ) − c c (γ0 − r1c β0 + 1) + c = c 0 c + c ,
r2 r̃1 r r̃1 r r r̃1 r

which gives

r̃1c
λ0 = (rc − μ) = ρ̃(rc − μ).
δ0c + 1
Single-Period and Multi-Period Mean-Variance Models 37

By Lemma 15 we have p̃0 ∈ (0, 1] and δ0c ≥ 0. Moreover, if p̃0 = 1 then δjc = 0 for j ∈ L and thus
δ0c > 0. This proves the inclusion ρ̃ ≡ rc p̃0 /(δ0c + 1) ∈ (0, rc ). Altogether, the previous results give

λj λj λj
r̃j∗ xj + r̃jc xcj = − (γj − r2c βj ) + c (γj − r2c βj + 1) + pj μ = c + pj μ (7)
r2c r2 r2

and similarly

λ0 λ0 λ0
r2c r̃0∗ x0 + r̃1c xc0 = − (γ0 − r1c β0 ) + c (γ0 − r1c β0 + 1) + p̃0 μ = c + p̃0 μ. (8)
rc r r
Upon insertion of λj and then λ0 the reward condition reads
  pj
ρ= r̃j∗ xj + r̃jc xcj = [rc (r∗ x0 + r1c xc0 ) − μ] + μ
δc +1 2 j
j∈L j∈L j
λ0 ρ̃
= r2c r̃0∗ x0 + r̃1c xc0 − p̃0 μ + μ = c
+ μ = c (rc − μ) + μ.
r r
This gives ρ = ρ̃ + μ(rc − ρ̃)/rc and thus μ = rc (ρ − ρ̃)/(rc − ρ̃). To calculate the risk, let
R̄j (xj , xcj ) := x∗j Σj xj + ρj (xj , xcj )2 and use the last expression from (7) for pj ρj (xj , xcj ). Then
 2  2  2
λj λj λj λj
R̄j (xj , xcj ) = δjc + +μ = (δjc + 1) + 2 μ + μ2
r̃jc r̃jc r̃jc r̃jc
[r2c (rj∗ x0 + r1c xc0 ) − μ]2 [r2c (rj∗ x0 + r1c xc0 ) − μ]
= + 2 μ + μ2
δjc + 1 δjc + 1
[r2c (rj∗ x0 + r1c xc0 )]2 2
δjc
= + μ .
δjc + 1 δjc + 1

Since R(x) = j∈L pj R̄j (xj , xcj ) − ρ2 , we get


pj  ∗ 2   pj δjc
R(x) + ρ2 = (r2c )2 (r x
j 0 ) + 2r c c ∗
x r
1 0 j 0 x + (r c c 2
x
1 0 ) + μ 2
δc + 1 δc + 1
j∈L j j∈L j
 
= (r2c )2 x∗0 Σ̃0 x0 + 2r1c xc0 r̃0∗ x0 + p̃0 (r1c xc0 )2 + μ2 (1 − p̃0 ).

Using (8), the first term equals


 2  2 
  λ0 λ0 μ
r2c r̃jc x∗0 Σ0 x0 + (r̄0∗ x0 + r1c xc0 )2 = r2c r̃jc δ0c + + c ,
r2c r̃1c r2c r̃1c r2

which is further simplified to


 2   c 
λ0 λ0 2 (r − μ)2 rc − μ 2
p̃0 (δ0c + 1) + 2 c μ + μ = p̃0 +2 c μ+μ .
r̃1c r̃1 δ0c + 1 δ0 + 1

Therefore we have
 
(rc )2 + δ0c μ2 p̃0
R(x) + ρ2 = p̃0 + μ 2
(1 − p̃ 0 ) = r c
ρ̃ + μ 2
1 −
δ0c + 1 δ0c + 1
 
ρ̃ (ρ − ρ̃)2
= rc ρ̃ + μ2 1 − c = rc ρ̃ + rc c .
r r − ρ̃

Subtracting ρ2 gives the stated risk formula whose minimum over all ρ is easily determined. The
remaining statements follow trivially.
38 M. C. Steinbach

Proof of Theorem 18. The optimality conditions include the ones of Problem 11 except that ρ
must be replaced by ρ + θ (in ∂L/∂μ = 0). Additional conditions are μ − (ρ + θ) = η (from
∂L/∂θ = 0), θ ≥ 0, η ≥ 0, and complementarity θη = 0. The proof now proceeds precisely as
the proof of Theorem 17, with ρ replaced by ρ + θ in all instances. Finally nonnegativity and
complementarity of θ and η lead to the case distinction: either θ = 0 and μ ≥ ρ, or θ > 0 and
μ = ρ + θ. The formula μ = rc (ρ + θ − ρ̃)/(rc − ρ̃) gives ρ ≥ rc in the first case and ρ + θ = rc in
the second case.
Theorem 20. In Problem 13 let θj ≡ r2c xlj for j ∈ L, θ0 ≡ rc xl0 , and
1  pj 1  pj
θ̄0 := θj , s0 := θj rj .
p̃0 δjc + 1 p̃0 δjc + 1
j∈L j∈L

Then every primal-dual solution satisfies


 
λj −1 1 λj
xj = − Σ (r̄j − r2c e), xcj = (γ j − r c
2 jβ + 1) + μ ,
r̃jc j r2c r̃jc
λ0 −1 1
x0 = − Σ (r̄0 − r1c e) − c Σ−1 (θ̄0 r̄0 − s0 ),
r2c r̃1c 0 r2 0
 
1 λ0
xc0 = c c (γ0 − r1c β0 + 1) + μ + (γ0 + 1)θ̄0 − r̄0∗ Σ−1 0 s 0 ,
r r̃1
r̃jc
λj = c [rc (r∗ x0 + r1c xc0 ) − μ − θj ] = −ηj ≤ 0,
δj + 1 2 j
r̃c
λ0 = c 1 [rc − μ − θ0 − (γ0 − r1c β0 + 1)θ̄0 + (r̄0 − r1c e)∗ Σ−1 0 s0 ] = −η0 ≤ 0,
δ0 + 1
ρ − ρ̃ ρ̃
μ = rc c + c [(γ0 − r1c β0 + 1)θ̄0 + θ0 − (r̄0 − r1c e)∗ Σ−1
0 s0 ].
r − ρ̃ r − ρ̃
In particular, the following case distinction can be made in each scenario. If ηj > 0, then xlj = 0,
ρj (xj , xcj ) < μ, and r2c wj < μ. Conversely, if ηj = 0, then xlj ≥ 0, ρj (xj , xcj ) = μ ≥ ρ, and
r2c wj ≥ μ. In this case the leaf variables are

xj = 0, xcj = μ/r2c , xlj = 1 − μ/r2c ,

giving xlj = 0 (100 % cash) if r2c wj = μ and xlj > 0 else.


The following case distinction holds in the root. If η0 > 0, then xl0 = 0 and ρ < μ. Otherwise,
if η0 = 0, then xl0 ≥ 0 and ρ = μ.
Proof. The optimality conditions of Theorem 17 remain valid except that xlj now appears in all
the budget conditions ∂L/∂λj = 0, j ∈ V . Additional optimality conditions in each node are
λj = −ηj (from ∂L/∂xlj = 0), xlj ≥ 0, ηj ≥ 0, and complementarity xlj ηj = 0. The expressions
above are obtained precisely as in the proof of Theorem 17 when slack variables xlj and derived
quantities θj , θ̄0 , s0 are included. This derivation also yields intermediate results
λj λ0
ρj (xj , xcj ) = μ + ≤ μ, j ∈ L, ρ= μ+ ≤ μ,
r̃jc rc

cf. (7) and the reward condition after (8). The additional optimality conditions above together
with these identities lead to the stated case distinctions when it is observed that λj has the same
sign as r2c wj − μ − θj for j ∈ L.
Remark. Note that all the multipliers now have a natural interpretation. The reward multiplier μ
is the maximal scenario return and a threshold value for surplus money: there is surplus money
in scenario j iff ρj = μ and r2c wj > μ. The budget multipliers λj (up to a scaling factor) measure
the difference of desired return or scenario returns to the threshold.
Single-Period and Multi-Period Mean-Variance Models 39

Proof of Lemma 22. Formula (4) for ρj (with θj = 0) gives

(φj + 1)(ρj − dj ) = φj dj + μ − (φj + 1)dj = μ − dj

and hence ρj − dj ∼ μ − dj ≡ μ − rc − ψj x0 . By Theorem 17,

λ0 r̄0 − r1c rc − μ r̄0 − r1c


x0 = − = − ρ̃ .
r2c r̃1c Σ0 r2c r̃1c Σ0

Therefore we have
rc − μ r̄0 − r1c
ρj − dj ∼ μ − rc + r2c (rj − r1c )ρ̃ c c
r2 r̃1 Σ0
 
rj − r1 r̄0 − r1
c c
= (μ − rc ) 1 − c .
δ 0 + 1 Σ0

By Theorem 17, μ − rc = rc (ρ − rc )/(rc − ρ̃) ∼ ρ − rc . The second factor equals

(rj − r1c )(r̄0 − r1c )


1− ∼ (r̄0 − r1c )2 + Σ0 − (rj − r1c )(r̄0 − r1c )
[(r̄0 − r1c )Σ−1
0 (r̄0 − r1 ) + 1]Σ0
c

= Σ0 + r̄02 − (rj + r1c )r̄0 + rj r1c


∼ Σ̃0 − (rj + r1c )r̃0 + p̃0 rj r1c
 pk
= [r2 − (rj + r1c )rk + r1c rj ]
δkc + 1 k
k∈L
 pk
= (rk − rj )(rk − r1c ) = sj .
δkc + 1
k=j

This completes the proof.

Acknowledgement
This work would have been impossible without many intensive discussions in the Operations
Research group at the University of St. Gallen. My sincerest thanks especially to K. Frauendorfer,
H. Siede, and D. Steiner.

References
[1] A. J. Alexander and W. F. Sharpe, Fundamentals of investment, Prentice-Hall, 1989.
[2] K. J. Arrow, Essays in the theory of risk-bearing, Markham, Chicago, 1971.
[3] P. Artzner, F. Delbaen, J. M. Eber, and D. Heath, Thinking coherently, Risk, 10 (1997),
pp. 68–71.
[4] C. Atkinson, S. R. Pliska, and P. Wilmott, Portfolio management with transaction costs., Proc.
R. Soc. Lond., Ser. A, 453 (1997), pp. 551–562.
[5] V. S. Bawa, Stochastic dominance: A research bibliography, Manag. Sci., 28 (1982), pp. 698–712.
[6] R. E. Bellman, Dynamic Programming, Princeton University Press, 1957.
[7] A. Beltratti, A. Consiglio, and S. A. Zenios, Scenario modeling for the management of inter-
national bond portfolios., Ann. Oper. Res., 85 (1999), pp. 227–247.
[8] M. Best and B. Ding, On the continuity of the minimum in parametric quadratic programs., J.
Optimization Theory Appl., 86 (1995), pp. 245–250.
[9] J. R. Birge, Stochastic programming computations and applications, INFORMS J. Comput., 9
(1997), pp. 111–133.
40 M. C. Steinbach

[10] J. R. Birge and F. Louveaux, Introduction to Stochastic Programming, Springer-Verlag, 1997.


[11] M. Broadie, Computing efficient frontiers using estimated parameters, Ann. Oper. Res., 45 (1993),
pp. 21–58.
[12] V. K. Chopra and W. T. Ziemba, The effects of errors in means, variances, and covariances on
optimal portfolio choice, J. Portfolio Manag., (1993), pp. 6–11.
[13] G. Consigli and M. A. H. Dempster, Solving dynamic portfolio problems using stochastic pro-
gramming, Z. Angew. Math. Mech., 77 (1997), pp. S535–S536.
[14] G. B. Dantzig and G. Infanger, Multi-stage stochastic linear programs for portfolio optimization,
Ann. Oper. Res., 45 (1993), pp. 59–76.
[15] R. S. Dembo and D. Rosen, The practice of portfolio replication. A practical overview of forward
and inverse problems., Ann. Oper. Res., 85 (1999), pp. 267–284.
[16] T. Dohi and S. Osaki, A note on portfolio optimization with path-dependent utility, Ann. Oper.
Res., 45 (1993), pp. 77–90.
[17] G. T. Duncan, A matrix measure of multivariate local risk aversion, Econometrica, 45 (1977),
pp. 895–903.
[18] E. J. Elton and M. J. Gruber, Modern portfolio theory and investment analysis, John Wiley, New
York, 3rd ed., 1987.
[19] Y. Ermoliev and R. J.-B. Wets, eds., Numerical techniques for stochastic optimization, Springer-
Verlag, 1988.
[20] P. C. Fishburn, Mean-risk analysis with risk associated with below-target returns, Amer. Econ. Rev.,
67 (1977), pp. 116–125.
[21] K. Frauendorfer, The stochastic programming extension of the Markowitz approach, Int. J. Mass-
Parallel Comput. Inform. Syst., 5 (1995), pp. 449–460.
[22] , Barycentric scenario trees in convex multistage stochastic programming, Math. Programming,
75 (1996), pp. 277–293.
[23] K. Frauendorfer and H. Siede, Portfolio selection using multi-stage stochastic programming, tech.
report, University of St. Gallen, 1998.
[24] N. H. Hakansson, Multi-period mean-variance analysis: Toward a general theory of portfolio choice,
J. Fin., 26 (1971), pp. 857–884.
[25] J. M. Harrison and S. R. Pliska, A stochastic calculus model of continuous trading: Complete
markets, Stochastic Processes Appl., 15 (1983), pp. 313–316.
[26] H. He and N. D. Pearson, Consumption and portfolio policies with incomplete markets and short-
sale constraints: The infinite dimensional case, J. Econ. Theory, 54 (1991), pp. 259–304.
[27] D. C. Heath, S. Orey, V. C. Pestien, and W. D. Sudderth, Minimizing or maximizing the
expected time to reach zero, SIAM J. Control Optimization, 25 (1987), pp. 195–205.
[28] G. Infanger, 1994. Private Communication (via K. Frauendorfer, 1998).
[29] J. E. Ingersoll, Jr., Theory of financial decision making, Rowman & Littlefield, 1987.
[30] S. D. Jacka, A martingale representation result and an application to incomplete financial markets,
Math. Finance, 2 (1992), pp. 239–250.
[31] I. Jewitt, Choosing between risky prospects: The characterization of comparative statistics results,
and location independent risk, Manag. Sci., 35 (1989), pp. 60–70.
[32] J. D. Jobson, Confidence regions for the mean-variance efficient set: An alternative approach to
estimation risk, Rev. Quant. Fin. Accounting, 1 (1991), p. 293.
[33] P. Kall and S. W. Wallace, Stochastic Programming, John Wiley, New York, 1994.
[34] I. Karatzas, Optimization problems in the theory of continuous trading, SIAM J. Control Optimiza-
tion, 27 (1989), pp. 1221–1258.
[35] I. Karatzas, J. P. Lehoczky, S. E. Shreve, and G. L. Xu, Martingale and duality methods for
utility maximization in an incomplete market, SIAM J. Control Optimization, 29 (1991), pp. 702–730.
[36] R. E. Kihlstrom and L. J. Mirman, Constant, increasing, and decreasing risk aversion with many
commodities, Rev. Econ. Studies, 48 (1981), pp. 271–280.
Single-Period and Multi-Period Mean-Variance Models 41

[37] M. Kijima and M. Ohnishi, Mean-risk analysis of risk aversion and wealth effects on optimal port-
folios with multiple investment opportunities, Ann. Oper. Res., 45 (1993), pp. 147–163.
[38] , Portfolio selection problems via the bivariate characterization of stochastic dominance rela-
tions., Math. Finance, 6 (1996), pp. 237–277.
[39] A. J. King, Asymmetric risk measures and tracking models for portfolio optimization under uncer-
tainty, Ann. Oper. Res., 45 (1993), pp. 165–177.
[40] A. J. King and D. L. Jensen, Linear-quadratic efficient frontiers for portfolio optimization., Appl.
Stochastic Models Data Anal., 8 (1992), pp. 195–207.
[41] H. Konno, Piecewise linear risk function and portfolio optimization, J. Oper. Res. Soc. Japan, 33
(1990), pp. 139–156.
[42] H. Konno, S. R. Pliska, and K.-I. Suzuki, Optimal portfolios with asymptotic criteria, Ann. Oper.
Res., 45 (1993), pp. 187–204.
[43] H. Konno and K.-I. Suzuki, A mean-variance-skewness portfolio optimization model., J. Oper. Res.
Soc. Japan, 38 (1995), pp. 173–187.
[44] H. Konno and H. Watanabe, Bond portfolio optimization problems and their applications to index
tracking: A partial optimization approach., J. Oper. Res. Soc. Japan, 39 (1996), pp. 295–306.
[45] A. Kraus and R. Litzenberger, Skewness preference and the valuation of risky assets, J. Fin., 21
(1976), pp. 1085–1094.
[46] Y. Kroll, H. Levy, and H. M. Markowitz, Mean-variance versus direct utility maximization, J.
Fin., 39 (1984), pp. 47–62.
[47] H. Levy, Stochastic dominance and expected utility: survey and analysis, Manag. Sci., 38 (1992),
pp. 555–593.
[48] H. Levy and H. M. Markowitz, Approximating expected utility by a function of mean and variance,
Amer. Econ. Rev., 69 (1979), pp. 308–317.
[49] Y. Li and W. T. Ziemba, Univariate and multivariate measures of risk aversion and risk premiums,
Ann. Oper. Res., 45 (1993), pp. 265–296.
[50] L. C. MacLean and K. Weldon, Estimating multivariate random effects without replication.,
Comm. Statist. Theory Methods, 25 (1996), pp. 1447–1469.
[51] H. M. Markowitz, Portfolio selection, J. Fin., 7 (1952), pp. 77–91.
[52] , The utility of wealth, J. Political Econ., 60 (1952), pp. 151–158.
[53] , The optimization of a quadratic function subject to linear constraints, Nav. Res. Logist. Quar-
terly, 3 (1956), pp. 111–133.
[54] , Portfolio Selection: Efficient Diversification of Investments, John Wiley, New York, 1959.
[55] , Mean-Variance analysis in portfolio choice and capital markets, Basil Blackwell, 1987.
[56] R. C. Merton, Optimal consumption and portfolio rules in a continuous-time model, J. Econ. Theory,
3 (1971), pp. 373–413.
[57] R. C. Merton and P. A. Samuelson, Fallacy of the log-normal approximation to optimal portfolio
decision-making over many periods, J. Fin. Econ., 1 (1974), pp. 67–94.
[58] A. J. Morton and S. R. Pliska, Optimal portfolio management with fixed transaction costs., Math.
Finance, 5 (1995), pp. 337–356.
[59] J. Mossin, Optimal multiperiod portfolio policies, J. Bus., 41 (1968), pp. 215–229.
[60] J. M. Mulvey and H. Vladimirou, Stochastic network optimization models for investment planning,
Ann. Oper. Res., 20 (1989), pp. 187–217.
[61] M. Nakasato and K. Furukawa, On the number of securities which constitute an efficient portfolio,
Ann. Oper. Res., 45 (1993), pp. 333–347.
[62] The Sveriges Riksbank (Bank of Sweden) Prize in Economic Sciences in Memory of Alfred Nobel.
Press Release of The Royal Swedish Academy of Sciences, Oct. 16, 1990. URL http://www.nobel.se
/laureates/economy-1990-press.html.
[63] A. F. Perold, Large-scale portfolio optimization, Manag. Sci., 30 (1984), pp. 1143–1160.
42 M. C. Steinbach

[64] J. W. Pratt, Risk aversion in the small and in the large, Econometrica, 32 (1964), pp. 122–136.
[65] R. T. Rockafellar, Duality and optimality in multistage stochastic programming, Ann. Oper. Res.,
85 (1999), pp. 1–19.
[66] R. T. Rockafellar and R. J.-B. Wets, Generalized linear-quadratic problems of deterministic and
stochastic optimal control in discrete time, SIAM J. Control Optimization, 28 (1990), pp. 810–822.
[67] S. Ross, Some stronger measures of risk aversion in the small and in the large, Econometrica, 49
(1981), pp. 621–638.
[68] M. E. Rubinstein, A comparative statistics analysis of risk premiums, J. Bus., 12 (1973), pp. 605–
615.
[69] A. Ruszczyński, Decomposition methods in stochastic programming, Math. Programming, 79 (1997),
pp. 333–353.
[70] P. A. Samuelson, The fundamental approximation theorem of portfolio analysis in terms of means,
variances, and higher moments, Rev. Econ. Studies, 25 (1958), pp. 65–86.
[71] , Lifetime portfolio selection by dynamic stochastic programming, Rev. Econ. Studies, 51 (1969),
pp. 239–246.
[72] W. F. Sharpe, Investment, Prentice-Hall, 1978.
[73] H. Shirakawa, Optimal consumption and portfolio selection with incomplete markets and upper and
lower bound constraints., Math. Finance, 4 (1994), pp. 1–24.
[74] M. C. Steinbach, Fast Recursive SQP Methods for Large-Scale Optimal Control Problems, Ph. D.
dissertation, University of Heidelberg, 1995.
[75] , Structured interior point SQP methods in optimal control, Z. Angew. Math. Mech., 76 (1996),
pp. 59–62.
[76] , Recursive direct optimization and successive refinement in multistage stochastic programs,
Preprint SC-98-27, ZIB, 1998.
[77] , Recursive direct algorithms for multistage stochastic programs in financial engineering, in
Operations Research Proceedings 1998, P. Kall and H.-J. Lüthi, eds., Springer-Verlag, 1999, pp. 241–
250.
[78] M. C. Steinbach, H. G. Bock, G. V. Kostin, and R. W. Longman, Mathematical optimization
in robotics: Towards automated high speed motion planning, Surv. Math. Ind., 7 (1998), pp. 303–340.
[79] S. Uryasev and R. T. Rockafellar, Optimization of Conditional Value-at-Risk, Research Report
99-4, University of Florida, 1999.
[80] J. von Neumann and O. Morgenstern, Theory of Games and Economic Behavior, Princeton
University Press, 3rd ed., 1953.
[81] R. J.-B. Wets, Programming under uncertainty: The equivalent convex program, SIAM J. Appl.
Math., 14 (1984), pp. 89–105.
[82] S. A. Zenios, Financial optimization, Cambridge University Press, Cambridge, UK, 1992.
[83] S. A. Zenios and P. Kang, Mean-absolute deviation portfolio optimization for mortgage-backed
securities, Ann. Oper. Res., 45 (1993), pp. 433–450.
[84] W. T. Ziemba, Choosing investments when the returns have stable distributions, in Mathematical
programming in theory and practice, P. L. Hammer and G. Zoutendijk, eds., North-Holland, Ams-
terdam, 1974, pp. 443–482.
[85] W. T. Ziemba and J. M. Mulvey, eds., Worldwide asset and liability modeling, Cambridge Uni-
versity Press, Cambridge, UK, 1998.

You might also like