On a class of minimax stochastic
programs
Alexander Shapiro∗ and Shabbir Ahmed†
August 12, 2003
Abstract
For a particular class of minimax stochastic programming models, we
show that the problem can be equivalently reformulated into a standard
stochastic programming problem. This permits the direct use of standard decomposition and sampling methods developed for stochastic programming. We also show that this class of minimax stochastic programs
subsumes a large family of mean-risk stochastic programs where risk is
measured in terms of deviations from a quantile.
Key words: worst case distribution, problem of moments, Lagrangian duality,
mean risk stochastic programs, deviation from a quantile.
1
Introduction
A wide variety of decision problems under uncertainty involves optimization of
an expectation functional. An abstract formulation for such stochastic programming problems is
Min EP [F (x, ω)],
(1.1)
x∈X
n
where X ⊆ R is the set of feasible decisions, F : Rn × Ω 7→ R is the objective
function and P is a probability measure (distribution) on the space Ω equipped
with a sigma algebra F. The stochastic program (1.1) has been studied in great
detail, and significant theoretical and computational progress has been achieved
(see, e.g., [17] and references therein).
In the stochastic program (1.1) the expectation is taken with respect to the
probability distribution P which is assumed to be known. However, in practical
applications, such a distribution is not known precisely, and has to be estimated
from data or constructed using subjective judgments. Often, the available information is insufficient to identify a unique distribution. In the absence of full
∗ Supported
† Supported
by the National Science Foundation under grant DMS-0073770.
by the National Science Foundation under grant DMI-0133943.
1
information on the underlying distribution, an alternative approach is as follows.
Suppose a set P of possible probability distributions for the uncertain parameters is known, then it is natural to optimize the expectation functional (1.1)
corresponding to the “worst” distribution in P. This leads to the following
minimax stochastic program:
Min f (x) := sup EP [F (x, ω)] .
(1.2)
x∈X
P ∈P
Theoretical properties of minimax stochastic programs have been studied
in a number of publications. In that respect we can mention pioneering works
of Žáčková [22] and Dupačová [3, 4], for more recent publications see [19] and
references therein. These problems have also received considerable attention in
the context of bounding and approximating stochastic programs [1, 7, 9]. A
number of authors have proposed numerical methods for minimax stochastic
program. Ermoliev, Gaivoronsky and Nedeva [5] proposed a method based on
the stochastic quasigradient algorithm and generalized linear programming. A
similar approach along with computational experience is reported in [6]. Breton
and El Hachem [2] developed algorithms based on bundle methods and subgradient optimization. Riis and Andersen [15] proposed a cutting plane algorithm.
Takriti and Ahmed [21] considered minimax stochastic programs with binary
decision variables arising in power auctioning applications, and developed a
branch-and-cut scheme. All of the above numerical methods require explicit
solution of the inner optimization problem supP ∈P EP [F (x, ω)] corresponding
to the candidate solution x in each iteration. Consequently, such approaches
are inapplicable in situations where calculation of the respective expectations
numerically is infeasible because the set Ω although finite is prohibitively large,
or possibly infinite.
In this paper, we show that a fairly general class of minimax stochastic
programs can be equivalently reformulated into standard stochastic programs
(involving optimization of expectation functionals). This permits a direct application of powerful decomposition and sampling methods that have been developed for standard stochastic programs in order to solve large-scale minimax
stochastic programs. Furthermore, the considered class of minimax stochastic
programs is shown to subsume a large family of mean-risk stochastic programs,
where the risk is measured in terms of deviations from a quantile.
2
The problem of moments
In this section we discuss a variant of the problem of moments. This will provide
us with basic tools for the subsequent analysis of minimax stochastic programs.
Let us denote by X the (linear) space of all finite signed measures on (Ω, F).
We say that a measure µ ∈ X is nonnegative, and write µ 0, if µ(A) ≥ 0 for
any A ∈ F. For two measures µ1 , µ2 ∈ X we write µ2 µ1 if µ2 − µ1 0.
That is, µ2 µ1 if µ2 (A) ≥ µ1 (A) for any A ∈ F. It is said that µ ∈ X is
2
a probability measure if µ 0 and µ(Ω) = 1. For given nonnegative measures
µ1 , µ2 ∈ X consider the set
M := µ ∈ X : µ1 µ µ2 .
(2.1)
Let ϕi (ω), i = 0, ..., q, be real valued measurable functions on (Ω, F) and bi ∈ R,
i = 1, ..., q, be given numbers. Consider the problem
R
MaxP ∈M RΩ ϕ0 (ω)dP (ω)
subject to RΩ dP (ω) = 1,
(2.2)
RΩ ϕi (ω)dP (ω) = bi , i = 1, ..., r,
ϕ (ω)dP (ω) ≤ bi , i = r + 1, ..., q.
Ω i
In the above problem, the first constraint implies that the optimization is performed over probability measures, the next two constraints represent moment
restrictions, and the set M represents upper and lower bounds on the considered
measures. If the constraint P ∈ M is replaced by the constraint P 0, then
the above problem (2.2) becomes the classical problem of moments (see, e.g.,
[12],[20] and references therein). As we shall see, however, the introduction of
lower and upper bounds on the considered measures makes the above problem
more suitable for an application to minimax stochastic programming.
We make the following assumptions throughout this section:
(A1) The functions ϕi (ω), i = 0, ..., q, are µ2 -integrable, i.e.,
Z
|ϕi (ω)|dµ2 (ω) < ∞, i = 0, ..., q.
Ω
(A2) The feasible set of problem (2.2) is nonempty, and, moreover, there exists
a probability measure P ∗ ∈ M satisfying the equality constraints as well
as the inequality constraints as equalities, i.e.,
Z
ϕi (ω)dP ∗ (ω) = bi , i = 1, ..., q.
Ω
Assumption (A1) implies that ϕi (ω), i = 0, ..., q, are P -integrable with respect to all measures P ∈ M, and hence problem (2.2) is well defined. By
assumption (A2), we can make the following change of variables P = P ∗ + µ,
and hence to write problem (2.2) in the form
R
R
Maxµ∈M∗ RΩ ϕ0 (ω)dP ∗ (ω) + Ω ϕ0 (ω)dµ(ω)
subject to RΩ dµ(ω) = 0,
(2.3)
RΩ ϕi (ω)dµ(ω) = 0, i = 1, ..., r,
ϕ (ω)dµ(ω) ≤ 0, i = r + 1, ..., q,
Ω i
where
M∗ := µ ∈ X : µ∗1 µ µ∗2
with µ∗1 := µ1 − P ∗ and µ∗2 := µ2 − P ∗ .
3
(2.4)
The Lagrangian of problem (2.3) is
Z
Z
∗
Lλ (ω)dµ(ω),
ϕ0 (ω)dP (ω) +
L(µ, λ) :=
(2.5)
Ω
Ω
where
Lλ (ω) := ϕ0 (ω) − λ0 −
q
X
λi ϕi (ω),
(2.6)
i=1
and the (Lagrangian) dual of (2.3) is the following problem:
Minλ∈Rq+1
ψ(λ) := supµ∈M∗ L(µ, λ)
subject to λi ≥ 0, i = r + 1, ..., q.
It is straightforward to see that
Z
Z
Z
ψ(λ) =
ϕ0 (ω)dP ∗ (ω) + [Lλ (ω)]+ dµ∗2 (ω) − [−Lλ (ω)]+ dµ∗1 (ω),
Ω
Ω
(2.7)
(2.8)
Ω
where [a]+ := max{a, 0}.
By the standard theory of Lagrangian duality we have that the optimal value
of problem (2.3) is always less than or equal to the optimal value of its dual (2.7).
It is possible to give various regularity conditions (constraint qualifications)
ensuring that the optimal values of problem (2.3) and its dual (2.7) are equal to
each other, i.e., that there is no duality gap between problems (2.3) and (2.7).
For example, we have (by the theory of conjugate duality, [16]) that there is no
duality gap between (2.3) and (2.7), and the set of optimal solutions of the dual
problem is nonempty and bounded, iff the following assumption holds:
(A3) The optimal value of (2.2) is finite, and there exists a feasible solution to
(2.2) for all sufficiently small perturbations of the right hand sides of the
(equality and inequality) constraints.
We may refer to [18] (and references therein) for a discussion of constraint
qualifications ensuring the “no duality gap” property in the problem of moments.
By the above discussion we have the following result.
Proposition 1 Suppose that the assumptions (A1)–(A3) hold. Then problems
(2.2) and (2.3) are equivalent and there is no duality gap between problem (2.3)
and its dual (2.7).
Remark 1 The preceding analysis simplifies considerably if the set Ω is finite,
say Ω := {ω1 , ..., ωK }. Then a measure P ∈ X can be identified with vector
p = (p1 , ..., pK ) ∈ RK . We have, of course, that P 0 iff pk ≥ 0, k = 1, ..., K.
The set M can be written in the form
M = p ∈ RK : µ1k ≤ pk ≤ µ2k , k = 1, ..., K ,
for some numbers µ2k ≥ µ1k ≥ 0, k = 1, ..., K, and problems (2.2) and (2.3) become linear programming problems. In that case the optimal values of problem
4
(2.2) (problem (2.3)) and its dual (2.7) are equal to each other by the standard
linear programming duality without a need for constraint qualifications, and the
assumption (A3) is superfluous.
Let us now consider, further, a specific case of (2.2) where
M := µ ∈ X : (1 − ε1 )P ∗ µ (1 + ε2 )P ∗ ,
(2.9)
i.e., µ1 = (1−ε1 )P ∗ and µ2 = (1+ε2 )P ∗ , for some reference probability measure
P ∗ satisfying assumption (A2) and numbers ε1 ∈ [0, 1], ε2 ≥ 0. In that case the
dual problem (2.7) takes the form:
n
o
Minλ∈Rq+1 EP ∗ ϕ0 (ω) + ηε1 ,ε2 [Lλ (ω)]
(2.10)
subject to λi ≥ 0, i = r + 1, ..., q,
where Lλ (ω) is defined in (2.6) and
−ε1 a, if a ≤ 0,
ηε1 ,ε2 [a] :=
ε2 a, if a > 0.
(2.11)
Note that the function ηε1 ,ε2 [·] is convex piecewise linear and Lλ (ω) is affine
in λ for every ω ∈ Ω. Consequently the objective function of (2.10) is convex
in λ. Thus, the problem of moments (2.2) has been reformulated as a convex stochastic programming problem (involving optimization of the expectation
functional) of the form (1.1).
3
A class of minimax stochastic programs
We consider a specific class of minimax stochastic programming problems of the
form
Min f (x),
(3.1)
x∈X
where f (x) is the optimal value of the problem:
R
MaxP ∈M RΩ F (x, ω)dP (ω)
subject to RΩ dP (ω) = 1,
RΩ ϕi (ω)dP (ω) = bi , i = 1, ..., r,
ϕ (ω)dP (ω) ≤ bi , i = r + 1, ..., q,
Ω i
(3.2)
and M is defined as in (2.9). Of course, this is a particular form of the minimax
stochastic programming problem (1.2) with the set P formed by probability
measures P ∈ M satisfying the corresponding moment constraints.
We assume that the set X is nonempty and assumptions (A1)–(A3), of
section 2, hold for the functions ϕi (·), i = 1, ..., q, and ϕ0 (·) := F (x, ·) for all x ∈
X. By the analysis of section 2 (see Proposition 1 and dual formulation (2.10))
5
we have then that the minimax problem (3.1) is equivalent to the stochastic
programming problem:
Min (x,λ)∈Rn+q+1
subject to
EP ∗ [H(x, λ, ω)]
x ∈ X and λi ≥ 0, i = r + 1, ..., q,
(3.3)
where
H(x, λ, ω) := F (x, ω) + ηε1 ,ε2 [F (x, ω) − λ0 −
Pq
i=1
λi ϕi (ω)] .
(3.4)
Note that by reformulating the minimax problem (3.1) into problem (3.3),
which is a standard stochastic program involving optimization of an expectation
functional, we avoid explicit solution of the inner maximization problem with
respect to the probability measures. The reformulation, however, introduces
q + 1 additional variables.
For problems with a prohibitively large (or possibly infinite) support Ω, a
simple but effective approach to attack (3.3) is the sample average approximation (SAA) method. The basic idea of this approach is to replace the expectation functional in the objective by a sample average function and to solve
the corresponding SAA problem. Depending on the structure of the objective
function F (x, ω) and hence H(x, λ, ω), a number of existing stochastic programming algorithms can be applied to solve the obtained SAA problem. Under mild
assumptions, the SAA method has been shown to have attractive convergence
properties. For example, a solution to the SAA problem quickly converges to a
solution to the true problem as the sample size N is increased. Furthermore, by
repeated solutions of the SAA problem, statistical confidence intervals on the
quality of the corresponding SAA solutions can be obtained. Detailed discussion
of the SAA method can be found in [17, Chapter 6] and references therein.
3.1
Stochastic programs with convex objectives
In this section, we consider minimax stochastic programs (3.1) corresponding
to stochastic programs where the objective function is convex. Note that if the
function F (·, ω) is convex for every ω ∈ Ω, then the function f (·), defined as the
optimal value of (3.2), is given by the maximum of convex functions and hence
is convex. Not surprisingly, the reformulation preserves convexity.
Proposition 2 Suppose that the function F (·, ω) is convex for every ω ∈ Ω.
Then for any ε1 ∈ [0, 1] and ε2 ≥ 0 and every ω ∈ Ω, the function H(·, ·, ω) is
convex and
if N (x, λ, ω) < 0,
(1 − ε1 )∂F (x, ω) × {ε1 ϕ(ω)},
(1 + ε2 )∂F (x, ω) × {−ε2 ϕ(ω)},
if N (x, λ, ω) > 0,
∂H(x, λ, ω) =
∪τ ∈[−ε1 ,ε2 ] (1 + τ )∂F (x, ω) × {−τ ϕ(ω)}, if N (x, λ, ω) = 0,
(3.5)
where the subdifferentials ∂H(x, λ, ω) and ∂F (x, ω) are taken with respect to
(x, λ) and x, respectively, and
N (x, λ, ω) := F (x, ω) − λ0 −
q
X
λi ϕi (ω), ϕ(ω) := (1, ϕ1 (ω), ..., ϕq (ω)).
i=1
6
Proof. Consider the function ψ(z) := z + ηε1 ,ε2 [z]. We can write
q
X
λi ϕi (ω),
H(x, λ, ω) = ψ V (x, λ, ω) + λ0 +
i=1
Pq
where V (x, λ, ω) := F (x, ω) − λ0 − i=1 λi ϕi (ω). The function V (x, λ, ω) is
convex in (x, λ), and for ε1 ∈ [0, 1] and ε2 ≥ 0, the function ψ(z) is monotonically nondecreasing and convex. Convexity of H(·, ·, ω) then follows. The
subdifferential formula (3.5) is obtained by the chain rule.
Let us now consider instances of (3.3) with a finite set of realizations of ω:
n
o
PK
Min (x,λ)∈Rn+q+1
h(x, λ) := k=1 p∗k H(x, λ, ωk )
(3.6)
subject to
x ∈ X and λi ≥ 0, i = r + 1, ..., q,
where Ω = {ω1 , . . . , wK } and P ∗ = (p∗1 , ..., p∗K ). The above problem can either
correspond to a problem with finite support of ω or may be obtained by sampling as in the SAA method. Problem (3.6) has a nonsmooth convex objective
function, and often can be solved by using cutting plane or bundle type methods that use subgradient information (see, e.g., [8]). By the Moreau-Rockafellar
theorem we have that
∂h(x, λ) =
K
X
p∗k ∂H(x, λ, ωk ),
(3.7)
k=1
where all subdifferentials are taken with respect to (x, λ). Together with (3.5)
this gives a formula for a subgradient of h(·, ·), given subgradient information
for F (·, ω).
3.2
Two-stage stochastic programs
A wide variety of stochastic programs correspond to optimization of the expected value of a future optimization problem. That is, let F (x, ω) be defined
as the optimal value function
F (x, ω) :=
Min
y∈Y (x,ω)
G0 (x, y, ω),
(3.8)
where
Y (x, ω) := {y ∈ Y : Gi (x, y, ω) ≤ 0, i = 1, ..., m} ,
(3.9)
Y is a nonempty subset of a finite dimensional vector space and Gi (x, y, ω),
i = 0, ..., m, are real valued functions. Problem (1.1), with F (x, ω) given in
the form (3.8), is referred to as a two-stage stochastic program, where the firststage variables x are decided prior to the realization of the uncertain parameters,
and the second-stage variables y are decided after the uncertainties are revealed.
The following result show that a minimax problem corresponding to a two-stage
stochastic program is itself a two-stage stochastic program.
7
Proposition 3 If F (x, ω) is defined as in (3.8), then the function H(x, λ, ω),
defined in (3.4), is given by
H(x, λ, ω) =
inf
y∈Y (x,ω)
G(x, λ, y, ω),
(3.10)
where
"
G(x, λ, y, ω) := G0 (x, y, ω) + ηε1 ,ε2 G0 (x, y, ω) − λ0 −
q
X
#
λi ϕi (ω) .
i=1
(3.11)
Proof. The result follows by noting that
G(x, λ, y, ω) = ψ G0 (x, y, ω) − λ0 −
q
X
i=1
!
λi ϕi (ω)
+ λ0 +
q
X
λi ϕi (ω),
i=1
and the function ψ(z) := z + ηε1 ,ε2 [z] is monotonically nondecreasing for ε1 ≤ 1
and ε2 ≥ 0.
By the above result, if the set Ω := {ω1 , ..., ωK } is finite, then the reformulated minimax problem (3.3) can be written as one large-scale optimization
problem:
PK
∗
Min x,λ,y1 ,...,yK
k=1 pk G(x, λ, yk , ωk )
(3.12)
subject to
yk ∈ Y (x, ωk ), k = 1, ..., K,
x ∈ X, λi ≥ 0, i = r + 1, ..., q.
A particularly important case of two-stage stochastic programs are the twostage stochastic (mixed-integer) linear programs, where F (x, ω) := V (x, ξ(ω))
and V (x, ξ) is given by the optimal value of the problem:
Miny
cT x + q T y,
subject to W y = h − T x, y ∈ Y.
(3.13)
Here ξ := (q, W, h, T ) represents the uncertain (random) parameters of problem (3.13), and X and Y are defined by linear constraints (and possibly with
integrality restrictions). By applying standard linear programming modelling
principles to the piecewise linear function ηε1 ,ε2 , we obtain that H(x, λ, ξ(ω)) is
given by the optimal value of the problem:
Miny,u+ ,u−
subject to
cT x + q T y + ε1 u− + ε2 u+
W y = h − T x,
u+ − u− = cT x + q T y − ϕT λ,
y ∈ Y, u+ ≥ 0, u− ≥ 0,
(3.14)
where ϕ := (1, ϕ1 (ω), . . . , ϕq (ω))T . As before, if the set Ω := {ω1 , ..., ωK }
is finite, then the reformulated minimax problem (3.3) can be written as one
8
large-scale mixed-integer linear program:
PK
+
Minx,λ,y,u+ ,u− cT x + k=1 p∗k qkT yk + ε1 u−
k + ε2 uk
subject to
Wk yk = hk − Tk x, k = 1, . . . , K,
−
T
T
T
u+
k − uk = c x + qk yk − ϕk λ, k = 1, . . . , K,
+
−
yk ∈ Y, uk ≥ 0, uk ≥ 0, k = 1, . . . , K,
x ∈ X.
(3.15)
The optimization model stated above has a block-separable structure which can,
in principle, be exploited by existing decomposition algorithms for stochastic
(integer) programs. In particular, if Y does not have any integrality restrictions,
then the L-shaped (or Benders) decomposition algorithm and its variants can
be immediately applied (see, e.g., [17, Chapter 3]).
4
Connection to a class of mean-risk models
Note that the stochastic program (1.1) is risk-neutral in the sense that it is
concerned with the optimization of an expectation objective. To extend the
stochastic programming framework to a risk-averse setting, one can adopt the
mean-risk framework advocated by Markowitz and further developed by many
others. In this setting the model (1.1) is extended to
Min E[F (x, ω)] + γR[F (x, ω)],
x∈X
(4.1)
where R[Z] is a dispersion statistic of the random variable Z used as a measure
of risk, and γ is a weighting parameter to trade-off mean with risk. Classically,
the variance statistic has been used as the risk-measure. However, it is known
that many typical dispersion statistics, including variance, may cause the meanrisk model (4.1) to provide inferior solutions. That is, an optimal solution to the
mean-risk model may be stochastically dominated by another feasible solution.
Recently, Ogryczak and Ruszczynski [14] has identified a number of statistics
which, when used as the risk measure R[·] in (4.1), guarantee that the mean-risk
solutions are consistent with stochastic dominance theory. One such dispersion
statistic is
hα [Z] := E α[Z − κα ]+ + (1 − α)[κα − Z]+ ,
(4.2)
where 0 ≤ α ≤ 1 and κα = κα (Z) denotes the α-quantile of the distribution
of Z. Recall that κα is said to be an α-quantile of the distribution of Z if
P r(Z < κα ) ≤ α ≤ P r(Z ≤ κα ), and the set of α-quantiles forms the interval
[a, b] with a := inf{z : P r(Z ≤ z) ≥ α} and b := sup{z : P r(Z ≥ z) ≤ α}. In
particular, if α = 21 , then κα (Z) becomes the median of the distribution of Z
and
hα [Z] = 21 E Z − κ1/2 ,
and it represents half of the mean absolute deviation from the median.
In [14], it is shown that mean-risk models (4.1), with R[·] := hα [·] and γ ∈
[0, 1], provide solutions that are consistent with stochastic dominance theory.
9
In the following, we show that minimax models (3.3) subsume such mean-risk
models (4.1).
Consider Lλ (ω), defined in (2.6), and α := ε2 /(ε1 + ε2 ). Observe that, for
fixed λi , i = 1, ..., q, and ε1 > 0, ε2 > 0, a minimizer λ̄0 of EP ∗ ηε1 ,ε2 [Lλ (ω)],
over λ0 ∈ R, is given
Pq by an α-quantile of the distribution of the random variable
Z(ω) := ϕ0 (ω) − i=1 λi ϕi (ω) defined on the probability space (Ω, F, P ∗ ). In
particular, if ε1 = ε2 , then λ̄0 is the median of the distribution of Y . It follows
that if ε1 and ε2 are positive, then the minimum of the expectation in (3.3), with
respect
to λ0 ∈ R, is attained at an α-quantile of the distribution of F (x, ω) −
Pq
λ
ϕi (ω) with respect to the probability measure P ∗ . In particular, if the
i
i=1
moments constraints are not present in (3.2), i.e., q = 0, then problem (3.3) can
be written as follows
(4.3)
Min EP ∗ F (x, ω) + (ε1 + ε2 )hα [F (x, ω)],
x∈X
where hα is defined as in (4.2). The above discussion leads to the following
result.
Proposition 4 The mean-risk model (4.1) with R[·] := hα [·] is equivalent to
the minimax model (3.3) with ε1 = γ(1 − α), ε2 = αγ and q = 0.
The additional term (ε1 + ε2 )hα [F (x, ω)], which appears in (4.3), can be
interpreted as a regularization term. We conclude this section by discussing the
effect of such regularization.
Consider the case when the function F (·, ω) is convex, and piecewise linear
for all ω ∈ Ω. This is the case, for example, when F (x, ω) is the value function
of the second-stage linear program (3.13). Consider the stochastic programming
problem (with respect to the reference probability distribution P ∗ ):
Min EP ∗ [F (x, ω)],
x∈X
(4.4)
and the corresponding mean-risk or minimax model (4.3). Suppose that X is
polyhedral, the support Ω of ω is finite, and both problems (4.3) and (4.4)
have finite optimal solutions. Then from the discussion at the end of Section
3, the problems (4.3) and (4.4) can be stated as linear programs. Let S0 and
Sε1,ε2 denote the sets of optimal solutions of (4.4) and (4.3), respectively. Then
by standard theory of linear programming, we have that, for all ε1 > 0 and
ε2 > 0 sufficiently small, the inclusion Sε1 ,ε2 ⊂ S0 holds. Consequently, the term
(ε1 +ε2 )hα [F (x, ω)] has the effect of regularizing the solution set of the stochastic
program (4.4). We further illustrate this regularization with an example.
Example 1 Consider the function F (x, ω) := |ω − x|, x, w ∈ R, with ω having
the reference distribution P ∗ (ω = −1) = p∗1 and P ∗ (ω = 1) = p∗2 for some
p∗1 > 0, p∗2 > 0, p∗1 + p∗2 = 1. We have then that
EP ∗ [F (x, ω)] = p∗1 |1 + x| + p∗2 |1 − x|.
10
Let us discuss first the case where p∗1 = p∗2 = 21 . Then the set S0 of optimal
solutions of the stochastic program (4.4) is given by the interval [−1, 1]. For
ε2 > ε1 and ε1 ∈ (0, 1), the corresponding α-quantile κα (F (x, ω)), with α :=
ε2 /(ε1 + ε2 ), is equal to the largest of the numbers |1 − x| and |1 + x|, and for
ε2 = ε1 the set of α-quantiles is given by the interval with the end points |1 − x|
and |1 + x|. It follows that, for ε2 ≥ ε1 , the mean-risk (or minimax) objective
function in problem (4.3):
f (x) := EP ∗ F (x, ω) + (ε1 + ε2 )hα [F (x, ω)],
is given by
f (x) =
1
2
1
2
(1 − ε1 )|1 − x| + 12 (1 + ε1 )|1 + x|, if x ≥ 0,
(1 + ε1 )|1 − x| + 12 (1 − ε1 )|1 + x|, if x < 0.
Consequently, Sε1 ,ε2 = {0}. Note that for x = 0, the random variable F (x, ω)
has minimal expected value and variance zero (with respect to the reference
distribution P ∗ ). Therefore it is not surprising that x = 0 is the unique optimal
solution of the mean-risk or minimax problem (4.3) for any ε1 ∈ (0, 1) and
ε2 > 0.
Suppose now that p∗2 > p∗1 . In that case S0 = {1}. Suppose, further, that
ε1 ∈ (0, 1) and ε2 ≥ ε1 , and hence α ≥ 21 . Then for x ≥ 0 the corresponding
α-quantile κα (F (x, ω)) is equal to |1 − x| if α < p∗2, κα (F (x, ω)) = 1 + x if
α > p∗2 , and κα (x) can be any point on the interval |1 − x|, 1 + x if α = p∗2 .
Consequently, for α ≤ p∗2 and x ≥ 0,
f (x) = (p∗1 + ε2 p∗1 )(1 + x) + (p∗2 − ε2 p∗1 )|1 − x|.
It follows then that Sε1 ,ε2 = {1} if and only if p∗1 +ε2 p∗1 < p∗2 −ε2 p∗1 . Since α ≤ p∗2
means that ε2 ≤ (p∗2 /p∗1 )ε1 , we have that for ε2 in the interval [ε1 , (p∗2 /p∗1 )ε1 ],
the set Sε1 ,ε2 coincides with S0 if and only if ε2 < (p∗2 /p∗1 − 1)/2. For ε2 in this
interval we can view ε̄2 := (p∗2 /p∗1 − 1)/2 as the breaking value of the parameter
ε2 , i.e., for ε2 bigger than ε̄2 an optimal solution of the minimax problem moves
from the optimal solution of the reference problem.
Suppose now that p∗2 > p∗1 and α ≥ p∗2 . Then for x ≥ 0,
f (x) = (p∗1 + ε1 p∗2 )(1 + x) + (p∗2 − ε1 p∗2 )|1 − x|.
In that case the breaking value of ε1 , for ε1 ≤ (p∗1 /p∗2 )ε2 , is ε̄1 := (1 − p∗1 /p∗2 )/2.
5
Numerical results
In this section, we describe some numerical experiments with the proposed minimax stochastic programming model. We consider minimax extensions of twostage stochastic linear programs with finite support of the random problem
parameters. We assume that q = 0 (i.e., that the moment constraints are not
present in the model) since, in this case, the minimax problems are equivalent
11
to mean-risk extensions of the stochastic programs, where risk is measured in
terms of quantile deviations.
Recall that, owing to the finiteness of the support, the minimax problems
reduce to the specially structured linear programs (3.15). We use an ℓ∞ –trustregion based decomposition algorithm for solving the resulting linear programs.
The method along with its theoretical convergence properties is described in
[11]. The algorithm has been implemented in ANSI C with the GNU Linear
Programming Kit (GLPK) [13] library routines to solve linear programming
subproblems. All computations have been carried out on a Linux workstation
with dual 2.4 GHz Intel Xeon processors and 2 GB RAM.
The stochastic linear programming test problems in our experiments are
derived from those used in [10]. We consider the problems LandS, gbd, 20term,
and storm. Data for these instances are available from the website:
http://www.cs.wisc.edu/∼swright/stochastic/sampling
These problems involve extremely large number of scenarios (joint realizations
of the uncertain problem parameters). Consequently, for each problem, we
consider three instances each with 1000 sampled scenarios. The reference distribution P ∗ for these instances correspond to equal weights assigned to each
sampled scenario.
Recall that a minimax model with parameters ε1 and ε2 is equivalent to a
mean-risk model (involving quantile deviations) with parameters γ := ε1 + ε2
and α := ε2 /(ε1 + ε2 ). We consider α values of 0.5, 0.7, and 0.9, and ε1 values
of 0.1, 0.3, 0.5, 0.7, and 0.9. Note that the values of the parameters ε2 and γ
are uniquely determined by ε2 = αε1 /(1 − α) and γ = ε1 /(1 − α). Note also
that some combinations of ε1 and α are such that γ > 1, and consequently the
resulting solutions are not guaranteed to be consistent with stochastic dominance.
First, for each problem, the reference stochastic programming models (with
ε1 = ε2 = 0) corresponding to all three generated instances were solved. Next,
the minimax stochastic programming models for the various ε1 -α combinations
were solved for all instances. Various dispersion statistics corresponding to the
optimal solutions (from the different models) with respect to the reference distribution P ∗ were computed. Table 1 presents the results for the reference
stochastic program corresponding to the four problems. The first six rows of
the table displays various cost-statistics corresponding to the optimal solution
with respect to P ∗ . The presented data is the average over the three instances.
The terms “Abs Med-Dev,” “Abs Dev,” “Std Dev,” “Abs SemiDev,” and “Std
SemiDev” stand for the statistics mean absolute deviation from the median,
mean absolute deviation, standard deviation, absolute semi-deviation, and standard semi-deviation, respectively. The last two rows of the table display the average (over the three instances) number of iterations and CPU seconds required.
Tables 2-4, 3-7, 8-10, and 11-13 present the results for the problems LandS, gbd,
20term, and storm, respectively. Each table (in the set 2-13) corresponds to
a particular α value and each column in a table correspond to a particular ε1
value. The statistics are organized in the rows as in Table 1.
12
For a fixed level of α, increasing ε1 corresponds to increasing the allowed perturbation of the reference distribution in the minimax model, and to increasing
the weight γ for the risk term in the mean-risk model. Consequently, we observe
from the tables, that this leads to solutions with higher expected costs. We also
observe that the value of some of the dispersion statistics decreases indicating a
reduction in risk. Similar behavior occurs upon increasing α corresponding to
a fixed level of ε1 .
A surprising observation from the numerical results, is that the considered
problem instances are very robust with respect to perturbations of the reference
distribution P ∗ . Even with large perturbations of the reference distribution, the
perturbations of the objective values of the solutions are relatively small.
A final observation from the tables, is the large variability of computational
effort for the various ε1 -α combinations. This can be somewhat explained by the
regularization nature of minimax (or mean-risk) objective function as discussed
in Section 4. For certain ε1 -α combinations, the piece-wise linear objective
function may become very sharp resulting in faster convergence of the algorithm.
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
LandS
225.524231
46.631711
46.950206
59.263949
23.475103
44.55075
47.333333
0.666667
gbd
1655.544680
502.017789
539.633584
715.331904
269.816792
605.012796
57.333333
0.666667
20term
254147.150217
10022.597583
10145.862901
12079.769991
5072.931451
8824.368440
275.333333
32.333333
storm
15498557.910287
304941.126223
313915.600392
371207.137372
156957.800196
261756.118948
5000.000000
2309.333333
Table 1: Statistics corresponding to the reference stochastic program
13
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
ε1 = 0.1
225.572591
45.907760
46.239633
58.277697
23.119817
43.783344
3357.333333
196.333333
ε1 = 0.3
225.741331
45.031670
45.384829
57.158371
22.692415
42.972997
3357.000000
195.333333
ε1 = 0.5
225.992122
44.408435
44.787747
56.408928
22.393873
42.477392
75.000000
1.000000
ε1 = 0.7
226.394508
43.743833
44.151339
55.625918
22.075670
41.969667
70.000000
1.000000
ε1 = 0.9
226.950087
43.040004
43.465447
54.838189
21.732724
41.451160
67.333333
1.000000
Table 2: Problem: LandS, α = 0.5
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
ε1 = 0.1
225.604863
45.691352
46.007275
57.941208
23.003638
43.475330
5000.000000
293.000000
ε1 = 0.3
225.860693
44.759124
45.108807
56.790168
22.554403
42.691880
72.666667
1.333333
ε1 = 0.5
226.308477
43.910281
44.283083
55.787518
22.141542
42.022589
64.666667
1.000000
ε1 = 0.7
226.924267
43.143102
43.536913
54.917470
21.768456
41.445042
70.666667
1.000000
ε1 = 0.9
227.732520
42.325219
42.757779
54.011196
21.378890
40.850585
68.000000
1.000000
Table 3: Problem: LandS, α = 0.7
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
ε1 = 0.1
225.657561
45.442289
45.767970
57.607866
22.883985
43.208970
65.666667
1.000000
ε1 = 0.3
226.231321
44.060225
44.452844
55.952645
22.226422
42.131097
63.333333
1.000000
ε1 = 0.5
227.158529
42.928119
43.357957
54.636028
21.678979
41.267697
59.666667
1.000000
ε1 = 0.7
228.239015
42.065348
42.497211
53.623360
21.248606
40.541459
60.000000
1.000000
Table 4: Problem: LandS, α = 0.9
14
ε1 = 0.9
228.723862
41.752750
42.175589
53.263650
21.087795
40.277103
1700.333333
95.666667
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
ε1 = 0.1
1655.544680
502.017789
539.633584
715.331904
269.816792
605.012796
79.000000
1.000000
ε1 = 0.3
1656.696416
496.352624
536.241586
711.500466
268.120793
602.666522
70.000000
0.666667
ε1 = 0.5
1659.855302
488.853714
531.229091
705.983292
265.614546
599.863871
70.666667
0.333333
ε1 = 0.7
1682.355861
450.792925
501.621083
678.924336
250.810542
586.674962
63.333333
1.000000
ε1 = 0.9
1685.929064
446.110925
498.632665
675.058785
249.316332
583.868317
66.000000
1.000000
Table 5: Problem: gbd, α = 0.5
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
ε1 = 0.1
1655.570090
501.772028
539.433816
715.147566
269.716908
604.913488
75.666667
1.000000
ε1 = 0.3
1657.065720
496.543213
536.230530
711.256157
268.115265
602.289640
71.333333
0.333333
ε1 = 0.5
1663.667081
483.940329
523.960900
702.313425
261.980450
598.708725
71.666667
1.000000
ε1 = 0.7
1685.507685
452.686268
504.791237
678.743484
252.395618
584.678270
63.333333
0.666667
ε1 = 0.9
1771.068164
396.222865
442.632554
619.951050
221.316277
536.764625
60.666667
0.666667
Table 6: Problem: gbd, α = 0.7
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
ε1 = 0.1
1656.742634
497.768654
536.887996
711.572070
268.443998
601.247020
82.333333
1.000000
ε1 = 0.3
1668.311115
496.665323
533.105354
703.357694
266.552677
588.428289
73.333333
1.000000
ε1 = 0.5
1701.823473
493.446982
531.749429
685.771387
265.874714
568.798933
68.000000
1.000000
ε1 = 0.7
2054.695770
411.528582
437.668526
523.260522
218.834263
412.283174
63.000000
1.000000
Table 7: Problem: gbd, α = 0.9
15
ε1 = 0.9
2104.715572
416.847130
434.673818
518.807426
217.336909
401.524723
65.333333
0.666667
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
ε1 = 0.1
254156.095167
9865.177167
10026.802482
11909.728572
5013.401241
8654.320482
284.333333
35.666667
ε1 = 0.3
254172.391617
9780.642083
9935.445188
11827.617116
4967.722594
8590.418211
304.333333
36.000000
ε1 = 0.5
254272.736467
9547.653600
9700.126819
11506.971142
4850.063410
8289.983061
277.000000
32.666667
ε1 = 0.7
254338.015533
9424.043778
9550.311715
11323.282363
4775.155857
8106.878737
279.000000
33.000000
ε1 = 0.9
254368.068283
9387.429861
9505.935648
11267.261816
4752.967824
8049.794913
273.333333
32.666667
ε1 = 0.7
254655.101626
9120.799565
9345.966887
10958.635626
4672.983444
7738.486426
275.666667
32.666667
ε1 = 0.9
254670.894863
9114.212673
9337.985153
10947.621409
4668.992577
7726.880046
1834.000000
431.333333
ε1 = 0.7
255016.024362
9457.586872
9534.906547
11187.286221
4767.453273
7817.800413
443.000000
53.000000
ε1 = 0.9
255924.359895
9301.167534
9353.787342
11114.273955
4676.893671
7648.477100
475.666667
57.666667
Table 8: Problem: 20term, α = 0.5
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
ε1 = 0.1
254157.838900
9850.254233
10008.715887
11891.616364
5004.357943
8637.923902
314.000000
37.666667
ε1 = 0.3
254329.222033
9437.472500
9566.587326
11344.804887
4783.293663
8129.071863
273.666667
32.333333
ε1 = 0.5
254545.403950
9220.598261
9360.187746
11002.470097
4680.093873
7767.960224
281.333333
34.000000
Table 9: Problem: 20term, α = 0.7
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
ε1 = 0.1
254310.491039
9521.522703
9649.482731
11414.001982
4824.741366
8174.885367
290.000000
34.000000
ε1 = 0.3
254508.075023
9330.217798
9469.459564
11115.029895
4734.729782
7857.867726
311.333333
36.666667
ε1 = 0.5
254523.789306
9323.709859
9461.571942
11104.118907
4730.785971
7846.314786
351.000000
41.333333
Table 10: Problem: 20term, α = 0.9
16
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
ε1 = 0.1
15498566.363404
304768.076897
313743.532720
371020.831673
156871.766360
261623.460914
5000.000000
2344.333333
ε1 = 0.3
15498566.363404
304768.076898
313743.532725
371020.831676
156871.766363
261623.460918
1718.666667
785.000000
ε1 = 0.5
15499088.268081
303407.515564
312276.310249
369656.245604
156138.155125
260734.509580
80.666667
16.666667
ε1 = 0.7
15499180.567248
303257.959564
312125.258689
369528.782376
156062.629345
260641.989695
1719.666667
802.333333
ε1 = 0.9
15499502.750174
302867.196699
311544.608977
369239.025550
155772.304488
260124.766710
1713.333333
793.666667
ε1 = 0.7
15499412.008793
303345.546078
312284.406979
369444.593031
156142.203490
260321.919740
3358.666667
1574.000000
ε1 = 0.9
15501297.484860
303651.417922
312788.985008
368414.762740
156394.492504
259467.084176
1713.333333
788.666667
ε1 = 0.7
15517578.176729
302161.828558
310811.051642
366045.144715
155405.525821
254270.119605
3362.333333
1623.333333
ε1 = 0.9
15522145.320553
304312.662029
311205.223802
366481.769877
155602.611901
253850.982098
1701.666667
822.666667
Table 11: Problem: storm, α = 0.5
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
ε1 = 0.1
15498597.009358
304768.076896
313693.250991
371019.219394
156846.625495
261574.859566
3359.666667
1565.333333
ε1 = 0.3
15498912.463425
304264.549632
313144.675810
370329.421866
156572.337905
260945.658373
3358.000000
1643.000000
ε1 = 0.5
15499225.250463
303585.261410
312532.808424
369731.473411
156266.404212
260501.734375
1718.333333
807.333333
Table 12: Problem: storm, α = 0.7
Expected cost
Abs Med-Dev
Abs Dev
Std Dev
Abs SemiDev
Std SemiDev
Iterations
CPU seconds
ε1 = 0.1
15498693.299136
305049.775029
314177.306262
370960.125688
157088.653131
261570.538153
3360.000000
1610.333333
ε1 = 0.3
15499682.020559
304785.970204
313993.856585
370450.826583
156996.928293
260717.002761
5000.000000
2361.000000
ε1 = 0.5
15507795.606679
302892.249844
312331.074727
367610.550262
156165.537364
256621.384854
76.333333
15.333333
Table 13: Problem: storm, α = 0.9
17
References
[1] J. R. Birge and R. J.-B. Wets. Computing bounds for stochastic programming problems by means of a generalized moment problem. Mathematics
of Operations Research, 12(1):149–162, 1987.
[2] M. Breton and S. El Hachem. Algorithms for the solution of stochastic dynamic minimax problems. Computational Optimization and Applications,
4:317–345, 1995.
[3] J. Dupačová. Minimax stochastic programs with nonseparable penalties. In
Optimization techniques (Proc. Ninth IFIP Conf., Warsaw, 1979), Part 1,
volume 22 of Lecture Notes in Control and Information Sci., pages 157–163,
Berlin, 1980. Springer.
[4] J. Dupačová. The minimax approach to stochastic programming and an
illustrative application. Stochastics, 20:73–88, 1987.
[5] Y. Ermoliev, A. Gaivoronsky, and C. Nedeva. Stochastic optimization
problems with partially known distribution functions. SIAM Journal on
Control and Optimization, 23:697–716, 1985.
[6] A. A. Gaivoronski. A numerical method for solving stochastic programming
problems with moment constraints on a distribution function. Annals of
Operations Research, 31:347–370, 1991.
[7] H. Gassmann and W. T. Ziemba. A tight upper bound for the expectation of convex function of a multi-variate random variable. Mathematical
Programming Study, 27:39–53, 1986.
[8] J.-B. Hiriart-Urruty and C. Lemarchal Convex Analysis and Minimization
Algorithms II. Springer-Verlag, Berlin, 1996.
[9] P. Kall. An upper bound for SLP using first and total second moments.
Annals of Operations Research, 30:670–682, 1991.
[10] J. Linderoth, A. Shapiro and S. Wright The empirical behavior for sampling
methods for stochastic programming. Annals of Operations Research, to
appear.
[11] J. Linderoth and S. Wright Decomposition algorithms for stochastic programming on a computational grid. Computational Optimization and Applications, 24:207–250, 2003.
[12] H.J. Landau (ed.) Moments in mathematics, Proc. Sympos. Appl. Math.,
37, Amer. Math. Soc., Providence, RI, 1987.
[13] A. Makhorin. GNU Linear Progamming Kit, Reference Manual, Version
3.2.3. http://www.gnu.org/software/glpk/glpk.html, 2002.
18
[14] W. Ogryczak and A. Ruszczynski. Dual stochastic dominance and related
mean-risk models. SIAM Journal on Optimization, 13:60–78, 2002.
[15] M. Riis and K. A. Andersen. Applying the minimax criterion in stochastic
recourse programs. Technical Report 2002/4, University of Aarhus, Department of Operations Research, Aarhus, Denmark, 2002.
[16] R. T. Rockafellar. Conjugate Duality and Optimization. Regional Conference Series in Applied Mathematics, SIAM, Philadelphia, PA, 1974.
[17] A. Rusczynski and A. Shapiro, editors. Stochastic Programming, volume 10
of Handbooks in Operations Research and Management Science. NorthHolland, 2003.
[18] A. Shapiro. On duality theory of conic linear problems. In Semi-Infinite
Programming, M.A. Goberna and M.A. López (eds), Kluwer Academic
Publishers, pages 135–165, 2001.
[19] A. Shapiro and A. Kleywegt. Minimax analysis of stochastic programs.
Optimization Methods and Software, 17:523–542, 2002.
[20] J.E. Smith. Generalized Chebychev inequalities: Theory and applications
in decision analysis. Operations Research, 43: 807-825, 1995.
[21] S. Takriti and S. Ahmed. Managing short-term electricity contracts under
uncertainty: A minimax approach. Technical report, School of Industrial &
Systems Engineering, Georgia Institute of Technology, Atlanta, GA, 2002.
[22] J. Žáčková. On minimax solution of stochastic linear programming problems. Cas. Pest. Mat., 91:423–430, 1966.
19