Module S-5/1
Svetlozar T. Rachev
Literature Recommendations
1) Jitka Dupačová, Jan Hurt and Josef Štěpán, Stochastic Modeling in Eco-
nomics and Finance, Kluwer Academic Publisher, 2002
1 Introduction 1
4.1.2 Parameters and Special Cases of the Stable Distribution . 39
4.1.3 Properties of Stable Random Variables . . . . . . . . . . . 41
4.1.4 Truncated α-Stable Distributions . . . . . . . . . . . . . . 43
4.2 Stable Modeling of Risk Factors . . . . . . . . . . . . . . . . . . . 46
4.2.1 Modeling Financial Returns with Stable Distributions . . . 46
4.3 Univariate and Multivariate Distributions . . . . . . . . . . . . . . 49
4.4 Fitting a Multivariate Distribution . . . . . . . . . . . . . . . . . 51
4.5 Dependence Modeling and Copulas . . . . . . . . . . . . . . . . . 53
5 ALM Implementation 62
5.1 Finding an adequate model . . . . . . . . . . . . . . . . . . . . . 65
5.1.1 Exponentially Weighted Moving Average
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1.2 VaR Backtesting . . . . . . . . . . . . . . . . . . . . . . . 78
5.2 Solving the Optimization Problem . . . . . . . . . . . . . . . . . . 80
5.3 Efficient Frontiers . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.4 Postoptimality Analysis and Backtesting . . . . . . . . . . . . . . 85
5.4.1 Postoptimality Analysis . . . . . . . . . . . . . . . . . . . 85
5.4.2 Portfolio Backtesting . . . . . . . . . . . . . . . . . . . . . 87
Asset liability management (ALM) attempts to find the optimal investment strat-
egy under uncertainty in both the asset and liability streams. In the past, the two
sides of the balance sheet have usually been separated, but simultaneous consid-
eration of assets and liabilities can be very advantageous when they have common
risk factors. If assets are allocated such that they are highly correlated with the
liabilities, it is possible to reduce the risk of the entire portfolio.
Traditionally, banks and insurance companies used accrual accounting for es-
sentially all their assets and liabilities. They would take on liabilities, such as
deposits, life insurance policies or annuities. They would invest the proceeds from
these liabilities in assets such as loans, bonds or real estate. All assets and liabil-
ities were held at book value. Doing so disguised possible risks arising from how
the assets and liabilities were structured.
Two of the earlier ALM frameworks for constructing portfolios of fixed-income
securities are dedication and immunization. Basic dedication assumes that the
future liability payments are deterministic and finds an allocation such that bond
income is sufficient to cover the liability payments in each time period. Achieving
this type of cashflow matching in every period is likely to be costly, so traditional
immunization models match cashflows on average providing a cheaper, but usually
riskier, portfolio. The immunized portfolio is constructed by matching the present
values and interest rate sensitivities of the assets and liabilities, and it results in
an allocation that hedges against a small parallel shift in the term structure of
interest rates.
Consider the following simple example (see from riskglossary.com): A bank
borrows USD 100 Mio at 3.00% for a year and lends the same money at 3.20%
to a highly-rated borrower for 5 years. For simplicity, we assume that all interest
rates are annually compounded and all interest accumulates to the maturity of the
respective obligations. The net transaction appears profitable, since the bank is
earning a 20 basis point spread. However, the transaction also entails considerable
At the end of a year, the bank will have to find new financing for the loan,
which will have 4 more years before it matures. If interest rates have risen, the
bank may have to pay a higher rate of interest on the new financing than the fixed
3.20 it is earning on its loan. Suppose, for example that at the end of a year, an
applicable 4-year interest rate is 6.00%. The bank is in serious trouble. It is going
to be earning 3.20% on its loan and paying 6.00% on its financing.
Accrual accounting does not recognize the problem. The book value of the
loan (the bank’s asset) is:
Based upon accrual accounting, the bank earned USD 200,000 in the first year.
However, market value accounting recognizes the bank’s predicament. The
respective market values of the bank’s asset and liability are:
Hence, from a market-value accounting standpoint, the bank has lost USD
10.28 Mio. So which result offers a better portrayal of the bank’ situation, the
accrual accounting profit or the market-value accounting loss? The bank is in
trouble, and the market-value loss reflects this. Ultimately, accrual accounting
will recognize a similar loss. The bank will have to secure financing for the loan
at the new higher rate, so it will accrue the as-yet unrecognized loss over the 4
remaining years of the position.
The problem in this example was caused by a mismatch between assets
and liabilities. Prior to the 1970’s, such mismatches tended not to be a sig-
nificant problem. Interest rates in developed countries experienced only modest
fluctuations, so losses due to asset-liability mismatches were small or trivial. Many
firms intentionally mismatched their balance sheets. Because yield curves were
generally upward sloping, banks could earn a spread by borrowing short and lend-
ing long. But things started to change in the 1970s, which ushered in a period
of volatile interest rates that continued into the early 1980s. US regulation which
had capped the interest rates that banks could pay depositors, was abandoned
to stem a migration overseas of the market for USD deposits. Managers of many
firms, who were accustomed to thinking in terms of accrual accounting, were slow
to recognize the emerging risk. Some firms suffered staggering losses. Because
the firms used accrual accounting, the result was not so much bankruptcies as
crippled balance sheets. Firms gradually accrued the losses over the subsequent 5
or 10 years.
One of the victims of the changing conditions is the US mutual life insurance
company the Equitable. During the early 1980s, the USD yield curve was inverted,
with short-term interest rates spiking into the high teens. The Equitable sold a
number of long-term guaranteed interest contracts (GICs) guaranteeing rates of
around 16% for periods up to 10 years. During this period, GICs were routinely
for principal of USD 100 Mio or more. Equitable invested the assets short-term
to earn the high interest rates guaranteed on the contracts. Short-term interest
rates soon came down. When the Equitable had to reinvest, it couldn’t get nearly
the interest rates it was paying on the GICs. The firm was crippled. Eventually,
it had to demutualize and was acquired by the Axa Group.
We conclude that the the earlier framework of accrual accounting is inadequate
for ALM because it misses the stochastic nature of interest rates and liabilities
and the dynamic nature of investing. The two main tools that help to capture the
dynamic and stochastic characteristics are stochastic control and stochastic
programming. Stochastic control methods model uncertainty in a continuous-
time setting through Itô processes, but a drawback is that only a few driving
variables, or states, can be handled. Applications of stochastic control in ALM is
for example the surplus optimization for pension funds and life insurance.
This lecture is organized as follows:
• Chapter two reviews major issues on risk and optimization. Properties of the
Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR) risk measure
are covered.
R = (r1 , ..., rn )′ . The risk associated with the portfolio return rp = ω ′ R is given
by σp2 = ω ′ Σω, where Σ is the covariance matrix of R. While the variance of
the investment return is the most traditional risk measure, a common criticism is
that the variance penalizes both large gains and large losses. A modification to an
asymmetric risk measure that accounts only for large losses is the semivariance:
E [ω ′ E(R) − ω ′ R]+ .
E [r∗ − ω ′ R]+ .
mp = E |ω ′ R − ω ′ E(R)| ,
Discussion of several other tail risk measures, including the Conditional Value
at Risk (CVaR), can be found in [1] and [2]. While it is not widely used in finance,
it has properties that make it a very logical alternative to VaR. These properties
are referred to as coherence and will be described shortly.
Define a random variable Tβ (x) on the β-tail of the loss L(x) through the
distribution function:
0 ζ < VaRβ (x)
ΨTβ (x, ζ) = (2.1)
ΨL (x,ζ)−β ζ ≥ VaR (x)
1−β β
For a given decision x, the Conditional Value at Risk at confidence level β is the
mean of the tail random variable Tβ (x) with distribution function (2.1):
E (L(x)|L(x) ≥ VaRβ (x)) ≤ CVaRβ (x) ≤ E (L(x)|L(x) > VaRβ (x)) . (2.2)
2.2 Coherence
To help define a sensible risk measure, [3] introduces properties that are required of
a coherent risk measure; however, VaR does not satisfy these properties in general.
As is well known, VaR is not sub-additive: Examples have been constructed where
the VaR of the sum of two portfolios is greater than the sum of individual VaRs.
Lack of subadditivity is very undesirable because diversification is not promoted.
However, for the special class of elliptical distribution, VaR is sub-additive and
coherent [5].
The following properties of coherence are stated adhering to the axiomatic
definition in [1]. If V is the space of real-valued random variables, a risk measure
is a functional ρ : V −→ R. If the random variables v, v ′ ∈ V are thought of as
losses, then ρ is coherent if it is
s.t. ω ′ µ = µ0 , (2.3)
i=1 ωi = 1.
The solution to the above problem is easily solved with Lagrangian techniques
and can be found in [6]. As µ0 is varied, the set of portfolios trace out the mean-
variance efficient frontier. If no short-selling is allowed, the restriction ωi ≥ 0 is
also included.
A drawback of optimization problem (2.3) is that it requires a large number
of parameters to be estimated. If there are n risky assets, the covariance matrix
consists of n(n + 1)/2 elements. For instance, if the universe of assets consists of
the S&P500, over 125,000 variances/covariances must be estimated. A solution,
as found in [37], is to model each asset with a multifactor equation:
where Fj is the deviation of the random factor j from its mean and cov(Fj , Fl ) = 0
for all j 6= l. Examples of typical factors include inflation, interest rates, and
GDP. The asset specific risks ǫi have zero expectation, are uncorrelated, and are
independent of the factors. The portfolio rp = ω ′ R can be written as
rp = µ p + βpj Fj + ǫp ,
X n
µp = ω µ, βpj = ωi βij , ǫp = ωi ǫi .
i=1 i=1
It follows the variance of the portfolio is
X n
σp2 = 2 2
βpj σFj + ωi2 σǫ2i .
j=1 i=1
The first term in the right-hand side of this equation is the systematic or market
risk, and the second term is the unsystematic risk of the portfolio. If equal weight
is given to each asset, ωi = 1/n, the unsystematic risk is bounded by c/n for some
constant c, so this risk can be diversified away as n grows large. Using the factor
model in the minimum variance optimization problem gives:
Pk Pn
min σp2 = j=1
2 2
βpj σFj + i=1 ωi2 σǫ2i
s.t. ω ′ µ = µ0 ,
βpj = ni=1 ωi βij
i=1 ωi = 1.
The factor sensitivities βij , factor variances, and specific risk variances can be
estimated through linear regression in equation (2.4). This results in a significant
reduction in the number of parameter estimates needed as compared to optimiza-
tion problem (2.3).
Both of the above are quadratic optimization problem. As an alternative
to mean-variance analysis, one can optimize the risk measures mentioned in the
previous section. Also in [37], the author illustrates that a linear optimization
problem can be achieved when the variance of the portfolio is replaced with its
mean-absolute deviation mp . Since R is multivariate normal, the relation holds
that mp = σ , so minimizing the mean-absolute deviation will produce the
π p
same optimal portfolio as minimizing the variance. In addition, the linear equiv-
alent program is easily modified to penalize upside and downside deviations from
the mean with different weights.
The class of elliptical distributions offers special properties in portfolio theory
that are useful in minimizing VaR or CVaR. The following gives a very brief
review; a more complete introduction to elliptical distributions and their portfolio
implications is found in [5]. For any elliptically distributed random vector R with
finite variance for all univariate marginals, variance is equivalent to any positive
homogeneous risk measure ρ. If rp = ω ′ R and r̃p = ω̃ ′ R are two linear portfolios
with corresponding variances σp2 and σ̃p2 :
s.t. ω ′ µ = µ0 ,
i=1 ωi = 1.
Details of stable portfolio theory are found in [30], and a comparison of allocations
under the normal and stable assumptions is found in [27].
One would like to be able to perform risk-return analysis for a portfolio by mini-
mizing VaR or CVaR subject to a constraint on the return for any distributional
assumption. In general, VaR is difficult to optimize and is usually not used in this
setting. Typically, one can model the returns with any distribution and then gen-
erate a discrete distribution of scenarios, but in this case, VaR is non-smooth and
non-convex in the portfolio positions with multiple local extrema [36]. CVaR, on
the other hand, has a representation that is easy to optimize both as a constraint
and as an objective for a set of scenarios. Additionally, if CVaR is constrained to
be small, VaR must necessarily be small. Conversely, minimization of VaR may
produce very different solutions than minimization of CVaR: VaR minimization
may stretch the tail of the distribution beyond VaR resulting in a poor CVaR
2.4.1 Uryasev’s Optimization Shortcut
As defined earlier, for the decision x ∈ Rn , L(x) is the random variable repre-
senting the loss, or negative return, with associated VaRβ (x) and CVaRβ (x). To
begin, define the function
E [L(x) − ζ]+ ,
Γβ (x, ζ) = ζ + (2.5)
and, in addition,
produce the same efficient frontiers as λ is varied. As is already shown, the optimal
solution to (2.8) can be found by a joint convex optimization problem. Similarly,
problems (2.9) and (2.10) produce the same optimal solution as
min −R(x) subject to Γβ (x, ζ) ≤ λ,
An extension of these optimization procedures to risk shaping with CVaR is
found in [33]. For confidence level βi with corresponding loss tolerance λi , for
i = 1, ...I,
When L(x) has a discrete distribution arising from, for example, a scenario
tree or sampling, equation (2.5) becomes
1 X
Γ̃β (x, ζ) = ζ + pi [Li (x) − ζ]+ , (2.11)
1 − β i=1
where L(x) takes the value Li (x) with probability pi for i = 1, ..., S. Additionally
if L(x) is linear, then Γ̃β is convex and piecewise linear. By introducing auxiliary
variables, a CVaR optimization problem can be solved by linear programming as
illustrated in the next section.
Chapter 3
In this chapter we will continue the problem of portfolio optimization and intro-
duce the stochastic programming as a solution technique.
negative return of the portfolio is given by
L(x) = −x′ R.
where µ0 is the required portfolio return, and by varying µ0 , the efficient frontier
is obtained. This optimization problem fits into the form of equation (2.8). If
the uncertainty in the return is given through the set of scenarios {R1 , ..., RS }
where each Rs ∈ Rn occurs with probability ps , Uryasev’s optimization shortcut
produces the equivalent problem
min ζ+ 1
1−β s=1 ps [−x′ Rs − ζ]+
s.t. x′ µ ≥ µ 0 ,
x ∈ X, ζ ∈ R,
min ζ+ 1−β s=1 ps y s
s.t. x′ µ ≥ µ 0 ,
x′ Rs + ζ + y s ≥ 0, s = 1, ..., S,
y s ≥ 0, s = 1, ..., S,
x ∈ X, ζ ∈ R.
This program is used to compare hedging strategies for international asset allo-
cation in [36]. In addition, the CVaR portfolio is compared with a portfolio min-
imizing the mean-absolute deviation. The empirical results indicate that CVaR
and MAD produce similar risk-return frontiers in a static setting. However, in
dynamic backtesting where the models are repeatedly applied over a time horizon,
CVaR produces higher returns and lower volatility than MAD.
Here, wT is the terminal wealth, and the max is taken over all fixed-mixed decision
rules. In a fixed-mixed rule, the portfolio is reallocated in each time period to keep
a certain percentage of wealth in each asset. As λ is varied between zero and one,
a type of efficient frontier is obtained. While the number of decision variables are
greatly reduced, the problem becomes non-convex, and a global search algorithm
is necessary.
The coherence of a risk measure in a multi-period setting is also defined in
terms of a wealth process w = (w1 , ..., wT ) where w1 is a known deterministic
wealth. It is shown in [14] that a weighted average of CVaR over the time horizon
is coherent: If CVaRβ (−wt ) is the CVaR associated with the negative wealth −wt ,
then a coherent risk measure is given by
ρ(w) = ρ(w1 , ..., wT ) = µt CVaRβ (−wt ), (3.2)
where the weights are nonnegative and sum to one. Here, coherence means that
ρ is
When implementing the risk measure in (3.2), one can apply Uryasev’s optimiza-
tion shortcut in a similar manner as the previous sections: Uryasev’s formula
can be applied to each CVaRβ (−wt ) where the loss L is taken to be the negative
wealth −wt , and the wealth in each stage is a function of some decision variables.
Of course, there will also be constraints such as the balance of wealth between
stages. This is illustrated in detail in the next chapter for the surplus wealth in
an ALM problem.
3.3 Formulation of the Stochastic Program
l2 (ξ1 ) ≤ x2 ≤ u2 (ξ1 )
• q2 (x1 , x2 , ξ1 ) is a cost of decision x2 for the given realization of the first stage
uncertainty ξ1 and the given first stage decision x1 ,
• Q2 (x1 , x2 , ξ1 , ξ2 ) is the cost of decision x2 for given realizations of uncertain-
ties ξ1 and ξ2 and the given first stage decision x1 ,
• B2 (ξ1 ) is the technology matrix that converts a first stage decision into
resources in the second stage, and
It is possible to remove the cost function Q2 by including the second term of the
objective in the cost function q2 . The problem is said to have fixed recourse when
A2 is independent of ξ1 . The subscripts indicate at which t a value is known
except in the case of ξt . For instance, the realizations of B2 , A2 , and b2 are all
known at t = 2, which is the beginning of the second stage, but ξ2 is not realized
until after t = 2.
The full 2-stage recourse problem incorporates the second stage problem as
follows: With the optimal value of the second stage problem (3.3) denoted by
Q1 (x1 , ξ1 ), the 2-stage problem minimizes the sum of a first stage cost q1 (x1 ) and
the expected value of the second stage cost EQ1 (x1 , ξ1 ):
l1 ≤ x1 ≤ u1 .
The first set of constraints in the above problem are referred to as the first stage
constraints. A good introduction to the various properties of 2-stage recourse
problems, such as feasibility, is found in [4].
An obvious criticism of the 2-stage model is that it only allows one recourse
decision to be made, not a sequence of decisions over the time horizon. A multi-
stage recourse program can provide a more realistic model, but it is more complex
and can often be very difficult to solve numerically. As in the 2-stage problem,
the initial vector of decisions x1 is made before the first realization of uncertainty
ξ1 , and a second stage decision x2 is then made based on x1 and ξ1 . In the T -stage
problem, this process continues for the uncertainties ξt , t = 1, ..., T − 1, and the
decisions vectors xt , t = 1, ..., T . There is usually one additional realization of
uncertainty ξT following the final decision xT .
The T -stage recourse program can be defined recursively as an extension of the
2-stage program. Let the uncertainty up to and including stage t, for t = 1, ..., T ,
be denoted by ξ t = {ξj , j = 1, ..., t}, where each ξj is the uncertainty realized in
stage j. Similarly, let the decisions up to and including stage t be denoted by
xt = {xj , j = 1, ..., t}, where each xj is the decision made for stage j. The first
stage problem is essentially the same as problem (3.4):
l1 ≤ x1 ≤ u1 ,
Bt+1,τ (ξ t )xτ + At+1 (ξ t )xt+1 = bt+1 (ξ t ), (3.7)
τ =1
To numerically solve the recourse problem (3.5-3.6), the distribution of (ξ1 , ..., ξT )
is approximated by a set of scenarios usually organized in the form of a scenario
tree. Figure (3.1) contains an example of a small scenario tree similar to the
one that will be used in the 2-stage ALM problem discussed later. A first stage
optimal allocation is found in the node at t = 1, and optimal recourse allocations
are found in every node at t = 2. In the 2-stage problem, there is no additional
allocation decision made at the nodes at t = 3. The tree shown in the figure is
called balanced because each node at t = 2 is connected to two nodes at t = 3.
To describe the scenario tree, assume the nodes of the scenario tree are num-
bered starting with the value of one at t = 1, and let It be the number of nodes
up to and including those at t. Define the sets of indices It = {It−1 + 1, ..., It }, for
t = 2, ..., T + 1, with I1 = 1. A scenario s, which is a path through the scenario
tree, is then represented by the set of indices (i2 , ..., iT +1 ) where it ∈ It . Two
useful functions defined on the node indices are the predecessor, pred(·), and the
descendant, dec(·): pred(it ) returns the node in It−1 connected to it , and dec(it )
returns a subset of nodes in It+1 connected to it . At t, the probability of being
at node it ∈ It is denoted by p(it ) so that it ∈It p(it ) = 1. Sometimes it is
more useful to use the transition probabilities p(it , it+1 ), for it+1 ∈ dec(it ) where
it+1 ∈dec(it ) p(it , it+1 ) = 1.
Figure 3.1: A Scenario Tree.
The discrete and finite distribution of a scenario tree allows the stochastic re-
course problem to be written as a deterministic program. Once a scenario tree
is constructed, each node it of the scenario tree determines values for At (ξ t−1 ),
Bt (ξ t−1 ), bt (ξ t−1 ), lt (ξ t−1 ), ut (ξ t−1 ), and qt (·, ξ t−1 ) which are denoted by Ait , Bit ,
bit , lit , uit , and qit (·). The recourse problem (3.5-3.6) can then be written as
s.t. A1 x1 = b1 , (3.8)
l1 ≤ x1 ≤ u1 ,
lit ≤ xt ≤ uit ,
min hq1 , x1 i + i2 ∈I2 p(i2 )hqi2 , xi2 i + · · · + iT ∈IT p(iT )hqiT , xiT i
subject to
A1 x1 = b1 ,
Bi2 x1 + Ai2 xi2 = bi2 , ∀i2 ∈ I2 ,
Bi3 xpred(i3 ) + Ai3 xi3 = bi3 , ∀i3 ∈ I3 ,
BiT xpred(iT ) + AiT xiT = biT , ∀iT ∈ IT ,
lit ≤ xit ≤ uit , ∀it ∈ It , t = 1, ..., T.
This arborescent form implicity includes non-anticipatory constraints that the
decision taken at t does not depend on the uncertainty that is realized in the
future. Note that the decision vectors are xit , it ∈ It , t = 1, ..., T , so there is one
decision for each node of the scenario tree except for those at T + 1.
The split-variable formulation is an equivalent form that lends itself to decom-
position and parallel implementation. If there are a total of S sample paths in
the scenario tree, S independent subproblems are created by allowing all decisions
to be scenario dependent. For the multistage case, the individual subproblem for
scenario s with nodes (i2 , ..., iT +1 ) is
plus any upper and lower bounds on xst . When combining all subproblems into
one problem, non-anticipatory constraints must be explicitly considered in this
formulation: For any two scenarios s and s′ with a common path up to and
including t, xsj = xsj , for j = 1, ..., t, must be enforced. Essentially this amounts
to a 0 − 1 matrix of coefficients. If ps is the probability of scenario s, the overall
split-variable representation for the multistage program is
min ps (hq1 , xs1 i + hqi2 , xs2 i + ... + hqiT , xsT i) ,
A specific ALM problem is now put into a form of a stochastic program with the
goal of finding the allocations over a time horizon in a set of assets that optimizes
a tradeoff between the risk and reward. The risk measure is a weighted average
of the CVaR of the negative surplus wealth at each stage, and the reward is the
expect final surplus wealth. Let the asset prices and liability price be denoted by
st and lt , respectively. There are n assets available at each time giving st ∈ Rn ,
and there is just one liability stream giving lt ∈ R. For the T -stage problem,
(st , lt ) are defined for t = 1, ..., T + 1. The current prices known today are (s1 , l1 ),
so these are not random variables; however, (st , lt ) is a bivariate random variable
with realizations in Rn+1 known at time t for t = 2, ..., T +1. The CVaR of interest
in stage t is just the CVaR of the distribution of the surplus wealth at t + 1. For
instance, the stage 1 CVaR is determined by the distribution of surplus wealth
at t = 2, which depends on the allocation decision at t = 1. For this reason, the
CVaR of interest in stage t is denoted as CVaRβ (−swt+1 ) where swt is the surplus
wealth at time t. The problem that will be solved can now be written as:
min λ µt CVaRβ (−swt+1 ) − (1 − λ)E(swT +1 ) (3.13)
s.t. an initial wealth constraint, (3.14)
For the given scenario tree, Uryasev’s formula can now be applied to each CVaR:
1 X +
CVaRβ (−swt+1 ) = ζt + p(it+1 ) lit+1 − hsit+1 , apred(it+1 ) i − ζt ,
1−β it+1 ∈It+1
where there is one auxiliary variable ζt introduced for each stage. To simplify
things, let
hit+1 (ζt , apred(it+1 ) ) = lit+1 − hsit+1 , apred(it+1 ) i − ζt , and (3.19)
An important task in ALM is the identification and adequate modeling of the un-
derlying risk factors. The dynamic of financial risk factors is well known to often
exhibit some of the following phenomena: heavy tails, skewness and high-kurtotic
residuals. The recognition and description of the latter phenomena goes back to
the seminal papers of Mandelbrot (1963) and Fama (1965). In this chapter we
will introduce the α-stable distribution as an extension of the normal distribu-
tion. Due to its summation stability and the fact that it generalizes the Gaussian
distribution, the class of α-stable distributions seems to be an ideal candidate to
describe the return distribution of the considered risk factors. For further de-
scription of the stable distribution and applications of the stable distribution in
financial theory see Samorodnitsky et al. (1994) or Rachev and Mittnik (1999).
4.1 Stable Distributions
This section reviews some of the main features of the stable distribution as the
natural extension of the Gaussian distribution. The notion of stable distribution
was introduced in the 1920’s by P. Lévy. A stable distribution can be defined in
four equivalent ways, given in the following definitions: A random variable X
follows a stable distribution, if for any positive numbers A and B there exists a
positive number C and a real number D such that
where X1 and X2 are independent copies of X and ”=” denotes equality in distri-
Therefore, a distribution f is stable if it is invariant under convolution, i.e., if
there exist real constants C > 0 and D such that
Z +∞
f(AX1 +d1 )+(BX2 +d2 ) (s) := f (A(s − l) + d1 )f (Bl + d2 ) = f (Cs + D) (4.2)
C α = Aα + B α (4.3)
where X1 , X2 , ..., Xn are independent copies of X. Again, the number Cn and the
stability index of the distribution are closely linked and we get Cn = n1/α where
the α ∈ (0, 2] is the same as in equation 4.3.
The third definition of a stable distribution is a generalisation of the central
limit theorem. Stable distributions are in fact the only distributions that can be
obtained as limits of normalized sums of iid random variables. A random variable
X is said to be stable if it has a domain of attraction, i.e., if there is a sequence
of random variables Y1 , Y2 , ... and sequences of positive numbers {dn } and real
numbers {cn }, such that
Y1 + Y2 + · · · + Y n d
⇒ X. (4.5)
Finally, the last equivalent way to define a stable random variable provides
information about its characteristic function. A random variable X has a stable
distribution if there are parameters 0 < α ≤ 2, σ ≥ 0, −1 ≤ β ≤ 1, and µ real
such that its characteristic function has the following form:
exp(−σ α |t|α [1 − iβsign(t) tan πα
] + iµt), if α 6= 1,
E(eiXt ) = (4.6)
exp(−σ|t|[1 + iβ π2 sign(t) ln |t|] + iµt),
if α = 1,
Definition 4.1.1 implies definition 4.1.1 what can be shown the following way:
For α 6= 1 and X1 , X2 , · · · , Xn independent copies of the stable random vari-
able X. Thus, we can write
X1 + X2 + · · · + Xn = cn X + dn .
4.1.2 Parameters and Special Cases of the Stable Distrib-
X ∼ Sα (β, σ, µ)
where α is the the so-called index of stability (0 < α ≤ 2). The lower the value of α
the more leptocurtic is the distribution. This can be considered as a very attractive
property for modeling financial asset returns. In empirical studies, the value of
α for asset returns is often chosen between 1 and 2. For α > 1, the location
parameter µ is the mean of the distribution. Figure 4.1 shows the probability
density function for symmetric alpha-stable random variables for different values
of α.
−5 −4 −3 −2 −1 0 1 2 3 4 5
Figure 4.1: Probability density functions for standard symmetric α-stable random
variables, α = 2, α = 1 (dotted) and α = 0.5 (dashed).
The second parameter β is the skewness parameter (−1 ≤ β ≤ 1). A stable
distribution with β = µ = 0 is called a symmetric α-stable distribution (SαS). If
β < 0, the distribution is skewed to the left, if β > 0, the distribution is skewed to
the right. We conclude that the stable distribution can also capture asymmetric
asset returns.
σ is the scale parameter (σ ≥ 0) and µ is the drift (µ ∈ R).
−5 −4 −3 −2 −1 0 1 2 3 4 5
Figure 4.2: Probability density functions for stable random variables with α = 1.2,
β varying, β = 0, β = −0.5 (dashed) and β = −1 (dotted).
Figure 4.2 shows the probability density function for some skewed alpha-stable
random variables with α = 1.2.
Generally the probability density function of a stable distribution cannot be
specified in explicit form. However, there are three special cases where this is
f1 (x) = (4.7)
π((x − µ)2 + σ 2 )
1 x
P (X ≤ x) = 0.5 + arctan . (4.8)
π σ
σ 1/2
1 σ
exp − (4.9)
2π (x − µ)3/2 2(x − µ)
is concentrated on (µ, ∞)
In this section we will summarize some useful properties useful of stable distrib-
utions in modeling financial data or simulation.
The first property mentioned is the so-called summation stability. Let X1 , X2
be independent random variables with Xi ∼ (σi , βi , µi ), i = 1, 2. Then X1 + X2 ∼
Sα (σ, β, µ), with
β1 σ1α + β2 σ2α
σ= (σ1α + σ2α )1/α , β= , µ = µ1 + µ2 . (4.10)
σ1α + σ2α
for the proof we refer to Samorodnitsky et al. (1994). Thus, the sum of two
alpha-stable distributed random variables with the same index α is also alpha-
stable with the same index of stability α.
The second proposition concerns the parameter σ. The Gaussian distribution
can be scaled by multiplication with a constant. This property extends to 0 <
α ≤ 2.
Let X ∼ Sα (σ, β, µ) and let a ∈ R\{0}. Then
The parameter σ is therefore often called the scale parameter. The proof of
4.1.3 can easily be done by using the characteristic function of stable distributions
ln Eeit(aX) = −σ α |ta|α σ α 1 − iβ arg(ta) tan + iµ(ta)
= −(σ|a|)α |t|α σ α 1 − iβ arg(a) arg(t) tan + i)(µa)t
The third proposition concerns the shift parameter µ. It was already discussed
that in the case of α = 2 the parameter µ is a shift parameter for the Gaussian
distribution. The same can be inferred about µ for any admissible α. Let
X ∼ Sα (σ, β, µ) and let a be real constant. Then X + a ∼ Sα (σ, β, µ + a).
This follows directly by interpreting a as a Sα (0, 0, a) stable random variable and
applying the summation stability proposition. For 1 < α ≤ 2, the shift parameter
µ equals the mean.
Finally, we can also interpret the last parameter β. It can be identified as a
skewness parameter. X ∼ Sα (σ, β, µ) is symmetric if and only if β = 0 and µ = 0.
It is symmetric about µ if and only if β = 0. We can proof this by the fact that
a random variable is symmetric if and only if its characteristic function is real.
By definition 4.1.1 this is the case if and only if β = 0 and µ = 0. The second
statement follows from property 4.1.3. In order to indicate that X is symmetric,
i.e. β = 0 and µ = 0, we write
X ∼ SαS
Despite these advantages the stable distribution so far is only rarely used in prac-
tical implementations. A major reason for the limited use of stable distributions
in applied work is that there are in general no closed-form expressions for its
probability density function. Numerical approximations are nontrivial and com-
putationally demanding. Another shortcoming in application issues is that all
moments of order ≥ α are infinite. Therefore, for some applications e.g. GARCH
models with conditions on the innovations like E(ǫt ) = 0 and V (ǫt ) = 1, t∈N
at first the stable distribution is not applicable. In the sequel, following Menn
and Rachev (2004) we will give a brief introduction to a new class of probability
distributions that combines the modeling flexibility of stable distributions with
the existence of arbitrary moments.
A possibility to guarantee the existence of moments of order ≥ α is to truncate
the stable distribution at certain limits and add two normally distributed tails to
the distribution. Dependent on where the truncation is conducted the distribution
can still be clearly more heavy-tailed than a normal distribution but may provide
finite variance. This idea leads to the definition of a so-called smoothly truncated
stable distribution.
Let gθ denote the density of some α-stable distribution with parameter-vector
Θ = (α, β, σ, µ) and hi , (i = 1, 2) denote the densities of two normal distributions
with parameters (νi , τi ), (i = 1, 2). Furthermore, let a, b ∈ R be two real numbers
with a ≤ µ ≤ b. The density of a smoothly truncated stable distribution (STS-
distribution) is defined by:
h (x) for x < a
f (x) = gθ (x) for a ≤ x ≤ b
h2 (x) for x > b
(i) Continuity:
! !
h1 (a) = gθ (a) and h2 (b) = gθ (b)
Za Za
p1 := h1 (x) dx = gθ (x) dx
−∞ −∞
Z∞ Z∞
p2 := h2 (x) dx = gθ (x) dx
b b
The class of smoothly truncated stable (STS) distributions in the following
will be denoted by S trunc , elements of S trunc by . Since probability distributions
used for modeling white noise processes like the innovations of a time series model,
are usually assumed to be standardized probability distributions with zero mean
and unit variance. It remains the problem of calculation of the parameters (νi , τi ),
(i = 1, 2) for the two normal distributions. The conditions lead to the following
equations for the parameters (νi , τi ), (i = 1, 2):
ϕ (Φ−1 (p1 ))
τ1 = and ν1 = a − τ1 Φ−1 (p1 ) (4.11)
gθ (a)
ϕ (Φ−1 (p2 ))
τ2 = and ν2 = b + τ2 Φ−1 (p2 ) (4.12)
gθ (b)
where ϕ and Φ denote the density and distribution function of the standard normal
Following Menn and Rachev (2004) a useful property of α-stable distributions
is the scale and translation invariance, which is transmitted to the class of STS-
Y := cX + d ∼ Sα̃ (σ̃, β̃, µ̃) ∈ S trunc (4.13)
ã = ca + d, b̃ = cb + d, α̃ = α, σ̃ = |c|σ, β̃ = sign(c)β, µ̃ = cµ + d
The main advantage is however that the mean EX and the second moment
EX 2 of a STS-distributed random variable X exists:
EX = ap1 − τ1 Φ−1 (p1 )p1 + ϕ(Φ−1 (p1 )) +
+ xgθ (x) dx +
+bp2 + τ2 Φ−1 (p2 )p2 + ϕ(Φ−1 (p2 ))
where, as above, ϕ denotes the density and Φ the distribution function of the
standard normal distribution. p1 and p2 denote the cut-off-probabilities given in
equation (4.1.4) and Gθ is the distribution function of the α-stable distribution
with parameter-vector θ = (α, β, σ, µ).
It should be pointed out that since there exists no closed form expression for
the density gθ of a stable distribution, the mean and the variance of an STS-
distribution can only be calculated with the help of numerical integration.
In this section we will give some examples on the superior fit of stable distribu-
tions to financial returns compared to the Gaussian distribution that is used in
Distribution Stable Gaussian
Parameters alpha beta sigma mu µ σ
Unemployment Rate 1.6691 -1.0000 0.0124 0.0316 0.0337 0.0207
Working Output 1.4474 1.0000 0.0723 0.0234 -0.0074 0.1454
Gross Domestic Product 1.6325 -1.0000 0.0830 -0.0493 -0.0495 0.1986
Consumer Price Index 1.2061 0.0880 0.3130 0.0385 0.0194 0.8058
Annual Saving 1.2849 1.0000 0.0123 0.0563 0.0433 0.0283
Personal Income 2.0000 0.1427 0.0147 0.0744 0.0744 0.0210
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
Figure 4.3: Normal and Stable fit to log return of Working Output per hour.
Empirical Density
Stable Fit
0.9 Gaussian (Normal) Fit
−5 −4 −3 −2 −1 0 1 2 3 4 5
that puts more weight to the tails of the distribution. Table 4.2.1 shows the
results for the considered goodness-of-fit criteria. For most variables we find a
clearly better fit of the stable distribution compared to the normal.
For ALM problems, often scenarios are generated by calibrating and simulating a
time-series model to multivariate data.
There are two major approaches modeling multivariate data:
Figure 4.6: Fit of Gaussian and Stable distribution to residuals of monthly infla-
• Fit each individual time-series with a univariate distribution and use a cop-
ula to describe the dependence structure.
The second approach is more flexible in the sense that it allows any type of
distribution to be fit to the individual series. For instance, one can first calibrate
complex univariate models like GARCH etc. and then capture the dependence
with a time-varying copula.
where the innovations process Eτ = (e1τ , ..., e6τ )′ is assumed to be white noise with
covariance matrix Σ. It is both easy to calibrate and easy to simulate scenarios
from VAR models. An introduction to modeling and estimation of VAR models
can be found in [38].
To simulate the VAR model, one needs to make a distributional assumption
for the innovations. After estimation of the VAR(1) model, the residuals are
computed by
Êτ = R̃τ − Π̂1 R̃τ −1 , (4.19)
and the standardized residuals Σ̂−1/2 Êτ are plotted in figure (5.2). The usual
assumption is that the innovations are Gaussian, in which case the standardized
residuals should be i.i.d. multivariate Normal(0,In ). However, based on the results
on financial return data of the previous sections, it might also be promising to
use a more flexible or heavy-tailed distribution like the α-stable or the truncated
stable distribution.
A n-dimensional random vector Z has a multivariate stable distribution if for
any a > 0 and b > 0 there exists c > 0 and d ∈ Rn such that
aZ1 + bZ2 = cZ + d,
where (z̃ 1 , z̃ 2 ) is a SαS vector with spectral measure Γ(z̃1 ,z̃2 ) and y hki = |y|k sign(x).
Additionally, the covariation norm is given by
i i 1/α
kz̃ i kα = z̃ , z̃ α .
See [30] for details on estimating the index of stability, spectral measure, and scale
parameter for a general stable random vector.
vi. Correlation is only defined when the variances of the risks are finite. It is
not an appropriate dependence measure for very heavy-tailed risks where
variances appear infinite.
For an illustration of point 2 and 4, consider the following example (see Embrechts
et al, 1999):
Consider two rv’s. X and Y that are lognormally distributed with µX = µY = 0,
σX = 1 and σY = 2. One can show that by an arbitrary specification of the joint
distribution with the given marginals, it is not possible to attain any correlation
in [−1, 1]. In fact, there exist boundaries for a maximal and a minimal attainable
correlation [ρmin , ρmax ] which in the given case is [−0.090, 0.666].
Allowing σY to increase, this interval becomes arbitrarily small as one can see
in figure 4.7. Here, it is interesting to note that the two boundaries represent the
case where the two rv’s are perfectly positive dependent (the max. correlation
line) or perfectly negative dependent (the min. correlation line) respectively.
Thus although the attainable interval for ρ as σY > 1 converges to zero from
both sides, the dependence between X and Y is by no means weak. This indicates
that it is wrong to interpret small correlation as weak dependence.
A single statistical parameter like the linear correlation coefficient will not
be able to capture the entire dependence structure between two rv’s in the gen-
eral case. At this point a general concept of describing the dependence structure
within multivariate distributions is needed. Since marginal distributions are very
illustrative, easy to handle and often used as basic building blocks for the de-
sign of a multivariate distribution, the idea of separating the description of the
joint multivariate distribution into the marginal behaviour and the dependence
structure is very attractive. One representation of the dependence structure that
satisfies this concept is a copula. A copula is a function that combines the mar-
ginal distributions to form the joint multivariate distribution. A copula is the
distribution function of a random vector in Rn with standard uniform marginals.
One can alternatively define a copula as a function and impose certain restrictions.
Figure 4.7: Maximum and minimum attainable correlation for X ∼
Lognormal(0, 1) and Y ∼ Lognormal(0, sigma).
A copula is any real valued function C : [0, 1]n → [0, 1], i.e. a mapping of the unit
hypercube into the unit interval, which has the following three properties:
X 2
··· (−1)i1 +···+in C(u1i1 , . . . , unin ) ≥ 0
i1 =1 in =1
where the function C can be identified as a joint distribution function with stan-
dard uniform marginals — the copula of the random vector X. In equation (4.21),
it can be clearly seen, how the copula combines the magrinals to the joint distri-
Sklar’s theorem provides a theoretic foundation for the copula concept:1 [Sklar’s
theorem] Let F be a joint distribution function with continuous margins F1 , . . . , Fn .
Then there exists a unique copula C : [0, 1]n → [0, 1] such that for all x1 , . . . , xn
in R = [−∞, ∞] (4.21) holds. Conversely, if C is a copula and F1 , . . . , Fn are
distribution functions, then the function F given by (4.21) is a joint distribution
function with margins F1 , . . . , Fn . For the case that the marginals Fi are not
all continuous, it can be shown2 that the joint distribution function can also be
expressed like in equation (4.21), although C is no longer unique in this case.
Examples of copulas
i. If the rv’s Xi are independent, then the copula is just the product over the
C ind (x1 , . . . , xn ) = x1 · · · · · xn .
where ρ ∈ (−1, 1) and Φ−1 (α) = inf{ x | Φ(x) ≥ α} is the univariate inverse
standard normal distribution function. Applying CρGa to two univariate
standard normally distributed rv’s results in a standard bivariate normal
distribution with correlation coefficient ρ.
For further discussion see [35].
See [35].
Note that, since the copula and the marginals can be arbitrarily combined,
this (and any other) copula can be applied to any set of univariate rv’s.
The outcome will then surely not be multivariate normal, but the resulting
multivariate distribution has inherited the dependence structure from the
multivariate normal distribution.
n oβ
1 1
CβGu (x, y) = exp − (− log x) + (− log y)
β β ,
P̂Ga (X > q0.99 |Y > q0.99 ) = = 0.3̄
P̂Gu (X > q0.99 |Y > q0.99 ) = = 0.75
This is another indicator for the increased probability for the joint occurrence of
extreme events.
In the previous section we considered a bivariate distribution to show that
marginal distributions and correlation are insufficient information to fully specify
the joint distribution. This example was constructed in the following way, using
a copula:
Let X and Y be two rv’s with standard normal distributions. Obviously the
outcome for the bivariate distribution when applying an arbitrary copula is not
bivariate normal in general. This is only the case when choosing the Gaussian
copula C = CρGa .
Thus, the following copula has been constructed:
2γ − 1
f (x) = 1{(γ,1−γ)} (x) + 1{(γ,1−γ)c } (x)
2γ − 1
g(y) = −1{(γ,1−γ)} (y) − 1{(γ,1−γ)c } (y)
with γ ∈ [ 41 , 12 ]. For γ < 21 , the joint density disappears on the square [γ, 1 − γ]2
such that the joint distribution is surely not bivariate normal. However, the linear
correlation coefficient between X and Y exists. From symmetry considerations
(C(u, v) = c(1 − u, v), 0 ≤ u, v ≤ 1) it can be deducted that ρX,Y = 0, irrespective
of γ. Therefore, uncountably many bivariate distributions with standard normal
marginals and zero correlation exist that are not bivariate normal.
Chapter 5
ALM Implementation
Asset Class Benchmark
s1 Cash Ryan Labs Cash Index
s2 Bonds Lehman U.S. Aggregate Bond Index
s3 Equities S&P500
s4 International Equities Morgan Stanley EAFE Index
s5 Mortgages Lehman Mortgage Index
Once a time-series model is found, it is simple to generate sample paths for the
returns and then convert the returns back to index values. Note that in our
example τ is interpreted as time, and in the previous chapter, t is interpreted as
the stage in a stochastic program. It is possible that they will coincide; however,
there will usually be many smaller time periods between stages. In our application,
a time-series model is fit to monthly data, but a stage covers a 6-month period.
Figure (5.1) contains the plots of the monthly returns for the components
of Rτ . There are 237 data points corresponding to the returns for the months
of April 1985 to December 2004. An obvious characteristic of the data is the
volatility clustering, especially noticeable in the equity index. This indicates that
a time-series model with time-varying volatilities is appropriate.
Monthly Liability Returns
1987 1990 1993 1995 1998 2001 2004
1987 1990 1993 1995 1998 2001 2004
1987 1990 1993 1995 1998 2001 2004
1987 1990 1993 1995 1998 2001 2004
As a first step in fitting a model to the data, the major trends of the individual
series are removed by an exponentially weighted moving average (EWMA) process
for the mean. The means of the univariate return series are assumed to follow:
where λm is a fixed parameter. By writing mτ = (m1τ , ..., m6τ )′ , the new return
series of interest is
R̃τ = Rτ − mτ , (5.3)
and as the next step, a vector autoregressive (VAR) model is calibrated to R̃τ .
For the data at hand, the AIC indicates that the VAR of order 1 is optimal.
More generally, one may fit a multivariate autoregressive moving average
(ARMA) model, however, multivariate financial data typically indicates only an
autoregressive component, so it is reasonable to restrict the model to VAR. Ex-
tensions of the VAR model that additionally includes economic regime changes
and long term equilibria in an ALM context may be used as well.
To find the optimal value of λm , a course grid was created, and for each element
in the grid, the AICs of low order VAR models were compared. VAR(1) always
resulted in the lowest AIC for any value of λm in the grid. A fine grid for λm
was then constructed, and the AICs of the corresponding VAR(1) models were
compared. This procedure gave an optimal value of λm = 0.952.
After estimation of the VAR(1) model, the residuals are computed by
σ̂ 1 σ̂ 2 σ̂ 3 σ̂ 4 σ̂ 5 σ̂ 6
0.0404 0.0015 0.0124 0.0450 0.0494 0.0105
The first noticeable point is that the volatilities corresponding to the equity returns
are the largest, the volatility corresponding to the bond returns is smaller, and
the volatility corresponding to the cash returns is very small. Also, the volatility
corresponding to the liability returns is almost as large as that of the equities,
meaning that the liabilities of pension funds are actually quite risky. The second
noticeable point is that the liability returns and bond returns are highly correlated
as one would expect. This means that when the optimization program is solved
Series 1
0 50 100 150 200
Series 2
0 50 100 150 200
Series 3
0 50 100 150 200
Series 4
0 50 100 150 200
Series 5
0 50 100 150 200
Series 6
0 50 100 150 200
Figure 5.2: Standardized residuals Σ̂−1/2 Êτ of the VAR(1) model for Rτ .
−4 −3 −2 −1 0 1 2 3 4
quantiles of the standard normal
Figure 5.3: QQ-plot of the standard normal versus the standardized residuals
Σ̂−1/2 Êτ .
for the minimum risk portfolio, one could expect a large allocation in the bonds
to offset the risk in the liabilities.
A symmetric stable distribution is fit to each of the univariate residual series
of the VAR(1) model by maximum likelihood estimation. The estimates of the
tail index α̂i and scale parameter σ̂αi from each univariate series êi is given table
(5.2). The estimation was restricted to symmetric distributions because of the
short length of the data series. Alternatively, it is reasonable to assume that
α = 1.8 for financial data and carry out the estimation for the scale parameter
alone. The empirical density of the liability return innovations is compared to
Density functions
10 stable
−0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 0.2
Figure 5.4: Density functions for the residuals of the liability return series.
both the estimated normal density and the estimated stable density in figures
(5.4) and (5.5). As is seen, the stable density better matches the peak of the
empirical density and has a slower decay at the tails than that of the normal
Two goodness-of-fit measures are employed to compare the normal fit and
the stable fit of the univariate series: the Kolmogorov distance (KD) and the
Anderson-Darling (AD) statistic. The KD and AD for the normal and stable
estimated distributions for each of the series can be found in tables (5.3) and
(5.4). The normal fit slightly outperforms the stable fit twice under the KD
measure, but the stable fit is clearly superior under the AD measure.
A sub-Gaussian distribution can be fitted to the residuals Eτ of the ALM
Density functions
1.8 stable
0.05 0.1 0.15 0.2
Figure 5.5: Right tail of the density functions for the residuals of the liability
return series.
They can be compared with the ML estimates in table (5.2). The moment es-
timate for Q given by the above equations is not symmetric, but a symmetric
estimate is given by Q̂ = (q̂ij2 + q̂ji
)/2 . The standardized residuals Q̂−1/2 Êτ
are also computed and are plotted in figure (5.6). In this case, the data points
should all be temporally and serially independent realizations of a S1.8705 (1, 0, 0)
random variable. This is clearly not the case because there is a significant amount
of volatility clustering. The qq-plot of the stable random variable versus the ag-
gregated standardized residuals is found in figure (5.7). This plot appears closer
to linear than the qq-plot with the standard normal in (5.3), which indicates the
sub-Gaussian provides a better fit than the multivariate normal; however, neither
of these capture the time-varying nature of the innovations.
To account for the volatility clustering, different types of models are implemented:
The first assumes the innovations are Gaussian with a time-varying covariance
matrix, and the second assumes the innovations are sub-Gaussian with a time-
varying dispersion matrix.
Given a multivariate data set {Eτ , τ = 1, ..., τm } with zero mean, the sample
Series 1
0 50 100 150 200
Series 2
Series 4
0 50 100 150 200 250
Series 5
Series 6
Figure 5.6: Standardized residuals Q̂−1/2 Êτ of the VAR(1) model for Rτ .
−10 −8 −6 −4 −2 0 2 4 6 8 10
quantiles of the symmetric alpha stable with alpha=1.8705
Figure 5.7: QQ-plot of the symmetric stable with α = 1.8705 versus the stan-
dardized residuals Q̂−1/2 Êτ .
m τ
1 X
Σ̂ = Eτ Eτ′ . (5.5)
τm − 1 τ =1
Note that there is equal weight applied to each observation of the data set. To
allow a time-varying volatility estimate, the covariance estimate at time τ is al-
lowed to depend on the data before time τ , and the weights are assumed to decay
exponentially from the most recent observation:
where 0 < λe < 1 and the weights are chosen so that they sum to one for an
infinite series. The estimate can also be written in the recursive form
u τm
u 1 X 2
RM SE2 (λe ) = t (eiτ )2 − σ̂τ2|τ −1,ii (λe ) , (5.7)
τm τ =1
where σ̂τ2|τ −1,ii (λe ) is a diagonal component of Σ̂τ |τ −1 in equation (5.6). Since the
data series is assumed to have zero mean, Eτ −1 (eiτ ) = στ2|τ −1,ii , so the prediction
error of (eiτ )2 is the difference of terms inside the square root in equation (5.7).
A single optimal estimate λ∗e for the decay factor is computed from the RMSE of
each univariate series through the formulas:
λ∗e = φi λ∗i , (5.8)
q̂τp|τ −1,jj = (1 − λe ) ejτ −1 A(p) + λe q̂τp−1|τ −2,jj (5.10)
Bτ |τ −1,ij = (1 − λe )eiτ −1 ejτ −1 A(q) + λe Bτ −1|τ −2,ij (5.11)
and the symmetric estimator for the time-varying dispersion matrix is given by
Q̂τ |τ −1 = q̂τ2|τ −1,ij + q̂τ2|τ −1,ji /2. This model is referred to as the stable expo-
nentially weighted moving average model (SEWMA). The authors also extend the
estimation technique for the decay factor by considering the prediction error of
|eiτ | . They note that Eτ −1 (|eiτ |p ) = qτp|τ −1,ii /A(p) and suggest to minimize the
following RMSE error for each univariate series:
u τm
u 1 X 2
RM SEp (λe ) = t A(p) |eiτ |p − q̂τp|τ −1,ii (λe ) . (5.13)
τm τ =1
The single optimal decay factor λ∗e is then found by replacing RM SE2i with
RM SEpi is equations (5.8-5.9). Using the VAR(1) residuals of the ALM data, this
technique is applied in both the normal and sub-Gaussian cases with p = α/3. A
grid for λ was constructed with increments of 0.001, and RM SEpi (λ) was mini-
mized over this grid. In both cases, a value of λe = 0.95 for equations (5.10-5.12)
is found to be appropriate. The exact values of λ∗e are found in table (5.5).
There are difficulties in implementing the SEWMA model for the ALM resid-
α p λ∗e
Normal 2 0.6667 0.9496
Stable 1.8705 0.6235 0.9494
Table 5.5: Comparison of the optimal decay factor λ∗e under the normal and stable
assumptions using the selection criterion based on RM SEpi .
2σy2 πα 2/α
g∼ N (0, σg2 ), y ∼ Sα (σy , 0, 0) , s ∼ Sα/2 cos , 1, 0 ,
σg2 4
d √
y= sg.
See [32] and the reference therein. If the governing Gaussian distribution Gτ for
the multivariate data has a time-varying covariance matrix Στ |τ −1 = στ |τ −1,ij
and each univariate series is modeled with an αi -stable random variable with time-
varying scale parameter qτ |τ −1,i , the previous results suggest a way to model Eτ
with a time-varying sub-Gaussian-like distribution:
s1τ gτ1
Eτ = . , (5.14)
√ n n
sτ gτ
Gτ = . ∼ N 0, Στ |τ −1 , (5.15)
2qτ2|τ −1,i πα 2/αi
siτ ∼ Sαi /2 cos , 1, 0 . (5.16)
στ2|τ −1,ii 4
When generating a sample for Eτ , the samples of siτ , i = 1, ..., n, are taken from the
same random seed so that the above equations will be close to the sub-Gaussian
representation where the same subordinator multiplies each component of the
normal random vector. In the above equations, the covariance of the governing
Gaussian distribution captures the dependence between the series, and each sub-
ordinator siτ is chosen to give the proper tail index and scale parameter for each of
the univariate series. Recall that for the sub-Gaussian distribution, all marginals
have the same tail index, so the above equations are actually an extension that
allow different tail indexes, αi , for the marginals. The scale parameters and covari-
ance matrix are estimated from EWMA equations already seen. The time-varying
estimate for the scale parameter is given by:
i j
−1,i = (1 − λe ) eτ −1 A(pi ) + λe σ̂τp−1|τ
−2,i , (5.17)
Σ̂τ |τ −1 = (1 − λe )Eτ∗−1 Eτ∗−1 + λe Σ̂τ −1|τ −2 . (5.18)
The forecasting performances of the EWMA and SSEWMA models are examined
by comparing the predicted VaRs with the observed returns as in [31]. From the
definition of VaR, the null hypothesis to test is:
for a return series {rτ }. This hypothesis is tested for each ALM return series
ri = {rτi , τ = 1, ..., 237}, i = 1, ..., 6, and for various values of β.
In this backtesting analysis, both the VAR(1)-EWMA and VAR(1)-SSEWMA
models are fit to a moving window of 100 data points. Since it is difficult to
estimate the tail index of the stable distribution with such a short time-series,
it is assumed that αi = 1.8 for each of the univariate series in the SSEWMA
model. Let VaRβ (τ ), for τ = 101, ..., 237, be the estimate of VaRβ (τ ) from a
model calibrated to {rτ̃ , τ̃ = τ − 100, ..., τ − 1}. If equation (5.19) holds, then
1 with probability 1 − β,
χτ = 1 rτ < −VaRβ (τ ) = (5.20)
0 with probability β,
where 1(·) is the indicator function, and the total number of VaR exceedings has
a binomial distribution:
X = χτ ∼ Bin(137, 1 − β). (5.21)
τ =101
The testing rule is to reject the null hypothesis at level of significance 100δ% if
X 137
(1 − β)k β 137−k ≤ δ/2, (5.22)
k=1 k
X 137
(1 − β)k β 137−k ≥ 1 − δ/2. (5.23)
k=1 k
The number of exceedings and the corresponding p-values for each ALM return
series are contained in tables (5.6-5.7). The conclusions are:
• At level of significance 95%, the EWMA model is rejected three times for
β = 0.99 and once for β = 0.95 while the SSEWMA model is never rejected.
Exceedings and p-values
β r1 r2 r3 r4 r5 r6
0.99 4 (0.0252) 3 (0.0990) 4 (0.0252) 4 (0.0252) 3 (0.0990) 3 (0.0990)
0.95 12 (0.0405) 11 (0.0850) 9 (0.2984) 8 (0.4955) 10 (0.1657) 6 (0.9379)
0.90 16 (0.4168) 13 (0.9851) 14 (0.7920) 16 (0.4168) 14 (0.7920) 12 (0.7586)
0.80 26 (0.8639) 28 (0.7987) 26 (0.8639) 32 (0.2773) 28 (0.7987) 26 (0.8639)
Table 5.6: Number of VaRβ exceedings in 137 data points with corresponding
p-values under the normal assumption.
Table 5.7: Number of VaRβ exceedings in 137 data points with corresponding
p-values under the stable assumption (α = 1.8).
• For β = 0.90 and 0.80, the EWMA model produces reasonably large p-
values, which just indicates that the normal distribution could be suitable
for forecasting more toward the middle of the distribution.
Overall, the SSEWMA model provides a better fit to the tails and is preferable
based on the examination of the p-values.
The ALM optimization problem is now solved using scenarios generated from the
time-series models of the previous section. First, efficient frontiers are developed
from the 2-stage problem with scenarios based on the EWMA and SSEWMA
models, and postoptimality analysis is briefly discussed. Then, backtesting is
carried out to compare the performance of the 1-stage problem versus the 2-
stage recourse problem and the normal assumption versus the stable assumption.
The results from varying the distributional assumption are mixed, but the 2-
stage recourse problem outperforms the 1-stage problem. Before presenting these
results, the parameters of the optimization problem are first specified.
For pension funds, decisions are made approximately on an annual basis, so a
stage in the stochastic program should correspond to 12 months. A twelve month
stage left too few data points in the backtesting, so the decision was made to
shorten the stage to cover a six month period. In addition to giving more points
for comparison in the backtesting, the time-series models should generate more
reliable scenarios over the shorter time period.
For the 2-stage problem, a balanced scenario tree is generated with 104 first
stage scenarios and 107 second stage scenarios, giving 103 second stage nodes con-
nected to each first stage node. This huge number of scenarios gives fairly reliable
optimal allocations, and memory limitations did not allow much larger scenario
trees to be considered. The first stage scenarios were created by simulating 104
sample paths of the time-series model out to six months, and the second stage
scenarios were created by simulating another 103 sample paths out an additional
six months for each of first stage scenarios. Scenario reduction and bundling using
the methods of probability metrics was also attempted in order to created a better
set of first stage scenarios, but these methods could not handle sample paths of
this number with the given hardware.
It is necessary to convert the generated sample paths of the returns back
to the index values of the benchmarks. This is not a problem when using the
normal distribution, but it does cause some small difficulties when using the stable
distribution. Since the returns have infinite variance under the stable assumption
and are temporally dependent, the sample paths of the corresponding index values
will explode. For this reason, the stable return scenarios are truncated at levels
corresponding to p-values of 0.001 and 0.999 of the estimated distribution. This
eliminates the explosion of the index values while still fitting the tail of the return
distribution better than the normal assumption.
For the efficient frontiers and at the start date of the backtesting, it is as-
sumed that the pension fund is fully funded: the total asset wealth and the lia-
bility obligation are both taken to be $1,000, and because of the structure of the
deterministic equivalent form of the optimization problem, any pension fund that
is fully funded will have the same optimal allocations (as a percent of the asset
wealth). For instance, a fund with an initial $1,000,000 in both asset wealth and
liability obligation has the same optimal allocations as one with $1,000 in both.
Including transaction costs, the optimal allocations depend also on the initial
allocation, not just the generated scenarios and initial wealth. In this case, it is
assumed that the fund initially holds 40% of its wealth in bonds and 60% of its
wealth in equities. A reasonable assumption for the trading costs, as a percent of
wealth traded, is obtained from data on mutual funds in [9]. In our example, the
median trading cost (TC) is 0.70% of fund assets per year:
The turnover, defined as the ratio of annual fund sales to the fund assets, is
determined to have a median of 0.70:
TC ≈ Traded Wealth,
2 · 0.70
or trading costs are approximately 0.5% of the traded wealth. Additionally as-
suming that the transaction costs are the same for each of the five ALM asset
classes, the values of T CB i = T CS i = 0.005, for i = 1, ..., 5, are used in the
optimization problem.
The numerical results of the efficient frontiers for the 2-stage recourse problem
are now given. Recall that the risk measure for the 2-stage problem is:
where swt+1 is the surplus wealth at the end of stage t (and sw1 = 0 since the
pension fund is initially fully funded). A confidence level of β = 0.95 is used in this
section to emphasize the differences between the normal and stable assumptions.
For the remainder, it is taken that µ1 = µ2 = .5, and studies in assigning different
weights to the CVaR at different stages is saved for a later time. Since the reward
is the expected surplus wealth at the end of the second stage, E(sw3 ), the efficient
frontier is obtained by varying λ in the minimization objective: λρ2 −(1−λ)E(sw3 ).
Figure (5.8) contains three different efficient frontiers:
240 260 280 300 320 340 360 380 400 420
Figure 5.8: Efficient frontiers under the normal assumption and stable assumption
for β = 0.95.
• Optimization with transaction costs and the same set of scenarios generated
from the normal assumption.
The optimal allocations, as percents of the initial wealth, can be found in the
tables (6.1-6.3) in the appendix of the lecture notes.
In all three cases and for any value of λ, the optimal first stage allocations
are some combination of the bond and international equity indexes. The portfolio
that maximizes the expected final surplus wealth (λ = 0) invests entirely in the
international equity index, and the minimum risk portfolio (λ = 1) invests entirely
in the bond index. A couple immediate comments can be made about the figure.
Since the stable distribution has a higher probability of extreme events, the frontier
of the stable distribution lies below that of the normal distribution. The inclusion
of transaction costs also moves the efficient frontier downward, and the distance
it moves for various values of λ depends on the initial allocation.
A few risk-reward points obtained by replacing the surplus wealth with the
wealth in the optimization problem, under the normal assumption, are also in-
cluded in figure (5.8). The optimal allocations, found in table (6.4), are very
different in this case: The minimum risk portfolio has a very large proportion
of wealth invested in the cash index. When the corresponding ρ2 and E(sw3 )
are calculated, the points for the risk-averse portfolios lie far below the efficient
frontier. This illustrates the advantage of considering the liabilities and assets
together in the same optimization problem. Maximizing the expected final wealth
and maximizing the expected final surplus wealth result in the same values of ρ2
and E(sw3 ) because of the linearity of the problem.
The basic postoptimality analysis examines how the optimal value of a stochastic
program changes as the initial probability distribution P1 becomes contaminated
with another probability distribution P2 . Usually problems of the following form
are considered:
φ(P1 ) = min F (x1 , P1 ) (5.25)
x1 ∈X
This means that the scenarios of both distributions are aggregated into one set of
scenarios where the probabilities of the scenarios in P1 are weighted by 1 − ψ and
the probabilities of the scenarios in P2 are weighed by ψ. If the optimal solution
is denoted by
x1 (P1 ) = arg min F (x1 , P1 ), (5.27)
x1 ∈X
a set of bounds for the optimal value of the stochastic program under the conta-
minated distribution, φ(Pψ ), are given by
correspond to the 2-stage risk under the normal and stable distributions, respec-
tively. Also, denote the risk under distribution Pψ by ρψ2 . As seen in tables (6.1-
6.2), the optimal allocations under both the normal assumption and the stable as-
sumption invest all the wealth in the bond index. It follows that F (x(P2 ), P1 ) = ρn2
and F (x(P1 ), P2 ) = ρs2 , and the bounds in equation (5.28) produce
= (1 − ψ) · 246.13 + ψ · 291.21.
The minimum risk in the 2-stage program is then easily calculated when scenarios
under the normal assumption and stable assumption are combined. The gen-
eral contamination technique can also be applied for any value of λ, but direct
information about the risk can no longer by calculated.
Finally, some backtesting results will be presented. The first round includes trans-
action costs, and the initial conditions for each run of the optimization problem
come from the previous period considered. This provides a realistic comparison
for the 1-stage problem versus the 2-stage problem, but it is difficult to calculate
the realized risk using the risk measure that was optimized. In the second round,
the transaction costs are removed and the initial conditions are reset every run of
the optimization problem. This allows the realized risk to be directly calculated
in terms of the optimized risk measure and provides a better comparison for the
distributional assumptions; however, this setup favors the 1-stage problem over
the 2-stage problem because the second stage becomes irrelevant.
Dynamic Backtesting: 1-stage versus 2-stage
This section performs the dynamic backtesting of the minimum risk 1-stage
and 2-stage portfolios with transaction costs. The 2-stage problem finds the op-
timal allocations that minimize ρ2 , and the 1-stage problem finds that optimal
allocation that minimizes
ρ1 = CVaRβ (−sw2 ). (5.30)
For a given distributional assumption, the same sets of scenarios are used when
solving the 1-stage and 2-stage problems: The 1-stage problem is just restricted
to considering the 104 first stage scenarios.
The time-series models are fit to a moving window of 100 data points under
both the normal and stable assumptions using the EWMA and SSEWMA models,
respectively. Running the optimization problems with scenarios generated from
the time-series models fit to the first 100 monthly data points give optimal allo-
cations for the six month period beginning in July, 1993. It is again assumed that
the pension fund is initially fully funded with 40% of wealth in the bond index
and 60% of wealth in the equity index. The window is then shifted forward by
6 data points, and the optimization problems output optimal allocations for Jan-
uary, 1994. The asset wealths resulting from the previous allocations, and those
allocations themselves, are used as the initial conditions for the new optimization
problems. This setup means that the 2-stage problem is run on a rolling horizon:
Since new scenarios are generated every 6 months, only the first stage allocations
are actually implemented.
Since it is difficult to obtain a good estimate for the tail index of a stable
distribution with only 100 data points, it is assumed that α = 1.8 in the SSEWMA
model. The backtesting, therefore, gives a comparison of the normal assumption
with the stable assumption for this particular value of the tail index.
The window is shifted 21 times resulting in a final surplus wealth for July,
2004. Since this results in only 22 values of the surplus wealth for comparison,
the confidence level of CVaR was reduced to β = 0.80 in ρ1 and ρ2 . To measure
the relative performances, it is necessary to calculate the risk of the realized
surplus wealths. However, it is not reasonable to directly calculate the CVaR of
these values because the surplus wealth that is used as the initial condition in
the optimization problems varies over the time horizon and is different for the
different assumptions. It is also not possible to calculate the CVaR of the return
of the surplus wealth because the surplus wealth is not strictly positive. By the
translation invariance property of a coherent risk measure, it is more reasonable
to look at the change in surplus wealth:
since sw1 is a fixed initial condition. Therefore, minimizing the CVaR in the
next time period has the effect of minimizing the CVaR of the change, but one
cannot still make a direct comparison because the asset wealth also varies for the
different assumptions over the horizon. The measure of realized risk, ρ̃, used in
the comparison is the CVaR at 80% confidence level of the change in negative
surplus wealth per dollar of asset wealth from the previous period. One can
expect that minimizing ρ1 and ρ2 produces small values of ρ̃, but ρ̃ does not
give a perfect comparison of risk because the resulting optimal allocations depend
on the ratios of assets to liabilities, not just the asset wealths. Values of ρ̃ and
ρ̃ final sw
1-stage Normal 0.0466 1177.29
1-stage Stable 0.0509 1077.64
2-stage Normal 0.0456 1209.22
2-stage Stable 0.0491 1217.92
Fixed-Mixed 0/40/60/0/0 0.0924 241.04
Fixed-Mixed 0/100/0/0/0 0.0776 -371.39
Table 5.8: Dynamic backtesting results.
the final surplus wealth are found in table (5.8). For comparison, this table also
includes values for the fixed-mixed rule of 40% bonds and 60% equity, and the
rule of 100% in bonds. Under both the normal and stable assumptions, the 2-
stage recourse problem outperforms the 1-stage problem by both reducing ρ̃ and
increasing the final surplus wealth. While the 2-stage problem under the stable
assumption results in the highest final surplus wealth, the normal assumption gave
lower values of ρ̃. The fixed-mixed rules were no comparison with the stochastic
Figures (5.9-5.11) show the evolution of the asset wealths and liability value
over the time horizon. One can see that minimizing CVaR does not look like a
typical index tracking problem because the upside is not penalized. The asset
wealths and the liability values are in table (6.5), and the optimal allocations can
be found in the appendix. These tables also include the percent of asset wealth
loss to transaction costs.
An additional comparison of the performance of the stable and normal dis-
tributions can be obtained by VaR backtesing similar to section 5.1.2. Future
material on this issue will be provided in the lecture.
2−stage asset wealth
1−stage asset wealth
Liability value
1994 1995 1997 1998 2000 2001 2002 2004
Figure 5.9: Dynamic backtesting: 1-stage versus 2-stage under the normal as-
2−stage asset wealth
1−stage asset wealth
Liability value
1994 1995 1997 1998 2000 2001 2002 2004
Figure 5.10: Dynamic backtesting: 1-stage versus 2-stage under the stable as-
0/40/60/0/0 asset wealth
0/100/0/0/0 asset wealth
Liability value
1994 1995 1997 1998 2000 2001 2002 2004
Optimal First Stage Allocations
λ E(−sw3 ) ρ2 CVaR1 CVaR2 Cash Bonds Equities Int. Eq. Mortgages
0 72.18 399.46 319.44 479.48 0 0 0 100 0
0.10 71.68 389.07 319.44 458.70 0 0 0 100 0
0.20 70.82 384.04 319.44 448.63 0 0 0 100 0
0.25 64.49 364.44 299.09 429.80 0 10.6483 0 89.3517 0
0.30 42.92 306.50 232.28 380.71 0 48.3706 0 51.6294 0
0.35 32.63 284.80 207.23 362.36 0 64.9963 0 35.0037 0
0.40 25.61 273.09 194.71 351.47 0 75.1793 0 24.8207 0
0.45 19.75 265.12 186.68 343.57 0 82.9291 0 17.0709 0
0.50 15.08 259.96 182.16 337.76 0 88.1928 0 11.8072 0
0.60 6.72 253.12 177.31 328.92 0 95.7591 0 4.2409 0
0.75 -2.74 248.36 175.47 321.24 0 100 0 0 0
1.00 -20.23 246.13 175.47 316.79 0 100 0 0 0
Table 6.1: Efficient frontier under the normal assumption with β = 0.95.
Optimal First Stage Allocations
λ E(−sw3 ) ρ2 CVaR1 CVaR2 Cash Bonds Equities Int. Eq. Mortgages
0 70.47 409.60 321.91 497.29 0 0 0 100 0
0.10 70.08 401.18 321.91 480.45 0 0 0 100 0
0.20 69.36 396.97 321.91 472.03 0 0 0 100 0
0.25 68.92 395.45 321.91 468.99 0 0 0 100 0
0.30 57.09 365.74 288.65 442.84 0 20.8749 0 79.1251 0
0.35 43.60 337.37 256.10 418.64 0 44.3374 0 55.6626 0
0.40 34.79 322.43 239.12 405.73 0 58.8847 0 41.1153 0
0.45 27.67 312.73 228.49 396.97 0 69.8927 0 30.1073 0
0.50 21.83 306.25 221.75 390.75 0 78.1959 0 21.8041 0
0.60 13.85 299.59 216.06 383.12 0 87.1561 0 12.8439 0
0.75 2.97 294.23 212.78 375.68 0 95.7426 0 4.2574 0
1.00 -19.85 291.21 212.08 370.34 0 100 0 0 0
Table 6.2: Efficient frontier under the stable assumption with β = 0.95.
Optimal First Stage Allocations Trans.
λ E(−sw3 ) ρ2 CVaR1 CVaR2 Cash Bonds Equities Int. Eq. Mortgages Costs
0 59.28 415.61 328.26 502.96 0 0 0 99.0050 0 0.9950
0.10 58.80 405.57 328.26 482.89 0 0 0 99.0050 0 0.9950
0.20 57.83 399.83 328.26 471.41 0 0 0 99.0050 0 0.9950
0.25 37.16 327.52 251.65 403.39 0 40.0000 0 59.4030 0 0.5970
0.30 34.81 321.78 247.18 396.37 0 42.6534 0 56.7496 0 0.5970
0.35 21.80 294.29 217.23 371.35 0 61.8604 0 37.5426 0 0.5970
0.40 13.93 281.12 203.63 358.61 0 72.4813 0 26.9217 0 0.5970
0.45 7.14 271.91 194.45 349.36 0 81.0294 0 18.3736 0 0.5970
0.50 2.07 266.31 189.48 343.14 0 86.5774 0 12.8256 0 0.5970
0.60 -6.25 259.47 184.36 334.58 0 94.1698 0 5.2331 0 0.5971
0.75 -15.27 254.86 181.98 327.75 0 99.4030 0 0 0 0.5970
1.00 -31.96 253.21 181.98 324.45 0 99.4030 0 0 0 0.5970
Table 6.3: Efficient frontier under the normal assumption with transaction costs and β = 0.95.
Table 6.4: Wealth optimization under the normal assumption with no transaction costs and β = 0.95.
Asset Wealth
Date Liability 1-stage 2-stage Fixed-mixed
Value Normal Stable Normal Stable 0/40/60/0/0 0/100/0/0/0
7/93 1000.00 1000.00 1000.00 1000.00 1000.00 1000.00 1000.00
1/94 1067.19 1036.08 1033.60 1034.93 1032.71 1067.69 1034.72
7/94 936.20 1006.58 1003.17 1005.07 1002.12 1031.23 1000.94
1/95 932.01 1015.83 1009.89 1014.77 1009.76 1061.09 1010.78
7/95 1085.82 1119.68 1106.91 1118.45 1108.27 1233.43 1102.13
1/96 1267.52 1230.57 1213.50 1235.10 1219.35 1376.50 1182.06
7/96 1137.78 1245.16 1227.06 1250.82 1234.14 1382.09 1163.18
1/97 1208.93 1545.82 1486.14 1552.85 1532.14 1609.50 1220.62
7/97 1354.74 1894.36 1818.87 1902.97 1877.59 1862.24 1288.37
1/98 1503.39 1961.80 1883.63 1970.72 1944.43 1937.84 1351.51
7/98 1566.35 2259.70 2169.65 2269.97 2239.69 2136.26 1389.74
1/99 1726.53 2599.19 2495.61 2611.01 2576.17 2371.80 1460.65
7/99 1536.38 2531.37 2448.79 2555.91 2543.31 2411.78 1424.34
1/00 1528.22 2568.31 2490.92 2610.18 2603.17 2498.60 1433.68
7/00 1715.38 2694.08 2612.77 2729.57 2724.48 2599.06 1509.31
1/01 1864.70 2884.14 2803.19 2910.55 2916.84 2621.32 1631.85
7/01 1915.56 3003.98 2920.14 3030.85 3038.17 2495.37 1700.87
1/02 1960.23 3100.08 3013.55 3127.81 3135.36 2436.33 1755.28
7/02 2068.61 3230.29 3140.12 3259.18 3267.05 2202.77 1829.00
1/03 2291.06 3393.44 3298.72 3423.79 3432.06 2176.37 1921.38
7/03 2180.78 3405.28 3310.23 3435.73 3444.03 2398.13 1928.08
1/04 2400.67 3558.11 3458.80 3589.93 3598.60 2659.32 2014.61
7/04 2392.77 3570.05 3470.40 3601.98 3610.68 2633.80 2021.38
Table 6.5: Dynamic backtesting: Realized liability value and asset wealths for the
optimal allocations with β = 0.80.
Date Cash Bonds Equities Intern. Mortgages Transaction
Equities Costs
(initial) 0 40 60 0 0
7/93 0 89.9043 4.9691 4.5791 0 0.5476
1/94 0 89.7865 0.1678 9.9954 0 0.0503
7/94 0 85.6719 9.9092 4.3211 0 0.0979
1/95 0 89.7313 10.2284 0 0 0.0403
7/95 0 58.9767 40.7269 0 0 0.2964
1/96 0 0 99.4273 0 0 0.5727
7/96 0 0 100 0 0 0
1/97 0 0 100 0 0 0
7/97 0 0 100 0 0 0
1/98 0 0 100 0 0 0
7/98 0 0 100 0 0 0
1/99 0 88.4698 10.6410 0 0 0.8891
7/99 0 82.3947 17.5437 0 0 0.0616
1/00 0 81.7418 18.2582 0 0 0
7/00 0 91.9907 7.9093 0 0 0.1000
1/01 0 99.9294 0 0 0 0.0706
7/01 0 100 0 0 0 0
1/02 0 100 0 0 0 0
7/02 0 100 0 0 0 0
1/03 0 100 0 0 0 0
7/03 0 100 0 0 0 0
1/04 0 100 0 0 0 0
Table 6.6: Dynamic backtesting: Allocations (as a percent of asset wealth) for the
1-stage optimization problem under the normal assumption with β = 0.80.
Date Cash Bonds Equities Intern. Mortgages Transaction
Equities Costs
(initial) 0 40 60 0 0
7/93 0 91.4944 3.7318 4.2140 0 0.5599
1/94 0 90.9029 0 9.0523 0 0.0448
7/94 0 86.5537 9.8023 3.5455 0 0.0985
1/95 0 89.8535 10.1135 0 0 0.0330
7/95 0 50.4015 49.2155 0 0 0.3830
1/96 0 0 99.5129 0 0 0.4871
7/96 0 0 100 0 0 0
1/97 0 0 100 0 0 0
7/97 0 0 100 0 0 0
1/98 0 0 100 0 0 0
7/98 0 0 100 0 0 0
1/99 0 82.2585 16.9148 0 0 0.8267
7/99 0 67.1027 32.7497 0 0 0.1477
1/00 0 66.1382 33.8618 0 0 0
7/00 0 89.5188 10.2507 0 0 0.2305
1/01 0 99.9081 0 0 0 0.0919
7/01 0 100 0 0 0 0
1/02 0 100 0 0 0 0
7/02 0 100 0 0 0 0
1/03 0 100 0 0 0 0
7/03 0 100 0 0 0 0
1/04 0 100 0 0 0 0
Table 6.7: Dynamic backtesting: First stage allocations (as a percent of asset
wealth) for the 2-stage optimization problem under the normal assumption with
β = 0.80.
1-stage 2-stage
Date Normal Stable Normal Stable
7/93 139.97 118.33 139.99 248.58 118.35 211.53
1/94 165.24 165.42 166.51 286.72 166.45 285.05
7/94 13.25 24.93 14.91 77.33 26.09 95.20
1/95 -21.85 -2.31 -20.81 25.29 -2.22 52.64
7/95 90.97 112.84 92.62 202.52 111.41 229.59
1/96 165.43 199.19 159.34 217.74 192.03 262.59
7/96 17.94 53.73 11.81 89.41 46.62 139.86
1/97 -294.58 -201.09 -302.18 -335.87 -253.20 -259.12
7/97 -530.24 -415.32 -539.84 -622.72 -481.01 -531.86
1/98 -343.84 -223.30 -353.58 -327.22 -290.62 -217.16
7/98 -603.38 -471.40 -614.36 -624.62 -545.81 -522.57
1/99 -682.94 -569.73 -693.82 -572.10 -652.59 -544.71
7/99 -914.55 -825.38 -937.53 -938.14 -921.85 -929.62
1/00 -971.35 -876.49 -1009.61 -1015.10 -987.07 -971.63
7/00 -839.05 -731.40 -871.71 -770.64 -847.11 -725.54
1/01 -911.81 -782.59 -939.38 -834.34 -904.92 -756.63
7/01 -986.08 -850.41 -1014.32 -934.61 -975.67 -848.00
1/02 -1053.78 -948.54 -1082.97 -1014.99 -1077.36 -991.05
7/02 -1032.56 -886.39 -1063.62 -965.99 -1023.88 -879.87
1/03 -899.44 -761.41 -931.94 -768.19 -904.66 -706.25
7/03 -1013.32 -881.03 -1046.02 -906.55 -1024.45 -858.39
1/04 -933.25 -832.85 -967.31 -802.10 -982.10 -820.57
Table 6.8: Dynamic backtesting: Optimal values of CVaR0.80 for scenarios gener-
ated under the normal and stable assumptions.
[1] Carlo Acerbi and Dirk Tasche. Expected shortfall: A natural coherent alter-
native to value at risk. http://www.gloriamundi.org, May 2001.
[2] Carlo Acerbi and Dirk Tasche. On the coherence of expected shortfall. Jour-
nal of Banking and Finance, 26(7):1487–1503, July 2002.
[3] Philippe Artzner, Freddy Delbaen, Jean-Marc Eber, and David Heath. Co-
herent measures of risk. Mathematical Finance, 9(3):203–228, July 1999.
[5] Brendan O. Bradley and Murad S. Taqqu. Financial risk and heavy tails. In
Svetlozar T. Rachev, editor, Handbook of heavy tailed distributions in finance,
Handbooks In Finance. Elsevier Science, The Netherlands, 2003.
[6] John Y. Campbell, Andrew W. Lo, and A. Craig MacKinlay. The Econo-
metrics of Financial Markets. Princeton University Press, Princeton, New
Jersey, 1997.
[7] David R. Carino, David H. Myers, and William T. Ziemba. Concepts, tech-
nical issues, and uses of the russell-yasuda kasai financial planning model.
Operations Research, 46(4):450–462, July-August 1998.
[8] Yair Censor and Stavros A. Zenios. Parallel Optimization: Theory, Algo-
rithms, and Applications. Numerical Mathematics and Scientific Computa-
tion. Oxford University Press, New York, 1997.
[9] John M.R. Chalmers, Roger M. Edelen, and Gregory B. Kadlec. Transaction-
cost expenditures and the relative performance of mutual funds. Technical
report, The Wharton School, University of Pennsylvania, November 1999.
[12] Jitka Dupačová, Giorgio Consigli, and Stein W. Wallace. Scenarios for multi-
stage stochastic programs. Annals of Operations Research, 100:25–53, 2000.
[13] Jitka Dupačová, Jan Hurt, and Josef Štěpán. Stochastic Modeling in Eco-
nomics and Finance. Kluwer Academic Publishers, Netherlands, 2002.
[16] Nicole Gröwe-Kuska, Holger Heitsch, and Werner Römisch. Scenario reduc-
tion and scenario tree construction for power management problems. Tech-
nical report, Institute of Mathematics, Humboldt-University Berlin.
[17] Holger Heitsch and Werner Römisch. Generation of multivariate scenario
trees to model stochasticity in power management. Technical report, Institute
of Mathematics, Humboldt-University Berlin.
[18] Pavlo Krokhmal, Jonas Palmquist, and Stanislav Uryasev. Portfolio opti-
mization with conditional value-at-risk objective and constraints. The Jour-
nal of Risk, 4(2), 2002.
[19] Fabio Lamantia, Sergio Ortobelli, and Svetlozar Rachev. Value at risk with
stable distributed returns. Available at http://www.pstat.ucsb.edu/research.
[22] John Mulvey and Hercules Vladimirou. Stochastic optimization models for
investment planning. Annals of Operations Research, 20:187–217, 1989.
[23] John Mulvey and Hercules Vladimirou. Applying the progressive hedging
algorithm to stochastic generalized networks. Annals of Operations Research,
31:399–424, 1991.
[24] John Mulvey and Hercules Vladimirou. Solving multistage stochastic net-
works: An application of scenario aggregation. Networks, 21(6):619–643,
[25] John Mulvey and Hercules Vladimirou. Stochastic network programming for
financial planning problems. Management Science, 38(11):1642–1664, Nov.
[26] Soren S. Nielsen and Stravos A. Zenios. Solving multistage stochastic net-
work programs on massively parallel computers. Mathematical Programming,
73:227–250, 1996.
[27] Sergio Ortobelli, Isabella Huber, Svetlozar Rachev, and Eduardo Schwartz.
Portfolio choice theory with non-gaussian distributed returns.
[28] G. Ch. Pflug. Scenario tree generation for multiperiod financial optimization
by optimal discretization. Mathematical Programming, 89:251–271, 2001.
[29] Georg Pflug. Some remarks on the value-at-risk and the conditional value-
at-risk. In Stanislav Uryasev, editor, Probabilistic Constrained Optimization:
Methodology and Applications. Kluwer Academic Publishers, 2002.
[30] Svetlozar Rachev and Stefan Mittnik. Stable Paretian Models in Finance.
Series in Financial Economics and Quantitative Analysis. John Wiley & Sons
Ltd., 2000.
[35] B Schweizer and A. Sklar. Probabilistic Metric Spaces. North Holland Else-
vier, New York, 1983.
[36] Nikolas Topaloglou, Hercules Vladimirou, and Stavros A. Zenios. Cvar models
with selective hedging for international asset allocation. Journal of Banking
and Finance, 26:1535–1561, 2002.
[38] Eric Zivot and Jianhui Wang. Modeling Financial Time Series with S-Plus.
Springer, New York, 2003.