Munich Personal RePEc Archive
CALLABLE SWAPS, SNOWBALLS
AND VIDEOGAMES
Albanese, Claudio
Independent Consultant
9 September 2007
Online at https://mpra.ub.uni-muenchen.de/5229/
MPRA Paper No. 5229, posted 09 Oct 2007 UTC
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
CLAUDIO ALBANESE
Abstract. Although economically more meaningful than the alternatives,
short rate models have been dismissed for financial engineering applications in
favor of market models as the latter are more flexible and best suited to cluster
computing implementations. In this paper, we argue that the paradigm shift
toward GPU architectures currently taking place in the high performance computing world can potentially change the situation and tilt the balance back in
favor of a new generation of short rate models. We find that operator methods
provide a natural mathematical framework for the implementation of realistic
short rate models that match features of the historical process such as stochastic monetary policy, calibrate well to liquid derivatives and provide new
insights on complex structures. In this paper, we show that callable swaps,
callable range accruals, target redemption notes (TARNs) and various flavors
of snowballs and snowblades can be priced with methods numerically as precise, fast and stable as the ones based on analytic closed form solutions by
means of BLAS level-3 methods on massively parallel GPU architectures.
1. Introduction
Fixed income derivatives have had a story spanning three decades and represent
the playground where many of the innovations in Mathematical Finance have been
first introduced. One factor short rate models are described in (Vasicek 1977),
(Ho and Lee 1986), (Hull and White 1993), (Cox et al. 1985) and (Black and
Karasinski 1991). Multi-factor short rate models are in (Cheyette 1992), (Longstaff
and Schwartz 2001), (Chen 1996) and general affine models are in (Duffie and
Kan 1996) and (Dai and Singleton 2003). All these models are characterized by at
least partial analytic solvability for the discount curve and caplets. The constraint
of analytic solvability derives from engineering considerations and applicability to
pricing purposes. However, requiring solvability lessens the main advantage of such
models which is the potential proximity between the pricing measure as is revealed
by derivative prices and the real world econometric estimates based on historical
time series.
Market models based on an over-complete set of stochastic processes constrained
by drift restrictions are in (Heath et al. 1992) and in (Brace et al. 1997), (Jamshidian
1997) and (Sandmann and Sondermann 1997). Market models are more flexible
and calibrate better to swaption prices than the traditional short rate models. The
implied dynamics for swaption prices and the term structure itself however do not
reflect econometric evidence. This is often compensated by using a market model
to decompose an exotic in elementary contracts and then applying a model such as
SABR in (Hagan et al. 2002) to risk manage at the book level.
Date: first version: September 9th, 2007. Revised October 1, 2007.
I am deeply indebted with Manlio Trovato for the many discussions and work in collaboration
on interest rate modeling.
1
2
CLAUDIO ALBANESE
The Markov functional model in (Hunt et al. 2000) can be regarded as a light
version of a BGM model which offers a number of computational advantages as it
is implementable on lattices, see (Bennett and Kennedy 2005). The technique of
recombining bushy trees discussed in (Tang and Lange 2001) also offers a systematic
way of building a recombining lattice models out of a BGM specification. These
lattice models however have limitations as they face the challenge of having to
substantially prune the number of nodes to control dimensional explosion. The
industry standard for path dependent options such as target redemption notes
(TARNs), callable swaps such as CMS spread range accruals, callable snowballs
and TARN snowblades is currently to use market models and Montecarlo methods,
combined the variance reduction techniques in (Longstaff and Schwartz 2001).
Let us recall that a CMS is a swap paying LIBOR or a fixed rate against a
floating rate given by a fixed tenor swap rate prevailing at the time the coupon
is paid. A TARN is a structure which terminates whenever either a maturity is
reached or the total amount paid on the structured coupon leg exceeds a certain
threshold. A snowball is a structured swap with a funding leg and a coupon stream
whereby the coupon paid on a given date is given by the sum of a fraction of the
coupon paid in the previous period plus an amount determined by the realization
of the rate process in the coupon period itself. A snowblade is similar to a snowball
except that it is not callable and termination is triggered by a TARN condition.
From a computational viewpoint, short rate lattice models typically execute
on single threaded CPUs. Montecarlo schemes for market models are naturally
parallelizable and are well suited to cluster computing implementations. The predominance of cluster computing in the last decade was one of the leading factors
sustaining the predominance of market models.
In this paper, we introduce a new and flexible modeling framework for short rate
models which does not rely on analytic solvability in any sense at all. As opposed
to evaluating hypergeometric functions, as our main numerical engine we make use
of matrix-matrix multiplication routines implemented on massively parallel multicore architectures typical of emerging graphic processing units (GPUs). GPUs are
particularly interesting from an engineering viewpoint as they are mass-produced
and low cost due to the applications to video-games and computer aided design. We
show that consideration of this more general class of models allows one to overcome
the limitations of traditional analytically solvable models, can be made to resemble
to a greater degree the econometric evidence, can be calibrated well and can used
as a basis for efficient pricing of exotic structures such as callable swaps, TARNs,
snowballs and snowblades.
GPU chipsets are based on arrays of single-instruction-multiple-data (SIMD)
processors running at lower frequency than current CPUs but capable of processing hundreds of threads per clock cycle. Unlike CPU clusters, GPU multi-core
chipsets have large blocks of fast shared memory and fast interconnects. Also
unlike CPUs, GPUs are optimized for single precision arithmetics, while double
precision is implemented in software and typically comes with a performance impact of about a factor 10. These architectures are particularly suitable to execute
tasks that can leverage on a high degree of parallelism, fast shared memory and
broad communication bandwidth such as BLAS level-3 methods and in particular
large matrix-matrix multiplications. The level-3 methods we use involve time discretization steps as small as one-hour and because of this reason exhibit internal
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
3
smoothing mechanisms which are not present in sparse BLAS level-2 methods or
Montecarlo methods. Thanks to the internal smoothing properties, calculations can
be safely carried out in single precision in their entirety. One can even safely numerically differentiate probability kernels to find sensitivities and price path-dependent
options within single precision floating point arithmetics.
In this paper, we show how to structure pricing algorithms based on level-3 BLAS
which execute particularly efficiently on GPUs, are robust in the native single precision and allow one to price a variety of fixed income exotics including European
and Bermuda swaptions, callable constant maturity swaps (CMSs), callable CMS
spread range accruals, CMS and CMS spread target redemption notes (TARNs),
callable and TARN snowballs. The model can be freely specified as long as it is a
Markov model for the short rate and possibly other factors under the risk neutral
measure. Specification strategies can vary, one could proceed non-parametrically
as well as semi-parametrically. The numerical scheme we use is based on a lattice
which on current hardware can realistically admit up to 1500 sites (we use 672
in our example) and an elementary time discretization step not longer than a few
hours at most. Partial explanations for the smoothing phenomenon in terms of
rigorous results are in (Albanese and Mijatovic 2006) and (Albanese 2006). We
find that the smallness of the time discretization step is crucial for the existence
of internal smoothing and the usability of a single precision engine. Pade’ approximants and implicit methods can be implemented on such architectures using mixed
precision iterative methods as discussed in (Buttari et al. 2007). From the mathematical viewpoint, instead of analytic closed form solutions we make use of operator
methods, functional calculus and algebraic manipulations, see (Albanese 2006).
The paper is structured as follows. In Section 2, we describe our working model,
a 2-factor stochastic monetary policy model. This model could easily be extended
by introducing a stochastic volatility term, but for simplicity’s sake we don’t pursue this possibility. In Section 3, we review the basic concepts behind operator
methods such as path-ordered exponentials, the fast-exponentiation algorithm, the
Feynman path-integral expansion, Dyson’s formula, pointwise kernel convergence
and diffusion smoothing. Section 4 is about the theory of Abelian processes and
how this applies to path-dependent structured products. Section 5 covers moment
methods. Section 6 contains a discussion of scaling laws and numerical complexity
on massively parallel GPU architectures. We present strategies to price portfolios of
exotics collectively and valuation strategies for callable and TARN snowball structures. Lattice models for TARNs are discussed in (Albanese and Trovato 2006) and
that method is based on the theory of Abelian processes and first implemented on a
CPU architecture. To our knowledge, this is the first article in the public literature
where snowballs are priced by means of a lattice model implemented on a massively
parallel GPU architecture.
2. A working example: A stochastic monetary policy short rate
model
The numerical methods described in this paper apply in great generality to all
multi-factor interest rate models based on a Markov process where one of the factor
is the short rate and the representation corresponds to the risk neutral measure
whereby one uses the money market account as the numeraire asset. No analytic
solvability assumption is made, not even for the term structure of interest rates.
4
CLAUDIO ALBANESE
The only limitation is given by the number of sites in the discretization lattice for
the short rate itself and the additional factors. To illustrate the methodology, we
discuss in detail a relatively simple example with 84 discretization points for the
short rate and 8 regimes, for a total of 672 lattice sites. However, lattices with up
to 1500 sites are still quite realistic from the engineering viewpoint given current
hardware.
Although short rate models are within range of our approach, the lattice size
limitation makes it impractical to implement market models describing an overcomplete set of risk factors subject to drift restrictions. The problem is not so
much with the drift restriction, rather with the sheer number of forward rates in
a market model which should be discretized and fit on a lattice. In this case,
the technique of non-exploding bushy trees in (Tang and Lange 2001) provides an
effective strategy for lattice discretization which is however quite distinct from our
approach.
To take advantage of GPU platforms, we use operator methods from the start
and express our model by means of a Markov generator given by a time dependent
matrix of the form
(1)
Lα (x, a; x′ , a′ ; t) = D(x, x′ |a, t)δaa′ + Tmp (a, a′ |x, t)δxx′
A state variable is given by a couple y = (x, a) whereby x is related to the short
rate by a function of the form
(2)
r̄(x, t) ≡ r(x)λ(t).
The function r(x) is specified initially as a regular grid extending from 0 to 50%.
The variable a is a monetary policy regime indicator. In our example, x = 0, ...83.
The function λ(t) is piecewise constant in time and is determined iteratively starting
from current time t = 0 in such a way to fit the term structure of interest rates, as
explained below. The function λ(t) that applies to our example is plotted in figure
2. Notice that this function does not deviate from 1 by more than the 5%. This is
obtained by construction we aim at explaining the general shape of the yield curve
by specifying the stochastic monetary policy dynamics as opposed to doing so by
tweaking the function λ(t).
The mean reversion term D(x, x′ ; |a, t) is a tri-diagonal matrix such that D(x, x′ ; |a, t) =
0 if |x − x′ |>= 2 and its first and second moments are given by
X
Dout (x; x′ |a, t)(r(x′ ) − r(x))
(3)
κ1 (θ1 − r̄(x, t)) + µa (x) ≡
x′
(4)
σa2 r(x)2β(x) ≡
X
Dout (x; x′ |a)(r(x′ ) − r(x))2
x′
Here, κ1 = 6.25% and θ1 = 13%. The variable a takes eight values and distinguishes
between 8 different monetary policies: a = 0 corresponds to a deflation scenario
and the other states for a > 0 correspond to a normal monetary policy regime with
rates falling, stable or rising at various rates. More precisely, the diffusion term
D(x, x′ ; |a, t) is a tri-diagonal matrix such that D(x, x′ ; |a, t) = 0 if |x − x′ |>= 2
and its first and second moments are given by
X
Dout (x; x′ |a, t)(r(x′ ) − r(x))
(5)
µa (x) ≡
x′
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
σ 2 (t)r(x)2β(x) ≡
(6)
X
5
Dout (x; x′ |a, t)(r(x′ ) − r(x))2
x′
Here, σ(t) = 11.5% + 6.5%(1 − e−t ) and the functions µa (x) are given by the
following table:
a
µa (x)
a=0
κ0 (θ0 − r(x))
a=1
-75bp
a=2
a=3
-50bp -25bp
Table 1.
a=4
0bp
a=5
25bp
a=6
50bp
a=7
100bp
Here, κ0 = 38% and θ0 = 0.25%. The monetary policy dynamics is chosen
in such a way to be in qualitative agreement with the historical process in 1. A
more refined, quantitative econometric analysis would be useful and possible here,
but venturing in this direction is not the objective of this article. See (Dai and
Singleton 2003) and (Ait-Sahalia et al. 2005) and references therein for a review of
econometric studies. Finally, the function β(x) is calibrated to the implied volatility
backbone, as explained below.
The time dependent matrix Tmp (a, a′ |x, t) gives the transition probability rates
between monetary policy regimes. This matrix is parametrized in terms of a transition probability intensity
p(t) = p0 + (p1 − p0 )e−τp t ,
(7)
with p0 = 50%, p1 = 40%, τp = 3Y . A mean reversion rate
k(t) = k0 + (k1 − k0 )e−τk t ,
(8)
with k0 = 12%, k1 = 4% and τk = 15Y . And a monetary policy drift term
m(t) = m0 + (m1 − m0 )e−τm t ,
(9)
with m0 = −15%, m1 = 1% and τm = 0.5Y . If a > 0, we set
1
(10)
Tmp (a, a + 1|x, t) = p(t) + (4 − a)k(t) + m(t).
2
If a − 1 > 0 then
1
2
2
(11)
Tmp (a, a − 1|x, t) = p(t) − (4 − a)k(t) − m(t),
3
3
3
and if a − 2 > 0
1
1
1
(12)
Tmp (a, a − 2|x, t) = p(t) − (4 − a)k(t) − m(t).
6
3
3
The deflation regime is a bit special and for it we set
(13)
Tmp (0, 3|x, t) = Tmp (0, 4|x, t) = 5%
and, if a = 1, 2, 3, we set
(14)
r(x)
Tmp (a, 0|x, t) = 10%(4 − a) 1 −
.
r0 +
Here r0 = 5%. All other matrix elements of Tmp (a, a′ |x, t) are zero.
We stress that this is just a working example that we choose here to show the
methods and a calibration example. The mathematics and numerical analysis in
the following sections would apply and extend disregard of the specific model specification. The yield curves produced by this model are in Figures 3, 4, 5, 6, 7. The
6
CLAUDIO ALBANESE
projected monetary policy scenarios are given in Figure 10 and the state dependent
correlations are plotted in Figures 9 and 8.
Figure 1. History of short rates (fund rates) for US dollar, the
Euro and the Japanese Yen.
Figure 2. Time dependent adjustment function λ(t) to fit the
term structure of rates. Notice that this function is very close to
1. This is achieved by calibrating the drift of the monetary policy
process.
3. Numerical methods
To price interest rate derivatives under a short rate model we need to accomplish
a few basic tasks. The first is to evaluate numerically the discounted propagator
satisfying the equation
d
(15)
U (t1 , t2 ) + (L(t1 ) − λ(t1 )r)U (t1 , t2 ) = 0,
dt1
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
7
where r is the multiplication operator by the short rate grid function r(x), with
final time condition U (t2 , t2 ) = I, the identity operator, for all time ordered pairs
t1 ≤ t2 . This is the propagator which is used to discount cash flows under the risk
neutral measure. Having done this, we need to evaluate more general propagators
carrying path information for path-dependent options.
The single-name propagator satisfying equation (15) is given by the path-ordered
exponential
Z t2
L(s) − λ(s)r ds .
(16)
U (t1 , t2 ) = P exp
t1
There are two useful methods to express a path-ordered exponential. One is the
Feynman path-integral expansion
(17)
Z t2
L(s)ds = lim I + δtN L(t1 ) I + δtN L(t1 + δt2 ) ... I + δtN L(tN )
P exp
N →∞
t1
t2 −t1
N .
The second is Dyson’s formula
n
Z t2
X
Z t2
∞
1
P
L(s)ds .
L(s)ds =
P exp
n!
t1
t1
n=0
where δtN =
(18)
where
(19)
P
Z
t2
t1
L(s)ds
n
= n!
Z
t2
ds1
t1
Z
t2
s1
ds2 ....
Z
t2
dsn L(s1 )L(s2 )...L(sn ).
sn−1
Dyson’s formula is useful from a theoretical viewpoint. In the next section we
provide examples of its use by deriving a numerical moment method to price a number of Abelian path-dependent options, ranging from risky annuities to volatility
swaps and cliquets.
The Feynman path integral representation is interesting in this context as this
formula can be implemented numerically very efficiently on GPU architectures by
the method of fast exponentiation. The method works as follows. Assume that
the dynamic generators L(t) are piecewise constant as a function of time. Suppose
L(t) = Li in the time interval [ti , ti + (∆t)i ]. Assume δt be chosen so small that
the following two conditions hold:
(FE1)
min(1 + δtLi (y, y)) ≥ 1/2
y∈Λ
(∆t)i
= n ∈ N.
δt
This condition leads to intervals δt of the order of one hour of calendar time and
this is indeed the choice we make. To compute e(∆t)i Li (x, y), we first define the
elementary propagator
(FE2)
(20)
log2
uδt (x, y) = δxy + δtLi (y, y)
and then evaluate in sequence u2δt = uδt · uδt , u4δt = u2δt · u2δt , ... u2n δt = u2n−1 δt ·
u2n−1 δt . Matrix multiplication is accomplished numerically by invoking either the
single precision routine sgemm or the double precision routine dgemm. Since GPUs
floating point units are natively single precision and double precision is slower by
a factor 10, our advice is to use sgemm. Interestingly enough, we find that single
precision is robust for all practical applications we considered, including the working
8
CLAUDIO ALBANESE
example in this paper. In fact, the algorithm of fast-exponentiation as described
above with a δt satisfying the above bound has self-smoothing properties which
lessen the impact of floating point errors to be far less than what one would naively
expect with a back-of-the envelope worst case estimate. The mathematical reasons
behind this phenomenon appear to be deep and not well understood; see (Albanese
and Mijatovic 2006) and (Albanese 2006) for some results in this direction.
Fast exponentiation per se suffices to evaluate all discount functions and to price
European and Bermudan swaptions and to find the CMS convexity convections.
We find that model parameters can be calibrated against at-the-money options is
quite feasible by using a Levenberg-Marquardt algorithm with box bounds as the
sensitivities are sharp and stable even if single precision arithmetics is used. The
calibration results for at-the-money European swaptions are showed in Figures 11,
12 and 13. The skews we obtain are in 14, 15 and 16. Notice that the persistent
skews are controlled by the occupancy probability of the deflation state, as the
other time dependencies in the model are concentrated on short maturities. The
function β(r) is chosen in a time homogeneous way in such a way to fit the implied
volatility backbone in Figures 17, 18 and 19. Notice that the function β(r) is time
homonegenous and does not help to create a persistent implied volatility skew. The
Bermuda backbone in figures 20 and 21 is also an important calibration target.
Finally, the CMS convexity corrections, i.e. the difference between an equilibrium
CMS swap spread and the plain vanilla swap of matching maturity is given in
Figures 22 and 23.
4. Abelian Processes
In (Albanese 2006) and (Albanese and Vidler 2007), we introduce the notion of
Abelian processes as an extension of the notion of stochastic integral over a diffusion
process, see also (Albanese and Trovato 2006) and (Albanese et al. 2006). Abelian
processes are characterized by an algebraic commutativity condition which allows
one to extend the key results of stochastic calculus such as Girsanov’s theorem,
Ito’s lemma, the Cameron-Martin theorem and the Feynman-Kac formula.
To explain the notion of Abelian process, consider a lattice process yt = (xt , at ) as
the one introduced in the previous section and consider a discretely valued Markov
chain process mt dependent on the path followed by yt such that the joint process
(yt , mt ) is Markov and its generator is given by the lifted operator
(21)
L̃(y, m; y ′ , m′ ; t) = L(y, y ′ ; t)δmm′ + A(m; m′ |y)δyy′ .
Consider the algebra spanned by sums and products of the operators A(y) of matrix elements A(y)mm′ = A(m; m′ |y). If this algebra is commutative, i.e. if the
commutators [A(y), A(y ′ )] ≡ A(y)A(y ′ ) − A(y ′ )A(y) vanish for all pairs y, y ′ , then
the process mt is called Abelian.
It is simple to prove that if two matrices A and B have distinct eigenvalues and
commute, then they can be diagonalized simultaneously. Let u be an eigenvector
of A of eigenvalue λ. Then we have that
(22)
BAu = λBu = ABu.
Hence, also Bu is an eigenvector of A of eigenvalue λ. Since the eigenvalues of A are
assumed to be mutually distinct, Bu needs to be proportional to u, i.e. Bu = µu
for some µ. This implies that u is also an eigenvector of B of eigenvalue µ.
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
9
The set of matrices with at least two coinciding eigenvalues has zero measure in
the space of all matrices. This set is a codimension-1 surface. Any neighborood
around each point on this surface cointains a matrix whose eigenvalues are all distinct. Hence, if a matrix has degenerate eigenvalues, there are arbitrarily small
perturbations which lift the degeneracy. One would conclude that degeneracies are
a fairly unusual occurrence. However, from the numerical viewpoint one has to pay
attention to the phenomenon as near degeneracies can translate in ill-conditioning.
In this case, floating point errors can be amplified to give rise to numerical instabilities.
If the Abelian property is satisfied and unless near-degeneracies and ill-conditioning
manifest themselves, then there exist a numerically usable non-singular linear transformation S(m; i) which diagonalizes all the operators A(m; m′ |y) simultaneously.
Hence, the transformed lifted generator S −1 L̃S is a complex, block-diagonal matrix
which can be fast-exponentiated by means of the BLAS routine cgemm which implements a complex matrix-matrix multiplication in single precision. This exponential
gives the joint distribution of the pairs (yt , mt ) for all time horizons t.
Possibly capped and floored range accruals provide an example for the application of this theory. The time a given rate spends within a certain range is in fact
an Abelian process. Also the maximum achieved by a given rate over a certain
period of time is an example of an Abelian process and lookback options can thus
be priced by block-diagonalization.
A discrete version of the notion of Abelian process is particularly useful to price
interest rate derivatives such as target redemption notes (TARNs). Consider a
coupon period ∆T and a state dependent payoff which is payed at times Ti = i∆T
and is of the following general form
(23)
Cs (yi−1 , yi , Ti ) − Cf (yi−1 , yi , Ti ).
where Cs (x, Ti ) is the coupon paid on the so-called structured leg of the TARN
and Cf (x, Ti ) is the coupon paid instead on the so-called funding leg. A TARN is
characterized by a maturity T = Tn for some integer n > 0 and a target H. One
says that the target is reached on the k−th coupon date Tk if
(24)
k
X
Cs (yi−1 , yi , Ti ) > H.
i=1
A TARN is structured so that cash flows continue either until maturity or at the
first coupon date at which the target is reached, whichever comes first.
To price a TARN, one can consider the following lifted propagator
Z Ti
L(t)dt (yi−1 ; yi )K(yi−1 , mi−1 ; yi , mi )
(25) Ui (yi−1 , mi−1 ; yi , mi ) = P exp
Ti−1
where
Cs (yi−1 , yi , Ti )
K(yi−1 , mi−1 ; yi , mi ) ≡ p(yi−1 , yi , Ti )δ m − m −
∆Cs
′
!
−
(26)
!
Cs (yi−1 , yi , Ti )
′
.
+ 1 − p(yi−1 , yi , Ti ) δ m − m −
∆Cs
+
10
CLAUDIO ALBANESE
Here, (∆Cs ) is a discretization interval for the structured coupon dimension. If
ξ ∈ R, we denote with [ξ]+ the smallest integer larger than ξ and with [ξ]− the
largest integer smaller than ξ. Furthermore, p(yi−1 , yi , Ti ) is chosen so that
(27)
Cs (yi−1 , yi , Ti )
Cs (yi−1 , yi , Ti )
Cs (yi−1 , yi , Ti )
+ 1−p(yi−1 , yi , Ti )
=
.
p(yi−1 , yi , Ti )
∆Cs
∆C
∆Cs
s
−
+
Let K(yi−1 , yi ) be the operators of matrix elements
K(yi−1 , yi )mi−1 mi = K(yi−1 , mi−1 ; yi , mi ).
(28)
The operators K(yi−1 , yi ) commute with each other for all choices of pairs (yi−1 , yi ).
Hence, in most cases and assuming no ill-conditioning situations arise, they can be
simultaneously diagonalized.
To complete the definition of the operator K(yi−1 , mi−1 ; yi , mi ) in a usable way,
we need to limit the range of variability of the variable m to a finite interval m =
0, 1, ...M for some M > 0 and specify boundary conditions for the process as it
reaches the site m = M . A useful strategy is to impose periodic boundary conditions
so that upon surpassing M , the process winds back and re-enters from the point
m = 0, as if the interval [0, M ] was wrapped around a circle. With this choice, the
situation is simple as the block-diagonalization can be accomplished in all cases by
means of an expicit partial Fourier transform whose implementation does not give
rise to ill-conditioning. Let F be the operator such that
(29)
û(y, p) = (Fu)(y, p) =
M
X
e−ipm u(y, m),
m=0
where p =
(30)
2πj
M +1 , j
= 0, 1, ...M . Then we have that the transformed operator
Ûi = FUi F −1
is block-diagonal and has matrix elements
(31)
Z Ti
M
X
K(yi−1 , 0; yi , m)eipi m .
L(t)dt (yi−1 ; yi )
Ûi (yi−1 , pi−1 ; yi , pi ) = δpi−1 pi P exp
Ti−1
m=0
We conclude that the joint distribution of the discrete time process mTi and the
underlying process yTi at time Ti conditioned to starting at y0 and m0 = 0 at time
0 is given by
M
i
Y
−mi−1 )
1 X 2iπj(mMi+1
e
(32)
Ui (yi−1 , mi−1 ; yi , mi ) =
Ûj (p) (y0 , yTi )
M + 1 j=0
j=0
where Ûj (p) is the operator of matrix elements Ûj (p)(y, y ′ ) = U (y, p; y ′ , p).
Notice here one of the differences between probability theory expressed via stochastic calculus as opposed to our formalism based on operator methods: in the
former one focuses on measure changes which are given by one parameter families
of positive linear transformations, while in our case we can make full use of general complex valued linear tansformations as the applicability of operator methods
does not hinge on the existence of a probabilistic interpretation for the dynamic
generators.
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
11
5. Moment Methods
In addition to applying block-diagonalization techniques, for most Abelian payoffs one can also use a moment method based on Dyson’s expansion. The method
applies whenever one needs to evaluate moments of stochastic integrals over a
bridge, such as
Z T
m
φ(yt , t)dt δ(yT ′ − y ′ ) y0 = y .
(33)
J(y, y ′ ) ≡ E0
0
Here m is an integer and [0, T ] is a given interval. In this case we have that
Z T
m
d
′
P exp
(34)
J(y, y ) =
Lǫ (t)dt (y, y ′ )
dǫm ǫ=0
0
where
(35)
L̃ǫ (t)(y, y ′ ) = L(y, y ′ ; t) + ǫφ(y; t)δyy′ .
For instance, to evaluate a range accrual where each coupon payment is proportional to the time a given swap spread s1t − s2t is in the interval [L, U ], one can use
the formula above with
(36)
φ(y; t) = 1 s1 (y; t) − s2 (y; t) ∈ [L, U ] .
If the range accrual is capped and/or floored the problem is complicated by the
fact that in general knowledge of the first moment is not sufficient and all moments are required to reconstruct the probability distribution function for the time
elapsed in the given range. However, also in this case one can take advantage of
the simplicity of moment method by using an ansatz to extrapolate the form of
the distribution from the knowledge of its first two or four moments. As an ansatz
distribution one can use for instance either the log-normal or the chi-squared distribution. In general, it is possible to verify the quality of the ansatz by using the
block-diagonalization method in the previous section, as range accruals are based
on a continuous time Abelian process.
Similarly, discrete payoffs such as TARNs can also be priced approximately by
means of moment methods. Consider a moment of a stochastic sum over a bridge
of the form
X
m
i
′
′
φj (yTj−1 , yTj ) δ(yTi − y ) y0 = y
(37)
F (y, y ) ≡ E0
j=1
where m is an integer and [0, T1 , ....Ti ] is a sequence of time points. In this case we
have that
Y
m
i
d
U
(T
,
T
)
(y, y ′ )
(38)
F (y, y ′ ) =
ǫ j−1
j
dǫm ǫ=0 j=1
where the matrix elements of the operators Uǫ (Tj−1 , Tj ) are given by
(39)
Z Tj
L(t)dt (y, y ′ ) exp ǫφj (yTj−1 , yTj ) (y, y ′ ).
Uǫ (y, Tj−1 ; y ′ , Tj ) = P exp
Tj−1
When pricing a TARN instead, one needs to proceed by iteration going forward
in time. In this case, at every step one needs to apply the kernel to a probability
distribution function ψ(y, Tj−1 ) which is initialized to have unit mass at the initial
12
CLAUDIO ALBANESE
point y = y0 . The function φj (yTj−1 , yTj ) in the kernel is the j−th coupon function.
At each date, one then numerically differentiates with respect to ǫ at least twice
and evaluates at ǫ = 0. This provides the two first moments of the distribution of
the total accrued by the structured leg on a give bridge originating from the spot
and allows one to reconstruct the probability distribution function and the survival
probability conditioned on a bridge. This survival probability needs to be applied
to each coupon amount which then has to be discounted to current time using the
discounted transition probability kernel.
An application of moment methods is given in Figures 25, 26, 27, 28, 29, 30, 31
where we show the pricing functions of callable CMS spread range accruals.
6. Scaling of algorithmic complexity, portfolios and snowballs
When using GPUs it is important to familiarize ourselves with fairly counterintuitive scaling properties for the run time required by various tasks, as this is not
proportional to the traditional concept of algorithmic complexity intended as the
total number of floating point operations. Since matrix-matrix multiplications are
distributed across a large number of cores, the larger are the matrices one multiplies,
the better is the hardware performance. In this section we show how these concepts
apply to portfolio valuations and pricing of snowballs and snowblades.
Consider pricing a large portfolio of European or callable instruments and suppose one has available in memory all the required discounted transition probability
kernels and data structures that describe cash-flows. In this case the best strategy
is to organize the pricing functions for all the instruments in the portfolio in a
large matrix, attributing one column vector to each instrument, and perform the
backward induction collectively on the entire portfolio. The alternative strategy of
iterating on each instrument separately is extremely more time expensive.
Task
GPU-O GPU-D Host-O
Initialization
4.79
5.16
3.77
Calibration to term structure of rates
4.76
6.14
61.88
585 European swaptions
7.62
10.17
85.30
Portfolio of CMS spread range accruals:
9 callable swaps and 3 callable snowballs
21.03
23.74
134.59
30240 ATM European swaptions
11.34
16.16
138.68
13725 ATM European swaptions and
13725 ATM Bermuda swaptions
56.56
59.02
272.19
Table 2. Execution times in seconds for various portfolios under
various configurations. GPU-O: using the GPU with host side
optimized code. GPU-D: using the GPU with host side debug
code. Host-O: using the host only with optimized code and Intel
MKL libraries. Ratio: column 3 versus column 1.
Ratio
0.79
13.00
11.19
6.40
12.23
4.81
Table 2 illustrates performance with examples of portfolios, also comparing the
performance of optimized code versus debug code. The code for this example is
written almost entirely in VB.NET, the CPU is a 2 GHz Xeon and the GPU is
a card Nvidia 8800 GTX originally intended for the games market, the API to
the GPU being given by the CUDA-BLAS library. One can notice that the debug
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
13
code runs only 5% slower than optimized code. The code is entirely managed
but by extrapolation one would conclude that the speed up achievable by using
native C code as opposed to VB.NET would also be negligible. The computational
bottleneck is in fact captured by calls to sgemm executing on the GPU.
One can notice that while a portfolio of 585 European swaption takes 7.62 seconds
to price, a portfolio with 30,240 swaptions takes only 11.34 seconds. Execution time
thus scales in a very non-linear way with respect to the number of floating point
operations. This is achieved by organizing all pricing functions of the portfolio
as column vectors in a matrix and applying backward induction to the portfolio in
aggregate. The alternative strategy of pricing each swaption individually would give
rise to linear scaling and be much more time expensive. However, by multiplying a
large kernel matrix with a large payoff matrix, the load is partitioned more evenly
across the cores, each one handling the task of multiplying sub-blocks, and as a
consequence the performance is greatly enhanced.
Notice that the performance gain with respect to an application running on the
host varies. The initialization is slightly slower when using a GPU due to memory
allocation on the card. The calibration to the term structure of interest rates shows
a gain by a factor 15. The pricing of European swaptions shows speedups by a factor
11-12, with a slightly better performance for the larger portfolio. The speedup for
the portfolios containing callables is not as great and is only 4.81 for the portfolio
with 13,725 Bermuda swaptions. The reason for this is that the implementation
was designed in such a way that the exercise condition was checked on the host
and at each exercise date a portion of the portfolio was moved across the bus. This
can easily be avoided by checking the exercise condition with a special function
written in C and executing on the GPU side. By eliminating the bus overhead, the
performance gain is again in the range 11-12.
Another instance where counter-intuitive scaling is revealed and can be taken
advantage of is the pricing of snowballs and snowblades. A snowball is a callable
swap whose structured coupon at time Ti has the following form
(40)
CTi = kCTi−1 + Φi (yTi−1 , yTi )
where k is a fixed parameter. The difficulty with snowballs is that the coupon
process is not Abelian, in fact it is path-dependent. Hence, the methods in the
two previous sections are not applicable to reduce dimensionality. However, using
multi-core scaling it is possible to setup a very efficient pricing scheme for snowballs.
Callable snowballs are priced by backward induction while snowblades are priced
by forward induction using either block-diagonalizations or, for greater efficiency,
the moment method. Consider first callable snowballs. In this case, one needs to
discretize the coupon dimension in intervals ∆C, so that a generic coupon can be
approximated as follows CTi = (∆C)nTi where nTi = 0, 1, ....N − 1 is a discrete
time, integer value process. In the working example we consider here we choose N =
100. A strategy to implement the backward induction scheme for a single callable
snowball is to organize the payoff function at maturity in a matrix with N columns,
each one indexed by the state variable y and representing the price conditional to
the discretized value of the previous coupon paid. One can then iterate backward
by applying the pricing kernel to this matrix of conditional pricing functions. After
each step in the iteration, one needs to reshuffle the pricing functions in such a way
that the conditioning relation on each column is satisfied. When pricing a portfolio
14
CLAUDIO ALBANESE
of European or callable snowballs there is obviously a great advantage in pricing
them all together.
Pricing functions for callable snowballs are given in Figures 32, 33, 34.
When pricing a TARN snowball instead one needs to go forward in time and
obtain the survival probability distributions ψj (y, n) on a bridge starting from the
spot and ending at y at time Tj . This requires obtaining the joint probability that
at time Tj the site y is occupied and the coupon paid at time Tj−1 is approximately
(∆C)n. In this case, at every step one needs to multiply the n − th column by
ekn(∆C) and then apply the kernel to all columns collectively where the function
φj (yTj−1 , yTj ) is the j−th coupon function. At each date, one then numerically
differentiates with respect to ǫ at least twice and based on the outcome, reconstruct
the probability distribution function for the total accrued on the structured leg and
the survival probability conditioned on a bridge. This survival probability needs to
be applied to the coupon amount which then has to be discounted to current time.
Also in this case, on massively parallel multicore architectures one obtains major
speedup ratios by pricing portfolios of TARN snowballs in one single iteration.
7. Conclusions
We have shows that short rate models specified semi-parametrically or even
non-parametrically are viable from the engineering viewpoint as frameworks within
which to price exotic single name interest rate derivatives. We find that the recently
introduced GPU platforms which are being mass-produced because of applications
to the video-game and computer-aided-design markets are an ideal hardware environment for such modeling approaches. We find that BLAS level-3 methods for
which the speed-up ratios are more significant are stable under single precision
arithmetics and that sensitivities are also smooth. The mathematical framework
that supports such analysis is based on operator methods, fast exponentiation and
the theory of Abelian processes. It is also based on techniques to exploit the highly
non-linear scaling behavior observed on massively parallel multi-core architectures
between the nominal algorithmic complexity and execution time. The methods
presented here can possibly be extended to cross-currency derivatives and hybrids
by modeling correlations by means of dynamic conditioning, as we have already
demonstrated in the case of unmanaged CDOs in (Albanese and Vidler 2007), but
we reserve to discuss these extensions in a future article.
References
Ait-Sahalia, Y., Hansen L.P. and J. A. Scheinkman (2005). Operator Methods for Continuous
Time Markov Processes. Journal of Economic Theory.
Albanese, C. (2006). Operator Methods, Abelian Processes and Dynamic Conditioning. preprint,
available at www.level3finance.com.
Albanese, C. and A. Mijatovic (2006). Convergence Rates for Diffusions on Continuous-Time
Lattices. preprint, available at www.level3finance.com.
Albanese, C. and A. Vidler (2007). A Structural Model for Bespoke CDOs. Willmott Magazine.
Albanese, C. and M. Trovato (2006). A Stochastic Volatility Model for Callable CMS
Swaps and Translation Invariant Path Dependent Derivatives. preprint, available at
www.level3finance.com.
Albanese, C., H. Lo and A. Mijatovic (2006). Spectral Methods for Volatility Derivatives. preprint.
Bennett, M.N. and J.E. Kennedy (2005). Common Interests. Finance and Stochastics.
Black, F. and P. Karasinski (1991). Bond and Option Pricing when Short Rates are Lognormal.
Financial Analysts Journal pp. 52–59.
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
15
Brace, A., D. Gatarek and M. Musiela (1997). The Market Model of Interest Rate Dynamics.
Mathematical Finance 7, 127–154.
Buttari, A., J. Dongarra, J. Kurzak, J. Langou, P. Lusczek and S. Tomov (2007). Exploiting
Mixed Precision Floating Point Hardware in Scientific Computations. preprint, available at
www.netlib.org.
Chen, Lin (1996). Interest Rate Dynamics, Derivatives Pricing, and Risk Management. Springer.
Cheyette, O. (1992). Term Structure Dynamics and Mortgage Valuation. Journal of Fixed Income
1, 28–41.
Cox, J.C., J.E. Ingersoll and S.A. Ross (1985). A theory of the term structure of interest rates.
Econometrica. 53, 385–407.
Dai, Q. and K. Singleton (2003). Term Structure Dynamics in Theory and Reality. Review of
Financial Studies 16, 631–678.
Duffie, D. and R. Kan (1996). A Yield-Factor Model of Interest Rates. Mathematical Finance
6, 379–406.
Hagan, P., D. Kumar, A. Lesniewski and D. Woodward (2002). Managing Smile Risk. Willmott
Magazine pp. 84–108.
Heath, D., R. Jarrow and A. Morton (1992). Bond Pricing and the Term Structure of Interest
Rates: A New Methodology. Econometrica 60, 77–105.
Ho, T.S.Y. and S.B. Lee (1986). Term Structure Movements and Pricing Interest Rate Contingent
Claims. Journal of Finance.
Hull, J. and A. White (1993). One-factor Interest Rate Models and the Valuation of Interest Rate
Derivative Securities. Journal of Financial and Quantitative Analysis 28, 235–54.
Hunt, P.J., J.E. Kennedy and A. Pelsser (2000). Markov Functional Interest Rate Models. Finance
and Stochastics 4, 391–408.
Jamshidian, F. (1997). Libor and Swap Market Models and Measures. Finance and Stochastics
1, 293–330.
Longstaff, F.A. and E.S. Schwartz (2001). Valuing American Options by Simulation: a Simple
Least Squares Approach. Rev. Financial Stud. 14, 113–148.
Sandmann, K. and D. Sondermann (1997). A Note on the Stability of Lognormal Interest Rate
Models and the Pricing of Eurodollar Futures. Mathematical Finance 7, 119–125.
Tang, Y. and J. Lange (2001). A Non-Exploding Bushy Tree Technique and its Application to
the Multifactor Interest Rate Market Model. Computational Finance.
Vasicek, O. A. (1977). An Equilibrium Characterization of the Term Structure. Journal of Financial Economics 5, 177–88.
Claudio Albanese
E-mail address:
[email protected]
16
CLAUDIO ALBANESE
Figure 3. Zero curves in the deflation regime.
Figure 4. Zero curves in the regime with drift -75 bp/year.
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
Figure 5. Zero curves in the regime with drift -25 bp/year.
Figure 6. Zero curves in the regime with drift +25 bp/year.
17
18
CLAUDIO ALBANESE
Figure 7. Zero curves in the regime with drift +100 bp/year.
Figure 8. Backbone of 2y-10Y correlation, i.e. correlation between daily returns of the 2Y swap rate versus the 10Y swap rate
as a function of the short rate.
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
Figure 9. Backbone of 1y-20Y correlation, i.e. correlation between daily returns of the 1Y swap rate versus the 20Y swap rate
as a function of the short rate.
Figure 10. Projected occupancy probabilities of monetary policy regimes.
19
20
CLAUDIO ALBANESE
Figure 11. Implied volatilities for at-the-money European swaptions of tenor 2Y compared to market data.
Figure 12. Implied volatilities for at-the-money European swaptions of tenor 5Y compared to market data.
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
Figure 13. Implied volatilities for at-the-money European swaptions of tenor 20Y compared to market data.
Figure 14. Implied volatility skews for European swaptions of
tenor 2Y compared to market data.
21
22
CLAUDIO ALBANESE
Figure 15. Implied volatility skews for European swaptions of
tenor 5Y compared to market data.
Figure 16. Implied volatility skews for European swaptions of
tenor 20Y compared to market data. Only implied volatilites for
extreme strikes 16% over the forward failed to compute, as the
graph shoes.
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
Figure 17. Implied volatility backbone, i.e. the scatterplot of the
implied at the money volatility of 2Y into 2Y European swaptions
versus the corresponding forward rate.
Figure 18. Implied volatility backbone, i.e. the scatterplot of the
implied at the money volatility of 4Y into 5Y European swaptions
versus the corresponding forward rate.
23
24
CLAUDIO ALBANESE
Figure 19. Implied volatility backbone, i.e. the scatterplot of
the implied at the money volatility of 10Y into 20Y European
swaptions versus the corresponding forward rate.
Figure 20. Bermuda premium backbone, i.e. the Bermuda premium of 4Y into 2Y at-the-money swaptions plotted against the
corresponding forward rate.
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
Figure 21. Bermuda premium backbone, i.e. the Bermuda premium of 5Y into 10Y at-the-money swaptions plotted against the
corresponding forward rate.
Figure 22. 2Y into 5Y convexity backbone, i.e. the convexity
correction for of 2Y into 5Y European constant-maturity-swaps
(CMS) with respect to European swaptions.
25
26
CLAUDIO ALBANESE
Figure 23. 3Y into 2Y convexity backbone, i.e. the convexity
correction for of 3Y into 2Y European constant-maturity-swaps
(CMS) with respect to European swaptions.
Figure 24. 10Y into 20Y convexity backbone, i.e. the convexity
correction for of 10Y into 20Y European constant-maturity-swaps
(CMS) with respect to European swaptions.
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
Figure 25. Pricing functions of callable CMS spread range accruals.
Figure 26. Pricing functions of callable CMS spread range accruals.
27
28
CLAUDIO ALBANESE
Figure 27. Pricing functions of callable CMS spread range accruals.
Figure 28. Pricing functions of callable CMS spread range accruals.
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
Figure 29. Pricing functions of callable CMS spread range accruals.
Figure 30. Pricing functions of callable CMS spread range accruals.
29
30
CLAUDIO ALBANESE
Figure 31. Pricing functions of callable CMS spread range accruals.
Figure 32. Pricing functions of callable snowball CMS spread
range accruals.
CALLABLE SWAPS, SNOWBALLS AND VIDEOGAMES
Figure 33. Pricing functions of callable snowball CMS spread
range accruals.
Figure 34. Pricing functions of callable snowball CMS spread
range accruals.
31