Credit Loss and Systematic LGD Frye Jacobs 100611 PDF
Credit Loss and Systematic LGD Frye Jacobs 100611 PDF
Credit Loss and Systematic LGD Frye Jacobs 100611 PDF
The authors thank Irina Barakova, Terry Benzschawel, Andy Feltovich, Brian Gordon, Paul
Huck, J. Austin Murphy, Ed Pelz, Michael Pykhtin, and May Tang for comments on previous
versions, and to participants at the 2011 Federal Interagency Risk Quantification Forum, the
2011 International Risk Management Conference, and the First International Conference on
Credit Analysis and Risk Management.
Abstract
This paper presents a model of systematic LGD that is simple and effective. It is simple in that it
uses only parameters appearing in standard models. It is effective in that it survives statistical
testing against more complicated models.
Any views expressed are the authors’ and do not necessarily represent the views of the
management of the Federal Reserve Bank of Chicago, the Federal Reserve System, the Office of
The Comptroller of the Currency or the U.S. Department of the Treasury.
Credit loss varies from period to period both because the default rate varies and because the loss
given default (LGD) rate varies. The default rate has been tied to a firm's probability of default
(PD) and to factors that cause default. The LGD rate has proved more difficult to model because
continuous LGD is more subtle than binary default and because LGD data are fewer in number
and lower in quality.
Studies show that the two rates vary together systematically.1 Systematic variation works against
the lender, who finds that an increase in the number of defaults coincides with an increase in the
fraction that is lost in a default. Lenders should therefore anticipate systematic LGD within their
credit portfolio loss models, which are required to account for all material risks.
This paper presents a model of systematic LGD that is simple and effective. It is simple in that it
uses only parameters that are already part of standard models. It is effective in that it survives
statistical testing against more complicated models. It may therefore serve for comparison in
tests of other models of credit risk as well as for purposes of advancing models of credit spreads
that include premiums for systematic LGD risk.
The LGD model is derived in the next section. The section on research methods discusses the
style of statistical testing to be used and the direct focus on credit loss modeling, both of which
are rare in the portfolio credit loss literature. Three sections prepare the way for statistical
testing. These sections develop the model of credit loss for the finite portfolio, discuss the data
to be used for calibration and testing, and introduce alternative hypotheses. Two sections
perform the statistical tests. The first tests each exposure cell—the intersection of rating grade
and seniority—separately. The second brings together sets of cells: all loans, all bonds, or all
instruments. Having survived statistical testing, the LGD model is applied in the section that
precedes the conclusion.
This section derives the LGD model. It begins with the simplest portfolio of credit exposures and
assumes that loss and default vary together. This assumption by itself produces a general
formula for the relationship of LGD to default. The formula depends on the distributions of loss
and default. We note that different distributions of loss and default produce similar
relationships, so we specify a distribution based on convenience and ease of application. The
result is the specific LGD function that appears as Equation (3).
The “asymptotically fined grained homogeneous” portfolio of credit exposures has enough same-
sized exposures that there is no need to keep track individual defaults.2 Only the rates of loss,
default, and LGD matter. These rates are random variables, and each one has a probability
distribution.
Because loss is the most important random variable, it is the loss distribution that we wish to
calibrate carefully, and it is the loss model that we wish to control for error. We symbolize the
cumulative distribution functions of the rates of loss and default by CDFLoss and CDFDR.
Our first assumption is that greater default rates and greater loss rates go together. This
assumption puts very little structure on the variables. It is much less restrictive than the
common assumption that greater default rates and greater LGD rates go together. The technical
assumption is that the asymptotic distributions of default and loss are comonotonic. This
implies that the loss rate and the default rate take the same quantile, q, within their respective
distributions:
( ) [ ] [ ]
The product of the default rate and the LGD rate equals the loss rate. Therefore, for any value of
q, the LGD rate equals the ratio of loss to default, which in turn depend on q and on inverse
cumulative distribution functions:
[ ] [ [ ]]
( )
[ ]
This expresses the asymptotic LGD rate as a function of the asymptotic default rate and it holds
true whenever the distributions of loss and default are comonotonic. This function might take
many forms depending on the forms of the distributions. Since LGD is a function of default, one
could use Equation (2) to infer a distribution of LGD and study it in isolation; however, we keep
the focus on the distribution of loss and on the nature of an LGD function that is consistent with
the distribution of loss.
In particular, we begin with a loss model having only two parameters. If a two-parameter loss
model were not rich enough to describe credit loss data, a more complicated model could readily
show this in a straightforward statistical test. The same is true of the default model. Therefore,
our provisional second assumption is that both credit loss and default have two-parameter
distributions in the asymptotic portfolio.
Testing this assumption constitutes the greater part of this study. Significant loss models with
more than two parameters are not found; a two-parameter loss model appears to be an adequate
description of the loss data used here. Therefore, this section carries forward the assumption of
two-parameter distributions of loss and default.
2
In principle, any two-parameter distributions could be used for the CDFs in Equation (2). In
practice, we compare three distributions: Vasicek, Beta, and Lognormal, arranging that each has
the same the mean and that each has the same standard deviation. To obtain values that are
economically meaningful, we turn to the freely available credit loss data published by Altman
and Karlin for high-yield bonds, 1989-2007. The means and standard deviations appear in the
first column of Table 1. The other three columns describe distributions that share these
statistics. Figure 1 compares the variants of Equation (2) that result.
3
Table 1. Calibration of three distributions to mean and SD of loss and default, Altman-Karlin data, 1989-2007.
[ ] √ [ ]
[ ] ( ) ( [ ] )
√ √ [ ]
PDF[x] [ ]
√ [ [ ]] √
CDF[x] √ [ ] [ ] ( ) [ ]
[ ] ∫ [ ]
√ [ ]
CDF-1[q] [ ] [ ] [ [ ]]
√
[ ] ( )
√ ∫
[ ]
70%
Vasicek
LGD Rate
60% Beta
LogNormal
50%
40%
0% 5% 10% 15% 20%
Default Rate
As Figure 1 illustrates, the three distributions produce approximately the same LGD – default
relationship. They differ principally when the default rate is low. This is the range in which they
would be the most difficult to distinguish empirically, because a low default rate generates few
defaults and substantial random variation in annual average LGD. The Lognormal Distribution
produces the relationship with lowest overall slope; however, of the three distributions, the
Lognormal has the fattest tail.
Our choice between the distributions is guided by practical considerations. Unlike the Beta
Distribution, the Vasicek Distribution has explicit formulas for its CDF and its inverse CDF.
Unlike Lognormal Distribution, the Vasicek Distribution constrains all rates to be less than
100%. Importantly, estimates of the Vasicek correlation parameter already exist within current
credit loss models. This makes the Vasicek distribution, by far, the easiest for a practitioner to
apply. Therefore, our third assumption is that loss and default obey the Vasicek distribution.
Our fourth assumption is that the value of in CDFLoss equals the value of in CDFDR. This
assumption is testable. Alternative E, introduced later, tests by allowing the values to differ, but
it does not find that the values are significantly different. Therefore we carry forward the
assumption that the values of are the same.
Substituting the expressions for the Vasicek CDF and inverse CDF into Equation (2) produces
the LGD function:
[ ] [ ]
( ) [ [ ] ] [ [ ] ]
√
This expresses the asymptotic LGD rate as a function of the asymptotic default rate. These rates
equal the conditionally expected rates for a single exposure. Equation (3) underlies the null
hypothesis in the tests that follow.
The three parameters PD, EL, and combine to form a single quantity that we refer to as the
LGD Risk Index and symbolize by k. If EL = PD (that is, if ELGD equals 1.0), then k = 0 and
LGD = 1, irrespective of DR. Except when the LGD Risk Index equals 0, LGD is a strictly
monotonic function of DR as shown in Appendix 1. For commonly encountered values of PD,
EL, and , k is between 0 and 2.
To recap, we derive the LGD function by making four assumptions. The first assumption is that
a greater rate of credit loss accompanies a greater rate of default. This plausible starting place
immediately produces a general expression for LGD, Equation (2). The second assumption is
that the distributions of loss and default each have two parameters. Later sections of this paper
attempt, unsuccessfully, to find a statistically significant loss model with more parameters. The
third assumption is that the distributions are specifically Vasicek. This assumption is a matter of
convenience; distributions such as Beta and Lognormal produce similar relationships but they
would be more difficult to implement. The fourth assumption is that the value of estimated
from default data also applies to the loss distribution. This assumption is testable, and it
survives testing in later sections. The four assumptions jointly imply Equation (3), which
expresses the LGD rate as a function of the default rate. This LGD function is consistent with the
assumption that credit loss has a two-parameter Vasicek distribution.
Research methods
This section discusses two research methods employed by this paper. First, this paper tests in an
unusual way. Rather than showing the statistical significance of Equation (3), it shows the lack
of significance of more complicated models that allow the LGD-default function to be steeper or
flatter than Equation (3). Second, this paper calibrates credit loss models to credit loss data.
Rather than assume that the parameters of a credit loss model have been properly established by
the study of LGD, it investigates credit loss directly.
This study places its preferred model in the role of the null hypothesis. The alternatives explore
the space of differing sensitivity by allowing the LGD function to be equal to, steeper than, or
flatter than Equation (3). The tests show that none of the alternatives have statistical
significance compared to the null hypothesis. This does not mean that the degree of systematic
LGD risk in Equation (3) can never be rejected, but a workmanlike attempt has not met with
success. Acceptance of a more complicated model that had not demonstrated significance would
accept an uncontrolled probability of Type I Error.
A specific hypothesis test has already been alluded to. Equation (3) assumes that the parameter
appearing in CDFLoss takes the same value as the parameter appearing in CDFDR. An
alternative allows the two values of correlation to differ. This alternative is not found to be
6
statistically significant in tests on several different data sets; the null hypothesis survives
statistical testing.
We do not try every possible alternative model, nor do we test using every possible data set; it is
impossible to exhaust all the possibilities. Still, these explorations and statistical tests have
content. The function for systematic LGD variation is simple, and it survives testing. A risk
manager could use the function as it is. If he prefers, he could test the function as we do. A test
might show that Equation (3) can be improved. Given, however, that several alternative LGD
models do not demonstrate significance on a relatively long, extensive, and well-observed data
set, an attitude of heightened skepticism is appropriate. In any case, the burden of proof is
always on the model that claims to impart a more detailed understanding of the evidence.
The second method used in this paper is to rely on credit loss data and credit loss models to gain
insight into credit risk. By contrast, the models developed in the last century, such as
CreditMetrics™, treat the distribution of credit loss as something that can be simulated but not
analyzed directly. This, perhaps, traces back to the fact that twentieth century computers ran at
less than 1% the speed of current ones, and some shortcuts were needed. But the reason to
model LGD and default is to obtain a model of credit loss. The model of credit loss should be the
focus of credit loss research, and these days it can be.
We make this difference vivid by a comparison. Suppose a risk manager wants to quantify the
credit risk for a specific type of credit exposure. Having only a few years of data, he finds it quite
possible that the pattern of LGD rates arises by chance. He concludes that the rates of LGD and
default are independent, and he runs his credit loss model accordingly. This two-stage approach
never tests whether independence is valid using credit loss data and a credit loss model, and it
provides no warrant for this elision.
Single stage methods are to be preferred because each stage of statistical estimation introduces
uncertainty. A multi-stage analysis can allow the uncertainty to grow uncontrolled. Single stage
methods can control uncertainty. One can model the target variable—credit loss—directly and
quantify the control of Type I Error.
The first known study to do this is Frye (2010), which tests whether credit loss has a two-
parameter Vasicek Distribution. One alternative is that the portfolio LGD rate is independent of
the portfolio default rate.3 This produces an asymptotic distribution of loss that has three
parameters: ELGD, PD and . The tests show that, far from being statistically significant, the
third parameter adds nothing to the explanation of loss data used.
The above illustrates the important difference touched upon earlier. If LGD and default are
modeled separately, the implied credit loss distribution tends to contain all the parameters
3 The LGD of an individual exposure, since it is already conditioned on the default of the exposure, is
independent of it. Nonetheless, an LGD can depend on the defaults of other firms or on their LGDs. This
dependence between exposures can produce correlation between the portfolio LGD rate and the portfolio
default rate, thereby affecting the systematic risk, systematic risk premium, and total required credit
spread on the individual loan.
7
stemming from either model. By contrast, this paper begins with a parsimonious credit loss
model and finds the LGD function consistent with it. If a more complicated credit loss model
were to add something important, it should demonstrate statistical significance in a test.
We hypothesize that credit loss data cannot support extensive theorizing. This hypothesis is
testable, and it might be found wanting. Nevertheless, the current research represents a
challenge to portfolio credit loss models running at financial institutions and elsewhere. If those
models have not demonstrated statistical significance against this approach, they can be
seriously misleading their users.
The current paper extends Frye (2010) in three principal ways. First, it derives and uses
distributions that apply to finite-size portfolios. Second, it controls for differences of rating and
differences of seniority by using Moody’s exposure-level data. Third, it develops alternative
models that focus specifically on the steepness of the relationship between LGD and default.
These are the topics of the next three sections.
This section derives the distribution of loss for a portfolio with few exposures, taking a different
approach from the pioneering work by Pykhtin and Dev. Later sections use this distribution to
test the LGD function against alternatives.
As usual, one must keep separate the concepts related to the population and the concepts related
to the sample. Economic and financial conditions give rise to the population variables. These are
the conditionally expected default rate, symbolized DR, and the conditionally expected LGD
rate, symbolized LGD. LGD is tied to DR by Equation (3) or by one of the alternatives developed
later. In a sample of credit loss data, the quantities of interest are the number of defaults, D, and
the portfolio average LGD rate, ̅̅̅̅̅̅ .
At the outset we recognize two cases. The first case is that D equals 0. In this case, credit loss
equals zero. This case has probability equal to (1 - DR)N.
The second case, when D = d > 0, produces a distribution of the portfolio average LGD rate.
Average LGD would approach normality for large D, according to the Central Limit Theorem.
We assume normality for all D for two reasons: for convenience, and for the practical benefit
that normality allows average LGD outside the range [0, 1]. This is important because the credit
8
loss data includes portfolio-years where ̅̅̅̅̅̅ is negative. The variance of the distribution is
assumed equal to 2/d:
̅̅̅̅̅̅
( ) ̅̅̅̅̅̅ [̅̅̅̅̅̅] [ ]
√ √
The conditional distribution of D and ̅̅̅̅̅̅ is then the product of the Binomial Distribution of D
and the normal distribution of ̅̅̅̅̅̅ :
̅̅̅̅̅̅
( ) ̅̅̅̅̅̅ [ ̅̅̅̅̅̅ ] ( ) ( ) [ ]
√ √
In a portfolio with uniform exposure amounts, the loss rate equals default rate times the LGD
rate. We pass from the portfolio’s LGD rate to its loss rate with the monotonic transformation:
( ) ̅̅̅̅̅̅
( ) [ ] ( ) ( ) [ ]
√ √
Summing over d, combining the two cases, and removing the conditioning on DR produces the
distribution of credit loss in the finite portfolio:
( ) [ ] [ ][ ]∫ [ ]( )
[ ][ ]∫ [ ] ∑ [ ]
where fDR[ ] is the PDF of the Vasicek density with parameters PD and . This distribution
depends on the parameters of the default distribution, PD and . It also depends on any
additional parameters of the LGD function. These consist solely of EL in the null hypothesis of
Equation (3) but include an additional parameter in the alternatives introduced later. Finally,
the distribution depends on N, the number of exposures. As N increases without limit, Equation
(8) becomes the Vasicek distribution with mean equal to EL. For small N, however, the
decomposition of EL into PD and ELGD has an effect on the distribution of loss.
9
Figure 2. Distributions of loss for asymptotic and finite portfolios
25
Asymptotic
portfolio:
PD = 10%
20 ELGD = 50%
= 15%
15
Density
0
0% 5% 10% 15% 20%
Credit loss rate
Figure 2 compares the distribution of loss for the asymptotic portfolio to the distribution for a
portfolio containing 10 exposures. Each distribution has EL = 5% and = 15%. Those values
completely describe the distribution of credit loss in the asymptotic portfolio. For the sake of
comparison, EL is assumed decompose to PD = 10% and ELGD = 50%. Credit loss in the finite
portfolio has the distribution of Equation (8). The point mass at zero loss has probability 43%;
therefore, the area under the curve illustrated in Figure 2 is 57%. Assuming = 1% produces
distinct humps for one, two, and three defaults. The hump for one default is centered at less
than 5% loss, while the hump for three defaults is centered at greater than 15% loss. In other
words, LGD tends be greater when there are more defaults.
Under the usual statistical assumptions—the parameters are stable over time and the variable
DR is independent each year—the log of the likelihood function of the data is this:
( ) [ ] ∑ [ [ ]]
Data
The data are twenty-seven years of data drawn from Moody's Corporate Default Rate Service™.
An exposure "cell"—the intersection of a rating grade and a seniority class—controls for both
borrower quality and for exposure quality. A cell is assumed to be a homogenous portfolio of
statistically identical exposures as called for in the loss models.
Distributions of credit loss can say nothing about cases where the loss amount is unknown.
Therefore, we restrict the definition of default to cases where Moody’s observes a post-default
price. By contrast, studies of default in isolation can include defaults that produce unknown loss.
10
We refer to this less-restrictive definition as “nominal default” and note that it produces default
rates that are generally greater than the ones we present.
We delimit the data set in several ways. To have notched ratings available at the outset, the data
sample begins with 1983. To align with the assumption of homogeneity, a firm must be classified
as industrial, public utility, or transportation and headquartered in the US. Ratings are taken to
be Moody's "senior" ratings of firms, which usually corresponds to the rating of the firm’s long-
term senior unsecured debt if such exists. To focus on cells that have numerous defaults, we
analyze firms rated Baa3 or lower. We group the ratings C, Ca, Caa, Caa1, Caa2, and Caa3 into a
single grade we designate "C". This produces five obligor rating grades altogether: Ba3, B1, B2,
B3, and C.
To align with the assumption of homogeneity, debt issues must be dollar denominated, intended
for the U.S. market, and not guaranteed or otherwise backed. We define five seniority classes:
Senior Secured Loans (Senior Secured instruments with Debt Class "Bank Credit
Facilities")
Senior Secured Bonds (Senior Secured instruments with Debt Class "Equipment Trusts",
"First Mortgage Bonds", or "Regular Bonds/Debentures")
Senior Unsecured Bonds ("Regular Bonds/Debentures" or "Medium Term Notes")
Senior Subordinated Bonds ("Regular Bonds/Debentures")
Subordinated Bonds ("Regular Bonds/Debentures").
This excludes convertible bonds, preferred stock, and certain other instruments.
A firm is defined to be exposed in a cell-year if on January 1st the firm has one of the five obligor
ratings, it is not currently in default, and it has a rated issue in the seniority class. A firm is
defined to default if there is a record of nominal default and one or more post-default prices are
observed. The LGD of the obligor’s exposures in the cell equals 1.0 minus the average of such
prices expressed as a fraction of par; there is exactly one LGD for each default. The default rate
in the cell-year is the number of LGD's divided by the number of firms that are exposed, and the
loss rate is the sum of the LGD's divided by the number of firms that are exposed. There is no
correction for firms that are exposed to default for only part of the year, perhaps because their
debts mature or because their ratings are withdrawn.
To make ideas concrete, consider the most-populated cell, Senior Secured Loans made to B2-
rated firms. This cell has 1842 cell-years of exposure. However, public agencies began rating
loans only in the latter half of the data sample; of the twenty-seven years of the data sample in
total, only fourteen years contain loans to B2-rated firms. Of those fourteen years, only six
record a default by a B2-rated firm that had a rated loan outstanding. Those six years contain all
the information about the LGD-default relationship that is contained within the cell. In all, the
cell generates fourteen annual observations on the three variables needed to calibrate the
distribution of loss:
11
D, the number of defaults, and
Loss, the sum of the LGD’s divided by N, or zero if D = 0.
This section presents alternative LGD functions that have an additional parameter and might
provide a better fit to the data. Designed to focus on a particular question, the alternatives
necessarily have a functional forms that appear more complicated than Equation (3).
In general, a statistical alternative could have any number of functional forms. For example, one
might test Equation (3) against a linear LGD hypothesis:
( )
Linear Equation (10) can be mentally compared to the curved function for the Vasicek
Distribution that is illustrated in Figure 1. If the straight line were wholly above the curved line,
its expected loss would be too high. Therefore, the straight line and curved line cross. If
parameter v takes a positive value, as is likely, the lines cross twice. Therefore, a calibration of
Equation (10) would likely produce a straight line that is shallower than Equation (3) at the left
and steeper than Equation (3) at the right. If this calibration were statistically significant, the
verdict would be that Equation (3) is too steep in some places and too flat in others.
Such an answer is not without interest, but we address a simpler question. If the LGD function
of Equation (3) does not adequately represent the data, we want to know whether a better
function is steeper or flatter. Therefore our alternatives have a special feature: the additional
parameter changes the LGD-default relationship but has no effect on EL. When the parameter
takes a particular value, the alternative becomes identical to Equation (3), and when the
parameter takes a different value, the alternative becomes steeper or flatter than Equation (3).
For all values of the parameter, the mathematical expectation of loss is equal to the value of
parameter EL. When we test against such an alternative, we are testing for a difference in slope
alone. Although the slope of the LGD-default relationship is not the only aspect of systematic
LGD risk that is important, it has first-order importance.
Alternative A takes the following form, using for convenience the substitution EL = PD ELGD:
[ ] [ ]
( ) [ [ ] ]
√
12
The additional parameter in Alternative A is symbolized by a. If a takes the value zero,
Alternative A becomes identical to Equation (3). If a takes the value 1.0, the function collapses to
ELGD; in other words, when a = 1 Alternative A becomes a model in which LGD is a constant in
the asymptotic portfolio.
Figure 3 illustrates Alternative A for five values of a. If parameter a takes a negative value,
Alternative A is steeper than Equation (3). If parameter a takes a value greater than 1.0,
Alternative A is negatively sloped. Thus, Alternative A can represent an entire spectrum of
slopes of the LGD-default relationship: equal to, steeper than, or flatter than the null hypothesis.
80% a = -1
a = 0 (H0)
LGD Rate
70%
a=1
60%
a=2
50%
40%
0% 5% 10% 15% 20%
Default Rate
Irrespective of the value of or the decomposition of EL into PD and ELGD, the expectation of
loss equals the value of the parameter EL:
[ ] [ ]
( ) [ ] [ [ ] ]
√
[ ]
Thus, the value of a affects the relationship between LGD and default but has no effect on EL.
We use Alternative A to challenge the null hypothesis, but it is also an approximation to other
LGD models that might be used instead. Appendix 2 compares Alternative A to the LGD model
of Michael Pykhtin and finds that the approximation is quite good when the value of a is zero.
This is exactly the case that survives statistical testing. Therefore, although we do not test
explicitly against Pykhtin’s model, we believe that we test against a model that is quite similar.
Alternatives B and C have forms similar to Alternative A. In fact, the three alternatives can be
identical to each other when applied to a single homogeneous cell. However, the assumption
13
that parameter b (or c) is uniform across several cells is different from the assumption that
parameter a is uniform across the cells. For this reason, Alternatives B and C are defined here
and applied in the section that several tests cells in parallel.
[ ] [ ]
( ) [ [ ] ]
√
[ ] [ ]
( ) [ [ ] ]
√
When parameter b (or c) takes the value 0, Alternative B (or C) becomes identical to Equation
(3). When the parameters take other values, the associated LGD-default relationships become
steeper or flatter. In any case, the mathematical expectation of loss equals the parameter EL.
The fourth alternative is designated Alternative E. It assumes that both loss and default have
Vasicek distributions, but that the correlation relevant to loss, e, differs from the correlation
relevant to default, . Substituting the Vasicek CDF and inverse CDF into Equation (2),
√ [ ] √ ( [ ] √ [ ])
( ) [ ]⁄
√ √
When e = , Equation (14) becomes identical to Equation (3). Figure 4 illustrates this when e
takes the value 14.51%. Relative to that, greater values of e make the function steeper and lesser
values of e make the function flatter or negatively sloped. For any value of e, the mathematical
expectation of loss equals the mean of the Vasicek loss distribution, which is parameter EL.
14
Figure 4. Alternative E for three values of e
90%
e = 19%
80%
e = 14.51% (H0)
LGD Rate
70%
60% e = 10%
50%
40%
0% 5% 10% 15% 20%
Default Rate
This section introduces four alternative LGD models. Each alternative contains an extra
parameter that can allow LGD to be more (or less) sensitive to DR than the null hypothesis. The
additional parameter has no effect on the expectation of loss. Therefore, the alternatives focus
solely on the question of the slope of the LGD-default relationship. Later sections use the
alternatives in statistical challenges to Equation (3).
This section performs tests on the twenty five cells one cell at a time. Each cell isolates a
particular Moody’s rating and a particular seniority. Each test calibrates Equation (8) twice:
once using Equation (3) and once using an alternative LGD function. The likelihood ratio
statistic determines whether the alternative produces a significant improvement. Judged as a
whole, the results to be presented are consistent with the idea that Equation (3) does not
misstate the relationship between LGD and default.
As with most studies that use the likelihood ratio, it is compared to a distribution that assumes
an essentially infinite, “asymptotic” data set. The statistic itself, however, is computed from a
sample of only twenty-seven years of data. This gives the statistic a degree of sampling variation
that it would not have in the theoretical asymptotic data. As a consequence, tail observations
tend to be encountered more often than they should be. This creates a bias toward finding
statistical significance. This bias strengthens a finding of no significance, such as produced here.
Most risk managers are currently unable to calibrate all the parameters of a loss model by
maximum likelihood estimation (MLE). A scientific finding that is valid only when MLE is
employed would be useless to them. Instead, we calibrate mean parameters along the lines
15
followed by practitioners. Our estimator for PD in a cell is the commonly-used average annual
default rate. Our estimator for EL is the average annual loss rate. ELGD is the ratio of EL to PD.
In the case of , we find MLE to be more convenient than other estimators. (The next section
checks the sensitivity of test results to the estimate of .) We begin with the MLE found by
maximizing the following expression of within each cell:
( ) [ ] ∑ [∫ [ ] ( ) ( ) ]
where fDR[ ] is the PDF of the Vasicek density with parameters ̂ and . Consistent with the
assumptions made in developing Equation (3), this value of is assumed valid for the loss
distribution as well, except in the case of Alternative E.
The parameter measures the random dispersion of an individual LGD around its
conditionally expected value. This is needed to calibrate the distribution of loss for the finite
portfolio, but has no role in the asymptotic LGD function of Equation (3). From this
perspective, is a “nuisance” parameter. To estimate it, we consider every cell-year in which
there are two or more LGDs. In each such cell-year we calculate the unbiased estimate of the
standard deviation. The dispersion measured around the data mean is less than the dispersion
around any other number, including the conditional expectation. Therefore, the average
standard deviation, 20.30%, should represent an underestimate of and should understate the
contribution of purely random causes.
These parameters—PD, ELGD, , and —are the only ones required under the null hypothesis.
The alternative has the extra parameter that controls the slope of the LGD-default relationship,
and that parameter is estimated by MLE. Thus, the only parameter informed by the loss data in
the context of the loss model is the additional parameter of the alternative hypothesis. This is
believed to bias the test toward finding a statistically significant result, and this bias strengthens
the findings of no statistical significance.
Table 2 shows summary statistics, parameter estimates, and test statistics for each cell. The test
statistics are stated as the difference between the maximum log likelihood using the alternative
and the log likelihood using the null. Twice this difference would have the chi-square
distribution with one degree of freedom in the asymptotic portfolio. Differences greater than the
5% critical value of 1.92 are noted in bold face. The test statistics for Alternatives B and C would
be identical to those presented for Alternative A.
16
Table 2. Basic statistics, parameter estimates, and test statistics by cell
Key to Table 2:
EL, PD, and : Estimates as discussed in the text; ELGD = EL / PD.
D: The number of defaults in the cell, counting within all 27 years.
N: The number of firm-years of exposure in the cell, counting within all 27 years.
D Years: The number of years that have at least one default.
N Years: The number of years that have at least one firm exposed.
NomD: The number of nominal defaults (including where the resulting loss is unknown).
NomPD: Average of annual nominal default rates.
a, e: MLEs of the parameters in Alternatives A and E.
LnL: the pick-ups in LnLLoss provided by Alternative A or E relative to the null hypothesis.
Statistical significance at the 5% level is indicated in bold.
Along the right and bottom margins, average EL, PD, and are weighted by N; other averages
are unweighted.
17
Along the bottom and on the right of Table 2 are averages. The overall averages at the bottom
right corner contain the most important fact about credit data: they are few in number. The
average cell has only 31 defaults, which is about one per year. Since defaults cluster in time, the
average cell has defaults in only 9 years, and only these years can shed light on the connection
between LGD and default.
Not only are the data few in number, they have a low signal-to-noise ratio: the random variation
of LGD, measured by = 20.30%, is material compared to the magnitude of the systematic
effect and the number of LGDs that are observed. A data set such as used here, spanning many
years with many LGDs, provides the best opportunity to see through the randomness and to
characterize the degree of systematic LGD risk.
In Table 2, there are two cells with log likelihood pickups greater than 1.92: Loans to B2-rated
firms and Senior Subordinated Bonds issued by C-rated firms. This does not signal statistical
significance because many tests are being performed. If twenty-five independent tests are
conducted, and if each has a size of 5%, then two or more nominally significant results would
occur with probability 36%. Of the two or more nominally significant results, one cell is
estimated steeper than Equation (3) and one cell is estimated flatter than Equation (3). Nothing
about this pattern suggests that the LGD function of Equation (3) is either too steep or too flat.
Considering all twenty-five cells including the twenty-three cells that lack nominal significance,
there is about a 50-50 split. About half the cells have an estimated LGD function that is steeper
than Equation (3) and about half have an estimated LGD function that is flatter than Equation
(3). A pattern like this would be expected if the null hypothesis were correct.
Summarizing, this section performs statistical tests of the null hypothesis one cell at a time. Two
cells produce nominal significance, which is an expected result if the null hypothesis were
correct. Of the two cells, one cell has an estimated LGD function that is steeper than the null
hypothesis and the other cell has an estimated LGD function that is flatter than the null
hypothesis. Of the statistically insignificant results, about half the cells have an estimated LGD
function that is steeper than the null hypothesis and the other half have an estimated LGD
function that is flatter than the null hypothesis. The pattern of results is of the type to be
expected when the null hypothesis is correct. This section provides no good evidence that
Equation (3) either overstates or understates LGD risk.
This section tests using several cells at once. To coordinate the credit cycle across cells, we
assume that the conditional rates are connected by a comonotonic copula. Operationally, the
conditional rate in every cell depends on a single risk factor. All cells therefore provide
information about the state of this factor.
We begin by analyzing the five cells of loans taken together. There are 6,120 firm-years of
exposure in all. The cell-specific estimates of EL and PD are equal to those appearing in Table 1.
18
The average of the standard deviation of loan LGD provides the estimate = 23.3%. We
estimate = 18.5% by maximizing the following likelihood in :
[̂] √ [ ] [̂] √ [ ]
( ) ∑ [∫ ∏ ( ) ( ) ( ) ]
√ √
The top section of Table 3 shows the estimates of the parameters and pickups of LnL that result.
The estimates of a, b, c and e suggest steepness that is slightly less than the null hypothesis, but
none of the alternatives comes close to the statistically significant pickup of LnL > 1.92. For the
five cells of loans taken together, the null hypothesis survives testing by each of the four
alternatives.
Turning to the twenty cells of bonds, some firms have bonds outstanding in different seniority
classes in the same year. Of the total of 10,585 firm-years of bond exposure, 9.0% have exposure
in two seniority classes, 0.4% have exposure in three classes, and 0.1% have exposure in all four
classes. This creates an intricate dependence between cells rather than independence. Assuming
that this degree of dependence does not invalidate the main result, the middle section of Table 3
shows parameter values suggesting steepness slightly greater than the null hypothesis. Again,
none of the alternative models come close to statistical significance and the null hypothesis
survives testing.
When all loans and bonds are considered together, 16.0% of firm-years have exposure to two or
more classes. Analyzing these simultaneously produces the parameter estimates in the bottom
section of Table 3. Once again, the alternative models remain far from statistically significant
and the null hypothesis survives testing.
19
The foregoing tests use maximum likelihood estimates of . Risk managers take estimates of
from various sources. These include vended models, asset or equity return correlations, credit
default swaps, regulatory authorities, and inferences from academic studies. All of these sources
are presumably intended to produce an estimate of the statistical parameter that appears in a
Vasicek Distribution relevant for an asymptotic portfolio.4 Still, it is natural to ask whether a
different value of would lead to a different conclusion about the statistical significance of the
alternative hypotheses.
To investigate this, we repeat the analysis of Table 3 for the collection of Loan cells. In each
repetition, we assume a value of . Based on that, we calculate the log likelihood under the null
hypothesis and under Alternative A. A significant result would be indicated by a difference in log
likelihoods greater than 1.92.
Figure 5 displays the results. The lower line is the log likelihood of loss under the null
hypothesis, and the upper line is the maximum log likelihood of loss under Alternative A. When
equals 18.5% the two are nearly equal, as already shown in Table 3. When takes different
value, the two log likelihoods tend to differ. However, the difference between them never
exceeds 1.92 for in the range 4.8% to 45.4%. It is likely that any estimate of correlation for
Moody’s-rated loans would be in this range. Therefore, the null hypothesis appears robust with
respect to the uncertainty in the estimate of correlation.
Max Alternative A
Null Hypothesis
76
Log Likelihood
1.92
74
72 1.92
70
4.8% 45.4%
0% 10% 20% 30% 40% 50%
Assumed value of
The results of a statistical test depend on every technique used. For example, our estimator of
PD is the average annual default rate. A maximum likelihood estimate of PD, by contrast, would
take into account both the default rates and the numbers of exposures. One must always be
4Frye (2008) discusses the difference between the correlation in a statistical distribution and the
correlation between asset returns that is often used as an estimator.
20
aware that other techniques, as well as other alternative hypotheses or other data sets, could
lead to other conclusions. An exhaustive check is impossible. It seems more important that
existing credit loss models be tested for statistical significance than to try to anticipate every
possibility.
The tests of this section employ three sets of cells: all loans, all bonds, or all instruments
including both loans and bonds. This allows many or all cells to contribute information about
the unobserved systematic risk factor. None of the tests produce evidence to suggest that the
null hypothesis seriously misstates LGD risk. With respect to the collection of loans, the
conclusion is shown to be robust over a range of possible values of correlation.
This section discusses the practical implementation and the practical effects of the LGD model.
The model can be included within a standard credit loss model without much difficulty. Outside
the model, it can be used to obtain scenario-specific LGDs. If the LGD model were used to
establish the capital charges associated to new credit exposures, new incentives would result.
A standard credit loss model could use Equation (3) to determine conditionally expected LGD.
Estimates of parameters PD and EL (or ELGD) are already part of the credit model. The value of
has little impact on the LGD-default relationship. A practical estimator of might be a
weighted average of an exposure’s correlations with other exposures.
Some credit loss models work directly with unobserved factors that establish the conditional
expectations, and these models would have DR readily available. Other credit models have
available only the simulated default rate. Each simulation run, these models could place the
portfolio default rate within a percentile of its distribution, and use that percentile to estimate
the DR of each defaulted exposure in the simulation run. An LGD would be drawn from a
distribution centered at the conditionally expected rate. This approximation is expected to
produce reasonable results for the simulated distribution of loss. Every exposure would have
LGD risk, and portfolio LGD would be relatively high in simulation runs where the default rate
is relatively high.
Outside a credit loss model, risk managers might want to have an estimate of expected LGD
under particular scenarios. One important scenario is that DR has a tail realization. In a tail
event, there would be many defaults, and individual LGDs should average out quite close to the
conditionally expected LGD rate.
In the bad tail, conditionally expected LGD is greater than ELGD. Figure 6 shows the difference
at the 99.9th percentile. Functions for six different exposures are illustrated. Based on its PD,
each exposure has taken from the Basel II formula.
21
Figure 6: 99.9th percentile LGD less ELGD
20% PD = 10%, Rho = 12.1%
PD = 3%, Rho = 14.7%
18% PD = 1%, Rho = 19.3%
PD = 0.3%, Rho = 22.3%
16% PD = 0.1%, Rho = 23.4%
PD = 0.03%, Rho = 23.8%
14%
LGD Add-on
12%
10%
8%
6%
4%
2%
0%
0% 20% 40% 60% 80% 100%
ELGD
An exposure with PD = 10% is illustrated on the top line. If ELGD were equal to 10%, LGD in the
99.9th percentile would equal (10% + 12%) = 22%, which is more than twice the value of ELGD.
If ELGD were equal to 20%, LGD in the 99.9th percentile would equal (20% + 16%) = 36%. The
diagram makes clear that the LGD function extracts a premium from exposures having the low-
ELGD, High-PD combination. Relative to systems that ignore LGD risk, this relatively
discourages exposures that have exhibited low historical LGD rates and relatively favors
exposures that have low PD rates.
In Figure 6, the conditional LGD rate depends on both parameters—PD and ELGD. That traces
back to the derivation of the LGD function. If the LGD function had no sensitivity to PD, the
credit loss distribution would have three parameters rather than two. Thus, the idea that the
distribution of credit loss can be seen only two parameters deep with existing data has a very
practical risk management implication.
If this approach to LGD were used to set capital charges for extensions of credit, new incentives
would result. The asymptotic loss distribution has two parameters, EL and . Assuming that the
parameter is uniform across a set of exposures, two credit exposures having the same EL
would have the same credit risk. The capital attributed to any exposure would be primarily a
function of its EL. EL, rather than the breakdown of EL into PD and ELGD, would become the
primary focus of risk managers. This would produce an operational efficiency and also serve the
more general goals of credit risk management.
22
Conclusion
If credit loss researchers had thousands of years of data, they might possess a detailed
understanding of the relationship between the LGD rate and the default rate. However, only a
few dozen years of data exist. Logically, it is possible that these data are too scanty to allow
careful researchers to distinguish between theories. This possibility motivates the current paper.
This study begins with simple statistical models of credit loss and default and infers LGD as a
function of the default rate. Using a long and carefully observed data set, this function is tested
but it is not found to be too steep or too shallow. It produces greater LGD rates with greater
default rates. It uses only parameters that are already part of credit loss models; therefore, the
LGD function can be implemented as it is. It can also be subject to further testing. By far, the
most important tests would be against the portfolio credit loss models now running at financial
institutions. If those models do not have statistical significance against Equation (3), they
should be modified to improve their handling of systematic LGD risk.
23
References
Altman, E. I., and B. Karlin, 2010, Special report on defaults and returns in the high-yield bond
and distressed debt market: The year 2009 in review and outlook, NYU Solomon Center Report,
February.
, 2008, Correlation and asset correlation in the structural portfolio model, Journal of
Credit Risk 4(2) (summer)
Gordy, M., 2003, A risk-factor model foundation for ratings-based bank capital rules, Journal of
Financial Intermediation, Volume 12, Issue 3, July 2003, Pages 199-232
Gupton, G., Finger, C., and Bhatia, M., 1997, CreditMetrics™– Technical Document
Pykhtin, M., 2003, Unexpected Recovery Risk, Risk (August), pp. 74-78.
Pykhtin, M. and Dev, A, 2002, Analytical approach to credit risk modeling, Risk, S26-S32
(March)
24
Appendix 1: Analysis of the LGD function
Appendix 1 analyzes Equation (3). It can be restated using the substitution EL = PD ELGD:
[ ] [ ]
( ) [ [ ] ]
√
[ ] [ ]
( ) [ [ ] ]
√
LGD functions differ from each other only because their parameter values produce different
values of k. We refer to k as the LGD Risk Index.
Figure 7 illustrates the LGD function with a base case (E0) and with three contrasting cases. The
base case has the parameter values ELGD = 32.6%, PD = 4.59%, and = 14.51%. These produce
the value k = 0.53. Each contrasting case doubles one of the three parameters, ELGD, PD, or .
Case E1 is the same LGD function as illustrated in Figure 1.
40%
30% E3: ELGD = 32.6%,
PD = 4.59%,
20% = 29%,
k = 0.58
10%
0%
0% 5% 10% 15% 20%
Default Rate
Along each LGD function, LGD rises moderately with default. In their nearly linear regions from
5% to 15%, LGD rises by slightly less than 10% for each of the illustrated exposures.
LGD lines cannot cross, because the LGD Risk Index k acts similar to a shift factor. Comparing
the three contrasting cases, E1 is the most distant from E0. That is because the unconditional
expectation, ELGD, has the most effect on k; not surprisingly, ELGD is the most important
25
variable affecting the conditional expectation, LGD. Next most important is PD, which has
partially offsetting influences on the numerator of k. Least important is the value of . This is
useful to know because the value of might be known within limits that are tight enough—say,
5%-25% for corporate credit exposures—to put tight bounds on the influence of .
In general, an estimate of PD tends to be close to the average annual default rate. (Our estimator
of PD is in fact exactly equal to the average annual default rate.) An estimate of ELGD, however,
tends to be greater than the average annual LGD rate. The latter is sometimes referred to as
“time-weighted” LGD, since it weights equally the portfolio average LGDs that are produced at
different times. By contrast, an estimate of ELGD is “default-rate-weighted.” This tends to be
greater than the time-weighted average, because it places greater weight on the times when the
default rate is elevated, and these tend to be times when the LGD rate is elevated. As a
consequence, the point (PD, ELGD) tends to appear above the middle of a data swarm.
The LGD function passes close to the point (PD, ELGD). This can be seen by inspection of
Equation (17). In the unrealistic but mathematically permitted case that = 0, if DR = PD then
LGD = ELGD. In other words, if = 0 the LGD function passes exactly through the point (PD,
ELGD). In the realistic case that > 0, the LGD function passes lower than this. In Figure 7,
Function E1 passes through (4.59%, 62.9%), which is 2.3% lower than (PD = 4.59%, ELGD =
65.1%). Function E2 passes through (9.18%, 29.4%), which is 3.2% lower than (PD = 9.18%,
ELGD = 32.6%).
For a given combination of PD and ELGD, the “drop”—the vertical difference between the point
(PD, ELGD) and the function value—depends on ; greater produces greater drop. (On the
other hand, greater allows the data to disperse further along the LGD function. This is the
mechanism that keeps EL invariant when becomes greater.) The amount of the drop can be
placed within limits that are easy to calculate. If takes the value of 25%, the drop is 4%-7% for
all PD less than 50% and all ELGD between 10% and 70%. If takes the value of 4%, the drop is
less than 1% for all PD and all ELGD. For all values of parameters that are likely to be
encountered, the LGD function tends to pass slightly lower than the point (PD, ELGD).
The LGD function of Equation (3) is strictly monotonic. Figure 8 illustrates this for seven
exposures that share a common value of PD (5%) and a common value of (15%), but differ
widely in ELGD.
Because both the axes of Figure 8 are on a logarithmic scale, the slopes of lines in Figure 8 can
be interpreted as elasticities, which measure responsiveness in percentage terms. The elasticity
of LGD with respect to DR is defined as
( )
Looking in the range 1% < DR < 10%, the slope is greater for lines that are lower; that is, the
elasticity of LGD with respect to DR is high when ELGD is low. Thus, when default rates rise the
biggest percentage changes in LGD are likely to be seen in low-ELGD exposures.
26
Figure 8. LGD functions: PD = 5%, = 15%, and seven values of ELGD
100.0%
10.0%
LGD Rate
Figure 8 represents by extension the entire range of LGD functions that can arise. Each of the
LGD functions illustrated in Figure 8 could apply to infinitely many other exposures that have
parameters implying the same value of k.
27
Appendix 2: Alternative A and Pykhtin’s LGD model
A solid theoretical model of LGD is provided by Michael Pykhtin. This Appendix discusses
Pykhtin’s model and then illustrates that Alternative A is similar to it. In fact, Alternative A can
be thought of as an approximation to Pykhtin’s model, if the slopes are low or moderate.
Therefore, although we do not test directly against Pykhtin’s model, this suggests that we test
against an alternative that is very much like it.
Pykhtin’s LGD model depends on a single factor that can be the same one that gives rise to
variation of the default rate. Adapting Pykhtin’s original notation and reversing the dependence
on Z, there are three parameters that control the relationship between LGDPyk and the standard
normal factor Z:
( ) ( )
( ) [ ] [ ] [ ]
√ √
Pykhtin’s three parameters are , , and LGD. Roughly stated, these measure the log of the
initial value of collateral, the dispersion of its ending value, and the correlation between its
return and the risk factor Z. Obviously, this is a model of the dynamics of collateral; LGD is
determined as the outcome of those dynamics. If there is very little collateral, LGD takes a high
value and there is very little for the model to do. Thus, the contribution of the model is most
apparent when ELGD is low.
Pykhtin’s LGD model can be combined with Vasicek’s default model, which relates the rate of
default to the unobserved risk factor Z:
[ ] √ √ [ ] [ ]
( ) [ ]
√ √
The expression for Z can be substituted into Equation (20) to produce a relationship between
LGD and default. In this relationship, LGD is a monotonic increasing function of DR that
approaches the limits of zero and one as DR approaches the same limits.
Pykhtin’s LGD model could be used to test the null hypothesis of Equation (3). To produce the
correct value of EL, the parameter values must obey the following restriction:
( ) ∫
[ ] [ ] [ ]
Maximizing Equation (9) using LGD function of Equation (20) and subject to the constraint
expressed by Equation (22) is believed to require a substantial commitment to numerical
optimization of what is apt to be a weakly identified model. In the much simpler distribution of
loss for the asymptotic portfolio, Frye (2010) finds that the Pykhtin parameters interact strongly
with each other and produce an optimum with limiting behavior; that is, to produce the
maximum likelihood one of the parameters must be allowed to tend toward negative infinity.
28
Rather that test directly against Pykhtin’s model, we test against Alternative A and other
alternatives. We compare the two LGD models for a low-ELGD credit exposure: PD = 5%, ELGD
= 20%, EL = 1%, and = 15%. In Alternative A, this specification leaves undetermined only the
value of parameter a. In the Pykhtin model, it leaves undetermined two parameters, because of
the three LGD parameters one of them can be established by Equation (22).
Figure 9 illustrates the comparison at three distinct levels of LGD risk: low, medium, and high.
The low level of LGD risk produces an almost-constant LGD function. The medium level is
consistent with Equation (3). In the high level, the LGD functions are steep and varied. At each
level of LGD risk, the line in Figure 9 representing Alternative A appears in green. Every line in
Figure 9 produces expected loss equal to 1%. The parameter values of these LGD functions are
shown in Table 4.
40%
a=0
30%
LGD Rate
20%
10% a = -3 a = .87
0%
0% 5% 10% 15% 20%
Default Rate
29
Table 4. LGD functions in Figure 9
Low LGD risk
Alternative A Pykhtin model
a LGD
0.867 -0.220 0.100 0.100
Medium LGD risk
Alternative A Pykhtin model
a LGD
0.000 -0.119 0.320 0.320
(null hypothesis) 0.294 0.950 0.230
-0.169 0.075 0.950
High LGD risk
Alternative A Pykhtin model
a LGD
-3.000 0.256 0.640 0.640
0.550 0.950 0.590
-0.044 0.235 0.950
When LGD risk is low, the LGD-default relationship is nearly flat at 20%. This is true of both
Pykhtin’s model ( = LGD = 10%) and of Alternative A (a = 0.867). The two lines appear as
nearly constant functions and might be indistinguishable in the rendering of Figure 9.
Three variants of Pykhtin’s model are compared to the LGD model of Equation (3), which is
Alternative A with a = 0. The extra parameters of Pykhtin’s model introduce some nuance into
the shape of the relationship, but not much. Two of the variants of Pykhtin’s model involve large
parameter values (either or LGD equals 95%), and the third one involves equal values ( = LGD
= 32%). Despite diverse sets of parameter values, the LGD functions are nearly the same except
at the left, where LGD functions are particularly difficult to distinguish by empirical data.
Comparing to the high-risk case when parameter a equals -3, the nuance of Pykhtin’s model is
clear. Economically, the borrower posts considerable collateral ( is elevated), but the collateral
is subject to both great systematic risk and to great idiosyncratic risk. The shapes produced by
the Pykhtin model are different from the shape of Alternative A and somewhat different from
each other; the one with LGD equal to 95% is distinct from the other two. If the slope of the LGD
function were found to be this steep, the nuance provided by the Pykhtin model might make a
significant contribution relative to Alternative A.
30