A Lognormal Model For Insurance Claims Data
A Lognormal Model For Insurance Claims Data
A Lognormal Model For Insurance Claims Data
net/publication/252507857
CITATIONS READS
13 1,454
3 authors, including:
All content following this page was uploaded by Carlos Diniz on 27 August 2014.
Abstract:
• In the insurance area, especially based on observations of the number of claims, N (w),
corresponding to an exposure w, and on observations of the total amount of claims
incurred, Y (w), the risk theory arises to quantify risks and to fit models of pricing
and insurance company ruin. However, the main problem is the complexity to obtain
the distribution function of Y (w) and, consequently, the likelihood function used to
calculate the estimation of the parameters.
This work considers the Poisson(wλ), λ > 0, for N (w) and lognormal(µ, σ 2 ),
−∞ < µ < ∞, and σ 2 > 0, for Zi , the individual claims, and presents maximum-
likelihood estimates for λ, µ and σ 2 .
Key-Words:
• 62J02, 62F03.
132 Zuanetti, Diniz and Leite
A Lognormal Model for Insurance Claims Data 133
1. INTRODUCTION
In the insurance area, the main goals of the risk theory are to study, an-
alyze, specify dimensions and quantify risks. The risk theory is also responsible
for fitting models of pricing and insurance company ruin, especially based on
observations of the random variables for the number of claims, N (w), and the
total amount of claims incurred, Y (w), defined as
N (w)
X
(1.1) Y (w) = Zi I(N (w)>0)
i=1
Assuming that N (w), Z1 , Z2 , ... are independent and the individual claims
are identically distributed, Jorgensen and Souza ([4]) discussed the estimation
and inference problem concerning the parameters considering the situation in
which the number of claims follows a Poisson process and the individual claims
follow a gamma distribution.
Using the properties of the Tweedie family for exponential dispersion mod-
els ([8]; [3]), Jorgensen and Souza ([4]) determined, using the convolution formula,
that Y (w) | N (w) follows an exponential dispersion model and the joint distribu-
tion of N (w) and Y (w)/w follows a Tweedie compound Poisson distribution.
For more details about exponential dispersion models read [2] and [3].
In spite of the distribution of the individual claim values being very well
represented in some situations by the gamma distribution, in other cases it could
be more suitable to attribute a lognormal distribution for Z1 , Z2 , ... . For instance,
in collision situations in car insurances and in common fires, where the individual
claim values can increase almost without limits but cannot fall below zero, with
most of the values near the lower limit and where the natural logarithm of the
individual claim variable yields a normal distribution.
(w)
NP
The aim of this paper is to estimate the parameters of Y (w) = Zi I(N (w)>0)
i=1
and N (w) distributions, where N (w), Z1 , Z2 , ... are independent, Z1 , Z2 , ... is a se-
quence of random variables with lognormal(µ, σ 2 ) distribution and N (w) follows
a Poisson distribution with rate λ.
2. LOGNORMAL MODEL
3. PARAMETER ESTIMATION
However, since the lognormal distribution was defined with reference to the
normal distribution, estimate µ, σ 2 and λ from the likelihood function for these
parameters considering the variables N (w) and Y (w) is equivalent to estimate µ,
σ 2 and λ from the likelihood function based on the variables N (w) and
N (w)
X
X+ (w) = Xi I(N (w)>0) ,
i=1
Then, we have
(3.1) X+ (w)|N (w) = n ∼ Normal(nµ, nσ 2 ), for n ≥ 1 .
and
P
m P
m
∧
δi Ni Ni
i=1 i=1
(3.6) λ = = .
m m
136 Zuanetti, Diniz and Leite
P
m Pm
Let S = Ni be the total number of claims and U = X+i . Hence,
i=1 i=1
Pm P
m
S follows a Poisson(mλ) and U | (N = n) follows a Normal µ ni , σ02 ni ,
i=1 i=1
where N = (N1 , N2 , ..., Nm ) and n = (n1 , n2 , ..., nm ) is the observed vector of
number of claims for m groups. Thus U | (S = s) follows a Normal(µs, σ02 s) and
∧
the exact distribution of λ is given by
∧ c exp(−mλ) (mλ)c
(3.7) P λ= = P S =c = for c = 0, 1, 2, ... .
m c!
∧
The cumulative distribution function of µ given S > 0, F∧ (v), for v ∈ R
µ|S>0
is " #
∧ ∧ ∞
[
P µ ≤ v|S > 0 = P µ≤v ∩ S = j |S > 0
j=1
!
∧ S
∞
P µ ≤ v, S =j , S > 0
j=1
=
P S>0
!
∧ S
∞
P µ ≤ v, S =j
j=1
=
P S>0
P
∞ ∧
P µ ≤ v, S = j
j=1
=
P S>0
P
∞ ∧
P µ ≤ v|S = j P S = j
j=1
(3.8) =
P S>0
P
∞
P US ≤ v|S = j P S = j
j=1
=
P S>0
P
∞
P Uj ≤ v|S = j P S = j
j=1
=
P S>0
P
∞
P U ≤ jv|S = j P S = j
j=1
=
P S>0
X∞
exp(−mλ) (mλ)j
= FU (jv) ,
j=1
j! 1 − exp(−mλ)
dF∧ (v)
µ|S>0
f∧ (v) =
µ|S>0 dv
∞
X exp(−mλ) (mλ)j
= fU (jv) j
j=1
j! 1 − exp(−mλ)
∞
exp(−mλ) X (mλ)j
(3.9) = fU (jv)
1 − exp(−mλ) (j −1)!
j=1
∞
exp(−mλ) X (mλ)r+1
= fU (r+1)v
1 − exp(−mλ) (r)!
r=0
∞
(mλ) exp(−mλ) X (mλ)r
= fU (r+1)v ,
1 − exp(−mλ) (r)!
r=0
where fU is the probability density function of the Normal µ(r+1), σ02 (r+1)
distribution.
Let k be the number of groups with number of claims greater that zero.
If σ 2 is unknown, the maximum likelihood estimate of σ 2 is
P
m
(X+i − Ni µ )
∧ 2
∧ δi Ni
2 i=1
σ =
P
m
δi
i=1
(3.10)
P
k (X+j − Nj µ )
∧ 2
Nj
j=1
= if Nj > 0, for all j = 1, 2, ..., k .
k
b ∧ 1 ∧
E[Z] = exp µ + σ 2 ,
2
∧ ∧
∧ ∧
d
Var[Z] = exp 2 µ + 2 σ − exp 2 µ + σ 2 ,
2
∧
b ] = λ
E[N
and
∧
d ] = λ.
Var[N
138 Zuanetti, Diniz and Leite
Suppose that (x+1 , n1 ), (x+2 , n2 ), ..., (x+m , nm ) are observations of the in-
dependent random vectors (X+1 , N1 ), (X+2 , N2 ) ..., (X+m , Nm ), m is the number
of groups present in the insurance portfolio, Ni ∼ Poisson(λ), and X+i | (Ni = ni )
∼ Normal(µi , ni σ 2 ), i = 1, 2, ..., m, with the following regression structure for the
location parameter
Xni
µi = α ni + β vij ,
j=1
where vij represents the covariate of the j-th individual claims of the i-th group,
for i = 1, 2, ..., m and j = 1, 2, ..., ni .
P
ni
Defining ri = vij , the log likelihood function for the parameters α, β, σ 2
j=1
and λ is given by
l α, β, σ 2 , λ =
m
( )
X 1 1 2
2
= δi − log 2πni σ + ni log(λ) − x+i − α ni − β ri + (−λ) .
2 2ni σ 2
i=1
P
k +j j
PN
X+j rj j=1 j=1
Nj − k
j=1 j
∧ j=1
(4.2) β =
P r
k 2 ,
P
k rj2
j
PN
j=1
Nj − k
j=1 j
j=1
P
k
(X+j − µj )
∧ 2
∧ Nj
j=1
(4.3) σ2 = , Nj > 0, for all j ,
k
∧ ∧ ∧
where µj = α nj + β rj .
5. APPLICATIONS
that is, the expected individual claim value is 1274.11 MU with a variance of
170728.8 MU.
N X+ δ N X+ δ N X+ δ
λ 2 1.7 0.3
µ 7.1 7.06 0.04
σ2 0.1 0.09 0.01
∧
From the distribution function of µ given S > 0, defined in (3.8), we can
∧
calculate P µ ≤ v|S > 0 for different values of v ∈ R. Table 3 shows the prob-
∧
ability of µ ≤ v given S > 0, considering λ = 2, µ = 7.1, σ 2 = 0.1 (used for the
data simulation) and s = 34 (observed in this dataset).
∧
Table 3: P µ ≤ v|S > 0 for λ = 2, µ = 7.1, σ 2 = 0.1 and s = 34
∧ ∧ ∧
v P µ ≤ v|S > 0 v P µ ≤ v|S > 0 v P µ ≤ v|S > 0
that is, the probability of occurrence of no claims in each group is practically null.
The observed values of N , X+ and δ are presented in Table 4 and the simulated
individual claim values vary between 440.91 MU and 3212.9 MU.
N X+ δ N X+ δ N X+ δ
∧
Table 6 shows the probability of µ ≤ v given S > 0, considering λ = 100,
µ = 7.1, σ 2 = 0.1 (used for the data simulation) and s = 1891 (observed in this
dataset).
∧
Table 6: P µ ≤ v|S > 0 for λ = 100, µ = 7.1, σ 2 = 0.1 and s = 1891
∧ ∧ ∧
v P µ ≤ v|S > 0 v P µ ≤ v|S > 0 v P µ ≤ v|S > 0
∧
From the results of Table 6 we have P [6.4 ≤ µ ≤ 7.1] = 0.978.
142 Zuanetti, Diniz and Leite
6. CONCLUDING REMARKS
ACKNOWLEDGMENTS
We would like to thank the associate editor and referees for their valuable
suggestions and comments on an earlier version of this paper.
REFERENCES
[1] Crow, E.L. and Shimizu, K. (Eds.) (1988). Lognormal Distributions: Theory
and Application, Dekker, New York.
[2] Jorgensen, Bent (1986). Some properties of exponential dispersion models,
Scand. J. Statist., 13, 187–198.
[3] Jorgensen, Bent (1987). Exponential dispersion models (with discussion),
J. Royal Statistical Society Ser. B, 49, 127–162.
[4] Jorgensen, B. and Souza, M.C.P. de (1994). Fitting Tweedie’s compound
poisson model to insurance claims data, Scandinavian Actuarial Journal, 69–93.
[5] Levy, E. (1992). Pricing European average rate currency options, Journal of
International Money and Finance, 14, 474–491.
[6] Milevsky, M.A. and Posner, S.E. (1998). Asian Options, the sum of lognormal,
and the Reciprocal Gamma distribution, Journal of Financial and Quantitative
Analysis, 33, 3.
[7] Slimanc, S.B. (2001). Bounds to the distribution of a sum of independent log-
normal random variables, IEEE Transations on Communications, 20, 6.
[8] Tweedie, M.C.K. (1947). Functions of a statistical variate with given means,
with special reference to Laplacian distributions, Proc. Camb. Phil. Soc., 49,
41–49.