Handout 9 PDF

Parametric Survival Models
Christoph Dätwyler and Timon Stucki
9. May 2011
Contents
Introduction
Parametric Model
Distributional Assumption
Weibull Model
Accelerated Failure Time Assumption
A More General Form of the AFT Model
Weibull AFT Model
Log-Logistic Model
Other Parametric Models
The Parametric Likelihood
Frailty Models
Summary
Contents
Introduction
Parametric Model
Weibull Model
Weibull AFT Model
Log-Logistic Model
Frailty Models
Summary
Parametric Survival Model
Basic Idea
The survival time follows a distribution.
Goal
Use data to estimate parameters of this distribution
⇒ completely specified model
⇒ prediction of time-quantiles
Parametric Survival Model vs. Cox PH Model
Parametric Survival Model

+ Completely specified h(t) and S(t)
+ More consistent with theoretical S(t)
+ time-quantile prediction possible
– Assumption on underlying distribution
Cox PH Model
– distribution of survival time unkonwn
– Less consistent with theoretical S(t) (typically step function)
+ Does not rely on distributional assumptions
+ Baseline hazard not necessary for estimation of hazard ratio
Contents
Introduction
Parametric Model
Weibull Model
Weibull AFT Model
Log-Logistic Model
Frailty Models
Summary
Contents
Introduction
Parametric Model
Weibull Model
Weibull AFT Model
Log-Logistic Model
Frailty Models
Summary
Probability Density, Hazard and Survival Function
Main Assumption
The survival time T is assumed to follow a distribution with density
function f (t).
Z ∞
⇒ S(t) = P(T > t) = f (u)du
t
Recall:
d Z t
dt S(t)

h(t) = − , S(t) = exp − h(u)du
S(t) 0
Density Function in Relation to the Hazard and Survival
Function
Remark
f (t) = h(t)S(t)
Proof:
d t
S(t) d
Z
h(t)S(t) = − dt S(t) = f (u)du = f (t)
S(t) dt ∞
Key Point
Specifying one of the three functions f (t), S(t) or h(t) specifies
the other two functions.
Commonly Used Distributions and Parameters
Distribution f (t) S(t) h(t)

Exponential λ exp(−λt) exp(−λt) λ
Weibull λpt p−1 exp(−λt p ) exp(−λt p ) λpt p−1
λpt p−1 1 λpt p−1
Log-logistic (1+λt p )2 1+λt p 1+λt p
Modeling of the parameters:

I λ is reparameterized in terms of predictor variables and
regression parameters.
I Typically for parametric models, the shape parameters p is
held fixed.
Contents
Introduction
Parametric Model
Weibull Model
Weibull AFT Model
Log-Logistic Model
Frailty Models
Summary
Weibull Model
Assuming T ∼ Weibull(λ, p) with probability density function
f (t) = λpt p−1 exp(−λt p ), where p > 0 and λ > 0,
the hazard function is given by
h(t) = λpt p−1 .
Hazard Function h(t)
p<1
p=1
p>1
1.5
p is called shape parameter:
I If p > 1 the hazard increases
h(t)
1.0
I If p = 1 the hazard is constant
(exponential model) 0.5
I If p < 1 the hazard decreases
0 2 4 6 8 10
t
Graphical Evaluation of Weibull Assumption
Property of Weibull Model

The log(− log(S(t))) is linear with the log of time.
S(t) = exp(−λt p )
⇒ − log(S(t)) = λt p
⇒ log(− log(S(t))) = log(λ) + p log(t)
This property allows a graphical evaluation of the appropriateness

of a Weibull model by plotting
log(− log(S(t)))
b vs. log(t),
where S(t)
b is Kaplan-Meier survival estimate.
Example: Remission Data
We consider the remission data of 42 leukemia patients.
I 21 patients given treatment (TRT = 1)
I 21 patients given placebo (TRT = 0)
Note: The survival time (time to event) is the time until a patient
went out of remission, which means that the patient relapsed.
●
● TRT = 0
1.0
TRT = 1 ●
●
0.5
●
0.0
●
log(− log(S(t)))
−0.5
^
●
I straight lines ⇒ Weibull
−1.0
●
I same slope ⇒ PH
−1.5
●
−2.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
log(t)
Weibull PH Model
Recall: h(t) = λpt p−1
Weibull PH model:
I Reparameterize λ with
λ = exp(β0 + β1 TRT ).
I Then the hazard ratio (TRT = 1 vs. TRT = 0) is
exp(β0 + β1 )pt p−1

HR = = exp(β1 ),
exp(β0 )pt p−1
which indicates that the PH assumption is satisfied.
Note: This result depends on p having the same value for TRT = 1
and TRT = 0 (otherwise time would not cancel out).
Exponential PH Model
The exponential distribution is a special case of the Weibull

distribution with p = 1.
Weibull density function:
f (t) = exp(−λt p ) λpt p−1

| {z } | {z }
S(t) h(t)
Setting p = 1 gives the density function of an exponential

distribution
f (t) = exp(−λt) |{z}
λ .
| {z }
S(t) h(t)
Exponential PH Model
Remission Data Output from runnin
shown on the left. Th
Exponential regression
Running the exponential model leads to the following software
output: (version 7.0
log relative-hazard form
are listed under the c
t Coef. Std. Err. z p >|z| rameter estimate for
trt −1.527 .398 −3.83 0.00 −1.527. The estimate
cons −2.159 .218 −9.90 0.00 in the output) is −2.1
Err.), Wald test statis
Wald test are also pro
The estimated hazard ratio is obtained from the estimated that the z test statisti
coefficient β̂1 by nificant with a p-valu
the output).
HR(TRT
c = 1 vs. TRT = 0) = exp(β̂1 ) = exp(−1.527) = 0.22.
Coefficient estimates obtained by MLE The regression coeffi
Interpretation

asymptotically normal
maximum likelihoo
are asymptotically n
This means that the hazard for the group with TRT = 0 is bigger
than the one for the group with TRT = 1 because 0.22 < 1,
indicating a positive effect of the treatment.
Contents
Introduction
Parametric Model
Weibull Model
Weibull AFT Model
Log-Logistic Model
Frailty Models
Summary
First example: Humans vs. dogs

I Let SH (t) and SD (t) denote the survival functions for humans
and dogs respectively.
I Known: In general dogs grow older seven times faster than
humans.
I In AFT terminology: The probability of a dog surviving past t
years is equal to the one of a human surviving past 7t years.
I This means:
SD (t) = SH (7t)
I In other words, dogs accelerate through life about seven times
faster than humans.
AFT Models
AFT Models
AFT Models describe stretching out or contraction of survival time
as a function of predictor variables.
Illustration of AFT Assumption
Second example: Smokers vs. nonsmokers

I Let SS (t) and SNS (t) denote the survival functions for
smokers and nonsmokers respectively.
AFT assumption
SNS (t) = SS (γt) for t ≥ 0
Definition
γ > 0 is the so called acceleration factor and is a constant.
Expressing the AFT assumption
The AFT assumption can be expressed

I in terms of survival function as seen before:
SNS (t) = SS (γt)
I in terms of random variables for survival time:
γTNS = TS ,
where TNS is a random variable following some distribution

representing the survival time for nonsmokers and TS the
analogous one for smokers.
Acceleration factor
The acceleration factor allows to evaluate the effect of predictor

variables on the survival time.
Acceleration factor
The acceleration factor is a ratio of time-quantiles corresponding to
268 7. Parametric Survival Models
any fixed value of S(t).
S(t) This idea is graphica

1.00 the survival curves for
γ=2
2 (G = 2) shown on th
0.75 S(t), the distance of
distance to G = 1
S(t) axis to the surviv
0.50 distance to G = 2 the distance to the su
tice the median survi
0.25
G=1 G=2
and 75th percentiles)
t models, this ratio of su
Survival curves for Group 1 (G = 1) stant for all fixed valu
and Group 2 (G = 2)
Interpretation of the Acceleration Factor
Assuming the event to occur is negative for an individual,
comparing two groups (levels of covariates) leads to the following
general interpretation:
γ>1 ⇒ exposure benefits survival

γ<1 ⇒ exposure harmful to survival
For the hazard ratio, we have:
HR > 1 ⇒ exposure harmful to survival

HR < 1 ⇒ exposure benefits survival
γ = HR = 1 ⇒ no effect from exposure

Contents
Introduction
Parametric Model
Weibull Model
Weibull AFT Model
Log-Logistic Model
Frailty Models
Summary
General Form of AFT Model
Consider an AFT model with one predictor X . The model can be

expressed on the log scale as
log(T ) = α0 + α1 X + ,
where is a random error following some distribution.
T log(T )
Exponential Extreme value
Weibull Extreme value
Log-logistic Logistic
Lognormal Normal
Some distributions (e.g. Weibull) have an additional parameter σ,

which scales .
log(T ) = α0 + α1 X + σ
Here (and in R) the model is parametrized using σ = p1 :
1
log(T ) = α0 + α1 X +
p
The model in terms of the survival time T is

1 1
T = exp α0 + α1 X + = exp(α0 ) · exp(α1 X ) · exp
p p
Remark
AFT model is multiplicative in terms of T and additive in terms of
log(T ).
Contents
Introduction
Parametric Model
Weibull Model
Weibull AFT Model
Log-Logistic Model
Frailty Models
Summary
AFT Model
I We use again the remission data.

I We consider the variable TRT as only predictor (TRT = 1
and TRT = 0)
AFT Model Assumption

The ratio of time-quantile is constant (γ) for all fixed values
S(t) = q.
Expression for time-quantiles

I Solve for t in terms of S(t)
I Scale t in terms of predictors
Weibull AFT Model
Recall: S(t) = exp(−λt p )
1. Solving for t gives:
S(t) = exp(−λt p ) ⇔ − log(S(t)) = λt p

1
⇔ t = (− log(S(t)))1/p
λ1/p
1
2. Reparameterizing λ1/p
= exp(α0 + α1 TRT ) yields
t = (− log(S(t)))1/p exp(α0 + α1 TRT ).
(TRT used to scale time to any fixed value of S(t))

3. In terms of any fixed probability S(t) = q we get
t = (− log(q))1/p exp(α0 + α1 TRT ).

Weibull AFT Model: Acceleration Factor
The acceleration factor for a fixed value of S(t) = q is calculated

as follows:
Level of covariates: TRT = 1 and TRT = 0
Then the acceleration factor γ is
γ = γ(TRT = 1 vs. TRT = 0)

(− log(q))1/p exp(α0 + α1 )
=
(− log(q))1/p exp(α0 )
= exp(α1 ).
Note: As in the PH form of the model, this result depends on p

having the same value for TRT = 1 and TRT = 0.
R Code and R Output
> weibull.aft <- survreg(Surv(Survt,status) ~ TRT,dist=’weibull’)

> summary(weibull.aft)
Call:
survreg(formula = Surv(Survt, status) ~ TRT, dist = "weibull")
Value Std. Error z p
(Intercept) 2.248 0.166 13.55 8.30e-42
TRT 1.267 0.311 4.08 4.51e-05
Log(scale) -0.312 0.147 -2.12 3.43e-02
Scale= 0.732
Weibull distribution
Loglik(model)= -106.6 Loglik(intercept only)= -116.4
Chisq= 19.65 on 1 degrees of freedom, p= 9.3e-06
Number of Newton-Raphson Iterations: 5
n= 42
R Code and R Output: Acceleration Factor
The estimated acceleration factor γ̂ comparing the treatment group

to the placebo group (TRT = 1 vs. TRT = 0) is now:
γ̂ = exp(α̂1 )
> exp(weibull.aft$coefficient[2])
TRT
3.551374
Interpretation
The survival time for the treatment group (TRT = 1) is increased
by a factor of 3.55 compared to the placebo group (TRT = 0)
⇒ Treatment is positive.
Survival functions
Computing time-quantiles, for example the median:
S(t) = 0.5 ⇒ t̂m = (− log(0.5))1/p̂ · exp(α̂0 + α̂1 TRT )
Survival Function S(t)
Estimated survival times for the
1.0
TRT=0
TRT=1
median S(t) = 0.5:
0.8
> median <-predict(weibull.aft,
+ newdata=list(TRT=c(0,1)),
0.6
+ type=’quantile’,p=0.5)
S(t)
> median
0.4
1 2
7.242697 25.721526 0.2
> median[2]/median[1]
2
0.0
3.551374 0 10 20 30 40 50 60
t
Relation between Weibull AFT and PH coefficients
1
I AFT: λ1/p
= exp(α0 + α1 TRT )
⇔ (1/p) log(λ) = −(α0 + α1 TRT )
⇔ log(λ) = −p(α0 + α1 TRT )
I PH: λ = exp(β0 + β1 TRT )

⇔ log(λ) = β0 + β1 TRT
This indicates the following relationship between the coefficients:
βj = −αj p
Exponential PH and AFT Model
We obtained βj = −αj p for the Weibull model. In the special case

of the exponential model where p = 1 we have
βj = −αj .
Remark
The exponential PH and AFT are in fact the same model, except
that the parametrization is different.
Exponential PH and AFT Model
Example:
The estimated values from the exponential example above support
this result.
Coefficient PH Model: β̂1 = −1.527

Coefficient AFT Model: α̂1 = 1.527
We also have
1
HR(TRT
c = 1vs. TRT = 0) = exp(β̂1 ) = exp(−α̂1 ) = .
γ̂
Property of the Weibull Model
Proposition
AFT assumption holds ⇔ PH assumption holds (given that p is
fixed)
Proof for the considered example (TRT = 1 and TRT = 0):

I [⇒]: γ = exp(α1 )
Assume γ is constant ⇒ α1 is constant
HR = exp(β1 ) = exp(−pα1 ) ⇒ HR is constant
I [⇐]: HR = exp(β1 )
Assume HR is constant ⇒ β1 is constant
γ = exp(α1 ) = exp(− βp1 ) ⇒ γ is constant
Possible Plots
Possible results for plots of log(− log(Ŝ(t))) against log(t):
⇒ Weibull (or Exponential if p = 1), PH and

AFT assumption hold.
⇒ Not Weibull, PH and not AFT.
⇒ Not Weibull, not PH and not AFT.
⇒ Weibull, not PH and not AFT (p not fixed).

Contents
Introduction
Parametric Model
Weibull Model
Weibull AFT Model
Log-Logistic Model
Frailty Models
Summary
Hazard Function of Log-Logistic Model
The log-logistic distribution accommodates an AFT model but not
a PH model.
Hazard function is
λpt p−1
h(t) = ,
1 + λt p
with p > 0 and λ > 0.
Hazard Function h(t)
1.0
p<=1
p>1
0.8
Shape of hazard function:
0.6
h(t)
I p ≤ 1: hazard decreases
I p > 1: hazard unimodal 0.4
0.2
0 1 2 3 4 5
t
PO Assumption
Definition
In a proportional odds (PO) survival model, the odds ratio is
constant over time.
I Survival odds: odds of surviving beyond time t
S(t) P(T > t)

=
1 − S(t) P(T ≤ t)
I Failure odds: odds of getting the event by time t
1 − S(t) P(T ≤ t)
=
S(t) P(T > t)
PO Assumption
The failure odds of the log-logistic survival model are

λt p
1 − S(t) 1+λt p
= 1
= λt p .
S(t) 1+λt p
The failure odds ratio (OR) for two different groups (1 and 2) is
(for p fixed)
1−S1 (t)
S1 (t) λ1 t p λ1
OR(1 vs. 2) = 1−S2 (t)
= p
= .
λ2 t λ2
S2 (t)
Hence, the log-logistic model is a proportional odds (PO) model.

Graphical Evaluation of Log-Logistic Assumption
The log-failure odds can be written as

1 − S(t)
log = log(λt p ) = log(λ) + p log(t),
S(t)
which is a linear function of log(t).
Graphical Evaluation of Log-Logistic Assumption

1−Ŝ(t)
I Plot log Ŝ(t)
against log(t) (Ŝ are the KM-survival
estimates).
I If the plot is linear with slope p, then the survival time follows
a log-logistic distribution.
Log-Logistic Example with the Remission Data
We consider a new categorical variable WBCCAT :
I WBCCAT = 2 if logWBC ≥ 2.5 (high count)
I WBCCAT = 1 if logWBC < 2.5 (medium count)
The graphical evaluation of WBCCAT = 2 and WBCCAT = 1:
Log Failure Odds vs. Log Time

3
● WBCCAT=1
WBCCAT=2
2
log((1 − S(t)) S(t))
⇒ straight lines indicate

^
●● ●●●
log-logistic distribution
^
●
0
● ● ●●
●
−1
●
−2
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
log(t)
AFT Log-Logistic Model
AFT log-logistic model with WBCCAT as only predictor:
We solve
1 1
S(t) = p
= 1
1 + λt 1 + (λ p t)p
for t and obtain 1
1 p 1
t= −1 1 .
S(t) λp
We reparameterize the factor on the right as

1
1 = exp(α0 + α1 WBCCAT ).
λp
AFT Log-Logistic Model: Acceleration Factor
We get
1
1 p
t= −1 exp(α0 + α1 WBCCAT ).
S(t)
For a fixed probability S(t) = q, the expression for t is

p1
t = q −1 − 1 exp(α0 + α1 WBCCAT ).
The acceleration factor γ for S(t) = q is

1
q −1 − 1 p exp(α0 + 2α1 )
γ(WBCCAT = 2 vs. WBCCAT = 1) = 1
(q −1 − 1) p exp(α0 + 1α1 )
= exp(α1 ).
R Code and R Output
> logistic.aft <- survreg(Surv(Survt, status) ~ WBCCAT,

+ dist=’loglogistic’,data=remdata)
> summary(logistic.aft)
Call:
survreg(formula = Surv(Survt, status) ~ WBCCAT, data = remdata,
dist = "loglogistic")
Value Std. Error z p
(Intercept) 4.094 0.586 6.98 2.92e-12
WBCCAT -0.987 0.337 -2.93 3.40e-03
Log(scale) -0.564 0.154 -3.67 2.41e-04
Scale= 0.569
Log logistic distribution

Loglik(model)= -111.2 Loglik(intercept only)= -115.4
Chisq= 8.28 on 1 degrees of freedom, p= 0.004
Number of Newton-Raphson Iterations: 4
n= 42
R Code and R Output: Acceleration Factor
The estimated acceleration factor γ̂ comparing WBCCAT = 2

(high count) and WBCCAT = 1 (medium count) is now:
> exp(logistic.aft$coefficient[2])
γ̂ = exp(α̂1 ) WBCCAT
0.3728214
⇒ Ŝ1 (t) = Ŝ2 (0.37t)

(Ŝi is the survival function for WBCCAT = i, i = 1,2)
Interpretation
The survival time for the group with high count (WBCCAT = 2) is
"accelerated" by a factor of 0.37 compared to the group with
medium count (WBCCAT = 1) ⇒ High WBC is negative.
PO Log-Logistic Model
The proportional odds (PO) form of the log-logistic model can be
formulated by reparameterizing λ.
Failure odds:
λt p
1 − S(t) 1+λt p
= 1
= λt p .
S(t) 1+λt p
Reparameterizing λ gives
λ = exp(β0 + β1 WBCCAT ).
Hence, the failure odds ratio is

t p exp(β0 + 2β1 )
OR(WBCCAT=2 vs. WBCCAT=1) = = exp(β1 ).
t p exp(β0 + 1β1 )
Comparing AFT and PO Log-Logistic Model
Parameterizations:
1
I AFT model: 1 = exp(α0 + α1 WBCCAT )
λp
I PO model: λ = exp(β0 + β1 WBCCAT )
Hence, we have the relationship
β0 = −α0 p and β1 = −α1 p.
Note: If p is fixed this leads to: AFT ⇔ PO

So we can calculate the estimated OR with the coefficients of the
AFT model by:
> alpha1 <- logistic.aft$coefficient[2]

OR
d = exp(β̂1 ) > p <- 1/logistic.aft$scale
= exp(−α̂1 p̂) = 5.66 > exp(-alpha1*p)
5.662691
Graphical Evaluation
Log Failure Odds vs. Log Time
3
● WBCCAT=1
WBCCAT=2
2
log((1 − S(t)) S(t))
1
^
●● ●●●
^
●
0
● ● ●●
●
−1
●
−2
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
log(t)
Graphical Evaluation
Proposition
1. Straight lines ⇒ Log-logistic
2. Parallel plots and log-logistic ⇒ PO
3. Log-logistic and PO ⇒ AFT
Proof: Consider two groups (1 and 2).

1. log(failure odds) = log(λ) + p log(t)
2. Parallel plots ⇒ p the same for both groups
p
⇒ OR = tt p λλ12 = λλ12
3. For S(t) = q, the acceleration factor is
1
q −1 − 1 p
λ1 λ1
γ= 1 = .
(q −1 − 1) λ2p λ2
Contents
Introduction
Parametric Model
Weibull Model
Weibull AFT Model
Log-Logistic Model
Frailty Models
Summary
Generalized Gamma Model
The generalized gamma distribution is given by
p d−1
ad
t exp(−( at )p )
f (t) = ,
Γ(d /p)
where Z ∞
Γ(z) = s z−1 e −s ds
0
and a > 0, d > 0, p > 0.
I The three parameters allow great flexibility in the distributions

shape.
I Weibull and lognormal distributions are special cases of the
generalized gamma distribution (e.g. setting d = p gives us
the Weibull distribution).
Lognormal Model
The lognormal distribution is given by
(log(t) − µ)2

1
f (t) = √ exp − ,
tσ 2π 2σ 2
where µ and σ are the mean and standard deviation respectively, of

the variable’s natural logarithm (by definition, the variable’s
logarithm is normally distributed).
I Shape similar to the log-logistic distribution (and yields similar

model results)
I Accommodates an AFT model (as the log-logistic), but is not
a proportional odds model (whereas the log-logistic model is a
PO model)
Gompertz Model
I PH model but not AFT
I Hazard function (with one predictor (TRT)):
h(t) = [exp(ξt)] · exp(β0 + β1 TRT )
with parametrically specified baseline hazard h0 (t) = exp(ξt)

I ξ > 0: hazard increases exponentially with t
I ξ < 0: hazard decreases exponentially with t
I ξ = 0: constant hazard (exponential model)
Modeling the Shape Parameter
Many parametric models contain a shape parameter, which is

usually considered fixed.
Example:
I Weibull model
Recall: h(t) = λpt p−1 where λ = exp(β0 + β1 TRT ) and p,
the shape parameter, unaffected by predictors.
I Alternative Weibull model

Now: h(t) = λpt p−1 where λ = exp(β0 + β1 TRT ) and
p = exp(δ0 + δ1 TRT )
I If δ1 6= 0, the value of p differs by TRT
I Not a PH or AFT model if δ1 6= 0, but still a Weibull model
Contents
Introduction
Parametric Model
Weibull Model
Weibull AFT Model
Log-Logistic Model
Frailty Models
Summary
Based on outcome distribution of the outcome vari
Censoring complicates survival observations (i.e., ob
Parametric Likelihood f(t)and Censoring vival data is the po
data
The likelihood function for a parametric model act time of the outco
◦ Right-censored sider three types of c
I is a function of the observed data and the unknown
◦ Left-censored censored, left-censor
parameters of the◦ model.
Interval-censored
I is based on the distribution of the survival time.
Examples of Censored Subjects
I depends on the censoring of the data.
Right-censored. Su
follow-up after 10 ye
of event is not obser
ter the 10th year. Thi
Right-censored: x time 10 years because the
10
of 10 on the time lin
x
Left-censored: __________________ time Left-censored. Supp
10
fore the 10th year bu
unknown. This subje
x
Interval-censored: ______________ time
(i.e., t < 10).
8 10
Interval-censored.
event between the 8
Construction of the Likelihood on an Example
Assume a survival time distribution with probability density function

f (t).
Subject Event Time Likelihood Contribution

Barry t=2 f (2)
R∞
Gary t>8 8 f (t)dt
(right−censored)
Harry t=6 f (6)
R2
Carrie t<2 0 f (t)dt
(left−censored)
R8
Larry 4<t<8 4 f (t)dt
(interval−censored)
Construction of the Likelihood on an Example
The likelihood function L is the product of each contribution:

Z ∞ Z 2 Z 8
L = f (2) · f (t)dt · f (6) · f (t)dt · f (t)dt
8 0 4
Assumptions for formulating L:

I Subjects are independent (product of contributions).
I No competing risks:
No competing event prohibits a subject from eventually
getting the event of interest.
Example: Death
I Follow-up times are continuous without gaps
(i.e. subjects do not return into study).
Maximum likelihood Estimates
The likelihood for M subjects is

M
Y
L= Li .
i=1
The maximum likelihood estimates of the parameters are obtained

by solving the following system of equations
∂ log(L)
= 0, j = 1, 2, . . . , N,
∂βj
where N is the number of parameters βj .

Parametric and Cox likelihood
In a parametric model, the parametric likelihood handles easily
right-, left- or interval-censored data.
In a Cox model, the Cox likelihood handles right-censored data, but
is not designed to accommodate left- or interval-censored data
directly.
Example:
I Health check for nonsymptomatic outcome every year once
I If event was detected e.g. at the beginning of the third year,
the exact time when the event occurred was between the
second and third year
I Fit a parametric model with the distribution of the outcome
denoted by f (t)
I Each subject’s contribution to the likelihood is obtained by
integrating f (t) over the interval in which it had the event.
Contents
Introduction
Parametric Model
Weibull Model
Weibull AFT Model
Log-Logistic Model
Frailty Models
Summary
Frailty
What is frailty?
I Random component
I Accounts for variability due to unobserved individual-level
factors (unaccounted for by the other predictors)
The frailty α (α > 0)

I is an unobserved multiplicative effect on the hazard
I follows some distribution g (α) with the mean of g (α) equal to
1 (µ = 1)
I θ = Var (g (α)), parameter to be estimated from the data
Hazard functions, Survival functions and Frailty
Express an individual’s hazard function conditional on the frailty as
h(t | α) = αh(t)
This leads to:

Z t Z t
S(t | α) = exp − h(u | α)du = exp − αh(u)du
0 0
Z t α
= exp − h(u)du = S(t)α
0
Suppose α > 1. Then we get:

I Increased hazard
I Decreased survival
And vice versa for α < 1.
Survival functions in Frailty Models
Distinguish between
I the individual level or conditional survival function
S(t | α)
I and the population level or unconditional survival function
SU (t), representing a population average.
Once the frailty distribution g (α) is chosen we find the
unconditional survival function by
Z ∞
SU (t) = S(t | α)g (α)d α
0
Then we can find the corresponding unconditional hazard hU (t)

using the known relationship between survival and hazard function
−d [SU (t)]/dt
hU (t) =
SU (t)
the frailty. Stata supports two distributions: the
Stata offers choices for g(α) gamma distribution and the inverse-Gaussian
Example: Weibull PH model with and without frailty
1. Gamma
2. Inverse-Gaussian
distribution for the frailty. With the mean fixed
at 1, both these distributions are parameterized in
Both distributions parameterized in terms of the variance θ and typically yield similar
terms of θ results.
EXAMPLE To illustrate the use of a frailty model, we apply

the data from the Veteran’s Administration Lung
Vet Lung Cancer Trial
Cancer Trial described in Chapter 5. The exposure
Predictors:
of interest is treatment status TX (standard = 1,
TX (dichotomous: 1 = standard, 2 = test) test = 2). The control variables are performance
PERF (continuous: 0 = worst, 100 = best) status (PERF), disease duration (DD), AGE, and
DD (disease duration in months) prior therapy (PRIORTX), whose coding is shown
AGE (in years) on the left. The outcome is time to death (in days).
PRIORTX (dichotomous: 0 = none,
10 = some)
Model 1:
Model 1. No Frailty
h(t) = λpt p−1 where
Output from running a Weibull PH model with-
out frailty using Stata software is shown on the
Weibull regression (PH form) left (Model 1). The model can be expressed: h(t) =
Log likelihood = −206.20418 λ = exp(β + β TX + β2 PERF + β3 DD
λ pt p−1 where 0 1
t Coef. Std. Err. z p >|z| λ = exp(β0 + β1 TX + β2 PERF + β3 DD
tx .137 .181 0.76 0.450
+ β AGE + β5 PRIORTX )
4
+ β4 AGE + β5 PRIORTX).
perf −.034 .005 −6.43 0.000
dd .003 .007 0.32 0.746 The estimate of the hazard ratio comparing TX = 2
age −.001 .009 −0.09 0.927 vs. TX = 1 is exp(0.137) = 1.15 controlling for per-
priortx −.013 .022 −0.57 0.566 formance status, disease duration, age, and prior
cons −2.758 .742 −3.72 0.000
therapy. The estimate for the shape parameter is
/ln p −.018 .065 −0.27 0.786 0.982 suggesting a slightly decreasing hazard over
p .982 .064 time.
1/p 1.02 .066
Presentation: XII. Frailty Models 297
EXAMPLE (continued) Model 2 (output on left) is the same Weibull model

as Model 1 except that a frailty component has
Model 2. With Frailty
been included. The frailty in Model 2 is assumed
Weibull regression (PH form)
Model
to follow a gamma2: distribution with mean 1 and
variance equal to theta (θ ). The estimate of theta
Gamma frailty
Log likelihood = −200.11338
hj (t
is 0.861 | αj )row
(bottom j h(t), Aj variance
= ofαoutput). = 1, 2,of..., n, where
zero (theta
h(t) =and
0) would
λ as indicate
above, thatαthedenoting
frailty the
p >|z|
component does not contribute to the jmodel. A
t Coef. Std. Err. z
tx
perf
.105
−.061
.291
.012
0.36
−5.00
0.719
0.000
frailty
likelihood ratiofor
test the j-th
for the subject
hypothesis thetaand
= 0 where
is shown directly below the parameter estimates
dd
age
−.006
−.013
.017
.015
−0.44
−0.87
0.663
0.385
α ∼
and indicates a chi-square value of 12.18 with 1 = θ)
gamma (µ = 1, variance
priortx −.006 .035 −0.18 0.859 degree of freedom yielding a highly significant
cons −2.256 1.100 −2.05 0.040 p-value of 0.000 (rounded to 3 decimals).
/ln p .435 .141 3.09 0.002
/ln the −.150 .382 −0.39 0.695 NoticeNote:
how α allj not estimableestimates
the parameter
p 1.54 .217 (overparameterization).
are altered with the inclusion of the But
frailty.the
1/p .647 .091 The estimate for the shape parameter is now 1.54,
theta .861 .329 quite variance
different fromofthethe frailty
estimate 0.982 is estimated
θ obtained
Likelihood ratio test of theta = 0: from Model 1. The inclusion of frailty not only
chibar2(01) = 12.18 has an impact on the parameter estimates but
Prob>=chibar2 = 0.000
also complicates their interpretation.
Comparing Model 2 with Model 1 Before discussing in detail how the inclusion

of frailty influences the interpretation of the
There is one additional parameters, we overview some of the key points
parameter to estimate in (listed on the left) that differentiate Model 2
For Model 1 we get HR

c = exp(0.137) = 1.15.
For Model 2 we get HR
c = exp(0.105) = 1.11.
Remark
In Model 2 the value we obtained is the estimated hazard ratio for
two individuals having the same frailty one taking the test and the
other taking the standard treatment (and same levels of other
predictors).
Compare the estimated values for the shape parameter p:

I Model 1: p̂ = 0.982 (→ decreasing hazard)
I Model 2: p̂ = 1.54 (→ increasing individual level hazard)
BUT: For frailty models one has to distinguish between the

individual level and population level hazard.
I Individual level/conditional hazard is estimated to increase
I Population level/unconditional hazard has an unimodal shape
(first increasing, then decreasing to 0)
group” has an increasing proportion of
Conditional and Unconditional Hazards < 1), decreasing
individualsin(αFrailty Models the po
average, or unconditional, hazard.
Four increasing individual level To clarify the above explanation, con

hazards, but average hazard de- graph on the left in which the hazards
creases from t1 to t2 individuals increase linearly over time u
h(t)
event occurs. The two individuals with
x
average hazard: h2 < h1 est hazards failed between times t1 and t
x other two failed after t2 . Consequently,
age hazard (h2 ) of the two individuals st
x
at tI2 isHazard
less than
forthe average hazard (h1 ) o
individuals
h1 individuals at risk at t1 . Thus the averag
increase
h2 x of the “at risk” population decreased fr
t2 (i.e.,Average
h2 < h1hazard decreases
I
) because the individuals
past t2 were less frail than the two individ
failed earlier.
t1 t2
Frailty Effect This property, in which the uncondition

eventually decreases over time because th
h (t) eventually decreases group” has an increasing proportion of
Four increasing individual level To clarify the above explanation, consider th
Estimated unconditional hazard On th
Conditional
hazards, butand Unconditional
average Hazards
hazard de- graph on
Model in(TXinFrailty
the 2left which
= Models
the
1, mean hazards
level for for fou
estima
creases from t1 to t2 individuals
other increase
covariates,linearly
p̂ = 1.54)over time until
dardthetr
h(t)
event occurs. The two individuals with the thehig
oth
x
average hazard: h2 < h1 est hazards failed between times t1 and t2 the andhath
other two failed after t2 . Consequently, the overaveti
Hazard function
x
age hazard (h2 ) of the two individuals stillincreas at ri
tx = 1
x
at t2 is less than the average hazard (h1 ) of averag
the fo
h1 individuals at risk at t1 . Thus the averagethan hazain
h2 x of the “at risk” population decreased from The t1a

t2 (i.e., h2 < h1 ) because the individuals survivin
prised
past t2 were less frail
1 than the two individuals wh
frailty.
failed earlier. analysis time a great
t1 t2 Weibull regression event e
group”
individ
Population with different levels of frailty averag
Frailty
→ "more Effect This
frail individuals" (α > 1) property,
are in which
more likely the unconditional
to get the event haza
eventually
Fourdecreases
increasingoverindividual
time becauseleveltheTo
“atcla
ri
hearlier
∪ (t) eventually decreases group” hazards,
has an increasing
but averageproportion
hazard de-of less
graphfra
→ "at risk group"
because has increasingindividuals,
proportion isof
creases fromless
called frail
to
t1 the individuals
t2frailty effect. individ
“at risk
(α < 1) group” becoming less frail h(t)
event o
x
over time average hazard: h2 < h1 est haz
→ decreasing population average hazard h (t) U x other t
→ frailty effect age ha
x
at t2 is
individ
Example
Assume plotting the Kaplan-Meier log-log survival estimates for

treatment TX = 2 vs. TX = 1 would give us plots starting out
parallel but then converge over time.
I Interpretation 1: Effect of treatment weakens over time

⇒ PH model not appropriate
I Interpretation 2: Effect of treatment remains constant over

time. Convergence is caused by an unobserved heterogeneity
in the population
⇒ a PH model with frailty would be appropriate
Contents
Introduction
Parametric Model
Weibull Model
Weibull AFT Model
Log-Logistic Model
Frailty Models
Summary
Summary
I Parametric model: assume distribution of survival time

I PH, AFT and PO (Examples: Weibull and log-logistic models)
I Parametric likelihood
I Frailty models: additional variability factor for hazard
I Distinguish between conditional and unconditional frailty
Thank you for your attention!

Handout 9 PDF

Uploaded by

Copyright:

Available Formats

Handout 9 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Handout 9 PDF

Uploaded by

Copyright:

Available Formats

Parametric Survival Models

Christoph Dätwyler and Timon Stucki

Other Parametric Models

The Parametric Likelihood

Other Parametric Models

The Parametric Likelihood

Parametric Survival Model

Other Parametric Models

The Parametric Likelihood

Other Parametric Models

The Parametric Likelihood

Distribution f (t) S(t) h(t)

Modeling of the parameters:

Other Parametric Models

The Parametric Likelihood

I If p < 1 the hazard decreases

Property of Weibull Model

This property allows a graphical evaluation of the appropriateness

0.0 0.5 1.0 1.5 2.0 2.5 3.0

I Then the hazard ratio (TRT = 1 vs. TRT = 0) is

exp(β0 + β1 )pt p−1

which indicates that the PH assumption is satisfied.

The exponential distribution is a special case of the Weibull

Weibull density function:

f (t) = exp(−λt p ) λpt p−1

Setting p = 1 gives the density function of an exponential

Other Parametric Models

The Parametric Likelihood

First example: Humans vs. dogs

Second example: Smokers vs. nonsmokers

SNS (t) = SS (γt) for t ≥ 0

The AFT assumption can be expressed

SNS (t) = SS (γt)

I in terms of random variables for survival time:

where TNS is a random variable following some distribution

The acceleration factor allows to evaluate the effect of predictor

S(t) This idea is graphica

γ>1 ⇒ exposure benefits survival

For the hazard ratio, we have:

HR > 1 ⇒ exposure harmful to survival

γ = HR = 1 ⇒ no effect from exposure

Other Parametric Models

The Parametric Likelihood

Consider an AFT model with one predictor X . The model can be

where  is a random error following some distribution.

Some distributions (e.g. Weibull) have an additional parameter σ,

Here (and in R) the model is parametrized using σ = p1 :

The model in terms of the survival time T is

Other Parametric Models

The Parametric Likelihood

I We use again the remission data.

AFT Model Assumption

Expression for time-quantiles

1. Solving for t gives:

S(t) = exp(−λt p ) ⇔ − log(S(t)) = λt p

t = (− log(S(t)))1/p exp(α0 + α1 TRT ).

(TRT used to scale time to any fixed value of S(t))

t = (− log(q))1/p exp(α0 + α1 TRT ).

The acceleration factor for a fixed value of S(t) = q is calculated

where is a random error following some distribution.