Therneau Slides For Packet
Therneau Slides For Packet
Therneau Slides For Packet
Terry M. Therneau
Mayo Clinic
Spring 2009
Introduction
SURVIVAL ANALYSIS
Traditional
Extensions
Methods apply to any ”time to” endpoint
• duration of unemployment
Introduction
• Multiple toxicities
• Social interactions
Introduction
Methods
Let
Event Rates
My examples are focused on SAS and S-Plus. SAS for the data manipulation,
S for the graphics, and either for the analysis. (These are the packages that I
know).
SAS
usage
revenue
Largest system.
manuals
code
S-Plus
Includes:
• documentation and support
• data management (e.g. import from SAS)
• graphical user interface
• add-on libraries
• Web interface to a central engine
Introduction
An open-source implementation of S.
• Base package: very solid, fast, free
• Add-ons: variable
? survival: same code as S-Plus
? cutting edge material, rapid growth
? ...
• Much activity in the university realm
Risk sets
The Kaplan-Meier
The KM estimates a probability at each event time ti
The concept is well known, any code is fully debugged: what could go wrong?
Risk sets
0.4
0.2
0.0
0 1 2 3 4 5 6
• Between years 5 and 6, 6 deaths were noted via unsolicited follow-up, and
added to the data set.
The correct denominator for a KM at some time t is the number of subjects for
whom
300
Number at risk
200
100
0
0 1 2 3 4 5
Cox model
Let ri = eXiβ be the risk score for each subject i.
The increment to the hazard function for a (hypothetical) subject with risk r0 is
Risk sets
The risk set for a KM or Cox model at some time t is the set of subjects who
could have a recorded event at time t, no more and no less.
• Multiple events
• Disallowed events
Simple Cox models
The Cox model models the event rate. The model assumes that the risk for
subject i is
λi(t) = λ0(t)eXi(t)β ,
where λ0 is an unspecified baseline hazard. Because the hazard ratio for two
subjects with fixed covariate vectors Xi and Xj
λi(t) λ0(t)eXiβ
=
λj (t) λ0(t)eXj β
eXiβ
=
eXj β
is constant over time, the model is also known as the proportional hazards
model.
Simple Cox models
• This avoids the main problem of censored data. Someone who is followed
for 18 months is a part of the rate computations for months 0–18, and not a
part of them thereafter.
Computation
Mathematically, the Cox model ends up as a “lottery” model.
Risk score for each subject = ri = exp(Xiβ).
Say a death occurs at day 174, subject 24. The likelihood term is
r24
P
i Yi (174)ri
• still alive
• not excluded
Probability( person who did win the lottery, should have won the lottery, given
the set of people who were participants in the drawing).
Simple Cox models
λk (t)eXiβ .
rather
The computing idea here is not new. Laird and Olivier, JASA 1981, show how
to use a Poisson regression program to fit a Cox model.
Computer Implementation
Cox model program with just one more variable. Replace
time status strata x1, x2, . . .
with
(time1, time2] status strata x1, x2, . . .
• left truncation
Patient 1 has died after 102 days; patient 2 was transplanted at day 21 and
died on day 343. The data file for these two patients would be
Interval Status Transplant Age at Entry Prior Surgery
(0,102] 1 0 41 0
(0,21] 0 0 48 1
(21,343] 1 1 48 1
Note that static covariates such as age are simply repeated for a patient with
multiple lines of data.
Counting Process Form
• 103 subjects
? 69 received a transplant
? 34 did not
? The data set will have 69*2 + 34 = 172 rows.
∗ maybe: 2 subjects were enrolled on the day of their transplant
• (The data in the appendix of K&P will not give the “correct” answers since
age is rounded to the nearest year).
• The data found in the JASA article of Crowley and Hu contains the actual
dates.
• Note that using 0.5 or 0.1 (or any number that does not occur naturally in
the data) instead of 0.9 will give the same result.
Counting Process Form
data temp;
infile ’data.jasa’;
input id @6 birth_dt mmddyy8. @16 entry_dt mmddyy8.
@26 tx_dt mmddyy8.
@37 fu_dt mmddyy8. fustat prior_sx ;
format birth_dt entry_dt tx_dt fu_dt date7.;
data stanford;
set temp;
age = (entry_dt - birth_dt)/365.25 - 48;
year = (entry_dt - mdy(10,1,67))/ 365.25; *time since 1 Oct 67;
wait = (tx_dt - entry_dt);
if (id = 38) then wait = wait - .5;
else do;
rx =0;
start = 0; stop = wait;
status= 0;
output;
Counting Process Form
rx =1;
start = wait; stop = fu_dt - entry_dt;
status= fustat;
output;
end;
proc print;
Counting Process Form
Four steps
λi(t) = λ0(t)eXβ
Here λ0 is the baseline hazard, which we are not attempting to model. But
what is the best time scale t?
• calendar time?
For all but the first the counting process form may be necessary.
Counting Process Form
On this time scale, a patient diagnosed in 1985 and referred in 1987 is not at
risk for death (“observed death in my study”) in the interval 0-2 years.
Time alignment
Without the counting process form all intervals implicitly start at zero. The
older, and unsatisfactory solution is to create a dummy variable d = delay time
from diagnosis to entry, and enter d into the model as a covariate.
Traditional Correct
Dx Referral Event Interval d Interval
9Nov84 10Mar85 1Mar92 (0,2548] 121 (121,2669]
8Aug82 28Apr85 18Jun90 (0,1877] 994 (994,2871]
3Jul83 22Jul85 1Mar92 (0,2414] 750 (750,3164]
28Nov84 1Aug85 12Feb88 (0,925] 246 (246,1171]
Counting Process Form
Age is the appropriate time scale for the model, since it is the largest effector
of baseline cancer risk.
Counting Process Form
Mortgage defaults
The rate and timing of loan defaults (foreclosure) is of great interest to firms
that “bundle” loans to create an investment. Some time scale issues are:
• Loans do not become a part of the portfolio on day 0 (the local bank may
hold them for a time). This induces left truncation: the loan is not at risk for
“observed failure” until it is acquired. Failures before that point are neither
observable nor of interest.
• Two natural time scales are time since initiation and calendar time.
The baseline hazard for a Cox model is very flexible, and plays a similar role
to blocking effects in linear models — “effects that should be adjusted for, but
are not modeled directly”.
The default rate might vary with time since initiation, e.g., loans held for more
than 10 years represent stable families. Not all these effects can be modeled
or need to be modeled.
The flexible baseline hazard λ0 is a strength of the Cox model — you don’t
have to model everything.
One problem with λ0(t) is that you only get to pick one time scale. Either
time-since-initiation or calendar-time, but not both. If both are important
(likely), then one must be modeled explicitly using time-dependent covariates.
Suggestion: model the simpler one.
Another problem is that time cannot be extrapolated, the Cox model can only
predict within the range of λ0(t) for the data at hand.
Counting Process Form
(I prefer the S-Plus behavior, since such intervals usually mean that I have
made a coding error).
The most common problem with this convention is a death on day 0.
Suggestion: have the subject die on day 0.5 (or .1 or .7 or . . . ).
• The actual value of the number will not effect the model results, as long as
the value is before the first actual death or censoring event.
• The value will effect the survival curve based on the Cox model (but
requires good eyesight to see it), as well as the tests for proportional
hazards.
Counting Process Form
Summary
• The created data set can be carefully examined before the fit
? Avoid errors
? Assurance of exactly which model is being fit
Counting Process Form
• Some patients on the active arm were expected to have, and did have, an
adverse reaction
? small numbers (4–5%)
? early onset
? cross-over to placebo
? no such reactions on placebo
? variable weeks on rx added to form
SAS code
• Due to round off, about 1/2 of the treatment patients are ‘crossed over’ to
placebo 1–3 days before death or last follow-up
Four steps
Computation
The Cox likelihood is a series of terms, one for each event
re
P
those at risk rj
where rj is the (modeled) risk for subject j and re is the risk for the person
who actually had the event.
P r(person who won the drawing, should have won the drawing)
• At each event time t, rj (t) depends only on the jth person’s covariates at
that time.
• Past history can be included, as well as the present, with suitable variables
Warnings
Responders vs non-responders
This particular incorrect analysis is re-discovered every few years in oncology.
(Or I should say re-published).
• Anyone who dies before week 4 is a non-responder, and most of those who
die before week 8.
Simple simulation:
• At each visit, ask the patient if they filled up their car with gas that day.
✛
✛✛
✛✛
✛
✛
✛ Pragmatists
0.8
✛
✛✛ Others
✛✛✛
✛✛
✛
✛✛
0.6
✛
Survival
✛✛
✛
✛✛
✛
✛
0.4
✛ ✛✛
✛
✛ ✛
✛ ✛
✛✛
✛✛
0.2
✛
✛ ✛
✛ ✛
✛ ✛✛
0.0
0 5 10 15 20 25 30
Months
Counting Process Form
Prophetic variables
Some time-dependent covariates are not predictors of an event as much as
they are markers of a failure-in-progress:
• Multiple-organ failure
• Ventilation
• Medication changes
? Cessation of diuretics in heart failure
So what?
Counting Process Form
Basic rule
At any time point, the covariates can be anything that you want, as long as
they use only information that would have been available on that day, had
analysis been done then.
• The pathologist asks “did I miss something”, and reviews the slides of those
subjects.
? This is good clinical practice
• If something is spotted, you can not go back and change the diagnosis
group. Results will be badly biased.
Lagged variables
Often it is preferable to not incorporate a result that is found on day x
immediately. Rather lag it to appear at x + lag.
• Multi-day visits
? If a patient has a multi-day appointment (common at Mayo), the dates on
the electronic record may show a > b > c when actually they are
concordant.
? Solution – we often have enter time-dependent lab test variables in the
model at day + 3 or day + 7.
? Model checking: don’t trust anything that happens ¡ 1 week before an
event.
Counting Process Form
• Treatment delays
? UDCA is a drug used in liver disease, that works by being incorporated
into the bile reclamation loop.
? It requires 4-6 months to reach saturation levels
? In a randomized trial of UDCA vs placebo, the end of the study contained
an open label period.
? On the day of cross-over, prior placebo patients are not yet getting
benefit — should I count them as treated?
? The treatment group was changed 3 months after the drug.
Counting Process Form
Insidious look-ahead
Smoothed continuous variables
• It is tempting to use 160 (1/3 of the way between 150 and 180).
• Bad idea
Persistence
• If the spike is still present at the 1 year evaluation, this is a bad thing. (It
mean that the ‘solitary’ lesion likely was not solitary.)
• We know that the spike would have been present at 12 months, if the test
had been done then.
Multiple Events
Multiple Events
Methods
• Marginal models
Multiple Events
Answers
• Marginal models are simple to use, interpretable, and flexible. They can be
fit with standard software.
Multiple Events
Risk sets
• Parallel
? unordered events
? multiple toxicities for a single subject
? correlated family members
• Sequential
? ordered events
? repeated infections in immunocompromised children
? treatment - response - relapse
? any events that are labeled as 1, 2, 3, . . .
Multiple Events
When events are of different types, one should have different baselines.
Are first and second heart attacks the same type of event?
Multiple Events
Robust variance
The robust variance is based on a “sandwich” estimate
V = I −1BI −1
where
• B is a correction factor.
Multiple Events
1. The proper variance when a likelihood for distribution f is fit, but the data
come from g.
Conceptual derivation
• Correlated data
=⇒ grouped jackknife
Multiple Events
Correlated Data
• Two treatments.
D2n?p =⇒ D e n?p
d11 d12 . . . d1p
d˜11 d˜12 . . . d˜1p
d21 d22 . . . d2p
d˜21 d˜22 . . . d˜2p
d31 d32 . . . d3p
add .. .. ..
d41 d42 . . . d4p
d51 d52 . . . d5p
.. .. ..
e 0D
D e is a grouped jackknife estimate of variance.
Multiple Events
• For a Cox model, this is the correlated data variance estimate of Wei, Lin
and Weissfeld.
Multiple Events
λ(t) = λ0(t)eXβ+Zb
b ∼ G(0, Σ)
Marginal or RE?
• Marginal
? Available in multiple packages
? Works in all cases (jackknife connection)
? Only the working independence variance is available
• Random Effects
? Can accomodate complex variance relationships
? Estimates of Σ and/or b may be of interest
? Assumption that you have the right model
? Young software
? Debate over the form of the random effect
All of the arguments between the ‘GEE’ and ‘Mixed’ camps are available.
Multiple Events
S Example
SAS
No random effects
Multiple Events
Random effects: ˜ 1 | id
id
Variance: 7.428457
Multiple Events
Unordered events
• Each observation is entered into the data set as a single line, no differently
than it would be if the correlated observations were not present.
• Time scale questions are just as for any single event model (time from
diagnosis versus time from enrollment versus . . . ).
• If multiple events are of different types, then each event type may be placed
into a separate strata.
Multiple Events
• Correlated data from families, where each subject has “ordinary” single
event survival data.
The data below is for the 197 subjects who form a high risk subset.
Multiple Events
• When eye X fails, the appropriate risk set is all eyes that have not yet failed.
• Hazard for the left eye is independent of the status of the right eye, given all
appropriate covariates and the right model. Since we never have this, the
observations will be correlated.
Multiple Events
Combine all of the data into a single data set with 2n rows.
n= 394
coef exp(coef) se(coef) robust se z p
adult 0.0539 1.055 0.162 0.179 0.302 .76
trt -0.7789 0.459 0.169 0.149 -5.245 <.0001
SAS output
id id;
-----------------------------------------------------------
Test Chi-Square DF Pr > ChiSq
Hazard
Variable Ratio
TRT 0.459
ADULT 1.055
Multiple Events
Random effect
> coxme(Surv(time, status) ˜ adult + trt, random= ˜1|id, diabetes)
Random effects: ˜ 1 | id
id
Variance: 0.7979162
Multiple Events
Random effects
• Conditional vs marginal
? The estimate treatment effect is larger, this is expected
? E[exp(Xβ)] > exp(E[Xβ])
? The per-subject effect is larger than the population effect
? The lower limit of the confidence interval is a about the same
• Testing
? (33.7 on 3 df) - (22.5 on 2 df) = 11.2 on 1 df
Multiple Events
• A subject who is still at risk is assumed to be at risk for all of the event
types, e.g., one can die without first having progression of disease.
• If there are k event types, then each subject will have k observations in the
data set. (Often, k − 1 of them will be censored).
• Usually the model will be stratified by event type. That is, we do not
assume that cardiac death, stroke death, infectious death, . . . , have the
same baseline hazard rate.
Multiple Events Competing risks
Kyle (1993) studied all 241 cases of MGUS identified at the Mayo Clinic before
1/1/1971.
Alive Dead
multiple myeloma 2 37
amyloidosis 0 8
macroglobulinemia 1 6
lymphoproliferative 2 3
none 52 139
57 184
Multiple Events Competing risks
Covariates are
• hemoglobin (6.8–16.6)
We use two data sets; the first data set mgus has one observation per subject
and corresponds to the usual “time to first outcome” analysis. The second
data set mgus2 allows for a competing risks analysis. It contains one stratum
for each outcome type, with all 241 subjects appearing in each stratum. For
simplicity, collapse into “death,” “multiple myeloma,” and “other.”
The first two subjects in the study experience death without progression at
760 days and lymphoproliferative disease at 2,160 days, respectively.
If one assumes common coefficients, then the competing risks model has
identical results to a time to first event model (other than approximations for
ties).
Multiple Events Competing risks
Random Effects
A random effects model is not feasable for this data.
• There are n random coefficients bi, we need more than one event per b to
do estimation.
• In theory the model is identifiable with 1 obs/subject, but this is one case
where “as n goes to infinity” really means bigger than any n I’ll ever see.
Multiple Events Competing risks
The advantage of the larger data set is that it allows for easy estimation of
within-event-type coefficients.
Fitting separate models for each endpoint is the same as assuming all
possible covariate * event type interactions.
Is the effect of age is identical for both outcomes, while controlling for a
common effect of hemoglobin?
The median age of the study subjects is 64 years, and it is not surprising that
age is a significant predictor of the overall death rate.
Age is, however, of almost no importance in predicting the likelihood of a
plasma cell malignancy. (About 1%/year convert, regardless of age.)
Multiple Events Competing risks
UDCA is a natural bile acid normally found in humans in minimal amounts, but
which naturally occurs in bears. When given orally it becomes incorporated
into the recycled pool of bile salts. The potential benefit of UDCA was thought
to result from its hydrophilic properties, i.e., it is less hepatatoxic when
retained in the liver than the more hydrophobic bile acids. Accordingly, several
randomized trials have now been undertaken to assess its efficacy. Mayo
Clinic coordinated a randomized double-blind trial of the drug starting in 1988,
one hundred eighty patients were enrolled.
89 UDCA
91 Placebo
3 centers
Endpoints
Placebo UDCA
Death 8 5
Transplant 8 7
Withdrawal 18 11
Drug toxicity 0 0
Histological progression 11 8
Development of varices 17 8
Development of ascites 4 1
Development of encephalopathy 1 3
Worsening of symptoms 9 6
Doubling of bilirubin 15 2
Multiple Events Parallel events
Total events
A decision was made between the study design and first analysis to not use
voluntary withdrawal or doubling of bilirubin as endpoints, the first due to
possible bias, the second due to laboratory variation.
0 1 2 3 4
UDCA 63 17 6 3 0
Placebo 50 28 11 0 2
First Total
Events Events Gain
UDCA 26 38 12
Placebo 41 58 17
1.0
0.8
0.6
Survival
0.4
0.2
0.0
0 1 2 3 4 5
Years
Multiple Events Parallel events
• Multiple events
? Parallel events. It so happens that no one had 2 events of the same type,
thus there is no ordering. At any given moment that a subject is at risk,
any event could happen.
? Separate strata per event type, as the baseline rates for each event are
quite different. Also, some events can not happen twice (worsening of
symptoms), so a subject may cease to be at risk for one event but not for
another.
First event
• Adverse events were all assessed at evaluation, with the exception of death
or transplant.
● ● ● ● ●
30
● ● ● ●
● ● ●
● ●
● ●
● ●
25
● ● ●
● ● ●
● ● ●
● ● ●●
● ●●
20
● ●●
● ●
●
●● ●
Subject
●●
● ●
15
● ●
● ●
● ●●
● ●●●
●
● ● ● ●
10
● ●
● ●
● ●
●●
● ●
● ●●
5
●●
● ●
● ●
●●
0
0 1 2 3 4
Failure times
Multiple Events Parallel events
• 929 patients
? 315 Observation
? 310 Levamisole
? 304 Levamisole + 5FU
• Up to 9 years of follow-up
Multiple Events Parallel events
Setup
Parallel (unordered) events. At any given moment when a subject is still at
risk, either event could happen.
Separate strata for the two event types. There is no guarrantee that they have
the same rate, or the same shape of baseline rate.
Data set will have 929 ∗ 2 = 1858 observations. The first 929 are the data set
one would have built for a “time to death, ignoring progression” analysis, the
next 929 similar for progression.
Multiple Events Parallel events
Time to death
Marginal fit
------
data temp; set colon;
rx_lev = 1*(rx=’Lev’);
rx_lev5= 1*(rx=’Lev+5FU’)
Comparison
Interactions
One important issue to consider is strata by covariate interaction. Do we wish
to assume that the effect of “> 4 nodes” is the same for the death and
progression risks? Assume that node4 should be different
e 0D.
• It is equivalent to fitting all of the data at once, followed by V = D e
• The latter is easier to program, and it allows for different codings of the
interactions (in particular, leaving them out).
It also allows the data event sets to be a different sizes, e.g., if one of the
outcomes were missing for some subjects.
Ordered multiple events
Ordered outcomes
Andersen-Gill model
“An event is an event is an event”
3. The model may have strata, but they have no relation to the event history,
e.g., stratify by gender.
2. A subject is not at risk for a third event until they have had a second event.
Both “time since entry” and “time since last event” are commonly used time
scales.
4. The counting process form will be needed unless we use time from last
event.
WLW
Treats an ordered data set as though it were an unordered, competing risks
problem.
2. Within strata 3, the data is “what I would have if the data recorder ignored
all information except event type 3”.
All patients appear in all strata.
3. In their paper WLW include all the strata*covariate interactions. This is not
necessary.
Strata
Assume a subject with events at t1, t2, t3 and no further follow up.
Ordered multiple events
Interval Strata
(0, t1] 1
AG (t1, t2] 1
(t2, t3] 1
(0, t1] 1
WLW (0, t2] 2
(0, t3] 3
(0, t1] 1
conditional (t1, t2] 2
(t2, t3] 3
(0, t1] 1
conditional (0, t2 − t1] 2
(0, t3 − t2] 3
Ordered multiple events
Risk sets
Subject Smith has experienced his second event on day 32. Who is at risk?
✕ ✕
✕ ✕ ●
✕ ✕ ✕ ●
0 1 2 3 4
Time
Ordered multiple events
• Andersen-Gill: 4
• WLW:
? Strata 1: 1
? Strata 2: 3
• Conditional:
? Strata 1: 1
? Strata 2: 2
Ordered multiple events
Data 1: Semi-Markov
✕ ✕
✕ ✕
✕ ✕
✕ ✕
● ●
✕ ✕
● ●
0 2 4 6 8 0 2 4 6 8
0 4 0 1 4 0 1
4 0 2
4 0 3
4 0 4
0 6 1 1 6 1 1
6 8 0 2 8 0 2
8 0 3
8 0 4
Ordered multiple events
Data 1 Data 2
PWP WLW
Stratified
(conditional) (competing risks)
Not Andersen-Gill
Lee, Wei, Amato
Stratified (independent increments)
Andersen-Gill:
WLW:
Risk sets
For any setup make sure that the risk sets are correct.
• For each strata separately: if at some time t the subject is at risk then they
are in the risk set.
? “At risk”: if the subject were to have an event at t, my analysis would
count it as an event in this strata.
? The WLW setup breaks this rule
• For any time t and any given strata, there is at most one copy of any
subject.
Ordered multiple events
• We purposely have chosen that the hidden variate have a larger effect than
treatment.
The AG model treats the data as a Poisson process, which this data is.
Therefore, with all the covariates in the model the AG setup should correctly
estimate both the coefficient and the variance. More importantly, what
happens if the covariate is not included?
The conditional model is also correct if all of the covariates are included.
model robust
coef var var
without covariate
β2 -0.92 0.066 0.084
AG
with covariate
β2 -0.93 0.066 0.066
β1 1.05 0.056 0.056
without covariate
β2 -0.67 0.070 0.068
Cond
with covariate
β2 -0.91 0.073 0.069
β1 1.03 0.065 0.064
without covariate
β2 -1.23 0.066 0.113
WLW
with covariate
β2 -1.60 0.069 0.117
β1 1.82 0.063 0.113
Ordered multiple events
AG model
• The coefficients and variances are correct when the variate is included.
e 0D
• The robust variance estimate D e agrees with the information matrix
variance I in this case.
• When the covariate is included, the model works well and is just slightly
less efficient than the “main effects” or AG model.
• When important covariates are missing the model fails badly. Why?
Long experience has shown that in non-randomized trials, the inclusion of all
appropriate covariates is essential (and perhaps impossible).
Ordered multiple events
WLW approach
The WLW model overestimates β2, and inclusion of the hidden covariate
makes the fit even worse.
For this particular setup we can work out the details in closed form.
If the true hazard ratio of treatment/control is λ1/λ2, then the WLW data set for
event k have an initial ratio of (λ1/λ2)k , which slowly dies down to an
asymptotic value of λ1/λ2.
Ordered multiple events
The conditional model only has 6 treated subjects remaining in strata 5, none
of which had a further event during the time period. Thus, the infinite hazard
ratio is simply an unstable estimate due to small sample size. Strata 6 and 7
have no treated subjects at all so the hazard ratio cannot be estimated.
Ordered multiple events
Test:
• Is it a simulation study?
1. Yes: Read the paper or ask the author.
2. No: Reject H0.
Goals
The simulation shows 4 things:
1. The prior results were based on a single realization — we could have been
unlucky. This shows that the conclusions are true in general.
3. The conditional model has a substantial bias if there are any “hidden”
covariates.
• We can think of the hidden covariate as a latent trait, i.e. a random effect.
• Perhaps a random effects model will repair the damage?
• Do we need to have the right distribution for b.
AG +x
Co +x
First+x
3
2
1
0
Treatment coefficient
Ordered multiple events
• The simple Cox model is more variable: there is an average of approx 1.6
events/subject.
Ordered multiple events
AG +x
AG
Co
First
3
2
1
0
Treatment coefficient
Ordered multiple events
AG +x
AG
AG +f
AG +f3
3
2
1
0
Treatment coefficient
Ordered multiple events
Co +x
Co
Co +f
Co +f3
3
2
1
0
Treatment coefficient
Ordered multiple events
AG
Co
4
3
Density
2
1
0
Andersen–Gill Conditional
0–.1 9 921
.1–.2 106 14
.2–.25 125 5
.25–.3 190 7
.3–.35 197 10
.35–.4 152 5
.4–.5 160 8
.5–.6 42 11
.6+ 9 18
Table 1: Distribution of the estimated frailty variance θ in 1,000 simulations
Ordered multiple events
Number of Recurrences
0 1 2 3 4
Number of Subjects 39 18 7 8 14
Follow-up after last event 38 17 5 6 12
One subject has no events and 0 months of follow-up. Since he adds nothing
to β̂, the observation has been excluded from the rest of the analysis.
Ordered multiple events
5 0 6 1 1 4 1 1
5 6 10 0 1 4 1 2
6 0 14 0 1 1 1 1
7 0 18 0 1 1 1 1
8 0 5 1 1 1 3 1
8 5 18 0 1 1 3 2
9 0 12 1 1 1 1 1
9 12 16 1 1 1 1 2
9 16 18 0 1 1 1 3
Ordered multiple events
Input data
5 6 1 1 4 1 1
5 10 0 1 4 1 2
5 10 0 1 4 1 3
5 10 0 1 4 1 4
9 12 1 1 1 1 1
9 16 1 1 1 1 2
9 18 0 1 1 1 3
9 18 0 1 1 1 4
Ordered multiple events
*
* Read in the bladder data set
* drop the one useless subject who has no follow-up time
* ;
data temp;
infile ’data.bladder’ missover;
retain id 0;
data save.bladder1;
set temp;
drop futime r1-r4;
time1 =0;
enum =0;
proc print;
var id futime status rx rx1 rx2 rx3 rx4 size number enum;
Ordered multiple events
One “feature” of the data set can be seen by doing a fit with enum as a factor.
data temp;
set bladder1;
enum2 = 1*(enum=2);
enum3 = 1*(enum=3);
enum4 = 1*(enum=4);
enum5 = 1*(enum=5);
----------------------------------
Ordered multiple events
Without With
Criterion Covariates Covariates Model Chi-Square
Enum
1 2 3 4 5
Number of Subjects 85 46 27 20 12
Number of events 47 29 22 14 0
A problem line:
• I used the data in examples for 3+ years before noticing the problem.
? and then only when I fit the above model (one I don’t recommend – I was
just experimenting),
? and no, I had not read the (start, stop] data set as carefully as I should
have.
• Because only the first four recurrences are recorded, patients who have
had a fourth recurrence become immortal in the eyes of the model.
Presumably these patients are at high risk; any follow-up after the fourth
event will bias β̂ back toward zero.
• The 12 observations with enum=5 should be removed from the data set.
Ordered multiple events
• SAS 8 manual says “. . . the fifth observation is not needed for the WLW
model, but it is indispensable to the AG analysis.”
This can be particularly important for structural issues in the data set, such as
the above constraint on ≤ 4 events.
Ordered multiple events
• bladder1: The counting-process style data set, using all patients (190 obs).
Andersen-Gill models
> coxph(Surv(time1, time2, status) ˜ rx + size + number,
data=bladder1)
As predicted earlier, we see that inclusion of the strata 5 “data” biases the
coefficients toward zero. (Data set bladder1 will not be used again.)
Ordered multiple events
The coefficient ordering of the hidden covariate data set is repeated, although
not nearly as strongly as we found there.
Random effects: ˜ 1 | id
id
Variance: 0.9930668
Ordered multiple events
• This comparison isn’t quite fair since β̂ differs between the models. (But is
a good first approximation).
Ordered multiple events
for (i in 2:4)
temp1 <- coxph(Surv(futime, status) ˜ rx + size + number,
bladder2, subset=(enum==i))
temp2 <- cox.zph(temp1, transform=’identity’)
lines(smooth.spline(temp2$x, temp2$y[,1], df=4), col=i)
3
0
2
-1
4
Beta(t) for rx
-2
1
-3
-4
-5
0 10 20 30
Time
Ordered multiple events
WLW fit each of the 4 strata separately, then combine the 4 robust variance
matrices to get an overall variance. It is easy to verify that we can get the
same estimates and variance.
> round(fit$var[1:4,1:4], 3)
1 2 3 4
1 0.095 0.060 0.057 0.044
2 0.060 0.132 0.130 0.116
3 0.057 0.130 0.172 0.159
4 0.044 0.116 0.159 0.240
It is far easier, and I believe more correct, to simply fit a single treatment effect:
I think not. In most data sets we cannot afford to spend such a large number
of degrees of freedom. The use of all interactions in the WLW paper is partly
(mostly?) an artifact of the computational method chosen.
Ordered multiple events
In the WLW model the treatment effect grows slightly stronger over time, in the
conditional one it changes sign!
Ordered multiple events
SAS fit
proc phreg data=save.bladder2 covsandwich(aggregate);
model futime * status(0) = rx size number
/ ties=efron;
strata enum;
id id;
------------------
The PHREG Procedure
Model Information
Percent
Stratum ENUM Total Event Censored Censored
1 1 85 47 38 44.71
2 2 85 29 56 65.88
3 3 85 22 63 74.12
4 4 85 14 71 83.53
--------------------------------------------------------------
Total 340 112 228 67.06
Ordered multiple events
Why do the parameter estimates agree with S-Plus, but the SEs differ slightly?
• SAS does not compute the dfbeta residuals correctly for an Efron
approximation.
• This can be verified by computing dfbetas “by hand” for a few data points.
• Educated guess: dfbetas involve the cumulative hazard Λ̂(t), and SAS is
using the Breslow estimate of that quantity. (It is easier to compute).
• Practically, the difference is trivial (in all examples I have looked at).
Ordered multiple events
CGD data
A clinical trial of gamma interferon versus placebo in children with CGD, a
genetic defect that leads to multiple recurrent infections. The data set is found
in Fleming and Harrington.
1 204 0 219 1 1 1 12
1 204 219 373 1 2 1 12
1 204 373 414 0 3 1 12
2 204 0 8 1 1 0 15
2 204 8 26 1 2 0 15
2 204 26 152 1 3 0 15
2 204 152 241 1 4 0 15
2 204 241 249 1 5 0 15
2 204 249 322 1 6 0 15
2 204 322 350 1 7 0 15
2 204 350 439 0 8 0 15
3 204 0 382 0 1 1 19
4 204 0 388 0 1 1 12
Ordered multiple events
data gamma;
infile ’../../../data/cgd.data’ missover;
input id 1-3 center 5-7 +1 random mmddyy6. rx 2. sex 2. age 3.
height 6.1 weight 6.1 inherit 2. steroids 2. propylac 2.
hos_cat 2. futime 4. (event1-event7) (7*4.);
tstart = 0;
enum = 1;
.
.
.
proc print;
#Turn the randomization date into a date object (as it should be)
# I have to insert "/" marks, since the timeDate function isn’t
# smart enough to do without them
temp <- cgd0$rand.dt
tempm<- floor(temp/10000)
tempd<- floor(temp/100)%%100
tempy<- 1900 + temp%%100
cgd0$rand.dt <- timeDate(charvec=paste(tempm, tempd, tempy, sep=’/’),
format="%d%b%C")
Ordered multiple events
#Find the max event time for each subject, setting it to 0 for
# those with no events
n <- nrow(cgd0)
etemp <- as.matrix(cgd0[,14:20])
maxtime <- apply(cbind(0,etemp), 1, max, na.rm=T)
#
# Create the WLW style data set
# everyone gets 7 rows
#
tstop <- ifelse(is.na(etemp), cgd0$futime, etemp)
tstat <- ifelse(is.na(etemp), 0, 1)
cgd2 <- data.frame(cgd0[rep(1:n,7), 1:12],
time = c(tstop),
status=c(tstat),
enum = rep(1:7, rep(n,7))
)
1 204 219 1 1 1 12
1 204 373 1 2 1 12
1 204 414 0 3 1 12
1 204 414 0 4 1 12
1 204 414 0 5 1 12
1 204 414 0 6 1 12
1 204 414 0 7 1 12
2 204 8 1 1 0 15
2 204 26 1 2 0 15
2 204 152 1 3 0 15
2 204 241 1 4 0 15
2 204 249 1 5 0 15
2 204 322 1 6 0 15
2 204 350 1 7 0 15
Treatment only
cfit1 <- coxph(Surv(tstop, status) ˜ rx + cluster(id),
data=cgd1, subset=(enum==1))
cfita <- coxph(Surv(tstart, tstop, status) ˜ rx + cluster(id),
data =cgd1)
cfitc <- coxph(Surv(tstart, tstop, status) ˜ rx + cluster(id)
+ strata(enum), data=cgd1)
cfitc2<- coxph(Surv(tstop-tstart, status) ˜ rx + cluster(id)
+ strata(enum), data=cgd1)
cfitw <- coxph(Surv(time, status) ˜ rx + cluster(id)
+ strata(enum), data=cgd2)
β se(β) robust se
Time to first event -1.09 0.34 0.34
A-G -1.10 0.26 0.31
conditional -0.86 0.28 0.29
cond (gap time) -0.88 0.28 0.28
WLW -1.34 0.27 0.36
Table 2: Fits for the CGD data
Ordered multiple events
The most important factors, other than treatment, are age, steroids and
enrollment center. Inheritance was a stratification factor.
...
Ordered multiple events
And here are those for models that stratify on center (coxph can have two
strata terms), and drop inheritance.
rx age steroids
Time to first event -1.22 -0.02 -0.99
A-G -1.26 -0.02 -1.05
conditional -1.20 -0.03 -1.07
WLW -1.59 -0.02 -1.25
Ordered multiple events
• The conditional and A-G models grow more similar; with the larger change
in the conditional fit.
Frailty models
These use a Gamma random effects term.
Random effects: ˜ 1 | id
id
Variance: 0.63929
Random effects: ˜ 1 | id
id
Variance: 0.5804364
The random effects term seems to have “reconstructed” the hidden covariates
Ordered multiple events
The predicted hazard curves based on the gap time model are interesting.
The hazards for all but the first strata seem to be quite similar. Subjects seem
to have a longer ‘wait’ to first event.
1.2
1.0
0.8
Cumulative Hazard
0.6
0.4
0.2
Event 1
Events 2-3
Events 4-7
0.0
Suggestion: we can use the first, and test for proportional hazards.
Ordered multiple events
Jail
Shelter
Street
Multi-state models
Assume a subject “Jones” enters our study on day 29 at a shelter, moves out
of the shelter on day 40, reenters the shelter on day 58, is jailed on day 60
and lost to our followup on day 80. The data set would be
One important part of the setup is deciding which variables should have strata
* covariate interactions, and which should not.
• The effect of age might well be different for shelter to street and shelter to
jail transitions.
• One may be forced to assume fewer interactions, to limit the total number of
coefficients in the model.
Multi-state models
• the waxing and waning nature of Crohn’s disease can make it difficult to
describe long-term outcomes
Multi-state models
The study consists of the cohort of 174 Olmsted County, Minnesota, residents
with a diagnosis of Crohn’s disease from January 1, 1970, through December
31, 1993.
Over their time course, subjects were classified into one of 8 possible states
Remission
Mild
Treat 1
Entry Treat 2 Dead
Treat 3
S-Rem
Surgery
Multi-state models
The starting data set has one observation for each transition, containing the
prior state, current state, next state, time to transition, and covariates.
Consider a subject who enters on 1Mar78 into the mild state, treatment
escalates on 28Mar78, remission is achieved on 16May78, and the patient is
then followed forward to 28Sept78. The data would look like
If the final observation does not end in a transition, e.g., a subject enters the
‘mild’ state midway through observation and is still in it at the last follow-up,
then the nstate variable is equal to missing on the last line.
Internally, the state codes for the data set are 0=Remission, 1–4=Treatment,
5=Surgery, 6=Surgical remission and 7=Dead.
• The starting data set has 1314 transitions (not counting “Entry →”).
• The analysis data set will have 7884 observations in 42 different strata.
else do;
do i = 1 to 7;
if (nstate =i) then status =1;
else status =0;
strat = current*10 + i;
if (i ne current) then output;
end;
end;
Multi-state models
Initial analysis
Question: is the process Markov?
That is, do the outcome and risk for a subject depend only on the current
state?
Wald
Label Chi-Square DF Pr > ChiSq
• Clearly, the prior state has an influence on the rate with which we leave the
current state.
• But notice that the coefficients, which are all contrasts with the prior state of
“New patient”, are all about the same size.
• Most subjects go from the surgery state to surgical remission (134), but a
few directly to one of the 4 treatment states (36). The duration of these
treatment states is longer than usual. (Surgery is the definitive treatment.)
• The analysis so far is only for “time in state”, not distinguishing states
• We first note that, almost by definition, nothing can affect time in the
surgical state.
Multi-state models
●
1000
●
Duration in state
●
100
●
●
●
●
●
●
●
●
●
●
●
●
●
●
10
● ●
A second fit (not shown) shows that the time is also not significantly related to
which state preceded surgery.
The outlier in the surgery time period (361 days) is at first worrisome, but it
actually does not have a major effect on the estimates, helped by the fact that
a Cox fit is essentially based on the ranks of the exit times.
Take home message: look at your data before starting the analysis.
Multi-state models
• entry=1 if this is a subject’s first state after entry into the study
Time in the surgery state is left out of the model, since it is clearly not related
to the covariates being tested (this is, in one sense, a particular state by
covariate interaction).
Multi-state models
The coefficient for being in a prior surgical state is (barely) not significant in
the reduced model.
The first state after entry has a reduced duration (1.8 fold increased rate of
departure from the state), but beyond that first transition a Markovian model
might be entertained as neither the prior state nor the duration in that state
has significant predictive power.
The analysis so far has only looked for a covariate’s overall effect on “time in
state”, not distinguishing the states.
Does gender, for instance, affect some transition times more than others?
It appears that the gender effect is largest for transitions out of the mild and
early treatment states.
Further questions
• Gender interactions?
• Hazard curves
Multi-state models
Hazard functions
> fit7 <- coxph(Surv(time, status) ˜ male + ent.rem + ent.trt1 +
strata(stratum),
data=crohn1)
> tdata <- data.frame(male=1, ent.rem=0, ent.trt1=0)
> fsurv <- survfit(fit7, newdata=tdata)
> fsurv
Call: survfit.coxph(object = fit7, newdata = tdata)
These fits are based on the data set crohn1, which has one observation per
state.
They ask: how long do you stay in a state, ignoring which state is the
destination.
Multi-state models
Surgery
Treat 1
Treat 3
4
Mild
3
Cumulative Hazard
Remission
2
1
Surgical remission
0
0 1 10 50 100 200
Months
Multi-state models
Hazard functions
data tdata;
male=1; ent_rem=0; ent_trt1=0;
output;
proc plot;
plot sqrt(time) * surv; /* not legal SAS */
Frailty
Frailty Models
Frailty
Software Maturity
• In SAS
Frailty
• Simple effects
? Stata
∗ Built in
? S/R
∗ coxph (Surv(time, status)˜ frailty(grp))
? SAS
∗ Macro from J. Klein
∗ Parametric weibull + frailty, NLMIXED examples
• Complex
? S/R: coxme function
Frailty
Data
There are exciting challenges with real data sets –
“Medical research at Mayo is like standing in front of a fire hydrant with your
mouth open.” – John Blinks
Introduction (Frailty)
In the last several years there has been significant and active research
concerning the addition of random effects to survival models.
The idea is that individuals have different frailties and that those who are most
frail will die earlier than the others.
Aalen (Stat Med 88) provides theoretical and practical motivation for frailty
models by discussing the impact of heterogeneity on analyses and by
illustrating how random effects can deal with it.
Introduction (Frailty)
He states
One issue is the selection problem: Assume 3 groups of subjects with hazards
of 20, 40, and 60 events/100 per year. What does the overall survival curve
look like?
Introduction (Frailty)
1.000
0.500
0.100
Survival
0.050
Subpopulations
Overall
0.010
0.005
0 2 4 6 8 10
Years
Introduction (Frailty)
• This has an effect on our estimates and can bias them toward 0. Omori and
Johnson, Biometrika 1993: the rate of bias acquisition = var(X|t), where X
is the frailty with an initial mean of 1.
• Aalen shows that with a sufficiently variable frailty the population relative
risk can go from r to 1/r over time, even though the true relative risk stays
at r.
Introduction (Frailty)
The drug is very effective for the majority of patients, reducing the rate of
infections by half.
A study is planned with 1 year of follow-up per patient, 1000 subjects per arm.
Assuming that infections are a simple Poisson process with the above rates,
we can compute the expected number of subjects who will finish the year with
0, 1, . . . events directly:
Number of infections
0 1 2 3 4 5+
Placebo 201 256 182 98 46 217
Treatment 390 269 106 35 18 182
For a patient with 2 infections already, what is the chance of a third? In the
placebo arm it is 361/543 = 66% and for the treatment arm the chance of
another infection within the year is 235/341 = 69%, higher than the placebo
group!
The problem is that the treatment arm’s ‘2 events or more’ group is dominated
by the severe disease patients, 58%, while the placebo group is 37% severe
disease patients at that point.
Introduction (Frailty)
This wouldn’t happen for an analysis stratified by subset, but we don’t know
the subsets.
Aalen shows that with a sufficiently important frailty, the population relative risk
can go from r to 1/r over time, even though the true relative risk stays at r.
Models (Frailty)
y = Xβ + Zb +
• y = response
• β = fixed effects
• β is still unbiased.
Preview
Consider a random effects Cox model
λi(t) = λ0(t)eXiβ+Zib
3. Penalized methods:
• Agree with EM for several cases
• Extensible to a larger variety of cases
• Fast, flexible
4. Laplace approximation
• Particular to the case of a Gaussian random effect
• Closely related to the penalized methods
• The basis for coxme
Models (Frailty)
λi(t) = λ0(t)eXiβ+Zib
with b ∼ G(0, σ 2).
• Closure: models with and without Zib are from the same family.
? Linear model: Gaussian
? Cox model: Positive stable
Models (Frailty)
Simple Frailty
λ(t) = λ0(t)eXβ+Zb
Equivalent
• $ ∼ G(1, σ 2)
? G is a positive distribution
∗ gamma
∗ positive stable
? wlog we can assume that G has mean 1
Models (Frailty)
Iterate
• Even binomial search gets one more (binary) digit per step
Formal algebra
Let φ(s) be the Laplace transform of $, and φ(n)(s) be its nth derivative.
Parner (96) shows that the EM algorithm for solving the shared frailty model
has the “E” step
φ(dj +1)(Aj )
$j = − (d ) ,
φ (Aj )
j
where
He also shows that the observed-data log-likelihood, i.e., the result after
integrating out the frailty, is
n
X q
X
Lg = δi log(λieXiβ ) + log[(−1)dj φ(dj )(Aj )]
i=1 j=1
φ(s) = (1 + 1/ν)−ν
d d−1
1 s −(ν+d) Y
φ(d)(s) = − 1+ (ν + i) ,
ν ν i=0
dj + 1/θ
$j =
Aj + 1/θ
If the variance of the random effect (θ) is infinite (no restriction on the size of
the random effect), then the family effect is the simple Poisson estimate
(events/ total time).
−sp
φ(s) = e
0 p−1 −sp
φ (s) = −ps e
00 −sp
φ (s) = pe [psp−1 − (p − 1)sp−2]
Laplace transform
−tα
L(t) = e 0<α≤1
Models (Frailty)
Parner shows
n
X n−i
L(n)(t) = −1 L(i)(t)h(n − i)tα+i−n
i=0
i
h(x) = −α(α − 1)(α − 2) . . . (α − x + 1)
Other patterns
For sibling pairs, Li worked out a frailty using
b = γ1 + γ2
where γ1 and γ2 are Gamma random effects, one of which is shared between
the siblings and one is not.
with
• bj ≡ log($j )
P
• Lg = PPL + j c(dj , σ), where dj is the total number of events in group j.
This does not depend on β or b.
Rat data
• 50 litters
Gamma frailty
-181
-182
-183 Profile Likelihoods for the Rat data
Loglik
-184
Theta
Beta
-185
-186
Parameter
Models (Frailty)
• An indicator variable for each litter is added to the model (like a class
statement) = 51 coefficients.
Gaussian frailty
The hazard function for the sample is defined as
Consider the log partial likelihood P L(β, b), treating β and b as covariates.
Assume b ∼ N (0, Σ). Integrating the random effect out of the partial likelihood
gives an integrated log-partial likelihood L of
Z
1 0 −1
eL = p ePL(β,b)e−b Σ b/2
db
2π|Σ|
Z
1
= p ePPL(β,b)db
2π|Σ|
Z
1 0
≈ p ePPL(β,b̂) e−(b−b̂) Hb̂b̂(b−b̂)/2db
2π|Σ|
Models (Frailty)
1
L = PPL(β, b̂) − − log |Σ| + log |Hb̂b̂|
2
Models (Frailty)
Laplace approximation
Z
P L(β,b) −b0 Ab/2
L= e e db
Let
g(b) = P L(β, b) − b0Ab/2
and b̂ = the value that minimizes g(b)
Then
2
∂g(b) 0 ∂ g(b)
g(b) ≈ g(b̂) + (b − b̂) + (b − b̂) (b − b̂)/2
∂b b̂ ∂b2 b̂
= g(b̂) + 0 + (b − b̂)0Hbb(b − b̂)/2
But Z
1 0
e(b−b̂) H(b−b̂)/2db = 1
2π|H|
Models (Frailty)
So Z
P L(β,b) −b0 Ab/2
L= e e db ≈ ef (β̂,b̂) + 2π|H|
Models (Frailty)
• For a Cox partial likelihood, does integrating out a random effect still give a
valid PL?
? Ripatti and Palmgren, JASA, show that it does
• There is some early evidence from Cortinas (PhD thesis) that the ‘REML’ is
not as stable.
Variance: 0.4255484
Diabetic Retinopathy Data
Between 1972 and 1975 seventeen hundred forty-two patients were enrolled
in a multi-center study to evaluate the efficacy of photocoagulation treatment
for proliferative diabetic retinopathy; photocoagulation was randomly assigned
to one eye of each study patient, with the other eye serving as an untreated
control. A major goal was to assess whether treatment significantly delayed
the onset of severe visual loss.
We will use a subset 197 patients, which is a 50% sample of the high-risk
patients as defined by the study. The data set has 394 observations.
Diabetic Retinopathy Data
The four models below are a marginal, stratified, gamma frailty and Gaussian
frailty model for the diabetic retinopathy data set.
Random effects: ∼ 1 | id
id
Variance: 0.7784043
Diabetic Retinopathy Data
The Gaussian and gamma frailty models both estimate a substantial random
effect for subject, of size 0.78 and 0.85, respectively. (Gaussian random effect
with REML was .77).
The estimated size of the treatment coefficient rises with the variance of the
random effect, as is expected theoretically (Henderson JRSSB 99), but only
by a small amount.
The stratified model, which treats each adult as a separate strata, is the most
aggressive correction for possible correlation, and has the largest treatment
effect but also the largest variance.
The adult/juvenile covariate is not estimable in the stratified model, since the
covariate value is constant within a given stratum.
Diabetic Retinopathy Data
λi(t) = λ0(t)eXiβ+bi
• An ‘average’ person has a risk which may be exp(.88) = 2.4 fold higher or
lower than the population as a whole.
Catheters may be removed for reasons other than infection, in which case the
observation is censored.
There are 38 patients, each with exactly 2 observations. Variables are the
subject id, age, sex (1=male, 2=female), disease type (glomerulo nephritis,
acute nephritis, polycystic kidney disease, and other), and the time to infection
or censoring for each insertion.
Diabetic Retinopathy Data Kidney
• In kfit3, θ̂ is essentially 0.
Diabetic Retinopathy Data Kidney
When the disease variable is left out of the random effects model, however,
we get a quite different result.
• Both the approximate Wald test and the likelihood show significance for θ.
0.5
●
●● ●●
●
●●● ●
● ● ●
● ●
●● ● ●
●●
●
0.0
●
● ●
●●
●
●
● ●● ● ●
-0.5
●
●
●
Gamma frailty
-1.0
-1.5
-2.0
Other GN AN PKD
Disease
Diabetic Retinopathy Data Kidney
• Subject 21, a 46 year old male, the median for the study is 45.5.
• In a model with only age and gender, this subject has a martingale residual
of -7.4; the next smallest is -1.8.
With this subject removed, neither the disease (p=0.53) nor the frailty (p>0.9)
are important.
With this subject in the model, it is a toss-up whether the disease or the frailty
term will be credited with ‘significance’.
Diabetic Retinopathy Data Kidney
Rebirth
Current
Questions
• Breast density
• Pharmacogenomics
Correlated Frailty issues
Statistical Questions
1. What to do with the probands?
• Exclude: wastes 426/1063 = 40% of our endpoints
• Include: biased
5. Missing data
6. Which model?
Breast Cancer Incidence / 100 000
20
40
■
Age
60
80
issues
Correlated Frailty issues
Excess risk
Create a model that incorporates important covariates (parity, OC use, age,
...)
Look at the excess risk
number of observed events O
=
expected, given the model E
• Formal shrinkage
? Random effects
? Bayesian
Correlated Frailty Familial Study
λi(a) = λ0(a)eXiβ+fj
f ∼ N (0, σ 2)
where
• a is age.
Correlated Frailty Familial Study
56
2.64
73 36 60 42 72 53 88 89 78 84 74
1.66 1.42 1.55 1.59 1.62
73 72 64 47 65 38 28 71 64 50 46 52 44
0.94 2.90 2.78 1.22 1.23 1.26
40 36
1.31 1.31
Correlated Frailty Familial Study
• parity is 1 for those women with 1 or more offspring and 0 for nulliparous
women
• Family 574 is estimated to have high risk; there are 2 breast cancers and
only 3 blood relatives, and one of the cancers occurred before the age of
35.
Correlated Frailty Familial Study
5.0 •
•
•
•
• •
•
574 •
•
• •
Estimated frailty
• • •
2.0
• • •
••••• •
• •
••
••
••••• • •
• • •
•••
• •
•
••
•• •• •
••• •••• ••
•••
••• ••• • ••
• •
•
•••••• •
• •
•
•••••• ••••••• ••
••
•• •••• •• • •
••••• ••••• • • ••
•••••
• •••
•••
••• •••••
•• • • •
••••••• •• • •• •
••••• ••
• ••••
• • ••• •
•••••••••• ••••• •
1.0
0 1 2 3 4 5 6 7
Goal: for screening studies, select those families with the greatest likelihood of
inherited risk.
• Total number of affecteds per family was a first suggestion for the selection
protocol
• it does not take into account the total family size, the age-intervals of
follow-up, or the ages at which cancers occurred
• Useful tool even if the model isn’t completely correct (we don’t necessarily
believe a Gaussian frailty).
Frailty Familial Study
Many of these were advanced in age when the study began and were not
available for the follow-up questionnaire in 1991.
Frailty Familial Study
Penalized Models
The model can be viewed in 2 ways
Particular families commonly have 60% higher (or lower) risk than others.
Family 8
56
1.47
73 36 42 72 53 88 60 89 78 84 74
1.47 1.47 1.47 1.47 1.47
73 72 64 47 65 38 28 71 64 50 46 52 44
1.47 1.47 1.47 1.47 1.47 1.47
• It is clearly incorrect for a marry-in to “inherit” the familial breast cancer risk
of their engrafted family
Family 8
56
2.46
73 36 42 72 53 88 60 89 78 84 74
0.96 2.46 2.46 2.46 2.46
73 72 64 47 65 38 28 71 64 50 46 52 44
0.97 2.46 2.46 2.46 2.46 2.46
> length(fit2$frail)
[1] 424
1000
800
600
400
200
0
sfit1$frail[ - (1:423)]
Frailty Familial Study
> fit2
Cox mixed-effects model fit by maximum likelihood
Data: main
Subset: (sex == "F" & proband == 0)
n=9399 (2876 observations deleted due to missing values)
Iterations= 7 82
NULL Integrated Penalized
Log-likelihood -5186.994 -5147.241 -5064.799
> length(fit2$frail)
[1] 424
> sfit$frail[420:424]
600 601 603 605 1000
-0.2381351 -0.0428534 -0.2052715 -0.3235427 -0.5020491
Frailty Familial Study
Sparse computation
• For sanity, the program can’t deal with a 4528 × 4528 Hessian matrix.
• The program does not even provide the estimate, in order to remove
temptation.
Frailty Familial Study
•
-4624
• •
•
•
-4626
•
Marginal Likelihood
-4628
•
-4630
•
-4632
•
-4634
Individual frailty
> coxme(Surv(startage, endage, cancer) ˜parity0, main,
random=˜1|gid, subset=(sex==’F’ & proband==0))
• Not identifiable
Correlated Frailty Familial Study
• Some families are very large (max=130). Can one parameter capture an
entire family?
Correlated Frailty
Assume that the risk for subject i is
λi(a) = λ0(a)eXiβ+bi
b ∼ N (0, σ 2K)
1 = self
1/2 = mother/daughter, sisters, . . .
1/4 = uncle/niece, . . .
0 = different families
Correlated Frailty Familial Study
• To create a kinship matrix for the females, you need all the people.
? If men are left out, my sister and my daughter will appear unrelated.
>kfit1
n=9399 (2876 observations deleted due to missing values)
Iterations= 3 38
NULL Integrated Penalized
Log-likelihood -5186.994 -5170.335 -4910.234
> length(kfit1$frail)
[1] 9399
Correlated Frailty Familial Study
Family 8
56
2.53
73 36 42 72 53 88 60 89 78 84 74
1.39 1.53 1.63 1.56 1.59
73 72 64 47 65 38 28 71 64 50 46 52 44
0.95 2.75 2.65 1.22 1.22 1.25
• The variance of the random effect is much larger in the model 0.91
• But only 490 ‘effective’ degrees of freedom, due to the rich correlation
structure
breast-prostate Familial Study
A sub-study was carried out, by obtaining PCA information from 141 families.
Male relatives over the age of 40 were screened for prostate cancer via a
survey.
breast-prostate Familial Study
Three models
1. Common genes: each person’s risk of cancer depends on that of both male
and female relatives
2. Separate genes: a female’s risk of cancer is linked to the risk of her female
relatives, a male’s is linked to that of his male relatives, but there is no
interdependency
λi(t) = λ0,1(t)eXβ+bi
λ0 = baseline risk for breast cancer, females
λ1 = baseline risk for prostate cancer, males
β = covariates = parity
breast-prostate Familial Study
For simplicity assume a single family with parents, daughter, grandson, and
granddaughter.
where a = σ12, b = σ22 and c = σ32 are three components of variance for M/M,
F/F, and M/F reactions.
breast-prostate Familial Study
How do we do this?
where A1, A2 and A3 are known matrices. It then estimates σ12, σ22, and σ32.
breast-prostate Familial Study
Let
K1 K3
K=
K3 K2
• K1 = male-male relationships
• K2 = female-female relationships
• K3 = male-female relationships
breast-prostate Familial Study
K1 0
kmat.m =
0 0
0 0
kmat.f =
0 K2
0 K3
kmat.mf =
K3 0
b ∼ N (0, σ 2K)
breast-prostate Familial Study
#
# Fit the overall model
#
nfit1 <- coxme(Surv(startage, endage, cancer) ˜ parity0 +
strata(sex),
main, random= ˜1|gid,
subset=(proband==0), varlist=kmat)
se(coef)z p
parity0 -0.3253943 0.7222425 0.1212548 -2.68 0.0073
Note, the majority of the 12,459 deleted due to missing values are men who
were not a part of the substudy.
breast-prostate Familial Study
Separate Model
• Knowledge of the males is not informative for the females’ risk and
vice-versa
> nfit2a$coef
> sfit3$coef
parity0 gid
-0.3219134 0.918989
breast-prostate Familial Study
Combined Model
• Male history gives some information about female risk, and vice-versa, σ32
Results
Variance
M/F F/F M/M 2L
Common 0.76 0.76 0.76 64.4
Separate 0 0.92 0.84 72.7
Combined 0.27 0.92 0.88 74.5
• The separate effects model fits better than the common effect model
? Chisquared of 8.3 on 1 df
? Female relatives are more informative about breast cancer than male
relatives are about prostate cancer
? Both are very useful, however.
• 37 centers
• 2323 patients
• Enrollments of 20–29 (5), 30–39 (11), 40–59 (8), 60–89 (6), 91, 104, 116,
120, 155, 183, 247.
• 40 or 60% censoring
Conclusions
Moderate censoring, β = 0.7
σ2 Estimates
β σ2
.04 .715 (.119, .094) .038
.08 .702 (.083, .095) .076
.4 .691 (.122, .165) .383
• Fit
• Repeat
Shrinkage
Look at the excess risk for each family:
P
Oi
P
Ei
> range(bfit1$frail)
[1] -0.5411219 0.8903696
> var(bfit1$frail)
[1] 0.0438545
●
● ●
2.0
●
● ●
● ●
●
● ●
●
● ●●
●● ● ●● ●
● ●
● ● ●
●●
Fitted random effect
●
● ● ●● ●
● ●● ●●●
● ●● ●
●
●
●
●
● ●●●
●● ● ● ●● ●
●● ●●
●●
●●● ●
●
●● ●●● ●
●●● ● ● ●
●● ●
● ●●●
●●●
●●●
● ●
●●●
●
●
●
● ●
●●●
● ●
●● ●
●
●●
●●
●
●●
●
●●
●●●●
1.0
● ●
●●●
● ●
●●
●
●
●●
●
● ●
●
●●
●●
●
●
● ●
●●●
●
● ●●
●
●
● ●●●●●
●
●
●
● ●●
●
●
● ●●
●
●
● ●● ●
●
●
●
● ●●●●
● ●●
● ●
●
● ●
●● ●
0.8
●
● ●
● ●
● ●
●
● ●●
●
●
● ● ●
● ●
●
●
● ●
● ●
0.7
●
●
●
●
●
● ●
●
0.6
Excess risk
Shrinkage
Remember the formula for a gamma random effect
2
Oi + 1σ
bi = ln
Ei + 1σ 2
With 1/σ 2 ≈ 23
For small families (O and E < 2), the estimated random effect for the family is
close to 0.
1. The estimated variance term from the model is for the true family
information.
• What I would have with full knowledge (1000’s of members)
2. The estimated bi for each family is a best estimate, after the fact, with small
information, for that family
Code questions
What it does it does well.
• Random slopes
Current flaws:
• Wider testing.
Statistical questions
• What is the best way to estimate the variance of the random effect: MLE,
REML, AIC, . . . ?
• Are the Wald and LR tests reliable – or even consistent – when the number
of coefficients grows with n?
• Growing in popularity
362
Expected Survival
• An important distinction is
? individual expected survival
? cohort expected survival
Expected Survival
• 45-year-old US male
The code below shows that the chance of reaching a 55th birthday is 0.911.
Rate Tables
S-Plus has several built in rate tables:
It is quite easy to build your own, for example cancer incidence rates by age,
sex, calendar year and tumor type.
• Covariates in the Cox model take the place of age, sex, etc.
1.0
0.8
0.6 Predicted survival, PBC
Survival
0.4
0.2
0.0
0 2 4 6 8 10
Years
Expected Survival
If there are multiple observations in the temp2 data set, then the result will be
multiple curves.
If there are multiple strata in the Cox model, this also gives multiple curves.
• Model stratified on edema, which has 3 levels. Two observations in the temp2
data set, one is age 50 the other age 65.
Expected Survival
• 6 curves result, a 50 year old with edema=0, 60 year old with edema= 0, 50
year old with edema=0.5, . . .
• In S-Plus the curves are a standard “survival curves” object (print and plot
them like any other survival curve).
An nice advantage of the S-Plus form is that the curve object is recognized as
a survival curve, so all of the usual plot options work. In particular, it is easily
superimposed on a Kaplan-Meier.
Expected Survival
data temp2;
age=50; lbili=log(2); lalb=log(3.5); lpro=log(10); output;
age=65; lbili=log(2); lalb=log(3.5); lpro=log(10); output;
In SAS the final data set has the curves one after the other, sorted by strata.
Expected Survival
“Mean” survival
If there is no newdata argument in S-Plus, or the mean option in SAS, then the
predicted curve will be for a subject with ‘average’ covariates.
• This “mean” curve is NOT the predicted survival for the group of subjects.
Time-dependent covariates
When the model contains time-dependent covariates baseline survival
estimates can still be produced, but the results can be quite surprising.
The data set pbcseq contains sequential laboratory measurements on the 312
protocol patients of the PBC study. Patients were scheduled to return at 6
months, 12 months, and yearly thereafter; most patients have these visits and
many have one or two “extra” in addition. A time dependent covariate model is
fit to the data as
1.0
0.8 Sequential PBC data
Cox
0.6
Survival
Kaplan-Meier
0.4
0.2
0.0
0 2 4 6 8 10 12 14
Years
Expected Survival
• the normal course is for liver function tests to slowly worsen over several
years, then progress rapidly as the sclerosis and damage near a critical
threshold, followed shortly thereafter by liver failure.
• The model captures this fact, that failure is preceded by large values of
bilirubin, edema, and prothrombin time.
• The survival for such a subject — if such a person even exists — would be
quite good.
Expected Survival
Creating such a covariate path is difficult; it is all too easy to create baseline
hazards that correspond to a subject who is misleading, uninteresting, or
impossible.
Expected Survival
SAS printout
set up in this situation; both SAS and S-Plus use the Nelson-Aalen
estimate because it’s easier.
• Create a data frame that contains the entire covariate path for a subject. It
must contain all of the variables used in the coxph call:
? (start, stop] show the intervals (variable names matching the data set
of course)
? The covariates over that interval.
? The strata that applies over that interval (for a stratified model).
? event is ignored but must be present
• Use the individual=T option of survfit, to declare that one curve is desired.
• Caveat Emptor
Expected Survival
• Nelson-Aalen estimator
? Variance of the estimate worked out in Tsiatis
? most common
? based on hazards
? easily extended to multiple events
• Kalbfleisch-Prentice estimator
? Kaplan-Meier like
? Somewhat harder to compute (iteration for tied deaths)
? the default in SAS
• Fleming-Harrington estimator
? Variant on the Nelson for tied deaths (identical logic to the Efron
approximation)
? the default in S-Plus
Expected Survival
Which method you choose makes little difference except in the tail of the curve
(where the std error is huge anyway).
Estimators
• naive
• Ederer (exact)
• Hakulinen (cohort)
• Conditional
Expected Survival
True cohort
0.6
Survival
0.4
0.2
0.0
0 20 40 60 80
Years
Expected Survival
Naive estimate for a Cox model: Compute the mean covariate for the new
data set, and use the baseline statement to get a single curve.
Ederer (or exact) estimate: Compute the expected survival of each individual,
separately, and average the curves.
n
1X
Se(t) = Si(t) .
n i=1
In the Cox model case this has been rediscovered and renamed as the “direct
adjusted survival curve”.
Due to a variety of factors, however, including the high cost and risk of the
procedure and the limited number of donor organs, this premise has never
been subjected to a comparative trial.
When a donor organ becomes available a liver transplant team, either locally
to a center or collectively via the procedures of UNOS (United Network for
Organ Sharing) must decide which of the multiple needy recipients will receive
it. A randomized trial is socially unsaleable in this environment, both now and
in the conceivable future.
The data set olt contains the survival post orthotopic liver transplant for 214
liver transplants of PBC patients, done at Mayo Clinic, Baylor College of
Medicine, and Pittsburgh Medical Center from April of 1985 through
September of 1994.
Other than status, the variable names in the olt data set are identical to those
in the pbc data set.
The advantage of liver transplant for these subjects is quite impressive. If one
looks very closely, there is a small early disadvantage for transplant
corresponding to 6 early transplant deaths at 5, 16, 16, 19, 21 and 22 days.
Expected Survival
1.0
0.8
Survival
0.6
0.4
0.2
0 1 2 3 4 5
SAS computation
In SAS we first generate all 214 individual curves, then average them using
Expected Survival
proc means. A plot that overlays the Kaplan-Meier with the expected curve is
left to the reader. (In other words, SAS/graph defies my understanding).
Expected Survival
The single change in the call to survexp is inclusion of the cohort=F argument,
the function then produces per individual results rather than cohort values.
The object exp2 is simply a vector of 214 values, each of which is the value of
the appropriate per-subject survival curve (such as would be computed with
survfit), at the last follow-up time for that subject. The one sample log-rank
test is
( Ni − Ei)2
P
P
Ei
which is considered to be a chi-square on 1 degree of freedom.
Expected Survival
Not easy
Other estimates
The “exact” estimate is not always the best, in spite of its name. As with other
statistical procedures, it is often the “exact computation of the wrong thing”.
? easy to compute
? asymptotically equivalent to Hakulinen if
observed = expected.
3. The expected survival of the matched cohort, assuming they have similar
censoring to the patients (Hakulinen).
Each of the fictional matched subjects is assumed to have entered the study
at the same time as his/her matched case. Each of them thus has a maximal
follow-up time, and the expected curve is computed with censoring due to
follow-up.
For the aortic valve study, the exact estimate of expected was 63% at 15
years, the estimate adjusted for censoring is 75%. The observed survival rate
at 15 years was 50% (first year mortality of 11%). Surgery was not as
life-saving as it seemed.
This will be an issue with any study that has significant changes in its patient
mix during the accrual period.
Expected Survival
Match each case with 103 controls. If a control dies, note it, and replace
him/her with a new one. If the case dies or is censored, however, stop all
follow-up of his/her controls at that point.
• The Hakulinen method has been rediscovered and called the “Bonsel”
estimate.
• The conditional estimate has been called the “direct survival curve”.
Expected Survival
Assume that futime is the follow-up time for the subjects, and that ctime is the
constructed censoring time variable
• For subjects who have died, ctime = the time at which they would have
been censored, usually analysis date - enrollment date.
Thomsen et al (1991) point out the difference between the naive and Ederer
estimates, and explain the difference in another way. They also discuss the
conditional estimator, but say that it is “difficult to interpret”.
The variance of S0 is found in Tsiatis, and is available from both SAS and
S-Plus. The variance of Se is found in Gail and Byar; it is a mess to compute.
The variance of Sh and Sc?
Expected Survival
2. The Ederer (direct adjusted) and Hakulinen (Bonsel) estimates have not
measurably differed in published comparisons. (But I have a recent
example where they do.)
3. The method used for the individual curves Si has little impact. (For the
Gronigen curves the last point on the curve is .0931 for PL vs .0946 for CH).
SAS S-Plus
Population survexp
√
Individual √
Ederer √
Hakulinen √
Conditional
* SAS has no software for cohort estimates, but since each of the Si can be
generated the Ederer estimate is computable with a macro.
Expected Survival
Under a Cox model, the Nelson-Aalen estimate for a hypothetical subject with
covariates X † and risk r† is
n Z t
†
X dNi(s)
Λ̂(t) = r Pn
i=1 0 j=1 Yj (s)r̂j (s)
Z t
† dN (s)
= r Pn
0 j=1 Yj (s)r̂j (s)
Most computer programs subtract the means from each column of X before
doing this computation, to avoid overflow in the exponential function.
Variance
The variance of Λ̂ consists of two terms
• Term 1 = the variance of the N-A estimator, assuming that β was known
n Z t
† 2
X dNi(s)
T1(t) = r P 2
0 n
j=1 Yj (s)r̂j (s)
i=1
Again, subtracting the means from each column of X makes the computer
program more stabile.
Expected Survival
T2 = q 0(t)I −1q(t)
Z t
† † dN (s)
q(t) = r [X − X̄(s)] n
P
0 j=1 Yj (s)r̂j (s)
1 1
+
r1 + r2 + r3 + r 4 + r5 r1 + r2 + r3 + r 4 + r5
would be replaced by
1 1
+ .
r1 + r2 + r3 + r4 + r5 .5r1 + .5r2 + r3 + r4 + r5
Kalbfleisch-Prentice estimate
Y
S(t, β̂) = αk ,
t(k) ≤t
where t(k) are the unique event times, and αk satisfies the equation
n
X ri(t(k)) X
dNi(t(k)) ri (t(k) )
= ri(t(k)),
i=1 1− αk i
• The left term is thus a sum over the subjects with an event at that time, and
If there is only one event at time t(k) then the equation can be solved for αk ,
1/r(k)
r(k)
αk = 1−P ,
i ri (t(k) )
with r(k) the risk score of the subject who experienced the event.
If there are multiple events, solve by iteration. (It is known that 0 < α < 1).
A Greenwood style variance can be created, but has not been explored.
Expected Survival
• A closed form estimator for the direct adjusted estimate (Cox + Ederer) has
been worked out by Gail et al.
? It has not, however, appeared in any packages
? Because the n individual curves are all correlated, due to a common β̂
and baseline hazard Λ0, the estimator is complex
• Estimates for the conditional estimate have been worked out by Keiding
? Based on martingale arguments
? However, the conditional estimate is hard to interpret
Expected Survival
Start-stop data
• Many different types of things can be set up by creating a (start, stop] data
set.
• For some of them a survival curve makes sense, for some not
1. Multiple events
• max 1 event per strata (UDCA study)
? Assumption that someone who has had one event type is still
informative for other event types
• Multiple events in a strata (Andersen-Gill model)
? last year’s lecture: S estimates the time to first bad thing
? this year’s lecture: no it doen’t
? To estimate time to first event, use a time to first event data set.
Expected Survival
• the two groups differ wrt several covariates x1, x2, x3, . . .
If
• fit a Cox model with sex as the strata, and the important covariates
Expected Survival
Male
Female
0.8
0.6
0.4
0.2
0.0
PBC data
Look at a more realistic example.
● ●
●
● ●
●
● ●
●●●● ● ●●●
●
●●●● ● ●●
●● ● ●
● ●
●●●
10.0
●
●●● ●● ●
●●●
●●●●●● ● ●
●●●● ● ●
●●●●● ●
●●●●●●●●
●●●●● ● ●
5.0
●●●●
●●●●●● ●
●●●● ●
●●●●●●●●
●●●●●●●●●●●●●●● ● ●
Bilirubin
●●●●●●●● ●●●
●●●●●
●●●●●● ●●
●●●●●●●● ●●
●●●●●●●●●●●●● ●●●
●●●● ●
●●●●●●●●●●● ●●
●●●●●●●●
●●●●●●●●●●●● ●● ●
●●●●●●●●●●●●●●● ●● ●●
●●●●●●●●●● ●●
●●●●●●●●●●●●●●●●●●● ●
●●●●●●●●●●●●●● ●
1.0
●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●● ● ●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●
●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●
●●●●●●●●●●●●●●●●●●●●●●●
0.5
●●●●●●●●● ●
●●●
0 0.5 1
Edema
Expected Survival
1.0
0.8
0.6 PBC Study, Survival by Edema
Survival
0.4
0.2
0.0
0 2 4 6 8 10 12
1.0
0.8
0.6 Comparison of adjusted and unadjusted
Survival
0.4
0.2
0.0
0 2 4 6 8 10 12
Population adjustment
• Because
? bilirubin is continuous
? risk is very well modeled by log(bilirubin)
? the Cox model had the “right” variable in the right form
• Then we can get away with using the mean curve as a display.
? actually no, not even in this case
? major mistake in other data sets
• Population average
? What the KM “would have been” if all three edema groups had the same
distribution of bilirubin values
? What to use as the ‘same distribution’?
∗ the overall distribution of bilirubin
∗ the distribution of bilirubin in one particular group
1.0
0.8
0.6 Comparison of direct adjusted and unadjusted
Survival
0.4
0.2
0.0
0 2 4 6 8 10 12
1.0
0.8
0.6
Survival
0.4
0.2
0.0
0 2 4 6 8 10 12
Reprise
• When dealing with a rate table, one can compute survival for either a group
or for a person
? survival for a ‘person’ is of course the hypothetical survival for a large
number of identical persons
? (the actual survival for one person is a step function that goes from 1 to 0
on the day of their demise)
Expected Survival
Standard Populations
• Individual survival
? What is the survival of a 53 year old male, still alive on Oct 30, 2006?
? What is the expected number of years remaining?
? What is the probability of 12 more years?
? Use the US rate tables (or Minnesota, or . . . )
? In S/R: survexp routine with only 1 subject in the data set
• Cohort survival
? Given cohort of subjects with a given age/sex makeup, what is the
expected survival curve for the group as a whole?
? The appropriate thing to compare to a Kaplan-Meier
∗ corresponsds to a hypothetical study where one had actually recruited
an age/sex matched control for every study subject.
∗ In S/R, use survexp with cohort=T and multiple subjects
◦ it calculates all n individual survival curves
◦ averages them appropriately (Ederer, Hakulinen, conditional)
? Why are age, sex, and race always the covariates?
Expected Survival
Cox models
Issues are exactly the same as before
• Individual survival
? What is the predicted survival for a 53 year old PBC patient with bilirubin
of 2, no edema, prothrombin of .5, and albumin of 3.5?
? In S/R use the survfit routine, with a Cox model fit as the first argument.
? In SAS, use the baseline statement
? Often applied to the immediate subjects of a study
? Less often, a new patient and an older definitive study.
• Cohort survival
? Using predictions from a prior Cox model, what is the expected survival
of a given cohort of subjects?
? Sometimes used for ‘what if’ questions.
? Sometimes used to adjust a survival curve.
? Sometimes used to display the results from a study.
• What variables?
Expected Survival
• D-penicillamine, the agent used in the trial for the PBC data set, was shown
not to be effective; several other drugs have since been evaluated in this
disease.
• The data are reported in Lindor et al. 1994, UDCA nearly halves the rate of
adverse outcomes.
Expected Survival
1.00
0.95
0.90
Survival
0.85
0.80
Placebo
0.75
UDCA
Expected
0.70
0 1 2 3 4 5
• Why does the placebo group have better survival than expected?
? Suspect: the rise in orthotopic liver transplant (OLT)
? At the time of the PBC natural history trial, transplant was new
? At the time of the UDCA trial, transplant was a recognized therapy
∗ the sickest patients are selected for transplant
∗ the treatment ‘rescues’ those who are about to die
Expected Survival
• However....
? We would like to retain the coefficients of the PBC natural history model
? The model has been validated in 3 other populations, with outstanding
stability
? Change only the baseline hazard Λ0
? Similar to changing only an intercept
1.0
0.9
Death/OLT
0.8
Placebo
UDCA
Expected
0.7
0 1 2 3 4 5
Alternate code
Approach: Obtain the observed and expected event counts, by subsets of risk
score.
• Complication
? Some patients crossed over from placebo to UDCA
? This occured at the planned end of the study
? UDCA takes about 6 months to saturate.
? How to deal with them?
∗ Only use time up to crossover
∗ Change groups at crossover (risk score at crossover)
∗ Use the risk score after 6 months of treatment, along with time after 6
months
Expected Survival
After 6 Months
Placebo UDCA UDCA
Score n Obs Exp n Obs Exp n Obs Exp
≤ 4.4 24 0 1.0 29 2 2.4 52 2 3.0
4.4–5.4 38 5 3.1 26 1 6.4 17 2 3.2
5.4–6.4 20 1 4.3 20 5 10.7 11 6 4.7
> 6.4 9 7 7.1 13 7 21.1 5 3 5.2
Total 91 13 15.4 88 15 40.5 85 13 16
Table 3: Observed and expected numbers of death/OLT events
Competing risks
Competing risks
Competing risks
Monoclonal Gammopathy
The plasma cell lineage comprises only a small portion of human blood cells,
<3%, but is responsible for the production of immunoglobulins, an important
part of the body’s immune defense.
The population prevalence of MGUS increases with age from about 1% at age
50 to 3% for patients over 70.
Given its connection to serious plasma cell diseases, the potential prognostic
importance of MGUS is of interest. Is it:
• something in between?
Competing risks
The data set has 1384 subjects. If there were no censoring, one useful thing
to examine would be the crude hazard:
Notice that if the death rate were to increase, the progression rate would
decrease. This estimate, also called the cumulative incidence or marginal
incidence estimator, shows the influence of one cause upon another.
One is treating each death as “if this subject had not died, their future disease
course would be just like those who survived him/her”.
• The thought experiment must make sense: “progression with all other
causes of death eliminated”.
Kaplan-Meier
Competing risk
0.3
Cumulative risk
0.2
0.1
0.0
0 5 10 15 20 25
Years
Competing risks
• The curve begins to flatten out at older ages, as the other infirmities
increase.
• “What is a patient’s ongoing risk, given that he/she is still coming in to see
me?”
The KM curve is used much more than it should be, because software is
available.
Competing risks
Computing CI estimates
Let S(t) be the overall Kaplan-Meier, “time to first bad thing”.
Method 1
1. If S has a jump at time t (some event happened), let ∆(t) = the jump size
at t = S(t) − S(t − 0).
2. Partition the jump across the event. If there were 3 events at time t, 2 of
type 1, 0 of type 2 and 1 of type 3, then
• CI1(t) will increase by (2/3)∆(t) at this point
• CI2(t) does not increase
• CI3(t) will increase by (1/3)∆(t) at this point
Competing risks
This is extensible to more complex situations (like the Cox model), but not
very intuitive.
Method 3: redistribute-to-the-right
Averaging Survival
Given a mixed population
• ...
k
X
S(t) = piSi(t)
i=1
Competing risks
Proof
Average Hazard
1.000
0.500
0.100
Survival
0.050
Subpopulations
Overall
0.010
0.005
0 2 4 6 8 10
Years
Competing risks
f (t)
λ(t) =
S(t)
Z t Z t
CI(t) = f (s)ds = λ(t)S(s)ds
0 0
Competing risks
• fk is a ‘defective’ distribution
Competing risks
function.
Competing risks
Consider the following small data set of AML patients (from Miller), with
survival times in months of: 9, 13, 13+, 18, 23, 28+, 31, 34, 45+, 48, and 161+.
Step
Time 0 1 2 3 4
9 1 1 1 1 1
13 1 1 1 1 1
13+ 1 0 0 0 0
18 1 1 +1/8 9/8 9/8 9/8
23 1 1 +1/8 9/8 9/8 9/8
28+ 1 1 +1/8 0 0 0
31 1 1 +1/8 9/8 + 9/40 27/20 27/20
34 1 1 +1/8 9/8 + 9/40 27/20 27/20
45+ 1 1 +1/8 9/8 + 9/40 0 0
48 1 1 +1/8 9/8 + 9/40 27/20 + 27/40 81/40
161+ 1 1 +1/8 9/8 + 9/40 27/20 + 27/40 0
• ordinary KM: redistribute the weights of those who die of other causes.
Treat removals due to other causes as though they were censored alive.
• At closure
? 639 OLT
? 79 died while waiting (12 “withdrawn”)
? 41 withdrawals from the list
? 102 remaining on the list
Competing risks Transplant
Competing Risks
Overall
0.8
OLT
0.6
Cumulative Probability
0.4
0.2
Death
0.0
Withdrawal
0 1 2 3 4 5
Note that the cumulative incidence curves sum to the total one.
The KM for death and the CI for death in this data set are quite different
• The KM estimates the expected death rate, among waiting list patients, if
OLT and withdrawal were to become unavailable.
• In this data, the actual effect of changes over time is the more interesting
question. This is what the CI curve estimates.
Competing risks Transplant
0.4
0.3 Death on the waiting list
Cumulative Probability
KM
0.2
CI
0.1
0.0
0 1 2 3 4 5
0.8
0.8
0.6
0.6
0.4
0.4
1990-92 1993-95
0.2
0.2
0.0
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.8
0.8
0.6
0.6
0.4
0.2
0.0
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0
Competing risks Transplant
1990-92
1993-95
1996-97
1998-99
0.15
Cumulative Probability
0.10
0.05
0.0
0.8
0.6 Time to transplant, 4 eras
Cumulative Probability
0.4
1990-92
1993-95
1996-97
0.2
1998-99
0.0
0.8
0.8
A, B, AB
0.6
0.6
O
0.4
0.4
0.2
0.2
1990-92 1993-95
0.0
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
1996-97 1998-99
0.0
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0
Competing risks Transplant
Key Issue
1. The existence of a competing risk λ2 does not change a risk λ
• But the added accident risk may lower the lifetime risk of cancer occurrence
Example
Imagine 2 causes of death A and B
• Difficult to distinguish
? one-third of the A’s are mistakenly called B’s
? Essentially random
Informative Censoring
When dividing the weight equally among all those still at risk, the redistribute
algorithm makes a key assumption:
Solutions:
• Give the weight unevenly, to those “most like” the censored subject.
? In a Cox model that has all the correct variables and coefficients this is
not necessary.
? Leads to time-dependent weights in the Cox model.
Liver transplant
For the waiting time data, competing risks analysis has been very useful.
Current status
Treatments alive dead
1 12 211 578
2 16 114 355
3 23 80 225
4 12 46 112
5 2 29 64
6+ 4 29 33
Total 69 509
Competing risks Multi-state models
0 1 2 3 4 5
Competing risks Multi-state models
Progression
0.4
Death
0.2
0.0
0 1 2 3 4 5
Competing risks Multi-state models
Progression
0.4
Death
0.2
0.0
0 1 2 3 4 5
Competing risks Multi-state models
0.6
0.4
0.2
0.0
0 2 4 6 8 10
Delayed entry
For a large study of 3914 multiple myeloma (MM) patients seen at the Mayo
Clinic from 6/47 through 2/96 we had
• Date of diagnosis of MM
For clinical reasons, the principle investigator (Dr. R. Kyle) wanted to use
“survival from diagnosis of MM”, rather than “survival from arrival at Mayo”.
On this time scale the data is left-truncated: a patient who comes to Mayo 1
year after diagnosis would not have been seen had death occurred at < 1
year.
Naive
0.6
Survival
0.4
Correct
0.2
0.0
0 1 2 3 4 5
Years post Dx
Competing risks Multi-state models
Computation
SAS
S-Plus
With delayed entry, e.g., referral patients who enter a group some time after
their diagnosis, the Kaplan-Meier curve is
Y d(s)
S(t) = 1−
r(s)
s≤t
where
• The usual KM has r(t) = number who have not yet died at time t
• The difference here is that one does not enter the denominator until you
have arrived
• “Not counted as being in the lottery drawing, unless you could have won
the drawing.”
Competing risks Multi-state models
PBC study
Between January, 1974 and May, 1984 the Mayo Clinic conducted a
randomized double-blind study in primary biliary cirrhosis of the liver (PBC),
comparing the drug D-penicillamine (DPCA) with a placebo. There were 424
patients who met the eligibility criteria of the trial, of whom 312 agreed to
participate.
PBC is a rare but fatal chronic liver disease of unknown cause, with a
prevalence of about 50 cases per million population. The primary pathologic
event is destruction of the interlobular bile ducts, possibly by an immunologic
mechanism. Progression is slow but inexorable, with a median survival of
about 8 years. Females comprise 90% of the cases.
Patients were scheduled for a repeat visit at 6 and 12 months, and then yearly
thereafter. The primary analysis was conducted in July, 1986, two years after
the last patient had been entered. At that time 125 of the 312 patients had
died, eight patients had been lost to follow-up, and 19 had undergone
orthotopic liver transplantation (OLT).
Competing risks Multi-state models
1.0
0.8
0.6
0.4
0.2
0.0
0 2 4 6 8 10 12
1.0
0.8
0.6
0.4
0.2 PBC sequential data, bilirubin > 2
Baseline data
Time dependent
0.0
0 2 4 6 8 10 12 14
• The upper curve is survival for someone whose bilirubin never goes over 2.
Competing risks Multi-state models
Summary
The [start, stop) method produces a nice looking curve.
Landmark curves
The problem with the time dependent covariate curves is that they do not
answer a clinical question: “This is the survival of a person with x, y and z”.
• They most certainly are not a description of the effect of high bilirubin
1.0
0.8
0.6
0.4
0.2 PBC sequential data, landmark at 25 months
bili <=2
bili >2
0.0
0 2 4 6 8 10 12 14
1.0
0.8
0.6
0.4
0.2 PBC sequential data, landmark at 25 months
bili <=2
bili >2
Time dep
0.0
0 2 4 6 8 10 12 14
Landmark
Advantages
Disadvantages