Censoring Issues in Survival Analysis: Kwan-Moon Leung, Robert M. Elashoff, and Abdelmonem A. Afifi
Censoring Issues in Survival Analysis: Kwan-Moon Leung, Robert M. Elashoff, and Abdelmonem A. Afifi
Censoring Issues in Survival Analysis: Kwan-Moon Leung, Robert M. Elashoff, and Abdelmonem A. Afifi
KEY WORDS: survival analysis, right censoring, interval censoring, informative censoring,
ignorability
ABSTRACT
A key characteristic that distinguishes survival analysis from other areas in statis-
tics is that survival data are usually censored. Censoring occurs when incomplete
information is available about the survival time of some individuals. We define
censoring through some practical examples extracted from the literature in various
fields of public health. With few exceptions, the censoring mechanisms in most
observational studies are unknown and hence it is necessary to make assumptions
about censoring when the common statistical methods are used to analyze cen-
sored data. In addition, we present situations in which censoring mechanisms can
be ignored. The effects of the censoring assumptions are demonstrated through
actual studies.
INTRODUCTION
Survival analysis is used in various fields for analyzing data involving the
duration between two events, or more generally the times of transition among
several states or conditions. It is also known as lifetime data analysis, reliability
analysis, time to event analysis, and event history analysis depending on the
type of application. In this paper, the term survival time is used interchangeably
with the terms risk period, lifetime, failure time, and time to a certain event.
To determine the survival time, we need to define two time points: the time of
origin, i.e. the time at which an original event, such as birth, occurs and the
time of failure, i.e. the time at which the final event, such as death, occurs. A
83
0163-7525/97/0510-0083$08.00
P1: EAK/vks P2: MBL/rsk QC: MBL/vmw T1: MBL
March 19, 1997 10:28 Annual Reviews AR028-04 AR28-04
subject is said to be at risk if the original event has occurred, but the final event
has not.
A key characteristic that distinguishes survival analysis from other areas
in statistics is that survival data are usually censored or incomplete in some
way. We define censoring through some practical examples, then describe
the common statistical methods used to analyze censored data and discuss the
necessity of making assumptions about censoring when those methods are used.
We also discuss situations in which the censoring mechanism can be ignored,
and investigate the effects of different censoring assumptions in actual studies.
Finally, we indicate some trends in the future of research on censoring.
Annu. Rev. Public. Health. 1997.18:83-104. Downloaded from www.annualreviews.org
by Universidad Nacional de Colombia on 08/15/13. For personal use only.
ORIGINS OF CENSORING
Censoring occurs when incomplete information is available about the survival
time of some individuals. In this section, we present a number of concrete
examples extracted from the literature in various fields of public health. The
purpose is to use real-life situations to illustrate types of censoring and to
motivate the discussion presented in the later sections.
Examples
Example 1. Health insurance and mortality To examine the relationship be-
tween the status of insurance and the risk of subsequent mortality, adults older
than 25 years who reported that they were uninsured or privately insured in the
first National Health and Nutrition Examination Survey (NHANES I) (9) were
followed prospectively from initial interviews between 1971 and 1975 until
1987 (end of the Epidemiologic Follow-up Study, NHEFS). There were a total
of 4882 eligible subjects, of whom 669 subjects were uninsured.
The time period of interest was the time from the start of follow-up to death,
and the research question is whether this variable is affected by whether or not
the individual is insured. In general, an observation is said to be right censored
if the person was alive at study termination or was lost to follow-up at any
time during the study. By right censoring, it is meant that the survival time is
only known to exceed a certain value. In this study, the analysis was adjusted
for other factors such as baseline age, gender, race, smoking status, alcohol
consumption, obesity, self-rated health, employment status, and so forth.
Example 2. Do men and women relapse into alcoholism for different reasons?
A sample of 44 women and 50 men attending an alcohol treatment facility oper-
ated by the Western Australian Alcohol and Drug Authority were studied (29).
A range of demographic, social, and psychological measures were observed to
determine whether women and men relapse for different reasons. The length
of follow-up was three months and the variable of interest was the time from
P1: EAK/vks P2: MBL/rsk QC: MBL/vmw T1: MBL
March 19, 1997 10:28 Annual Reviews AR028-04 AR28-04
CENSORING ISSUES 85
variable was the time from diagnosis to death. In this sample, 28 right-censored
observations represented patients whose treatment was terminated because of
by Universidad Nacional de Colombia on 08/15/13. For personal use only.
work, whichever came first. The workers were supposed to have health exami-
nations and to fill out questionnaires regarding respiratory symptoms at the start
of employment, then yearly at a routine examination, if attending the plant’s
health clinic because of respiratory symptoms, or when leaving employment.
The study sample consisted of 1301 subjects who were employed during
the study period and had at least two examinations. If the subjects reported
wheezing and dyspnea, they were considered symptomatic. The time variable
of interest was the time from employment to development of symptoms. Besides
fluoride exposure, other potential covariates included age, smoking habits, and
previous exposure to dust/gases.
Annu. Rev. Public. Health. 1997.18:83-104. Downloaded from www.annualreviews.org
In addition to right censoring, that is, leaving the potroom or ending the
survey without respiratory symptoms, some observations were singly interval
by Universidad Nacional de Colombia on 08/15/13. For personal use only.
censored because for them the study endpoint was established only by periodical
examinations. By singly interval censoring it is meant that the outcome variable
is not known exactly, rather it is known only up to a time interval.
Example 6. Multicenter AIDS Cohort Study (MACS) From April 1984 to
September 1993, there were 4954 men between the ages of 18 and 70 who were
recruited for the Multicenter AIDS Cohort Study (MACS) (5). The MACS is
a longitudinal study of the natural history of human immunodeficiency virus
type 1 (HIV-1) among homosexual and bisexual men. Subjects were recruited
at four centers: Los Angeles, Chicago, Pittsburgh, and Baltimore.
An important time variable was the incubation period of AIDS (time from
HIV-1 seroconversion to an AIDS-defining illness). An observation was right
censored if the subject was AIDS free on September 1, 1993, or was lost to
follow-up. The censoring issue becomes more complicated when we realize
that both the time of HIV seroconversion and the time of AIDS onset are
known only up to a time interval since those times are determined by periodical
examinations. Such observations are called doubly interval censored, i.e. the
survival time (incubation period of AIDS) is subject to interval censoring on
the left and on the right.
Example 7. Survival with malignant melanoma In the period between 1971
and 1993, approximately 6000 patients with malignant melanoma were treated
by the staffs of the John Wayne Cancer Institute (JWCI) (24). The primary
objective of the study reported here was to examine the efficacy of a new
polyvalent melanoma cell vaccine (MCV) in treating patients with metastatic
disease. Such treated patients represent a subset of the JWCI patients. To provide
appropriate treatments, patients were followed periodically after admission to
detect any change in disease stages. Excluding patients with stage III disease
when first seen at the JWCI, we had 1548 patients in the data set. By the time of
analysis, 890 had advanced to stage III, 788 of whom died. Beside the indicator
P1: EAK/vks P2: MBL/rsk QC: MBL/vmw T1: MBL
March 19, 1997 10:28 Annual Reviews AR028-04 AR28-04
CENSORING ISSUES 87
further assumptions about the censoring mechanism when analyzing the data.
Statistical methods for handling censoring mechanisms are discussed in detail
in the next section.
While cases B and C in Figure 1 represent right censoring, Subject D repre-
sents a case with left truncation. Such a problem could happen, for example,
when a subject in the AIDS study was already HIV-1 seropositive prior to enroll-
ment and the time variable of interest is the incubation period of AIDS. There
are other situations, such as for Subject E, in which the observation is both left
and right censored; we call such observations doubly censored. An example of
this sort exists in the AIDS study example where a subject was already HIV-1
seropositive when enrolled but was still AIDS-free at the end of the study. A
crucial question here is whether the time (DL ) from the beginning of the risk
period to the beginning of the observation period is known. If DL is observed,
then one could apply the methodologies developed for no censoring (or right
censoring) for analysis with proper adjustment of the risk set. However, if DL
is not observed, then we cannot specify the origin of the survival time. In this
case external information such as the distribution, over chronological time, of
the original event is needed (14, 17, 32).
Finally, in most applications there are cases where the origin and the event
both occur prior to the start of follow-up or after follow-up ends. Such cases
generally do not affect the analysis but they can affect the generalizablility
of the findings. Subjects F and G in Figure 1 represent such cases known as
completely right censored and completely left censored, respectively.
CENSORING ISSUES 89
censored data: complete data analysis, the imputation approach, analysis with
dichotomized data, and the likelihood-based approach.
Complete-data analysis As many researchers and statistical packages do when
faced with incomplete data, one can simply ignore the censored observations
and analyze only the uncensored complete observations. The main advantage
of this approach is simplicity. However, there are some disadvantages. (a) Loss
of efficiency: The loss in sample size can be considerable since it is not unusual,
especially in medical or epidemiological studies, that 50% or more observations
are censored. (b) Estimation bias: Inferences based on analyzing the uncen-
sored observations only may be biased. It is a common misconception that
one need not make any assumptions about the censoring mechanism when per-
forming a complete-data analysis. In reality, such an analysis requires a strong
assumption regarding the censoring mechanism: As in the incomplete data situ-
ations, complete-data analysis produces unbiased estimates only if the missing
(censored observations) are missing (censored) completely at random (25).
Imputation approach Although imputation is one of the popular approaches
for handling incomplete data, it may not be appropriate for censored data. In the
context of right censoring, there are two extreme ways to impute the missing
survival times: (a) assuming all censored cases fail right after the time of
censoring, that is, left-point imputation or (b) assuming all censored cases never
fail, that is, right-point imputation. It is clear that neither of these approaches
is appropriate since the survival probabilities would be underestimated and
overestimated, respectively. Another approach is to assume that the failure
time after censoring follows a specific model and estimate the model parameters
in order to impute the residual survival time (time from censoring to failure).
However, this approach depends on the model assumptions, which are very
difficult to check without information on survival after censoring (the missing
information).
P1: EAK/vks P2: MBL/rsk QC: MBL/vmw T1: MBL
March 19, 1997 10:28 Annual Reviews AR028-04 AR28-04
CENSORING ISSUES 91
versus nonoccurrence of the event within a fixed period of time and disregards
the survival times. In this case, the dichotomized data can be easily analyzed
by Universidad Nacional de Colombia on 08/15/13. For personal use only.
by the standard techniques for binary outcomes, such as contingency tables and
logistic regression. However, there are some disadvantages of this approach:
2. Variability in the timing of the event among those who had the event within
the observation period cannot be modeled. Let us consider an extreme
example: Suppose we are studying the effect of a new drug on patients who
underwent surgery for a particular disease. Eighty percent of the subjects
who have the placebo have recurrence shortly after the surgery, while 80% of
the subjects who took the new drug remained disease free for at least 5 years
but had recurrence within 10 years. All other patients did not have recurrence
after 10 years. If we analyze these data with dichotomized outcomes, we
may find no difference between treatment the groups when the observation
period is 10 years. However, there would likely be a significant difference
between the treatment groups when the observation period is 5 years.
ABOUT CENSORING
by Universidad Nacional de Colombia on 08/15/13. For personal use only.
CENSORING ISSUES 93
To analyze data of this kind, we may proceed by considering the joint dis-
tribution of T (survival time) and C (censoring time), that is, the likelihood
by Universidad Nacional de Colombia on 08/15/13. For personal use only.
function
Yn
L= f i (ti , ci ), 2.
i=1
where f i (ti , ci ) is the density function of survival time T = ti and censoring time
C = ci . However, we are often not interested in modeling both the survival
time and censoring time. Instead, we are only interested in the distribution
of the survival time or the effects of certain covariates on the survival time.
Furthermore, for the ith subject, we only observe yi = min{ti , ci } and the
censoring indicator δi instead of (ti , ci ). Under these conditions, the distribution
function of T is described by statisticians as being nonidentifiable (33) unless
we make further assumptions. Here nonidentifiability means that there exist
more than one distribution function of T that are compatible with the data.
A simple but commonly used assumption to resolve this problem is indepen-
dent censoring, that is, we assume that the survival time T and the censoring time
C are independent. Note that some authors use the term random censoring when
they actually mean independent censoring. Under the independent censoring
assumption, analyses can be simply based on the likelihood function (Eq. 1).
For left truncation, data can be handled in a similar manner to right censoring
(for example, 14, 32). For interval censoring, the situation is slightly different
in that it requires the knowledge of the examination scheme (prospective study)
or sampling plan (retrospective study) as explained below.
Suppose the disease process is denoted by X (t) indicating the disease state at
time t. We assume that a particular subject is observed at times t0 < t1 < · · · <
tm to be in states s0 , s1 , . . . , sm , respectively. Here s j , j = 1, . . . , m could
represent the disease stages. For these observations, the likelihood is given by
L = Pr{X (t0 ) = s0 , . . . , X (tm ) = sm ; T0 = t0 , . . . , Tm = tm ; M = m} 3.
Here, not only are the examination times T0 , T1 , . . . , Tm assumed random, the
P1: EAK/vks P2: MBL/rsk QC: MBL/vmw T1: MBL
March 19, 1997 10:28 Annual Reviews AR028-04 AR28-04
CENSORING ISSUES 95
Interval Censoring
Compared with right censoring situations, relatively few articles were devoted
to the problem of ignorability of interval censoring. Similar to the defini-
tion of noninformative censoring for right-censored data, Gruger et al (10)
P1: EAK/vks P2: MBL/rsk QC: MBL/vmw T1: MBL
March 19, 1997 10:28 Annual Reviews AR028-04 AR28-04
2. Random sampling Under this scheme, all subjects are examined or ob-
served in a more or less random fashion and the examination times are inde-
pendent of the subjects’ disease history.
3. Doctor’s care Under this scheme, the next examination time is chosen on
the basis of the subject’s current observed status. For example, if a patient is
in a critical stage, then the time of the next examination will be chosen to be in
the very near future.
Under the above examination schemes, Gruger et al (10) showed that the
likelihood function (Eq. 4) can be used to obtain an estimate of the survival
function (4, 23) or estimates of the regression coefficients of survival times on
the covariates (7, 18, 24). Gruger et al (10) considered another situation called
patient self-selection. In this scheme a patient’s examination is initialized when
the patient feels unwell and/or when symptoms suggest that the disease is
advancing. Alternatively, a patient who feels unwell might refuse to appear
for examination because of loss of confidence in the efficacy of the treatment.
Under this scheme, the examination times are no longer noninformative and
must be taken into account when analyzing the data. That task, however,
requires modeling the joint distribution of the disease state and the examination
times [that is, the likelihood function (Eq. 3)]. Gruger et al (10) suggested that
investigators should plan in advance in order to avoid this difficulty.
Unlike the case of right censoring, there are only a very few articles on testing
the assumption of noninformative examination schemes for interval-censored
data. Heitjan (12) and Heitjan & Rubin (13) introduced the concept of “coarse”
data that have right censoring and interval censoring as special cases. They
provided the conditions under which it is appropriate to ignore the stochastic
nature of the coarsening (censoring), and called such conditions coarsening at
random. However, parametric assumptions are generally required in order to
test the coarsening at random conditions.
P1: EAK/vks P2: MBL/rsk QC: MBL/vmw T1: MBL
March 19, 1997 10:28 Annual Reviews AR028-04 AR28-04
CENSORING ISSUES 97
Lagakos & Williams (22) obtained the maximum likelihood estimates (stan-
dard errors) of the cone model discussed in the last section as λ̂ = 0.0409
(0.0123) and θ̂ = 0.25 (0.36). Notice that θ̂ is significantly different from one
(recall that θ = 1 means censoring is noninformative) based on a large-sample
test of H0 : θ = 1, and hence the noninformative censoring assumption is not
satisfied. We also obtained the estimates of the upper and the lower bounds
of the survival function based on Peterson’s procedure (27), and the estimated
survival function based on the procedures proposed by Fisher & Kanarek (8),
Slud & Rubinstein (31), and Klein & Moeschberger (19) with various model
parameters. Figure 4 displays the estimates of the survival function together
Annu. Rev. Public. Health. 1997.18:83-104. Downloaded from www.annualreviews.org
with the empirical distribution function derived from all 61 complete obser-
vations. Each of the three models that account for nonignorable censoring
by Universidad Nacional de Colombia on 08/15/13. For personal use only.
CENSORING ISSUES
Figure 4 Estimates of the survival function–lung cancer example. Kaplan-Meier estimate (filled diamond 3 dashes); Lagakos-
99
Williams estimate (four dashes); Peterson estimated bounds (• two dashes); Fisher-Kanarek estimate (open triangle two dashes);
Slud-Rubinstein’s estimate (+ two dashes); Klein-Moeschberger’s estimate (∇ two dashes).
P1: EAK/vks P2: MBL/rsk QC: MBL/vmw T1: MBL
March 19, 1997 10:28 Annual Reviews AR028-04 AR28-04
the data with techniques for right censoring and for interval censoring using
Turbull’s estimate (34). Notice that we call such an approach, i.e. replacing an
interval-censored observation by its right-endpoint, as right imputation.
Figure 5 displays the Kaplan-Meier estimate of the probability of symp-
toms based on right-imputed data and the Trunbull estimate (a generalization
of the Kaplan-Meier estimator for interval censoring). Note that the Kaplan-
Meier estimate underestimates the probability of symptoms in early follow-up
and overestimates it in late follow-up, thus resulting in an overestimate of the
length of survival probabilities. This over-optimistic estimate occurs because
the time to respiratory symptoms appears to be longer than it actually is when
Annu. Rev. Public. Health. 1997.18:83-104. Downloaded from www.annualreviews.org
II disease, the intermediate event is disease metastasis, and the final event is
death. Since the time of disease metastasis is known only to be between the
time of the last stage II disease and the time of the first stage III disease, the
time of the intermediate event is left interval-censored when one computes sur-
vival time for post-disease metastasis. In this example, the times of the final
events are either known exactly or they are right censored (i.e. there is no right
interval-censoring).
Because of the left interval-censoring, we cannot directly apply the standard
approaches for right-censored data for analysis. As mentioned in the second
section, a simple analytic approach is to impute the time of the intermediate
Annu. Rev. Public. Health. 1997.18:83-104. Downloaded from www.annualreviews.org
event (disease metastasis) by the right-point or the mid-point of the time interval
and then apply the standard techniques for right-censored data. However, this
by Universidad Nacional de Colombia on 08/15/13. For personal use only.
approach may not be appropriate, as will be seen below. Table 2 presents the
estimates of the regression coefficients from the imputation approaches and
from an approach that takes the interval censoring into account as proposed by
Leung & Elashoff (24). Basically, they assumed that the time of the event within
the censored interval is governed by an unknown distribution, and proposed an
estimate of the distribution.
Comparison of these estimates suggests that large differences exist among
methods, especially for the estimates related to the interval censored time (the
time from the first diagnosed stage II disease to disease metastasis). First,
the imputation approaches led to very different estimates on the effect of site
and the effect of time between the first diagnosed stage II disease and dis-
ease metastasis (metastases) as compared to the “correct” method. Second,
the imputation approaches underestimate the standard errors of the regression
Table 2 Parameter estimates of the Weibull proportional hazards model—Melanoma Study (example 7)∗
MCV treatment (1 = treated, 0 = control) −0.194 (0.097) −0.197 (0.076) −0.311 (0.076)
Breslow Depth −0.113 (0.104) −0.068 (0.075) −0.046 (0.075)
(1 = depth ≥ 1.8 mm, 0 = depth < 1.8 mm)
Gender (1 = male, 0 = female) 0.241 (0.100) 0.271 (0.076) 0.241 (0.075)
Metastasis site (1 = distant, 0 = others) 0.298 (0.126) 0.452 (0.088) 0.541 (0.088)
I{S ≥ 2 years}∗∗ −0.991 (0.111) −0.687 (0.082) −0.087 (0.097)
λ1 (scale parameter of Weibull dist.) 0.0345 (0.0069) 0.0499 (0.0073) 0.0887 (0.0131)
λ2 (shape parameter of Weibull dist.) 1.161 (0.041) 1.043 (0.026) 0.854 (0.022)
∗
The survival time is defined as the time from disease metastasis to death.
∗∗
S represents the time from the first diagnosed stage II disease to disease metastasis and I{S ≥ 2 years} = 1 if S ≥ 2 years and
= 0 otherwise.
†
See Reference 24.
P1: EAK/vks P2: MBL/rsk QC: MBL/vmw T1: MBL
March 19, 1997 10:28 Annual Reviews AR028-04 AR28-04
coefficient estimates. In summary, this example shows that one might obtain
biased estimates and incorrect statistical inferences by falsely assuming that the
time of event is equal to the right-point or the mid-point of the time interval.
For interval censoring (both singly and doubly interval censoring), researchers
often assume that the occurrence of an event coincides with the reporting time
by Universidad Nacional de Colombia on 08/15/13. For personal use only.
(that is, right-imputation). In writing this paper, our intention was to direct
investigators’ attention to the dangers involved in analyzing censored survival
data and point out some techniques to avoid these pitfalls.
An approach to investigating the situation was explored by Fisher & Kanarek
(8): “In some situations a subset of the loss-to-follow-up cases may in fact
be followed although at considerable expense.” With such information, the
investigator can either test the assumption of noninformative censoring (36) or
estimate the risk of informative censoring (1). However, this strategy can be
very expensive and can only be done when censoring is nonlethal.
All the methods dealing with informative censoring discussed in the literature
assume that all censored cases are either all informative or all noninformative.
In practice, there are many situations in which all three kinds of censoring
(positive dependence, negative dependence, and noninformative censoring) are
present in one sample. Thus, it would be useful to extend the existing methods
to deal with all these situations. Furthermore, the methods described here for
estimating the survival function under various conditions assume a fixed model
parameter (Fisher-Kanarek’s α, Slud-Rubinstein’s ρ and Klein-Moeschberger’s
θ; see the last two sections for details). In practice, however, these values are
rarely known. Thus, it would also be useful if there are some guidelines for
investigators to determine the value, or at least a reasonable range of the model
parameters, based on a sample data.
When covariates are available, sometimes it may be possible to recover some
of the information lost by identifying a surrogate response variable measured on
the censored subjects and using it to predict the residual survival time. Although
Cox (3) initialized this idea, no follow-up on this subject has appeared in the
literature.
Finally, the best way to handle censoring is to prevent it from happening
by a good design; no matter how effective the statistical methods are, some
information will be lost when analyzing censored data (2, 6, 35). These design
P1: EAK/vks P2: MBL/rsk QC: MBL/vmw T1: MBL
March 19, 1997 10:28 Annual Reviews AR028-04 AR28-04
Literature Cited
1. Baker SG, Wax Y, Patterson BH. 1993. censoring and left truncation. Biometrika
Regression analysis of grouped survival 64:225–30
data: informative censoring and double 15. Jennrich R. 1983. A note on the behavior of
Annu. Rev. Public. Health. 1997.18:83-104. Downloaded from www.annualreviews.org
sampling. Biometrics 49:379–89 the log rank permutation test under unequal
2. Brook RJ. 1982. On the loss of information censoring. Biometrika 70:133–37
through censoring. Biometrika 69:137– 16. Kalbfleisch JD, McKay RJ. 1979. On
by Universidad Nacional de Colombia on 08/15/13. For personal use only.
terval censoring in longitudinal data of der right censoring and left truncation.
respiratory symptoms in aluminium pot- Biometrika 74:883–86
room workers: a comparison of methods. 33. Tsiatis A. 1975. A nonidentifiability aspect
Stat. Med. 13:1771–80 of the problem of competing risks. Proc.
29. Saunders B, Baily S, Phillips M, Allsop S. Natl. Acad. Sci. USA 72:20–22
1993. Women with alcohol problems: Do 34. Turnbull BW. 1976. The empirical distribu-
they relapse for reasons different to their tion function with arbitrarily grouped, cen-
male counterparts? Addiction 88:1413–22 sored and truncated data. J. R. Stat. Soc. B
30. Slud E, Byar D. 1988. How dependent 38:290–95
causes of death can make risk factors ap- 35. Turrero A. 1989. On the relative efficiency
pear protective. Biometrics 44:265–69 of grouped and censored survival data.
31. Slud EV, Rubinstein LV. 1983. Depen- Biometrika 76:125–31
dent competing risks and summary survival 36. Wax Y, Baker SG, Patterson BH. 1993. A
curves. Biometrika 70:643–49 score test for noninformative censoring us-
Annu. Rev. Public. Health. 1997.18:83-104. Downloaded from www.annualreviews.org
32. Tsai WY, Jewell NP, Wang MC. 1987. ing doubly sampled grouped survival data.
A note on the product-limit estimator un- Appl. Stat. 42:159–72
by Universidad Nacional de Colombia on 08/15/13. For personal use only.