What Is A Cox Model?: Supported by Sanofi-Aventis
What Is A Cox Model?: Supported by Sanofi-Aventis
What Is A Cox Model?: Supported by Sanofi-Aventis
Supported by sanofi-aventis
What is a
Cox model?
● A Cox model is a statistical technique for exploring the
Stephen J Walters BSc
MSc PhD CStat Reader in relationship between the survival of a patient and several
Medical Statistics, School explanatory variables.
of Health and Related
Research (ScHARR),
University of Sheffield ● Survival analysis is concerned with studying the time
between entry to a study and a subsequent event (such as
death).
A B C D E F
Survival time Number at Number of Number Proportion Cumulative
(years) risk at start deaths censored surviving until proportion
of study end of interval surviving
0.909 10 1 0 1 – 1/10 = 0.900 0.900
1.112 9 1 0 1 – 1/9 = 0.889 0.800
1.322* 8 0 1 1 – 0/8 = 1.000 0.800
1.328 7 1 0 1 – 1/7 = 0.857 0.686
1.536 6 1 0 1 – 1/6 = 0.833 0.571
2.713 5 1 0 1 – 1/5 = 0.800 0.457
2.741* 4 0 1 1 – 0/4 = 1.000 0.457
2.743 3 1 0 1 – 1/3 = 0.667 0.305
3.524* 2 0 1 1 – 0/2 = 1.000 0.305
4.079* 1 0 1 1 – 0/1 = 1.000 0.305
* Indicates a censored survival time
0.25 –
0.0 –
0 2 4 6 8
Time from randomisation to death (years)
Number at risk
Control 336 203 97 22 0
Interferon 338 215 84 23 0
Control Interferon
Hazard ratio 0.92 (95% CI: 0.74–1.13); p=0.411 (logrank)
given that the individual has survived up to variable is the hazard function at a given time.
the beginning of the interval. It can therefore If we have several explanatory (X) variables of
be interpreted as the risk of dying at time t. interest (for example, age, sex and treatment
The hazard function – denoted by h(t) – group), then we can express the hazard or risk
can be estimated using the following of dying at time t as:
equation:
h(t) = h0(t) x exp(bage.age + bsex.sex + ...
number of individuals experiencing + bgroup.group)
an event in interval beginning at t
h(t) = taking natural logarithms of both sides:
(number of individuals surviving at
time t) x (interval width)
ln h(t) = ln h0(t) x exp(bage.age + bsex.sex + ...
+ bgroup.group)
What is regression?
If we want to describe the relationship The quantity h0(t) is the baseline or
between the values of two or more variables underlying hazard function and corresponds
we can use a statistical technique called to the probability of dying (or reaching an
regression.7 If we have observed the values event) when all the explanatory variables are
of two variables, X (for example, age of zero. The baseline hazard function is
children) and Y (for example, height of analogous to the intercept in ordinary
children), we can perform a regression of Y on regression (since exp0 = 1).
X. We are investigating the relationship The regression coefficients bage to bgroup give
between a dependent variable (the height the proportional change that can be expected
of children) based on the explanatory in the hazard, related to changes in the
variable (the age of children). explanatory variables. They are estimated by a
When more than one explanatory (X) complex statistical method called maximum
variable needs to be taken into account (for likelihood,6 using an appropriate computer
example, height of the father), the method is program (for example, SAS, SPSS or STATA).
known as multiple regression. Cox’s The assumption of a constant relationship
method is similar to multiple regression between the dependent variable and the
analysis, except that the dependent (Y) explanatory variables is called proportional
Figure 3. Complementary 1–
log-log plot3
0–
-1 –
-4 –
-5 –
-6 –
-3 -2 -1 0 1 2
ln (time)
Table 2. Cox regression model fitted to the data from the AIM HIGH trial of interferon versus
no further treatment (control) in malignant melanoma (n=674)
CI: confidence interval; LM: locally metastatic; RMD: regionally metastatic at diagnosis; RMR: regionally metastatic at recurrence