RCsplines
RCsplines
RCsplines
William D. Dupont
W. Dale Plummer
Department of Biostatistics
Vanderbilt University Medical School
Nashville, Tennessee
1
t1 t2 t3
Example of a restricted cubic spline with three knots
where
x1 = x
Ïu : u > 0
(u ) + = Ì
Ó0 : u £ 0
2
These covariates are
3
Default knot locations are placed at the quantiles of the
x variable given in the following table (Harrell 2001).
Number
Knot locations expressed in quantiles
of knots
of the x variable
k
3 0.1 0.5 0.9
4 0.05 0.35 0.65 0.95
5 0.05 0.275 0.5 0.725 0.95
6 0.05 0.23 0.41 0.59 0.77 0.95
7 0.03 0.183 0.342 0.5 0.658 0.817 0.98
SUPPORT Study
A prospective observational study of hospitalized patients
4
250
200
Length of Stay (days)
100 50
0 150
5
. gen log_los = log(los)
Regress log_los against all
. rc_spline meanbp
number of knots = 5 variables that start with the
value of knot 1 =
value of knot 2 =
47
66
letters _S. That is, against
value of knot 3 = 78 _Smeanbp1
value of knot 4 = 106 _Smeanbp2
value of knot 5 = 129 _Smeanbp3
_Smeanbp4
. regress log_los _S*
------------------------------------------------------------------------------
log_los | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Smeanbp1 | .0296009 .0059566 4.97 0.000 .017912 .0412899
_Smeanbp2 | -.3317922 .0496932 -6.68 0.000 -.4293081 -.2342762
_Smeanbp3 | 1.263893 .1942993 6.50 0.000 .8826076 1.645178
_Smeanbp4 | -1.124065 .1890722 -5.95 0.000 -1.495092 -.7530367
_cons | 1.03603 .3250107 3.19 0.001 .3982422 1.673819
------------------------------------------------------------------------------
6
200
100
80
Length of Stay (days)
60
40
20
10
8
6
4
Observed Expected
{ Output omitted }
. predict y_hat, xb
7
200
100
80
Length of Stay (days)
60
40
20
10
8
6
4
Observed Expected
{ Output omitted }
. predict y_hat, xb
8
200
100
80
Length of Stay (days)
60
40
20
10
8
6
4
Observed Expected
200
100
80
Length of Stay (days)
60
40
20
10
8
6
4
Observed Expected
9
. drop _S* y_hat
200
100
80
Length of Stay (days)
60
40
20
10
8
6
4
10
Define rstudent to be the
. predict rstudent, rstudent
studentized residual.
Lowess smoother
4
Studentized residuals
0 -2 2
11
1
30 .9
.8
25
20 .6
.5
15
.4
10 .3
.2
5
.1
0 0
20 40 60 80 100 120 140 160 180
Mean Arterial Blood Pressure (mm Hg) ...
------------------------------------------------------------------------------
hospdead | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
meanbp | .9845924 .0028997 -5.27 0.000 .9789254 .9902922
------------------------------------------------------------------------------
. predict p,p
12
1
.9
.3 .4 .5 .6 .7 .8
Probabilty of Hospital Death
.2
.1
0
. drop _S* p
------------------------------------------------------------------------------
hospdead | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Smeanbp1 | -.1055538 .0203216 -5.19 0.000 -.1453834 -.0657241
_Smeanbp2 | .1598036 .1716553 0.93 0.352 -.1766345 .4962418
_Smeanbp3 | .0752005 .6737195 0.11 0.911 -1.245265 1.395666
_Smeanbp4 | -.4721096 .6546662 -0.72 0.471 -1.755232 .8110125
_cons | 5.531072 1.10928 4.99 0.000 3.356923 7.705221
------------------------------------------------------------------------------
13
. test _Smeanbp2 _Smeanbp3 _Smeanbp4
( 1) _Smeanbp2 = 0
( 2) _Smeanbp3 = 0 We reject the null hypothesis
( 3) _Smeanbp4 = 0
that the log odds of death is a
chi2( 3) = 80.69 linear function of mean BP.
Prob > chi2 = 0.0000
14
1
.9
.8
Probabilty of Hospital Death
.7
.6
.5
.4
.3
.2
.1
0
20 40 60 80 100 120 140 160 180
Mean Arterial Blood Pressure (mm Hg)
+-------------------------------------------+
| _Smean~1 _Smean~2 _Smean~3 _Smean~4 |
|-------------------------------------------|
178. | 60 .32674 0 0 |
575. | 90 11.82436 2.055919 .2569899 |
893. | 120 56.40007 22.30039 10.11355 |
+-------------------------------------------+
15
( 1) - 30 _Smeanbp1 - 11.49762 _Smeanbp2 - 2.055919 _Smeanbp3 -
> .2569899 _Smeanbp4 = 0
------------------------------------------------------------------------------
hospdead | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 3.65455 1.044734 4.53 0.000 2.086887 6.399835
------------------------------------------------------------------------------
Mortal odds ratio for patients with meanbp = 60 vs. meanbp = 90.
------------------------------------------------------------------------------
hospdead | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 2.283625 .5871892 3.21 0.001 1.379606 3.780023
------------------------------------------------------------------------------
Mortal odds ratio for patients with meanbp = 120 vs. meanbp = 90.
General Reference
Harrell FE: Regression Modeling Strategies: With
Applications to Linear Models, Logistic Regression, and
Survival Analysis. New York: Springer, 2001.
16
Cubic B-Splines
de Boor, C: A Practical Guide to Splines.
New York: Springer-Verlag 1978
Software
Newson, R: sg151, B-splines & splines parameterized
by values at ref. points on x-axis. 2000; STB-57: 20-27.
bspline.ado
17
Conclusions
Restricted cubic splines can be used with any
regression program that uses a linear predictor
– e.g. regress, logistic, glm, stcox etc.
18