https://scholarlyrepository.miami.edu/oa_dissertations/2102/
令
σ | 晶 5ecure
I https://scholarlyreposíto叩.míamí.edu/oa_dis自由tí0l1s/21 02/
l J LIBRARIES
,
MIA~1l
lìbra"y.m ami .edu
〈 生旦型坚
旦旦旦旦 >ETDS> QA D1 SSERT.町1 0NS > 旦旦
OPEN ACCESS DISSERTATIONS
]1 Search r
Advanced 5 earch
Notify me via email or .R.S.S
...而'军3………………………·
FAQ
Disclaimer
Collectíons
Disci plines
Au thors
·监!lìi画画画画画
Estimating Individual Treat ment E仔ect in
Observationa l Data Using Random Forest
Methods
旦坠且主,.JlD.些笠MU芷旦旦旦
5 ubmit Research
E圆
-o!.圆nm
Available ior dow时oad on
Thursday, November 14,
publication Date
2018-05-23
2019
Avai lability
Embargoed
SHARE
回 E 田 。-I3 D
Enlbargo Period
2019-11-14
Degree Type
Dissertation
Degree Name
Doctor 01 Philosophy (PHD)
Departm凹,t
Public Health
Author Help
旦旦! >
5口 ences (~1edici ne)
Dat e of Defense
2018-05-03
First Committee Meluber
Hen、ant Ish、;\' aran
Second Committee
J . 5unil Rao
M 凹nber
Third CO ll1 mittee Member
Daniel Feaster
Fourth Committee Member
Wei 5 un
Abstr act
Estimation 01 individual tre.tment e忏ect in observational data is complicated
due t。由e challeno目。1 confoundino and selection bias_ A usel ul lnferentlal
Iramework to address this is the counterfactual model which takes the
hypothetical stance 01 asking what if an individual had received both
treatments. 问akino use of random forests ( RF) within the counterfactual
Iramew。巾, I estlmate Indlvldual tre.tment e何ects by directly modelino the
response. This thesis consists 01 five Chapters. Chapter 1 reviews 白e
methodology in causal inferenζe and provide mathematical notations_ Major
approaches reviewed include potential outcome approach , oraphical approach
and counterfactual approach. Chapter 2 discusses assumptìons for
counterfactual approach. P-values are useful in causal inlerence, but
whenever it is used, caution must be taken. 5ection 2.3 and 5ection 2.4
propose machine leamino methods as alternatives to p-values and checkino
proportional hazards assumption In survlval analysls. These two sectJons are
more general in content even beyond the scope 01 counterfactual approach.
Chapter 3 describes six random 10re5t methods for estimating individual
treatment effects under counterfactual approach I ramework and discusses
model conslstency and converoence 01 random lorest in Sectlon 3.6. Chapter
4 demonstrates the performance 01 these methods in ∞mplex simulations
and how the most appropriate method is used in a real dataset lor
ζontinuous outcome. Chapter 5 addresses causal inlerence in survival
anal ysis of ischem ic 日 rdiomγopathγ. Treatm ent effect is viewed as a dynamic
causal pro臼du re. New random forest m ethods are proposed in th is chapter
to assess individual therapy ove.rlalP. These m ethods possess the uni que
f eature of being able to incorporate external expert knowledge either in a
fully supe阿ised way ( i.e., we have a strong belief that knowledge is correct) ,
or in a minimally-supervised fashion (i. e., knowledge is not considered goldstandard) .
Keywords
Causal l nference; Random Forests; Machine Learning; Survi val; l ndividual
Treatment Effect; Observational Data
Recommended Citat ion
Lu , f'1i n, 仿Estimating lndividual Treatm ent Effect in Observational Data Us[ng
Random Forest r-1 ethods" (20 18). Opefl Acc白5 Disset臼tions . 2102.
https: llscho larl yreposlt。町.miaml .ed uJ oð_dissertatio nsJ2 1 02
|
出血|约旦[IIj国|旦U皿皿
E虫~田血h
l 缸里旦挝且S监皿且
丁 DIGITALCOM… •l
ζ一
lW'\'f1ië'm申明
•
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Estimating Individual Treatment Effect
in Observational Data
Using Random Forest Methods
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Min Lu
Dissertation Defense on May 3, 2018
Division of Biostatistics
University of Miami
Other Papers
1 / 58
Overview
Random
Forest
Individual
Causal
Inference
Min Lu
1
2
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
3
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
4
5
Introduction
Continuous and Binary Outcome
Definitions and notations
Method
Simulation
Real Data Application
Discussion
Survival Outcome
Definitions & notations
Treatment overlap oj (x)
Treatment effect estimation
Results
Discussion
Reference
Other Papers
Other Papers
2 / 58
Introduction
Random
Forest
Individual
Causal
Inference
Min Lu
• Comparison of potential outcomes under alternative treatments
• We only observe half of the potential outcomes
• A non-experimental study is subject to selection bias/confounding
Introduction
Continuous
and Binary
Outcome
Units Observed
Potential Outcomes
Treatment Effects
Treatment Control
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
1
..
.
Y1
..
.
Y1 (1)
..
.
Y1 (0)
..
.
Y1 (1) − Y1 (0)
..
.
i
..
.
Yi
..
.
Yi (1)
..
.
Yi (0)
..
.
Yi (1) − Yi (0)
..
.
YN (1)
YN (0)
YN (1) − YN (0)
Ȳ(1)
Ȳ(0)
Ȳ(1) − Ȳ(0)
YN
N
P
T
n1
Yi
−
P
C
n0
Yi
Reference
Other Papers
3 / 58
Introduction
Random
Forest
Individual
Causal
Inference
Min Lu
Units Observed
Potential Outcomes
Treatment Effects
Treatment Control
Outcome
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
1
..
.
Y1
..
.
Y1 (1)
..
.
Y1 (0)
..
.
Y1 (1) − Y1 (0)
..
.
i
..
.
Yi
..
.
Yi (1)
..
.
Yi (0)
..
.
Yi (1) − Yi (0)
..
.
YN (1)
YN (0)
YN (1) − YN (0)
Ȳ(1)
Ȳ(0)
Ȳ(1) − Ȳ(0)
Discussion
Survival
Outcome
Definitions &
notations
YN
N
P
T
n1
Yi
−
P
C
n0
Yi
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Propensity score approaches (Rosenbaum and Rubin, 1983)
can get an average treatment effect
Reference
Other Papers
4 / 58
Introduction
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Units
Potential Outcomes
1
..
.
Y1 (1)
..
.
Y1 (0)
..
.
Y1 (1) − Y1 (0)
..
.
i
..
.
Yi (1)
..
.
Yi (0)
..
.
Yi (1) − Yi (0)
..
.
N
YN (1)
YN (0)
YN (1) − YN (0)
Ȳ(1)
Ȳ(0)
Ȳ(1) − Ȳ(0)
Treatment overlap
oj (x)
Treatment effect
estimation
Treatment Effects
Treatment Control
We
estimate
individual
treatment
effect
(ITE)
Results
Discussion
Reference
Other Papers
5 / 58
Definitions and notations
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
• Let {(X1 , T1 , Y1 ), . . . , (Xn , Tn , Yn )} denote the data where
Xi is the covariate vector for individual i
• Let Ti = 0 represent the control group, and Ti = 1 the
intervention group
• Let Yi (0) and Yi (1) denote the potential outcome for i
under the two treatments
• Given Xi = x, the individual treatment effect (ITE) for i
is defined as
τ (x) = E [Yi (1)|Xi = x] − E [Yi (0)|Xi = x]
The average treatment effect (ATE) is defined as
Results
Discussion
Reference
τ0 = E [Yi (1)] − E [Yi (0)] = E [τ (X)]
Other Papers
6 / 58
ITE estimation
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
As to ITE, τ (x) = E [Yi (1)|Xi = x] − E [Yi (0)|Xi = x], the basis for our
counterfactual approaches rests on the assumption of strongly ignorable
treatment assignment (SITA). This assumes that P(T = 1|x) > 0 for all x
and treatment assignment is conditionally independent of the potential
outcomes given the variables; i.e., T ⊥ {Y(0), Y(1)} | X . Under the
assumption of SITA, we have
Definitions and
notations
Method
Simulation
Real Data
Application
τ (x) = E [Y|T = 1, X = x] − E [Y|T = 0, X = x]
Assuming that the outcome Y satisfies
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Y = f (X, T) + ε
where E(ε) = 0 and f is the unknown regression function, and assuming
that SITA holds, we have
τ (x) = f (x, 1) − f (x, 0)
Discussion
Reference
Other Papers
7 / 58
ITE estimation
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
As to ITE, τ (x) = E [Yi (1)|Xi = x] − E [Yi (0)|Xi = x] , the basis for our
counterfactual approaches rests on the assumption of strongly ignorable
treatment assignment (SITA). This assumes that P(T = 1|x) > 0 for all x
and treatment assignment is conditionally independent of the potential
outcomes given the variables; i.e., T ⊥ {Y(0), Y(1)} | X. Under the
assumption of SITA, we have
Definitions and
notations
Method
Simulation
Real Data
Application
τ (x) = E [Y|T = 1, X = x] − E [Y|T = 0, X = x]
Assuming that the outcome Y satisfies
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Y = f (X, T) + ε
where E(ε) = 0 and f is the unknown regression function, and assuming
that SITA holds, we have
τ (x) = f (x, 1) − f (x, 0)
Discussion
Reference
Other Papers
8 / 58
ITE estimation
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
As to ITE, τ (x) = E [Yi (1)|Xi = x] − E [Yi (0)|Xi = x], the basis for our
counterfactual approaches rests on the assumption of strongly ignorable
treatment assignment (SITA). This assumes that P(T = 1|x) > 0 for all x
and treatment assignment is conditionally independent of the potential
outcomes given the variables; i.e., T ⊥ {Y(0), Y(1)} | X. Under the
assumption of SITA, we have
Definitions and
notations
Method
Simulation
Real Data
Application
τ (x) = E [Y|T = 1, X = x] − E [Y|T = 0, X = x]
Assuming that the outcome Y satisfies
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Y = f (X, T) + ε
where E(ε) = 0 and f is the unknown regression function, and assuming
that SITA holds, we have
τ (x) = f (x, 1) − f (x, 0)
Discussion
Reference
Other Papers
9 / 58
Method
Random
Forest
Individual
Causal
Inference
Min Lu
The methods for estimating the ITE are as follows:
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
1. Virtual twins (VT)
2. Virtual twins interaction (VT-I)
3. Counterfactual RF (CF)
4. Counterfactual synthetic RF (synCF)
5. Bivariate RF (bivariate)
6. Honest RF (honest RF)
7. Bayesian Additive Regression Trees (BART)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
10 / 58
Virtual twins (VT)
Random
Forest
Individual
Causal
Inference
Min Lu
Foster et al. (2011) proposed Virtual Twins (VT) for estimating counterfactual outcomes. In this approach, RF
is used to regress Yi against (Xi , Ti ). Given an individual i with Ti = 1, to obtain i’s counterfactual estimate,
one runs the altered (Xi , 1 − Ti ) = (Xi , 0) down the forest to obtain the counterfactual estimate Ŷi (0),
denoted as ŶVT (x, 0). The VT counterfactual estimate for τ (x) is
τ̂VT (x) = ŶVT (x, 1) − ŶVT (x, 0)
Introduction
Continuous
and Binary
Outcome
Units
Potential Outcomes
Treatment Effects
Treatment Control
Definitions and
notations
Method
1
Yb1 (1)
Yb1 (0)
Yb1 (1) − Yb1 (0)
.
.
.
.
.
.
.
.
.
.
.
.
Survival
Outcome
i
Ybi (1)
Ybi (0)
Ybi (1) − Ybi (0)
Definitions &
notations
.
.
.
.
.
.
.
.
.
.
.
.
N
b
YN (1)
b
YN (0)
b
YN (1) − b
YN (0)
Simulation
Real Data
Application
Discussion
Treatment overlap
oj (x)
Treatment effect
estimation
Training data
Units Y T
1
.
. .
.
. .
.
. .
X1 . . . XP
.
.
.
N
R ANDOM F OREST
Results
Discussion
Reference
Other Papers
11 / 58
Virtual twins interaction (VT-I)
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
As noted in Foster et al. (2011), the VT approach can be improved by
manually including treatment interactions in the design matrix. Thus, one
runs a RF regression with Yi regressed against (Xi , Ti , Xi Ti )
Units
Continuous
and Binary
Outcome
Definitions and
notations
Potential Outcomes
Treatment Effects
Treatment Control
1
Yb1 (1)
Yb1 (0)
Yb1 (1) − Yb1 (0)
.
.
.
.
.
.
.
.
.
.
.
.
Discussion
i
Ybi (1)
Ybi (0)
Ybi (1) − Ybi (0)
Survival
Outcome
.
.
.
.
.
.
.
.
.
.
.
.
N
b
YN (1)
b
YN (0)
Method
Simulation
Real Data
Application
Training data
Units Y T
1
.
. .
.
. .
.
. .
X1 . . . XP TX10 . . . TXP0
.
.
.
.
.
.
N
R ANDOM F OREST
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
12 / 58
Out-of-bag (OOB) estimates
Random
Forest
Individual
Causal
Inference
Min Lu
Each tree in a forest is constructed from a bootstrap sample which uses
approximately 63% of the data. The remaining 37% of the data is called
OOB and are used to calculate an OOB predicted value for a case
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
13 / 58
Out-of-bag (OOB) estimates
Random
Forest
Individual
Causal
Inference
Min Lu
For example, if 1000 trees are grown, approximately 370 will be used in
calculating the OOB estimate for the case. The inbag predicted value, on
the other hand, uses all 1000 trees
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
14 / 58
Out-of-bag (OOB) estimates
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
OOB estimates are generally much more accurate than insample (inbag) estimates (Breiman, 1996). To
∗
illustrate how OOB estimation applies to VT, suppose that case x is assigned treatment T = 1. Let ŶVT
(x, T)
denote the OOB predicted value for (x, T). The OOB counterfactual estimate for τ (x) is
∗
τ̂VT (x) = ŶVT (x, 1) − ŶVT (x, 0)
Note that ŶVT (x, 0) is not OOB. This is because (x, 0) is a new data point and technically speaking cannot
have an OOB predicted value as the observation is not even in the training data
Units
Definitions and
notations
Potential Outcomes
Treatment Effects
Treatment Control
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
1
Yb1 (1)
Yb1 (0)
Yb1 (1) − Yb1 (0)
.
.
.
.
.
.
.
.
.
.
.
.
i
Ybi (1)
Ybi (0)
Ybi (1) − Ybi (0)
.
.
.
.
.
.
.
.
.
.
.
.
N
b
YN (1)
b
YN (0)
Training data
Units Y T
1
.
. .
.
. .
.
. .
b
YN (1) − b
YN (0)
X1 . . . XP
.
.
.
N
Use all data
for training RF
R ANDOM F OREST
Only use OOB
for prediction
Discussion
Reference
Other Papers
15 / 58
Counterfactual random forests (CF)
Random
Forest
Individual
Causal
Inference
Min Lu
Forests CF1 and CF0 are fit separately to data {(Xi , Yi ) : Ti = 1} and {(Xi , Yi ) : Ti = 0}, respectively. To
obtain a counterfactual ITE estimate, each data point is run down its natural forest, as well as its
counterfactual forest. If ŶCF,j (x, T) denotes the predicted value for (x, T) from CFj , for j = 0, 1, and
∗
ŶCF,1
(x, 1) denotes the OOB predicted value for (x, 1), the counterfactual OOB ITE estimate is
∗
τ̂CF (x) = ŶCF,1 (x, 1) − ŶCF,0 (x, 0)
Introduction
Continuous
and Binary
Outcome
Units
Definitions and
notations
Potential Outcomes
Treatment Effects
Training data: T=1
X1 . . . XP
.
.
.
Units Y T
.
. 1
.
.
.
. 1
Treatment Control
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
1
Yb1 (1)
Yb1 (0)
Yb1 (1) − Yb1 (0)
.
.
.
.
.
.
.
.
.
.
.
.
i
Ybi (1)
Ybi (0)
Ybi (1) − Ybi (0)
.
.
.
.
.
.
.
.
.
.
.
.
N
b
YN (1)
b
YN (0)
b
YN (1) − b
YN (0)
R ANDOM F OREST (1)
Training data: T=0
X1 . . . XP
.
.
.
Units Y T
.
. 0
.
.
.
. 0
Results
Discussion
Reference
R ANDOM F OREST (0)
Other Papers
16 / 58
Counterfactual synthetic random forests
(synCF)
Random
Forest
Individual
Causal
Inference
Min Lu
Using a collection of Breiman forests (called base learners) grown under different tuning parameters (mtry
and nodesize), each generating a predicted value called a synthetic feature, a synthetic forest (Ishwaran and
Malley, 2014) is defined as a secondary forest calculated using the new input synthetic features, along with all
the original features. We denote its ITE estimate by τ̂synCF (x):
τ̂synCF (x) = ŶsynCF,1 (x, 1) − ŶsynCF,0 (x, 0)
Introduction
Continuous
and Binary
Outcome
where ŶsynCF,j (x, T) is the predicted value for (x, T). As before, OOB estimation is used whenever possible
Units
Definitions and
notations
Potential Outcomes
Treatment Effects
Training data: T=1
X1 . . . XP Ŷ 1 . . .Ŷ M
.
.
.
.
.
.
Units Y T
.
. 1
.
.
.
. 1
Treatment Control
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
1
Yb1 (1)
Yb1 (0)
Yb1 (1) − Yb1 (0)
.
.
.
.
.
.
.
.
.
.
.
.
i
Ybi (1)
Ybi (0)
Ybi (1) − Ybi (0)
.
.
.
.
.
.
.
.
.
.
.
.
N
b
YN (1)
b
YN (0)
b
YN (1) − b
YN (0)
R ANDOM F OREST (1)
Synthetic features
Training data: T=0
X1 . . . XP Ŷ 1 . . .Ŷ M
.
.
.
.
.
.
Units Y T
.
. 0
.
.
.
. 0
Results
Discussion
Reference
R ANDOM F OREST (0)
Other Papers
17 / 58
Bivariate imputation approach
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Causal Inference as Missing Data Problem
Y(0)
Y(1)
X1
...
Xp
?
y2
?
y4
..
.
y1
?
y3
?
..
.
x11
x21
x31
x41
..
.
...
...
...
...
..
.
x1p
x2p
x3p
x4p
..
.
?
yn
xn1
...
xnp
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
To impute these missing outcomes, a bivariate splitting rule is used (Tang
and Ishwaran, 2015). The complete bivariate values are used to calculate
the bivariate counterfactual estimate
τ̂bivariate (x) = Ŷbivariate,1 (x) − Ŷbivariate,0 (x)
Other Papers
18 / 58
Honest RF
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
In this method, a RF is run by regressing Yi on (Xi , Ti ), but using only a
randomly selected 50% subset of the data. When fitting RF to this
training data, a modified regression splitting rule is used. Rather than
splitting tree nodes by maximizing the node variance, honest RF instead
uses a splitting rule which maximizes the treatment difference within a
node (see Procedure 1 and Remark 1 in Wager and Athey, 2015). We
denote the honest RF estimate by τ̂honestRF (x)
Units
Potential Outcomes
Treatment Effects
Treatment Control
Method
Simulation
Real Data
Application
Discussion
1
Y1 (1)
Y1 (0)
Y1 (1) − Y1 (0)
Survival
Outcome
.
.
.
.
.
.
.
.
.
.
.
.
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
i
Yi (1)
Yi (0)
Yi (1) − Yi (0)
.
.
.
.
.
.
.
.
.
.
.
.
N
YN (1)
YN (0)
YN (1) − YN (0)
Results
Discussion
Training data (50%)
Units Y T X1 . . . XP
1
.
. .
.
.
. .
.
.
. .
.
N/2
H ONEST R ANDOM F OREST
Use the remaining 50%
for prediciton
Reference
Other Papers
19 / 58
Bayesian Additive Regression Trees (BART)
Random
Forest
Individual
Causal
Inference
Min Lu
Hill (2011) proposed using BART to directly model the regression surface to estimate potential outcomes.
Therefore, this is similar to VT, but where RF is replaced with BART. The BART ITE estimate is defined as
τ̂BART (x) = ŶBART (x, 1) − ŶBART (x, 0)
where ŶBART (x, T) denotes the predicted value for (x, T) from BART. Note that due to the highly adaptive
nature of BART, no forced interactions are included in the design matrix
Introduction
Continuous
and Binary
Outcome
Units
Potential Outcomes
Treatment Effects
Treatment Control
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Training data
Units Y T
1
.
. .
.
. .
.
. .
1
Yb1 (1)
Yb1 (0)
Yb1 (1) − Yb1 (0)
.
.
.
.
.
.
.
.
.
.
.
.
i
Ybi (1)
Ybi (0)
Ybi (1) − Ybi (0)
.
.
.
.
.
.
.
.
.
.
.
.
N
b
YN (1)
b
YN (0)
b
YN (1) − b
YN (0)
X1 . . . XP
.
.
.
N
BAYESIAN
A DDITIVE
R EGRESSION
T REES
(BART)
Results
Discussion
Reference
Other Papers
20 / 58
Simulation study to assess different methods
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
• 20 covariates: 10 continuous variables, 10 binary variables
• A logistic regression model was used to simulate T in which the
linear predictor F(X) defined on the logit scale was
F(X) = −2 + .028X1 − .374X2 − .03X3 + .118X4 − 0.394X11 + 0.875X12 + 0.9X13
• Three different models were used for continuous outcome Y, which
was assumed to be Yi = fj (Xi , Ti ) + εi , where εi were independent
N(0, σ 2 ). The mean functions fj for the three simulations were
Discussion
Survival
Outcome
f1 (X, T)
=
2.455 − 1{T=0} × (.4X1 + .154X2 − .152X11 − .126X12 ) − 1{T=1, g(X)>0}
f2 (X, T)
=
2.455 − 1{T=0} × sin(.4X1 + .154X2 − .152X11 − .126X12 ) − 1{T=1, g(X)>0}
Definitions &
notations
f3 (X, T)
=
2.455 − 1{T=0} × sin(.4X1 + .154X2 − .152X11 − .126X12 ) − 1{T=1, h(X)>0}
Treatment overlap
oj (x)
Treatment effect
estimation
2
where g(X) = .254X22 − .152X11 − .4X11
− .126X12 and h(X) = .254X32 − .152X4 − .126X5 − .4X52
Results
Discussion
Reference
Other Papers
21 / 58
Performance measures
Random
Forest
Individual
Causal
Inference
Min Lu
Given an estimator τ̂ of τ , the bias for group Gm was defined as
h
i
h
i
Bias(m) = E τ̂ (X)|X ∈ Gm − E τ (X)|X ∈ Gm ,
m = 1, . . . , M
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Our simulation experiments were replicated independently B times. Let
Gm,b denote those x values that lie within the qm quantile of the propensity
score from realization b. Let τ̂b be the ITE estimator from realization b.
The conditional bias was estimated by
Simulation
Real Data
Application
B
B
1X
1X
d
τ̂m,b −
τm,b
Bias(m)
=
B b=1
B b=1
Discussion
Survival
Outcome
Definitions &
notations
where
Treatment overlap
oj (x)
Treatment effect
estimation
Results
τ̂m,b =
X
1
τ̂b (xi ),
#Gm,b
xi ∈Gm,b
τm,b =
X
1
τ (xi )
#Gm,b
xi ∈Gm,b
Discussion
Reference
Other Papers
22 / 58
Performance measures
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Similarly, we define the conditional root mean squared error (RMSE) of τ̂
by
s
2
X ∈ Gm ,
m = 1, . . . , M
RMSE(m) = E τ̂ (X) − τ (X)
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
which we estimated using
v
u X
i2
X h
u1 B
1
\
RMSE(m)
=t
τ̂b (xi ) − τb (xi )
B b=1 #Gm,b
xi ∈Gm,b
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
23 / 58
Simulation result:
Bias and Root-Mean-Squared-Error
Discussion
0.4
n = 500
n = 5000
0.5
RMSE
0.5
RMSE
RMSE
0.5
Simulation 3
0.6
n = 500
n = 5000
0.4
0.4
0.3
0.3
0.3
0.2
0.2
0.2
VT
VT−I
honest RF
bivariate
CF
synCF
BART
VT
VT−I
honest RF
bivariate
CF
synCF
BART
Results
Simulation 2
0.6
n = 500
n = 5000
VT
VT−I
honest RF
bivariate
CF
synCF
BART
VT
VT−I
honest RF
bivariate
CF
synCF
BART
VT
VT−I
honest RF
bivariate
CF
synCF
BART
VT
VT−I
honest RF
bivariate
CF
synCF
BART
Simulation 1
0.6
VT
VT−I
honest RF
bivariate
CF
synCF
BART
VT
VT−I
honest RF
bivariate
CF
synCF
BART
−0.05
VT
VT−I
honest RF
bivariate
CF
synCF
BART
VT
VT−I
honest RF
bivariate
CF
synCF
BART
−0.05
Discussion
Treatment effect
estimation
Bias
0.00
−0.05
Real Data
Application
Treatment overlap
oj (x)
0.05
0.00
Simulation
Definitions &
notations
0.10
0.05
0.00
Method
n = 500
n = 5000
0.15
0.10
0.05
Definitions and
notations
Survival
Outcome
Simulation 3
n = 500
n = 5000
0.15
0.10
VT
VT−I
honest RF
bivariate
CF
synCF
BART
VT
VT−I
honest RF
bivariate
CF
synCF
BART
Continuous
and Binary
Outcome
Simulation 2
n = 500
n = 5000
0.15
Bias
Introduction
Simulation 1
Bias
Random
Forest
Individual
Causal
Inference
Min Lu
Reference
Other Papers
24 / 58
Project Aware: a counterfactual approach to understand the
role of drug use in sexual risk
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
• Aim: how substance use plays a role in sexual risk
(n = 2813, p = 99)
• Treatment (exposure) variable T: 0 = no substance use
in the prior 6 months, 1 = any substance use in the
prior 6 months leading to the study
• Outcome: number of unprotected sex acts within the
last six months as reported by the individual
• Method: a synthetic forest was fit separately to each
exposure group. This yielded estimated causal effects
{τ̂synCF (xi ), i = 1, . . . , n} for τ (x) defined as the mean
difference in number of unprotected sex acts for drug
versus non-drug users
Reference
Other Papers
25 / 58
RF estimated causal effect of drug use
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
26 / 58
RF estimated causal effect of drug use
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
The true model for the outcome
(number of unprotected sex acts)
is Y = f (X, T) + ε, where
f (X, T) = α0 T + h(X, T)
and h is some unknown function.
Under the assumption of SITA,
Definitions and
notations
Method
Simulation
Real Data
Application
τ (x)
=
f (x, 1) − f (x, 0)
=
α0 + h(x, 1) − h(x, 0)
Discussion
Survival
Outcome
Definitions &
notations
Now since we canPassume a
linear model α + pj=1 βj xj for the
ITE, we have
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
α0 + h(x, 1) − h(x, 0) = α +
p
X
βj xj
j=1
Reference
Other Papers
27 / 58
Explore causal effect of drug use
Random
Forest
Individual
Causal
Inference
Min Lu
Now we assume a linear model for the ITE. Standard errors and
significance of linear model coefficients were determined using
p
subsampling. We have
X
τ (x) = α +
βj xj
Introduction
j=1
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Intercept (drug use)
CESD
Condom change 2
Condom change 3
Condom change 4
Condom change 5
Usual care 3
Marriage
No health insurance
SU treatment last 6 months 2
Frequency of injection
Estimate
Std. Error
Z
16.97
0.60
-19.38
-23.33
-21.46
-24.02
4.91
-1.61
2.72
6.38
3.59
9.36
0.13
2.96
3.23
3.39
3.41
2.11
0.73
0.99
3.20
1.77
1.81
4.54
-6.56
-7.22
-6.32
-7.04
2.33
-2.21
2.75
2.00
2.02
Other Papers
28 / 58
Discussion
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
• In observational data with complex heterogeneity of treatment
effect, individual estimates of treatment effect can be obtained in a
principled way by directly modeling the response surface
• Successful estimation mandates highly adaptive and accurate
regression methodology and for this we relied on RF, a machine
learning method with well known properties for accurate estimation
in complex nonparametric regression settings
• We encourage the use of out-of-bag estimation, a simple but
underappreciated out-of-sample technique for improving accuracy
• The success of counterfactual synthetic RF can be attributed to
three separate effects: (a) fitting separate forests to each treatment
group, which improves adaptivity to confounding; (b) replacing
Breiman forests with synthetic forests, which reduces bias; and (c)
utilizing OOB estimation, which improves accuracy
Discussion
Reference
Other Papers
29 / 58
Research Questions
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
We base our analysis on data from 1468 patients who were treated for
ischemic cardiomyopathy at Cleveland Clinic from 1997 to 2007.
Treatments include
· Coronary artery bypass grafting alone (CABG)
· CABG plus mitral valve anuloplasty (MVA)
· CABG plus surgical ventricular reconstruction (SVR)
· Listing for cardiac transplantation (LCTx)
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
• What is the average treatment
effect (ATE) and the individual
treatment effect (ITE)?
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
30 / 58
Research Questions
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
We base our analysis on data from 1468 patients who were treated for
ischemic cardiomyopathy at Cleveland Clinic from 1997 to 2007.
Treatments include
· Coronary artery bypass grafting alone (CABG)
· CABG plus mitral valve anuloplasty (MVA)
· CABG plus surgical ventricular reconstruction (SVR)
· Listing for cardiac transplantation (LCTx)
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
MVA
LCTx
31
81
CABG
43
126
231
90
SVR
65
37
77
110
124
52
121
109
171
• What is the treatment effect?
• How to check overlap and how to
utilize all data points when lack of
overlap often results in sample
size reduction?
• Have patients received optimal
treatments?
Reference
Other Papers
31 / 58
Research Questions
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
We base our analysis on data from 1468 patients who were treated for
ischemic cardiomyopathy at Cleveland Clinic from 1997 to 2007.
Treatments include
· Coronary artery bypass grafting alone (CABG)
· CABG plus mitral valve anuloplasty (MVA)
· CABG plus surgical ventricular reconstruction (SVR)
· Listing for cardiac transplantation (LCTx)
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
MVA
LCTx
31
81
CABG
43
126
231
90
SVR
65
37
77
110
124
52
121
109
171
• What is the treatment effect?
• How to check overlap and how to
utilize all data points when lack of
overlap often results in sample
size reduction?
• Have patients received optimal
treatments?
Reference
Other Papers
31 / 58
Treatment effect on survival outcome
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Let {(X1 , Z1 , T1 , δ1 ), . . . , (Xn , Zn , Tn , δn )} denote the data. The observed
survival time Ti = min(Tio , Cio ), where Tio is the true event time and Cio is
the true censoring time. We assume Tio ⊥ Cio |(Xi , Zi ). Let T o (j) denote the
potential outcome (event time) under treatment Z = j
Continuous
and Binary
Outcome
Definitions and
notations
Units
Observed
Potential Outcomes
Treatment Effects
Method
Simulation
Outcome Treatment
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
1
.
.
.
i
δ1
T1
Z1
.
.
.
Ti
δi
Zi
Therapy 1 . . .
Therapy M
for j over k
T1o (1)
T1o (M)
Sj (t|x1 ) − Sk (t|x1 )
.
.
.
.
.
.
.
.
.
Tio (1)
Tio (M)
Sj (t|xi ) − Sk (t|xi )
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
32 / 58
The individual treatment effect (ITE) τj,k (t, x)
1.0
Random
Forest
Individual
Causal
Inference
Min Lu
0.8
Survival under CABG
Survival under MVA
Treatment effect of CABG over MVA
0.6
Introduction
0.4
Continuous
and Binary
Outcome
0.2
Definitions and
notations
Method
Simulation
0.0
Real Data
Application
−0.2
Discussion
Survival
Outcome
0
Definitions &
notations
2
4
6
Time (Year)
8
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Definition
The individual treatment
effect (ITE) at time tfor covariate x for treatment j over treatment k is defined as
τj,k (t, x) = Sj t|x − Sk t|x , where Sl t|x = P{T o (l) > t|X = x} is the survival function
Reference
Other Papers
33 / 58
0.8
Survival under CABG
Survival under MVA
Treatment effect of CABG over MVA
Method
Simulation
Real Data
Application
0.2
Overlap Assumption
0.0
Overlap between treatments j and k is said to hold for x
if P(Z = j|x) > 0 and P(Z = k|x) > 0. We define the
overlap function as follows
oj (x) = 1{P(Z=j|x)>0}
−0.2
Definitions and
notations
0.4
Introduction
Continuous
and Binary
Outcome
Weak Unconfoundedness Assumption
We say that weak unconfoundedness holds, if for all
j ∈ {1, . . . , M},
1{Z=j} ⊥ T o (j), Co (j) | X
0.6
Random
Forest
Individual
Causal
Inference
Min Lu
1.0
The individual treatment effect (ITE) τj,k (t, x)
0
2
4
6
Time (Year)
8
Discussion
Survival
Outcome
Definition
Definitions &
notations
The individual treatment effect (ITE) at time t for covariate x for treatment j over treatment k is defined as
τj,k (t, x) = Sj t|x − Sk t|x , where Sl t|x = P{T o (l) > t|X = x} is the survival function. Under weak
Treatment overlap
oj (x)
unconfoundedness,
Treatment effect
estimation
Results
Discussion
Reference
τj,k (t, x)
=
=
=
o
o
P{T (j) > t|X = x} − P{T (k) > t|X = x}
o
o
P{T > t|X = x, Z = j} − P{T > t|X = x, Z = k}
S t|x, Z = j − S t|x, Z = k
Other Papers
34 / 58
Potential survival function under CABG
Potential survival function under SVR
Potential survival function under MVA
Potential survival function under LCTx
0.8
Random
Forest
Individual
Causal
Inference
Min Lu
1.0
Individualized Treatment Rules
Definitions and
notations
0.4
Continuous
and Binary
Outcome
0.6
Introduction
Method
Real Data
Application
0.2
Simulation
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
0.0
Discussion
Survival
Outcome
0
2
4
6
Time (Year)
8
The optimal individualized treatment rule (ITR) selects the best treatment
for each patient. For example, LCTx is the optimal treatment in the above
figure.
Other Papers
35 / 58
Individualized Treatment Rules
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
The optimal decision rule for x, denoted as
dopt (x), maximizes the restricted mean
survival time (RMST)
Z t0
opt
d (x) = argmax
S t|x, Z = l dt
l∈{l:ol (x)=1}
0
and is uniquely determined by the ITE
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Integrating over t ∈ [0, t0 ], we define the ITE before time t0 as
τj,k ([0, t0 ], x) =
Z t
0
0
τj,k (t, x) dt
which can be interpreted as the difference in the number of
years alive before time t0 for treatment j over k. Typically, t0 is
chosen to equal the maximum observed follow-up time
Results
Discussion
Reference
Other Papers
36 / 58
Continuous
and Binary
Outcome
Definitions and
notations
0.10
0.05
0.00
Introduction
Individual Treatment effect of CABG over MVA
Average Treatment effect of CABG over MVA
−0.05
Random
Forest
Individual
Causal
Inference
Min Lu
0.15
The average treatment effect (ATE) τj,k (t)
Simulation
Real Data
Application
−0.15
Method
0
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
2
4
6
Time (Year)
8
Definition
Define the average treatment effect (ATE) at time t for treatment j over treatment k, as
i
h
τj,k (t) = EX τj,k (t, X) oj (X) = 1, ok (X) = 1 .
Discussion
Reference
Rt
We define the ATE before time t0 as τj,k ([0, t0 ]) = 0 0 τj,k (t) dt
Other Papers
37 / 58
The average treatment effect (ATE) τj,k (t)
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Definition
Define the average treatment effect (ATE) at time t for treatment j over treatment k, as
h
i
τj,k (t) = EX τj,k (t, X) oj (X) = 1, ok (X) = 1 .
Results
Discussion
Reference
Rt
We define the ATE before time t0 as τj,k ([0, t0 ]) = 0 0 τj,k (t) dt
Other Papers
37 / 58
The average treatment effect (ATE) τj,k (t)
Random
Forest
Individual
Causal
Inference
Min Lu
Overlap Assumption
Overlap between treatments j and k is said to hold
for x if P(Z = j|x) > 0 and P(Z = j|x) > 0. We
formally define the overlap function as follows
Introduction
Continuous
and Binary
Outcome
oj (x) = 1{P(Z=j|x)>0}
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Definition
Define the average treatment effect (ATE) at time t for treatment j over treatment k, as
h
i
τj,k (t) = EX τj,k (t, X) oj (X) = 1, ok (X) = 1 .
{z
}
| {z } |
ITE
Treatment Overlap
Results
Discussion
Reference
R t0
We define the ATE before time t0 as τj,k ([0, t0 ]) = 0 τj,k (t) dt
Other Papers
38 / 58
Treatment overlap oj (x) = 1{P(Z=j|x)>0}
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
• Strategy 1. The overlap function can be
estimated using
ôj (x; C) = 1{P̂(Z=j|x)>C}
where 0 < C < 1 is a selected cutoff value
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
• Strategy 2. A unique feature of our study
was the availability of expert knowledge for
defining treatment eligibility
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
39 / 58
Treatment overlap oj (x) = 1{P(Z=j|x)>0}
Random
Forest
Individual
Causal
Inference
Min Lu
• Strategy 1. The overlap function can be
estimated using
ôj (x; C) = 1{P̂(Z=j|x)>C}
Introduction
Continuous
and Binary
Outcome
where 0 < C < 1 is a selected cutoff value
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
• Strategy 2. A unique feature of our study
was the availability of expert knowledge for
defining treatment eligibility
Table: Expert knowledge used for determining treatment eligibility
Treatment
CABG
SVR∗
MVA
LCTx∗
∗
Expert Knowledge Eligibility Criteria
(a) Ischemic symptoms (angina); viable myocardium with diseased but by-passable coronary arteries. If (a) was not available, eligibility was determined using: (b) ACC/AHA guidelines for CABG based on angina and coronary artery disease
Anterior wall akinesia/dyskinesia; left ventricular end-diastolic diameter>6 cm
3+/4+ mitral regurgitation (MR) present
Age<70 years; NYHA functional class III/IV; creatinine level<1.7 mg·dL−1
Treatments where expert knowledge is considered less accurate for determining eligibility
Other Papers
39 / 58
Treatment overlap oj (x) = 1{P(Z=j|x)>0}
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
• Strategy 1. The overlap function can be
estimated using
ôj (x; C) = 1{P̂(Z=j|x)>C}
where 0 < C < 1 is a selected cutoff value
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
• Strategy 2. A unique feature of our study
was the availability of expert knowledge for
defining treatment eligibility
Let Ej ∈ {0, 1} denote eligibility for
treatment j. The overlap function can
also be estimated using
Results
Discussion
Reference
ôj (x; C) = 1{P̂(Ej =1|x)>C}
Other Papers
39 / 58
Random forest approaches addressing ôj (x, C)
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
• Approach I. Random forest classification (RF-C) approach. Our
first approach uses the treatment received Zi as the outcome and
Xi as features and fits a random forest classification (RF-C) model
to estimate P{Z = j|x} — strategy 1
• Approach II. Random forest distance (RF-D) approach. The
general idea is to assign patient i’s eligibility for treatment j by using
the “random forest distance” of i to treatment j patients — strategy 1
• Approach III. Multivariate random forest (MRF). We directly
model expert knowledge by using the expert data as M-multivariate
outcomes in a multi-label classfication analysis — strategy 2
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
40 / 58
Approach II. Random forest distance approach
Random
Forest
Individual
Causal
Inference
Min Lu
Let diA be the count of the edges from i to the closest common ancestor of i and i0 . Similarly, let diA0 count the
edges from i0 to the closest (i, i0 ) common ancestor. Define DA
= diA + diA0 . Let diR and diR0 be the count of
i,i0
the edges from i and i0 to the root node and define DR
= diR + diR0 . The distance is defined as
i,i0
DA
i,i0
.
Introduction
di,i0 =
Continuous
and Binary
Outcome
The forest distance is defined as the forest averaged distance, which we denote by di,i0 . We define the
probability of assigning i to treatment j by the closeness of i to treatment j patients,
Definitions and
notations
Method
Simulation
Real Data
Application
P
P̂{Z = j|Xi } =
DR 0
i,i
i0 :Z 0 =j (1
i
P
i0 (1
− di,i0 )
− di,i0 )
.
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
The distance between Xi and Xi0 is the
ratio of the number of edges connecting
the red nodes to the ancestor, NA , to the
number of edges connecting the red nodes
to the root node, NR . Thus
di,i0 = (2 + 1)/(4 + 3) = 3/7
Other Papers
41 / 58
Cutoff criteria C in ôj (x, C)
Random
Forest
Individual
Causal
Inference
Min Lu
ôj (x; C) = 1{P̂(E =1|x)>C}
j
Random Forest Classification (RF−C)
Random Forest Distance (RF−D)
Multivariate Random Forest (MRF)
Simulation
Real Data
Application
0.26
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
n
1 X
X
1{E 0 6=ô 0 (X ,c)}
i
ij
j
0<c<1 2n i=1 0
j ∈M 0
= arg min
Table: Cutoff values for estimating treatment eligibility
C=0.08
0.18
C=0.12
Discussion
Method
0.04
Cutoff
Value
C=0.61
0.0
Survival
Outcome
∗
0.4
C
0.2
Method
Misclassification Error
Definitions and
notations
0.6
Introduction
Continuous
and Binary
Outcome
• Let M 0 = {j1 , j2 } denote the subset of treatment
groups corresponding to CABG and MVA. We define
the CABG and MVA cutoff as follows:
• Treatment overlap is determined either
ôj (x; C) = 1{P̂(Z=j|x)>C} or
0.0
0.2
0.4
0.6
0.8
1.0
Cutoff, c
RF-C
RF-D
MRF
0.08
0.12
0.61
Misclassification Error
CABG
MVA
0.26
0.18
0.04
All four
treatments
0.32
0.35
0.13
Fig: Misclassification error of CABG and MVA
as a function of the cutoff value c
Results
Discussion
Reference
Other Papers
42 / 58
Counterfactual survival analysis using random
survival forests
Random
Forest
Individual
Causal
Inference
Min Lu
We utilize ALL the data to estimate S(t|x, Z)
We estimate the survival function S(t|x, Z) using virtual twin random survival forest interactions, denoted as
RSF-VT-I where we add all possible interactions between the treatment variable Z and covariates X to the
design matrix to grow random survival forest. The counterfactual ITE estimate is defined as
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
τ̂j,k (t, Xi ) = Ŝ∗ (t|Xi , Z = Zi = j) − Ŝ(t|Xi , Z = k),
where Ŝ∗ (t|Xi , Zi = j) is the Out-of-Bag estimated survival value based on i’s original (unaltered) data
Units
Potential Outcomes
Treatment Effects
Treatment j
Simulation
Treatment k
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
1
.
.
.
Treatment overlap
oj (x)
i
Treatment effect
estimation
.
.
.
Results
Discussion
N
b
S1 (t|Z = j)
.
.
.
b
Si (t|Z = j)
.
.
.
b
SN (t|Z = j)
b
S1 (t|Z = k)
.
.
.
b
Si (t|Z = k)
.
.
.
b
SN (t|Z = k)
Training data
Units T δ
1
.
.
.
.
.
.
Method
τ
bj,k (t|X1 )
Z
.
.
.
X1 . . . XP ZX10 . . . ZXP0
.
.
.
.
.
.
N
.
.
.
τ
bj,k (t|Xi )
.
.
.
R ANDOM S URVIVAL F OREST
τ
bj,k (t|XP )
Reference
Other Papers
43 / 58
Counterfactual survival analysis using random
survival forests
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
We utilize ALL the data to estimate S(t|x, Z)
We estimate the survival function S(t|x, Z) using virtual twin random survival forest interactions, denoted as
RSF-VT-I where we add all possible interactions between the treatment variable Z and covariates X to the
design matrix to grow random survival forest. The counterfactual ITE estimate is defined as
τ̂j,k (t, Xi ) = Ŝ∗ (t|Xi , Z = Zi = j) − Ŝ(t|Xi , Z = k),
where Ŝ∗ (t|Xi , Zi = j) is the Out-of-Bag estimated survival value based on i’s original (unaltered) data
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
43 / 58
Average treatment effect on the treated (ATT)
Random
Forest
Individual
Causal
Inference
Min Lu
Treatment assignment is not only adjusted, but also judged
Definition
Introduction
Define the average treatment effect on the treated (ATT) at time t for the treated j, for treatment j over
treatment k, as
h
i
τ j k (t) = EX τj,k (t, X) Z = j, oj (X) = 1, ok (X) = 1
Continuous
and Binary
Outcome
Likewise, the ATT for the treated k, for treatment j over k, is
h
i
τj k (t) = EX τj,k (t, X) Z = k, oj (X) = 1, ok (X) = 1
Definitions and
notations
Method
0.15
0.10
0.05
0.00
0.00
Treatment overlap
oj (x)
Overlap for CABG and MVA
Received CABG and overlap for MVA
Received MVA and overlap for CABG
−0.05
Definitions &
notations
−0.05
Survival
Outcome
0.10
Discussion
0.05
Real Data
Application
0.15
Simulation
Average Treatment effect of CABG over MVA
Individual Treatment effect of patient received CABG
Individual Treatment effect of patient received MVA
Discussion
Reference
−0.15
Results
0
2
4
6
Time (Year)
8
−0.15
Treatment effect
estimation
0
2
4
6
Time (Year)
8
Other Papers
44 / 58
Results
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
45 / 58
Results
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
The areas under the black, blue, and red lines of previous figure equal
the ATE and ATT before t0 (the maximum observed follow-up time), and
thus represent the difference in number of years alive before t0
o
ATE before t0 (black line)
o
ATT before t0 where j is the treated (blue line)
o
ATT before t0 where k is the treated (red line)
ATEjk = τj,k ([0, t0 ])
ATTjk = τ j k ([0, t0 ])
ATTkj = τj k ([0, t0 ])
Definitions and
notations
Method
Simulation
Real Data
Application
Table: Difference in number of months alive before maximum follow-up
time, t0 = 9.36 years
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
ATEojk
Treatment j vs. k
(a) CABG vs. SVR
(b) CABG vs. MVA
(c) CABG vs. LCTx
(d) SVR vs. MVA
(e) SVR vs. LCTx
(f) MVA vs. LCTx
ATTojk
ATTokj
MRF
RF-C
RF-D
Mean
SE
Mean
SE
0.31
4.88
0.85
5.95
-1.40
-11.80
0.29
5.06
3.67
5.49
-0.55
-6.08
0.60
5.21
3.50
5.47
-1.08
-6.81
-2.67
4.20
5.85
5.97
2.57
-0.84
3.74
2.89
2.26
1.41
1.52
2.62
0.70
5.02
-0.74
5.70
-4.81
-14.97
0.93
1.55
1.11
5.61
1.53
1.36
Reference
Other Papers
46 / 58
Results
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Confidence intervals for
individual treatment effects
τ̂j,k (t, x) at t = 5 years. Each
subfigure indicates a
pairwise comparison for
treatment j versus k. Red
and blue indicate patients
with significant treatment
effect (p-value < .05), where
blue are from treatment j
group and red are from
treatment group k. Thus,
blue and red boxes
correspond to some of the
patients from blue and red
lines in previous figure
Reference
Other Papers
47 / 58
Treatment effect heterogeneity test
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
48 / 58
Subgroup analysis
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
We fit a bump hunting model (Friedman and Fisher, 1999; Duong, 2015)
for subgroup analysis. To improve efficiency of the algorithm, we only
used variables found important by using random forest variable selection.
The estimated ITE was used for the outcome and all pre-treatment
covariates as independent variables
Table: Subgroup detection using bump hunting after variable selection. CATEojk equals the conditional ATE
before t0 , conditioned on subgroup criteria
Method
Treatment j vs. k
Subgroup
Discussion
CABG vs. SVR
CABG vs. SVR
Survival
Outcome
CABG vs. LCTx
BSA>2.23
Regurgitation Grade>0
Blood Urea Nitrogen<30
Creatinine<1.8
BMI>27.04
GFR>44.75
Blood Urea Nitrogen<25
LDL<133.31
BSA>1.83
BMI>27.77
55.29<GFR<120.80
Simulation
Real Data
Application
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
SVR vs. LCTx
CATEojk /ATEojk
Size/Total
% in j
% in k
-4.08/0.31
-7.26/0.31
44/246
31/246
28.57
10.71
16.51
12.84
5.31/0.85
125/406
59.18
21.75
7.66/-1.40
60/292
30.37
12.10
Discussion
Reference
BSA=body surface aera (m2 ); BMI=body mass index; GFR=glomerular filtration rate; LDL=low-density
lipoprotein cholesterol
Other Papers
49 / 58
Results
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Confidence intervals for
individual treatment effects
τ̂j,k (t, x) at t = 5 years. Each
subfigure indicates a
pairwise comparison for
treatment j versus k. Red
and blue indicate patients
with significant treatment
effect (p-value < .05), where
blue are from treatment j
group and red are from
treatment group k. Thus,
blue and red boxes
correspond to some of the
patients from blue and red
lines in previous figure
Reference
Other Papers
50 / 58
Results
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Fig. Identifying patients who received optimal
treatment and those who did not. Optimal therapy
is defined as eligible treatment maximizing
restricted mean survival time (RMST). Pie charts
display gain in months for alternative optimized
therapies and their respective sample sizes. If
optimized treatment is the assigned treatment, gain
is defined as zero.
Reference
Other Papers
51 / 58
Gain in months for patients who received SVR
but where optimal therapy was CABG
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
52 / 58
Gain in months for patients who received SVR
but where optimal therapy was CABG
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
53 / 58
Treatment decisions
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
54 / 58
Concluding remarks
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
• One contribution of this paper is to offer estimation
methods for assessing treatment overlap under the
scenario that some treatments may have either gold
standard expert knowledge, or controversial knowledge
for judging eligibility
• For personalized treatment decision and dynamic
causal procedure of treatment effect, we develop a
virtual twin random survival forest, extended to include
interactions between treatment variables and all
pre-treatment covariates
• A key insight of this paper is to judge current treatment
decisions using pairwise ATT comparisons
Reference
Other Papers
55 / 58
Reference
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
56 / 58
Other papers
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Method
Simulation
Real Data
Application
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
• Lu M. and Ishwaran H. (2017). A Prediction-Based Alternative to
P-values in Regression Models. To appear The Journal of Thoracic
and Cardiovascular Surgery
https://arxiv.org/abs/1701.04944
• Ishwaran H. and Lu M. (2018). Standard Errors and Confidence
Intervals for Variable Importance in Random Forest Regression,
Classification, and Survival. To appear Statistics in Medicine
• Rice T.W., Lu M., Ishwaran H., and Blackstone, E.H. (2017).
Precision Surgical Therapy for Adenocarcinoma of the Esophagus
and Esophagogastric Junction
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
57 / 58
Random
Forest
Individual
Causal
Inference
Min Lu
Introduction
Continuous
and Binary
Outcome
Definitions and
notations
Thank you all very much!
Method
Simulation
Real Data
Application
You can reach me at
[email protected]
Discussion
Survival
Outcome
Definitions &
notations
Treatment overlap
oj (x)
Treatment effect
estimation
Results
Discussion
Reference
Other Papers
58 / 58