PSMatching
PSMatching
PSMatching
潘杰,讲师,华西公共卫生学院
Email: [email protected]
Website: panjie.org
《卫生政策与管理研究的定量方法》,2013年11月7日
Introductory example
Question: What is the impact of cash transfer
intervention on maternal mortality?
OLS?
Would there be selection bias?
2
Evaluation Problem
Let Yi T be medical cost of patients in treatment group ( T ), i.e. Yi T | T (1)
We will never have the patient’s medical cost with and without treatment at the same
time.
But, we may hope to learn the average effect of treatment on medical cost:
E (Yi T Yi C | T ). Then
5
Generating Propensity Scores
Propensity scores can be estimated using
several methods, but the most commonly used
method is logistic regression.
7
Limitations of Matching
If the two groups do not have substantial
overlap, then substantial error may be
introduced:
E.g., if only the worst cases from the
untreated “comparison” group are compared
to only the best cases from the treatment
group, the result may be regression toward
the mean
makes the comparison group look better
Makes the treatment group look worse.
Propensity Score Overlap - 1
.975-
.925-
.875-
.825-
.775-
.725-
Propensity score
.675-
.625-
.575-
.525-
.475-
.425-
.375-
.325-
.275-
.225-
.175-
.125-
.075-
.025-
0-
400 300 200 100 0 100 200 300 400
Number of Infants
5-19% Poverty < 5% Poverty
Data Source: MN Department of Health, Center for Health Statistics, LBD 1990-1999
Propensity Score Overlap - 2
.975-
.925-
.875-
.825-
.775-
Propensity score
.725-
.675-
.625-
.575-
.525-
.475-
.425-
.375-
.325-
.275-
.225-
.175-
.125-
.075-
.025-
0-
Number of Infants
40-100% Poverty < 5% Poverty
Data Source: MN Department of Health, Center for Health Statistics, LBD 1990-1999
Propensity Scores– An Example of NO
Overlap
1.00-
.875-
.725-
Propensity score
.575-
.425-
.275-
.125-
0-
Number of Subjects
Range of
matched
cases.
Participants Nonparticipants
Predicted Probability
Propensity Score Matching (PSM)
Employs a predicted probability of group
membership
E.g. treatment vs. control group
Based on observed predictors, usually obtained from
logistic regression to create counterfactual group
(Rosenbaum & Rubin, 1983)
Dependent variable: T=1, if participate; T=0, otherwise
T=f(age, gender, pre-cci, etc.)
Allows “quasi-randomized” experiment
Two subjects, one in treated group and one in the
control, with the same (or similar) propensity score,
can be seen as “randomly assigned” to either group
Criteria for “Good” PSM Before
Matching
Identify treatment and comparison group with
substantial overlap
Same exclusion, inclusion criteria
Overweighting some variables (Medicare vs Medicaid)
Choose variables
20
Stratification Method
Divide the range of variation of the propensity
score in intervals such that within each interval
treated and control units have, on average, the
same propensity score
Calculate the differences in outcomes measure
between the treatment and the control group
in each interval
Average treatment effect is obtained as an
average of outcomes of each block with
weights given by the distribution of treated
units across blocks
Discard observation in blocks where either the
treated or control unit is absent
22
Stratification Method
Nearest Neighbor Matching
Randomly order the participants and non-
participants
Then select the first participants and find non-
participant with closest propensity score
Nearest Neighbor Matchup (1 to 1)
Caliber: 0.034677/4=0.0086
Stata command: psmatch2, outcome(cost) pscore(phat) n(2)
norep cal(0.086)
Radius and Kernel Matching
Radius matching
Each treated unit is matched only with the control
units whose propensity score falls in a predefined
neighborhood of the propensity score of the treated
unit
Kernel matching
All treated are matched with a weighted average of
all controls
Weights are inversely proportional to the distance
between the propensity scores of treated and
controls
Radius Matching
r
Kernel Matching
(psmatch2, outcome(cost) pscore(phat) kernel)
^ d1 ^
C1T P1T P1C C1C
^ d2 ^
C2T P2T P2C C2C
^ d3 ^
C3T P3T P3C C3C
d1 d2 d3
38
Quantifiable Criteria-1
C1. Two sample t-statistics & Chi-square test
between treatment and matched control
observations
-Insignificant values
Variables C1: T-test or chi-square test p-values
M1 M2 M3 M4 M5 M6 M7
Age 0 0 0.001 0.709 0 0 0
Female 0.267 0.868 0.87 0.999 0.233 0.376 0.255
Male 0.267 0.868 0.87 0.999 0.233 0.376 0.255
Northeast 0.005 0.003 0.514 0.482 0.002 0 0.006
North central 0 0 0.962 0.999 0 0 0
South 0 0 0.966 0.999 0.003 0 0
West 0 0 0.948 0.999 0.61 0.024 0.45
Other region 0 0 0.999 0.999 0 0 0
CCI 0 0 0.255 0.999 0 0 0
Point of service0.407 0.937 0.891 0.999 0.455 0.689 0.515
Other plan type0.407 0.937 0.891 0.999 0.455 0.689 0.515
Quantifiable Criteria-2
C2. Mean difference as a percentage of the
average standard deviation
1
100 * ( X T X C ) /( )( s xT s xC )
2
C2: The mean difference as a percentage of the average standard
Variables deviation
M1 M2 M3 M4 M5 M6 M7
M1: Nearest Neighbor; M2: 2 to 1; M3: Mahalanobis; M4: Mahalanobis with caliber
M5: Radius; M6: Kernel; M7: Stratified
Quantifiable Criteria-5
C5. Use the Kolmogorov-Smirnov test to
compare treatment density estimates of
the propensity scores of control units
with those of the treated units
-Insignificant values
C5: Comparison of the density estimates of the propensity scores of control units with those of the treated units
M1 M2 M3 M4 M5 M6 M7
Propensity Scores 0 0 0.001 0.192 0 0 0
STATA command:
glm cost treatment age …, link(log) family(gamma)
robust
Estimated Total Health Care
Expenditure
Regression Based
Matching Type Difference S.E. Treatment Control Difference S.E.
Unmatched $4,247 $489 $10,398 $3,345 $7,053 $742
M1: Nearest neighbor $3,969 $1,135 $10,398 $7,377 $3,021 $1,275
M2: 2 to 1 $5,157 $1,232 $10,398 $7,364 $3,034 $978
M3: Mahalanobis $4,823 $1,205 $10,398 $6,892 $3,506 $2,281
M4: Mahalanobis with caliber $4,456 $994 $11,104 $6,641 $4,463 $3,252
M5: Radius $4,601 $659 $10,398 $7,786 $2,612 $1,278
M6: Kernel $4,823 $1,205 $10,398 $7,942 $2,456 $2,281
M7: Stratified $3,754 $1,009 $10,398 $8,358 $2,040 $2,564
Multivariate Regression After
Propensity Score Matching
Is it necessary?
Results are at least as good as the ones after
Propensity Score Matching
Tells us the marginal effects of each variable on
outcomes measure
glm cost treatment age …, link(log)
family(gamma) robust
Increase efficiency - double filtering
Covers your mistakes!
Why do we need Propensity Score Matching if we
run multivariate regression-1
52
Conclusion
Propensity score matching creates “quasi
random experiment” from observational data
For retrospective data when true randomization not
possible
Choosing among different types of matching
techniques is important and we should look at
several criteria
Multivariate analysis after applying the correct
matching technique increases the efficiency of
the outcome estimator
Research of Interest
Propensity Score Estimation with More
than Two Categories
(Imbens, 1999)
Accounting for Limited Overlap in
Estimation of Average Treatment Effect
(Optimal Subpopulation Average Treatment
(OSATE) estimation) (Crump, Hotz, Imbens,
Mitnik, work in progress)
Combining regression and propensity
score matching (Double Robustness)
(Wooldridge, work in progress)
Commonly Asked Questions
How do we to select the propensity score
variables?
Should we include/exclude any propensity
score variables when we build our outcomes
regression covariate list?
Is it possible to over-match?
How can we control for unobservable effects?