Metode Firth (PMLE)
Metode Firth (PMLE)
Metode Firth (PMLE)
Georg Heinze
Medical University of Vienna
Medicine:
• Side effects of treatment 1/1000s to fairly common
• Hospital-acquired infections 9.8/1000 pd
• Epidemiologic studies of rare diseases 1/1000 to 1/200,000
Engineering:
• Rare failures of systems 0.1-1/year
Economy:
• E-commerce click rates 1-2/1000 impressions
Political science:
• Wars, election surprises, vetos 1/dozens to 1/1000s
…
Georg Heinze
27.09.2017 2
Problems with rare events
Georg Heinze
27.09.2017 3
Our interest
• Models
• for prediction of binary outcomes
• should be interpretable,
i.e., betas should have a meaning
explanatory models
Georg Heinze
27.09.2017 4
Logistic regression
Pr 1 1 exp
• Likelihood: L | ∏ 1
• Its nth root: Probability of correct prediction
Georg Heinze
27.09.2017 5
Rare event problems…
∗
log log
Georg Heinze
27.09.2017 7
Firth‘s penalization for logistic regression
Firth-type penalization
Firth-type penalization
• Firth‘s bias reduction method was proposed as solution to the problem of separation
in logistic regression (Heinze and Schemper, 2002)
original augmented
A B A B
Firth-type
Y=0 44 4 0 44.5 4.5
penalization
Y=1 1 1 1 1.5 1.5
A B A B
Y=0 315 5 320 Y=0 315.5 5.5 321
Y=1 31 1 32 Y=1 31.5 1.5 33
346 6 352 346.5 6.5 354
Parameter ML Jeffreys-
Firth
Bias * +18%
RMSE * 0.86
Bayesian non- 63.7%
collapsibility
x1 x2 y x1 x2 y
1 ∗ ∗ ∗ 1 ∗ ∗ ∗
1 ∗ ∗ ∗ 1 ∗ ∗ ∗
1 ∗ ∗ ∗ 1 ∗ ∗ ∗
1 ∗ ∗ ∗ 1 ∗ ∗ ∗ each assigned a weight of 1
1 ∗ ∗ ∗ 1 ∗ ∗ ∗
1 ∗ ∗ ∗ 1 ∗ ∗ ∗
1 ∗ ∗ ∗ 1 ∗ ∗ ∗
0 1 0 0
0 1 0 1
0 0 1 0 each assigned a weight of ½
0 0 1 1
• No shrinkage for the intercept, no rescaling of the variables
A B A B
Y=0 315 5 320 Y=0 315.5 5.5 321
Y=1 31 1 32 Y=1 31.5 1.5 33
346 6 352 347 7 352
ORBvsA 1.84
1
0; 0, … ,
2
/
where the ’s are the diagonal elements of the hat matrix
They are equivalent to:
1
2
1 0
2 2
Georg Heinze – Logistic regression with rare events
CeMSIIS-Section for Clinical Biometrics 30
FLAC: Firth‘s Logistic regression with Added Covariate
1 0
2 2
Pseudo data
1 0
2 2
Pseudo data
2. Modify the intercept in Firth-type estimates such that the average pred. prob.
becomes equal to the observed proportion of events.
In our simulation study, we compared FLIC and FLAC to the following methods:
• ridge regression, RR
• RMSE of :
equal effect sizes: ridge the winner
unequal effect sizes: very good performance of FLAC and CP
closely followed by logF(1,1)
• Performance decreases
if effects are very different
• With penalized (=shrinkage) methods one cannot achieve nominal coverage over
all possible parameter values
• But one can achieve nominal coverage averaging over the implicit prior
• Good performance
Please cf. the reference lists therein for all other citations of this presentation.
Further references:
• Gustafson P, Greenland S. Interval estimation for messy observational data. Statistical Science 2009,
24:328-342.
• Rainey C. Estimating logit models with small samples. www.carlislerainey.com/papers/small.pdf (27
March 2017)