Logit, Probit and Multinomial Logit Models in R: Oscar Torres-Reyna
Logit, Probit and Multinomial Logit Models in R: Oscar Torres-Reyna
Logit, Probit and Multinomial Logit Models in R: Oscar Torres-Reyna
models in R
(v. 3.5)
Oscar Torres-Reyna
[email protected]
If outcome or dependent variable is categorical but are ordered (i.e. low to high), then use ordered
logit or ordered probit models. Some examples are:
Do you agree or disagree with the President? What is your socioeconomic status?
1 ‘Disagree’ 1 ‘Low’
2 ‘Neutral’ 2 ‘Middle’
3 ‘Agree’ 3 ‘High’
If outcome or dependent variable is categorical without any particular order, then use multinomial
logit. Some examples are:
If elections were held today, for which party What do you like to do on the weekends?
would you vote?
1 ‘Rest’
1 ‘Democrats’ 2 ‘Go to movies’
2 ‘Independent’ 3 ‘Exercise’
3 ‘Republicans’ OTR 2
Logit model
# Getting sample data
library(foreign)
mydata <- read.dta("https://dss.princeton.edu/training/Panel101.dta")
summary(logit)
Call:
glm(formula = y_bin ~ x1 + x2 + x3, family = binomial(link = "logit"),
data = mydata)
The Pr(>|z|) column shows the two-tailed p-values testing the null hypothesis that the
Deviance Residuals: coefficient is equal to zero (i.e. no significant effect). The usual value is 0.05, by this
Min 1Q Median 3Q Max measure none of the coefficients have a significant effect on the log-odds ratio of the
-2.0277 0.2347 0.5542 0.7016 1.0839 dependent variable. The coefficient for x3 is significant at 10% (<0.10).
Coefficients:
The z value also tests the null that the coefficient is equal to zero. For a 5%
Estimate Std. Error z value Pr(>|z|) significance, the z-value should fall outside the ±1.96.
(Intercept) 0.4262 0.6390 0.667 0.5048
The Estimate column shows the coefficients in log-odds form. When x3 increase by
x1 0.8618 0.7840 1.099 0.2717
x2 0.3665 0.3082 1.189 0.2343
one unit, the expected change in the log odds is 0.7512. What you get from this
x3 0.7512 0.4548 1.652 0.0986 . column is whether the effect of the predictors is positive or negative. See next page
--- for an extended explanation.
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
library(stargazer)
stargazer(logit, type="html", out="logit.htm")
NOTE: Use the option type = "text" if you want to see the results directly in the RStudio console.
OTR 4
Logit model: odds ratio
Odds ratio interpretation (OR): Based on the output below, when x3 increases by one unit, the odds of y = 1 increase by 112% -(2.12-1)*100-.
Or, the odds of y =1 are 2.12 times higher when x3 increases by one unit (keeping all other predictors constant). To get the odds ratio, you
need explonentiate the logit coefficient.
# Using package -–mfx--
# Estimating the odds ratio by hand
library(mfx)
cbind(Estimate=round(coef(logit),4), logitor(y_bin ~ x1 + x2 + x3, data=mydata)
Call:
OR=round(exp(coef(logit)),4)) logitor(formula = y_bin ~ x1 + x2 + x3, data = mydata)
The Estimate column shows the coefficients in log-odds form. When x3 increase by one unit, the expected change in the log odds is 0.7512. Lets hold x1 and x2
constant at their means, and vary x3 with values 1, 2, and 3, to get the predicted log-odds given each of the three values of x3:
logit.or = exp(coef(logit))
logit.or
(Intercept) x1 x2 x3
1.531417 2.367352 1.442727 2.119566
library(stargazer)
stargazer(logit, type="html", coef=list(logit.or), p.auto=FALSE, out="logitor.htm")
NOTE: Use the option type = "text" if you want to see the results
OTR directly in the RStudio console. 6
Logit model: predicted probabilities
The logit model can be written as (Gelman and Hill, 2007):
Pr(yi = 1) = Logit-1(Xiβ)
In the example:
coef(logit)
(Intercept) x1 x2 x3
0.4261935 0.8617722 0.3665348 0.7512115
Estimating the probability at the mean point of each predictor can be done by inverting the logit model.
Gelman and Hill provide a function for this (p. 81), also available in the R package –arm-
Pr(yi = 1) = 0.8328555
OTR 7
Logit model: predicted probabilities
Adding categorical variable, the model would be:
invlogit(coef(logit.cat)[1]+
coef(logit.cat)[2]*mean(mydata$x1)+
coef(logit.cat)[3]*mean(mydata$x2)+
coef(logit.cat)[4]*mean(mydata$x3)+
coef(logit.cat)[5]*1)
OTR 8
Logit model: predicted probabilities
invlogit = function (x) {1/(1+exp(-x))}
Estimating the probability when opinion = ‘Disagree’
invlogit(coef(logit.cat)[1]+
coef(logit.cat)[2]*mean(mydata$x1)+
coef(logit.cat)[3]*mean(mydata$x2)+
coef(logit.cat)[4]*mean(mydata$x3)+
coef(logit.cat)[6]*1)
Pr(yi = 1| opinion= “Disagree”) = 0.9077609
Estimating the probability when opinion = ‘Strongly disagree’
invlogit(coef(logit.cat)[1]+
coef(logit.cat)[2]*mean(mydata$x1)+
coef(logit.cat)[3]*mean(mydata$x2)+
coef(logit.cat)[4]*mean(mydata$x3)+
coef(logit.cat)[7]*1)
Pr(yi = 1| opinion= “Strongly disagree”) = 0.933931
Estimating the probability when opinion = ‘Strongly agree’
invlogit(coef(logit.cat)[1]+
coef(logit.cat)[2]*mean(mydata$x1)+
coef(logit.cat)[3]*mean(mydata$x2)+
coef(logit.cat)[4]*mean(mydata$x3))
Pr(yi = 1| opinion= “Strongly agree”) = 0.8764826
OTR 9
Logit model: predicted probabilities
Another way to estimate the predicted probabilities is by setting initial conditions.
Getting predicted probabilities holding all predictors or independent variables to their means.
After estimating the logit model and creating the dataset with the mean values of the predictors, you can use the predict()
function to estimate the predicted probabilities (for help/details type ?predict.glm), and add them to the allmean dataset.
allmean
x1 x2 x3 pred.prob
1 0.6480006 0.1338694 0.761851 0.8328555
When all predictor values are hold to their means, the probability of y = 1 is 83%.
OTR 10
Logit model: predicted probabilities with categorical variable
logit <- glm(y_bin ~ x1+x2+x3+opinion, family=binomial(link="logit"), data=mydata)
To estimate the predicted probabilities, we need to set the initial conditions. Getting predicted probabilities holding all predictors or
independent variables to their means for each category of categorical variable ‘opinion’:
Creating a new
allmean <- data.frame(x1=rep(mean(mydata$x1),4),
dataset with the
x2=rep(mean(mydata$x2),4), mean values of the
x3=rep(mean(mydata$x3),4), predictors for each
opinion=as.factor(c("Str agree","Agree","Disag","Str disag"))) category
allmean
x1 x2 x3 opinion
1 0.6480006 0.1338694 0.761851 Str agree
2 0.6480006 0.1338694 0.761851 Agree
3 0.6480006 0.1338694 0.761851 Disag
4 0.6480006 0.1338694 0.761851 Str disag
The object with the Dataset with the Requesting predicted Standard error of
logit coefficients conditions probabilities the prediction
allmean
x1 x2 x3 opinion fit se.fit residual.scale
1 0.6480006 0.1338694 0.761851 Str agree 0.8764826 0.07394431 1
2 0.6480006 0.1338694 0.761851 Agree 0.5107928 0.15099064 1
3 0.6480006 0.1338694 0.761851 Disag 0.9077609 0.06734568 1
4 0.6480006 0.1338694 0.761851 Str disag 0.9339310 0.06446677 1
(continue next page)
OTR 11
Logit model: predicted probabilities with categorical variable
names(allmean)[names(allmean)=="se.fit"] = "se.prob"
allmean
OTR 12
Logit model: predicted probabilities with categorical variable
# Plotting predicted probabilities and confidence intervals using ggplot2
library(ggplot2)
1.0
0.8
Pr(y=1)
0.6
0.4
0.2
Str agree Agree Disag Str disag
OTR Opinion 13
add footnote here
Logit model: marginal effects
# Using package –mfx-
# See http://cran.r-project.org/web/packages/mfx/mfx.pdf
install.packages("mfx") #Do this only once
library(mfx)
logitmfx(y_bin ~ x1+x2+x3, data=mydata)
Call:
logitmfx(formula = y_bin ~ x1 + x2 + x3, data = mydata)
OTR 14
Ordinal logit model
# Getting sample data
library(foreign)
mydata <- read.dta("https://dss.princeton.edu/training/Panel101.dta")
summary(m1)
Call:
polr(formula = opinion ~ x1 + x2 + x3, data = mydata, Hess = TRUE)
Coefficients:
Value Std. Error t value
x1 0.98140 0.5641 1.7397
x2 0.24936 0.2086 1.1954
x3 0.09089 0.1549 0.5867
Intercepts:
Value Std. Error t value
Str agree|Agree -0.2054 0.4682 -0.4388
Agree|Disag 0.7370 0.4697 1.5690
Disag|Str disag 1.9951 0.5204 3.8335
OTR 15
Ordinal logit model: p-values
# Getting coefficients and p-values
m1.coef
OTR 16
Ordered logit model
# The stargazer() function from the package –stargazer allows a publication quality of the
logit model.
# The model will be saved in the working directory under the name ‘m1.htm’ which you can open
with Word or any other word processor.
library(stargazer)
stargazer(m1, type="html", out="m1.htm")
NOTE: Use the option type = "text" if you want to see the results directly in the RStudio console.
OTR 17
Ordered logit model: odds ratios
# Relative risk ratios allow an easier interpretation of the logit coefficients. They are the
exponentiated value of the logit coefficients.
m1.or=exp(coef(m1))
m1.or
x1 x2 x3
2.668179 1.283198 1.095150
library(stargazer)
stargazer(m1, type="html", coef=list(m1.or), p.auto=FALSE, out="m1or.htm")
NOTE: Use the option type = "text" if you want to see the results
OTR directly in the RStudio console. 18
Ordinal logit model: predicted probabilities
# Use "probs" for predicted probabilities
The bold numbers are the predicted probabilities of each category when all
predictors are at their mean value
OTR 19
Ordinal logit model: predicted probabilities
# At specific values, example x1 and x2 at their means, and x3 = 1 and x3 = 2.
# Use "probs" for predicted probabilities given specific predictors
setup1 <- data.frame(x1=rep(mean(mydata$x1),2),
x2=rep(mean(mydata$x2),2),
x3=c(1,2)) Setup for new predicted
setup1 probabilities
x1 x2 x3
1 0.6480006 0.1338694 1
2 0.6480006 0.1338694 2
OTR 20
Ordinal logit model: marginal effects
# Load package "erer", use function ocMe() for marginal effects
library(erer)
x$out
OTR 21
Multinomial logit model
# Loading the required packages
library(foreign)
library(nnet)
library(stargazer)
mydata = read.dta("http://www.ats.ucla.edu/stat/data/hsb2.dta")
table(mydata$ses)
NOTE: This section is based on the UCLA website http://www.ats.ucla.edu/stat/r/dae/mlogit.htm, applied to data from the page
http://www.ats.ucla.edu/stat/stata/output/stata_mlogit_output.htm. Results here reproduce the output in the latter to compare, and to
provide an additional source to interpret outcomes. OTR 22
Multinomial logit model
# Running the multinomial logit model using the multinom() function
summary(multi1)
Call:
multinom(formula = ses2 ~ science + socst + female, data = mydata)
Coefficients:
(Intercept) science socst femalefemale These are the logit coefficients relative
low 1.912288 -0.02356494 -0.03892428 0.81659717 to the reference category. For example,
high -4.057284 0.02292179 0.04300323 -0.03287211 under ‘science’, the -0.02 suggests that
for one unit increase in ‘science’ score,
Std. Errors: the logit coefficient for ‘low’ relative to
(Intercept) science socst femalefemale ‘middle’ will go down by that amount,
low 1.127255 0.02097468 0.01951649 0.3909804
-0.02.
high 1.222937 0.02087182 0.01988933 0.3500151
In other words, if your science score
Residual Deviance: 388.0697
increases one unit, your chances of
AIC: 404.0697 staying in the middle ses category are
higher compared to staying in low ses.
OTR 23
Multinomial logit model
# The multinom() function does not provide p-values, you can get significance of the
coefficients using the stargazer() function from the package –stargazer.
# The model will be saved in the working directory under the name ‘multi1.htm’ which you can
open with Word or any other word processor.
library(stargazer)
stargazer(multi1, type="html", out="multi1.htm")
NOTE: Use the option type = "text" if you want to see the results directly in the RStudio console.
OTR 24
Multinomial logit model: relative risk ratios
# Relative risk ratios allow an easier interpretation of the logit coefficients. They are the
exponentiated value of the logit coefficients.
multi1.rrr = exp(coef(multi1))
multi1.rrr
(Intercept) science socst femalefemale
low 6.76855944 0.9767105 0.9618235 2.2627869
high 0.01729593 1.0231865 1.0439413 0.9676623
library(stargazer)
stargazer(multi1, type="html", coef=list(multi1.rrr), p.auto=FALSE, out="multi1rrr.htm")
NOTE: Use the option type = "text" if you want to see the results directly in the RStudio console.
OTR 25
Ordinal logit model: predicted probabilities
# At specific values, example science and socst at their means for males and females.
# Use "probs" for predicted probabilities given specific predictors
allmean
science socst female pred.prob.middle pred.prob.low pred.prob.high
1 51.85 52.405 male 0.5555769 0.1441171 0.3003061
2 51.85 52.405 female 0.4739293 0.2781816 0.2478890
allmean
science socst female pred.prob
1 51.85 52.405 male middle These are the predicted categories given the new data
2 51.85 52.405 female middle OTR 26
Sources
Greene, Econometric Analysis, 7th. ed.
Gelman and Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, 2007
UCLA, http://www.ats.ucla.edu/stat/r/dae/
StatsExchange, http://stats.stackexchange.com/
R packages:
-mfx- http://cran.r-project.org/web/packages/mfx/mfx.pdf
-erer- http://cran.r-project.org/web/packages/erer/erer.pdf
OTR 27