Logistic Regression
Logistic Regression
Logistic Regression
Introduction
1
Logistic Regression
2
Logistic Regression
Odds
3
Logistic Regression
Odds: Examples
Let’s say that the probability of success in a random experiment
is .8, thus p = .8
Then the probability of failure is q = 1 – p = .2
Odds are determined from probabilities and range between 0 and
infinity
odds(success) = p/(1-p) or p/q = .8/.2 = 4, (or 4:1)
that is, the odds of success are 4 to 1. which means 4 times out
of 5 it will be a success.
odds(failure) = q/p = .2/.8 = .25 (or 1:4)
Which means the odds of failure are 1 to 4
Odds greater than 1 indicates success is more likely than failure
Odds less than 1 indicates failure is more likely than success.
4
Logistic Regression
5
Logistic Regression
If you are male, the probability of being admitted is 0.7 and the
probability of not being admitted is 0.3.
Here are the same probabilities for females,
p = 3/10 = .3 q = 1 – .3 = .7
If you are female it is just the opposite, the probability of being
admitted is 0.3 and the probability of not being admitted is 0.7.
6
Logistic Regression
Odds: Examples 2
Thus, for a male, the odds of being admitted are 5.44 times
larger than the odds for a female being admitted.
7
Logistic Regression
Odds: Examples 2
8
Logistic Regression
Logit
9
Logistic Regression
Logit Example
10
Logistic Regression
11
Logistic Regression
12
Logistic Regression
14
Simple Logistic Regression
Taking the natural log of the odds makes the variable more
suitable for a regression, so the result of a logistic regression is
an equation that looks like this:
ln[Y/(1−Y)]=a+bX
You find the slope (b) and intercept (a) of the best-fitting
equation in a logistic regression using the maximum-likelihood
method, rather than the least-squares method you use for linear
regression.
Simple Logistic Regression
For the spider example, the equation is:
ln[Y/(1−Y)]=−1.6476+5.1215(grain size)
In order to predict the probability that spiders would live there, you
could measure the sand grain size, plug it into the equation, and get
an estimate of Y, the probability of spiders being on the beach.
Simple Logistic Regression
Furthermore, prediction about the presence or absence may be
made easily by making a decision based on a threshold (say 0.5)
on probability.
For example if the value of the probability is greater than 0.5, the
prediction is that for the given grain size, the spiders exist on the
beach, otherwise not.