Qualitative Response Regression Questions
Qualitative Response Regression Questions
Qualitative Response Regression Questions
Answer
Example, suppose we want to study the labor force participation (LFP) decision of adult males.
Since an adult is either in the labor force or not, LFP is a yes or no decision. Hence, the response
variable, or regressand (dependent variable), can take only two values, say, 1 if the person is in the
labor force and 0 if he or she is not. In other words, the regressand (dependent variable) is a binary,
or dichotomous, variable. For the present purposes, the important thing to note is that the
regressand (dependent variable) is a qualitative variable.
2. Explain the key difference between the Quantitative model and Qualitative Model
Answer
In a model where Y is quantitative, our objective is to estimate its expected, or mean, value given
the values of the regressors (independent variable). While in models where Y is qualitative, our
objective is to find the probability of something happening, such as voting for a Democratic
candidate, or owning a house, or belonging to a union, or participating in a sport, etc. Hence,
qualitative response regression models are often known as probability models.
3. With vivid example, explain the meaning of trichotomous variable
Answer
Is a type of categorical variable that has three distinct categories or levels. It differs from binary
variables, which have two categories.
Example: Suppose there are three parties, CHADEMA, CCM and CUF. The dependent or response
variable here is trichotomous variable.
4. What is Linear Probability Model and Discuss how the LPM model cause Bernoulli
probability distribution
Answer
Yi = β1 + β2Xi + ui
where X = family income and Y = 1 if the family owns a house and 0 if it does not own a house.
Model above looks like a typical linear regression model but because the regressand is binary, or
dichotomous, it is called a linear probability model (LPM).
This is because the conditional expectation of Yi given Xi, E (Yi | Xi ), can be interpreted as the
conditional probability that the event will occur given Xi , that is, Pr (Yi = 1 | Xi ). Thus, in our
example, E (Yi | Xi) gives the probability of a family owning a house and whose income is the
given amount Xi.
The justification of the name LPM for models like Eq. above can be seen as follows:
Assuming E (ui) = 0, as usual (to obtain unbiased estimators), we obtain E(Yi | Xi ) = β1 + β2Xi
Now, if Pi = probability that Yi = 1 (that is, the event occurs), and (1 − Pi ) = probability that Yi =
0 (that is, the event does not occur), the variable Yi has the following (probability) distribution
Yi Probability
0 1 - Pi
1 Pi
TOTAL 1
In general, the expectation of a Bernoulli random variable is the probability that the random
variable equals 1.
5. Explain the setbacks of using LPM and way of collecting those setbacks
i. Non-Normality of the Disturbances ui
The assumption of normality for ui is not tenable for the LPMs because, like Yi, the disturbances
ui also take only two values; that is, they also follow the Bernoulli distribution.
ui = 𝑌𝑖 − β1 − β2X
ui Probability
When Yi = 1 1 − β1 − β2Xi Pi
Obviously, ui cannot be assumed to be normally distributed; they follow the Bernoulli distribution.
We can resolve this problem by increase the sample size to minimize the non-normality problem.
ii. Heteroscedastic Variances of the Disturbances
As statistical theory shows, for a Bernoulli distribution the theoretical mean and variance are,
respectively, p and p(1 − p), where p is the probability of success (i.e., something happening),
showing that the variance is a function of the mean. Hence the error variance is heteroscedastic.
We already know that, in the presence of heteroscedasticity, the OLS estimators, although
unbiased, are not efficient; that is, they do not have minimum variance.
Since the variance of ui depends on E(Yi | Xi), one way to resolve the heteroscedasticity problem
is to transform the model by dividing it through √𝒙
𝑌𝑖 𝛽1 𝑌𝑖 𝑢𝑖
= + β2√𝑥 + √𝑥
√ 𝑥 √𝑥
Since E (Yi | Xi) in the linear probability models measures the conditional probability of the event
Y occurring given X, it must necessarily lie between 0 and 1. Although this is true a priori, there
is no guarantee that Yi estimated, the estimators of E (Yi | Xi), will necessarily fulfill this restriction,
and this is the real problem with the OLS estimation of the LPM.
This happens because OLS does not take into account the restriction that 0 ≤ E (Yi ) ≤ 1 (an
inequality restriction). There are two ways of finding out whether the estimated Yi lie between 0
and 1.
a. To estimate the LPM by the usual OLS method and find out whether the estimated Yi lie
between 0 and 1. (solution) If some are less than 0 (that is, negative), Yi is assumed to be
zero for those cases; if they are greater than 1, they are assumed to be 1.
b. To devise an estimating technique that will guarantee that the estimated conditional
probabilities Yi will lie between 0 and 1.
iv. Questionable Value of R2 as a Measure of Goodness of Fit.
To see why, consider Figure Above Corresponding to a given X, Y is either 0 or 1. Therefore, all
the Y values will either lie along the X axis or along the line corresponding to 1. Therefore,
generally no LPM is expected to fit such a scatter well, whether it is the unconstrained LPM (Figure
a) or the truncated or constrained LPM (Figure b), an LPM estimated in such a way that it will not
fall outside the logical band 0–1. As a result, the conventionally computed R2 is likely to be much
lower than 1 for such models. In most practical applications the R2 ranges between 0.2 to 0.6. R 2
in such models will be high, say, in excess of 0.8 only when the actual scatter is very closely
clustered around points A and B (Figure c), for in that case it is easy to fix the straight line by
joining the two points A and B. In this case the predicted Yi will be very close to either 0 or 1.
6. Derive the Logit model from the Logistic function and explain the meaning of its slope
coefficients
7. The following equation gives an estimate of LPM for having children, whereby the
dependent variable is binary taking the value 1 if woman has children and 0 if the woman
does not have children.
Children = -1.997 + 0.175age – 0.090educ – 0.362electric
Se = (0.094) (0.003) (0.006) (0.0680)
n = 4361 r2 = 0.560
where
age – is the age of woman
educ – number of years a woman spent at school
electric – a dummy variable taking a value of 1 if a woman is connected to electricity and
0 otherwise
Use the above information to answer the following questions
i. Interpret the results
Solutions
The intercept of −1.997 gives the “probability’’ that a woman with zero age, educ and electric will
has children. Since this value is negative, and since probability cannot be negative, we treat this
value as zero.
The coefficient of 0.175 means that for a unit change in age, on the average the probability of
woman to have children increases by 0.175 or about 17.5 percent.
The coefficient of -0.090 means that for a unit change in number of years a woman spent at school
(educ), on the average the probability of woman to have children decreases by 0.090 or about 9.0
percent.
The coefficient of -0.362, The average probability of woman to having children is lower by about
0.362 or 36.2 percent when she is connected to electricity, holding other factor constant.