ECON 301 - Midterm - F2020 Answer Key - pdf-1601016920671
ECON 301 - Midterm - F2020 Answer Key - pdf-1601016920671
ECON 301 - Midterm - F2020 Answer Key - pdf-1601016920671
Rules:
1. You are required to submit your answers to Moodle by midnight, under “Midterm Exam
Dropbox” in the week of September 21 covering today.
2. The take-home midterm exam is designed to take 2 hours. You have an additional 3 hours
to submit your answers to Moodle, to account for any technical problems you might
encounter. If you fail to submit your answers by midnight, therefore, you will receive a
zero without any exception.
3. You need to write down your own original answers and submit them as a single file. You
can use a software (ex: Word or Latex) to type your answers or scan your handwritten
answers into a pdf file. In the latter case, you are responsible for the quality of scanning.
4. You are NOT allowed to resubmit, so make sure you are submitting the right file.
5. Making the same mistakes in a similar order will be judged as plagiarism and will be
sanctioned according to the Student’s Code of Misconduct.
6. Answers that are correct but copied & pasted or not properly paraphrased from another
source will be judged as plagiarism and sanctioned according to the Student’s Code of
Misconduct.
7. Especially in open-ended questions but not limited to them, answers using similar steps to
your peers even though they are properly paraphrased will be suspected plagiarism. Your
exam will then not count, and I will instead give you an oral exam in the week of
September 28.
8. The bottom line behind rules 5-7 is to write your own original answers.
9. There are 8 True/False/Uncertain questions and 4 problems in total, worth of 100 points.
10. If you believe a question is vague, sharpen it as you see fit before answering.
11. Explain your answers carefully. You will get no credit for unsupported assertions or
guesses. Write as if you are trying to convince an intelligent person who does not already
know the answers. If your answers would not convince such a person, it will be assumed
that you do not really understand the material.
1
True, False, or Uncertain (24 points, 3 points each) Are the following statements true or false?
Justify your answers briefly. – For each, correct T/F/U is worth only a point, justification is
2 points.
(ii) OLS maximizes the correlation between actual Y and fitted values .
(iv) OLS provides you the casual relation between the dependent variable and independent
variables.
(v) The error term in a regression equation is said to exhibit homoscedasticity if it has the same
value for all values of the explanatory variable.
False. “The error term in a regression equation is said to exhibit homoscedasticity if its variance
has the same value for all values of the explanatory variable.”
(vi) If you are estimating a multiple linear regression and the number of observations in your
sample exceeds the number of slope parameters by 1, the value of R2 will be equal to 1.
True. Think about this as a curve fitting problem. If the number of parameters in the curve you
are trying to fit to the data is equal to the number of data points, you will get perfect fit. In this
case we have n parameters, n-1 slope parameter and an intercept parameter. This would imply
R2=1.
(vii) Suppose that you run the following 2 regressions. (i) regress Y on X and (ii) regress Y on X
and X2. The second regression will have a smaller standard error estimate.
Uncertian. This depends on what amount of the sum of squared residuals will be explained by
∑
X2. Remember that =
. Both the numerator (∑ ) and the denominator ( − − 1)
will decrease. Then, the overall effect will depend on which decrease is more pronounced.
(viii) The ordinary least square estimators have the smallest variance among all the linear
unbiased estimators.
2
Uncertain. Only if the first five assumptions are satisfied, see the Gauss-Markov theorem.
Grading: 1 point for correct T/F + {0.5 if bad explanation, 1 if ok explanation, 2 if good
explanation}.
------------------------------------------------------------------------------
AP | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Inventories | .8665732 C
_cons | .2598805 .0084988
------------------------------------------------------------------------------
Find the withheld values of A, B and C, showing your calculations. You can round your answers
to 3 decimal points. (No partial points will be given.)
Answer:
In the STATA output below, obtained from a sample of high schools in a region, math10
provides percentage of students passing mathematics test, totcomp measures the average teacher
compensation (salary+benefits) in U.S. dollars, enroll denotes the number of students enrolled,
lnchprg stands for percentage of students entitled to the school lunch program. The letter “l” in
front of some of the variables in the regression output denotes that the natural logarithms of these
variables are utilized.
3
. sum math10 totcomp enroll lnchprg
------------------------------------------------------------------------------
math10 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ltotcomp | 20.32717 3.960341 5.13 0.000 12.54177 28.11256
lenroll | -1.322726 .6907269 -1.91 0.056 -2.680584 .0351317
_cons | -180.2074 39.19439 -4.60 0.000 -257.2572 -103.1575
------------------------------------------------------------------------------
Holding everything else constant, if the average teacher compensation increases by 1 percent,
the math passing rate increases by 0.2 percentage point on average.
If the average teach compensation and enrollment are equal to 1, we expect math passing rate
to be -180 percent on average. This does not make any sense obviously, and it is because the
minimum of enrollment and total compensation are far from 1 in our sample.
d. (6 points) If we measured the teacher compensation using thousands of U.S. dollars, call it
totcomp_thousands, and run the regression including ltotcomp_thousands instead of
ltotcomp, what would happen to coefficient and standard error estimates of the slope
parameters?
4
equivalently, 2 =
-./ℎ10 !+ ! log (1000) + ! (ltotcomp − log(1000)) + ! lenroll.
2 =
Thus, -./ℎ10 !+ ! log (1000) + ! (ltotcomp_thousands) + ! lenroll. This shows
that ltotcomp_thousands instead of ltotcomp will alter only the intercept estimate that will
!+
not be given by ! log (1000) + !. The slope parameter estimates will remain intact.
Their standard error estimates will also remain unchanged since the variation in
ltotcomp_thousands comes entirely from ltotcomp and enroll remains intact as a regressor.
e. (6 points) What is the crucial assumption you are making while interpreting OLS estimates?
Did omitting lnchprg from the regression above cause a violation of this assumption?
Zero conditional mean assumption. Since lnchprg should affect math passing rate, as poor
students have less resources affecting their level of success on average, and lnchprg and
teacher compensation should be negatively correlated, as schools having more students in the
lunch program can raise less money from the parents, the zero conditional mean assumption
should be violated.
f. (8 points) If you include lnchprg in the regression above, how would it change your slope
estimates? Discuss thoroughly.
To answer this question, you need to make assumptions about correlations among lunch
program, enrollment, and teacher compensation. I will assume lunch program is negatively
correlated with both teacher compensation, more obvious, and enrollment, less obvious –
assuming schools do not have a financial incentive to raise their enrollment by accepting
students from disadvantaged backgrounds. This argument also implies enrollment and
average teacher salary should be positively correlated. Making different sign assumptions for
correlations are perfectly fine as long as your answer is consistent with your assumptions.
As lunch program should have a negative estimate in the long regression and its correlation
with both other regressors are assumed to be negative, the resulting direct bias from not
including lunch program should be positive, suggesting short regression producing larger
estimates than the long one. The next is to look at the indirect passover between enrollment
and teacher compensation. Since the correlation between them is assumed to be positive,
when you over-(under-)estimate one, you will also over-(under-)estimate the other. Then,
given that the direct biases for both regressors are in the same direction, the indirect passover
will only amplify its effect. We can then conclude that if we include lunch program in the
regression, the estimates of both teacher compensation and enrollment will be lower in the
long regression justifying the positive bias we found for the short regression.
I am providing the evidence from the data below to double check these arguments:
. corr lenroll ltotcomp lnchprg
(obs=408)
5
. reg math10 ltotcomp lenroll lnchprg
------------------------------------------------------------------------------
math10 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ltotcomp | 11.91085 3.816609 3.12 0.002 4.407962 19.41375
lenroll | -1.892355 .6446503 -2.94 0.004 -3.159643 -.6250671
lnchprg | -.3040063 .0372827 -8.15 0.000 -.3772987 -.230714
_cons | -79.56061 38.40213 -2.07 0.039 -155.0536 -4.06766
------------------------------------------------------------------------------
Answer:
E(m1) = (1/4) E(Y1) + (1/2) E(Y2) + (1/4) E(Y3) = 1/4μ + 1/2μ +1/4μ = μ.
E(m2) = (1/6) E(Y1) + (2/6) E(Y2) + (3/6) E(Y3) = 1/6μ + 2/6μ +3/6μ = μ.
Therefore, both estimates are unbiased. In this case, the estimator having a smaller variance is
better as it provides us a less dispersed range. Observe that, since each observation is a random
draw and uncorrelated with the other observations:
Denote Var(Yi) by σ2 for each i. Since Var(m1) = 6σ2/16<7σ2/18= Var(m2), we prefer the
estimator m1 over m2.
a. (6 pts) Use the sample analog of E(UX2) = 0 to get an estimate of β1. (If your answer to this
part is wrong, you will get a zero for part b.)
6
b. (7 pts) Is your estimate unbiased? Make sure to include all the steps in your derivation and
state any assumptions you utilize at any specific step.
! = ∑ F GH = ∑(IJG KL
H
)G
= +
∑G L
! % = M + ∑ G L N = + ∑ G O(L ) =
. Then,
∑ G ∑ G ∑ GH ∑ G ∑ G
. The second last equality follows because we treat X’s as fixed. This estimate is unbiased.
c. (7 pts) Assume that X10≠0 (10 corresponds to the tenth observation in your sample) and
consider the estimator
P =
C
Is P an unbiased estimator of ? Make sure to include all the steps in your derivation and
state any assumptions you utilize at any specific step.