Section 12 PDF
Section 12 PDF
Section 12 PDF
∗
Caroline, Chris, Jimmy, Kaushiki, Leah
An important thing to keep in mind is that for this approach to work we need to nd an instrument
Z that is related to X and unrelated to things we don't observe that might also change Y (so that Z
aects Y only through X ). The rst condition is empirically testable, but unfortunately this is not
true for the second condition.
3. Large outliers are unlikely: The X 0 s, W 0 s, Z 0 s, and Y have nonzero nite fourth moments.
4. The two conditions for a valid instrument hold: instrument relevance and instrument exogeneity.
∗
We thank previous GSIs for the great section material they built over time. These section notes are heavily based
on their previous work.
1
1.2 Estimation: Two Stage Least Square (TSLS)
• First stage: run the regression Xi = π0 + π1 Zi + π2 Wi + vi using OLS and compute the predicted
value X
bi = π
b0 + π
b1 Zi + π
b2 Wi .
Remark 1: Note that the rst stage decomposes X into two: a problematic component (vi )
correlated with ui and a problem-free component (π0 + π 1 Zi + π 2 Wi ) uncorrelated with ui .
Remark 2: Instrument relevance means π
b1 is statistically signicant.
Remark 3: The idea behind TSLS is to use the predicted problem-free component of Xi to
get a consistent estimate of β1 .
1
Remark 4: When you run the 2-step procedure manually in Stata the standard errors are
incorrect because they do not recognize that it is the 2nd stage of a two-stage process.
Luckily, Stata corrects for this automatically when you use a canned command such as
ivregress.
Remark 5: As before, the error u might be heteroskedastic, so it is important to use
heteroskedastic-robust standard errors.
1.3 Terminology
In general, there can be multiple endogenous regressors (X 0 s), multiple included regressors
(W s), and multiple instrumental variables (Z s). For IV regression to be possible there must be at
0 0
0 0
least as many Z s as X s. Regarding the number of instruments (m) and the number of endogenous
regressors (k ) we have the following terminology:
• Instrument relevance:
Why is this important? Intuitively the more the variation in X is explained by the instruments,
the more info is available for use in IV regression (and the more accurate is the estimator too).
estimator is biased, and TSLS t-statistics and condence intervals are unreliable.
1
And what about β2 ? In this case β2 is already consistent because by assumption Wi is one of our good regressors,
in the sense that E (ui |Wi ) = 0.
2
Why 10? Take a look at Appendix 12.5.
2
• Instrument exogeneity:
Intuitively, exogeneity of the instruments means that they are uncorrelated with ui .
Exactly identied case: We cannot test exogeneity. But you can provide your personal
knowledge on the subject to defend your instrument (this is indeed a very common practice).
When the model is over-identied, we can use what is called the Sargan test, or the Hansen
test, or the J-test. In essence, the test uses information from k good IVs to test whether
the extra m−k IVs are valid or not. In other words, the test assumes there are at least k
good IVs.
∗ step 1: take the residuals from TSLS estimation (note: use Xi , not X
bi )
bTi SLS = Yi − βb0T SLS + βb1T SLS Xi + βb2T SLS Wi
u
∗ step 2: estimate the following regression using OLS (using homoskedastic-only standard
error estimation):
bTi SLS = δ0 + δ1 Z1i + ... + δm Zmi + δm+1 Wi + ei
u
∗ step 3: do the following hypothesis test (F-statistic with homoskedasticity):
-H0 : δ1 = δ2 = ... = δm = 0 vs H1 : at least one δj 6= 0, j = 1, ..., m
- CalculateJ = mF and compare with critical values from χ2m−k (J ∼ χ2m−k , where
m−k is the degree of overidentication, the # of instruments minus the # of endogenous
regressors).
- If you reject the null, at least one (maybe more than one) instrument is not exogenous.
- If you fail to reject the null, then you have evidence that the instruments are approx-
imately uncorrelated with bTi SLS .
u
1.5 IV in practice
• Where do valid instruments come from? If IV is so awesome, why don't we use it all the
time? Because in practice, it is really hard to nd valid instruments. Two approaches are:
Economic theory. Example: weather aects only supply side in agricultural markets.
• Consistency vs Eciency: While IV estimators are consistent given valid instruments, they
can be much less ecient than the OLS estimator and can behave poorly in nite samples.
These problems are greatly magnied if instruments are weakly correlated with the endogenous
variables.
3
2 Exercises
Problem 1 [SW 12.9]
A researcher is interested in the eect of military service on human capital. He collects data from a
random sample of 4000 workers aged 40 and runs the OLS regression Yi = β0 + β1 Xi + ui , where Yi is
the worker i's annual earnings and Xi is a binary variable that is equal to 1 if the person served in the
military and 0 otherwise.
a) Explain why the OLS estimates are likely to unreliable. (Hint: Which variables are omitted from
the regression? Are they correlated with military service?)
b) During the Vietnam War there was a draft, where priority for the draft was determined by a
national lottery. (The days of the year were randomly reordered 1 through 365. Those with
birthdates ordered rst were drafted before those with birthdates ordered second, and so forth.)
Explain how the lottery might be used as an instrument to estimate the eect of military service
on earnings.
a) Regress weeksm1 (weeks worked by the mom) on the indicator variable morekids using OLS.
On average, do women with more than two children work less than women with two children?
How much less?
b) Explain why the OLS regression estimated in (a) is inappropriate for estimating the causal eect
of fertility (morekids) on labor supply (weeksm1).
c) The dataset contains the variable samesex, which is equal to 1 if the rst two children are of the
same sex (boy-boy or girl-girl) and equal to 0 otherwise. Are couples whose rst two children are
of the same sex more likely to have a third child? Is the eect large? is it statistically signicant?
d) Explain why samesex is a valid instrument for the IV regression of weeksm1 on morekids.
f ) Estimate the regression of weeksm1 on morekids using samesex as an instrument. How large
is the fertility eect on labor supply?
g) Do the results change when you include the variables agem1, black , hispan, and othrace in the
labor supply regression (treating these variables as exogenous)? Explain why or why not.
4
Problem 4 [Weak instruments]
In an IV regression model with one regressor, Xi , and one instrument, Zi , the rst-stage regression of
Xi onto Zi has R2 = 0.05 and N = 100. Is Zi a strong instrument? [Hint: use the R2 to compute the
homoskedastic-only F-stat]. Would your answer change if R
2 = 0.05 and N = 500.
ln wi = α + β1 si + β2 ei + β3 e2i + X0 γ + ui
where si denotes years of schooling, ei denotes years of work experience (calculated as ei = agei −si −6),
e2i denotes experience squared, and X is a vector with 26 control variables (geographic indicators and
parental education). Assume these control variables are exogenous.
a) A classmate tells that this regression suers from OVB and thus you must be careful in inter-
preting the coecients. How many endogenous variables do we have? How many instruments
do you need?
b) The author decides to use 3 instruments. The rst one is col4, an indicator for whether a four-
year college is nearby. The other two are age 2
and age . The author argues that (i) age and
age2 are highly correlated with ei and e2i , and (ii) that they can be omitted from the model
for log-wage since it is work experience that matters. Do you agree with him? Is the model
just-identied or overidentied? Can you test the exogeneity of the instruments?
c) The results for β1 from OLS and IV are reported in the table below. Interpret the eect of
schooling on wages for the OLS regression and IV regression. Comment about the precision of
the estimates.
OLS IV
R2 0.304 0.207
First-stage F-stat for si - 8.07
Obs 3010 3010
5
3 Solution to the exercises
Problem 1 [SW 12.9]
a) There are other factors that could aect both the choice to serve in the military and annual
earnings. One example could be education, although this could be included in the regression as
a control variable. Another variable is ability which is dicult to measure, and thus dicult to
control for in the regression.
b) The draft was determined by a national lottery so the choice of serving in the military was
random. Because it was randomly selected, the lottery number is uncorrelated with individual
characteristics that may aect earning and hence the instrument is exogenous. Because it aected
the probability of serving in the military, the lottery number is relevant.
b) Both fertility and weeks worked are choice variables. A woman with a positive labor supply
regression error (a woman who works more than average) may also be a woman who is less likely
to have an additional child. This would imply that morekids is positively correlated with the
regression error, so that the OLS estimator of βmorekids is positively biased.
\
morekids = 0.346 + 0.068 ∗ samesex
(0.001) (0.002)
so that couples with samesex = 1 are 6.8 percentage points more likely to have an additional
child that couples with samesex = 0. The eect is highly signicant (t-statistic = 35.2).
d) samesex is random and is unrelated to any of the other variables in the model including the
error term in the labor supply equation. Thus, the instrument is exogenous. From (c), the rst
stage F-statistic is large (F = 1238) so the instrument is relevant. Together, these imply that
samesex is a valid instrument.
g) The results do not change in an important way. The reason is that samesex is unrelated to
agem1, black , hispan, othrace, so that there is no omitted variable bias in the IV regression of
(f ).
b) No. The J test suggests that E(ui |Z1i , Z2i ) 6= 0, but doesn't provide evidence about whether the
problem is with Z1i or Z2i (or both).
6
Problem 4 [Weak instruments]
R2 /k 0.05
N = 100 : Fhomosk−only = 2
= = 5.16
(1 − R )/N − k − 1 0.95/98
R2 /k 0.05
N = 500 : Fhomosk−only = 2
= = 26.2
(1 − R )/N − k − 1 0.95/498
The rst case (with N = 100) is lower than the rule-of-thumb value of 10, so Zi is weak instrument.
The second case (with N = 500) is greater than 10, so Zi is not weak.
b) The model is just-identied (3 endogenous variables and 3 instruments) so we cannot test exo-
geneity. The use of age and age squared as instruments can be questioned. Although age is clearly
exogenous, some unobservables such as social skills may be correlated with both age and wage.
This illustrates the general point that there can be disagreement on assumptions of instrument
validity.
c) The OLS estimate of β1 is 0.073, so that wages rise by 7.3% on average with each extra year
of schooling holding everything else constant. This estimate is an inconsistent estimate of β1
given omitted ability, as discussed before. The IV estimate (or 2SLS estimate) is 0.132, so that
an extra year of schooling is estimated to lead to a 13.2% increase in wage. The IV estimator
is much less ecient than OLS. The standard error of β̂1,OLS is over 10 times larger than the