CH 10

Chapter 10
Random Regressors and Moment-

Based Estimation
Walter R. Paczkowski
Rutgers University
Chapter 10: Random Regressors and
Principles of Econometrics, 4th Edition Page 1
Moment-Based Estimation
Chapter Contents
 10.1 Linear Regression with Random x’s

 10.2 Cases in Which x and e are Correlated
 10.3 Estimators Based on the Method of
Moments
 10.4 Specification Tests

We relax the assumption that variable x is not
random

10.1
Linear Regression with Random x’s

10.1
Linear Regression
with Random x’s
Modified simple regression assumptions:

A10.1 yi = β1 + β2xi + ei correctly describes the relationship
between yi and xi in the population, where β1 and β2 are unknown (fixed)
parameters and ei is an unobservable random error term.
A10.2 The data pairs (xi, yi), i = 1, …, N, are obtained by random

sampling. That is, the data pairs are collected from the same
population, by a process in which each pair is independent of every
other pair. Such data are said to be independent and identically
distributed.

10.1
Linear Regression
with Random x’s
Modified simple regression assumptions (Continued):

A10.3 The expected value of the error term e, conditional on the value
of x, is zero.
If E(e|x) = 0, then we can show that it is also true that x and e are
uncorrelated, and that cov(x, e) = 0. Explanatory variables that
are not correlated with the error term are called exogenous
variables.
Conversely, if x and e are correlated, then cov(x, e) ≠ 0 and we

can show that E(e|x) ≠ 0. Explanatory variables that are correlated
with the error term are called endogenous variables.
A10.4 In the sample, x must take at least two different values.

10.1
Linear Regression
with Random x’s
Modified simple regression assumptions (Continued):

A10.5 var(e|x) = σ2. The variance of the error term, conditional on any
x, is a constant σ2.
A10.6 The distribution of the error term is normal.

10.1
Linear Regression
with Random x’s
Assumption A10.2 states that both y and x are

obtained by a sampling process, and thus are
random
– This is the only one new assumption on our list

10.1
Linear Regression
with Random x’s
10.1.1
The Small Sample
Properties of the
Least Squares
Estimators
The result that under the classical assumptions,

and fixed x’s, the least squares estimator is the best
linear unbiased estimator, is a finite sample, or a
small sample
– This means is that the result does not depend on
the size of the sample

10.1
Linear Regression
with Random x’s
10.1.1
The Small Sample
Properties of the
Under assumptions A10.1–A10.6:
Least Squares
Estimators 1. The least squares estimator is unbiased
2. The least squares estimator is the best linear
unbiased estimator of the regression
parameters, and the usual estimator of σ2 is
unbiased
3. The distributions of the least squares
estimators, conditional upon the x’s, are normal,
and their variances are estimated in the usual
way
• The usual interval estimation and hypothesis
testing procedures are valid
10.1
Linear Regression
with Random x’s
10.1.1
The Small Sample
Properties of the
Least Squares
Estimators
If x is random, as long as the data are obtained by

random sampling and the other usual assumptions
hold, no changes in our regression methods are
required

10.1
Linear Regression
with Random x’s
10.1.2
Large Sample
Properties of the
Least Squares
Estimators
For the purposes of a ‘‘large sample’’ analysis of

the least squares estimator, it is convenient to
replace assumption A10.3 by:
A10.3* E(e) = 0 and cov(x, e) = 0

10.1
Linear Regression
with Random x’s
10.1.2
Large Sample
Properties of the
Now we can say:
Least Squares
Estimators
– Under assumptions A10.1, A10.2, A10.3*, A10.4, and A10.5, the least
squares estimators:
1. Are consistent.
– They converge in probability to the true parameter values as
N→∞.
2. Have approximate normal distributions in large samples, whether
the errors are normally distributed or not.
– Our usual interval estimators and test statistics are valid, if the
sample is large.
3. If assumption A10.3* is not true, and in particular if cov(x,e) ≠ 0
so that x and e are correlated, then the least squares estimators are
inconsistent.
– They do not converge to the true parameter values even in very
large samples.
– None of our usual hypothesis testing or interval estimation
procedures are valid.
10.1
Linear Regression
with Random x’s
FIGURE 10.1 (a) Correlated x and e
10.1.3
Why Least Squares
Estimation Fails

10.1
Linear Regression
with Random x’s
FIGURE 10.1 (b) Plot of data, true and fitted regression functions
10.1.3
Why Least Squares
Estimation Fails

10.1
Linear Regression
with Random x’s
10.1.3
Why Least Squares
Estimation Fails
The statistical consequences of correlation

between x and e is that the least squares estimator
is biased — and this bias will not disappear no
matter how large the sample
– Consequently the least squares estimator is
inconsistent when there is correlation between x
and e

10.2
Cases in Which x and e are
Correlated

10.2
Cases in Which x
and e are Correlated
When an explanatory variable and the error term

are correlated, the explanatory variable is said to
be endogenous
– This term comes from simultaneous equations
models
• It means ‘‘determined within the system’’
– Using this terminology when an explanatory
variable is correlated with the regression error,
one is said to have an ‘‘endogeneity problem’’

10.2
Cases in Which x
10.2.1
Measurement Error
The errors-in-variables problem occurs when an

explanatory variable is measured with error
– If we measure an explanatory variable with
error, then it is correlated with the error term,
and the least squares estimator is inconsistent

10.2
Cases in Which x
10.2.1
Measurement Error Let y = annual savings and x* = the permanent
annual income of a person
– A simple regression model is:
Eq. 10.1 yi  1  2 xi*  vi
– Current income is a measure of permanent
income, but it does not measure permanent
income exactly.
• It is sometimes called a proxy variable
• To capture this feature, specify that:
Eq. 10.2 xi  xi*  ui

10.2
Cases in Which x
10.2.1
Measurement Error
Substituting:
y  1  2 x*  vi
 1  2  x  u   v
Eq. 10.3
 1  2 x   v   2u 
 1  2 x  e

10.2
Cases in Which x
10.2.1
Measurement Error
In order to estimate Eq. 10.3 by least squares, we

must determine whether or not x is uncorrelated
with the random disturbance e
– The covariance between these two random
variables, using the fact that E(e) = 0, is:
cov  x, e   E  xe   E  x*  u   v  2u  

Eq. 10.4
 E  2u 2   2 u2  0

10.2
Cases in Which x
10.2.1
Measurement Error
The least squares estimator b2 is an inconsistent

estimator of β2 because of the correlation between
the explanatory variable and the error term
– Consequently, b2 does not converge to β2 in
large samples
– In large or small samples b2 is not
approximately normal with mean β2 and
variance var  b2     x  x 
2

10.2
Cases in Which x
10.2.2
Simultaneous
Equations Bias
Another situation in which an explanatory variable

is correlated with the regression error term arises
in simultaneous equations models
– Suppose we write:
Eq. 10.5 Q  1  2 P  e

10.2
Cases in Which x
10.2.2
Simultaneous
Equations Bias
There is a feedback relationship between P and Q

– Because of this, which results because price and
quantity are jointly, or simultaneously,
determined, we can show that cov(P, e) ≠ 0
– The resulting bias (and inconsistency) is called
the simultaneous equations bias

10.2
Cases in Which x
10.2.3
Omitted Variables
When an omitted variable is correlated with an

included explanatory variable, then the regression
error will be correlated with the explanatory
variable, making it endogenous

10.2
Cases in Which x
10.2.3
Omitted Variables
Consider a log-linear regression model explaining

observed hourly wage:
Eq. 10.6 ln  WAGE   β1  β 2 EDUC  β 3 EXPER  β 4 EXPER 2  e
– What else affects wages? What have we
omitted?

10.2
Cases in Which x
10.2.3
Omitted Variables
We might expect cov(EDUC, e) ≠ 0

– If this is true, then we can expect that the least
squares estimator of the returns to another year
of education will be positively biased,
E(b2) > β2, and inconsistent
• The bias will not disappear even in very
large samples

10.2
Cases in Which x
10.2.4
Least Squares
Estimation of a
Estimating our wage equation, we have:
Wage Equation
ln  WAGE  0.5220  0.1075 EDUC  0.0416 EXPER  0.0008 EXPER2

 se  0.1986  0.0141  0.0132  0.0004
– We estimate that an additional year of

education increases wages approximately
10.75%, holding everything else constant
• If ability has a positive effect on wages, then
this estimate is overstated, as the
contribution of ability is attributed to the
education variable
10.3
Estimators Based on the Method of
Moments

10.3
Estimators Based on
the Method of
Moments
When all the usual assumptions of the linear

model hold, the method of moments leads to the
least squares estimator
– If x is random and correlated with the error
term, the method of moments leads to an
alternative, called instrumental variables
estimation, or two-stage least squares
estimation, that will work in large samples

10.3
Estimators Based on
the Method of
Moments
10.3.1
Method of Moments
Estimation of a
Population Mean
and Variance The kth moment of a random variable Y is the
expected value of the random variable raised to
the kth power:
Eq. 10.7 E  Y k    k  k th moment of Y
– The kth population moment in Eq. 10.7 can be
estimated consistently using the sample (of size
N) analog:
E  Y   ˆ k  k th sample moment of Y   yik N
Eq. 10.8
 k

10.3
Estimators Based on
the Method of
Moments
10.3.1
Method of Moments
Estimation of a
Population Mean
and Variance
The method of moments estimation procedure

equates m population moments to m sample
moments to estimate m unknown parameters
– Example:
var  Y     E  Y     E  Y 2    2
2 2
Eq. 10.9

10.3
Estimators Based on
the Method of
Moments
10.3.1
Method of Moments
Estimation of a
Population Mean
and Variance
The first two population and sample moments of Y

are:
Population Moments Sample Moments
Eq. 10.10 E  Y   1   ˆ   yi N
E  Y 2   2 ˆ 2   yi2 N

10.3
Estimators Based on
the Method of
Moments
10.3.1
Method of Moments
Estimation of a
Population Mean
and Variance
Solve for the unknown mean and variance

parameters:
Eq. 10.11 ˆ   yi N  y
and
    yi  y 
2
y 2
y  Ny
2 2
Eq. 10.12   ˆ 2  ˆ
2 2
 i
y 2
 i

N N N

10.3
Estimators Based on
the Method of
Moments
10.3.2
Method of Moments
Estimation in the
Simple Linear
Regression Model
In the linear regression model y = β1 + β2x + e, we

usually assume:
Eq. 10.13 E  ei   0  E  yi  1  2 xi   0
– If x is fixed, or random but not correlated with
e, then:
Eq. 10.14 E  xe   0  E  x  y  1  2 x    0

10.3
Estimators Based on
the Method of
Moments
10.3.2
Method of Moments
Estimation in the
Simple Linear
Regression Model
We have two equations in two unknowns:

1
  yi  b1  b2 xi   0
N
Eq. 10.15
1
 xi  yi  b1  b2 xi   0
N

10.3
Estimators Based on
the Method of
Moments
10.3.2
Method of Moments
Estimation in the
These are equivalent to the least squares normal
Simple Linear
Regression Model equations and their solution is:
b2 
  xi  x   yi  y 
  xi  x 
2
Eq. 10.16
b1  y  b2 x
– Under "nice" assumptions, the method of
moments principle of estimation leads us to the
same estimators for the simple linear regression
model as the least squares principle

10.3
Estimators Based on
the Method of
Moments
10.3.3
Instrumental
Variables Estimation
Suppose that there is another variable, z, such that:
1. z does not have a direct effect on y, and thus it
in the Simple Linear
Regression Model
does not belong on the right-hand side of the

model as an explanatory variable
2. z is not correlated with the regression error term
e
• Variables with this property are said to be
exogenous
3. z is strongly [or at least not weakly] correlated
with x, the endogenous explanatory variable
A variable z with these properties is called an
instrumental variable
10.3
Estimators Based on
the Method of
Moments
10.3.3
Instrumental
Regression Model If such a variable z exists, then it can be used to
form the moment condition:
Eq. 10.16 E  ze   0  E  z  y  1  2 x    0
– Use Eqs. 10.13 and 10.16, the sample moment
conditions are:
  yi  ˆ 1  ˆ 2 xi   0
1
N
Eq. 10.17
 zi  yi  ˆ 1  ˆ 2 xi   0
1
N
10.3
Estimators Based on
the Method of
Moments
10.3.3
Instrumental
Regression Model Solving these equations leads us to method of
moments estimators, which are usually called the
instrumental variable (IV) estimators:
ˆ  N  zi yi   zi  yi    zi  z   yi  y 
N  zi xi   zi  xi   zi  z   xi  x 
2
Eq. 10.18
ˆ 1  y  ˆ 2 x

10.3
Estimators Based on
the Method of
Moments
10.3.3
Instrumental
These new estimators have the following
Regression Model properties:
– They are consistent, if z is exogenous, with
E(ze) = 0
– In large samples the instrumental variable
estimators have approximate normal
distributions
• In the simple regression model:
  2 
Eq. 10.19 ˆ 2 ~ N  2 , 2 
 r  x  x  2 
 zx i 
10.3
Estimators Based on
the Method of
Moments
10.3.3
Instrumental
These new estimators have the following
Regression Model properties (Continued):
– The error variance is estimated using the
estimator:
  yi  ˆ 1  ˆ 2 xi 
2
ˆ 2IV 
N 2

10.3
Estimators Based on
the Method of
Moments
10.3.3a
The Importance of
Using Strong
Instruments
Note that we can write the variance of the
instrumental variables estimator of β2 as:
2 var  b2 
 
var ˆ 2  
  xi  x 
2
r 2
zx
rzx2
– Because rzx2  1 the variance of the instrumental
variables estimator will always be larger than
the variance of the least squares estimator, and
thus it is said to be less efficient

10.3
Estimators Based on
the Method of
Moments
10.3.4
Instrumental
in the Multiple
Regression Model To extend our analysis to a more general setting,
consider the multiple regression model:
y  β1  β 2 x2    β K xK  e
– Let xK be an endogenous variable correlated
with the error term
– The first K - 1 variables are exogenous
variables that are uncorrelated with the error
term e - they are ‘‘included’’ instruments

10.3
Estimators Based on
the Method of
Moments
10.3.4
Instrumental
in the Multiple
Regression Model
We can estimate this equation in two steps with a

least squares estimation in each step

10.3
Estimators Based on
the Method of
Moments
10.3.4
Instrumental
in the Multiple
Regression Model The first stage regression has the endogenous
variable xK on the left-hand side, and all
exogenous and instrumental variables on the
right-hand side
– The first stage regression is:
Eq. 10.20 xK  1   2 x2     K 1 xK 1  1 z1     L z L  vK
– The least squares fitted value is:

Eq. 10.21 xˆ K  ˆ 1  ˆ 2 x2    ˆ K 1 xK 1  ˆ 1 z1    ˆ L zL

10.3
Estimators Based on
the Method of
Moments
10.3.4
Instrumental
The second stage regression is based on the original
in the Multiple
Regression Model specification:
Eq. 10.22 y  β1  β 2 x2    β K xK  e
ˆ *
– The least squares estimators from this equation

are the instrumental variables (IV) estimators
– Because they can be obtained by two least squares
regressions, they are also popularly known as the
two-stage least squares (2SLS) estimators
• We will refer to them as IV or 2SLS or
IV/2SLS estimators
10.3
Estimators Based on
the Method of
Moments
10.3.4
Instrumental
in the Multiple
Regression Model
The IV/2SLS estimator of the error variance is

based on the residuals from the original model:
 
2
Eq. 10.23
σ̂ 2

 yi  βˆ 1  βˆ 2 x2i    βˆ K xKi
IV
N K

10.3
Estimators Based on
the Method of
Moments
10.3.4a
Using Surplus
Instruments in
Simple Regression
In the simple regression, if x is endogenous and
we have L instruments:
xˆ  ˆ 1  ˆ 1 z1    ˆ L z L
– The two sample moment conditions are:
  y  βˆ 
1
i 1  βˆ 2 xi  0
N
 xˆ  y  βˆ 
1
i i 1  βˆ 2 xi  0
N

10.3
Estimators Based on
the Method of
Moments
10.3.4a
Using Surplus
Instruments in
Simple Regression
Solving using the fact that x̂  x , we get:
β̂ 2 
  xˆ  xˆ   y  y    xˆ  x   y  y 
i i
 i i
  xˆ  xˆ   x  x    xˆ  x   x  x 
i i i i
βˆ 1  y  βˆ 2 x

10.3
Estimators Based on
the Method of
Moments
10.3.4b
Surplus Moment
Conditions
Sometimes we have more instrumental variables at

our disposal than are necessary
– Suppose we have L = 2 instruments, z1 and z2
– Then we have:
E  z2e   E  z2  y  β1  β 2 x    0

10.3
Estimators Based on
the Method of
Moments
10.3.4b
Surplus Moment
Conditions
We have three sample moment conditions:
 
1
yi  βˆ 1  βˆ 2 xi  mˆ i  0
N
1
N
z  y  βˆ  βˆ x   mˆ  0
i1 i 1 2 i 2
 z  y  βˆ  βˆ x   mˆ  0
1
i2 i 1 2 i 3
N

10.3
Estimators Based on
the Method of
Moments
10.3.5
Assessing
Instrument Strength
Using the First Stage
Model
The first stage regression is a key tool in assessing

whether an instrument is ‘‘strong’’ or ‘‘weak’’ in
the multiple regression setting

10.3
Estimators Based on
the Method of
Moments
10.3.5a
One Instrumental
Variable
Suppose the first stage regression equation is:

Eq. 10.24 xK  1   2 x2     K 1 xK 1  1 z1  vK
– The key to assessing the strength of the
instrumental variable z1 is the strength of its
relationship to xK after controlling for the
effects of all the other exogenous variables

10.3
Estimators Based on
the Method of
Moments
10.3.5b
More Than One
Instrumental
Variable
Suppose the first stage regression equation is:
Eq. 10.25 xK  1   2 x2     K 1 xK 1  1 z1    L zL  vK
– We require that at least one of the instruments

be strong

10.3
Estimators Based on
the Method of
Moments
10.3.6
Instrumental
of the Wage
Equation
Consider the model with an instrumental variable

MOTHEREDUC:

EDUC  9.7751  0.0489 EXPER  0.0013EXPER 2  0.2677 MOTHEREDUC
Eq. 10.26
 se   0.4249   0.0417   0.0012   0.0311

10.3
Estimators Based on
the Method of
Moments
10.3.6
Instrumental
of the Wage
Equation
To implement instrumental variables estimation

using the two-stage least squares approach, we
obtain the predicted values of education from the
first stage equation and insert it into the log-linear
wage equation to replace EDUC
– Then estimate the resulting equation by least
squares

10.3
Estimators Based on
the Method of
Moments
10.3.6
Instrumental
of the Wage
Equation
The instrumental variables estimates of the log-

linear wage equation are:
  WAGE   0.1982  0.0493EDUC  0.0449 EXPER  0.0009 EXPER 2
ln
 se   0.4729   0.0374   0.0136   0.0004 

10.3
Estimators Based on
the Method of
Moments
10.3.6
Instrumental
of the Wage
Equation
Using FATHEREDUC, the first stage equation is:
EDUC  γ1  γ 2 EXPER  γ 3 EXPER 2  θ1MOTHEREDUC  θ 2 FATHEREDUC  v

10.3
Estimators Based on
the Method of Table 10.1 First-Stage Equation
Moments
10.3.6
Instrumental
of the Wage
Equation

10.3
Estimators Based on
the Method of
Moments
10.3.6
Instrumental
of the Wage
Equation
The IV/2SLS estimates are:

  WAGE   0.0481  0.0614 EDUC  0.0442 EXPER  0.0009 EXPER 2
ln
Eq. 10.27
 se   0.4003  0.0314   0.0134   0.0004 

10.3
Estimators Based on
the Method of
Moments
10.3.7
Partial Correlation
In a multiple regression model, the coefficients are

the effect of a unit change in an explanatory,
independent, variable on the expected outcome,
holding all other things constant
– In calculus terminology, the coefficients are
partial derivatives

10.3
Estimators Based on
the Method of
Moments
10.3.7
Partial Correlation
We can net out or partial out the effects of

explanatory variables
– Regression coefficients can be thought of
measuring the effect of one variable on another
after removing, or partialling out, the effects of
all other variables
– The sample correlation between two residuals
is called the partial correlation coefficient

10.3
Estimators Based on
the Method of
Moments
10.3.8
Instrumental
in a General Model
The multiple regression model, including all K

variables, is:
G exogenous variables B endogenous variables
Eq. 10.28
y  
1   x
2 2    x
G G  
 x
G 1 G 1     x
K K
 e

10.3
Estimators Based on
the Method of
Moments
10.3.8
Instrumental Think of G = Good explanatory variables, B = Bad
in a General Model explanatory variables and L = Lucky instrumental variables
– It is a necessary condition for IV estimation that L ≥ B
– If L = B then there are just enough instrumental
variables to carry out IV estimation
• The model parameters are said to just identified or
exactly identified in this case
• The term identified is used to indicate that the model
parameters can be consistently estimated
– If L > B then we have more instruments than are
necessary for IV estimation, and the model is said to be
overidentified

10.3
Estimators Based on
the Method of
Moments
10.3.8
Instrumental
Consider the B first-stage equations:
in a General Model
xG  j  1 j   2 j x2     Gj xG  1 j z1    Lj zL  v j ,
Eq. 10.29
j  1, , B
The predicted values are:
xˆG  j  ˆ 1 j  ˆ 2 j x2    ˆ Gj xG  ˆ 1 j z1    ˆ Lj z L ,
j  1, , B
In the second stage of estimation we apply least
squares to:
Eq. 10.30 y  1  2 x2  G xG  G 1 xˆG 1     K xˆK  e*

10.3
Estimators Based on
the Method of
Moments
10.3.8a
Assessing
Instrument Strength
in a General Model
Consider the model with B = 2:
Eq. 10.31 y  1  2 x2  G xG  G 1 xG 1  G 1 xG 1  e
– The first-stage equations are:
xG 1  γ11  γ 21 x2    γ G1 xG  θ11 z1  θ 21 z2  v1
xG  2  γ12  γ 22 x2    γ G 2 xG  θ12 z1  θ 22 z2  v2

10.3
Estimators Based on
the Method of
Moments
10.3.8b
Hypothesis Testing
with Instrumental
Variables Estimates
When testing the null hypothesis H0: βk = c, use of
the test statistic t   ˆ k  c  se  ˆ k  is valid in large
samples
– It is common, but not universal, practice to use
critical values, and p-values, based on the
distribution rather than the more strictly
appropriate N(0,1) distribution
– The reason is that tests based on the t-
distribution tend to work better in samples of
data that are not large
10.3
Estimators Based on
the Method of
Moments
10.3.8b
Hypothesis Testing
with Instrumental
Variables Estimates
When testing a joint hypothesis, such as
H0: β2 = c2, β3 = c3, the test may be based on the
chi-square distribution with the number of degrees
of freedom equal to the number of hypotheses (J)
being tested
– The test itself may be called a “Wald” test, or a
likelihood ratio (LR) test, or a Lagrange
multiplier (LM) test
– These testing procedures are all asymptotically
equivalent
10.3
Estimators Based on
the Method of
Moments
10.3.8c
Goodness-of-Fit
with Instrumental
Variables Estimates
Unfortunately R2 can be negative when based on

IV estimates
– Therefore the use of measures like R2 outside
the context of the least squares estimation
should be avoided

10.4
Specification Tests

10.4
Specification Tests
1. Can we test for whether x is correlated with the

error term?
– This might give us a guide of when to use least
squares and when to use IV estimators
2. Can we test if our instrument is valid, and
uncorrelated with the regression error, as
required?

10.4
Specification Tests
10.4.1
The Hausman Test
for Endogeneity
The null hypothesis is H0: cov(x, e) = 0 against the

alternative H1: cov(x, e) ≠ 0

10.4
Specification Tests
10.4.1
The Hausman Test
for Endogeneity
If null hypothesis is true, both the least squares
estimator and the instrumental variables estimator are
consistent
– Naturally if the null hypothesis is true, use the more
efficient estimator, which is the least squares
estimator
If the null hypothesis is false, the least squares estimator
is not consistent, and the instrumental variables
estimator is consistent
– If the null hypothesis is not true, use the instrumental
variables estimator, which is consistent
10.4
Specification Tests
10.4.1
The Hausman Test
for Endogeneity
There are several forms of the test, usually called

the Hausman test

10.4
Specification Tests
10.4.1
The Hausman Test
for Endogeneity
Consider the model: y  1  2 x  e
– Let z1 and z2 be instrumental variables for x.
1. Estimate the model x  1  1 z1  2 z2  vby least
squares, and obtain the residuals
v̂  x  ˆ  ˆ z  ˆ z
.
1 1 1 2 2
• If there are more than one explanatory variables

that are being tested for endogeneity, repeat this
estimation for each one, using all available
instrumental variables in each regression

10.4
Specification Tests
10.4.1
The Hausman Test
for Endogeneity
Consider the model (Continued): y  1  2 x  e
2. Include the residuals computed in step 1 as an
explanatory variable in the original regression,
y  1  2 x  vˆ  e
– Estimate this "artificial regression" by least
squares, and employ the usual t-test for the
hypothesis of significance
H 0 :   0  no correlation between x and e 

H1 :   0  correlation between x and e 

10.4
Specification Tests
10.4.1
The Hausman Test
for Endogeneity
Consider the model (Continued): y  1  2 x  e
3. If more than one variable is being tested for
endogeneity, the test will be an F-test of joint
significance of the coefficients on the
included residuals

10.4
Specification Tests
10.4.2
Testing Instrument
Validity
A test of the validity of the surplus moment
conditions is:
1. Compute the IV estimates ˆ k using all
available instruments, including the G
variables x1=1, x2, …, xG that are presumed to
be exogenous, and the L instruments
2. Obtain the residuals eˆ  y  ˆ 1  ˆ 2 x2    ˆ K xK .
3. Regress ê on all the available instruments
described in step 1

10.4
Specification Tests
10.4.2
Testing Instrument
Validity
A test of the validity of the surplus moment conditions
is (Continued):
4. Compute NR2 from this regression, where N is the
sample size and R2 is the usual goodness-of-fit
measure
5. If all of the surplus moment conditions are valid,
NR ~ ( L  B ) .
2 2
then
• If the value of the test statistic exceeds
(2L  B ) the
100(1−α)-percentile from the distribution,
then we conclude that at least one of the surplus
moment conditions restrictions is not valid
10.4
Specification Tests Table 10.2 Hausman Test Auxiliary Regression
10.4.3
Specification Tests
for the Wage
Equation

Key Words

Keywords
asymptotic instrumental reduced form

properties variable equation
conditional instrumental sample moments
expectation variable estimator simultaneous
endogenous just identified equations bias
variables equations test of surplus
errors-in-variables large sample moment
exogenous properties conditions
variables over identified two-stage least
finite sample equations squares estimation
properties population weak instruments
first stage moments
regression
random sampling
Hausman test

Appendices

10A
Conditional and
Iterated
Expectations
10A.1
Conditional
Expectations
We can use the conditional pdf to compute the

conditional mean of Y given X:
Eq. 10A.1 E  Y | X  x    yP  Y  y | X  x   yf  y | x 
y y
Similarly we can define the conditional variance

of Y given X:
var  Y | X  x     y  E  Y | X  x   f  y | x 
2

10A
Conditional and
Iterated
Expectations
10A.2
Iterated
Expectations
The law of iterated expectations says that the

expected value of the conditional expectation of Y
given X:
Eq. 10A.2 E  Y   E X  E  Y | X  

10A
Conditional and
Iterated
Expectations
10A.2
Iterated
Expectations
We can now show:

 
E  Y    yf  y    y  f  x, y  
y y x 
 
  y  f  y | x  f  x  
y x 
 
   yf  y | x   f  x  [by changing order of summation]
x  y 
  E  Y | X  x f  x
x
 EX  E  Y | X  

10A
Conditional and
Iterated
Expectations
10A.2
Iterated
Expectations
Two other results can be shown to be true:
Eq. 10A.3 E  XY   E X  XE  Y | X  
Eq. 10A.4 cov  X , Y   E X  X   X  E  Y | X  

10A
Conditional and
Iterated
Expectations
10A.3
Regression Model
Application
The following can be shown to hold:
Eq. 10A.5 E  ei   Ex  E  ei | xi    Ex  0  0
Eq. 10A.6 E  xi ei   Ex  xi E  ei | xi    Ex  xi 0  0
Eq. 10A.7 cov  xi , ei   Ex  xi   x  E  ei | xi    Ex  xi   x  0  0

10A
Conditional and
Iterated
Expectations
10A.3
Regression Model
Application
If E(e|x) = 0 it follows that E(e) = 0, E(xe) = 0, and

cov(x, e) = 0
– However, if E(e|x) ≠ 0 then cov(x, e) ≠ 0

10B
The Inconsistency
of the Least
Squares Estimator
This is an algebraic proof that the least squares

estimator is not consistent when cov(x, e) ≠ 0
– The regression model is y = β1 + β2x + e.
– Under Eq. A10.3, E(e) = 0, so that
E(y) = β1 + β2E(x)

10B
The Inconsistency
of the Least
Squares Estimator
Subtract this expectation from the original

equation:
yi  E  yi   2  xi  E  xi    ei
– Multiply both sides by x – E(x):
 xi  E  xi    yi  E  yi     2  xi  E  xi     xi  E  xi   ei
2
– Take expected values of both sides:

E  xi  E  xi    yi  E  yi    2 E  xi  E  xi    E  xi  E  xi   ei
2

cov  x, y   2 var  x   cov  x, e 
or
10B
The Inconsistency
of the Least
Squares Estimator
Solve for β2:

cov  x, y  cov  x, e 
Eq. 10B.1 2  
var  x  var  x 
– If we assume cov(x, e) = 0, then:
cov  x, y 
Eq. 10B.2 2 
var  x 
– The least squares estimator can be expressed as:

b2 
  xi  x   yi  y    xi  x   yi  y  /  N  1 cov(
 
 x, y )
Eq. 10B.3
  xi  x    xi  x  /  N  1
2 2  x)
var(

10B
The Inconsistency
of the Least
Squares Estimator
The sample variance and covariance converge to

the true variance and covariance as the sample size
N increases, so that the least squares estimator
converges to β2
– If cov(x, e) = 0, then:
 x, y )
cov( cov( x, y )
b2    2
 x)
var( var( x)
– If cov(x, e) ≠ 0, then:
2  

10B
The Inconsistency
of the Least
Squares Estimator
The least squares estimator now converges to:
Eq. 10B.4 b2   2   2

10C
The Consistency of
the IV Estimator
The IV estimator can be expressed as:
ˆ    zi  z   yi  y      z, y 
N  1 
cov
Eq. 10C.1
  zi  z   xi  x   N  1 cov
  z, x 
2
– For large samples:
Eq. 10C.2
ˆ  cov  z , y 
cov  z , x 
2

10C
The Consistency of
the IV Estimator
Following the steps from Appendix 10B, we get:
cov  z, y  cov  z, e 
Eq. 10C.3 2  
cov  z , x  cov  z , x 
If cov(x, e) = 0, then:
Eq. 10C.4
ˆ  cov  z , y   
cov  z , x 
2 2

10D
The Logic of the
Hausman Test
Start with the simple regression model:

Eq. 10D.1 y  1  2 x  e
We can describe the relationship between an

instrumental variable z, which must be correlated
with x but uncorrelated with e, as:
Eq. 10D.2 x  0  1 z  v

10D
The Logic of the
Hausman Test
We can divide x into two parts, a systematic part

and a random part, as:
Eq. 10D.3 x  E  x  v
– Substituting:
y  1  2 x  e  1   2  E  x   v   e
 1  2 E  x    2v  e
Eq. 10D.4

10D
The Logic of the
Hausman Test
An estimated analog of Eq. 10D.3 is:
Eq. 10D.5 x  xˆ  vˆ
Substitute Eq. 10D.5 into the original Eq. 10D.1:
y  1  2 x  e  1  2  xˆ  vˆ   e
Eq. 10D.6
 1  2 xˆ  2vˆ  e

10D
The Logic of the
Hausman Test
To reduce confusion, write:
Eq. 10D.7
y  1   2 xˆ  vˆ  e
– If we omit vˆ
Eq. 10D.8 y  1  2 xˆ  e

10D
The Logic of the
Hausman Test
Carrying out the test is made simpler by playing a

trick on Eq. 10D.7:
y  1  2 xˆ  vˆ  e   2vˆ   2vˆ
Eq. 10D.9  1  2  xˆ  vˆ       2  vˆ  e
 1  2 x  vˆ  e

10E
Testing for Weak
Instruments
Using canonical correlations there is a solution

to the problem of identifying weak instruments
when an equation has more than one endogenous
variable
– Canonical correlations are a generalization of
the usual concept of a correlation between two
variables and attempt to describe the
association between two sets of variables

10E
Testing for Weak
Instruments
10E.1
A Test for Weak
Identification
A test for weak identification, the situation that

arises when the instruments are correlated with the
endogenous regressors but only weakly, is based
on the Cragg-Donald F-test statistic
Eq. 10E.1 Cragg-Donald   N  G  B  L   rB2  1  rB2  

10E
Testing for Weak
Instruments
10E.1
A Test for Weak
Identification
Two particular consequences of weak instruments:

– Relative Bias: In the presence estimator can
become large
– Rejection Rate (Test Size): When estimating a
model with endogenous regressors, testing
hypotheses about the coefficients of the
endogenous variables is frequently of interest

10E
Testing for Weak Table 10E.1 Critical Values for the Weak Instrument Test Based
Instruments on IV Test Size (5% level of significance)
10E.1
A Test for Weak
Identification

10E
Testing for Weak Table 10E.2 Critical Values for the Weak Instrument Test Based
Instruments on IV Relative Bias (5% level of significance)
10E.1
A Test for Weak
Identification

10E
Testing for Weak
Instruments
10E.2
Examples of
Testing for Weak
Identification
Consider the following HOURS supply equation

specification:
Eq. 10E.2 HOURS  β1  β2MTR  β3EDUC  β4KIDSL6  β5 NWIFEINC  e
where
NWIFEINC   FAMINC WAGE  HOURS  1000

10E
Testing for Weak
Instruments
10E.2
Examples of
Testing for Weak
Identification
Weak IV Example 1: Endogenous: MTR;
Instrument: EXPER
– The estimated first-stage equation for MTR is
Model (1) of Table 10E.3
– The estimated coefficient of MTR in the
estimated HOURS supply equation in Model
(1) of Table 10E.4 is negative and significant at
the 5% level

10E
Testing for Weak
Instruments
Table 10E.3 First-stage Equations
10E.2
Examples of
Testing for Weak
Identification

10E
Testing for Weak
Instruments Table 10E.4 IV Estimation of Hours Equation
10E.2
Examples of
Testing for Weak
Identification

10E
Testing for Weak
Instruments
10E.2
Examples of
Testing for Weak
Weak IV Example 2: Endogenous: MTR;
Identification
Instruments: EXPER, EXPER2, LARGECITY
– The first-stage equation estimates are reported
in Model (2) of Table 10E.3
– The estimated coefficient of MTR in the
estimated HOURS supply equation in Model
(2) of Table 10E.4 is negative and significant at
the 5% level, although the magnitudes of all the
coefficients are smaller in absolute value for
this estimation than the model in Model (1)

10E
Testing for Weak
Instruments
10E.2
Examples of
Testing for Weak
Weak IV Example 3 Endogenous: MTR, EDUC;
Identification
Instruments: MOTHEREDUC, FATHEREDUC
– The first-stage equations for MTR and EDUC
are Model (3) and Model (4) of Table 10E.3
– The estimates of the HOURS supply equation,
Model (3) of Table 10E.4, shows parameter
estimates that are wildly different from those in
Model (1) and Model (2), and the very small t-
statistic values imply very large standard errors,
another consequence for instrumental variables
estimation in the presence of weak instruments
10E
Testing for Weak
Instruments
10E.3
Testing for Weak
Identification:
If instrumental variables are ‘‘weak,’’ then the
Conclusions
instrumental variables, or two-stage least squares,
estimator is unreliable
When there is a single endogenous variable, the
first-stage F-test of the joint significance of the
external instruments is an indicator of instrument
strength
If there is more than one endogenous variable on
the right-hand side of an equation, then the F-test
statistics from the first stage equations do not
provide reliable information about instrument
strength
10F
Monte Carlo
Simulation
We do two types of simulations

1. We generate a sample of artificial data and
give numerical illustrations of the estimators
and tests
2. We carry out a Monte Carlo simulation to
illustrate the repeated sampling properties of
the least squares and IV/2SLS estimators
under various conditions

10F
Monte Carlo
Simulation
10F.1
Illustrations Using
Simulated Data
We create an artificial sample of y values by

adding e to the systematic portion of the
regression
– The least squares estimates are
yˆ LS  0.9789  1.7034 x
 se   0.088  0.090 

10F
Monte Carlo
Simulation
10F.1
Illustrations Using
Simulated Data
The IV estimates using z1 are:

yˆ IV _ z1  1.1011  1.1924 x
 se   0.109   0.195
The IV estimates using z2 are:
yˆ IV _ z2  1.3451  0.1724 x
 se   0.256   0.797 

10F
Monte Carlo
Simulation
10F.1
Illustrations Using
Simulated Data
If we use instrumental variables estimation with

the invalid instrument, we get:
yˆ IV _ z3  0.9640  1.7657 x
 se   0.095  0.172 

10F
Monte Carlo
Simulation
10F.1
Illustrations Using
Simulated Data
The outcome of two-stage least squares estimation
using the two instruments z1 and z2 where we first
obtain the first-stage regression of x on the two
instruments z1 and z2:
xˆ  0.1947  0.5700 z1  0.2068 z2
Eq. 10F.1
 se   0.079   0.089   0.077 
– The instrumental variables estimates are:
yˆ IV _ z1 , z2  1.1376  1.0399 x
Eq. 10F.2
 se   0.116   0.194 

10F
Monte Carlo
Simulation
10F.1.1
The Hausman Test To implement the Hausman test, estimate the first-
stage equation shown in Eq. 10F.1 using the
instruments z1 and z2
– Compute the residuals:
vˆ  x  xˆ  x  0.1947  0.5700 z1  0.2068 z2
– Include the residuals as an extra variable in the
regression equation and apply least squares:
yˆ  1.1376  1.0399 x  0.9957vˆ
 se   0.080   0.133  0.163

10F
Monte Carlo
Simulation
10F.1.2
Test for Weak
Instruments
If we consider using just z1 as an instrument, the

estimated first-stage equation is:
xˆ  0.2196  0.5711z1
 t  6.24 
If we use just z2 as an instrument, the estimated
first-stage equation is:
xˆ  0.2140  0.2090 z2
 t  2.28

10F
Monte Carlo
Simulation
10F.1.3
Testing the Validity
of Surplus
If we use z1, z2, and z3 as instruments, there are two
Instruments
surplus moment conditions.
– The IV estimates using these three instruments
are:
yˆ IV _ z1 , z2 , z3  1.0626  1.3535 x
– Obtaining the residuals and regressing them on

the instruments yields:
eˆ  0.0207  0.1033 z1  0.2355 z2  0.1798 z3
• The R2 from this regression is 0.1311 and NR2

= 13.11
10F
Monte Carlo
Simulation Table 10F.1 Monte Carlo Simulation Results
10F.2
The Repeated
Sampling
Properties of
IV/2SLS

10F
Monte Carlo
Simulation
10F.2
The Repeated
Sampling
Properties of
IV/2SLS
In the case in which ρ = 0.8 and π = 0.1, the mean

square error for the least squares estimator is:
  b2m  β2  10000  0.6062

10000 2
m 1
The IV estimator it is
 β̂ 
2

10000
m1 2m  β2 10000  1.0088


CH 10

Uploaded by

Copyright:

Available Formats

CH 10

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH 10

Uploaded by

Copyright:

Available Formats

Chapter 10

Random Regressors and Moment-

 10.1 Linear Regression with Random x’s

Chapter 10: Random Regressors and

Chapter 10: Random Regressors and

Chapter 10: Random Regressors and

Modified simple regression assumptions:

between yi and xi in the population, where β1 and β2 are unknown (fixed)

parameters and ei is an unobservable random error term.

A10.2 The data pairs (xi, yi), i = 1, …, N, are obtained by random

Chapter 10: Random Regressors and

Modified simple regression assumptions (Continued):

Conversely, if x and e are correlated, then cov(x, e) ≠ 0 and we

A10.4 In the sample, x must take at least two different values.

Chapter 10: Random Regressors and

Modified simple regression assumptions (Continued):

A10.6 The distribution of the error term is normal.

Chapter 10: Random Regressors and

Assumption A10.2 states that both y and x are

Chapter 10: Random Regressors and

The result that under the classical assumptions,

Chapter 10: Random Regressors and

If x is random, as long as the data are obtained by

Chapter 10: Random Regressors and

For the purposes of a ‘‘large sample’’ analysis of

A10.3* E(e) = 0 and cov(x, e) = 0

Chapter 10: Random Regressors and

Chapter 10: Random Regressors and

Chapter 10: Random Regressors and

The statistical consequences of correlation

Chapter 10: Random Regressors and

Chapter 10: Random Regressors and

When an explanatory variable and the error term

Chapter 10: Random Regressors and

The errors-in-variables problem occurs when an

Chapter 10: Random Regressors and

Chapter 10: Random Regressors and

Chapter 10: Random Regressors and

In order to estimate Eq. 10.3 by least squares, we

cov  x, e   E  xe   E  x*  u   v  2u  

 E  2u 2   2 u2  0

Chapter 10: Random Regressors and

The least squares estimator b2 is an inconsistent

Chapter 10: Random Regressors and

Another situation in which an explanatory variable

Chapter 10: Random Regressors and

There is a feedback relationship between P and Q

Chapter 10: Random Regressors and

When an omitted variable is correlated with an

Chapter 10: Random Regressors and

Consider a log-linear regression model explaining

Chapter 10: Random Regressors and

We might expect cov(EDUC, e) ≠ 0

Chapter 10: Random Regressors and

ln  WAGE  0.5220  0.1075 EDUC  0.0416 EXPER  0.0008 EXPER2

– We estimate that an additional year of

Chapter 10: Random Regressors and

When all the usual assumptions of the linear