GMM 2

Generalized Methods of Moments (GMM) Estimation with
Applications using STATA

David Guilkey
Review of Recursive Simultaneous Equations Models

The GMM estimator is typically used to correct for bias caused by
endogenous explanatory variables. So we start with a brief review
of the standard recursive simultaneous equations model:
Simple recursive structural equation model with endogenous
regressors:
Yi1 X i i1
Yi 2 Yi1 i 2
OLS estimation of second equation will lead to biased and

inconsistent results if E ( i1 i 2 ) 0 i.e. if there is overlap in the set
of unobservable variables that affect the two outcomes
sometimes called unobserved heterogeneity.
Examples:
The dependent variables are education and wage rate. Highly
motivated individuals may both get more education and receive
higher wages the effect of education on wages is biased upwards
(positive effect of education is overstated).
Second example: contraceptive use and fertility. If the
unobservable is fecundity, highly fecund women may be more
likely to use contraception and also more likely to have children
biasing the effect of contraceptive use towards zero (negative
effect of contraceptive use biased towards zero).
Two Stage Least Squares

1. Estimate the first equation by OLS to get .
2. Replace Yi1 with Yi1 = Xi and run OLS:
Note that this model can be estimated since we assume that X i
does not have a direct effect on Yi 2 .
Suppose that it did:
Yi 2 Yi1 X i i 2
Then, if we make the two stage least squares substitution, we get:

Yi2 = (Xi )+ Xi + i2
The perfect collinearity between the regressors implies that OLS
cannot separately identify the two coefficients.
Note that if the second equation takes the following form:
Yi 2 Yi1 X i2 i 2
Then the model would be identified by functional form.

Overidentification occurs if there are more exclusion restrictions
than needed typically the higher the level of overidentification,
the more efficient the estimator.
How do you find appropriate instruments (some examples).

1. Effect of physical activity on BMI.
Could use the presence of recreational facilities in community
where person lives. Must be careful individuals may self select
into communities.
2. Effect of contraceptive use on fertility.
Presence of family planning facilities in the community. Again,
must be careful if the government targets the placement of
facilities. In some cases, this is not an issue MATLAB in
Bangladesh.
3. Effect of fertility on education of children (Becker the
quantity-quality theory of children).
The exogenous shock of twins on fertility.
4. Effect of fertility on economic outcomes has used twins and
sex composition as identifying variables.
5. Effect of time spent on homework on results on achievement
tests.
One author used a dummy variable for whether or not the students
roommate had video games.
6. Effect of military service on economic outcomes for veterans
of the Vietnam war.
Angrist used the draft lottery number.
7. Effect of past fertility on contraceptive use

Backdate the availability of services to the beginning of the
womans child bearing years.
Review of Generalized Least Squares (GLS)
Consider the basic multivariate model:
Y X
Where Y is N x 1, X is N x K, and is N x 1
Ideal error term assumption is ~ N (0, 2 I ) where I is an N x N
identity matrix.
Leads to OLS estimator:
= (X' X)-1 X' Y
Now suppose that ~ N (0, ) -- unequal diagonal elements means
that we have heteroskedastic errors and non-zero off diagonal
elements means that we have autocorrelation.
Under these conditions, OLS is still unbiased but the standard
errors are incorrect. Correct covariance matrix is:
Cov = (X' X)-1 X' X(X'X).
Sometimes referred to as sandwich estimator.
Can approximate this covariance matrix using robust option in
STATA for heteroskedasticity or the Newy-West option for
autocorrelation and heteroskedasticity.
5
An improved estimator is the GLS estimator:

-1
-1
=(X' X) X'-1 Y
This estimator is more efficient than OLS since it uses more
information.
Example for heteroskedasticity:
Yi X i i
Where all terms are scalars.

N
OLS: =
X Y
i 1
N
i i
X
i 1
2
i
GLS: =
X Y /
i 1
N
i i
X
i 1
2
i
2
i
/ i2
Observations with high error variance down weighted relative two

observations with low error variance.
Generalized Method of Moments
Suppose we have the following simple model:
Yi X i i where it is assumed that E ( X ii ) 0

The method of moments estimator sets the sample counterpart of
this condition to zero:
1
N
Ni=1 Xi (Yi Xi )= 0 which leads to =
Xi Yi
X2i
Which is the OLS estimator.

Now suppose E ( X ii ) 0 but E (Zii ) 0 where Z is some other
variable. If we set the sample counterpart of this moment
condition to zero we get:
IV =
Zi Yi
Zi Xi
which is the same as the instrumental variables
estimator.
Note that if the variables are standardized, the denominator is the

correlation between X and Z -- so the estimator is not defined (not
identified) if the correlation is zero.
Two conditions for an instrumental variable:
1. Correlated with the variable that needs to be instrumented
2. Uncorrelated with the error term.
Now suppose that that X is N x K and Z is N x L and that L>K
we have more instruments available than regressors over
identification.
7
The model in matrix form is:

Y X where E ( X ' ) 0
The moment conditions are:
E ( Z ' ) 0
Since we have more moment conditions than parameters we could
throw out extra moment conditions but this would not lead to a
unique .
So what is done is instead is to find that that minimizes:
min[( Z ' )'W ( Z ' )]

Where W is an L x L weighting matrix.
If you substitute for :
min[( Z '(Y X )'W ( Z '(Y X )]

The solution for the optimal is:
=(X'ZWZ'X)-1 X'ZWZ'Y
How do you choose W?

Using an identity matrix will do however the estimator ignores
useful information and is not optimal. However note that:
var( Z ' ) 2 ( Z ' Z )
So if we choose W ( Z ' Z ) 1 , we give higher weight to moment

conditions with lower variance (just like the GLS example above)
This leads to:
IV = X'Z(Z'Z)-1 Z'X
-1
X'Z(Z'Z)-1 Z'Y
Some books refer to this as generalized instrumental variables as

opposed to GMM and save the term GMM for models with general
functional forms.
It is easy to show that this estimator is the same as two stage least
squares.
Now suppose that ~ N (0, ) , then var( Z ' ) 2 ( Z ' Z )
And:
IV = X'Z(Z'Z)-1 Z'X
-1
X'Z(Z'Z)-1 Z'Y
Important specification tests:

1. It is important to show that your excluded variables have
explanatory power for the fight-hand-side endogenous
variables.
2. It is important to show that the instruments are uncorrelated
with the error term and that the exclusion restrictions are
valid.
3. Important to show that there are endogenous explanatory
variables in the model since GMM is inefficient relative to
OLS if all variables are exogenous.
Leads to (for 2):
H 0 : Z ' 0
H a : Z ' 0
This test is easy to implement in STATA a post estimation
command. Note that the model must be over identified for the test
to work. If the model is exactly identified, then Z ' 0
regardless of whether Z contains endogenous variables or not.
Example using Data from Tanzania

regress idealnum
gradef
city agef
goodsan goodwat fpmess
Source |
SS
df
MS
-------------+-----------------------------Model | 7935.78153
6 1322.63026
Residual | 40918.6068 7168 5.70851099
-------------+-----------------------------Total | 48854.3883 7174
6.8099231
Number of obs
F( 6, 7168)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
7175
231.69
0.0000
0.1624
0.1617
2.3892
-----------------------------------------------------------------------------idealnum |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------gradef | -.1657421
.009817
-16.88
0.000
-.1849863
-.1464978
city | -.3628387
.096505
-3.76
0.000
-.5520169
-.1736604
agef |
.0470808
.0034304
13.72
0.000
.0403562
.0538055
goodsan | -.4821974
.1976966
-2.44
0.015
-.8697411
-.0946538
goodwat | -.6871424
.0697211
-9.86
0.000
-.8238164
-.5504684
fpmess | -.3810917
.0590844
-6.45
0.000
-.4969145
-.2652689
_cons |
6.006125
.120121
50.00
0.000
5.770652
6.241597
------------------------------------------------------------------------------
10
ivregress 2sls idealnum

lisradio)
gradef
city agef
Instrumental variables (2SLS) regression
goodsan goodwat (fpmess= radio tv
Number of obs
Wald chi2(6)
Prob > chi2
R-squared
Root MSE
=
7101
= 1230.52
= 0.0000
= 0.0573
= 2.5343
-----------------------------------------------------------------------------idealnum |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------fpmess | -2.160707
.3515359
-6.15
0.000
-2.849705
-1.47171
gradef | -.1108485
.0147842
-7.50
0.000
-.1398251
-.0818719
city | -.1508169
.1110777
-1.36
0.175
-.3685252
.0668915
agef |
.0616064
.0045227
13.62
0.000
.0527422
.0704706
goodsan |
-.591477
.2107448
-2.81
0.005
-1.004529
-.1784248
goodwat | -.5264795
.0804746
-6.54
0.000
-.6842068
-.3687522
_cons |
6.05979
.1291835
46.91
0.000
5.806595
6.312985
-----------------------------------------------------------------------------Instrumented: fpmess
Instruments:
gradef city agef goodsan goodwat radio tv lisradio
ivregress 2sls idealnum
lisradio),robust
gradef
city agef
Instrumental variables (2SLS) regression
Number of obs
Wald chi2(6)
Prob > chi2
R-squared
Root MSE
=
7101
= 1294.87
= 0.0000
= 0.0573
= 2.5343
-----------------------------------------------------------------------------|
Robust
idealnum |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------fpmess | -2.160707
.3482724
-6.20
0.000
-2.843309
-1.478106
gradef | -.1108485
.01455
-7.62
0.000
-.1393659
-.0823311
city | -.1508169
.098868
-1.53
0.127
-.3445946
.0429609
agef |
.0616064
.0046162
13.35
0.000
.0525588
.070654
goodsan |
-.591477
.1483923
-3.99
0.000
-.8823206
-.3006335
goodwat | -.5264795
.0800936
-6.57
0.000
-.68346
-.369499
_cons |
6.05979
.133223
45.49
0.000
5.798678
6.320903
Instruments:
estat endogenous
Tests of endogeneity
Ho: variables are exogenous
Robust score chi2(1)
Robust regression F(1,7093)
=
=
31.232
31.4789
(p = 0.0000)
(p = 0.0000)
11
ivregress gmm idealnum

lisradio)
gradef
city agef
Instrumental variables (GMM) regression
GMM weight matrix: Robust
Number of obs
Wald chi2(6)
Prob > chi2
R-squared
Root MSE
=
7101
= 1296.23
= 0.0000
= 0.0572
= 2.5345
-----------------------------------------------------------------------------|
Robust
idealnum |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------fpmess | -2.161912
.3479156
-6.21
0.000
-2.843814
-1.480009
gradef | -.1109999
.0145499
-7.63
0.000
-.1395171
-.0824827
city |
-.155256
.0986711
-1.57
0.116
-.3486477
.0381358
agef |
.0613938
.0046154
13.30
0.000
.0523479
.0704398
goodsan | -.6016565
.1483434
-4.06
0.000
-.8924043
-.3109088
goodwat | -.5241035
.080061
-6.55
0.000
-.6810202
-.3671868
_cons |
6.066762
.1331309
45.57
0.000
5.80583
6.327693
Instruments:
estat overid
Test of overidentifying restriction:
Hansen's J chi2(2) = 4.95188 (p = 0.0841)
gmm (idealnum-{b0}-{b1}*fpmess-{b2}*gradef-{b3}*city-{b4}*agef-{b5}*goodwat{b6}*goodsan),instruments(gradef city agef goodsan goodwat radio
> tv lisradio)
GMM estimation
Number of parameters =
7
Number of moments
=
9
Initial weight matrix: Unadjusted
GMM weight matrix:
Robust
Number of obs
7101
-----------------------------------------------------------------------------|
Robust
|
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------/b0 |
6.066762
.1331309
45.57
0.000
5.80583
6.327693
/b1 | -2.161912
.3479156
-6.21
0.000
-2.843814
-1.480009
/b2 | -.1109999
.0145499
-7.63
0.000
-.1395171
-.0824827
/b3 |
-.155256
.0986711
-1.57
0.116
-.3486477
.0381358
/b4 |
.0613938
.0046154
13.30
0.000
.0523479
.0704398
/b5 | -.5241035
.080061
-6.55
0.000
-.6810202
-.3671868
/b6 | -.6016565
.1483434
-4.06
0.000
-.8924043
-.3109088
-----------------------------------------------------------------------------Instruments for equation 1: gradef city agef goodsan goodwat radio tv lisradio
_cons
estat overid
Hansen's J chi2(2) = 4.95188 (p = 0.0841)
12
Extension to Limited Dependent Variables
Poisson Regression Model

Useful for count data:
Number of births
Number of deaths
Number of FP facilities in a community
Estimation method is maximum likelihood:
e i iyi
P(Yi yi )
yi !
Where:
i e X
i
And
E ( yi ) i e X i
So we can define moment conditions:

N
( y
i 1
e Xi ) X i ' 0
In this case, it is easy to show that the first order conditions for the
maximum likelihood estimator are:
13
ln L N
( yi e X i ) X i ' 0
i 1
Which means that maximum likelihood and method of moments
will be the same in this case.
If we have endogenous regressors:
N
( y
i 1
e X i )Zi ' 0
Example using Data from Tunisia:

idealnum |
Freq.
Percent
Cum.
------------+----------------------------------0 |
2
0.05
0.05
1 |
80
2.16
2.22
2 |
861
23.28
25.49
3 |
982
26.55
52.04
4 |
1,265
34.20
86.24
5 |
229
6.19
92.43
6 |
280
7.57
100.00
------------+----------------------------------Total |
3,699
100.00
Standard Maximum Likelihood

poisson idealnum births urban age15_25 age26_30 age31_35
dclnf210 dclnf35
Poisson regression
Log likelihood = -6261.6451
educf dhspfp5
Number of obs
LR chi2(9)
Prob > chi2
Pseudo R2
=
=
=
=
3699
339.89
0.0000
0.0264
-----------------------------------------------------------------------------idealnum |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------births |
.045359
.0044232
10.25
0.000
.0366898
.0540282
urban | -.1215596
.0238621
-5.09
0.000
-.1683284
-.0747908
age15_25 |
.0709409
.0331187
2.14
0.032
.0060294
.1358525
age26_30 |
.0760448
.0285344
2.67
0.008
.0201184
.1319712
age31_35 |
.0617507
.0247624
2.49
0.013
.0132173
.110284
educf | -.0083566
.0027712
-3.02
0.003
-.0137881
-.0029251
dhspfp5 | -.0388122
.0236606
-1.64
0.101
-.0851862
.0075618
dclnf210 | -.0191308
.0208825
-0.92
0.360
-.0600598
.0217981
dclnf35 | -.0354968
.0203251
-1.75
0.081
-.0753334
.0043397
_cons |
1.142399
.0399497
28.60
0.000
1.064099
1.220699
------------------------------------------------------------------------------
14
GMM with No Correction for Endogeneity

global xb
"{b1}*births+{b2}*urban+{b3}*age15_25+{b4}*age26_30+{b5}*age31_35+{b6}* educf+
{b7}*dhspfp5+{b8}*dclnf210+{b9}* dclnf35+{b0}"
gmm (idealnum-exp($xb)),instruments( births urban age15_25 age26_30 age31_35
educf
dhspfp5 dclnf210 dclnf35)
warning: 268 missing values returned for equation 1 at initial values
GMM estimation
Number of parameters = 10
Number of moments
= 10
GMM weight matrix:
Robust
Number of obs
3699
-----------------------------------------------------------------------------|
Robust
|
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------/b1 |
.045359
.0025938
17.49
0.000
.0402753
.0504427
/b2 | -.1215596
.0137967
-8.81
0.000
-.1486006
-.0945187
/b3 |
.0709409
.0185094
3.83
0.000
.0346633
.1072186
/b4 |
.0760448
.0161104
4.72
0.000
.0444689
.1076207
/b5 |
.0617507
.0143083
4.32
0.000
.033707
.0897944
/b6 | -.0083566
.0014837
-5.63
0.000
-.0112646
-.0054486
/b7 | -.0388122
.0134263
-2.89
0.004
-.0651273
-.012497
/b8 | -.0191308
.0120829
-1.58
0.113
-.0428129
.0045513
/b9 | -.0354968
.0117735
-3.01
0.003
-.0585725
-.0124211
/b0 |
1.142399
.0233899
48.84
0.000
1.096556
1.188243
-----------------------------------------------------------------------------Instruments for equation 1: births urban age15_25 age26_30 age31_35 educf
dhspfp5 dclnf210 dclnf35 _cons
15
GMM with Instruments for Births

global xb
"{b1}*births+{b2}*urban+{b3}*age15_25+{b4}*age26_30+{b5}*age31_35+{b6}* educf+
{b7}*dhspfp5+{b8}*dclnf210+{b9}* dclnf35+{b0}"
gmm (idealnum-exp($xb)),instruments(urban age15_25 age26_30 age31_35
hosp55 clnc355 clnc255 dhspfp5 dclnf210 dclnf35)
educf
GMM estimation
Number of moments
= 12
GMM weight matrix:
Robust
Number of obs
3699
-----------------------------------------------------------------------------|
Robust
|
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------/b1 |
.178753
.0775377
2.31
0.021
.0267819
.3307241
/b2 | -.0445122
.0506145
-0.88
0.379
-.1437149
.0546904
/b3 |
.7156728
.3993534
1.79
0.073
-.0670454
1.498391
/b4 |
.5368065
.2891844
1.86
0.063
-.0299846
1.103597
/b5 |
.3105718
.1603059
1.94
0.053
-.003622
.6247656
/b6 |
.0082527
.0103891
0.79
0.427
-.0121095
.0286149
/b7 |
.0024966
.0341734
0.07
0.942
-.064482
.0694752
/b8 | -.0051071
.0176397
-0.29
0.772
-.0396804
.0294661
/b9 |
.0410424
.0505894
0.81
0.417
-.0581111
.1401959
/b0 |
.1108901
.6380012
0.17
0.862
-1.139569
1.361349
-----------------------------------------------------------------------------Instruments for equation 1: urban age15_25 age26_30 age31_35 educf hosp55
clnc355 clnc255 dhspfp5 dclnf210 dclnf35 _cons
estat overid
Hansen's J chi2(2) = 5.61925 (p = 0.0602)
16
Probit
The dependent variable is now dichotomous and the model takes
the following form:
Yi* X i i where i ~ N (0,1)
The observed dependent variable is Yi which is the sign for the

latent variable Yi* .
We can show that:
E (Yi ) ( X i )

N
(Y ( X )) X
i 1
'0
And that the first order conditions for the maximum likelihood
estimator are:
ln L N (Yi ( X i )) ( X i ) X i '
( X i )(1 ( X i ))
i 1
So method of moments and maximum likelihood will not be the

same in this case. Unless we define the residual to be:
(Yi ( X i )) ( X i )
( X i )(1 ( X i ))
17
Which have been referred to as generalized residuals see

Gourieroux, Monfort, Renault, and Trognon (Journal of
Econometrics 1987)
If we have endogenous regressors and a set of instruments, we can
define the following moment conditions:
N
(Y ( X ))Z ' 0
i 1
And
(Yi ( X i )) ( X i ) Z i '
0
( X i )(1 ( X i ))
i 1
N
Examples using Data from Zimbabwe:

Simple Probit
probit curuse
ageyr educf city
Probit regression
Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2
=
=
=
=
4044
123.15
0.0000
0.0243
-----------------------------------------------------------------------------curuse |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------ageyr |
.0247129
.0024259
10.19
0.000
.0199582
.0294676
educf |
.0218258
.0067044
3.26
0.001
.0086853
.0349662
city |
.1519386
.0461074
3.30
0.001
.0615698
.2423074
_cons | -1.353872
.0936086
-14.46
0.000
-1.537342
-1.170403
------------------------------------------------------------------------------
18
Method of Moments
global xb "{b0}+{b1}* ageyr+{b2}* educf+{b3}* city"
global Phi "normal($xb)"
global phi "normalden($xb)"
gmm (curuse-$Phi),instruments( ageyr educf city)
Step 1
GMM estimation
4
Number of moments
=
4
GMM weight matrix:
Robust
Number of obs
4044
-----------------------------------------------------------------------------|
Robust
|
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------/b0 |
-1.33728
.0917991
-14.57
0.000
-1.517203
-1.157357
/b1 |
.0240179
.0022603
10.63
0.000
.0195878
.0284481
/b2 |
.0226185
.0069999
3.23
0.001
.0088989
.0363381
/b3 |
.1515532
.0458515
3.31
0.001
.0616859
.2414204
-----------------------------------------------------------------------------Instruments for equation 1: ageyr educf city _cons
Method of Moments with Generalized Residuals

gmm (curuse*$phi/$Phi-(1-curuse)*$phi/(1-$Phi)),instruments( ageyr educf city)
GMM estimation
4
Number of moments
=
4
GMM weight matrix:
Robust
Number of obs
4044
-----------------------------------------------------------------------------|
Robust
|
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------/b0 | -1.353872
.0919552
-14.72
0.000
-1.534101
-1.173643
/b1 |
.0247129
.0023031
10.73
0.000
.020199
.0292269
/b2 |
.0218258
.0068971
3.16
0.002
.0083076
.0353439
/b3 |
.1519386
.0458888
3.31
0.001
.0619983
.2418789
-----------------------------------------------------------------------------Instruments for equation 1: ageyr educf city _cons
19
Model with Endogenous Regressor

Simple Probit
probit curuse
ageyr educf city
idealnum
Probit regression
Number of obs
LR chi2(4)
Prob > chi2
Pseudo R2
=
=
=
=
3432
129.53
0.0000
0.0290
-----------------------------------------------------------------------------curuse |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------ageyr |
.0275553
.0027264
10.11
0.000
.0222116
.032899
educf |
.0158365
.0075078
2.11
0.035
.0011215
.0305514
city |
.1585321
.0498385
3.18
0.001
.0608505
.2562138
idealnum | -.0093307
.0112343
-0.83
0.406
-.0313495
.0126881
_cons | -1.251713
.1103242
-11.35
0.000
-1.467944
-1.035482
------------------------------------------------------------------------------
Joint Estimation of Two Equations by Maximum Likelihood

ivprobit curuse
ageyr educf city
(idealnum= goodwat goodsan),first
Probit model with endogenous regressors

Number of obs
Wald chi2(4)
Prob > chi2
=
=
=
3432
536.69
0.0000
-----------------------------------------------------------------------------|
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------curuse
|
.0420308
-9.80
0.000
-.4944318
-.3296739
ageyr |
.0470123
.0023144
20.31
0.000
.0424763
.0515484
educf | -.0573738
.010782
-5.32
0.000
-.0785061
-.0362415
city | -.1813323
.0631177
-2.87
0.004
-.3050408
-.0576238
_cons |
.9205355
.3242921
2.84
0.005
.2849347
1.556136
-------------+---------------------------------------------------------------idealnum
|
ageyr |
.0742471
.0038626
19.22
0.000
.0666766
.0818175
educf | -.1636876
.0111069
-14.74
0.000
-.1854567
-.1419186
city | -.2835769
.1114185
-2.55
0.011
-.5019533
-.0652006
goodwat | -.4396987
.0916246
-4.80
0.000
-.6192796
-.2601178
goodsan | -.1043804
.0716666
-1.46
0.145
-.2448443
.0360834
_cons |
4.255872
.1497554
28.42
0.000
3.962356
4.549387
-------------+---------------------------------------------------------------/athrho |
1.093695
.2272441
4.81
0.000
.6483046
1.539085
/lnsigma |
.6675622
.0120719
55.30
0.000
.6439017
.6912226
-------------+---------------------------------------------------------------rho |
.7982227
.0824534
.5705275
.9119665
sigma |
1.949479
.0235339
1.903895
1.996155
-----------------------------------------------------------------------------Instrumented: idealnum
Instruments:
ageyr educf city goodwat goodsan
-----------------------------------------------------------------------------Wald test of exogeneity (/athrho = 0): chi2(1) =
23.16 Prob > chi2 = 0.0000
20
Method of Moments Using Generalized Residuals

global xb "{b0}+{b1}* ageyr+{b2}* educf+{b3}* city+{b4}*idealnum"
global Phi "normal($xb)"
global phi "normalden($xb)"
gmm (curuse*$phi/$Phi-(1-curuse)*$phi/(1-$Phi)),instruments( ageyr educf city
goodwat goodsan)
GMM estimation
5
Number of moments
=
6
GMM weight matrix:
Robust
Number of obs
3432
-----------------------------------------------------------------------------|
Robust
|
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------/b0 |
1.925231
1.660998
1.16
0.246
-1.330266
5.180728
/b1 |
.0921017
.0373101
2.47
0.014
.0189752
.1652282
/b2 | -.0956154
.0563338
-1.70
0.090
-.2060276
.0147968
/b3 | -.3823969
.2995131
-1.28
0.202
-.9694318
.204638
/b4 | -.8756558
.4847308
-1.81
0.071
-1.825711
.0743991
-----------------------------------------------------------------------------Instruments for equation 1: ageyr educf city goodwat goodsan _cons
estat overid
Hansen's J chi2(1) = .439423 (p = 0.5074)
21
Logit
We can show that:
e Xi
E (Yi )
1 e Xi

N
(Y E (Y )) X
i 1
'0
Which are the same as the first order conditions for the maximum
likelihood estimator.
If we have endogenous regressors, we can define the following
moment conditions:
N
(Y E (Y ))Z ' 0
i 1
Example using data from Tanzania:

Simple Logit
logit curuse agef gradef city idealnum
Logistic regression
Number of obs
LR chi2(4)
Prob > chi2
Pseudo R2
=
=
=
=
6420
293.40
0.0000
0.0739
-----------------------------------------------------------------------------curuse |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------agef |
.0441804
.0053186
8.31
0.000
.0337561
.0546046
gradef |
.205559
.0161523
12.73
0.000
.1739011
.2372169
city |
.5334482
.1087663
4.90
0.000
.3202702
.7466263
.0209566
-2.49
0.013
-.0932343
-.0110861
_cons | -4.355425
.240694
-18.10
0.000
-4.827177
-3.883674
------------------------------------------------------------------------------
22
GMM with Exogenous Regressors

global xb "{b0}+{b1}* agef+{b2}* gradef+{b3}* city+{b4}*idealnum"
global ey "exp($xb)/(1+exp($xb))"
gmm (curuse-$ey),instruments( agef gradef city idealnum)
GMM estimation
5
Number of moments
=
5
GMM weight matrix:
Robust
Number of obs
6420
-----------------------------------------------------------------------------|
Robust
|
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------/b0 | -4.355425
.2311157
-18.85
0.000
-4.808404
-3.902447
/b1 |
.0441804
.0047281
9.34
0.000
.0349135
.0534472
/b2 |
.205559
.01677
12.26
0.000
.1726903
.2384276
/b3 |
.5334482
.1104231
4.83
0.000
.317023
.7498735
/b4 | -.0521602
.0207588
-2.51
0.012
-.0928467
-.0114737
-----------------------------------------------------------------------------Instruments for equation 1: agef gradef city idealnum _cons
GMM with Idealnum Endogenous

global xb "{b0}+{b1}* agef+{b2}* gradef+{b3}* city+{b4}*idealnum"
global ey "exp($xb)/(1+exp($xb))"
gmm (curuse-$ey),instruments( agef gradef city goodsan goodwat)
GMM estimation
5
Number of moments
=
6
GMM weight matrix:
Robust
Number of obs
6420
-----------------------------------------------------------------------------|
Robust
|
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------/b0 | -.7549698
2.294859
-0.33
0.742
-5.25281
3.742871
/b1 |
.0724285
.0256051
2.83
0.005
.0222435
.1226135
/b2 |
.1431967
.0205348
6.97
0.000
.1029493
.1834441
/b3 |
.1939602
.201301
0.96
0.335
-.2005824
.5885028
/b4 | -.9054825
.7181797
-1.26
0.207
-2.313089
.5021237
-----------------------------------------------------------------------------Instruments for equation 1: agef gradef city goodsan goodwat _cons
estat overid
Hansen's J chi2(1) = .153577 (p = 0.6951)
23
Extensions
Seemingly Unrelated Regression Model
Consider a two equation model:

Yi1 X i11 i1
Yi 2 X i 2 2 i 2
Where we assume that E ( i1 i 2 ) 12 -- contemporaneous

correlation between the errors of the two equations. For example if
these are two equations for an individual, we assume that there is
overlap in the unobservables affected the two outcomes. The
equations are seemingly unrelated because the the association is
through unobservables.
Note that if X i1 X i 2 , then OLS yields the same results as the
seemingly unrelated regression estimator.
24
Example for Tunisia:

sureg (idealnum urban age15_25 age26_30 age31_35 educf dhspfp5 dclnf210
dclnf35) ( births urban age15_25 age26_30 age31_35 educf hosp55 c
> lnc255 clnc355),corr
Seemingly unrelated regression
---------------------------------------------------------------------Equation
Obs Parms
RMSE
"R-sq"
chi2
P
---------------------------------------------------------------------idealnum
3699
8
1.098898
0.1511
646.08
0.0000
births
3699
8
1.966098
0.4528
3057.12
0.0000
--------------------------------------------------------------------------------------------------------------------------------------------------|
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------idealnum
|
urban | -.5341832
.04836
-11.05
0.000
-.6289671
-.4393993
age15_25 | -.4340123
.052471
-8.27
0.000
-.5368536
-.331171
age26_30 | -.2219301
.0501645
-4.42
0.000
-.3202506
-.1236096
age31_35 | -.0421442
.0487635
-0.86
0.387
-.1377188
.0534304
educf | -.0426635
.0051863
-8.23
0.000
-.0528284
-.0324986
dhspfp5 | -.1439378
.0444092
-3.24
0.001
-.2309783
-.0568973
dclnf210 | -.0822242
.0412718
-1.99
0.046
-.1631155
-.0013329
dclnf35 | -.1438506
.0401266
-3.58
0.000
-.2224973
-.065204
_cons |
4.208579
.0503266
83.63
0.000
4.109941
4.307217
-------------+---------------------------------------------------------------births
|
urban |
-.472903
.0929165
-5.09
0.000
-.6550159
-.2907901
age15_25 | -4.146174
.0937669
-44.22
0.000
-4.329954
-3.962394
age26_30 | -2.875245
.089736
-32.04
0.000
-3.051124
-2.699366
age31_35 | -1.491799
.0872567
-17.10
0.000
-1.662819
-1.320778
educf | -.1171785
.0092421
-12.68
0.000
-.1352927
-.0990642
hosp55 | -.2233258
.1038585
-2.15
0.032
-.4268846
-.019767
clnc255 | -.0075894
.1029788
-0.07
0.941
-.2094241
.1942452
clnc355 |
-.223498
.0657493
-3.40
0.001
-.3523643
-.0946318
_cons |
6.543402
.0790779
82.75
0.000
6.388412
6.698392
-----------------------------------------------------------------------------Correlation matrix of residuals:
idealnum
births
idealnum
1.0000
0.2901
births
1.0000
Breusch-Pagan test of independence: chi2(1) =
311.228, Pr = 0.0000
25
GMM
gmm (eq1: idealnum - {b0}-{xb1:urban age15_25 age26_30 age31_35 educf dhspfp5
dclnf210 dclnf35}) (eq2:births- {c0}-{xb2: urban age15_25 age
> 26_30 age31_35 educf hosp55
> clnc255 clnc355}), instruments(eq1:urban age15_25 age26_30 age31_35 educf
dhspfp5 dclnf210 dclnf35) instruments (eq2:urban age15_25 age26_
> 30 age31_35 educf hosp55
> clnc255 clnc355) winitial(unadjusted,independent) wmatrix(unadjusted)
twostep;
GMM estimation
Number of moments
= 18
GMM weight matrix:
Unadjusted
Number of obs
3699
-----------------------------------------------------------------------------|
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------/b0 |
4.259347
.0512122
83.17
0.000
4.158973
4.359721
/xb1_urban | -.5224261
.0487445
-10.72
0.000
-.6179636
-.4268885
/xb1_age1~25 |
-.43919
.0524799
-8.37
0.000
-.5420487
-.3363313
/xb1_age2~30 |
-.225901
.050169
-4.50
0.000
-.3242304
-.1275715
/xb1_age3~35 | -.0444741
.0487651
-0.91
0.362
-.1400519
.0511037
/xb1_educf | -.0413129
.0051912
-7.96
0.000
-.0514876
-.0311383
/xb1_dhspfp5 | -.1716952
.0458525
-3.74
0.000
-.2615644
-.0818261
/xb1_dcl~210 | -.0878041
.0426432
-2.06
0.039
-.1713832
-.004225
/xb1_dclnf35 |
-.209072
.0417471
-5.01
0.000
-.2908949
-.1272492
/c0 |
6.552747
.0798257
82.09
0.000
6.396292
6.709203
/xb2_urban | -.4345315
.0938955
-4.63
0.000
-.6185632
-.2504998
/xb2_age1~25 | -4.146713
.0937755
-44.22
0.000
-4.33051
-3.962916
/xb2_age2~30 | -2.877945
.089742
-32.07
0.000
-3.053836
-2.702054
/xb2_age3~35 | -1.493905
.0872593
-17.12
0.000
-1.66493
-1.322879
/xb2_educf | -.1163168
.0092479
-12.58
0.000
-.1344424
-.0981912
/xb2_hosp55 | -.3579445
.1077349
-3.32
0.001
-.569101
-.146788
/xb2_clnc255 |
.0673702
.106759
0.63
0.528
-.1418735
.276614
/xb2_clnc355 | -.2343723
.0682729
-3.43
0.001
-.3681847
-.1005599
-----------------------------------------------------------------------------Instruments for equation 1: urban age15_25 age26_30 age31_35 educf dhspfp5
dclnf210 dclnf35 _cons
Instruments for equation 2: urban age15_25 age26_30 age31_35 educf hosp55
clnc255 clnc355 _cons
26
Panel Data Models
STATA has a separate packages for instrumental variables

regressions for panel data models xtivreg and one for dynamic
panel data models xtdpdsys.
Topic for a entire talk. However, there are a couple of interesting
points:
Instrumental Variables and Panel Data:
Consider a simple panel data model with only time varying

regressors:
Yti X ti i ti
Where there are K regressors and is K x 1. Now suppose that:

E ( X ti ti ) 0 and E ( X ti i ) 0
So OLS, random effects, fixed effects, and first differencing are

not consistent estimators.
However, suppose a set of L instruments exists Zti that are
uncorrelated with the errors but correlated with X ti then
instrumental variables methods can be used. On the surface, it

appears that identification requires L K for identification but this
is not necessarily the case. To see this, suppose T=2:
Y X i
1i
1i
1i
Y X i
2i
2i
2i
27
We can think of this as two equations where we have restricted the

effect of the explanatory variables to be the same. Now Z is an
2i
instrument for X -- however so is Z . In addition, Zs from
2i
1i
both time periods are valid instruments X (assuming strong
1i
exogeneity of the Zs).
Note that as we add time periods, the number of s to be
estimated stays the same (unless we start adding time interactions,
for example) and so the level of over identification increases.
Higher levels of identification typically increase the efficiency of
the resulting estimator.
Leads to what is referred to as GMM style instrument set.
28
Example:
Standard identification or sometimes called summation
assumption:
Let ti i ti

1i
2i
Ti
and Zi
'
Z1i
'
Z 2i
'
ZTi
Then E (Zi'i ) 0 requires that L K for identification

Contemporaneous exogeneity assumption redefines:
Zi
'
Z1i 0 .
'
0 Z 2i
0 . . .
. . 0
'
Z
Ti
29
Which means that we have TxL orthogonality conditions but we

can get even more if we go to strong endogeneity (Zs and s
from different time periods orthogonal)
30

GMM 2

Uploaded by

Copyright:

Available Formats

GMM 2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

GMM 2

Uploaded by

Copyright:

Available Formats

Generalized Methods of Moments (GMM) Estimation with

Applications using STATA

Review of Recursive Simultaneous Equations Models

OLS estimation of second equation will lead to biased and

Two Stage Least Squares

Then, if we make the two stage least squares substitution, we get:

Then the model would be identified by functional form.

How do you find appropriate instruments (some examples).

7. Effect of past fertility on contraceptive use

An improved estimator is the GLS estimator:

Where all terms are scalars.

Observations with high error variance down weighted relative two

Generalized Method of Moments

Suppose we have the following simple model:

Yi X i i where it is assumed that E ( X ii ) 0

Ni=1 Xi (Yi Xi )= 0 which leads to =

Which is the OLS estimator.

which is the same as the instrumental variables

Note that if the variables are standardized, the denominator is the

The model in matrix form is:

The moment conditions are:

min[( Z ' )'W ( Z ' )]

min[( Z '(Y X )'W ( Z '(Y X )]

How do you choose W?

So if we choose W ( Z ' Z ) 1 , we give higher weight to moment

Some books refer to this as generalized instrumental variables as

Important specification tests:

Example using Data from Tanzania

goodsan goodwat fpmess

ivregress 2sls idealnum

Instrumental variables (2SLS) regression

goodsan goodwat (fpmess= radio tv

Instrumental variables (2SLS) regression

goodsan goodwat (fpmess= radio tv

ivregress gmm idealnum

goodsan goodwat (fpmess= radio tv

Instrumental variables (GMM) regression

GMM weight matrix: Robust

Extension to Limited Dependent Variables

Poisson Regression Model

So we can define moment conditions:

Example using Data from Tunisia:

Standard Maximum Likelihood

Log likelihood = -6261.6451

GMM with No Correction for Endogeneity

GMM with Instruments for Births

The observed dependent variable is Yi which is the sign for the

So we can define moment conditions:

So method of moments and maximum likelihood will not be the

Which have been referred to as generalized residuals see

Examples using Data from Zimbabwe:

ageyr educf city

Log likelihood = -2473.4264

Method of Moments with Generalized Residuals

Model with Endogenous Regressor

ageyr educf city

Log likelihood = -2166.2919

Joint Estimation of Two Equations by Maximum Likelihood