Introduction To Econometric Solutions To Exercises (Part 2)
Introduction To Econometric Solutions To Exercises (Part 2)
Introduction To Econometric Solutions To Exercises (Part 2)
Solutions to Empirical
Exercises
Chapter 3
Review of Statistics
Solutions to Empirical Exercises
1. (a)
Average Hourly Earnings, Nominal $s
Mean SE(Mean) 95% Confidence Interval
AHE
1992
11.63 0.064 11.5011.75
AHE
2004
16.77 0.098 16.5816.96
Difference SE(Difference) 95% Confidence Interval
AHE
2004
AHE
1992
5.14 0.117 4.915.37
(b)
Average Hourly Earnings, Real $2004
Mean SE(Mean) 95% Confidence Interval
AHE
1992
15.66 0.086 15.4915.82
AHE
2004
16.77 0.098 16.5816.96
Difference SE(Difference) 95% Confidence Interval
AHE
2004
AHE
1992
1.11 0.130 0.851.37
(c) The results from part (b) adjust for changes in purchasing power. These results should be used.
(d)
Average Hourly Earnings in 2004
Mean SE(Mean) 95% Confidence Interval
High School 13.81 0.102 13.6114.01
College 20.31 0.158 20.0020.62
Difference SE(Difference) 95% Confidence Interval
CollegeHigh School
6.50 0.188 6.136.87
Solutions to Empirical Exercises in Chapter 3 109
(e)
Average Hourly Earnings in 1992 (in $2004)
Mean SE(Mean) 95% Confidence Interval
High School 13.48 0.091 13.3013.65
College 19.07 0.148 18.7819.36
Difference SE(Difference) 95% Confidence Interval
CollegeHigh School
5.59 0.173 5.255.93
(f)
Average Hourly Earnings in 2004
Mean SE(Mean) 95% Confidence Interval
AHE
HS,2004
AHE
HS,1992
0.33 0.137 0.060.60
AHE
Col,2004
AHE
Col,1992
1.24 0.217 0.821.66
ColHS Gap (1992) 5.59 0.173 5.255.93
ColHS Gap (2004) 6.50 0.188 6.136.87
Difference SE(Difference) 95% Confidence Interval
Gap
2004
Gap
1992
0.91 0.256 0.411.41
Wages of high school graduates increased by an estimated 0.33 dollars per hour (with a 95%
confidence interval of 0.06 0.60); Wages of college graduates increased by an estimated 1.24
dollars per hour (with a 95% confidence interval of 0.82 1.66). The College High School gap
increased by an estimated 0.91 dollars per hour.
(g) Gender Gap in Earnings for High School Graduates
Year
m
Y
s
m
n
m
w
Y
s
w
n
w
m
Y
w
Y SE(
m
Y
w
Y )
95% CI
1992 14.57 6.55 2770 11.86 5.21 1870 2.71 0.173 2.373.05
2004 14.88 7.16 2772 11.92 5.39 1574 2.96 0.192 2.593.34
There is a large and statistically significant gender gap in earnings for high school graduates.
In 2004 the estimated gap was $2.96 per hour; in 1992 the estimated gap was $2.71 per hour
(in $2004). The increase in the gender gap is somewhat smaller for high school graduates than
it is for college graduates.
Chapter 4
Linear Regression with One Regressor
Solutions to Empirical Exercises
1. (a)
_ Course Eval = 4.00 + 0.133 Beauty. The variable Beauty has a mean that is equal to 0; the
estimated intercept is the mean of the dependent variable (Course_Eval) minus the estimated
slope (0.133) times the mean of the regressor (Beauty). Thus, the estimated intercept is equal
to the mean of Course_Eval.
(c) The standard deviation of Beauty is 0.789. Thus
Professor Watsons predicted course evaluations = 4.00 + 0.133 0 0.789 = 4.00
Professor Stocks predicted course evaluations = 4.00 + 0.133 1 0.789 = 4.105
Solutions to Empirical Exercises in Chapter 4 111
(d) The standard deviation of course evaluations is 0.55 and the standard deviation of beauty is
0.789. A one standard deviation increase in beauty is expected to increase course evaluation by
0.133 0.789 = 0.105, or 1/5 of a standard deviation of course evaluations. The effect is small.
(e) The regression R
2
is 0.036, so that Beauty explains only 3.6% of the variance in course
evaluations.
3. (a)
Ed = 13.96 0.073 Dist. The regression predicts that if colleges are built 10 miles closer
to where students go to high school, average years of college will increase by 0.073 years.
(b) Bobs predicted years of completed education = 13.96 0.073 2 = 13.81
Bobs predicted years of completed education if he was 10 miles from college = 13.96 0.073
1 = 13.89
(c) The regression R
2
is 0.0074, so that distance explains only a very small fraction of years of
completed education.
(d) SER = 1.8074 years.
4. (a)
Growth
Trade Share
0 .5 1 1.5 2
-5
0
5
10
Yes, there appears to be a weak positive relationship.
(b) Malta is the outlying observation with a trade share of 2.
(c)
+ How to get the standard error depends on the software that you are using.
An easy way is re-specify the regression replacing Female Beauty with Male Beauty. The
resulting regression is shown in (4) in the table. Now, the coefficient on Beauty is the effect of
Beauty for females and the standard error is given in the table. The 95% confidence interval is
(0.090 1.96 0.040) (2 0.79) or 0.02 to 0.27
Solutions to Empirical Exercises in Chapter 8 127
3. This table contains the results from seven regressions that are referenced in these answers. The
Dependent Variable in all of the regressions is ED
(1) (2) (3) (4) (5)
Regressor ED ln(ED) ED ED ED
Dist 0.037**
(0.012)
0.0026**
(0.0009)
0.081**
(0.025)
0.081**
(0.025)
0.110**
(0.028)
Dist
2
0.0046*
(0.0021)
0.0047*
(0.0021)
0.0065*
(0.0022)
Tuition 0.191
(0.099)
0.014*
(0.007)
0.193*
(0.099)
0.194*
(0.099)
0.210*
(0.099)
Female 0.143**
(0.050)
0.010**
(0.004)
0.143**
(0.050)
0.141**
(0.050)
0.141**
(0.050)
Black 0.351**
(0.067)
0.026**
(0.005)
0.334**
(0.068)
0.331**
(0.068)
0.333**
(0.068)
Hispanic 0.362**
(0.076)
0.026**
(0.005)
0.333**
(0.078)
0.329**
(0.078)
0.323**
(0.078)
Bytest 0.093**
(0.003)
0.0067**
(0.0002)
0.093**
(0.003)
0.093**
(0.003)
0.093**
(0.003)
Incomehi 0.372**
(0.062)
0.027**
(0.004)
0.369**
(0.062)
0.362**
(0.062)
0.217*
(0.090)
Ownhome 0.139*
(0.065)
0.010*
(0.005)
0.143*
(0.065)
0.141*
(0.065)
0.144*
(0.065)
DadColl 0.571**
(0.076)
0.041**
(0.005)
0.561**
(0.077)
0.654**
(0.087)
0.663**
(0.087)
MomColl 0.378**
(0.083)
0.027**
(0.006)
0.378**
(0.083)
0.569**
(0.122)
0.567**
(0.122)
DadColl MomColl
0.366*
(0.164)
0.356*
(0.164)
Cue80 0.029**
(0.010)
0.002**
(0.0007)
0.026**
(0.010)
0.026**
(0.010)
0.026**
(0.010)
Stwmfg 0.043*
(0.020)
0.003*
(0.001)
0.043*
(0.020)
0.042*
(0.020)
0.042*
(0.020)
Incomehi Dist 0.124*
(0.062)
Incomehi Dist
2
0.0087
(0.0062)
Intercept 8.920**
(0.243)
2.266**
(0.017)
9.012**
(0.250)
9.002**
(0.250)
9.042**
(0.251)
(a)
(a) Dist and Dist
2
6.08
(0.002)
6.00
(0.003)
8.35
(0.000)
Interaction terms
Incomehi Dist and
Incomehi Dist
2
2.34
(0.096)
SER 1.538 0.109 1.537 1.536 1.536
2
R
0.281 0.283 0.282 0.283 0.283
Significant at the *5% and **1% significance level.
128 Stock/Watson - Introduction to Econometrics - Second Edition
(a) The regression results for this question are shown in column (1) of the table. If Dist increases
from 2 to 3, education is predicted to decrease by 0.037 years. If Dist increases from 6 to 7,
education is predicted to decrease by 0.037 years. These values are the same because the
regression is a linear function relating AHE and Age.
(b) The regression results for this question are shown in column (2) of the table. If Dist increases
from 2 to 3, ln(ED) is predicted to decrease by 0.0026. This means that education is predicted
to decrease by 0.26%. If Dist increases from 6 to 7, ln(ED) is predicted to decrease by 0.00026.
This means that education is predicted to decrease by 0.26%. These values, in percentage terms,
are the same because the regression is a linear function relating ln(ED) and Dist.
(c) When Dist increases from 2 to 3, the predicted change in ED is:
(0.081 3 + 0.0046 3
2
) (0.081 2 + 0.0046 2
2
) = 0.058.
This means that the number of years of completed education is predicted to decrease by 0.058
years. When Dist increases from 6 to 7, the predicted change in ED is:
(0.081 3 + 0.0046 7
2
) (0.081 2 + 0.0046 6
2
) = 0.021.
This means that the number of years of completed education is predicted to decrease by 0.021 years.
(d) The regression in (3) adds the variable Dist
2
to regression (1). The coefficient on Dist
2
is
statistically significant ( t = 2.26) and this suggests that the addition of Dist
2
is important. Thus,
(4) is preferred to (1).
(e)
Regression Functions
14.95
15
15.05
15.1
15.15
15.2
15.25
15.3
15.35
15.4
0 2 4 6 8 10
Distance (10's of Miles)
Y
e
a
r
s
o
f
E
d
u
c
a
t
i
o
n
Regression (3)
Regression (1)
(i) The quadratic regression in (3) is steeper for small values of Dist than for larger values.
The quadratic function is essentially flat when Dist = 10. The only change in the regression
functions for a white male is that the intercept would shift. The functions would have the
same slopes.
Solutions to Empirical Exercises in Chapter 8 129
(ii) The regression function becomes positively sloped for Dist > 10. There are only 44 of the
3796 observations with Dist > 10. This is approximately 1% of the sample. Thus, this part of
the regression function is very imprecisely estimated.
(f) The estimated coefficient is 0.366. This is the extra effect of education above and beyond
the sepearted MomColl and DadColl effects, when both mother and father attended college.
(g) (i) This the coefficient on DadColl, which is 0.654 years
(ii) This the coefficient on MomColl, which is 0.569 years
(iii) This is the sum of the coefficients on DadColl, MomColl and the interaction term. This is
0.654 + 0.569 0.366 = 0.857 years.
(h) Regression (5) adds the interaction of Incomehi and the distance regressors, Dist and Dist
2
.
The implied coefficients on Dist and Dist
2
are:
Students who are not high income (Incomehi = 0)
.)
Non West
West
n
n
Because the samples are independent, the standard errors for the estimated difference in the
coefficients can be calculated as
2 2
( ) ( ) ( ) .
Non West West Non West West
SE SE SE | | | |
= +
For example, the standard error for the difference between the non-West and West coefficients on
Dist is
2 2
(0.028) (0.045) 0.053 + = .
138 Stock/Watson - Introduction to Econometrics - Second Edition
The coefficients on Dist and Dist
2
in the West are very similar to the values for the non-West. This
means that the estimated regression coefficients are similar. The interaction terms Incomehi Dist
and Incomehi Dist
2
look different. In the non-West, the estimated regression function for high
income students was essentially flat (E8.3(h)), while the estimated regression coefficient in the
West for students with Incomehi = 1 is very similar to the regression function for students with
Incomehi = 0. However, the coefficients on the interaction terms for the West sample are
imprecisely estimated and are not statistically different from the non-West sample. Indeed, the only
statistically significant coefficient across the two samples is the coefficient on Bytest. The
difference is 0.093 0.073 = 0.20, which has a standard error of
2 2
0.003 0.006 0.0067. + =
Solutions to Empirical Exercises in Chapter 9 139
Regession Results for Non-Western and Western States
Non-West West
Dist 0.110**
(0.028)
0.092*
(0.045)
Dist
2
0.0065*
(0.0022)
0.0041
(0.0031)
Tuition 0.210*
(0.099)
0.523*
(0.242)
Female 0.141**
(0.050)
0.051**
(0.100)
Black 0.333**
(0.068)
0.067**
(0.182)
Hispanic 0.323**
(0.078)
0.196**
(0.115)
Bytest 0.093**
(0.003)
0.073**
(0.006)
Incomehi 0.217*
(0.090)
0.407*
(0.169)
Ownhome 0.144*
(0.065)
0.199*
(0.127)
DadColl 0.663**
(0.087)
0.441**
(0.144)
MomColl 0.567**
(0.122)
0.283**
(0.262)
DadColl MomColl 0.356
(0.164)
0.142
(0.330)
Cue80 0.026**
(0.010)
0.045**
(0.023)
Stwmfg 0.042*
(0.020)
0.031*
(0.044)
Incomehi Dist 0.124*
(0.062)
0.005*
(0.090)
Incomehi Dist
2
0.0087
(0.0062)
0.0000
(0.0057)
Intercept 9.042**
(0.251)
9.227**
(0.524)
F-statistics (p-values) and measures of fit
(a) Dist and Dist
2
8.35
(0.000)
2.66
(0.070)
(b) Interaction terms
Incomehi Dist and
Incomehi Dist
2
2.34
(0.096)
0.01
(0.993)
SER 1.536 1.49
2
R
0.283 0.218
n 3796 943
Significant at the *5% and **1% significance level.
Chapter 10
Regression with Panel Data
Solutions to Empirical Exercises
1.
(1) (2) (3) (4)
shall 0.443
**
(0.048)
0.368**
(0.035)
0.0461
*
(0.019)
0.0280
(0.017)
incar_rate 0.00161
**
(0.00018)
0.00007
(0.00009)
0.0000760
(0.000090)
density 0.0267
(0.014)
0.172
**
(0.085)
0.0916
(0.076)
avginc 0.00121
(0.0073)
0.00920
(0.0059)
0.000959
(0.0064)
pop 0.0427
**
(0.0031)
0.0115
(0.0087)
0.00475
(0.0079)
pb1064 0.0809
**
(0.020)
0.104
**
(0.018)
0.0292
(0.023)
pw1064 0.0312
**
(0.0097)
0.0409
**
(0.0051)
0.00925
(0.0079)
pm1029 0.00887
(0.012)
0.0503
**
(0.0064)
0.0733
**
(0.016)
Intercept 6.135
**
(0.019)
2.982
**
(0.61)
3.866
**
(0.38)
3.766
**
(0.47)
State Effects No No Yes Yes
Time Effects No No No Yes
F-Statistics and p-values testing exclusion of groups of variables
State Effects 210.38
(0.00)
309.29
(0.00)
Time Effects 13.90
(0.00)
2
R
0.09 0.56 0.94 0.95
(a) (i) The coefficient is 0.368, which suggests that shall-issue laws reduce violent crime by 36%.
This is a large effect.
(ii) The coefficient in (1) is 0.443; in (2) it is 0.369. Both are highly statistically significant.
Adding the control variables results in a small drop in the coefficient.
Solutions to Empirical Exercises in Chapter 10 141
(iii) Attitudes towards guns and crime. Quality of schools. Quality of police and other crime-
prevention programs.
(b) In (3) the coefficient on shall falls to 0.046, a large reduction in the coefficient from (2).
Evidently there was important omitted variable bias in (2). The 95% confidence interval for |
Shall
is now 0.086 to 0.007 or 0.7% to 8.6%. The state effects are jointly statistically significant,
so this regression seems better specified than (2).
(c) The coefficient falls further to 0.028. The coefficient is insignificantly different from zero. The
time effects are jointly statistically significant, so this regression seems better specified than (3).
(d) This table shows the coefficient on shall in the regression specifications (1)(4). To save space,
coefficients for variables other than shall are not reported.
Dependent Variable = ln(rob)
(1) (2) (3) (4)
shall 0.773
**
(0.070)
0.529**
(0.051)
0.008
(0.026)
0.027
(0.025)
F-Statistics and p-values testing exclusion of groups of variables
State Effects 190.47
(0.00)
243.39
(0.00)
Time Effects 12.39
(0.00)
Dependent Variable = ln(mur)
shall 0.473
**
(0.049)
0.313**
(0.036)
0.061
*
(0.027)
0.015
(0.027)
F-Statistics and p-values testing exclusion of groups of variables
State Effects 88.22
(0.00)
106.69
(0.00)
Time Effects 9.73
(0.00)
The quantative results are similar to the results using violent crimes: there is a large estimated
effect of concealed weapons laws in specifications (1) and (2). This effect is spurious and is due
to omitted variable bias as specification (3) and (4) show.
(e) There is potential two-way causality between this years incarceration rate and the number of
crimes. Because this years incarceration rate is much like last years rate, there is a potential
two-way causality problem. There are similar two-way causality issues relating crime and shall.
(f) The most credible results are given by regression (4). The 95% confidence interval for |
Shall
is +1% to 6.6%. This includes |
Shall
= 0. Thus, there is no statistically significant evidence that
concealed weapons laws have any effect on crime rates. The interval is wide, however, and
includes values as large as 6.6%. Thus, at a 5% level the hypothesis that |
Shall
= 0.066
(so that the laws reduce crime by 6.6%) cannot be rejected.
142 Stock/Watson - Introduction to Econometrics - Second Edition
2.
Regressor (1) (2) (3)
sb_useage 0.00407
***
(0.0012)
0.00577
***
(0.0012)
0.00372
***
(0.0011)
speed65 0.000148
(0.00041)
0.000425
(0.00033)
0.000783
*
(0.00042)
speed70 0.00240
***
(0.00047)
0.00123
***
(0.00033)
0.000804
**
(0.00034)
ba08 0.00192
***
(0.00036)
0.00138
***
(0.00037)
0.000822
**
(0.00035)
drinkage21 0.0000799
(0.00099)
0.000745
(0.00051)
0.00113
**
(0.00054)
lninc 0.0181
***
(0.0011)
0.0135
***
(0.0014)
0.00626
(0.0039)
age 0.00000722
(0.00016)
0.000979
**
(0.00038)
0.00132
***
(0.00038)
State Effects No Yes Yes
Year Effects No No Yes
0.544 0.874 0.897
(a) The estimated coefficient on seat belt useage is positive and statistically significant. One the face
of it, this suggests that seat belt useage leads to an increase in the fatality rate.
(b) The results change. The coefficient on seat belt useage is now negative and the coefficient is
statistically significant. The estimated value of |
SB
= 0.00577, so that a 10% increase in seat belt
useage (so that sb_useage increases by 0.10) is estimated to lower the fatality rate by .000577
fatalities per million traffic miles. States with more dangerous drving conditions (and a higher
fatality rate) also have more people wearing seat belts. Thus (1) suffers from omitted variable
bias.
(c) The results change. The estimated value of |
SB
= 0.00372.
(d) The time effects are statistically significant the F-statistic = 10.91 with a p-value of 0.00.
The results in (3) are the most reliable.
(e) A 38% increase in seat belt useage from 0.52 to 0.90 is estimated to lower the fatality rate by
0.00372 0.38 = 0.0014 fatalities per million traffic miles. The average number of traffic miles
per year per state in the sample is 41,447. For a state with the average number of traffic miles, the
number of fatalities prevented is 0.0014 41,447 = 58 fatalities.
(f) A regression yields
( ) p
All Workers 0.242 0.004
No Smoking Ban 0.290 0.007
Smoking Ban 0.212 0.005
(b) From model (1), the difference in 0.078 we a standard error of 0.009. The resulting t-statistic
is 8.66, so the coefficient is statistically significant.
(c) From model (2) the estimated difference is 0.047, smaller than the effect in model (1).
Evidently (1) suffers from omitted variable bias in (1). That is, smkban may be correlated with
the education/race/gender indicators or with age. For example, workers with a college degree are
more likely to work in an office with a smoking ban than high-school dropouts, and college
graduates are less likely to smoke than high-school dropouts.
(d) The t-statistic is 5.27, so the coefficient is statistically significant at the 1% level.
(e) The F-statistic has a p-value of 0.00, so the coefficients are significant. The omitted education
status is Masters degree or higher. Thus the coefficients show the increase in probability
relative to someone with a postgraduate degree. For example, the coefficient on Colgrad is 0.045,
so the probability of smoking for a college graduate is 0.045 (4.5%) higher than for someone
with a postgraduate degree. Similarly, the coefficient on HSdrop is 0.323, so the probability
of smoking for a college graduate is 0.323 (32.3%) higher than for someone with a postgraduate
degree. Because the coefficients are all positive and get smaller as educational attainment
increases, the probability of smoking falls as educational attainment increases.
(f) The coefficient on Age
2
is statistically significant. This suggests a nonlinear relationship between
age and the probability of smoking. The figure below shows the estimated probability for a
white, non-Hispanic male college graduate with no workplace smoking ban.
146 Stock/Watson - Introduction to Econometrics - Second Edition
2 (a) See the table above.
(b) The t-statistic is 5.47, very similar to the value for the linear probability model.
(c) The F-statistic is significant at the 1% level, as in the linear probability model.
(d) To calculate the probabilities, take the estimation results from the probit model to calculate , z
and calculate the cumulative standard normal distribution at , z i.e., . ( ) ( ) Prob smoke z = u The
probability of Mr. A smoking without the workplace ban is 0.464 and the probability of smoking
with the workplace bans is 0.401. Therefore the workplace bans would reduce the probability of
smoking by 0.063 (6.3%).
(e) To calculate the probabilities, take the estimation results from the probit model to calculate , z
and calculate the cumulative standard normal distribution at , z i.e., . ( ) ( ) Prob smoke z = u The
probability of Ms. B smoking without the workplace ban is 0.143 and the probability of smoking
with the workplace ban is 0.110. Therefore the workplace bans would reduce the probability of
smoking by .033 (3.3%).
(f) For Mr. A, the probability of smoking without the workplace ban is 0.449 and the probability
of smoking with the workplace ban is 0.402. Therefore the workplace ban would have a
considerable impact on the probability that Mr. A would smoke. For Ms. B, the probability
of smoking without the workplace ban is 0.145 and the probability of smoking with the
workplace ban is 0.098. In both cases the probability of smoking declines by 0.047 or 4.7%.
(Notice that this is given by the coefficient on smkban, 0.047, in the linear probability model.)
(g) The linear probability model assumes that the marginal impact of workplace smoking bans on
the probability of an individual smoking is not dependent on the other characteristics of the
individual. On the other hand, the probit models predicted marginal impact of workplace
smoking bans on the probability of smoking depends on individual characteristics. Therefore,
in the linear probability model, the marginal impact of workplace smoking bans is the same for
Mr. A and Mr. B, although their profiles would suggest that Mr. A has a higher probability of
smoking based on his characteristics. Looking at the probit models results, the marginal impact
of workplace smoking bans on the odds of smoking are different for Mr. A and Ms. B, because
their different characteristics are incorporated into the impact of the laws on the probability of
smoking. In this sense the probit model is likely more appropriate.
Are the impacts of workplace smoking bans large in a real-world sense? Most people might
believe the impacts are large. For example, in (d) the reduction on the probability is 6.3%.
Applied to a large number of people, this translates into a 6.3% reduction in the number of
people smoking.
(h) An important concern is two-way causality. Do companies that impose a smoking ban have
fewer smokers to begin with? Do smokers seek employment with employers that do not have a
smoking ban? Do states with smoking bans already have more
or fewer smokers than states without smoking bans?
Solutions to Empirical Exercises in Chapter 11 147
3. Answers are provided to many of the questions using the linear probability models. You can also
answer these questions using a probit or logit model. Answers are based on the following table:
Dependent Variable
Insured Insured Insured Healthy Healthy Healthy Any
Limitation
Regressor
(1) (2) (3) (4) (5) (6) (7)
selfemp 0.128**
(0.015)
0.174**
(0.014)
0.210**
(0.063)
0.010
(0.007)
0.020*
(0.008)
0.015
(0.008)
0.010
(0.012)
age 0.010**
(0.003)
0.001
(0.001)
0.0006
(0.0017)
0.002
(0.002)
0.003
(0.002)
age
2
0.00008*
(0.00003)
0.010**
(0.003)
0.00003
(0.00002)
0.000
(0.000)
0.000
(0.000)
age selfemp 0.000
(0.000)
deg_ged 0.151**
(0.027)
0.151**
(0.027)
0.045*
(0.020)
0.061*
(0.024)
deg_hs 0.254**
(0.016)
0.254**
(0.016)
0.099**
(0.012)
0.012
(0.012)
deg_ba 0.316**
(0.017)
0.316**
(0.017)
0.122**
(0.013)
0.042**
(0.014)
deg_ma 0.335**
(0.018)
0.335 **
(0.018)
0.128**
(0.015)
0.078**
(0.018)
deg_phd 0.366**
(0.026)
0.366 **
(0.025)
0.138**
(0.018)
0.084**
(0.027)
deg_oth 0.288**
(0.020)
0.287**
(0.020)
0.115**
(0.014)
0.049**
(0.017)
familysz 0.017**
(0.003)
0.017**
(0.003)
0.001
(0.002)
0.016**
(0.002)
race_bl 0.028*
(0.013)
0.028*
(0.013)
0.022*
(0.009)
0.035**
(0.010)
race_ot 0.048*
(0.023)
0.048**
(0.023)
0.029
(0.015)
0.046
(0.016)
reg_ne 0.037**
(0.012)
0.037**
(0.012)
0.006
(0.008)
0.046**
(0.011)
reg_mw 0.053**
(0.012)
0.053**
(0.012)
0.012
(0.008)
0.008
(0.011)
reg_so 0.003
(0.011)
0.004
(0.011)
0.001
(0.008)
0.007
(0.010)
male 0.037**
(0.008)
0.037**
(0.008)
0.015**
(0.005)
0.005
(0.007)
married 0.136**
(0.010)
0.136 **
(0.010)
0.001
(0.007)
0.017**
(0.009)
Intercept 0.817
(0.004)
0.299**
(0.054)
0.296**
(0.054)
0.927**
(0.003)
0.953**
(0.031)
0.902**
(0.035)
0.071
(0.044)
Significant at the 5% * or 1% ** level.
148 Stock/Watson - Introduction to Econometrics - Second Edition
(a) Probability of being insured
p SE
( ) p
All Workers 0.802 0.004
Self Employed 0.689 0.014
Not Self Employed 0.817 0.004
The self-employed are 12.8% less likely to have health insurance. This is a large number.
It is statistically significant: from (1) in the table the difference is significant at the 1% level.
(b) From specification (2), the result is robust to adding additional control variables. Indeed, after
controlling for other factors, the difference increases to 17.4%
(c) See specification (2). There is evidence of nonlinearity (Age
2
is significant in the regression). The
plot below shows the effect of Age on the probability of being insured for a self-employed white
married male with a BA and a family size of four from the northeast. (The profile for others will
look the same, although it will be shifted up or down.) The probability if being insured increases
with Age over the range 2065 years.
(d) Specification (3) adds an interaction of Age and selfemp. Its coefficient is not statistically
significant, and this suggests that the effect of selfemp does not depend on Age. (Note: this
answer is specific to the linear probability model. In the probit model, even without an
interaction, the effect of selfemp depends on the level of the probability of being insured, and this
probability depends on Age.)
(e) This is investigated in specifications (4)(7). The effect of selfemp on health status or Any
Limitation is small and not and statistically significant. This result obtains when the regression
controls for Age or for a full set of control variables.
Solutions to Empirical Exercises in Chapter 11 149
There are potential problems with including healthy on the right hand side of the model
because of adverse selection problems. It is possible that only those less healthy
individuals pursue health insurance, perhaps through their employer. This causes a self-
selection problem that more healthy individuals might (a) choose to be self-employed or
(b) choose not to obtain health insurance. While the evidence suggests that there might
not be a strong correlation between health status and self-employment, the adverse
selection concerns still exist.
Chapter 12
Instrumental Variables Regression
Solutions to Empirical Exercises
1. This table shows the OLS and 2SLS estimates. Values for the intercept and coefficients on Seas are
not shown.
Regressor OLS 2SLS
ln(Price) 0.639
(0.073)
0.867
(0.134)
Ice 0.448
(0.135)
0.423
(0.135)
Seas and intercept Not Shown Not Shown
First Stage F-statistic 183.0
(a) See column the table above. The estimated elasticity is 0.639 with a standard error of 0.073.
(b) A positive demand error will shift the demand curve to the right. This will increase the
equilibrium quantity and price in the market. Thus ln(Price) is positively correlated with the
regression error in the demand model. This means that the OLS coefficient will be positively
biased.
(c) Cartel shifts the supply curve. As the cartel strengthens, the supply curve shifts in, reducing
supply and increasing price and profits for the cartels members. Thus, Cartel is relevant. For
Cartel to be a valid instrument it must be exogenous, that is, it must be unrelated to the factors
affecting demand that are omitted from the demand specification (i.e., those factors that make up
the error in the demand model.) This seems plausible.
(d) The first stage F-statistic is 183.0. Cartel is not a weak instrument.
(e) See the table. The estimated elasticity is 0.867 with a standard error of 0.134. Notice that the
estimate is more negative than the OLS estimate, which is consistent with the OLS estimator
having a positive bias.
(f) In the standard model of monopoly, a monopolist should increase price if the demand elasticity
is less than 1. (The increase in price will reduce quantity but increase revenue and profits.) Here,
the elasticity is less than 1.
2. (Results using full dataset)
Estimation Method Regressor
OLS IV IV
Morekids 5.387
(0.087)
6.313
(1.275)
5.821
(1.246)
Additional Regressors Intercept Intercept Intercept, agem1,
black, hispan, othrace
First Stage F-Statistic 1238.2 1280.9
Solutions to Empirical Exercises in Chapter 12 151
(a) The coefficient is 5.387, which indicates that women with more than 2 children work 5.387
fewer weeks per year than women with 2 or fewer children.
(b) Both fertility and weeks worked are choice variables. A women with a positive labor supply
regression error (a women who works more than average) may also be a woman who is less
likely to have an additional child. This would imply that Morekids is positively correlated with
the regression error, so that the OLS estimator of |
Morekids
is positively biased.
(c) The linear regression of morekids on samesex (a linear probability model) yields
= 1.26, SE( |
2
b w
X X ( ) se
b w
X X
t-stat
ofjobs 2435 3.658 1.219 2435 3.664 1.219 0.006 0.035 0.18
yearsexp 2435 7.830 5.011 2435 7.856 5.079 0.027 0.145 0.18
honors 2435 0.051 0.221 2435 0.054 0.226 0.003 0.006 0.45
volunteer 2435 0.414 0.493 2435 0.409 0.492 0.006 0.014 0.41
military 2435 0.102 0.303 2435 0.092 0.290 0.009 0.008 1.11
empholes 2435 0.446 0.497 2435 0.450 0.498 0.004 0.014 0.29
workinschool 2435 0.561 0.496 2435 0.558 0.497 0.003 0.014 0.20
email 2435 0.480 0.500 2435 0.479 0.500 0.001 0.014 0.06
computerskills 2435 0.832 0.374 2435 0.809 0.393 0.024 0.011 2.17
specialskills 2435 0.327 0.469 2435 0.330 0.470 0.003 0.013 0.21
eoe 2435 0.291 0.454 2435 0.291 0.454 0.000 0.013 0.00
manager 2435 0.152 0.359 2435 0.152 0.359 0.000 0.010 0.04
supervisor 2435 0.077 0.267 2435 0.077 0.267 0.000 0.008 0.00
secretary 2435 0.333 0.471 2435 0.333 0.471 0.000 0.014 0.03
offsupport 2435 0.119 0.323 2435 0.119 0.323 0.000 0.009 0.00
salesrep 2435 0.151 0.358 2435 0.151 0.358 0.000 0.010 0.00
retailsales 2435 0.168 0.374 2435 0.168 0.374 0.000 0.011 0.00
req 2435 0.787 0.409 2435 0.787 0.409 0.000 0.012 0.00
expreq 2435 0.435 0.496 2435 0.435 0.496 0.000 0.014 0.00
comreq 2435 0.125 0.331 2435 0.125 0.331 0.000 0.009 0.00
educreq 2435 0.107 0.309 2435 0.107 0.309 0.000 0.009 0.00
compreq 2435 0.437 0.496 2435 0.437 0.496 0.000 0.014 0.03
orgreq 2435 0.073 0.260 2435 0.073 0.260 0.000 0.007 0.00
manuf 2435 0.083 0.276 2435 0.083 0.276 0.000 0.008 0.00
transcom 2435 0.030 0.172 2435 0.030 0.172 0.000 0.005 0.00
bankreal 2435 0.085 0.279 2435 0.085 0.279 0.000 0.008 0.00
trade 2435 0.214 0.410 2435 0.214 0.410 0.000 0.012 0.00
busservice 2435 0.268 0.443 2435 0.268 0.443 0.000 0.013 0.00
othservice 2435 0.155 0.362 2435 0.155 0.362 0.000 0.010 0.00
missind 2435 0.165 0.371 2435 0.165 0.371 0.000 0.011 0.00
chicago 2435 0.555 0.497 2435 0.555 0.497 0.000 0.014 0.00
high 2435 0.502 0.500 2435 0.502 0.500 0.000 0.014 0.00
female 2435 0.775 0.418 2435 0.764 0.425 0.011 0.012 0.88
college 2435 0.723 0.448 2435 0.716 0.451 0.007 0.013 0.51
call_back 2435 0.064 0.246 2435 0.097 0.295 0.032 0.008 4.11
Solutions to Empirical Exercises in Chapter 13 155
2. (a) (i) A person will trade if he received good A but prefers good B or he received good B and
prefers good A. 50% receive good A, of these (100-X)% prefer good B; 50% receive good B,
of these X% prefer good A. Let x = X/100. Thus Expected Fraction Traded = 0.5 (1 x) +
0.5x = 0.5.
(ii) Use X = 100%;
(iii) Use X = 50%
(b)(d)
Answers are based on the following table
Dependent Variable = Trade
All Traders Dealers Non-Dealers
Regressor (1) (2) (3) (4) (5) (6) (7) (8)
Goodb 0.018
(0.078)
0.021
(0.117)
0.564
(0.100)
Years_trade > 10 0.046
(0.128)
0.297
(0.140)
Trades Per Month >
8
Intercept 0.338
(0.039)
0.329
(0.057)
0.445
(0.058)
0.457
(0.085)
0.230
(0.049)
0.200
(0.079)
0.220
(0.055)
0.169
(0.050)
(b) From (1) the fraction of trades is 0.338; the t-statistic for H
o
: p = 0.5 is t = (0.338 0.5)/0.039 =
4.15, so the fraction is statistically significantly different from p = 0.5. From (2) the fraction of
recipients of good A who traded for good B was 0.329 and the fraction of recipients of good B
who traded for good A was 0.329 + 0.018 = 0.347. Both are statistically different from 0.5 at the
1% level. The fraction of good B recipients who traded was not statistically significantly different
from the fraction of good A recipients.
(c) The story is different for dealers (see (3) and (4)). The fraction of trades in 0.446, which is not
statistically significantly different from 0.5. This is true for recipients of good A and good B.
(d) Specification (5)(8) use data on non-dealers. (5)(6) repeat the analysis from parts (b) and (c).
Specification (7) adds an indicator variable that is equal to 1 if the trader has been active in the
market for more than 10 years (approximately 25% of the traders). Specification (8) adds an
indicator variable that is equal to 1 if the trader reports making more than 8 trades per month
(approximately 25% of the traders). Long-term traders (7) are no different than short-term
traders: the coefficient on Years_trade > 10 is small and not statistically significant. Participants
who engage in more than 8 trades per month are different from those who dont: the coefficient
on Trades Per Month > 8 is large and statistically significant. The fraction of these traders who
traded their endowment was 0.297 + 0.169 = 0.466, which is not statistically significantly
different from 0.5.
Chapter 14
Introduction to Time Series Regression and Forecasting
Solutions to Empirical Exercises
1. (a)(c)
Mean Standard Deviation
Quarterly Growth Rate Unscaled
[ln(GPD
t
/GDP
t 1
]
0.0083
0.0092
Quarterly Growth Rate
Percentage Points at an annual rate
[400 ln(GPD
t
/GDP
t 1
]
3.30
3.68
(d) Estimated Autocorrelations (unit free)
Lag Autocorrelation
1 0.29
2 0.17
3 0.03
4 0.02
2. (a)
t
Y A = 0.0058 + 0.301AY
t 1
,
2
R = 0.086, SER = 0.0088
(0.0010) (0.076)
(b)
t
Y A = 0.0052 + 0.272AY
t 1
+ 0.096AY
t 2
,
2
R = 0.090, SER = 0.0088
(0.0010) (0.081) (0.086)
(c) Minimized value shown in BOLD
Lag BIC AIC
1 9.4234 9.4564
2 9.4063 9.4557
3 9.3834 9.4494
4 9.3598 9.4423
3. Regressing AY
t
on Y
t 1
, AY
t 1
, time trend, and intercept yields a t-statistic on Y
t 1
that is
t = 2.51. The 10% critical value is 3.12, so that DF t-statistic is not significant at the
10% level.
4. The QLR F-statistic is 1.26 (maximized value at 1966:01). This is less than the 10% critical value
of 5.00, so the null of stability is not rejected.
Solutions to Empirical Exercises in Chapter 14 157
5. (a)
t
Y A = 0.0060 + 0.270AY
t 1
+ 0.0018AR
t 1
0.0037AR
t 2
+ 0.0098AR
t 3
0.0030AR
t 4
(0.0010) (0.081) (0.0009) (0.0010) (0.0007) (0.0008)
2
R = 0.175, SER = 0.0084
The
2
R has increased from 0.086 to 0.175.
(b) The F-statistic is 6.93 with a p-value of 0.00.
(c) The QLR F-statistic is 4.80 (maximum in 1974:3). This is larger than the 1% critical value of
4.53 suggesting instability in the ADL(1,4) model.
6.
Selected Pseudo Out-Of-Sample Forecast Results
(percentage points at an annual rate)
Model Forecast Error Mean (SE) RMSFE
AR(1) 0.22
(0.26)
2.06
ADL(1, 4) 0.51
(0.29)
2.29
Naive 0.07
(0.28)
2.18
The AR and ADL models show a negative bias, but neither is statistically significant at the 5% level.
The AR model has the smallest RMSFE.
7. (a) Table 14.3 Extended Dataset (Sample Period 1932:12002:12)
Regressors (1) (2) (3)
Excess Return
t 1
0.098
(0.061)
0.102
(0.061)
0.099
(0.058)
Excess Return
t 2
0.040
(0.057)
0.029
(0.054)
Excess Return
t 3
0.098
(0.054)
Excess Return
t 4
0.006
(0.046)
Intercept 0.524
(0.181)
0.543
(0.186)
0.590
(0.199)
F-statistic on all coefficients (p-value) 2.61
(0.11)
1.51
(0.22)
1.41
(0.23)
2
R
0.009 0.009 0.016
158 Stock/Watson - Introduction to Econometrics - Second Edition
(b)
(1) (2) (3)
Estimation Period 1932:1 2002:12 1932:1 2002:12 1932:1 1982:12
Regressors
Excess Return
t 1
0.093
(0.135)
0.109
(0.124)
0.128
(0.07)
Excess Return
t 2
0.088
(0.153)
Aln(dividend yield
t 1
) 0.005
(0.132)
0.007
(0.119)
Aln(dividend yield
t 2
) 0.048
(0.129)
ln(dividend yield
t 1
) 0.020
(0.11)
Intercept 0.526
(0.203)
0.559
(0.228)
6.759
(3.623)
F-statistic on all coefficients (p-value) 1.34
(0.26)
0.81
(0.52)
2
R
0.007 0.007 0.022
(c) The ADF statistic from the regression using 1 lagged first difference and a constant term is
2.78. This is smaller (more negative) than the 10% critical value, but not more negative than
the 5% critical value.
(d)
Model RMSFE
Zero Forecast 4.28
Constant Forecast 4.25
ADL(1, 1) 4.29
(e) No. The in-sample regressions in Tables 14.3 and 14.7 suggest that the coefficients on lagged
excess returns and lags of the first difference of the dividend yield are insignificant. The dividend
yield is persistent, and this makes statistical inference in (3) of Table 14.7 difficult. In the
pseudo-out-of sample experiment the ranking of the forecasts (Constant, Zero, ADL(1,1)) is the
same as reported in the box.
Chapter 15
Estimation of Dynamic Causal Effects
Solutions to Empirical Exercises
1. (a) Mean = 0.27; Standard Deviation = 0.94
(b) O
t
is the greater of zero or the percentage point difference between oil prices at date t and their
maximum value during the past year. Thus O
t
> 0, and O
t
= 0 if the date t is not greater than the
maximum value over the past year.
1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 2002
0.00
0.05
0.10
0.15
0.20
0.25
0.30
(c) m was chosen using 0.75T
0.33
rounded to the nearest integer; m = 6 in this case. The estimated
coefficients and 95% confidence intervals are shown in the figure in part (e).
(d) The F-statistic testing that all 19 coefficients are equal to zero is 1.78, with a p-value 0.02;
the coefficients are significant at the 5% but not the 1% level.
(e) The cumulative multipliers show a persistent and large decrease in industrial production
following an increase in oil prices above their previous 12 month peak price. Specifically
a 100% increase in oil prices is leads to an estimated 15% decline in industrial production
after 18 months.
160 Stock/Watson - Introduction to Econometrics - Second Edition
Dynamic Effect of Oil on IP Growth
(a) Estimated Dynamic Multipliers and 95% Confidence Interval
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
-7.5
-5.0
-2.5
0.0
2.5
5.0
(b) Estimated Cumulative Multipliers and 95% Confidence Interval
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
-28
-24
-20
-16
-12
-8
-4
0
4
(f) In this case O
t
is not exogenous and the results summarized in (e) are not reliable.
2. (a) Mean of t
CPI
= 4.10. Mean of t
PCED
= 3.67
(b) Mean of Y = 0.44. The mean of Y is the difference in the means because Y = t
CPI
t
PCED
.
(c) Y = t
CPI
t
PCED
, so E(Y) = E(t
CPI
) E(t
PCED
).
(d) Y
t
= |
0
+ u
t
, so E(Y
t
) = |
0
+ E(u
t
) = |
0
because E(u
t
) = 0.
(e)
0
u = 1.13 with a standard error of 0.05 (using a lag truncation parameter of m = 6 for
the Newey-West HAC estimator). This value is slightly greater than 1.0, the value imposed above.
5. (a) The estimated model is
t
Y A = 0.006 + 0.319AY
t 1
,
(0.001) (0.072)
2
t
o = 0.000001 + 0.141
2
1 t
u
+ 0.848
2
1 t
o
(0.000002) (0.083) (0.080)
(Note: your estimates may differ slightly from those presented above depending on the software that
you used to estimate the model.)
Solutions to Empirical Exercises in Chapter 16 163
(b)
GARCH(1,1) Bands For GDP Growth
1955 1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003
-0.03
-0.02
-0.01
0.00
0.01
0.02
0.03
0.04
(c) The GARH standard deviations bands narrow considerably in the early 1980s, providing
evidence of a decrease in volatility.