Assignment4 Exported
Assignment4 Exported
Assignment4 Exported
For this assignment, you are allowed to use R packages. Make sure you know how to use them to answer the
questions. I wait for me to use robust tests when it is needed. This is something you should always have in
mind.
Read the section “Cross-Section Application” of Wooldridge (2001) that you will find on Learn. The application
is the same one that you used in Assignment 3. Using the same subset that you used in Assignment 3,
reproduce Table 1 of the paper. Of course, you will not get the same results exactly because you do not use
the same sample. But, compare your results with Wooldridge’s. Do you reach the same conclusion? Explain.
For this question, using the same subset of the Card’s data you used for the previous question and Assignment
3.
We want to estimate the return to education using the following model:
Question 1
Estimate the model by OLS and interpret the coefficients of ed76 and kww.
Question 2
David Card explores the possibility of kww being endogenous. To solve the problem, he tries to use iq as
instrument for kww. Can you explain why kww may be endogenous and why iq may be a good instrument for
kww? You can cite Card’s arguments if you want.
Question 3
1
For each model, interpret the coefficients of ed76 and kww. Can you explain the difference between these
estimates and the OLS estimates? In theory, assuming that all instruments are valid, which model should
produce the most efficient estimator of the return to education? Explain. Do you observe this in your results?
Question 4
For this question, we assume that kww is exogenous. We want to consider the instruments Z1 = nearc4,
Z2 = nearc2, Z3 = nearc4 × daded and Z4 = nearc2 × daded. Perform the following tests and interpret the
results. If you cannot perform the test, explain why. All estimations must be performed by efficient GMM.
a. Test if ed76 is exogenous using Z1 only as instrument.
b. Test if ed76 is exogenous using Z2 only as instrument.
c. Test if Z1 is a valid instrument.
d. Test if the set of instruments {Z1 , Z2 , Z3 , Z4 } is a valid.
e. Assuming that the set of instruments {Z1 , Z3 } is valid, test if the additional set of instruments {Z2 , Z4 }
is also valid.
f. Test the relevance of the four instruments.
g. Using the four instruments, test if the effect of experience on wage is the same for black and non-black
workers, using the Wald and LR statistics. You may have to modify your model to test this hypothesis.
For this part and the next, you will need one of the macro dataset in the Data folder. Each dataset contains
quarterly macroeconomic time series for a specific country. You first need to find your assigned dataset by
running the following code, with the seed replaced by your student ID:
set.seed(112233)
country <- c("Austria","Belgium","Canada","Denmark","Finland","France",
"Germany","Ireland","Italy","Japan","Netherlands","Spain",
"Sweden","Switzerland","UK","US")
mycountry <- sample(country,1)
mycountry
## [1] "Japan"
Using this seed, I was assigned to Japan. Therefore, my dataset is JapanData61_02.csv.
dat <- read.csv("Data/JapanData61_02.csv", header=TRUE)
This dataset is not set as a time series. It is a normal data.frame. Sometimes it is better to keep it in that
format, but sometimes it is better to convert it into a time series (all explained in the online tutorial). To
create a time series, look at the first element of the variable Date. For Japan, it is 1961.0, which means: first
quarter of 1961. The second is 161.25, the third is 161.5 and the fourth is 1961.75. You can create a new
dataset in time series format as follows:
datTS <- ts(dat, start=c(1961,1), frequency=4)
For this Part, you need to use the time series version of the dataset.
2
The Variables
Variables with no number attached to them are time t observations. Variables followed by 1 are
time (t + 1) observations and the ones followed by _i are time (t − i) observations. The datasets
contain different measures of GDP trend gap (gaps are log differences), inflation, and one or more
series of interest rates. The variable trend is the observed GDP trend gap, HP and BP are two
different filtered trend gaps and inf represents inflation rate. The other variables are interest
rates. The names differ across countries. For example, the rates for Canada are labelled bank (for
bank rate) and TB (for treasury bill rate), for the United States it is FFR and TB and for UK we
only have TB. If more than one rate is available, use the one you prefer (I do not care which one).
For example, if you want the correlation between πt and πt−1 you could do the following:
cor(datTS[,"inf"], datTS[,"inf_1"])
## [1] 0.5006251
Of course, you could also do
acf(datTS[,"inf"], plot=FALSE, lag.max = 1)
##
## Autocorrelations of series 'datTS[, "inf"]', by lag
##
## 0.00 0.25
## 1.000 0.501
Question 1
Plot the series of inflation, trend and interest rate. Interpret what you see. You could answer the following
questions in a text and more if you have things to add (I kind of expect more):
- Do you detect positive or negative autocorrelation? Explain.
- Do you think the series are stationary and ergotic? Explain.
- Do you detect positive of negative comovement between the series?
Question 2
Plot the ACF function of the three series and discuss the level of persistence of the series.
Question 3
Rt = β0 + β1 Yt + β2 πt + et ,
where Rt is the interest rate, Yt is the output trend gap and πt is the inflation rate. This is a simplified (and
not quite valid) Taylor Rule equation. It represents the reaction of the Central Bank to deviations from the
potential GDP and inflation target.
• Interpret the coefficients.
• Plot the residuals as function of time. Do you detect autocorrelation?
• Test the null hypothesis that the errors are not autocorrelated. Use the Durbin Watson and Breusch-
Godfrey tests.
3
• If you reject the hypothesis, print the coefficient matrix using robust standard errors.
• The variables are deviations form their trends. Why are we using deviations from trends instead of the
actual series? What would be the consequence of using the actual series on the properties of the OLS
estimators?
Question 4
Estimate the model by GLS using the function gls from the nlme package, with the assumption that the
error follows an AR(1) process. Do you see an efficiency gain over OLS? Interpret the coefficients. Are you
worried when you look at the estimated autocorrelation coefficient?
The model is
Rt = α + ρ1 Rt−1 + ρ2 Rt−2 + (1 − ρ1 − ρ2 )[φy Yt + φπ πt+1
e
] + et .
where πt+1
e
is the expected inflation rate. The interest rate lags are included to model smooth reactions
of the central bank to output or inflation shocks. The model is non-linear, because we interpret the
coefficients φy and φπ as the effect of output and inflation shocks at the steady state (or the long run effect),
Rt = Rt−1 = Rt−2 = R.
We assume that inflation can be forecasted perfectly by the Central Bank, so we set πt+1
e
to πt+1 . We consider
lags of interest rate as being predetermined, so we do not need to instrument them. We therefore use the
following set of instruments: Z = {Rt−1 , Rt−2 , Yt−1 , Yt−2 , πt , πt−1 }.
We do not want to estimate nonlinear model, so we estimate the following unrestricted model:
We can always recover the other coefficients using φy = βy /(1 − ρ1 − ρ2 ) and φπ = βπ /(1 − ρ1 − ρ2 ). The
coefficients βy and βπ are supposed to be smaller in absolute value because they represent the immediate
reactions of the Central Bank.
Question 1
Estimate the model by efficient GMM (using the HAC weighting matrix). Interpret your results. Test the
over-identifying restriction, the strength of the instruments, and the exogeneity of πt+1 and Yt .
Question 2
Test if you can add the instruments Z2 = {Yt−3 , Yt−4 , πt−2 , πt−3 }.
Question 3
Construct a 95% confidence interval for φ̂y = β̂y /(1 − ρ̂1 − ρ̂2 ) and φ̂π = β̂π /(1 − ρ̂1 − ρ̂2 ). Use the Delta
method.
4
Question 4
using the same instruments as in Question 1. The momentfit package allows the regression to be nonlinear.
For example, we can define the model as:
m <- momentModel(SR~alpha+rho1*SR_1+rho2*SR_2+(phiy*HP+phipi*inf1)/(1-rho1-rho2),
~SR_1+SR_2+HP_1+HP_2+inf+inf_1, data=dat,
vcov="HAC",
theta0=c(alpha=mean(dat$SR), rho1=0, rho2=0, phiy=0, phipi=0))
Interpret your results. Also, Compare the confidence intervals of φ̂y and φ̂π with the ones obtained in
Question 3.