Autocorrelation: When Error Terms U and U, U,, U Are Correlated, We Call This The Serial Correlation or
Autocorrelation: When Error Terms U and U, U,, U Are Correlated, We Call This The Serial Correlation or
Autocorrelation: When Error Terms U and U, U,, U Are Correlated, We Call This The Serial Correlation or
A process described by the eq.
ut = ρ1ut-1+ ρ2ut-2 +…+ ρput-p + εt
is called the pth order autoregressive
process AR(p)
If p = 1, we have the 1st-order auto-
regressive pr. AR(1): ut = ρ1ut-1+ εt
or, simply, ut = ρut-1+ εt (this is also
known as white noise series)
• Example 1: consider the
consumption of electricity during
different hours of the day.
Because the temperature patterns
are similar between successive time
periods, we can expect consumption
patterns to be correlated between
neighboring periods.
• Example 2: The price of a
particular security or stock market
index at the close of successive days
or during successive hours is likely
to be serially correlated
Residual plot
Residual (ut)
-1 0 10 20 30 40 50
Positive autocorrelation: when
there is a clear tendency for
successive residuals to cluster on
one side of the zero line or the other
(as in the previous plot)
Negative autocorrelation: when
consecutive residuals change sign
frequently, i.e., jump frequently
above and below the zero line
Consequences of ignoring serial
The OLS estimates and forecasts
will still be unbiased and consistent
However, the OLS estimates are no
longer BLUE and will be inefficient
Forecasts will also be inefficient
The estimated variances of the
regression coefficients will be
biased and inconsistent, and tests
of hypothesis are invalid.
If the serial correlation is positive
and the independent variable Xt is
growing over time, then:
The R2 will be overestimated,
indicating a better fit than exists
The t-statistics will be higher; the
parameters will appear more
significant than they actually are
Testing for the first-order serial
1.The Durbin-Watson test d
n n
Drawbacks of the D-W test
Cannot be used when there are
lagged dependent variables
(e.g., Yt-1) among the independent
variables (distributed lag models)
There are instances when the DW
test is inconclusive
Cannot be used when the autocorrelation is
of higher order than one: quarterly data are
likely to display autocorrelation of order 4,
monthly data – of order 12. E.g., the
statistic d for quarterly data (order 4) has to
be modified as follows:
n n
d4 = Σ (ût – ût-4)2/Σ ût2
t=5 t=1
In order to apply this d4 test, one has to use
the tables of d4 values
2. The Breusch-Godfrey LM test
(for n > 30) Consider the model:
Yt = β0+β1X1t+β2X2t+…+ βkXkt + ut
Step 1. Estimate this model by OLS.
Get the residuals ût.
Step 2. Regress ût against all the
independent variables in the model
(with a constant) plus
ût-1, ût-2, …, ût-p, where p is the
order of autocorrelation.
Get R2 from this auxiliary
Step 3. Compute χ2 = (n-p)R2
If this value exceeds the critical
value χ2(p) of a chi-square
distribution with d.f. = p and (for
example) α = 5%, then reject the
null hypothesis H0 of no
autocorrelation of any order ≤ p.
Treatment of serial correlation
1. The Durbin iterative procedure
estimate ρ from the equation:
Yt=β0(1-ρ)+ρYt-1+ β1Xt – β1ρXt-1 + vt
then reestimate the regression on
the transformed variables:
Yt - ρYt-1 = β0(1-ρ) + β1(Xt–ρXt-1) + vt
To avoid losing the 1st observation
in the differencing process,
Y1√(1-ρ2) and X1√(1-ρ2)
are used for the first transformed
Y and X.
Note: the above steps assume that
the initial model is of the form
Yt = β0 + β1Xt + ut, i.e., there is only
one independent variable.
When there are more than one
independent variables
X1t, X2t, … Xkt, each of them must
be accompanied with its lagged
partner X1t-1, X2t-1, … Xkt-1, which
will drastically increase the total
number of variables in the model.
Note: when ρ ≈ 1, we have the
regression in difference form
(without the constant term)
Yt - Yt-1 = β1(Xt – Xt-1)+ vt
(Example: Course Notes, p. 67)
2. The Cochrane-Orcutt iterative
If there are many expl. var. in the
regr. equation, Durbin’s method
involves a regr. in too many var.
(2×the # of expl. variables plus yt-1).
In this case, we use the Cochrane-
Orcutt method.
Step 1. Estimate the model by OLS;
Get ût
Step 2. Estimate ρ:
(a)from the formula ρ = Σûtût-1/Σût2
(b) by OLS of the model (without a
constant) ût = ρût-1 + wt
(c) from the formula d ≈ 2(1-ρ)
Step 3. Transform the model’s
variables as follows: Yt* = Yt – ρYt-1;
X1t* = X1t – ρX1t-1, X2t* = X2t – ρX2t-1 ,..,
Xkt* = Xkt – ρXkt-1
Step 4. Regress Yt* against a constant,
X1t*, X2t*, …, Xkt*, and get OLS
estimates b1*, b2*, …, bk*, and new
residuals ût*
Step 5. Repeat Steps 2-4. Continue
until ρ calculated in Step 2 stops
changing between two consecutive
iterations by no more than some
preselected value, for example 0.01.
Step 6. The final ρ is used to get the
Cochrane-Orcutt estimates of the
parameters in the original model.
Drawbacks of the Cochrane-
Orcutt procedure:
1.It may take many iterations until ρ
2.It does not guarantee that the RSS
obtained in the subsequent OLS
regressions will attain its global
minimum (it may attain only a
local one, depending on the initial
value of ρ)
A better way to ensure the RSS is
minimized would be to do a grid
search: different values (covering
the range from -0.9 to 0.9 in steps
of 0.1) of ρ are tried in obtaining
the Yt* and Xjt-1* variables, and the
estimates corresponding to the
lowest found RSS are selected.
One of the most popular grid
search techniques is Hildreth-Lu
Search Procedure (not covered)
3. Changing the functional form
Add the quadratic term, e.g. X2, to the
model: Y = β0 + β1X + β2X2 + ε
(see Course Notes, p. 76)
Use a double-log model (Ch. 8):
logY = β0 + β1logX + ε
Add dummy variables describing a
change in the structure of the data (Ch.