Autocorrelation: When Error Terms U and U, U,, U Are Correlated, We Call This The Serial Correlation or

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 27

AUTOCORRELATION

When error terms


ut and ut-1, ut-2, …, ut-p
are correlated, we call this
the pth-order serial correlation or
autocorrelation

1
 A process described by the eq.
ut = ρ1ut-1+ ρ2ut-2 +…+ ρput-p + εt
is called the pth order autoregressive
process AR(p)
 If p = 1, we have the 1st-order auto-
regressive pr. AR(1): ut = ρ1ut-1+ εt
or, simply, ut = ρut-1+ εt (this is also
known as white noise series)
2
• Example 1: consider the
consumption of electricity during
different hours of the day.
 Because the temperature patterns
are similar between successive time
periods, we can expect consumption
patterns to be correlated between
neighboring periods.

3
• Example 2: The price of a
particular security or stock market
index at the close of successive days
or during successive hours is likely
to be serially correlated

4
Residual plot

4
3
2
Residual (ut)

1
0
-1 0 10 20 30 40 50

-2
-3
Time

5
Positive autocorrelation: when
there is a clear tendency for
successive residuals to cluster on
one side of the zero line or the other
(as in the previous plot)
Negative autocorrelation: when
consecutive residuals change sign
frequently, i.e., jump frequently
above and below the zero line
6
Consequences of ignoring serial
correlation
 The OLS estimates and forecasts
will still be unbiased and consistent
 However, the OLS estimates are no
longer BLUE and will be inefficient
 Forecasts will also be inefficient

7
 The estimated variances of the
regression coefficients will be
biased and inconsistent, and tests
of hypothesis are invalid.

8
 If the serial correlation is positive
and the independent variable Xt is
growing over time, then:
The R2 will be overestimated,
indicating a better fit than exists
The t-statistics will be higher; the
parameters will appear more
significant than they actually are
9
Testing for the first-order serial
correlation
1.The Durbin-Watson test d
n n

d = Σ (ût – ût-1)2/Σ ût2


t=2 t=1

 It can be also shown (for large samples,


n > 30) that d ≈ 2(1-ρ), where ρ is the
first-order autocorrelation coefficient
from the white noise series
10
 The tested null hypothesis is
H0: ρ = 0 (i.e., there is no first-order
autocorrelation)

Test result: Positive autoc. ??? No autocorrelation ??? Neg. autoc.


• __________x____________x___x______________________x_____x_________x

• d-values: 0 dL dU 2 4-dU 4-dL 4

 ??? = the test is inconclusive

11
Drawbacks of the D-W test
 Cannot be used when there are
lagged dependent variables
(e.g., Yt-1) among the independent
variables (distributed lag models)
 There are instances when the DW
test is inconclusive

12
 Cannot be used when the autocorrelation is
of higher order than one: quarterly data are
likely to display autocorrelation of order 4,
monthly data – of order 12. E.g., the
statistic d for quarterly data (order 4) has to
be modified as follows:
n n
d4 = Σ (ût – ût-4)2/Σ ût2
t=5 t=1
 In order to apply this d4 test, one has to use
the tables of d4 values
13
2. The Breusch-Godfrey LM test
(for n > 30) Consider the model:
Yt = β0+β1X1t+β2X2t+…+ βkXkt + ut
Step 1. Estimate this model by OLS.
 Get the residuals ût.

14
Step 2. Regress ût against all the
independent variables in the model
(with a constant) plus
ût-1, ût-2, …, ût-p, where p is the
order of autocorrelation.
 Get R2 from this auxiliary
regression.

15
Step 3. Compute χ2 = (n-p)R2
 If this value exceeds the critical
value χ2(p) of a chi-square
distribution with d.f. = p and (for
example) α = 5%, then reject the
null hypothesis H0 of no
autocorrelation of any order ≤ p.

16
Treatment of serial correlation
1. The Durbin iterative procedure
 estimate ρ from the equation:
Yt=β0(1-ρ)+ρYt-1+ β1Xt – β1ρXt-1 + vt
 then reestimate the regression on
the transformed variables:
Yt - ρYt-1 = β0(1-ρ) + β1(Xt–ρXt-1) + vt

17
 To avoid losing the 1st observation
in the differencing process,
Y1√(1-ρ2) and X1√(1-ρ2)
are used for the first transformed
Y and X.
Note: the above steps assume that
the initial model is of the form
Yt = β0 + β1Xt + ut, i.e., there is only
one independent variable.
18
When there are more than one
independent variables
X1t, X2t, … Xkt, each of them must
be accompanied with its lagged
partner X1t-1, X2t-1, … Xkt-1, which
will drastically increase the total
number of variables in the model.

19
Note: when ρ ≈ 1, we have the
regression in difference form
(without the constant term)
Yt - Yt-1 = β1(Xt – Xt-1)+ vt
(Example: Course Notes, p. 67)

20
2. The Cochrane-Orcutt iterative
procedure
 If there are many expl. var. in the
regr. equation, Durbin’s method
involves a regr. in too many var.
(2×the # of expl. variables plus yt-1).
 In this case, we use the Cochrane-
Orcutt method.
21
Step 1. Estimate the model by OLS;
 Get ût
Step 2. Estimate ρ:
(a)from the formula ρ = Σûtût-1/Σût2
or
(b) by OLS of the model (without a
constant) ût = ρût-1 + wt
(c) from the formula d ≈ 2(1-ρ)
22
Step 3. Transform the model’s
variables as follows: Yt* = Yt – ρYt-1;
X1t* = X1t – ρX1t-1, X2t* = X2t – ρX2t-1 ,..,
Xkt* = Xkt – ρXkt-1
Step 4. Regress Yt* against a constant,
X1t*, X2t*, …, Xkt*, and get OLS
estimates b1*, b2*, …, bk*, and new
residuals ût*
23
Step 5. Repeat Steps 2-4. Continue
until ρ calculated in Step 2 stops
changing between two consecutive
iterations by no more than some
preselected value, for example 0.01.
Step 6. The final ρ is used to get the
Cochrane-Orcutt estimates of the
parameters in the original model.

24
Drawbacks of the Cochrane-
Orcutt procedure:
1.It may take many iterations until ρ
converges
2.It does not guarantee that the RSS
obtained in the subsequent OLS
regressions will attain its global
minimum (it may attain only a
local one, depending on the initial
value of ρ)
25
 A better way to ensure the RSS is
minimized would be to do a grid
search: different values (covering
the range from -0.9 to 0.9 in steps
of 0.1) of ρ are tried in obtaining
the Yt* and Xjt-1* variables, and the
estimates corresponding to the
lowest found RSS are selected.
 One of the most popular grid
search techniques is Hildreth-Lu
Search Procedure (not covered)
26
3. Changing the functional form
 Add the quadratic term, e.g. X2, to the
model: Y = β0 + β1X + β2X2 + ε
(see Course Notes, p. 76)
 Use a double-log model (Ch. 8):
logY = β0 + β1logX + ε
 Add dummy variables describing a
change in the structure of the data (Ch.
9)
27

You might also like