L 18 Mit Ts

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Local to Unity Asymptotics 1

14.384 Time Series Analysis, Fall 2007


Professor Anna Mikusheva
Paul Schrimpf, scribe
October 25, 2007

Lecture 18
More Non-Stationarity

We have seen that there’s a discrete difference between stationarity and non-stationarity. When we have
a non-stationary process, limiting distributions are quite different from in the stationary case. For example,
P[τ T ]
let ²t be a martingale difference sequence, with E(²2t |It−1 ) = 1, E²4t < k < ∞. Then ξT (τ ) = √1T t=1 ²t ⇒
W (·). Then there is a sort of discontinuity in the limiting distribution of an AR(1) at ρ = 1:
Unit Root Stationary
Process yt = yt−1 R+ ²t xt = ρxt−1 + ²t
W dW √
Limiting distribution of ρ T (ˆ ρ − 1) ⇒ R W 2 dt T (ρ̂ − ρ) ⇒ N (0, 1 − ρ2 )
R
W dW
Limiting distribution of t t ⇒ √R 2 t ⇒ N (0, 1)
W dt
In finite samples, the distribution of the t-stat is continuous in ρ ∈ [0, 1]. However, the limit distribution is
discontinuous at ρ = 1. This must mean that the convergence is not uniform. In particular, the convergence
of the t-stat to a normal distribution is slower, the closer ρ is to 1. Thus, in small samples, when ρ is close
to 1, the normal distribution badly approximates the unknown finite sample distribution of the t-statistic.
A more precise statement is that we have pointwise convergence, i.e.

sup |P (t(ρ, T ) ≤ x) − Φ(x)| → 0 ∀ρ < 1


x

but not uniform convergence, i.e.

sup sup |P (t(ρ, T ) ≤ x) − Φ(x)| 6→ 0


ρ∈(0,1) x

where Φ(·) is the normal cdf. It means that the confidence set based on normal approximation of t-statistic
will have bad coverage for values of ρ very close to the unit root. Since we don’t know the true value of ρ
for sure we are in danger to get a deceptive confidence set.
Just how bad is the normal approximation? If you construct a 95% confidence interval based on a normal
approximation, then without a constant the coverage is 90%, with a constant 70%, and with a linear treand
35%.

Local to Unity Asymptotics


Local to unity asymptotics is one way to try to construct a better approximation. Let:

xt =ρxt−1 + ²t , t = 1, ..., T
ρ = exp(c/T ) ≈ 1 + c/T , c < 0

This model is not meant to be a literal way of describing the world. It is just a device for building a better
approximating limiting distribution. It can be shown that:
x[τ T ]
√ ⇒ =c (τ ) (1)
T
where =c (τ ) is an Ornstein-Ulenbeck process.

Cite as: Anna Mikusheva, course materials for 14.384 Time Series Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu),
Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Confidence Sets 2

Rτ 2τ c
Definition 1. Ornstein-Ulenbeck process: =c (τ ) = 0
ec(τ −s) dW (s), so =c (τ ) ∼ N (0, e −1
2c )

We will not prove (1), but we will sketch the idea. First, observe that
X t
x ²j
√t = ρt−j √
T j=1
T
t
X ²j
= ec(t/T −j/T ) √
j=1
T

Defining ξT (τ ) as usual we have:


X t
x
√t = ec(t/T −j/T ) ∆ξT (j/T )
T j=1

then taking τ = t/T we have:


Z τ
x[τ T ]
√ = ec(τ −s) dξT (s)
T 0

Finally, assuming convergence of the stochastic integral (which we could prove if we took care of some
technical details), gives:
Z τ
x[τ T ]
√ ⇒ ec(τ −s) dW (s) ≡ =c (τ )
T 0

Using this result, the limiting distribution of OLS will be (omitting several technical steps):
R
=c (s)dW (s)
T (ρ̂ − ρ) ⇒ R 2
=c (s)ds
R
=c (s)dW (s)
tρ=ec/T ⇒ tc = qR
=2c (s)ds

If c = 0, tc is a Dickey-Fuller distribution. If c → −∞, the tc ⇒ N (0, 1). This was shown by Phillips (1987).
The convergence to this distribution is uniform (Mikusheva (2007)),
sup sup |P (t(ρ, T ) ≤ x) − P (tc ≤ x|ρ = ec/T )| → 0 as T → ∞
ρ∈[0,1] x

Figure 1 illustrates this convergence

Confidence Sets
We usually construct confidence sets by inverting a test. Consider testing H0 : ρ = ρ0 vs ρ 6= ρ0 . We
ρ̂−ρ
construct a confidence set as C(x) = {ρ0 : hypothesis accepted}. So, for example in OLS, we take t = s.e.(ρ̂)
and
ρ̂ − ρ
C(x) ={ρ0 : −1.96 ≤ ≤ 1.96}
se(ρ̂)
=[ρ̂ − 1.96se(ρ̂), ρ̂ + 1.96se(ρ̂)]
To construct confidence sets using local to unity asymptotics, we do the exact same thing, except the quantiles
of our limiting distribution depend on ρ0 , i.e.
ρ̂ − ρ
C(x) ={ρ0 : q1 (ρ0 , T ) ≤ ≤ q1 (ρ0 , T )}
se(ρ̂)

Cite as: Anna Mikusheva, course materials for 14.384 Time Series Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu),
Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Grid Bootstrap 3

Figure 1: Local to Unity Asymptotics

where q1 (ρ0 , T ) and q2 (ρ0 , T ) are quantiles of tc for c = T log ρ0 .


This approach was developed by Stock (1991). It only works when we have an AR(1) with no auto-
correlation. Some correction could be done in AR(p) to construct a confidence set for the largest autoregres-
sive root.

Grid Bootstrap
This was an approach developed by Hansen (1999). It has a local to unity interpretation. Suppose
p−1
X
xt = ρxt−1 + βj ∆xt−j + ²t
j=1

where ρ will be the sum of AR coefficients; it is a measure of persistence. For the grid bootstrap we:
• Choose grid on [0, 1]
• Test H0 : ρ = ρ0 vs ρ 6= ρ0 for each point on grid
1. Regress xt on xt−1 and ∆xt−1 , ...∆xt−p+1 to get ρ̂, tρ0 -stat
2. Regress xt − ρ0 xt−1 on ∆xt−1 , ...∆xt−p+1 to get β̂j
3. Bootstrap:
– ²∗t from residuals of step 1
P
– Form x∗t = ρ0 x∗t−1 + β̂j ∆xt−j + ²∗t do OLS as in step 1
– Repeat, use quantiles of bootstrapped t-stats as critical values to form test
• All ρ0 for which the hypothesis is accepted form a confidence set

Bayesian Perspective
From a Bayesian point of view, there is nothing special about unit roots if one assumes a flat prior. Sims
and Uhlig (1991) argue that all the attention paid to unit roots is non-productive. Phillips (1991) has a

Cite as: Anna Mikusheva, course materials for 14.384 Time Series Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu),
Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Bayesian Perspective 4

reply that looks more carefully at the idea of uninformative priors. Sims and Uhlig (1991) had put a uniform
prior on [0, 1]. Phillips points out that this puts all weight on the stationary case. He argues that a uniform
prior is not necessarily uninformative, and point out that a Jeffreys prior would put much more weight
(asymptotically almost unity weight) on the non-stationary case. In this case Bayesian conclusions look
more like frequentists’. There is a Journal of Applied Econometrics issue about this debate.

Cite as: Anna Mikusheva, course materials for 14.384 Time Series Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu),
Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
MIT OpenCourseWare
http://ocw.mit.edu

14.384 Time Series Analysis


Fall 2013

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

You might also like