y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any...
Transcript of y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any...
Introduction
Consider the simple AR(1) model for t = 1, . . . , T
yt = φyt−1 + εt, εt ∼WN(0, σ2)
If |φ| < 1, then yt ∼ I(0) and
yt = ψ(L)εt, ψ(L) =∞Xk=0
ψkLk, ψk = φk
such that
∞Xk=0
k|ψk| < ∞
LRV = σ2ψ(1)2 = σ2(1− φ)−2 <∞
Furthermore, by the LLN and the CLT
T−1TXt=1
ytp→ E[yt] = 0
T−1/2TXt=1
ytd→ N(0, σ2ψ(1)2) = N(0, σ2(1− φ)−2)
T 1/2(φ− φ)d→ N(0, (1− φ2))
where φ =³PT
t=1 y2t−1
´−1PTt=1 yt−1yt is the least
squares estimate of φ.
If φ = 1, then yt ∼ I(1) and
ψk = 1∞Xk=0
k|ψk| = ∞
σ2ψ(1)2 = ∞
Furthermore,
T−1TXt=1
yt→∞ as T →∞
T−1/2TXt=1
yt→∞ as T →∞
T 1/2(φ− 1) p→ 0
Clearly, the asymptotic results for I(0) processes are
not applicable.
Sample Moments of I(1) Processes
When φ = 1
yt = yt−1 + εt
= y0 +tX
j=1
εj
=tX
j=1
εj if y0 = 0
Now, consider the sample mean of yt when y0 = 0 :
y = T−1TXt=1
yt = T−1TXt=1
⎛⎝ tXj=1
εj
⎞⎠Notice that the sample mean is a normalized sum of
partial sums of the white noise error term εt. As such,
it exhibits very different probabilistic behavior than
the sum of stationary and ergodic errors. It turns out
that the limit behavior of y when φ = 1 is described
by simple functionals of Brownian motion.
Brownian Motion
Standard Brownian motion (Wiener process) is a continuous-
time processW (·) associating each date r ∈ [0, 1] thescalar random ariable W (r) such that
1. W (0) = 0
2. For any dates 0 ≤ r1 < r2 < · · · < rk ≤ k, the
random increments
W (r2)−W (r1),W (r3)−W (r2), . . . ,W (rk)−W (rk−1)
are independent Gaussian random variables with
W (t)−W (s) ∼ N(0, t− s)
3. For any given realization, W (r) is continuous at
r with probability 1. That is, W (r) ∈ C[0, 1] =
space of continuous real valued functions on [0, 1].
The standard Brownian motion, or Wiener process,
may be intuitively thought of as the continuous-time
limit of a random walk process in which the integer
time index t = 1, 2, . . . ,∞ has been rescaled to the
continuous time index r = 0, . . . , 1. The Wiener pro-
cess may be shown to have the following properties:
1. W (r) ∼ N(0, r)
2. σW (r) = B(r) ∼ N(0, σ2r)
3. W (r)2 ∼ r · χ2(1)
4. W (r) is not differentiable and exhibits unbounded
variation.
Partial Sum Processes and the Functional Central
Limit Theorem
Let εt ∼WN(0, σ2). For r ∈ [0, 1], define the partialsum process
XT (r) = T−1[Tr]Xt=1
εt
[Tr] = integer part of T · r
For example, let T = 10 and consider XT (r) for r =
0, 0.01, 0.1, 0.2 :
r = 0, [10 · 0] = 0 : X10(0) =1
10
[10·0]Xt=1
εt = 0
r = 0.01, [10 · 0.01] = 0 : X10(0.01) =1
10
[10·0.01]Xt=1
εt = 0
r = 0.1, [10 · 0.1] = 1 : X10(0.1) =1
10
[10·0.1]Xt=1
εt =ε110
r = 0.2, [10 · 0.2] = 2 : X10(0.2) =1
10
[10·0.2]Xt=1
εt =ε1 + ε210
In general,
X10(r) =ε1 + · · ·+ εj
10,
j
10≤ r <
j + 1
10
For a sequence of errors ε1, . . . , εT :
1. the function XT (r) is a random step function de-
fined on [0, 1].
2. As T gets bigger the spaces between the steps
gets smaller and the random step function begins
to look more and more like a Wiener process.
The Functional Central Limit Theorem
For any fixed r ∈ [0, 1], consider
√TXT (r) =
√T
⎛⎜⎝T−1 [Tr]Xt=1
εt
⎞⎟⎠=
1√T
[Tr]Xt=1
εt
=
⎛⎜⎝q[Tr]√T
⎞⎟⎠⎛⎜⎝ 1q
[Tr]
[Tr]Xt=1
εt
⎞⎟⎠Now, as T →∞ q
[Tr]√T
→√r
1q[Tr]
[Tr]Xt=1
εtd→ N(0, σ2)
It follows from Slutsky’s theorem that
√TXT (r)
d→ N(0, r · σ2) ≡ σ ·W (r)
or√TXT (r)/σ
d→ N(0, r) ≡W (r)
Notice that when r = 1, we have the usual result
√TXT (1)/σ =
1
σ√T
TXt=1
εtd→ N(0, 1) ≡W (1)
Since the above result holds for any r ∈ [0, 1], one
might expect that the result holds uniformly for r ∈[0, 1]. In fact, the probability distribution of the se-
quence of stochastic step functions
{√TXT (·)/σ}∞T=1
defined on [0, 1] converges asymptotically to that of
standard Brownian motion W (·).
This convergence result, know as Donsker’s Theorem
for Partial Sums or the Functional Central Limit The-
orem (FCLT), is often represented as√TXT (·)/σ ⇒W (·)
The symbol “⇒” denotes convergence in distributionfor random functions.
The Continuous Mapping Theorem
Recall, if XT is a sequence of random variables such
that XTd→ X and g(·) is a continuous function then
g(XT )d→ g(X). A similar result holds for random
functions and is called the Continuous Mapping The-
orem (CMT).
Let {ST (·)}∞T=1 be a sequence of random functions
such that
ST (·) ⇒ S(·)g(·) = continuous functional
Then the CMT states that
g(ST (·))⇒ g(S(·))
Example 1
Suppose ST (·) =√TXT (·)/σ so that S (·) = W (·)
by the FCLT. Let g(ST (·)) = σ · ST (·) . Then
g(ST (·))⇒ g(W (·)) = σW (·)
Example 2
Let g(ST (·)) =R 10 ST (r)dr. Then
g(ST (·))⇒ g(W (·)) =Z 10W (r)dr
Convergence of Sample Moments of I(1) Processes
Let yt be the I(1) process
yt = yt−1 + εt, εt ∼WN(0, σ2)
For r ∈ [0, 1], define the partial sum process
XT (r) = T−1[Tr]Xt=1
εt
such that√TXT (·) ⇒ σW (·). The FCLT and the
CMT may be used to deduce the following results:
T−3/2TXt=1
yt−1 ⇒ σZ 10W (r)dr
T−2TXt=1
y2t−1 ⇒ σ2Z 10W (r)2dr
T−1TXt=1
yt−1εt ⇒ σ2Z 10W (r)dW (r)
= σ2³W (1)2 − 1
´/2
For example, it can be shown that
T−3/2TXt=1
yt−1 =Z 10
√TXT (r)dr ⇒ σ
Z 10W (r)dr
using the FCLT and the CMT. The details are given
in chapter 17 of Hamilton.
Application: Unit Root Tests
To illustrate the convergence of sample moments of
I(1) processes, consider the AR(1) regression
yt = φyt−1 + εt, εt ∼WN(0, σ2)
If φ = 1 then yt ∼ I(1); if |φ| < 1 then yt ∼ I(0). A
test of yt ∼ I(1) against the alternative that yt ∼ I(0)
may therefore be formulated as
H0 : φ = 1 vs. H1 : |φ| < 1
A natural test statistic is the t-statistic
tφ=1 =φ− 1SE(φ)
where
φ =
⎛⎝ TXt=1
y2t−1
⎞⎠−1 TXt=1
yt−1yt
SE(φ) =
⎛⎜⎝σ2⎛⎝ TXt=1
y2t−1
⎞⎠−1⎞⎟⎠1/2
σ2 = T−1TXt=1
(yt − φyt−1)2
Consistency of φ under H0 : φ = 1
Under H0 : φ = 1
φ− 1 =
⎛⎝ TXt=1
y2t−1
⎞⎠−1 TXt=1
yt−1εt⎛⎝T−2 TXt=1
y2t−1
⎞⎠−1 T−2 TXt=1
yt−1εt
Using the results
T−2TXt=1
y2t−1 ⇒ σ2Z 10W (r)2dr
T−1TXt=1
yt−1εt ⇒ σ2Z 10W (r)dW (r)
and the CMT, it follows that
φ− 1 p→Ãσ2Z 10W (r)2dr
!−1× 0 = 0
so that φp→ 1.
DF Test with Intercept
yt = c+ φyt−1 + εt
= x0tβ + εt
xt = (1, yt−1)0, β = (c, φ)0
OLS gives
β =
⎛⎝ TXt=1
xtx0t
⎞⎠−1 TXt=1
xtyt−1
TXt=1
xtx0t =
ÃT
PTt=1 yt−1PT
t=1 yt−1PTt=1 y
2t−1
!TXt=1
xtyt−1 =
à PTt=1 yt−1PTt=1 ytyt−1
!
Now, under H0 : φ = 1 and c = 0
β − β =
Ãc− 0φ− 1
!=
⎛⎝ TXt=1
xtx0t
⎞⎠−1 TXt=1
xtεt
ÃT
PTt=1 yt−1PT
t=1 yt−1PTt=1 y
2t−1
!−1Ã PTt=1 εtPT
t=1 yt−1εt
!
Problem: Elements ofPTt=1 xtx
0t and
PTt=1 xtεt con-
verge at different rates!ÃT
PTt=1 yt−1PT
t=1 yt−1PTt=1 y
2t−1
!=
ÃO(T ) Op(T 3/2)
Op(T 3/2) Op(T 2)
!Ã PT
t=1 εtPTt=1 yt−1εt
!=
ÃOp(T 1/2)Op(T )
!
Implication: Cannot get sensible convergence results
using traditional scaling
T³β − β
´=
⎛⎝T−1 TXt=1
xtx0t
⎞⎠−1 T−1 TXt=1
xtεt
=
ÃT−1 T−2
PTt=1 yt−1
T−1PTt=1 yt−1 T−2
PTt=1 y
2t−1
!−1
×Ã
T−1PTt=1 εt
T−1PTt=1 yt−1εt
!
⇒Ã0 0
0 σ2R 10 W (r)2dr
!−1×Ã
0
σ2R 10 W (r)dW (r)
!which is not well defined.
Sims-Stock-Watson Trick
Define the diagonal and invertible scaling matrix
DT =
ÃT 1/2 00 T
!Then write
DT
³β − β
´= DT
⎛⎝ TXt=1
xtx0t
⎞⎠−1DTD−1T
TXt=1
xtεt
=
⎛⎝D−1T TXt=1
xtx0tD−1T
⎞⎠−1D−1T TXt=1
xtεt
where
DT
³β − β
´=
ÃT 1/2c
T (φ− 1)
!
D−1TTXt=1
xtx0tD−1T
=
Ã1 T−3/2
PTt=1 yt−1
T−3/2PTt=1 yt−1 T−2
PTt=1 y
2t−1
!
D−1TTXt=1
xtεt =
ÃT−1/2
PTt=1 εt
T−1PTt=1 yt−1εt
!
Therefore,
DT
³β − β
´⇒
Ã1 σ
R 10 W (r)
σR 10 W (r) σ2
R 10 W (r)2dr
!−1×Ã
N(0, σ2)
σ2R 10 W (r)dW (r)
!Straightforward algebra shows that
T 1/2cd9 N(0, σ2)
T (φ− 1) ⇒ÃZ 10Wμ(r)2dr
!−1 Z 10Wμ(r)dW (r)
Wμ(r) = W (r)−Z 10W (r)
Convergence of Sample Moments with General Serial
Correlation
yt = yt−1 + ψ∗(L)εt, εt ∼WN(0, σ2)= yt−1 + ut
ψ∗(L) is 1-summable
LRV = σ2ψ∗(1)2 = γ0 + 2∞Xj=1
γj
γj = cov(ut, ut−j)
FCLT
√TXT (·) =
1√T
[T ·]Xt=1
ut⇒ LRV×W (·)
1. T−3/2PTt=1 yt−1⇒
√LRV
R 10 W (r)dr
2. T−2PTt=1 y
2t−1⇒LRV
R 10 W (r)2dr
3. T−1PTt=1 yt−1ut⇒LRV
R 10 W (r)dW (r)+ω, ω =
12(LRV−γ0)
4. T−1PTt=1 yt−1εt⇒
√σ2LRV
R 10 W (r)dW (r)
5. T−1PTt=1 yt−1ut−1 =LRV
R 10 W (r)dW (r) + ω +
γ0
Application: Asymptotic Distribution of ADF test
Assume yt is I(1) and that ∆yt ∼ AR(1)
∆yt = ξ∆yt−1 + εt, εt ∼WN(0, σ2)|ξ| < 1
Therefore, ∆yt has Wold representation
∆yt = ψ∗(L)εt = ut
ψ∗(L) = (1− ξL)−1 =∞Xj=0
ψ∗jLj, ψ∗j = ξj
LRV = σ2ψ∗(L) = σ2(1− ξ)−1
The ADF test regression is
yt = φyt−1 + ξ∆yt−1 + εt
x0tβ + εt
x0t = (yt−1,∆yt−1)0, β = (φ, ξ)0
Notice that
xt =
Ãyt−1∆yt−1
!∼ I(1)∼ I(0)
OLS on the ADF test regression gives
β − β =
⎛⎝ TXt=1
xtx0t
⎞⎠−1 TXt=1
xtεt
where
TXt=1
xtx0t =
à PTt=1 y
2t−1
PTt=1 yt−1∆yt−1PT
t=1∆yt−1yt−1PTt=1∆y2t−1
!
=
ÃOp(T 2) Op(T )
Op(T ) Op(T 1/2)
!TXt=1
xtεt =
à PTt=1 yt−1εtPTt=1∆yt−1εt
!
=
ÃOp(T )
Op(T 1/2)
!
Use Sims-Stock-Watson trick and define the scaling
matrix
DT =
ÃT 0
0 T 1/2
!Then write
DT
³β − β
´= DT
⎛⎝ TXt=1
xtx0t
⎞⎠−1DTD−1T
TXt=1
xtεt
=
⎛⎝D−1T TXt=1
xtx0tD−1T
⎞⎠−1D−1T TXt=1
xtεt
where
DT
³β − β
´=
⎛⎝ T (φ− 1)T 1/2
³ξ − ξ
´ ⎞⎠D−1T
TXt=1
xtx0tD−1T
=
ÃT−2
PTt=1 y
2t−1 T−3/2
PTt=1 yt−1∆yt−1
T−3/2PTt=1∆yt−1yt−1 T−1
PTt=1∆y2t−1
!
and
D−1TTXt=1
xtεt =
ÃT−1
PTt=1 yt−1εt
T−1/2PTt=1∆yt−1εt
!Note: ∆yt−1εt = ut−1εt is a stationary and ergodicMDS with
E[(ut−1εt)2] = E[E (ut−1εt)
2 |It−1]= E[u2t−1E[ε
2t ]] = σ2γ0
Therefore, by the appropriate CLT
T−1/2TXt=1
∆yt−1εt→ N(0, σ2γ0)
Using the convergence results for the sample moments
of serially correlated I(1) process, the above result,
and the CMT gives
T (φ− 1)⇒R 10 W (r)dW (r)R 10 W (r)2dr
T 1/2³ξ − ξ
´d→ N(0, σ2γ0)
Furthermore, φ and ξ are asymptotically independent.