1 Basic Geometric Intuition: For Example, See Theorems 6.3.8 and 6.3.9 in Lay's Linear Algebra Book On The Syllabus
1 Basic Geometric Intuition: For Example, See Theorems 6.3.8 and 6.3.9 in Lay's Linear Algebra Book On The Syllabus
1 Basic Geometric Intuition: For Example, See Theorems 6.3.8 and 6.3.9 in Lay's Linear Algebra Book On The Syllabus
𝑦 = 𝑋𝛽 + 𝜀 (1)
Label the 𝑘th column of 𝑋 𝑥(𝑘) (don’t confuse it with the vector of covariates of unit 𝑖 𝑥𝑖 ).
a In what space does the vector 𝑦 lie? What is the dimension of column space of 𝑋? Draw
a picture for 𝑛 = 2, 𝑘 = 1 and for 𝑛 = 3, 𝑘 = 1, 2.
b Show that 𝑋𝑒 = 0 for the OLS vector of residuals 𝑒. State this is as an orthogonality condi-
tion involving 𝑒 and the regressors. Show that there are unique 𝑒 and 𝑧 ∈ span{𝑥(1) , . . . , 𝑥(𝑘) }
such that and 𝑒 is orthogonal to 𝑧 and ‖𝑒‖ is minimized1 .
c What happens if 𝑦 lies in the linear span of columns of 𝑋? Draw a picture for 𝑛 = 3, 𝑘 = 2
d What happens if 𝑋 is not full column rank? Draw a picture for 𝑛 = 3, 𝑘 = 2. The vector
𝑧 obtained in b is unique. How can this be reconciled with possible non-uniqueness of the
solution to 𝑋 ′ 𝑋𝑏 = 𝑋 ′ 𝑦?
1. If E(𝜀|𝑋) = 0, what can you say about E(𝜀|𝑍)? What relation should 𝛽 and 𝛾 satisfy
for E(𝑢|𝑍) = 0?
1
Let 𝜀 ∼ 𝑁 (0, 𝐼) and 𝑋 be a deterministic 𝑛 × 𝑘 matrix of rank 𝑘 (just so that we don’t have
to deal with statements conditional on 𝑋).
a Show that 𝑏 is a sufficient statistic for 𝛽. It is also complete (by theorem 6.2.25 in Casella
and Berger, but you don’t have to show this).
c Show this without using completeness and sufficiency of 𝑏 (Hint: 1.5.16 in Hayashi)
𝑦𝑖2 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝜀𝑖 (5)
b Suppose 𝑦𝑖 is not a deterministic function of 𝑥𝑖 . Argue using Jensen’s inequality why the
’intuitive’ suggestion will not lead to nice results.
√
c Confirm this by defining 𝑢𝑖 = 𝑦𝑖 − 𝛽0 + 𝛽1 𝑥𝑖 and computing E(𝑦𝑖2 |𝑋). What can and
cannot be estimated from this?
2
be the estimate on the first 𝑛 observations. Suppose a new observation comes along (𝑦𝑛+1 , 𝑥𝑛+1 )
comes along. Establish a recursion which allows us to update 𝛽𝑛 using new information
without recomputing from scratch using the following steps.
so that 𝑏𝑛 = 𝐴𝑛 𝑋𝑛′ 𝑦𝑛 .
−1 𝐴−1 𝑢𝑣 ′ 𝐴−1
(𝐴 + 𝑢𝑣 ′ ) = 𝐴−1 − (9)
1 + 𝑣 ′ 𝐴−1 𝑢
3. Setting
1 (︀ ′
)︀−1
𝐾𝑛 = (︀ )︀−1 𝑋𝑛−1 𝑋𝑛−1 𝑥𝑛 (10)
′
1 + 𝑥′𝑛 𝑋𝑛−1 𝑋𝑛−1 𝑥𝑛
𝐴𝑛−1 𝑥𝑛
= (11)
1 + 𝑥′𝑛 𝐴𝑛−1 𝑥𝑛
show that
𝑏𝑛 = 𝑏𝑛−1 + 𝐾𝑛 (𝑦𝑛 − 𝑥𝑛 𝑏𝑛−1 ) (12)
(the term in the denominator of 𝐾𝑛 is essentially the influence of observation 𝑥𝑛 ,
showing how much attention should 𝑏𝑛 pay to it. See p. 21 in Hayashi for a bit more
on this)
4. How much memory does this approach need? (In simple words, how many numbers do
you have to store to do a step of this iteration?)