Econometrics I: Professor William Greene Stern School of Business Department of Economics

Econometrics I
Professor William Greene

Stern School of Business
Department of Economics
3-1/29 Part 3: Least Squares Algebra

Econometrics I
Part 3 – Least Squares Algebra

Vocabulary
 Some terms to be used in the discussion.
 Population characteristics and entities vs.
sample quantities and analogs
 Residuals and disturbances
 Population regression line and sample regression
 Objective: Learn about the conditional mean
function. ‘Estimate’  and 2
 First step: Mechanics of fitting a line (hyperplane) to
a set of data

Fitting Criteria
 The set of points in the sample
 Fitting criteria - what are they:
 LAD: Minimizeb  |y – x’bLAD|
 Least squares: Minimizeb  (y – x’bLS)2
 and so on
 Why least squares?
A fundamental result:
Sample moments are “good” estimators of
their population counterparts
We will examine this principle and apply it to least
squares computation.

An Analogy Principle for Estimating 
In the population E[y | X ] = X so
E[y - X |X] = 0
Continuing (assumed) E[xi i] = 0 for every i
Summing, Σi E[xi i] = Σi 0 = 0
Exchange Σi and E[] E[Σi xi i] = E[ X ] = 0
E[X(y - X) ] = 0
So, if X is the conditional mean, then E[X’] = 0.
We choose b, the estimator of , to mimic this population
result: i.e., mimic the population mean with the sample
mean
1 1
Find b such that X e = 0  X(y - Xb)

n n
As we will see, the solution is the least squares coefficient
vector.

Population Moments
We assumed that E[i|xi] = 0. (Slide 2:40)

It follows that Cov[xi,i] = 0.
Proof: Cov(xi,i) = Cov(xi,E[i |xi]) = Cov(xi,0) = 0.
(Theorem B.2). If E[yi|xi] = xi’, then
 = (Var[xi])-1 Cov[xi,yi].
Proof: Cov[xi,yi] = Cov[xi,E[yi|xi]]=Cov[xi,xi’]
This will provide a population analog to the statistics we
compute with the data.

U.S. Gasoline Market, 1960-1995

Least Squares
 Example will be, Gi regressed on
xi = [1, PGi , Yi]
 Fitting criterion: Fitted equation will be

yi = b1xi1 + b2xi2 + ... + bKxiK.
 Criterion is based on residuals:

ei = yi - b1xi1 + b2xi2 + ... + bKxiK
Make ei as small as possible.
Form a criterion and minimize it.

Fitting Criteria

n
 Sum of residuals: e
i1 i
 i1 i
n 2
 Sum of squares: e

n
 Sum of absolute values of residuals: i1
ei

n
 Absolute value of sum of residuals e
i1 i
We focus on e now and  i1 ei later

n 2 n
 i1 i

Least Squares Algebra
 e   i 1 (yi - xi b) 2 = ee = (y - Xb)'(y - Xb)

n 2 n
i 1 i
Matrix and vector derivatives.

Derivative of a scalar with respect to a vector
Derivative of a column vector wrt a row vector
Other derivatives

Least Squares Normal Equations
 e   i 1 (yi - xi b) 2
n n
 (yi - xi b) 2
2
  i 1
n
i 1 i

b b b
  i 1 2(yi - xi b)(-xi ) = -2 i 1 xi yi  2 i 1 xi xi b
n n n
= -2  n
i 1  
xi y i  2
n
i 1
xi xi b
= -2X'y + 2X'Xb

Least Squares Normal Equations
(y - Xb)'(y - Xb)

 2 X'(y - Xb) = 0
b
(11) / ( K 1) (-2)(n  K )'( n 1)
= (-2)(K  n )( n 1) = K 1
Note: Derivative of (11) wrt K 1 vector is a K 1 vector.
Solution:  2X'(y - Xb) = 0  X'y = X'Xb

Least Squares Solution
Assuming it exists: b = (X'X)-1X'y

Note the analogy:  =  Var(x)   Cov(x,y) 
1
1 1
1  1  1 n   1 n 
b =  X'X   X'y     i ` xi xi    i ` xi yi 
n  n  n  n 
Suggests something desirable about least squares

Second Order Conditions
Necessary Condition : First derivatives = 0
(y - Xb)'(y - Xb)
 2 X'(y - Xb)
b
Sufficient Condition : Second derivatives ...
  (y - Xb)'(y - Xb) 
 
 (y - Xb)'(y - Xb)
2
 b 
=
bb b
 2X'(y - Xb)   2X'y   2 X'  -Xb   2 X'Xb
    0
b b b b
 K 1 column vector
= = K  K matrix
 1 K row vector
= 2X'X
Side Result: Sample Moments
 in1 xi21 in1 xi1 xi 2 ... in1 xi1 xiK 
 n 
 x x  n
x 2
... in1 xi 2 xiK 
X'X =  i 1 i 2 i1 i 1 i 2
 ... ... ... ... 
 n 
i 1 xiK xi1 i 1 xiK xi 2 ... in1 xiK2 
n
 xi21 xi1 xi 2 ... xi1 xiK 

 2 
x x x ... xi 2 xiK 
=in1  i 2 i1 i2
 ... ... ... ... 
 
 xiK xi1 xiK xi 2 ... xiK 
2
 xi1 
x 
=in1  i 2   xi1 xi 2 ... xiK 
 ... 
 
 xik 
=in1xi xi
Does b Minimize e’e?
 in1 xi21 in1 xi1 xi 2 ... in1 xi1 xiK 
 n 
 e'e
2
 x x  n
x 2
... in1 xi 2 xiK 
 2 X'X = 2  i 1 i 2 i1 i 1 i 2
bb'  ... ... ... ... 

 n 
i 1 xiK xi1 i 1 xiK xi 2 ... i 1 xiK 
n n 2
If there were a single b, we would require this to be

positive, which it would be; 2x'x = 2 i 1 xi2  0. OK
n
The matrix counterpart of a positive number is a

positive definite matrix.

A Positive Definite Matrix
Matrix C is positive definite if a'Ca is > 0 for every a.
Generally hard to check. Requires a look at
characteristic roots (later in the course).
For some matrices, it is easy to verify. X'X is
one of these.
a'X'Xa = (a'X')( Xa) = ( Xa)'( Xa) = v'v =  k=1 v 2k  0
K
Could v = 0? v = 0 means Xa = 0. Is this possible? No.

Conclusion: b = ( X'X)-1 X'y does indeed minimize e'e.

Algebraic Results - 1
In the population: E[X'] = 0

1 n
In the sample :
n
 i 1
x ie i  0
X e = 0 means for each column of X , xk e = 0

(1) Each column of X is orthogonal to e.
(2) One of the columns of X is a column of ones.

n
i'e = e = 0. The residuals sum to zero.
i 1 i
1 n
(3) It follows that  i 1 ei = 0 which mimics E[i ]  0.
n

Residuals vs. Disturbances
Disturbances (population) y i  x i  i
Partitioning y: y = E[y|X ] + ε
= conditional mean + disturbance
Residuals (sample) y i  x ib  e i
Partitioning y : y = Xb + e
= projection + residual
(Note : Projection into the column space of X , i.e., the
set of linear combinations of the columns of X. Xb is one of these.)

Algebraic Results - 2
 A “residual maker” M = (I - X(X’X)-1X’)
 e = y - Xb= y - X(X’X)-1X’y = My
 My = The residuals that result when y is regressed on X
 MX = 0 (This result is fundamental!)
How do we interpret this result in terms of residuals?
When a column of X is regressed on all of X, we get a
perfect fit and zero residuals.
 (Therefore) My = MXb + Me = Me = e
(You should be able to prove this.
 y = Py + My, P = X(X’X)-1X’ = (I - M).
PM = MP = 0.
 Py is the projection of y into the column space of X.

The M Matrix
 M = I- X(X’X)-1X’ is an nxn matrix

 M is symmetric – M = M’
 M is idempotent – M*M = M
(just multiply it out)
 M is singular; M-1 does not exist.
(We will prove this later as a side result
in another derivation.)

Results when X Contains a Constant Term
 X = [1,x2,…,xK]
 The first column of X is a column of ones
 Since X’e = 0, x1’e = 0 – the residuals sum to
zero. y  Xb + e
Define i  [1,1,...,1]'  a column of n ones

n
i'y = i=1
y i  ny
i'y  i'Xb + i'e = i'Xb so (1/n)i'y = (1/n)i'Xb
implies
y  x b (the regression line passes through the means)
These do not apply if the model has no constant term.

U.S. Gasoline Market, 1960-1995

Least Squares Algebra

Least Squares

Residuals

Least Squares Residuals (autocorrelated)

Least Squares Algebra-3
I X XX X M
M is n  n potentially huge

Least Squares Algebra-4
MX =

Econometrics I: Professor William Greene Stern School of Business Department of Economics

Uploaded by

Copyright:

Available Formats

Econometrics I: Professor William Greene Stern School of Business Department of Economics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econometrics I: Professor William Greene Stern School of Business Department of Economics

Uploaded by

Copyright:

Available Formats

Econometrics I

Professor William Greene

3-1/29 Part 3: Least Squares Algebra

Part 3 – Least Squares Algebra

3-2/29 Part 3: Least Squares Algebra

3-3/29 Part 3: Least Squares Algebra

3-4/29 Part 3: Least Squares Algebra

3-5/29 Part 3: Least Squares Algebra

We assumed that E[i|xi] = 0. (Slide 2:40)

3-6/29 Part 3: Least Squares Algebra

3-7/29 Part 3: Least Squares Algebra

 Fitting criterion: Fitted equation will be

 Criterion is based on residuals:

3-8/29 Part 3: Least Squares Algebra

We focus on e now and  i1 ei later

3-9/29 Part 3: Least Squares Algebra

 e   i 1 (yi - xi b) 2 = ee = (y - Xb)'(y - Xb)

Matrix and vector derivatives.

3-10/29 Part 3: Least Squares Algebra

3-11/29 Part 3: Least Squares Algebra

(y - Xb)'(y - Xb)

3-12/29 Part 3: Least Squares Algebra

Assuming it exists: b = (X'X)-1X'y

3-13/29 Part 3: Least Squares Algebra

 xi21 xi1 xi 2 ... xi1 xiK 

bb'  ... ... ... ... 

If there were a single b, we would require this to be

The matrix counterpart of a positive number is a

3-16/29 Part 3: Least Squares Algebra

Could v = 0? v = 0 means Xa = 0. Is this possible? No.

3-17/29 Part 3: Least Squares Algebra

In the population: E[X'] = 0

X e = 0 means for each column of X , xk e = 0

3-18/29 Part 3: Least Squares Algebra

3-19/29 Part 3: Least Squares Algebra

3-20/29 Part 3: Least Squares Algebra

 M = I- X(X’X)-1X’ is an nxn matrix

3-21/29 Part 3: Least Squares Algebra

3-22/29 Part 3: Least Squares Algebra

3-23/29 Part 3: Least Squares Algebra

3-24/29 Part 3: Least Squares Algebra

3-25/29 Part 3: Least Squares Algebra

3-26/29 Part 3: Least Squares Algebra

3-27/29 Part 3: Least Squares Algebra

3-28/29 Part 3: Least Squares Algebra

3-29/29 Part 3: Least Squares Algebra

You might also like