FECO Note 2 - Simple Linear Regression: Xuan Chinh Mai

FECO Note 2 - Simple Linear Regression
Xuan Chinh Mai
February 3, 2018
Contents
1 Simple Linear Regression Model 2
2 OLS Estimator 3
3 Properties of the OLS Estimator 4
4 Goodness-of-fit 6
5 Limitations of the Simple Linear Regression 7
1
1 Simple Linear Regression Model
Functional form:
Y
|{z} = β0 + β1 |{z}
X + U
|{z} (1)
|{z} |{z}
Dependent Variable Intercept Slope Regressor Error Term
Assumptions of the model:
A1 Linear parameters
A2 Zero mean condition: E(U |X) = E(U ) = 0
A3 Random sampling: {(Xi , Yi )}N

i=1 is i.i.d.
A4 Finite fourth moments (no outliers): 0 < E(X 4 ) < ∞ and 0 < E(Y 4 ) < ∞
A5 Homoskedasticity, V ar(U |X) = σU2
From these assumptions, the conditional expectation of Y given X is a function of X:
E(Y |X) ≡ g(X) = β0 + β1 X
Before going further, it is necessary to mention some twists as they will make things simpler.
N
X
(X − X) = 0 (2)
i=1
XN
Xi (X − X)
i=1
XN
= (X − X)(Xi + X − X)
i=1
N
X N
X
= (X − X)2 − X (X − X)
i=1 i=1
N
2
X
= (X − X)2
i=1
XN N
X
⇒ Xi (X − X) = (X − X)2 (3)
i=1 i=1
2
2 OLS Estimator
Using the OLS Estimator, we have the fitted model:
Ŷi = β̂0 + β̂1 Xi , and Ûi = Yi − Ŷi , ∀i ∈ {1, ..., N } (4)

N
X N
X 2
min Ûi2 = Yi − β̂0 − β̂1 Xi
β̂0 ,β̂1
i=1 i=1
First order conditions:

N
∂ X
: Ûi = 0 (5)
∂ β̂0 i=1
N
∂ X
: Ûi Xi = 0 (6)
∂ β̂1 i=1
From 5, we have:
N
X
Ûi = 0 ⇔ β̂0 = Y − β̂1 X (7)
i=1
From 6, we have:
N
X
Ûi Xi =0
i=1
N
3,5
X
⇔ Yi − β̂0 − β̂1 Xi Xi − X =0
i=1
N h i
7
X
⇔ Yi − Y − β̂1 X − β̂1 Xi Xi − X =0
i=1
N N
2
X X 2
⇔ Yi − Y Xi − X = β̂1 Xi − X
i=1 i=1
PN
i=1 Yi − Y Xi − X
⇔β̂1 = PN 2 (8)
i=1 Xi − X
From 7 and 8, we have the formulas for the OLS Estimators:

PN
β̂0 = Y − β̂1 X and β̂1 = PN 2
i=1 Xi − X
3
Moreover, from 7 and 8, we have some other twists:
PN
β̂1 = PN 2
i=1 Xi − X
PN
1 i=1 (β0 + β1 Xi + Ui ) − Y Xi − X
⇔ β̂1 = PN 2
i=1 Xi − X
PN PN
2 i=1 Xi (Xi − X) (Xi − X)Ui
⇔ β̂1 = β1 PN + Pi=1
N
2 2
i=1 (Xi − X) i=1 (Xi − X)
PN
3 (Xi − X)Ui
⇔ β̂1 = β1 + Pi=1N
(9)
2
i=1 (Xi − X)
N
1 X
Y = (β0 + β1 Xi + Ui ) = β0 + β1 X + U (10)
N i=1
β̂0 = Y − β̂1 X

10
⇔ β̂0 = β0 + β1 − β̂1 X + U (11)
3 Properties of the OLS Estimator
Remark: When dealing with conditional expectation, e.g E(Ui Xi |Xi ), the variables after
the conditional term indicate those are being controlled (fixed) and, thus, could be treated
as a constant, so we can take those controlled variables out of the expectation operation, e.g
E(Ui Xi |Xi ) = Xi E(Ui |Xi ). Under the first four assumptions (A1-A4) of the Simple Linear
Regression Model:
" PN # PN

9 i=1 (X i − X)U i
(Xi − X)E(Ui |Xi ) A2
E β̂1 Xi = E β1 + PN Xi = β1 + i=1PN = β1

2
2
i=1 (Xi − X) i=1 (Xi − X)
N

11
h i 1 X A2
E β̂0 Xi = E β0 + β1 − β̂1 X + U Xi = β0 +[β1 − E (β1 |Xi )] X+ E (Ui |Xi ) = β0

N i=1
Hence, the OLS estimators of β0 and β1 are unbiased

E β̂0 Xi = β0 and E β̂1 Xi = β1

4
Moreover, under the fifth assumption (A5), when the error term Ui is homoskedastic, the
variance of these estimators are:
" PN #

9 (X i − X)U
i
V ar β̂1 Xi = V ar β1 + Pi=1 X i

N 2
i=1 (X i − X)
PN
(Xi − X)2 V ar(Ui |Xi )
= i=1hP i2
N 2
i=1 (X i − X)
PN
(Xi − X)2
= σU2 hP i=1 i2
N 2
i=1 (Xi − X)
σU2 P 1 σU2
⇔ V ar β̂1 Xi = PN → (12)

i=1 (Xi − X)2 N V ar(X)
h i
11
V ar β̂0 Xi = V ar β0 + β1 − β̂1 X + U Xi

2
= X V ar β1 − β̂1 Xi + V ar U Xi

2
1
= X V ar β̂1 Xi + V ar (Ui |Xi )

N
2
12 2 σU 1
= X PN + σU2
i=1 (Xi − X)
2 N
PN 2 PN 2
2 i=1 X + i=1 Xi − X
= σU 2
N N
P
i=1 Xi − X
PN h 2 i
i=1 Xi − X + X − 2X Xi − X
= σU2 2
N N
P
i=1 Xi − X
PN
2
X2 2
P σU E (X )
2
⇔ V ar β̂0 Xi = σU PN i=1 i
2
→ (13)

2
N i=1 Xi − X N V ar(X)
Therefore, when N is large, from 12 and 13 we have:

PN
Xi2 σU2 E (X 2 )
V ar β̂0 Xi = σU2 i=1
2 ≈

PN N V ar(X)
N i=1 Xi − X
and
σU2 1 σU2
V ar β̂1 Xi = PN ≈

i=1 (Xi − X)2 N V ar(X)
When the assumption A5 doesn’t hold, in large sample, the OLS estimators β̂0 and β̂1 have
a jointly normal sample distribution with the variances of this distribution as follows:
1 V ar[(Xi − µX )Ui ] 1 V ar(Hi Ui ) E(Xi )

V ar(βˆ1 ) = 2
and V ar(βˆ0 ) = 2 2
, where Hi ≡ 1 − Xi
N σX N E(Hi ) E(Xi2 )
5
It can be seen that the variances of β̂0 and β̂1 converge to zero as N → ∞, so they are
consistent. When N is large, β̂0 and β̂1 are close to β0 and β1 with high probability. The
larger V ar(X) is, the smaller V ar(β̂1 ) is, which means that the more spread out the sample
of X is, the easier it is to trace out the relationship between Y and X. Moreover, the smaller
V ar(U ) is, the smaller V ar(β̂1 ) is. That implies if U are smaller, the data will have a tighter
scatter around the regression line so its slope will be estimated more accurately.
As σU2 is unknown due to unknown U , we have to estimate it. Note that:

A2
V ar(U |X) = E U 2 X − E (U |X)2 = E U 2 X

⇔ E U 2 X = σU2 = E U 2

(14)
Hence, the unbiased estimator of σU2 is
N
1 X 2
≡ σ̂U2 Û
N − 2 i=1 i

then the estimators for V ar β̂0 and V ar β̂1 are:
v
u PN 2
σ̂U i=1 Xi
u1
SE β̂1 ≡ qP and SE β̂ 0 ≡ σ̂U
t
N N Xi − X 2
2 P
N
i=1 X i − X i=1
4 Goodness-of-fit
The Goodness-of-fit of the model is represented by R − squared which shows how well the
model could explain the deviations of the sample. It is the ratio between the explained
deviations and total deviations of the model. Defining following terms:
N
X 2
Total Sum of Squared: T SS = Yi − Y (15)
i=1
XN 2
Explained Sum of Squared: ESS = Ŷi − Y (16)
i=1
XN 2 N
X
Sum of Squared Residuals: SSR = Yi − Ŷi = Ûi2 (17)
i=1 i=1
From 15, we have:

N
X N h
X i2
2
T SS = (Yi − Y ) = (Yi − Ŷi ) + (Ŷi − Y )
i=1 i=1
XN 2 XN 2 N
X
= Yi − Ŷi + Ŷi − Y +2 Yi − Ŷi Ŷi − Y
i=1 i=1 i=1
6
where:
N
X XN N
X N
X
Yi − Ŷi Ŷi − Y = Ûi Ŷi − Y = −Y Ûi + Ûi β̂0 + β̂1 Xi
i=1 i=1 i=1 i=1
N N N
5,6
X X X
= −Y Ûi + β̂0 Ûi + β̂1 Ûi Xi = 0
i=1 i=1 i=1
hence,
N
X N
X 2 N
X 2
2
(Yi − Y ) = Yi − Ŷi + Ŷi − Y
i=1 i=1 i=1
⇔ T SS = ESS + SSR (18)
Then the formula of R − squared is:

ESS SSR
R2 = =1− , 0 < R2 < 1 (19)
T SS T SS
5 Limitations of the Simple Linear Regression
The Simple Linear Regression (SLR) model performs poorly in certain cases, especially when
one of its assumptions doesn’t hold:
• The SLR using OLS does poorly when there are outliers (A4 doesn’t hold)
• OLS doesn’t necessarily imply a causal effect, so without further assumptions the
meanings of the coefficient should be interpreted carefully
• The linear regression does poorly at very low/high level of the regressor as it depends
on the spread of the sample
• The negative intercept should be interpreted with caution

FECO Note 2 - Simple Linear Regression: Xuan Chinh Mai

Uploaded by

Copyright:

Available Formats

FECO Note 2 - Simple Linear Regression: Xuan Chinh Mai

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FECO Note 2 - Simple Linear Regression: Xuan Chinh Mai

Uploaded by

Copyright:

Available Formats

FECO Note 2 - Simple Linear Regression

Xuan Chinh Mai

1 Simple Linear Regression Model 2

3 Properties of the OLS Estimator 4

5 Limitations of the Simple Linear Regression 7

Assumptions of the model:

A2 Zero mean condition: E(U |X) = E(U ) = 0

A3 Random sampling: {(Xi , Yi )}N

A5 Homoskedasticity, V ar(U |X) = σU2

From these assumptions, the conditional expectation of Y given X is a function of X:

E(Y |X) ≡ g(X) = β0 + β1 X

Using the OLS Estimator, we have the fitted model:

Ŷi = β̂0 + β̂1 Xi , and Ûi = Yi − Ŷi , ∀i ∈ {1, ..., N } (4)

First order conditions:

From 7 and 8, we have the formulas for the OLS Estimators:

3 Properties of the OLS Estimator

Therefore, when N is large, from 12 and 13 we have:

1 V ar[(Xi − µX )Ui ] 1 V ar(Hi Ui ) E(Xi )

As σU2 is unknown due to unknown U , we have to estimate it. Note that:

From 15, we have:

Then the formula of R − squared is:

5 Limitations of the Simple Linear Regression

• The negative intercept should be interpreted with caution

You might also like