04 Inference

Chapter 4: Multiple Regression Analysis – Inference
Econometrics
Michal Houda
University of South Bohemia in České Budějovice

Department of Applied Mathematics and Informatics
Michal Houda Chapter 4: Multiple Regression Analysis – Inference

Sampling Distributions of the OLS Estimators
Classical Linear Model (CLM) Assumptions
Assumption MLR 6 (Normality)

ε ∼ N (0; σ 2 ) and it is independent of x1 , . . . , xk .
Justifying normality
result of central limit theorem (not a tie with large sample sizes)
under MLR.1–6: OLS estimators are the best (minimum variance) unbiased
estimators (not only best linear)
population assumptions summarized as
y |x ∼ N (β0 + β1 x1 + . . . + βk xk ; σ 2 )
Theorem 1 (Normal Sampling Distribution)

Under MLR.1–6, β̂j ∼ N (βj , var β̂j ), that is,
β̂j − βj
∼ N (0, 1).
sd βj
Testing Hypotheses about a single population parameter
One-Sided and Two-Sided t-Test
Corollary 2 (t Distribution for the standardized estimators)

Under MLR.1–6,
β̂j − βj
∼ tn−k−1 .
se β̂j
β̂j
Statistical packages usually provide t-ratio tβ̂j := se β̂j
automatically ⇒ tests
H 0 : βj = 0 against HA : βj 6= 0
(two-sided tests) are straightforward.

One-sided alternatives should be considered in econometrics.

Right-Tailed t-Test
One-sided alternative — right-tailed test:
H 0 : βj = 0 against H A : βj > 0
significance level α: probability of rejecting H0 when it is true (the most

popular choice: α := 0.05 = 5 %)
critical value: (1 − α)-percentile (quantile) of the appropriate distribution
(tn−k−1 here)
rejection rule: tβ̂j > tn−k−1 (1 − α)
As the degrees of freedom (n − k − 1) get larger, the t distribution approaches the

standard normal distribution N (0; 1)
Compare: t120 (0.95) = 1.658 with u(0.95) = 1.645
curve(dnorm(x), xlim=c(-5,5), col="red", lwd=4)

curve(dt(x, df=5), col="blue", lwd=2, add=TRUE)
R
curve(dt(x, df=120), col="green", lwd=1, add=TRUE)
colors()
Right-Tailed t-Test
Example 3 (Hourly Wage Equation)

Data: WAGE1
\ = 0.284 + 0.092educ + 0.0041exper + 0.022tenure
ln(wage)
H0 : βexper = 0 against HA : βexper > 0

texper ≈ 0.0041/0.0017 ≈ 2.39 > t522 (0.95) = 1.648 (or u(0.95) = 1.645)
1
p-value = Pr{texper > 2.39} = 0.0171 = 0.0085
2
H0 rejected at α = 5 % (even at 1%) . . . the effect of experience on wages

is statistically significant
But: the estimated return of experience is not large — for example, additional
3 years of experience provide only 3 × 0.0041 = 1.23% increase of wages

Left-Tailed t-Test
One-sided alternative — left-tailed test:
H 0 : βj = 0 against H A : βj < 0
critical value: α-percentile (quantile) of the appropriate distribution

rejection rule: tβ̂j < tn−k−1 (α)

Left-Tailed t-Test
Example 4 (Student Performance and School Size)

Data: MEAP93
\ = 2.274 + 0.00046totcomp + 0.048staff − 0.00020enroll
math10
math10 . . . percentage of students passing the Michigan Educational Assessment Program

(MEAP) standardized 10-grade math test
totcomp . . . average annual teacher compensation (measure of teacher quality)
staff . . . number of staff per 1000 students (measure of attention received)
enroll . . . student enrollment (measure of school size)
H0 : βenroll = 0 against HA : βenroll < 0

tenroll ≈ −0.918 > t404 (0.05) = −1.649
1
p-value = Pr{texper > 2.39} = 0.36 = 0.18
2
H0 not rejected at 5 % (even at 15 %); changing the model:
\ = 2.274 + 0.00046 ln(totcomp) + 0.048 ln(staff ) − 0.00020 ln(enroll)
math10
tln(enroll) ≈ −1.829 < t404 (0.05) = −1.649 ⇒ H00 rejected at 5 %!
Two-sided t-Test
H 0 : βj = 0 against HA : βj 6= 0
critical value: 1 − α2 -percentile (quantile) of the appropriate distribution

rejection rule: tβ̂j < tn−k−1 (1 − α2 )
Example 5 (Determinants of College GPA)

Data: GPA1
\ = 1.39 + 0.412hsGPA + 0.015ACT − 0.083skipped
colGPA
skipped . . . average number of lectures missed per week
1 H0 : βhsGPA = 0 against HA : βhsGPA 6= 0: thsGPA = 4.396,

p-value ≈ 10−5 ⇒ H0 rejected at any conventional level;
2 H0 : βACT = 0 against HA : βACT 6= 0: tACT = 1.393, t137 (0.95) = 1.656,
p-value = 0.166 ⇒ H0 not rejected at 10 % — also small in practice;
3 H0 : βskipped = 0 against HA : βskipped 6= 0: tskipped = −3.197, t137 (0.995) = 2.612,
p-value = 0.0017 ⇒ H0 rejected at 1 % — but practically of small effect!
Tests against Other Alternatives
H0 : βj = aj against HA : βj T aj
Example 6 (Campus Crime and Student Enrollement)

Data: CAMPUS (FBI’s Uniform Crime Report for 1992, n = 97)
ln(crime) = β0 + β1 ln(enroll) + ε
\ = −6.63 + 1.27 ln(enroll)
ln(crime)
H0 : βexper = 1 against HA : βexper > 1
(crime is of more problem on larger campuses)
tln(enroll) ≈ (1.27 − 1)/0.11 ≈ 2.46 > t95 (0.95) = 1.66

p-value = Pr{texper > 2.46} ≈ 0.0079
H0 rejected (at 1%)

warning: this analysis holds no other factor fixed ⇒ elasticity 1.27 is not necessarily
a good estimate of ceteris paribus effect.
Economic (Practical) vs. Statistical Significance
Example 7 (Participation Rates in 401(k) Plans)
[ = 80.29 + 5.44mrate + 0.269age − 0.00013totemp

prate
totemp . . . total number of employees (firm size)
H0 : βtotemp = 0 against HA : βtotemp 6= 0

tln(totemp) ≈ −3.25, p-value ≈ 0.001
H0 rejected (even at 0,1 %) ⇒ βtotemp statistically significant

but: holding mrate, age fixed, +10,000 employees ⇒ only 1.3 percentage
point decrease in participation rate — not practically very large (not
economically significant)

Confidence Intervals

95% confidence interval for βj : β̂j ± tn−k−1 (0.975) · se β̂j
interpretation: unknown βj is in (known) (β j ; β j ) with 95% probability (for
95 % samples)
we only hope that we have used one of these 95 % samples
connection with H0 : βj = aj against HA : βj 6= aj — H0 rejected at (say) 5%
level ⇔ aj 6∈ (β j ; β j )
R confint(model)

Testing a Single Linear Combination of Parameters
Example 8 (Return to Education)
ln(wage) = β0 + β1 jc + β2 univ + β3 exper + ε

jc . . . # years attending a two-year college
univ . . . # years attending a four-year college
exper . . . # months in the workforce
H 0 : β1 = β 2 against HA : β1 < β2
cannot simply use individual t statistics as se(β̂1 − β̂2 ) 6= se β̂1 − se β̂2

q
standard error estimated by se(β̂1 − β̂2 ) = \
(se β̂1 )2 + (se β̂2 )2 − 2cov(β̂1 , β̂2 )
(sometimes reported by the software).
easier technique: define θ1 := β1 − β2 , totcoll := jc + univ , and rewrite the model
ln(wage) = β0 + θ1 jc + β2 totcoll + β3 exper + ε
H0 : θ1 = 0 against HA : θ1 < 0
tθ̂1 ≈ −1.48, p-value ≈ 0.070 (data: TWOYEAR)
⇒ H0 is not rejected at 5% (it is rejected at 10%) — there is some, but not strong,
evidence of the campus size on criminal activities.
Multiple Regression Analysis – Statistical Inference
Testing Multiple Linear Restrictions: F test
Example 9 (Baseball Players’ Salaries)

ln(salary ) = β0 + β1 years + β2 gamesyr + β3 bavg + β4 hrunsyr + β5 rbisyr + ε
salary . . . 1993 total salary
years . . . # years in the league
gamesyr . . . average # games played per year
bavg . . . career batting average
hrunsyr . . . # home runs per year
rbisyr . . . runs batted in per year
Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.119e+01 2.888e-01 38.752 < 2e-16 ***
years 6.886e-02 1.211e-02 5.684 2.79e-08 ***
gamesyr 1.255e-02 2.647e-03 4.742 3.09e-06 ***
bavg 9.786e-04 1.104e-03 0.887 0.376
hrunsyr 1.443e-02 1.606e-02 0.899 0.369
rbisyr 1.077e-02 7.175e-03 1.500 0.134
H0 : β3 = β4 = β5 = 0 against HA : nonH0
. . . multiple (joint, three) exclusion restrictions

Testing Multiple Linear Restrictions: F test
Example 9 (Baseball Players’ Salaries)

ln(salary ) = β0 + β1 years + β2 gamesyr + β3 bavg + β4 hrunsyr + β5 rbisyr + ε
H0 : β3 = β4 = β5 = 0 against HA : nonH0
. . . multiple (joint, three) exclusion restrictions

Data: MLB1 (n = 353)
Restricted model: ln(salary ) = β0 + β1 years + β2 gamesyr + ε
Test statistics: F -ratio
(SSR r − SSR ur )/q
F := ∼ Fq,n−k−1
SSR ur /(n − k − 1)
In the example, F ≈ 9.55, p-value ≈ 4.10−6 ⇒ H0 rejected.
Note again that all the three t-statistics are insignificant!
(Reason: corr(hrusyn, rbisyr ) ≈ 0.89).
Overall significance test: H0 : x1 = . . . = xk = 0
H0 often rejected, even if R 2 is small

occasionally, the overall F is the focus of a study (e. g., to test whether some variable is
predictable based on selected factors — cf. efficient markets hypothesis)
Testing General Linear Restrictions: F test
Example 10
ln(price) = β0 + β1 ln(assess) + β2 ln(lotsize) + β3 ln(sqrtft) + β4 ln(bdrms) + ε
price . . . house price

assess . . . the assessed housing value (before sold)
lotsize . . . size of the lot (in feet)
sqrft . . . square footage
bdrms . . . number of bedrooms
Data: HPRICE1, n = 88
Estimate Std. Error t value Pr(>|t|)

(Intercept) 0.263743 0.569665 0.463 0.645
log(assess) 1.043065 0.151446 6.887 1.01e-09 ***
log(lotsize) 0.007438 0.038561 0.193 0.848
log(sqrft) -0.103238 0.138430 -0.746 0.458
bdrms 0.033839 0.022098 1.531 0.129
Are the assessed housing prices of a rational valuation?
H0 : β1 = 1, β2 = β3 = β4 = 0
Testing General Linear Restrictions: F test
Example 10
ln(price) = β0 + β1 ln(assess) + β2 ln(lotsize) + β3 ln(sqrtft) + β4 ln(bdrms) + ε
Are the assessed housing prices of a rational valuation?
H0 : β1 = 1, β2 = β3 = β4 = 0
F ≈ 0.661, p-value≈ 0.62 ⇒ failed to reject H0 .

There is no evidence against rational valuation.

04 Inference

Uploaded by

Copyright:

Available Formats

04 Inference

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

04 Inference

Uploaded by

Copyright:

Available Formats

Chapter 4: Multiple Regression Analysis – Inference

University of South Bohemia in České Budějovice

Michal Houda Chapter 4: Multiple Regression Analysis – Inference

Assumption MLR 6 (Normality)

Theorem 1 (Normal Sampling Distribution)

Corollary 2 (t Distribution for the standardized estimators)

(two-sided tests) are straightforward.

Michal Houda Chapter 4: Multiple Regression Analysis – Inference

One-sided alternative — right-tailed test:

significance level α: probability of rejecting H0 when it is true (the most

As the degrees of freedom (n − k − 1) get larger, the t distribution approaches the

curve(dnorm(x), xlim=c(-5,5), col="red", lwd=4)

Example 3 (Hourly Wage Equation)

H0 : βexper = 0 against HA : βexper > 0

H0 rejected at α = 5 % (even at 1%) . . . the effect of experience on wages

Michal Houda Chapter 4: Multiple Regression Analysis – Inference

One-sided alternative — left-tailed test:

critical value: α-percentile (quantile) of the appropriate distribution

Michal Houda Chapter 4: Multiple Regression Analysis – Inference

Example 4 (Student Performance and School Size)

math10 . . . percentage of students passing the Michigan Educational Assessment Program

H0 : βenroll = 0 against HA : βenroll < 0

critical value: 1 − α2 -percentile (quantile) of the appropriate distribution

Example 5 (Determinants of College GPA)

1 H0 : βhsGPA = 0 against HA : βhsGPA 6= 0: thsGPA = 4.396,

Example 6 (Campus Crime and Student Enrollement)

(crime is of more problem on larger campuses)

tln(enroll) ≈ (1.27 − 1)/0.11 ≈ 2.46 > t95 (0.95) = 1.66

H0 rejected (at 1%)

Example 7 (Participation Rates in 401(k) Plans)

[ = 80.29 + 5.44mrate + 0.269age − 0.00013totemp

totemp . . . total number of employees (firm size)

H0 : βtotemp = 0 against HA : βtotemp 6= 0

H0 rejected (even at 0,1 %) ⇒ βtotemp statistically significant

Michal Houda Chapter 4: Multiple Regression Analysis – Inference

Michal Houda Chapter 4: Multiple Regression Analysis – Inference

ln(wage) = β0 + β1 jc + β2 univ + β3 exper + ε

cannot simply use individual t statistics as se(β̂1 − β̂2 ) 6= se β̂1 − se β̂2

Example 9 (Baseball Players’ Salaries)

Estimate Std. Error t value Pr(>|t|)

. . . multiple (joint, three) exclusion restrictions

Example 9 (Baseball Players’ Salaries)

. . . multiple (joint, three) exclusion restrictions

Overall significance test: H0 : x1 = . . . = xk = 0

H0 often rejected, even if R 2 is small

ln(price) = β0 + β1 ln(assess) + β2 ln(lotsize) + β3 ln(sqrtft) + β4 ln(bdrms) + ε

price . . . house price

Estimate Std. Error t value Pr(>|t|)

Are the assessed housing prices of a rational valuation?

ln(price) = β0 + β1 ln(assess) + β2 ln(lotsize) + β3 ln(sqrtft) + β4 ln(bdrms) + ε

Are the assessed housing prices of a rational valuation?

F ≈ 0.661, p-value≈ 0.62 ⇒ failed to reject H0 .

Michal Houda Chapter 4: Multiple Regression Analysis – Inference

You might also like