Introduction To Econometrics, 5 Edition: Chapter 2: Properties of The Regression Coefficients and Hypothesis Testing

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 35

Type author name/s here

Dougherty

Introduction to Econometrics,
5th edition
Chapter heading
Chapter 2: Properties of the
Regression Coefficients and
Hypothesis Testing

© Christopher Dougherty, 2016. All rights reserved.


TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2

Estimator X

This sequence describes the testing of a hypotheses relating to regression coefficients. It


is concerned only with procedures, not with theory.

1
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2

Estimator X

Hypothesis testing forms a major part of the foundation of econometrics and it is essential
to have a clear understanding of the theory.

2
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2

Estimator X

The theory, discussed in sections R.9 to R.11 of the Review chapter, is non-trivial and
requires careful study. This sequence is purely mechanical and is not in any way a
substitute.
3
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2

Estimator X

If you do not understand, for example, the trade-off between the size (significance level)
and the power of a test, you should study the material in those sections before looking at
this sequence.
4
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2

Estimator X

In our standard example in the Review chapter, we had a random variable X with unknown
population mean m and variance s2. Given a sample of data, we used the sample mean as
an estimator of m.
5
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y  1   2 X  u

Estimator X ˆ
2    X i  X   Yi  Y 
 Xi  X 
2

In the context of the regression model, we have unknown parameters b1 and b2 and we
̂ 1
have derived estimators ˆand
2 for them. In what follows, we shall focus on b2 and its
ˆ
estimator  2 .
6
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y  1   2 X  u

Estimator X ˆ
2    X i  X   Yi  Y 
 Xi  X 
2

Null hypothesis H 0 :   0 H 0:  2   20
Alternative hypothesis H 1:    0 H 1:  2   20

In the case of the random variable X, our standard null hypothesis was that m was equal to
some specific value m0. In the case of the regression model, our null hypothesis is that b2 is
equal to some specific value  2 .
0

7
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y  1   2 X  u

Estimator X ˆ
2    X i  X   Yi  Y 
 Xi  X 
2

Null hypothesis H 0 :   0 H 0:  2   20
Alternative hypothesis H 1:    0 H 1:  2   20

X  0 ˆ2   20
Test statistic t t
s.e.  X  s.e.  ˆ2 

For both the population mean m of the random variable X and the regression coefficient b2,
the test statistic is a t statistic.

8
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y  1   2 X  u

Estimator X ˆ
2    X i  X   Yi  Y 
 Xi  X 
2

Null hypothesis H 0 :   0 H 0:  2   20
Alternative hypothesis H 1:    0 H 1:  2   20

X  0 ˆ2   20
Test statistic t t
s.e.  X  s.e.  ˆ2 

In both cases, it is defined as the difference between the estimated coefficient and its
hypothesized value, divided by the standard error of the coefficient.

9
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y  1   2 X  u

Estimator X ˆ
2    X i  X   Yi  Y 
 Xi  X 
2

Null hypothesis H 0 :   0 H 0:  2   20
Alternative hypothesis H 1:    0 H 1:  2   20

X  0 ˆ2   20
Test statistic t t
s.e.  X  s.e.  ˆ2 

Reject H0 if t  tcrit t  t crit

We reject the null hypothesis if the absolute value is greater than the critical value of t,
given the chosen significance level.

10
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y  1   2 X  u

Estimator X ˆ
2    X i  X   Yi  Y 
 Xi  X 
2

Null hypothesis H 0 :   0 H 0:  2   20
Alternative hypothesis H 1:    0 H 1:  2   20

X  0 ˆ2   20
Test statistic t t
s.e.  X  s.e.  ˆ2 

Reject H0 if t  tcrit t  t crit

Degrees of freedom n–1 n–k=n–2

There is one important difference. When locating the critical value of t, one must take
account of the number of degrees of freedom. In the case of the random variable X, this is
n – 1, where n is the number of observations in the sample.
11
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y  1   2 X  u

Estimator X ˆ
2    X i  X   Yi  Y 
 Xi  X 
2

Null hypothesis H 0 :   0 H 0:  2   20
Alternative hypothesis H 1:    0 H 1:  2   20

X  0 ˆ2   20
Test statistic t t
s.e.  X  s.e.  ˆ2 

Reject H0 if t  tcrit t  t crit

Degrees of freedom n–1 n–k=n–2

In the case of the regression model, the number of degrees of freedom is n – k, where n is
the number of observations in the sample and k is the number of parameters (b
coefficients). For the simple regression model above, it is n – 2.
12
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Example: p = b1 + b2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

As an illustration, we will consider a model relating price inflation to wage inflation. p is the
percentage annual rate of growth of prices and w is the percentage annual rate of growth of
wages.
13
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Example: p = b1 + b2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

We will test the hypothesis that the rate of price inflation is equal to the rate of wage
inflation. The null hypothesis is therefore H0: b2 = 1.0. (We should also test b1 = 0.)

14
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Example: p = b1 + b2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

pˆ  1.21  0.82 w
(0.05) (0.10)

Suppose that the regression result is as shown (standard errors in parentheses). Our
actual estimate of the slope coefficient is only 0.82. We will check whether we should reject
the null hypothesis.
15
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Example: p = b1 + b2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

pˆ  1.21  0.82 w
(0.05) (0.10)

ˆ2   20 0.82  1.00


t   1.80.
s.e.   2 
ˆ 0.10

We compute the t statistic by subtracting the hypothetical true value from the sample
estimate and dividing by the standard error. It comes to –1.80.

16
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Example: p = b11 + b22w + u


Null hypothesis: H00: b22 = 1.0
Alternative hypothesis: H11: b22 ≠ 1.0

pˆ  1.21  0.82 w
(0.05) (0.10)

ˆ2   20 0.82  1.00


t   1.80.
s.e.   2 
ˆ 0.10

n  20 degrees of freedom  18 t crit, 5%  2.101

There are 20 observations in the sample. We have estimated 2 parameters, so there are 18
degrees of freedom.

17
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Example: p = b1 + b2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

pˆ  1.21  0.82 w
(0.05) (0.10)

ˆ2   20 0.82  1.00


t   1.80.
s.e.   2 
ˆ 0.10

n  20 degrees of freedom  18 t crit, 5%  2.101

The critical value of t with 18 degrees of freedom is 2.101 at the 5% level. The absolute
value of the t statistic is less than this, so we do not reject the null hypothesis.

18
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Model Y = b1 + b2X + u

In practice it is unusual to have a feeling for the actual value of the coefficients. Very often
the objective of the analysis is to demonstrate that Y is influenced by X, without having any
specific prior notion of the actual coefficients of the relationship.
19
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Model Y = b1 + b2X + u
Null hypothesis: H0: b2 = 0
Alternative hypothesis: H1: b2 ≠ 0

In this case it is usual to define b2 = 0 as the null hypothesis. In words, the null hypothesis
is that X does not influence Y. We then try to demonstrate that the null hypothesis is false.

20
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Model Y = b1 + b2X + u
Null hypothesis: H0: b2 = 0
Alternative hypothesis: H1: b2 ≠ 0

ˆ2   20 ˆ2
t 
s.e.  ˆ2  s.e.  ˆ2 

For the null hypothesis b2 = 0, the t statistic reduces to the estimate of the coefficient
divided by its standard error.

21
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Model Y = b1 + b2X + u
Null hypothesis: H0: b2 = 0
Alternative hypothesis: H1: b2 ≠ 0

ˆ2   20 ˆ2
t 
s.e.  ˆ2  s.e.  ˆ2 

This ratio is commonly called the t statistic for the coefficient and it is automatically printed
out as part of the regression results. To perform the test for a given significance level, we
compare the t statistic directly with the critical value of t for that significance level.
22
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

Here is the output from the earnings function fitted in a previous slideshow, with the t
statistics highlighted.

23
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

You can see that the t statistic for the coefficient of S is enormous. We would reject the null
hypothesis that schooling does not affect earnings at the 1% significance level (critical
value about 2.59).
24
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

In this case we could go further and reject the null hypothesis that schooling does not
affect earnings at the 0.1% significance level.

25
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

The advantage of reporting rejection at the 0.1% level, instead of the 1% level, is that the
risk of mistakenly rejecting the null hypothesis of no effect is now only 0.1% instead of 1%.
The result is therefore even more convincing.
26
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

We have seen that the intercept does not have any plausible meaning, so it does not make
sense to perform a t test on it.

27
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

The next column in the output gives what are known as the p values for each coefficient.
This is the probability of obtaining the corresponding t statistic as a matter of chance, if the
null hypothesis H0: b = 0 is true.
28
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

If you reject the null hypothesis H0: b = 0, this is the probability that you are making a
mistake and making a Type I error. It therefore gives the significance level at which the null
hypothesis would just be rejected.
29
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

If p = 0.05, the null hypothesis could just be rejected at the 5% level. If it were 0.01, it could
just be rejected at the 1% level. If it were 0.001, it could just be rejected at the 0.1% level. This
is assuming that you are using two-sided tests.
30
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

In the present case p = 0 to three decimal places for the coefficient of S. This means that
we can reject the null hypothesis H0: b2 = 0 at the 0.1% level, without having to refer to the
table of critical values of t. (Testing the intercept does not make sense in this regression.)
31
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

The use of p values is a more informative approach to reporting the results of tests It is
widely used in the medical literature.

32
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

However, in economics, standard practice is to report results referring to 5% and 1%


significance levels, and sometimes to the 0.1% level (when one can reject at that level).

33
Copyright Christopher Dougherty 2016.

These slideshows may be downloaded by anyone, anywhere for personal use.


Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.

The content of this slideshow comes from Section 2.6 of C. Dougherty,


Introduction to Econometrics, fifth edition 2016, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oxfordtextbooks.co.uk/orc/dougherty5e/.

Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2016.04.19

You might also like