Regression Example

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

Autumn Voiku is attempting to forecast sales for Brookfi eld Farms based on a

multiple regression model. Voiku has constructed the following model:


sales = b 0 + (b 1 CPI) + (b 2 IP) + (b 3 GDP) + t
Where:
sales = $ change in sales (in 000s)
CPI = change in the consumer price index
IP = change in industrial production (millions)
GDP = change in GDP (millions)
All changes in variables are in percentage terms.
Voiku uses monthly data from the previous 180 months of sales data and for the
independent variables. The model estimates (with coeffi cient standard errors in
parentheses) are:

sales =

10.2 +
(5.4)

(4.6 CPI) +
(3.5)

(5.2 IP) +
(5.9)

(11.7 GDP)
(6.8)

The sum of squared errors is 140.3 and the total sum of squares is 368.7.
Voiku calculates the unadjusted R 2 , the adjusted R 2 , and the standard error of
estimate to be 0.592, 0.597, and 0.910, respectively.
Voiku is concerned that one or more of the assumptions underlying multiple
regression has been violated in her analysis. In a conversation with Dave Grimbles,
CFA, a colleague who is considered by many in the fi rm to be a quant specialist.
Voiku says, It is my understanding that there are fi ve assumptions of a multiple
regression model:

Assumption 1: There is a linear relationship between the dependent and independent


variables.
Assumption 2: The independent variables are not random, and there is no correlation
between any two of the independent variables.
Assumption 3: The residual term is normally distributed with an expected value of zero.
Assumption 4: The residuals are serially correlated.
Assumption 5: The variance of the residuals is constant.

Grimbles agrees with Millers assessment of the assumptions of multiple regression.


Voiku tests and fails to reject each of the following four null hypotheses at the 99%
confi dence interval:

Hypothesis 1: The coefficient on GDP is negative.


Hypothesis 2: The intercept term is equal to 4.
Hypothesis 3: A 2.6% increase in the CPI will result in an increase in sales of more than
12.0%.
Hypothesis 4: A 1% increase in industrial production will result in a 1% decrease in sales.
Figure 1: Partial table of the Students t-distribution (One-tailed probabilities)

df

p = 0.10

p = 0.05

p = 0.025

p = 0.01

p = 0.005

170

1.287

1.654

1.974

2.348

2.605

176

1.286

1.654

1.974

2.348

2.604

180

1.286

1.653

1.973

2.347

2.603

Figure 2: Partial F-Table critical values for right-hand tail area equal to 0.05

df1 = 1

df1 = 3

df1 = 5

df2 = 170

3.90

2.66

2.27

df2 = 176

3.89

2.66

2.27

df2 = 180

3.89

2.65

2.26

Figure 3: Partial F-Table critical values for right-hand tail area equal to 0.025

df1 = 1

df1 = 3

df1 = 5

df2 = 170

5.11

3.19

2.64

df2 = 176

5.11

3.19

2.64

df2 = 180

5.11

3.19

2.64

Concerning the assumptions of multiple regression, Grimbles is:

correct to agree with Voikus statement of the assumptions.


incorrect to agree with Voikus list of assumptions because three of the assumptions are stated
incorrectly.
incorrect to agree with Voikus list of assumptions because two of the assumptions are stated
incorrectly.
incorrect to agree with Voikus list of assumptions because one of the assumptions is stated
incorrectly.

How confident are you?

Low

Medium

High
Explanation
Assumption 2 is stated incorrectly. Some correlation between independent
variables is unavoidable; high correlation results in multicollinearity. An exact

linear relationship between linear combinations of two or more independent


variables should not exist.
Assumption 4 is also stated incorrectly. The assumption is that the residuals
are serially uncorrelated (i.e., they are not serially correlated).
For which of the four hypotheses did Voiku incorrectly fail to reject the null, based on the data
given in the problem?

Hypothesis 1.
Hypothesis 4.
Hypothesis 3.
Hypothesis 2.

How confident are you?

Low

Medium

High
Explanation
The critical values at the 1% level of significance (99% confidence) are 2.348
for a one-tail test and 2.604 for a two-tail test (df = 176).
The t-values for the hypotheses are:
Hypothesis 1: 11.7 / 6.8 = 1.72
Hypothesis 2: 14.2 / 5.4 = 2.63
Hypothesis 3: 12.0 / 2.6 = 4.6, so the hypothesis is that the coefficient is 4.6,
and the t-stat of that hypothesis is 0.
Hypothesis 4: (5.2 + 1) / 5.9 = 1.05
Hypotheses 1 and 3 are one-tail tests; 2 and 4 are two-tail tests. Only
Hypothesis 2 exceeds the critical value, so only Hypothesis 2 should be
rejected.

The most appropriate decision with regard to the F-statistic for testing the null hypothesis that
all of the independent variables are simultaneously equal to zero at the 5 percent
signifi cance level is to:

fail to reject the null hypothesis because the F-statistic is smaller than the critical F-value of 3.19.
reject the null hypothesis because the F-statistic is larger than the critical F-value of 3.19.
fail to reject the null hypothesis because the F-statistic is smaller than the critical F-value of 2.66.
reject the null hypothesis because the F-statistic is larger than the critical F-value of 2.66.

How confident are you?

Low

Medium

High
Explanation
RSS = 368.7 140.3 = 228.4, F-statistic = (228.4 / 3) / (140.3 / 176) = 95.51.
The critical value for a one-tailed 5% F-test with 3 and 176 degrees of
freedom is 2.66. Because the F-statistic is greater than the critical F-value,
the null hypothesis that all of the independent variables are simultaneously
equal to zero should be rejected.
Regarding Voikus calculations of R 2 and the standard error of estimate, she is:

correct in her calculation of both the unadjusted R2 and the standard error of estimate.
incorrect in her calculation of both the unadjusted R2 and the standard error of estimate.
correct in her calculation of the unadjusted R2 but incorrect in her calculation of the standard
error of estimate.

incorrect in her calculation of the unadjusted R2 but correct in her calculation of the standard
error of estimate.

How confident are you?

Low

Medium

High
Explanation
SEE = [140.3 / (180 3 1)] = 0.893
unadjusted R2 = (368.7 140.3) / 368.7 = 0.619
The multiple regression, as specifi ed, most likely suff ers from:

multicollinearity.
omitted variables.
heteroskedasticity.
serial correlation of the error terms.

How confident are you?

Low

Medium

High
Explanation
The regression is highly significant (based on the F-stat in Part 3), but the
individual coefficients are not. This is a result of a regression with significant
multicollinearity problems. The t-stats for the significance of the regression
coefficients are, respectively, 1.89, 1.60, 0.88, 1.72. None of these are high

enough to reject the hypothesis that the coefficient is zero at the 5% level of
significance (two-tailed critical value of 1.974 from t-table).
A 90 percent confi dence interval for the coeffi cient on GDP is:

0.5 to 22.9.
4.4 to 20.8.
1.9 to 19.6.
1.5 to 20.0.

How confident are you?

Low

Medium

High
Explanation
A 90% confidence interval with 176 degrees of freedom is coefficient tc(se)
=11.7 1.654 (6.8) or 0.5 to 22.9.

You might also like