Anova PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

The ANOVA Table for Regression

Supplement to Section 11.5 - B. Habing, 11/20/03

Just like the ANOVA table can be used to test the null hypothesis that µ1=µ2= ... = µp when looking at the means of several groups, it can also be
used in regression. In the case of means, the null hypothesis means that knowing the group membership provides no extra information about y. In
the case of regression, the corresponding null hypothesis would be that knowing x provides no extra information about y. If we were to guess the
same y value for every x, that would mean that the regression line was flat, that it had no slope. Therefore, the null hypothesis for the ANOVA table
in regression is H0: β1=0 and the alternate hypothesis is HA: β1 ≠0.

The main idea in setting up the ANOVA table for regression is that instead of comparing the individual observations to the group averages yi (our
best guess for the observations in each group), we will compare them to βˆ0 + βˆ1 xi (the best guess for the observation at each x). In other words,
we’ll basically use the same table we used before and just replace the yi by the βˆ0 + βˆ1 xi and make sure the degrees of freedom work out correctly.
When calculating the degrees of freedom now, instead of using p (one for each group average) we use 2 (one for each β). It is almost easier to do
because we only have one subscript to worry about instead of two.

An example of a completed ANOVA table for regression can be seen in Figure 11.8 on page 532 and in Figure 11.12 on page 536. Note that the
“Root MSE” in the SAS output is the same square root of the MSE in the ANOVA table (this is called “S” in the MINITAB output). Also note that
the p-value from the ANOVA table is the same as the p-value for the t-test for the slope variable (0.0354) and that the t=3.66 is the square root of the
F=13.36. The t-test for β1 ≠0 will always agree with the F-test for β1 ≠0. The t-test has the advantage that it can be made to test > or <. The
ANOVA table has the advantage that it will be useful in many other situations in STAT 516.
The ANOVA Table for Comparing Means

SS (Sum of Squares, DF MS (Mean Square,


Source F
the numerator of the variance) (the denominator) the variance)
Treatment p ni
∑ ∑ ( yi − y ) 2
SST
(or Between SST = p-1 MST = MST
i =1 j =1 p −1 F=
or Model) MSE
p ni
Error SSE = ∑ ∑ ( yij − yi )2 MSE =
SSE
n-p
(or Within) i =1 j =1 n− p
p ni
Total TSS = ∑ ∑ ( yij − y )2 n-1
i =1 j =1

The ANOVA Table for Regression

SS (Sum of Squares, DF MS (Mean Square,


Source F
the numerator of the variance) (the denominator) the variance)
Regression SSR = ∑ n ˆ
(( β 0 + βˆ1 xi ) − y ) 2 2-1=1 MSR =
SSR MSR
(or Model) i =1 1 F=
MSE
n
Error SSE = ∑ ( yi − ( βˆ0 + βˆ1 xi )) 2 n-2 MSE =
SSE
i =1 n−2
n
Total TSS = ∑ ( yi − y )2 n-1
i =1

Because they are made in the same way, the five keys we saw for the ANOVA table for means also apply to the ANOVA table for regression.

You might also like