Anova

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 43

ANOVA

ANOVA (Analysis Of Variance)


• It is used to measure the variation among groups (3 or
more).
• ANOVA measures two sources of variation in the data
and compares their relative sizes
• Variation BETWEEN groups
– for each data value look at the difference between its group
mean and the overall mean
• Variation WITHIN groups
– for each data value look at the difference between that value
and the mean of its group
Why ANOVA instead of multiple t-test
One-Way ANOVA
• When there is just one explanatory variable, we refer
to the analysis of variance as one-way ANOVA.
• The one-way analysis of variance is used to test the
H 0 :or1more
claim that three   2 population
 3   means
 k are equal
• This is an extension of the two independent samples t-
test
H0= µ1 = µ2 = µ3 = …… = µk
• The null hypothesis is that there is no significant
difference among all the means
• The alternative hypothesis is that at least one of the
means is different
ANOVA (Analysis Of Variance)
• The ANOVA F-statistic is a ratio of the Between Group
Variaton divided by the Within Group Variation:

• A large F is evidence against H0, since it indicates that


there is more difference between groups than within
groups.
• If the alternate hypothesis is true, then F tends to be large.
• We reject H0 in favor of Ha if the F statistic is sufficiently
large.
One-Way ANOVA
• Conditions or Assumptions
– The data are randomly sampled
– The variances of each sample are assumed equal
– The residuals are normally distributed
One-Way ANOVA
• Here is the basic one-way ANOVA table

Source SS df MS F p

Between

Within

Total
One-Way ANOVA
• The M.Pharm classroom is divided into three rows:
front, middle, and back
• The instructor noticed that the further the students were
from him, the more likely they were to miss class or
used an instant messenger during class
• He wanted to see if the students further away did worse
on the exams
One-Way ANOVA
The ANOVA doesn’t test that one mean is less than
another, only whether they’re all equal or at least
one is different.

H :   
0 F M B
One-Way ANOVA
• A random sample of the students in each row
was taken
• The score for those students on the exam was
recorded
– Front: 82, 83, 97, 93, 55, 67, 53
– Middle: 83, 78, 68, 61, 77, 54, 69, 51, 63
– Back: 38, 59, 55, 66, 45, 52, 52, 61
One-Way ANOVA
The summary statistics for the grades of each row are
shown in the table below

Row Front Middle Back

Sample size 7 9 8

Mean 75.71 67.11 53.50

St. Dev 17.63 10.95 8.96

Variance 310.90 119.86 80.29


One-Way ANOVA
• Variation
– Variation is the sum of the squares of the
deviations between a value and the mean of the
value
– Sum of Squares is abbreviated by SS and often
followed by a variable in parentheses such as
SS(B) or SS(W) so we know which sum of squares
we’re talking about
– SS(Total ) for the total Sum of Squares (variation)
One-Way ANOVA
• There are two sources of variation
– the variation between the groups, SS(B), or the
variation due to the factor
– the variation within the groups, SS(W), or the
variation that can’t be explained by the factor so
it’s called the error variation
One-Way ANOVA
• Grand Mean
– The grand mean is the average of all the values when
the factor is ignored
– It is a weighted average of the individual sample
means

n x n x  n x   n x
x x
i i
i 1 1 1 2 2 k k

n  n   n
k

n
i 1
i 1 2 k
One-Way ANOVA
• Grand Mean for our example is 65.08

7  75.71  9  67.11  8 53.50 


x
798
1562
x
24
x  65.08
One-Way ANOVA
• Between Group Variation, SS(B)
– The between group variation is the variation between each
sample mean and the grand mean
– Each individual variation is weighted by the sample size

SS  B    n  x  x 
k 2

i i
i 1

SS  B   n  x  x   n  x  x     n  x  x 
2 2 2

1 1 2 2 k k
One-Way ANOVA
The Between Group Variation for our example is
SS(B)=1902

SS  B   7  75.71  65.08   9  67.11  65.08   8 53.50  65.08 


2 2 2

SS  B   1900.8376  1902
One-Way ANOVA
• Within Group Variation, SS(W)
– The Within Group Variation is the weighted total of the
individual variations
– The weighting is done with the degrees of freedom
– The df for each sample is one less than the sample size
for that sample.
One-Way ANOVA
Within Group Variation

SS W    df s
k
2
i i
i 1

SS W   df s  df s    df s
1 1
2
2
2
2 k
2
k
One-Way ANOVA

• The within group variation for our example is 3386

SS W   6 310.90   8 119.86   7 80.29 

SS W   3386.31  3386
One-Way ANOVA
• After filling in the sum of squares, we have …

Source SS df MS F p

Between 1902

Within 3386

Total 5288
One-Way ANOVA
• Filling in the degrees of freedom gives this …

Source SS df MS F p

Between 1902 2

Within 3386 21

Total 5288 23
One-Way ANOVA
• Variances
– The variances are also called the Mean of the Squares and
abbreviated by MS, often with an accompanying variable
MS(B) or MS(W)
– They are an average squared deviation from the mean and
are found by dividing the variation by the degrees of
freedom
– MS = SS / df

Variation
Variance 
df
One-Way ANOVA
• MS(B) = 1902 / 2 = 951.0
• MS(W) = 3386 / 21 = 161.2
• MS(T) = 5288 / 23 = 229.9
– Notice that the MS(Total) is NOT the sum of
MS(Between) and MS(Within).
One-Way ANOVA
• Completing the MS gives …

Source SS df MS F p

Between 1902 2 951.0

Within 3386 21 161.2

Total 5288 23 229.9


One-Way ANOVA
• F test statistic
– An F test statistic is the ratio of two sample
variances
– The MS(B) and MS(W) are two sample variances
and that’s what we divide to find F.
– F = MS(B) / MS(W)
• For our data, F = 951.0 / 161.2 = 5.9
One-Way ANOVA
• Adding F to the table …

Source SS df MS F p

Between 1902 2 951.0 5.9

Within 3386 21 161.2

Total 5288 23 229.9


One-Way ANOVA
• The F test is a right tail test
• The F test statistic has an F distribution with
df(B) numerator df and df(W) denominator df
• The p-value is the area to the right of the test
statistic
• p(F2,21 > 5.9) = 0.009
One-Way ANOVA
• Completing the table with the p-value

Source SS df MS F p

Between 1902 2 951.0 5.9 0.009

Within 3386 21 161.2

Total 5288 23 229.9


One-Way ANOVA
• The p-value is 0.009, which is less than the
significance level of 0.05, so we reject the null
hypothesis.
• The null hypothesis is that the means of the
three rows in class were the same, but we
reject that, so at least one row has a different
mean.
One-Way ANOVA
• There is enough evidence to support the claim
that there is a difference in the mean scores of
the front, middle, and back rows in class.
• The ANOVA doesn’t tell which row is
different, you would need to run post hoc tests
to determine that
Post-hoc Tests
• Used to determine which mean or group of means is/are
significantly different from the others (significant F)
Depending upon research design & research question:
• Bonferroni (more powerful)
– Only some pairs of sample means are to be tested
– Desired alpha level is divided by no. of comparisons
• Tukey’s HSD Procedure
– when all pairs of sample means are to be tested
• Scheffe’s Procedure
– when sample sizes are unequal
Two-Way ANOVA
• Two-way ANOVA allows to compare population means when
the populations are classified according to two (categorical)
factors.
• Example 1. We might like to look at SAT scores of students
who are male or female (1st factor) and either have or have not
had a preparatory course (2nd factor).
• Example 2. A researcher wants to investigate the effects of the
amounts of calcium and magnesium in a rat's diet on the rat's
blood pressure. Diets including high, medium and low
amounts of each mineral (but otherwise identical) will be fed
to the rats. And after a specified time on the diet, the blood
pressure will be measured.
• Notice that the design includes nine different treatments
because there are three levels to each of the two factors
Two-Way ANOVA Table

You might also like