Pertemuan 3 Anova

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 60

ANOVA

Chap 11-1
-2
Goals
After completing this chapter, you should be able
to:
Recognize situations in which to use analysis of variance
Understand different analysis of variance designs
Perform a single-factor hypothesis test and interpret results
Conduct and interpret post-analysis of variance pairwise
comparisons procedures
Set up and perform randomized blocks analysis
Analyze two-factor analysis of variance test with replications
results
-3
Chapter Overview
Analysis of Variance (ANOVA)
F-test
F-test
Tukey-
Kramer
test
Fishers Least
Significant
Difference test
One-Way
ANOVA
Randomized
Complete
Block ANOVA
Two-factor
ANOVA
with replication
Chap 11-4
General ANOVA Setting
Investigator controls one or more independent
variables
Called factors (or treatment variables)
Each factor contains two or more levels (or
categories/classifications)
Observe effects on dependent variable
Response to levels of independent variable
Experimental design: the plan used to test
hypothesis
Chap 11-5
One-Way Analysis of Variance
Evaluate the difference among the means of three
or more populations

Examples: Accident rates for 1
st
, 2
nd
, and 3
rd
shift

Assumptions
Populations are normally distributed
Populations have equal variances
Samples are randomly and independently drawn
Chap 11-6
Completely Randomized Design
Experimental units (subjects) are assigned
randomly to treatments
Only one factor or independent variable
With two or more treatment levels
Analyzed by
One-factor analysis of variance (one-way ANOVA)
Called a Balanced Design if all factor levels
have equal sample size
Chap 11-7
Hypotheses of One-Way ANOVA

All population means are equal
i.e., no treatment effect (no variation in means among
groups)


At least one population mean is different
i.e., there is a treatment effect
Does not mean that all population means are different
(some pairs may be the same)

k 3 2 1 0
: H = = = =
same the are means population the of all Not : H
A
Chap 11-8
One-Factor ANOVA
All Means are the same:
The Null Hypothesis is True
(No Treatment Effect)
k 3 2 1 0
: H = = = =
same the are all Not : H
i A
3 2 1
= =
Chap 11-9
One-Factor ANOVA
At least one mean is different:
The Null Hypothesis is NOT true
(Treatment Effect is present)
k 3 2 1 0
: H = = = =
same the are all Not : H
i A
3 2 1
= =
3 2 1
= =
or
(continued)
Chap 11-10
Partitioning the Variation
Total variation can be split into two parts:
SST = Total Sum of Squares
SSB = Sum of Squares Between
SSW = Sum of Squares Within
SST = SSB + SSW
Chap 11-11
Partitioning the Variation
Total Variation = the aggregate dispersion of the individual
data values across the various factor levels (SST)
Within-Sample Variation = dispersion that exists among
the data values within a particular factor level (SSW)
Between-Sample Variation = dispersion among the factor
sample means (SSB)
SST = SSB + SSW
(continued)
Chap 11-12
Partition of Total Variation
Variation Due to
Factor (SSB)
Variation Due to Random
Sampling (SSW)
Total Variation (SST)
Commonly referred to as:
Sum of Squares Within
Sum of Squares Error
Sum of Squares Unexplained
Within Groups Variation
Commonly referred to as:
Sum of Squares Between
Sum of Squares Among
Sum of Squares Explained
Among Groups Variation
=
+
Chap 11-13
Total Sum of Squares

= =
=
k
i
n
j
ij
i
) x x ( SST
1 1
2
Where:
SST = Total sum of squares
k = number of populations (levels or treatments)
n
i
= sample size from population i
x
ij
= j
th
measurement from population i
x = grand mean (mean of all data values)
SST = SSB + SSW
Chap 11-14
Total Variation
(continued)
Group 1 Group 2 Group 3
Response, X
X
2 2
12
2
11
) x x ( ... ) x x ( ) x x ( SST
k
kn
+ + + =
Chap 11-15
Sum of Squares Between
Where:
SSB = Sum of squares between
k = number of populations
n
i
= sample size from population i
x
i
= sample mean from population i
x = grand mean (mean of all data values)
2
1
) x x ( n SSB
i
k
i
i
=

=
SST = SSB + SSW
Chap 11-16
Between-Group Variation
Variation Due to
Differences Among Groups
i

2
1
) x x ( n SSB
i
k
i
i
=

=
1
=
k
SSB
MSB
Mean Square Between =
SSB/degrees of freedom
Chap 11-17
Between-Group Variation
(continued)
Group 1 Group 2 Group 3
Response, X
X
1
X
2
X
3
X
2 2
2 2
2
1 1
) x x ( n ... ) x x ( n ) x x ( n SSB
k k
+ + + =
Chap 11-18
Sum of Squares Within
Where:
SSW = Sum of squares within
k = number of populations
n
i
= sample size from population i
x
i
= sample mean from population i
x
ij
= j
th
measurement from population i
2
1 1
) x x ( SSW
i ij
n
j
k
i
j
=

= =
SST = SSB + SSW
Chap 11-19
Within-Group Variation
Summing the variation
within each group and then
adding over all groups
i

k N
SSW
MSW

=
Mean Square Within =
SSW/degrees of freedom
2
1 1
) x x ( SSW
i ij
n
j
k
i
j
=

= =
Chap 11-20
Within-Group Variation
(continued)
Group 1 Group 2 Group 3
Response, X
1
X
2
X
3
X
2 2
2 12
2
1 11
) x x ( ... ) x x ( ) x x ( SSW
k kn
k
+ + + =
Chap 11-21
One-Way ANOVA Table
Source of
Variation
df SS MS
Between
Samples
SSB MSB =
Within
Samples
N - k SSW MSW =
Total N - 1
SST =
SSB+SSW
k - 1
MSB
MSW
F ratio
k = number of populations
N = sum of the sample sizes from all populations
df = degrees of freedom
SSB
k - 1
SSW
N - k
F =
Chap 11-22
One-Factor ANOVA
F Test Statistic
Test statistic


MSB is mean squares between variances
MSW is mean squares within variances
Degrees of freedom
df
1
= k 1 (k = number of populations)
df
2
= N k (N = sum of sample sizes from all populations)
MSW
MSB
F =
H
0
:
1
=
2
=

=
k
H
A
: At least two population means are different
Chap 11-23
Interpreting One-Factor ANOVA
F Statistic
The F statistic is the ratio of the between
estimate of variance and the within estimate
of variance
The ratio must always be positive
df
1
= k -1 will typically be small
df
2
= N - k will typically be large

The ratio should be close to 1 if
H
0
:
1
=
2
= =
k
is true

The ratio will be larger than 1 if
H
0
:
1
=
2
= =
k
is false
Chap 11-24
One-Factor ANOVA
F Test Example
You want to see if three
different group of woman
yield different distances in 10
seconds. You randomly
select five measurements
from trials on a sprint. At the
.05 significance level, is there
a difference in mean
distance?
group-1 group-2 group-3
254 234 200
263 218 222
241 235 197
237 227 206
251 216 204
Chap 11-25





One-Factor ANOVA Example:
Scatter Diagram
270
260
250
240
230
220
210
200
190










Distance
1
X
2
X
3
X
X
227.0 x
205.8 x 226.0 x 249.2 x
3 2 1
=
= = =
group-1 group-2 group-3

254 234 200
263 218 222
241 235 197
237 227 206
251 216 204
group
1 2 3
Chap 11-26
One-Factor ANOVA Example
Computations
group-1 group-2 group-3

254 234 200
263 218 222
241 235 197
237 227 206
251 216 204
x
1
= 249.2
x
2
= 226.0
x
3
= 205.8

x = 227.0

n
1
= 5
n
2
= 5
n
3
= 5
N = 15
k = 3
SSB = 5 [ (249.2 227)
2
+ (226 227)
2
+ (205.8 227)
2
] = 4716.4
SSW = (254 249.2)
2
+ (263 249.2)
2
++ (204 205.8)
2
= 1119.6
MSB = 4716.4 / (3-1) = 2358.2
MSW = 1119.6 / (15-3) = 93.3
25.275
93.3
2358.2
F = =
Chap 11-27
F

= 25.275
One-Factor ANOVA Example
Solution
H
0
:
1
=
2
=
3

H
A
:
i
not all equal
o = .05
df
1
= 2 df
2
= 12
Test Statistic:



Decision:

Conclusion:

Reject H
0
at o = 0.05
There is evidence that
at least one
i
differs
from the rest
0
o = .05

F
.05
= 3.885
Reject H
0
Do not
reject H
0
25.275
93.3
2358.2
MSW
MSB
F = = =
Critical
Value:
F
o
= 3.885
Chap 11-28
SUMMARY
Groups Count Sum Average Variance
Club 1 5 1246 249.2 108.2
Club 2 5 1130 226 77.5
Club 3 5 1029 205.8 94.2
ANOVA
Source of
Variation
SS df MS F P-value F crit
Between
Groups
4716.4 2 2358.2 25.275 4.99E-05 3.885
Within
Groups
1119.6 12 93.3
Total 5836.0 14
ANOVA -- Single Factor:
Excel Output
EXCEL: tools | data analysis | ANOVA: single factor
Chap 11-29
The Tukey-Kramer Procedure
Tells which population means are significantly
different
e.g.:
1
=
2
=
3

Done after rejection of equal means in ANOVA
Allows pair-wise comparisons
Compare absolute mean differences with critical
range
x

1
=

2

3
Chap 11-30
Tukey-Kramer Critical Range




where:
q
o
= Value from standardized range table
with k and N - k degrees of freedom for
the desired level of o
MSW = Mean Square Within
n
i
and n
j
= Sample sizes from populations (levels) i and j
|
|
.
|

\
|
+ =
o
j i
n
1
n
1
2
MSW
q Range Critical
Chap 11-31
The Tukey-Kramer Procedure:
Example
1. Compute absolute mean
differences:
group-1 group-2 group-3

254 234 200
263 218 222
241 235 197
237 227 206
251 216 204
20.2 205.8 226.0 x x
43.4 205.8 249.2 x x
23.2 226.0 249.2 x x
3 2
3 1
2 1
= =
= =
= =
2. Find the q value from the table in appendix J with
k and N - k degrees of freedom for
the desired level of o
3.77 q

=
Chap 11-32
The Tukey-Kramer Procedure:
Example
5. All of the absolute mean differences
are greater than critical range.
Therefore there is a significant
difference between each pair of
means at 5% level of significance.
16.285
5
1
5
1
2
93.3
3.77
n
1
n
1
2
MSW
q Range Critical
j i

=
|
.
|

\
|
+ =
|
|
.
|

\
|
+ =
3. Compute Critical Range:
20.2 x x
43.4 x x
23.2 x x
3 2
3 1
2 1
=
=
=
4. Compare:
Chap 11-33
Tukey-Kramer in PHStat
Chap 11-34
Randomized Complete Block ANOVA
Like One-Way ANOVA, we test for equal population
means (for different factor levels, for example)...

...but we want to control for possible variation from a
second factor (with two or more levels)

Used when more than one factor may influence the
value of the dependent variable, but only one is of key
interest

Levels of the secondary factor are called blocks
Chap 11-35
Partitioning the Variation
Total variation can now be split into three parts:
SST = Total sum of squares
SSB = Sum of squares between factor levels
SSBL = Sum of squares between blocks
SSW = Sum of squares within levels
SST = SSB + SSBL + SSW
Chap 11-36
Sum of Squares for Blocking
Where:
k = number of levels for this factor
b = number of blocks
x
j
= sample mean from the j
th
block
x = grand mean (mean of all data values)
2
1
) x x ( k SSBL
j
b
j
=

=
SST = SSB + SSBL + SSW
Chap 11-37
Partitioning the Variation
Total variation can now be split into three parts:
SST and SSB are
computed as they were
in One-Way ANOVA
SST = SSB + SSBL + SSW
SSW = SST (SSB + SSBL)
Chap 11-38
Mean Squares
1
= =
k
SSB
between square Mean MSB
1
= =
b
SSBL
blocking square Mean MSBL
) b )( k (
SSW
within square Mean MSW
1 1
= =
Chap 11-39
Randomized Block ANOVA Table
Source of
Variation
df SS MS
Between
Samples
SSB MSB
Within
Samples
(k1)(b-1) SSW MSW
Total N - 1 SST
k - 1
MSBL
MSW
F ratio
k = number of populations N = sum of the sample sizes from all populations
b = number of blocks df = degrees of freedom
Between
Blocks
SSBL b - 1 MSBL
MSB
MSW
Chap 11-40
Blocking Test
Blocking test: df
1
= b - 1
df
2
= (k 1)(b 1)
MSBL
MSW
... : H
b3 b2 b1 0
= = =
equal are means block all Not : H
A
F =
Reject H
0
if F > F
o

Chap 11-41
Main Factor test: df
1
= k - 1
df
2
= (k 1)(b 1)
MSB
MSW
k 3 2 1 0
... : H = = = =
equal are means population all Not : H
A
F =
Reject H
0
if F > F
o

Main Factor Test
Chap 11-42
Fishers
Least Significant Difference Test
To test which population means are significantly
different
e.g.:
1
=
2

3
Done after rejection of equal means in randomized
block ANOVA design
Allows pair-wise comparisons
Compare absolute mean differences with critical
range
x

=

1 2 3
Chap 11-43
Fishers Least Significant
Difference (LSD) Test
where:
t
o/2
= Upper-tailed value from Students t-distribution
for o/2 and (k -1)(n - 1) degrees of freedom
MSW = Mean square within from ANOVA table
b = number of blocks
k = number of levels of the main factor

b
2
MSW t LSD
/2 o
=
Chap 11-44
... etc
x x
x x
x x
3 2
3 1
2 1

Fishers Least Significant


Difference (LSD) Test
(continued)
b
2
MSW t LSD
/2 o
=
If the absolute mean difference
is greater than LSD then there
is a significant difference
between that pair of means at
the chosen level of significance.
Compare:
? LSD x x Is
j i
>
RCBD Example
Chap 11-45
A physical therapist wished to compare three
methods for teaching patiens to use a certain
prosthetic device. He felt that there rate of
learning would be different for patients of
different ages and wished to design and
experiment in which the influence of age could
be taken out
Chap 11-46
Data:
Three patients in each of five age groups were
selected to participate in the experiment, and one
patient in each age group was randomly assigned
to each of teaching methods. The methods of
instruction constitute our three treatments, and the
five groups are the blocs. The data are shown on
the table.
Chap 11-47
Chap 11-48
Two-Way ANOVA
Examines the effect of
Two or more factors of interest on the
dependent variable
e.g.: Percent carbonation and line speed on
soft drink bottling process
Interaction between the different levels of these
two factors
e.g.: Does the effect of one particular
percentage of carbonation depend on which
level the line speed is set?
Chap 11-49
Two-Way ANOVA
Assumptions

Populations are normally distributed
Populations have equal variances
Independent random samples are
drawn
(continued)
Chap 11-50
Two-Way ANOVA
Sources of Variation
Two Factors of interest: A and B
a = number of levels of factor A
b = number of levels of factor B
N = total number of observations in all cells
Chap 11-51
Two-Way ANOVA
Sources of Variation
SST
Total Variation

SS
A

Variation due to factor A
SS
B
Variation due to factor B
SS
AB
Variation due to interaction
between A and B
SSE
Inherent variation (Error)
Degrees of
Freedom:
a 1
b 1
(a 1)(b 1)
N ab
N - 1
SST = SS
A
+ SS
B
+ SS
AB
+ SSE
(continued)
Chap 11-52
Two Factor ANOVA Equations

= =
'
=
=
a
i
b
j
n
k
ijk
) x x ( SST
1 1 1
2
2
1
) x x ( n b SS
a
i
i A

'
=

=
2
1
) x x ( n a SS
b
j
j B

'
=

=
Total Sum of Squares:
Sum of Squares Factor A:
Sum of Squares Factor B:
Chap 11-53
Two Factor ANOVA Equations
2
1 1
) x x x x ( n SS
a
i
b
j
j i ij AB
+
'
=

= =

= =
'
=
=
a
i
b
j
n
k
ij ijk
) x x ( SSE
1 1 1
2
Sum of Squares
Interaction Between
A and B:
Sum of Squares Error:
(continued)
Chap 11-54
Two Factor ANOVA Equations
where:
Mean Grand
n ab
x
x
a
i
b
j
n
k
ijk
=
'
=

= =
'
= 1 1 1
A factor of level each of Mean
n b
x
x
b
j
n
k
ijk
i
=
'
=

=
'
= 1 1
B factor of level each of Mean
n a
x
x
a
i
n
k
ijk
j
=
'
=

=
'
= 1 1
cell each of Mean
n
x
x
n
k
ijk
ij
=
'
=

'
=1
a = number of levels of factor A
b = number of levels of factor B
n = number of replications in each cell
(continued)
Chap 11-55
Mean Square Calculations
1
= =
a
SS
A factor square Mean MS
A
A
1
= =
b
SS
B factor square Mean MS
B
B
) b )( a (
SS
n interactio square Mean MS
AB
AB
1 1
= =
ab N
SSE
error square Mean MSE

= =
Chap 11-56
Two-Way ANOVA:
The F Test Statistic
F Test for Factor B Main Effect
F Test for Interaction Effect
H
0
:
A1
=
A2
=
A3
=

H
A
: Not all
Ai
are equal
H
0
: factors A and B do not interact
to affect the mean response

H
A
: factors A and B do interact
F Test for Factor A Main Effect
H
0
:
B1
=
B2
=
B3
=

H
A
: Not all
Bi
are equal
Reject H
0

if F > F
o
MSE
MS
F
A
=
MSE
MS
F
B
=
MSE
MS
F
AB
=
Reject H
0

if F > F
o
Reject H
0

if F > F
o
Chap 11-57
Two-Way ANOVA
Summary Table
Source of
Variation
Sum of
Squares
Degrees of
Freedom
Mean
Squares
F
Statistic
Factor A SS
A
a 1
MS
A

= SS
A
/(a 1)
MS
A

MSE
Factor B SS
B
b 1
MS
B

= SS
B
/(b 1)
MS
B

MSE
AB
(Interaction)
SS
AB
(a 1)(b 1)
MS
AB

= SS
AB
/ [(a 1)(b 1)]
MS
AB

MSE
Error SSE N ab
MSE =
SSE/(N ab)
Total SST N 1
Chap 11-58
Features of Two-Way ANOVA
F Test
Degrees of freedom always add up
N-1 = (N-ab) + (a-1) + (b-1) + (a-1)(b-1)
Total = error + factor A + factor B + interaction
The denominator of the F Test is always the
same but the numerator is different
The sums of squares always add up
SST = SSE + SS
A
+ SS
B
+ SS
AB
Total = error + factor A + factor B + interaction
Chap 11-59
Examples:
Interaction vs. No Interaction
No interaction:
1 2
Factor B Level 1
Factor B Level 3
Factor B Level 2
Factor A Levels
1
2
Factor B Level 1
Factor B Level 3
Factor B Level 2
Factor A Levels
M
e
a
n

R
e
s
p
o
n
s
e

M
e
a
n

R
e
s
p
o
n
s
e

Interaction is
present:
Chap 11-60
Chapter Summary
Described one-way analysis of variance
The logic of ANOVA
ANOVA assumptions
F test for difference in k means
The Tukey-Kramer procedure for multiple comparisons
Described randomized complete block designs
F test
Fishers least significant difference test for multiple
comparisons
Described two-way analysis of variance
Examined effects of multiple factors and interaction

You might also like