MEL761: Statistics For Decision Making: Anova

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 54

1

MEL761: Statistics for Decision


Making







Dr S G Deshmukh
Mechanical Department
Indian Institute of Technology
ANOVA
2
Learning Objectives
Understand the differences between various
experimental designs and when to use them.
Compute and interpret the results of a one-
way ANOVA.
Compute and interpret the results of a
random block design.
Compute and interpret the results of a two-
way ANOVA.
Understand and interpret interaction.
Know when and how to use multiple
comparison techniques.
3
Introduction to Design
of Experiments..1..

Experimental Design
- a plan and a structure to test
hypotheses in which the researcher
controls or manipulates one or more
variables.
4
Introduction to Design of Experiments
..2..
Independent Variable
Treatment variable is one that the experimenter
controls or modifies in the experiment.
Classification variable is a characteristic of the
experimental subjects that was present prior to
the experiment, and is not a result of the
experimenters manipulations or control.
Levels or Classifications are the subcategories of
the independent variable used by the researcher
in the experimental design.

5
Introduction to Design
of Experiments ..3..

Dependent Variable
- the response to the different levels of
the independent variables.

6
Three Types
of Experimental Designs

Completely Randomized Design
Randomized Block Design
Factorial Experiments
7
Completely Randomized Design
Machine Operator
Valve Opening
Measurements
1
.
.
.
2
.
.
.
4
.
.
.
.
.
.
3
8
Valve Openings by Operator
1 2 3 4
6.33 6.26 6.44 6.29
6.26 6.36 6.38 6.23
6.31 6.23 6.58 6.19
6.29 6.27 6.54 6.21
6.4 6.19 6.56
6.5 6.34
6.19 6.58
6.22
9
Analysis of Variance:
Assumptions
Observations are drawn from normally
distributed populations.
Observations represent random samples
from the populations.
Variances of the populations are equal.
10
One-Way ANOVA: Procedural
Overview
H
H
o
k
a
:
:
1 2 3

= = = =
At least one of the means is different from the others
F
MSC
MSE
=
If F > , reject H .
If F , do not reject H .
c
o
c
o
F
F
s
11
One-Way ANOVA:
Sums of Squares Definitions
( ) ( ) ( )
value individual
level or group treatment a of mean =
mean grand = X
level tment given trea a in ns observatio of number
levels treatment of number =
level treatment a =
level treatment a of member particular :
n n
ij
SSE + SSC = SST
squares of sum between + squares of sum error = squares of sum total
X
X
n
X
ij
j
j
1 1
2
1
2
1 = i 1 j =
2
j j
=
=
=
+ =


= = =
C
j
i where
j ij
j
j
i
C
j
C
j
C
X X
X
X
n
X
12
Partitioning Total Sum
of Squares of Variation
SST
(Total Sum of Squares)
SSC
(Treatment Sum of Squares)
SSE
(Error Sum of Squares)
13
One-Way ANOVA:
Computational Formulas
( )
( )
( )
MSE
MSC
F
SSE
MSE
SSC
MSC
N
n
ij
SST
C N
n
j
ij
SSE
C
j
SSC
df
df
df X
X
df
X
X
df X
X
n
E
C
T
j
C
i
E
i
C
j
C
C
j
j
j
j
=
=
=
= =
= =
= =


= =
= =
=
1
1
1 1
2
1 1
2
1
2
where
X
: i = a particular member of a treatment level
j = a treatment level
C = number of treatment levels
= number of observations in a given treatment level
X = grand mean
column mean
= individual value
j
j
ij
n
X
=
14
One-Way ANOVA:
Preliminary Calculations
1 2 3 4
6.33 6.26 6.44 6.29
6.26 6.36 6.38 6.23
6.31 6.23 6.58 6.19
6.29 6.27 6.54 6.21
6.4 6.19 6.56
6.5 6.34
6.19 6.58
6.22
T
j
T
1
= 31.59 T
2
= 50.22 T
3
= 45.42 T
4
= 24.92 T = 152.15
n
j
n
1
= 5 n
2
= 8 n
3
= 7 n
4
= 4 N = 24
Mean 6.318000 6.277500 6.488571 6.230000 6.339583
15
( )
( )
15492 . 0
) 230 . 6 19 . 6 ( ) 230 . 6 22 . 6 (
) 2775 . 6 36 . 6 ( ) 2775 . 6 26 . 6 ( ) 318 . 6 4 . 6 (
) 318 . 6 29 . 6 ( ) 318 . 6 31 . 6 ( ) 318 . 6 26 . 6 ( ) 318 . 6 33 . 6 (
23658 . 0
) 339583 . 6 23 . 6 ( ) 339583 . 6 488571 . 6 (
) 339583 . 6 2775 . 6 ( ) 339583 . 6 318 . 6 (
2 2
2 2 2
2 2 2 2
1 1
2
2 2
2 2
1
2
4 7
8 5 [
=
+
+ + +
+ + + =
=
=
+ +
+ =
=
+ +


= =
=

n
j
ij
SSE
j
SSC
j
i
C
j
C
j
j
X
X
X
X
n
One-Way ANOVA:
Sum of Squares Calculations
16
( )
39150 . 0
) 339583 . 6 19 . 6 (
) 339583 . 6 22 . 6 ( ) 339583 . 6 31 . 6 (
) 339583 . 6 26 . 6 ( ) 339583 . 6 33 . 6 (
2
2 2
2 2
1 1
2
=
+
+ + +
+ =
=




= =

n
ij
SST
j
i
C
j
X
X
One-Way ANOVA:
Sum of Squares Calculations
17
One-Way ANOVA: Mean Square
and F Calculations
18 . 10
007746 .
078860 .
007746 .
20
15492 .
078860 .
3
23658 .
23 1 24 1
20 4 24
3 1 4 1
= = =
= = =
= = =
= = =
= = =
= = =
MSE
MSC
F
SSE
MSE
SSC
MSC
N
C N
C
df
df
df
df
df
E
C
T
E
C
18
Analysis of Variance
for Valve Openings
Source of Variance df SS MS F

Between 3 0.23658 0.078860 10.18
Error 20 0.15492 0.007746
Total 23 0.39150

19


F
20 , 3 , 05 .
df
1
df
2
A Portion of the F Table for o =
0.05
1 2 3 4 5 6 7 8 9
1 161.45 199.50 215.71 224.58 230.16 233.99 236.77 238.88 240.54

18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39
21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37
20
One-Way ANOVA:
Procedural Summary
. H reject do , 10 . 3 F
. H reject , 10 . 3 > F
o
c
o
c
F
F
= s
=
If
If
Rejection Region
o = .05
Critical Value
10 . 3
11 , 9 , 05 .
=
F
Non rejection
Region
20
3
2
1
=
=
u
u
others the from different is
means the of one least At : H
: H
a
4 3 2 1
o

= = =
. H reject , 10 . 3 > 10.18 = F Since o
c F
=
21
Excel Output
for the Valve Opening Example
Anova: Single Factor
SUMMARY
Groups Count Sum Average Variance
Operator 1 5 31.59 6.318 0.00277
Operator 2 8 50.22 6.2775
0.011078
6
Operator 3 7 45.42
6.48857142
9
0.010114
3
Operator 4 4 24.92 6.23
0.001866
7
ANOVA
Source of
Variation SS df MS F
P-
val
ue F crit
Between Groups
0.23658011
9 3 0.07886004
10.18102
5
0.0002
8
3.0983
9
Within Groups
0.15491571
4 20
0.00774578
6
Total
0.39149583
3 23
22
Multiple Comparison Tests
An analysis of variance (ANOVA) test is
an overall test of differences among
groups.
Multiple Comparison techniques are used
to identify which pairs of means are
significantly different given that the
ANOVA test reveals overall
significance.
Tukeys honestly significant difference
(HSD) test requires equal sample sizes
Tukey-Kramer Procedure is used when
sample sizes are unequal.
23
Tukeys Honestly Significant
Difference (HSD) Test
HSD
MSE
n
=
o
o
,C,N-C
,C,N-C
q
q
where: MSE = mean square error
n = sample size
= critical value of the studentized range distribution from Table A.10
24
Data for Demonstration Problem
PLANT (Employee Age)
1 2 3
29 32 25
27 33 24
30 31 24
27 34 25
28 30 26

Group Means 28.2 32.0 24.8
n
j
5 5 5

C = 3
df
E
= N - C = 12 MSE = 1.63
25
q Values for o = .01
Degrees of
Freedom
1
2
3
4
.
11
12
2 3 4 5
90 135 164 186
14 19 22.3 24.7
8.26 10.6 12.2 13.3
6.51 8.12 9.17 9.96
4.39 5.14 5.62 5.97
4.32 5.04 5.50 5.84
.
...
Number of Populations
. , ,
.
01 3 12
504
q
=
26
Tukeys HSD Test
for the Employee Age Data
HSD
MSE
n
C N C
q
X
X
X
= = =
= =
= =
= =
o, ,
.
.
.
. . .
. . .
. . .
504
163
5
2 88
28 2 32 0 38
28 2 24 8 34
32 0 24 8 7 2
2
3
3
1
1
2
X
X
X
27
Tukey-Kramer Procedure:
The Case of Unequal Sample
Sizes
HSD
MSE
r s
n n
= +
o
o
,C,N-C
r
th
s
th
,C,N-C
q
n r
n s
q
where: MSE = mean square error
= sample size for sample
= sample size for sample
= critical value of the studentized range distribution from Table A.10
2
1 1
( )
28
Freighter Example: Means and
Sample Sizes for the Four
Operators
Operator Sample Size Mean
1 5 6.3180
2 8 6.2775
3 7 6.4886
4 4 6.2300
29
Tukey-Kramer Results
for the Four Operators
Pair
Critical
Difference
|Actual
Differences|
1 and 2 .1405 .0405
1 and 3 .1443 .1706*
1 and 4 .1653 .0880
2 and 3 .1275 .2111*
2 and 4 .1509 .0475
3 and 4 .1545 .2586*
*denotes significant at o = .05
30
Partitioning the Total Sum of
Squares in the Randomized
Block Design
SST
(Total Sum of Squares)
SSC
(Treatment
Sum of Squares)
SSE
(Error Sum of Squares)
SSR
(Sum of Squares
Blocks)
SSE
(Sum of Squares
Error)
31
A Randomized Block Design
Individual
observations
.
.
.
.
.
.
.
.
.
.
.
.
Single Independent Variable
Blocking
Variable
.
.
.
.
.
32
Randomized Block Design Treatment
Effects: Procedural Overview
others the from different is means the of one least At : H
: H
a
3 2 1
o

k
= = = =
F
MSC
MSE
=
If F > , reject H .
If F , do not reject H .
c
o
c
o
F
F
s
33
Randomized Block Design:
Computational Formulas
( )( )
SSC n
j
C
SSR C
i
n
SSE
ij
i i
C n N n C
SST
ij
N
MSC
SSC
C
MSR
SSR
n
MSE
SSE
N n C
MSC
MSE
MSR
MSE
X
X df
X
X
df
X
X X
X df
X X df
F
F
j
C
C
i
n
R
i
n
j
n
E
i
n
j
n
E
treatments
blocks
= =
= =
= = = +
= =
=

=
+
=
=


+

=
=
= =
= =
2
1
2
1
2
1 1
2
1 1
1
1
1 1 1
1
1
1
1
( )
( )
( )
( )
where: i = block group (row)
j = a treatment level (column)
C = number of treatment levels (columns)
n = number of observations in each treatment level (number of blocks - rows)
individual observation
treatment (column) mean
block (row) mean
X = grand mean
N = total number of observations
ij
j
i
X
X
X
=
=
=
SSC sum of squares columns (treatment)
SSR = sum of squares rows (blocking)
SSE = sum of squares error
SST = sum of squares total
=
34
Randomized Block Design:
Tread-Wear Example
Supplier
1
2
3
4
Slow Medium Fast
Block
Means
( )
3.7 4.5 3.1 3.77
3.4 3.9 2.8 3.37
3.5 4.1 3.0 3.53
3.2 3.5 2.6 3.10
5
Treatment
Means( )
3.9 4.8 3.4 4.03
3.54 4.16 2.98 3.56
Speed
j X
i X
X
C = 3
n = 5
N = 15
35
SSC n
j
SSR C
i
X
X
X
X
j
C
i
n
=
= + +
=
=
= + + + +
=




=
=
2
1
2 2 2
2
1
2 2 2 2 2
5
3
54 356 16 356 98 356
3484
77 356 37 356 53 356 10 356 03 356
1549
( )
(3. . ) (4. . ) (2. . )
.
( )
(3. . ) (3. . ) (3. . ) (3. . ) (4. . )
.
[
[ ]
Randomized Block Design:
Sum of Squares Calculations (Part
1)
36
Randomized Block Design:
Sum of Squares Calculations (Part
2)
176 . 5
) 56 . 3 4 . 3 ( ) 56 . 3 6 . 2 ( ) 56 . 3 4 . 3 ( ) 56 . 3 7 . 3 (
) (
143 . 0
) 56 . 3 03 . 4 98 . 2 4 . 3 ( ) 56 . 3 10 . 3 98 . 2 6 . 2 (
) 56 . 3 37 . 3 54 . 3 4 . 3 ( ) 56 . 3 77 . 3 54 . 3 7 . 3 (
) (
2 2 2 2
1 1
2
2 2
2 2
1 1
2
=
+ + + + =
=
=
+
+ + + =
=


+ +
+ +

= =
= =

n
i
C
j
n
i
C
j
X
X
X
X X
X
ij
SST
i j
ij
SSE
37
Randomized Block Design:
Mean Square Calculations
MSC
SSC
C
MSR
SSR
n
MSE
SSE
N n C
F
MSC
MSE
=

= =
=

= =
=
+
= =
= = =
1
3484
2
1742
1
1549
4
0 387
1
0143
8
0 018
1742
0 018
96 78
.
.
.
.
.
.
.
.
.
38
Analysis of Variance
for the Tread-Wear Example
Source of VarianceSS df MS F
Treatment 3.484 2 1.742 96.78
Block 1.549 4 0.387 21.50
Error 0.143 8 0.018
Total 5.176 14

39
Randomized Block Design
Treatment Effects: Procedural
Summary
H
H
o
a
:
:
1 2 3

= =
At least one of the means is different from the others
78 . 96
018 . 0
742 . 1
= = =
MSE
MSC
F
F = 96.78 > = 8.65, reject H.
.01,2,8
o
F
40
Randomized Block Design
Blocking Effects: Procedural
Overview
H
H
o
a
:
:
1 2 3 4 5

= = = =
At least one of the blocking means is different from the others
5 . 21
018 .
387 .
= = =
MSE
MSR
F
F = 21.5> = 7.01, reject H.
F
o
. , , 01 4 8
41
Excel Output for Tread-Wear
Example: Randomized Block
Design
Anova: Two-Factor Without Replication
SUMMARY Count Sum Average Variance
Suplier 1 3 11.3 3.7666667 0.4933333
Suplier 2 3 10.1 3.3666667 0.3033333
Suplier 3 3 10.6 3.5333333 0.3033333
Suplier 4 3 9.3 3.1 0.21
Suplier 5 3 12.1 4.0333333 0.5033333
Slow 5 17.7 3.54 0.073
Medium 5 20.8 4.16 0.258
Fast 5 14.9 2.98 0.092
ANOVA
Source of Variation SS df MS F P-value F crit
Rows 1.5493333 4 0.3873333 21.719626 0.0002357 7.0060651
Columns 3.484 2 1.742 97.682243 2.395E-06 8.6490672
Error 0.1426667 8 0.0178333
Total 5.176 14
42
Two-Way Factorial Design
Cells
.
.
.
.
.
.
.
.
.
.
.
.
Column Treatment
Row
Treatment
.
.
.
.
.
43
Two-Way ANOVA: Hypotheses
Row Effects: H : Row Means are all equal.
H : At least one row mean is different from the others.
Columns Effects: H : Column Means are all equal.
H : At least one column mean is different from the others.
Interaction Effects: H : The interaction effects are zero.
H : There is an interaction effect.
o
a
o
a
o
a
44
Formulas for Computing
a Two-Way ANOVA
( )( )
( )
SSR nC
i
R
SSC nR
j
C
SSI n
ij i j
R C
SSE
ijk
ij
RC n
SST
ijk
N
MSR
SSR
R
MSR
MSE
MSC
X
X
df
X
X df
X X X
X df
X
X df
X X df
F
i
R
R
j
C
C
j
C
i
R
I
k
n
j
C
i
R
E
a
n
r
R
c
C
T
R
= =
= =
= =
= =
= =
=



=
=
= =
= = =
= = =
2
1
2
1
2
1 1
2
1 1 1
2
1 1 1
1
1
1 1
1
1
1
( )
( )
( )
( )
( )
( )( )
( )
=

=
=

=
=

SSC
C
MSC
MSE
MSI
SSI
R C
MSI
MSE
MSE
SSE
RC n
where
C
I
F
F
1
1 1
1
:
n = number of observations per cell
C = number of column treatments
R = number of row treatments
i = row treatment level
j = column treatment level
k = cell member
= individual observation
= cell mean
= row mean
= column mean
X = grand mean
ijk
ij
i
j
X
X
X
X
45
A 2 3 Factorial Design
with Interaction
Cell
Means
C
1
C2 C
3
Row effects
R
1
R
2
Column
46
A 2 3 Factorial Design
with Some Interaction
Cell
Means
C
1
C
2
C
3
Row effects
R
1
R
2
Column
47
A 2 3 Factorial Design
with No Interaction
Cell
Means
C
1
C
2
C
3
Row effects
R
1
R
2
Column
48
A 2 3 Factorial Design: Data and
Measurements for CEO Dividend
Example

N = 24
n = 4
X=2.7083
1.75 2.75 3.625
Location Where Company
Stock is Traded
How Stockholders
are Informed of
Dividends
NYSE AMEX OTC
Annual/Quarterly
Reports
2
1
2
1

2
3
3
2

4
3
4
3

2.5
Presentations to
Analysts
2
3
1
2

3
3
2
4

4
4
3
4

2.9167
X
j
X
i
X
11
=1.5
X
23
=3.75 X
22
=3.0 X
21
=2.0
X
13
=3.5 X
12
=2.5
49
A 2 3 Factorial Design:
Calculations for the CEO Dividend
Example (Part 1)
SSR
X
X
SSC
X
X
SSI
X X X
X
nC
i
nR
j
n
ij i j
i
R
j
C
j
C
i
R
=
= +
=
=
= + +
=
=
= + +

+

=
=
= =
2
1
2 2
2
1
2 2 2
2
1 1
2
4 3 2 5 2 7083 2 9167 2 7083
4 2 175 2 7083 2 75 2 7083 3625 2 7083
4 15 2 5 175 2 7083
10418
140833
( )
.
( )
.
( )
( )( )[( . . ) ( . . ) ]
( )( )[( . . ) ( . . ) ( . . ) ]
[( . . . . ) ( . . . . )
( . . . . ) ( . . . . )
( . . . . ) ( . . . . ) ]
.
2 5 2 5 2 75 2 7083
35 2 5 3625 2 7083 2 0 2 9167 175 2 7083
30 2 9167 2 75 2 7083 375 2 9167 3625 2 7083
2
2 2
2 2
00833
+
+ + + +
+ + + +
=
50
A 2 3 Factorial Design:
Calculations for the CEO Dividend
Example (Part 2)
SSE X
X
SST
X X
ijk
ij
ijk
k
n
j
C
i
R
a
n
r
R
c
C
=
= + + + +
=
= + + + +
=



=


= = =
= = =
2
1 1 1
2 2 2 2
2
1 1 1
2 2 2 2
2 15 1 15 3 375 4 375
77500
2 27083 1 27083 3 27083 4 27083
229583
( )
( . ) ( . ) ( . ) ( . )
.
( )
( . ) ( . ) ( . ) ( . )
.

51
A 2 3 Factorial Design:
Calculations for the CEO Dividend
Example (Part 3)
( )( )
( )
MSR
SSR
R
MSR
MSE
MSC
SSC
C
MSC
MSE
MSI
SSI
R C
MSI
MSE
MSE
SSE
RC n
R
C
I
F
F
F
=

= = = = =
=

= = = = =
=

= = = = =
=

= =
1
10418
1
10418
10418
0 4306
2 42
1
14 0833
2
7 0417
7 0417
04306
1635
1 1
0 0833
2
00417
0 0417
04306
010
1
7 7500
18
04306
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
52
Analysis of Variance
for the CEO Dividend Problem
Source of VarianceSS df MS F
Row 1.0418 1 1.0418 2.42
Column 14.0833 2 7.0417 16.35
*

Interaction 0.0833 2 0.0417 0.10
Error 7.7500 18 0.4306
Total 22.9583 23

*
Denotes significance at o = .01.
53
Excel
Output
for the
CEO
Dividend
Example
(Part 1)
Anova: Two-Factor With Replication
SUMMARY NYSE ASE OTC Total
AQReport
Count 4 4 4 12
Sum 6 10 14 30
Average 1.5 2.5 3.5 2.5
Variance 0.3333 0.3333 0.3333 1
Presentation
Count 4 4 4 12
Sum 8 12 15 35
Average 2 3 3.75 2.9167
Variance 0.6667 0.6667 0.25 0.9924
Total
Count 8 8 8
Sum 14 22 29
Average 1.75 2.75 3.625
Variance 0.5 0.5 0.2679
54
Excel Output for the
CEO Dividend Example (Part 2)
ANOVA
Source of Variation SS df MS F P-value F crit
Sample 1.0417 1 1.0417 2.4194 0.1373 4.4139
Columns 14.083 2 7.0417 16.355 9E-05 3.5546
Interaction 0.0833 2 0.0417 0.0968 0.9082 3.5546
Within 7.75 18 0.4306
Total 22.958 23

You might also like