ANOVA Presentation
ANOVA Presentation
ANOVA Presentation
Presented By:
Richa Rai
Kinshuk Sharma
Moiz Ashraf Ansari
Geeta Verma
Shruti Godra
2
Overview
What is ANOVA?
When is it useful?
How does it work?
Some Examples
Limitations
Conclusions
3
Definitions
ANOVA: analysis of variation in an
experimental outcome and especially of
a statistical variance in order to
determine the contributions of given
factors or variables to the variance.
Remember: Variance: the square of the
standard deviation
Remember: RA
Fischer, 1919-
Evolutionary Biology
4
Introduction
Any data set has variability
Variability exists within groups
and between groups
Question that ANOVA allows us to answer : Is
this variability significant, or merely by
chance?
5
The difference between variation within a group and
variation between groups may help us determine this. If
both are equal it is likely that it is due to chance and not
significant.
H
0
: Variability w/i groups = variability b/t groups, this
means that 1 = n
H
a
: Variability w/i groups does not = variability b/t
groups, or, 1 n
6
Two-Way ANOVA
Two factor (manipulated variable)
Analysis of variance can be used to test for the effects
of two factors.
Two or more groups to compare
Example
A tea company can analyse its sale on the basis of four
salesmen and three seasons winter ,summer and rainy
season .
Educationalists can analysis three method of teaching a
subject in 4 different ways.
7
Procedure of Two Way ANOVA
1. Take total of the values of individual item in all the sample.
2. Correction factor= T
2
/N
3. Total sum of squares:
SST= [X
2
1
+ X
2
2
+.... X
2
k
] - (T
2
/N)
4. Some of squares between the column:
SSC = {(Xc)
2
/n
c
} - T
2
/N
5. Some of squares between the row:
SSR = {(X)
2
/n} - T
2
/N
6. Some of square for residual:
SSE = SST- (SSC+SSR)
8
7. Determination of degree of freedom
The total no. of item = cr
The total number of degree of freedom= cr-1
Number of degree of freedom columns= c-1
number of degree of freedom of rows= r-1
Number of degree of freedom for residual:
= (c-1)(r-1)
8. Calculation of variance ratio or F coefficient:
F- coefficient for variance between columns= Variance between
columns (MSC) / Residual variance (MSE)
F- coefficient for variance between rows:
= Variance between columns (MSR) / Residual
variance (MSE)
9
9. Preparation of Two-Way ANOVA Table
Source of
variation
Sum of
square (SS)
Degree of
freedom (d.
f.)
Mean Square
(M.S)
Variance
ratio
Between
columns
SSC =
{(Xc)
2
/nC}
- T
2
/N
c-1 SSC/(c-1) =
MSC
F =
MSC/MSE
Between rows SSR =
{(X)
2
/n}
- T
2
/N
r-1 SSR/ (r-1) =
MSR
F =
MSR/MSE
Residual SSE = SST-
(SSC+SSR)
(c-1) (r-1) SSE/{(c-1) (r-
1)} = MSE
Total SST= [X
2
1 +
X
2
2 +....
X
2
k
] - (T
2
/N)
cr-1
Example of Two-Way ANOVA
Set up the ANOVA table for the following information relating to three drugs
testing to judge the effectiveness in reducing blood pressure for three different
groups of people.
Group of People Drug
X Y Z
A 14 10 11
15 9 11
B 12 7 10
11 8 11
C 10 11 8
11 11 7
1. Do the drugs act differently ?
2. Are the different groups of people affected differently ?
3. Is the interaction term significant ?
4. Answer the above questions taking a significant level of 5 %.
Solution:
Step 1. T = 187, n = 18, thus the correction factor = 187 X 187/ 18
= 1942.72
Step 2. Total SS = ( X
1
2
+ X
2
2
+.+ X
5
2
) C.F
= 76.28
Step 3. SS Between the columns (that is between the drugs) =
SSC = [(C
1
2
/ n (C
1
) + (C
2
2
/ n (C
2
) + + (C
5
2
/ n (C
5
)]
C.F.
= 28.77
Step 4. SS Between the rows (that is between people)
SSR = {(X)
2
/n} - T
2
/N
= 14.78
Step 4. SS Between the rows (that is between people)
SSR = {(X)
2
/n} - T
2
/N
= 14.78
Step 5. SS within samples
(X
1
- Xbar) + (X
2
-Xbar) + .
= 3.50
Step 6. SS for interaction variation
SSI = SST- (SSC + SSR + SSE)
= 29.23
Source of
variation
SS d. f. MS F-ratio 5 % F-
limit
Between
columns
(between
drugs)
28.77 (3-1) = 2 28.77/2 =
14.385
14.385/0.3
89 =36.9
F (2, 9) =
4.26
Between
rows
(between
people)
14.78 (3-1) = 2 14.78/2 =
7.390
7.390/0.38
9 = 19.0
F (2, 9) =
4.26
Interaction 29.23 4 29.23/4 7.308/0.38
9
F (4, 9) =
3.63
Within
samples
(Error)
3.50 (18-9) = 9 3.50/9 =
0.389
Total 76.28 (18-1) = 17
Two-Factor Annova Table
Conclusion:
The above table shows that all the three F-ratios are significant of 5 % level
which means that the drugs act differently, different groups of people are affected
differently and the interaction term is significant.
The following table shows the number of items produced by five
different workers in a factory on five different machines:
Workers Machine Number
M1 M2 M3 M4 M5
A 10 14 23 18 20
B 11 15 24 17 21
C 9 12 20 16 19
D 8 13 17 17 20
E 12 15 19 15 22
Discuss, with the help of analysis of variance, whether the mean productivity
of (a) workers and (b) machines differs significantly.
Another example using ANOVA table
Solution:
Let us take the hypothesis that
(a) the mean productivity is the same for the five different
machines.
(b) the five workers do not differ with regard to mean
productivity.
This question is based on two way classification and two-
way analysis of variance is clearly asked in the question.
In order to simplify calculations, we code the data by
subtracting 16 from each figure. The data in the coded
form are given below:
Worker Machine Number Total
M1 M2 M3 M4 M5
A -6 -2 7 2 4 5 r
1
B -5 -1 8 1 5 8 r
2
C -7 -4 4 0 3 -4 r
3
D -8 -3 1 1 4 -5 r
4
E -4 -1 3 -1 6 3 r
5
Total -30
C
1
-11
C
2
23
C
3
3
C
4
22
C
5
7
Work
er
Machine Number Total
M1 M2 M3 M4 M5
A 36 4 49 4 16 109
B 25 1 64 1 25 116
C 49 16 16 0 9 90
D 64 9 1 1 16 91
E 16 1 9 1 36 63
Total 190 31 139 7 102 469
(1) Correction factor (CF) = Total
2
/ N = (7)
2
/25=1.96
(2) Total Sum of Squares of Deviations:
SST= ( X
1
2
+ X
2
2
+.+ X
5
2
) C.F.
= 109+ 116 + 90 + 91 + 63 1.96 = 467.04
Calculation of Squares of Deviations Between Columns (Machines)
and Rows (Workers)
Coded Data Squares
(3) Sum of Squares between Machines: S.S. between Columns:
S.S.C. = [(C
1
2
/ n (C
1
) + (C
2
2
/ n (C
2
) + + (C
5
2
/ n (C
5
)] C.F.
= (-30)
2
/5 + (-11)
2
/5 + (23)
2
/5 + (3)
2
/5 + (22)
2
/5 (1.96)
= 900/5 + 121/5 + 529/5 + 9/5 + 484/5 -1.96
=
2043/5
- 1.96 = 408.6 1.96 = 406.64
4) Sum of Squares between Workers: S.S. between Rows:
S.S.R. = [(r
1
2
/ n (r
1
) + (r
2
2
/ n (r
2
) + + (r
5
2
/ n (r
5
)] C.F.
= (5)
2
/5 + (8)
2
/5 + (-4)
2
/5 + (-5)
2
/5 + (3)
2
/5 (1.96)
= 25/5 + 64/5 + 16/5 + 25/5 + 9/5 -1.96
=
139/5
- 1.96 = 27.8 1.96 = 25.84
(5) Residual Sum of Squares :
S.S.E. = S.S.T. (S.S.C. + S.S.R.)
= 467.04- [406.64 + 25.84]
= 467.04- 432.48 = 34.56
Worker Machine Number Total
M1 M2 M3 M4 M5
A -6 -2 7 2 4 5 r
1
B -5 -1 8 1 5 8 r
2
C -7 -4 4 0 3 -4 r
3
D -8 -3 1 1 4 -5 r
4
E -4 -1 3 -1 6 3 r
5
Total -30
C
1
-11
C
2
23
C
3
3
C
4
22
C
5
7
Coded Data
(
Source of
Variance(i)
Sum of
Squares (ii)
Degrees of
Freedom (iii)
Mean Squares
(ii)(iii)=(iv)
Variance
Ratios (v)
Between
Machines
SSC= 406.64 c-1=5-1=4 MSC=101.66 F= 101.66/2.16
=47.06
Between
workers
SSR= 25.84 r-1=5-1=4 MSR=6.46 F=6.46/2.16
=2.99
Residual error SSE=34.56 (c-1)(r-1)=16 MSE=2.16
Total SST=467.04 Cr-1=24
Two-factor ANOVA Table
Two factor ANOVA table
For v
1
=4 and v
2
=16, F
0.05
= 3.01
Conclusion:
(i) The calculated value of F between machines i.e. between columns is 47.06
while the corresponding table value is 3.01. Hence, the calculated value of F is
greater than table value of F. The hypothesis is rejected. The mean productivity
does not appear to be the same for different machines.
(ii) The calculated value of F between workers i.e. between rows is 2.99 while
the corresponding table is 3.01. The calculated value of F is less than the table
value of F. The hypothesis is accepted Hence, all the five workers do not differ
with regard to mean productivity.
Application
It is useful technique concerning researches in the following fields:
Economics
Biology
Education
Psychology
Sociology
Business/industry, etc.
References:
Research methodology, C. R. Kothari,New age International(P)
Limited,publisher
Statistical Analysis,Dr. K.C.Goel, Swati Prakashan
Thank You