BCS301M55
BCS301M55
BCS301M55
Prepared by:
Purushotham P
Assistant Professor
SJC Institute of Technology
Email id: purushotham@sjcit.ac.in
Phone: 7338005244
Experimental unit:
For conducting an experiment, the experimental material is divided into smaller parts and
each part is referred to as an experimental unit. The experimental unit is randomly assigned to
treatment is the experimental unit. The phrase “randomly assigned” is very important in this
definition.
Experiment:
A way of getting an answer to a question which the experimenter wants to know.
Treatment
Different objects or procedures which are to be compared in an experiment are called
treatments.
Sampling unit:
The object that is measured in an experiment is called the sampling unit. This may be different
from the experimental unit.
Factor:
A factor is a variable defining a categorization. A factor can be fixed or random in nature. A
factor is termed as a fixed factor if all the levels of interest are included in the experiment. A
factor is termed as a random factor if all the levels of interest are not included in the experiment
and those that are can be considered to be randomly chosen from all the levels of interest.
Replication:
It is the repetition of the experimental situation by replicating the experimental unit.
ANOVA:
Analysis of variance (ANOVA) is an analysis tool used in statistics that splits an observed
aggregate variability found inside a data set into two parts: systematic factors and random factors.
The systematic factors have a statistical influence on the given data set, while the random factors do
not.
ANOVA stands for Analysis of Variance. It is a statistical method used to analyze the
differences between the means of two or more groups or treatments. It is often used to determine
whether there are any statistically significant differences between the means of different groups
There are two main types of ANOVA: one-way (or unidirectional) and two-way. There also
variations of ANOVA.
In social sciences, ANOVA tests can be used to study the statistical significance of various study
environments on test scores. Medical research. In medical research, the ANOVA test can be used
to identify the relationship between various types or brands of medications on individuals with
migraines or depression.
We can use the ANOVA test to compare different suppliers and select the best available. ANOVA
(Analysis of Variance) is used when we have more than two sample groups and determine whether
there are any statistically significant differences between the means of two or more independent
sample groups.
CRD: A completely randomized design (CRD) is one where the treatments are assigned completely
at random so that each experimental unit has the same chance of receiving any one treatment.
RBD: A randomized block design is a restricted randomized design, in which experimental units are
first organized into homogeneous blocks and then the treatments are assigned at random to these units
LSD: The Latin Square Design gets its name from the fact that we can write it as a square with Latin
letters to correspond to the treatments. The treatment factor levels are the Latin letters in the Latin
square design. The number of rows and columns has to correspond to the number of treatment levels.
PROBLEMS:
1. Three processes A, B and C are tested to see whether their outputs are equivalent. The
following observations of outputs are made:
A 10 12 13 11 10 14 15 13
B 9 11 10 12 13 - - -
C 11 10 15 14 12 13 - -
Carry out the analysis of variance and state your conclusion.
Sol. To carry out the analysis of variance, we form the following tables
Total Squares
A 10 12 13 11 10 14 15 13 T1=98 T21=9604
B 9 11 10 12 13 T2=55 T22=3025
C 11 10 15 14 12 13 T3=75 T23=5625
Total T 228 -
Sum of Squares
𝑆𝑆𝑇 = + + − 2736
⇒ 𝑆𝑆𝑇 = 1200.5 + 605 + 937.5 − 2736
⇒ 𝑆𝑆𝑇 = 2743 − 2736
⇒ 𝑆𝑆𝑇 = 7
⇒ 𝑆𝑆𝐸 = 58 − 7
⇒ 𝑆𝑆𝐸 = 51
Total Squares
S1 9 7 6 5 8 T1=35 T21=1225
S2 7 4 5 4 5 T2=25 T22=625
S3 6 5 6 7 6 T3=30 T23=900
Total T= 90 -
Sum of Squares
S1 81 49 36 25 64 255
S2 49 16 25 16 25 131
S3 36 25 36 49 36 182
𝑆𝑆𝑇 = + + − 540
⇒ 𝑆𝑆𝑇 = 245 + 125 + 180 − 540
⇒ 𝑆𝑆𝑇 = 550 − 540
⇒ 𝑆𝑆𝑇 = 10
3. Three different kinds of food are tested on three groups of rats for 5 weeks. The
objective is to check the difference in mean weight (in grams) of the rats per week.
Apply one-way ANOVA using a 0.05 significance level to the following data:
Food 1 8 12 19 8 6 11
Food 2 4 5 4 6 9 7
Food 3 11 8 7 13 7 9
Sol. To carry out the analysis of variance, we form the following tables
Total Squares
8 12 19 8 6 11
F1 T1=64 T21=4096
4 5 4 6 9 7
F2 T2=35 T22=1225
11 8 7 13 7 9
F3 T3=55 T23=3025
Total T
154 -
Sum of Squares
F2 16 25 16 36 81 49 223
𝑆𝑆𝑇 = + + − 1317.55
⇒ 𝑆𝑆𝑇 = 682.66 + 204.166 + 504.166 − 1317.55
⇒ 𝑆𝑆𝑇 = 1391 − 1317.55
⇒ 𝑆𝑆𝑇 = 73.45
Total 18-1=17 - -
Since evaluated value 3.55 <3.68 for F(2,15) at 5% level of significance
Hence the null hypothesis is accepted , there is no significance between the three process.
4. Three types of fertilizers are used on three groups of plants for 5 weeks. We want
to check if there is a difference in the mean growth of each group. Using the data
given below apply a one-way ANOVA test at 0.05 significant level
Fertilizer 1 6 8 4 5 3 4
Fertilizer 2 8 12 9 11 6 8
Fertilizer 3 13 9 11 8 7 12
Sol.
To carry out the analysis of variance, we form the following tables
Total Squares
6 8 4 5 3 4
F1 T1=30 T21=900
8 12 9 11 6 8
F2 T2=54 T22=2916
13 9 11 8 7 12
F3 T3=60 T23=3600
Total T
144 -
Sum of Squares
F1 36 64 16 25 9 16 166
𝑆𝑆𝑇 = + + − 1152
⇒ 𝑆𝑆𝑇 = 150 + 486 + 600 − 1152
⇒ 𝑆𝑆𝑇 = 1236 − 1152
⇒ 𝑆𝑆𝑇 = 84
Total Squares
6 7 3 8
A T1=24 T21=576
5 5 3 7
B T2=20 T22=400
5 4 3 4
C T3=16 T23=256
Total T
60 -
Total Squares
36 49 9 64
A 158
25 25 9 49
B 108
25 16 9 16
C 66
Grand Total - ∑ ∑ 𝑥
332
Set the null hypotheses 𝐻 : 𝜇 = 𝜇 𝜇
( )
Correction Factor 𝐶𝐹 = = = = 300
𝑆𝑆𝑇 = + + − 300
⇒ 𝑆𝑆𝑇 = 144 + 100 + 64 − 300
⇒ 𝑆𝑆𝑇 = 308 − 300
⇒ 𝑆𝑆𝑇 = 8
⇒ 𝑆𝑆𝐸 = 32 − 8
⇒ 𝑆𝑆𝐸 = 24
6. A trial was run to check the effects of different diets. Positive numbers indicate weight loss
and negative numbers indicate weight gain. Check if there is an average difference in the
weight of people following different diets using an ANOVA Table.
Low Fat Low Low protein Low
Calorie carbohydrate
8 2 3 2
9 4 5 2
6 3 4 -1
7 5 2 0
3 1 3 3
Sol.
To carry out the analysis of variance, we form the following tables
𝑆𝑆𝑇 = + + + − 252
⇒ 𝑆𝑆𝑇 = 217.8 + 45 + 57.8 + 7.2 − 252
⇒ 𝑆𝑆𝑇 = 327.8 − 252
⇒ 𝑆𝑆𝑇 = 75.80
7. The following data show the number of worms quarantined from the GI areas offour groups
of muskrats in a carbon tetrachloride anthelmintic study. Conduct a
two-way ANOVA test.
I II III IV
33 41 12 38
32 38 35 43
26 40 46 25
14 23 22 13
30 21 11 26
I II III IV
3 11 -18 8
2 8 5 13
-4 10 16 -5
-16 -7 -8 -17
0 -9 -19 -4
T -15 13 -24 -5 -31
T2 225 169 576 25
⇒ 𝑇𝑆𝑆 = 2293 − 48
⇒ 𝑇𝑆𝑆 = 2245
𝑆𝑆𝑇 = + + + − 48
Prepared by: PURUSHOTHAM P, SJC INSTITUTE OF TECHNOLOGY, 7338005244 P a g e | 13
Inspire before you expire…, BCS301
⇒ 𝑆𝑆𝑇 = 45 + 33.8 + 115.2 + 5 − 48
⇒ 𝑆𝑆𝑇 = 199 − 48
⇒ 𝑆𝑆𝑇 = 151
𝑥 𝑥 𝑥 … 𝑥 𝑇 𝑇
𝑥 𝑥 𝑥 … 𝑥 𝑇 𝑇
𝑥 𝑥 𝑥 … 𝑥 𝑇 𝑇
--- ---
--- … --- --- ---
-
𝑥 𝑥 𝑥 … 𝑥 𝑇 𝑇
Sum P1 P2 P3 --- PK =G
2 2 2 2
Squares P1 P2 P3 --- PK
Define the null hypothesis 𝐻 : 𝜇 = 𝜇 = 𝜇 =….=𝜇 for the level of significance.
Find the sum of all the verities (Row wise) Find the sum of all the observations of N
varieties, say T.
Find the correction factor 𝐶𝐹 =
Find the sum of squares of individual items 𝑇𝑆𝑆 = ∑ ∑ 𝑥 − 𝐶𝐹
Find the sum of the squares of rows 𝑆𝑆𝑅 = ∑ − 𝐶𝐹
1. Set up an analysis of variance table for the following per acre production data for
three varieties of wheat, each grown on 4 plots and state it the variety differences
are significant at 5% significant level.
Per acre production data
Plot of land Variety of wheat
A B C
1 6 5 5
2 7 5 4
3 3 3 3
4 8 7 4
Sol.
Variety
A B C
36 25 25
49 25 16
9 9 9
64 49 16 Grand Total -
∑ ∑ 𝑥 =332
Set the null hypotheses 𝐻 : 𝜇 = 𝜇 𝜇 , N=12
( )
Correction Factor 𝐶𝐹 = = = = 300
𝑆𝑆𝑅 = + + + − 300
⇒ 𝑆𝑆𝑅 = 85.33 + 85.33 + 27 + 120.33 − 300
⇒ 𝑆𝑆𝑅 = 318 − 300
⇒ 𝑆𝑆𝑅 = 18
𝑆𝑆𝐶 = + + − 300
⇒ 𝑆𝑆𝐶 = 144 + 100 + 64 − 300
⇒ 𝑆𝑆𝐶 = 308 − 300
⇒ 𝑆𝑆𝐶 = 8
Therefore SSE=TSS-SSR-SSC
SSE=32-18-8=6
2. Three varieties of coal were analysed by four chemists and the ash-content in the varieties
was found to be as under.
Varieties Chemists
1 2 3 4
A 8 5 5 7
B 7 6 4 4
C 3 6 5 4
Carry out the analysis of variance.
Chemists T T2
Variety 1 2 3 4
A 8 5 5 7 25 625
B 7 6 4 4 21 441
C 3 6 5 4 18 324
P 18 17 14 15 =64 -
P2 324 289 196 225
The squares are as follows
Chemists
1 2 3 4
64 25 25 49
49 36 16 16
9 36 25 16 Grand Total - ∑ ∑ 𝑥 =366
Set the null hypotheses 𝐻 : 𝜇 = 𝜇 𝜇 , N=12
( )
Correction Factor 𝐶𝐹 = = = = 341.33
𝑆𝑆𝑅 = + + − 341.33
⇒ 𝑆𝑆𝑅 = 156.25 + 110.25 + 81 − 341.33
⇒ 𝑆𝑆𝑅 = 347.50 − 341.33
⇒ 𝑆𝑆𝑅 = 6.17
𝑆𝑆𝐶 = + + + − 341.33
⇒ 𝑆𝑆𝐶 = 108 + 96.33 + 65.33 + 75 − 341.33
⇒ 𝑆𝑆𝐶 = 344.66 − 341.33
⇒ 𝑆𝑆𝐶 = 3.33
Therefore SSE=TSS-SSR-SSC
SSE=24.67-6.17-3.33=15.17
3. Perform ANOVA and test at 0.05 level of significant whether these are differences in the
detergent or in the engines for the following data:
Detergent Engine
I II III
A 45 43 51
B 47 46 52
C 48 50 55
D 42 37 49
Sol.
Given the data
Engine
Detergent
I II III
A 45 43 51
B 47 46 52
C 48 50 55
D 42 37 49
Subtract 45 from all the observations, we get
Detergent Engine T T2
I II III
A 0 -2 6 4 16
B 2 1 7 10 100
C 3 5 10 18 324
D -3 -8 4 -7 49
P 2 -4 27 2 =25
2
P 4 16 729 4 -
The squares are
Detergent Engine Sum
I II III
A 0 4 36 40
B 4 1 49 54
C 9 25 100 134
D 9 64 16 89
Grand Total -
∑ ∑ 𝑥 = 317
𝑆𝑆𝑅 = + + + − 52.08
⇒ 𝑆𝑆𝑅 = 5.33 + 33.33 + 108 + 16.33 − 52.08
⇒ 𝑆𝑆𝑅 = 163 − 52.08
⇒ 𝑆𝑆𝑅 = 110.92
𝑆𝑆𝐶 = + + − 52.08
⇒ 𝑆𝑆𝐶 = 1 + 4 + 182.25 − 52.08
⇒ 𝑆𝑆𝐶 = 187.25 − 52.08
⇒ 𝑆𝑆𝐶 = 135.17
Therefore SSE=TSS-SSR-SSC
SSE=264.92-110.92-135.17=18.83
Since the null hypothesis is rejected and there is a significance between Detergent and
Engine.
C B A D
25 23 20 20
A D C B
19 19 21 18
B A D C
19 14 17 20
D C B A
17 20 21 15
Sol.
C B A D
25 23 20 20
A D C B
19 19 21 18
B A D C
19 14 17 20
D C B A
17 20 21 15
Null hypothesis Ho : There is no significant difference between rows, columns and treatment
T T2
C B A D
5 3 0 0 8 64
A D C B
-1 -1 1 -2 -3 9
B A D C
-
-1 -6 -3 0
10 100
D C B A
-3 0 1 -5 -7 49
P 0 -4 -1 -7 =- 12
2
P 0 16 1 49 - -
C B A D
25 9 0 0
A D C B
1 1 1 4
B A D C
1 36 9 0
D C B A
9 0 1 25
36 46 11 29 ∑ ∑ 𝑥 =122
( )
Correction Factor 𝐶𝐹 = = = =9
⇒ 𝑇𝑆𝑆 = 122 − 9
⇒ 𝑇𝑆𝑆 = 113
𝑆𝑆𝑅 = + + + −9
⇒ 𝑆𝑆𝑅 = 16 + 2.25 + 25 + 12.25 − 9
⇒ 𝑆𝑆𝑅 = 55.5 − 9
⇒ 𝑆𝑆𝑅 = 4
𝑆𝑆𝐶 = 0 + + + −9
⇒ 𝑆𝑆𝐶 = 4 + 0.25 + 12.25 − 9
⇒ 𝑆𝑆𝐶 = 16.5 − 9
⇒ 𝑆𝑆𝐶 = 7.5
Observations 𝑄
= (𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠) 𝑄
A 0 -1 -6 -5 -12 144
B 3 -2 -1 1 1 1
C 5 1 0 0 6 36
D 0 -1 -3 -3 -7 49
Sum of the squares of treatments 𝑆𝑆𝑇 = ∑ − 𝐶𝐹
𝑆𝑆𝑇 = + + + −9
⇒ 𝑆𝑆𝑇 = 36 + 0.25 + 9 + 12.25 − 9
Prepared by: PURUSHOTHAM P, SJC INSTITUTE OF TECHNOLOGY, 7338005244 P a g e | 21
Inspire before you expire…, BCS301
⇒ 𝑆𝑆𝑇 = 57.50 − 9
⇒ 𝑆𝑆𝑇 = 48.50
∴ 𝑆𝑆𝐸 = 𝑇𝑆𝑆 − 𝑆𝑆𝑅 − 𝑆𝑆𝐶 − 𝑆𝑆𝑇 ⇒ 𝑆𝑆𝐸 = 113 − 46.5 − 7.5 − 48.50 = 10.5, We
know that 𝐹(3,6) = 4.76
5. Five varieties of paddy A, B, C, D, and E are tried. The plan, the varieties shown in each plot
and yields obtained in Kg are given in the following table (LSD)
B E C A D
95 85 139 117 97
E D B C A
90 89 75 146 87
C A D B E
116 95 92 89 74
A C E D B
85 130 90 81 77
D B A E C
87 65 99 89 93
Test whether there is a significant difference between rows and columns at 5% LOS.
T T2
B E C A D
-5 -15 39 17 -3 33 1089
E D B C A -
-10 -11 -25 46 -13 13 169
C A D B E
-
16 -5 -8 -11 -26
34 1156
A C E D B -
-15 30 -10 -19 -23 37 1369
D B A E C -
-13 -35 -1 -11 -7 67 4489
P -27 -36 -5 22 -72 = - 118
P2 729 1296 25 484 5184 - -
B E C A D
25 225 1521 289 9
E D B C A
100 121 625 2116 169
C A D B E
256 25 64 121 676
A C E D B
225 900 100 361 529
D B A E C
169 1225 1 121 49
𝑆𝑆𝐶 = + + + + − 557
⇒ 𝑆𝑆𝐶 = 145.8 + 259.2 + 5 + 96.8 + 1036.8 − 557
⇒ 𝑆𝑆𝐶 = 1543.6 − 557
⇒ 𝑆𝑆𝐶 = 986.6
Observations 𝑄 𝑄
= (𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠)
A 17 - -5 -15 -1
13 -17 289
B -5 - -11 -23 -35
25 -99 9801
C 39 46 16 30 -7 124 15376
D -3 - -8 -19 -13
11 -54 2916
E -15 - -26 -10 -11
10 -72 5184
𝑆𝑆𝑇 = + + + + − 557
⇒ 𝑆𝑆𝑇 = 57.8 + 1960.2 + 3075.2 + 583.2 + 1036.8 − 557
⇒ 𝑆𝑆𝑇 = 6713.2 − 557
⇒ 𝑆𝑆𝑇 = 6156.2
6. Present your conclusions after doing analysis of variance to the following results of
the Latin-square design experiment conducted in respect of five fertilizers which
A B C D E
16 10 11 9 9
E C A B D
10 9 14 12 11
B D E C A
15 8 8 10 18
D E B A C
12 6 13 13 12
C A D E B
13 11 10 7 14
Sol. Given observations are
A B C D E
16 10 11 9 9
E C A B D
10 9 14 12 11
B D E C A
15 8 8 10 18
D E B A C
12 6 13 13 12
C A D E B
13 11 10 7 14
T T2
A B C D E
6 0 1 -1 -1 5 25
E C A B D
0 -1 4 2 1 6 36
B D E C A
5 -2 -2 0 8 9 81
D E B A C
2 -4 3 3 2 6 36
C A D E B
3 1 0 -3 4 5 25
P 16 -6 6 1 14 = 31
P2 256 36 36 1 196 - -
The squares are as follows:
A B C D E
36 0 1 1 1
E C A B D
0 1 16 4 1
B D E C A
25 4 4 0 64
D E B A C
4 16 9 9 4
C A D E B
9 1 0 9 16
74 22 30 23 86 = 235
( )
Correction Factor 𝐶𝐹 = = = = 38.44
𝑆𝑆𝑅 = + + + + − 38.44
⇒ 𝑆𝑆𝑅 = 5 + 7.2 + 16.2 + 7.2 + 5 − 38.44
⇒ 𝑆𝑆𝑅 = 40.60 − 38.44
⇒ 𝑆𝑆𝑅 = 2.16
Prepared by: PURUSHOTHAM P, SJC INSTITUTE OF TECHNOLOGY, 7338005244 P a g e | 26
Inspire before you expire…, BCS301
Sum of the column squares 𝑆𝑆𝐶 = ∑ − 𝐶𝐹
𝑆𝑆𝐶 = + + + + − 38.44
⇒ 𝑆𝑆𝐶 = 51.2 + 7.2 + 7.2 + 0.2 + 39.2 − 38.44
⇒ 𝑆𝑆𝐶 = 105 − 38.44 ⇒ 𝑆𝑆𝐶 = 66.56
To find the sum of the treatments
Observations 𝑄 𝑄
= (𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠)
A 6 4 8 3 1 22 484
B 0 2 5 3 4 14 196
C 1 -1 0 2 3 5 25
D -1 1 -2 2 0 0 0
E -1 0 -2 -4 -3 -10 100
Sum of the squares of treatments 𝑆𝑆𝑇 = ∑ − 𝐶𝐹
𝑆𝑆𝑇 = + + + + − 38.44
⇒ 𝑆𝑆𝑇 = 96.8 + 39.2 + 5 + 0 + 20 − 38.44
⇒ 𝑆𝑆𝑇 = 161 − 38.44
⇒ 𝑆𝑆𝑇 = 122.56
∴ 𝑆𝑆𝐸 = 𝑇𝑆𝑆 − 𝑆𝑆𝑅 − 𝑆𝑆𝐶 − 𝑆𝑆𝑇
⇒ 𝑆𝑆𝐸 = 196.56 − 2.16 − 66.56 − 122.56
⇒ 𝑆𝑆𝐸 = 5.28
Sources d.f. SS MSS F Ratio Conclusion
variation
Rows 5-1=4 SSR=2.16 2.16 0.54 𝐹 < 𝐹(4,12)
𝑀𝑆𝑅 = 𝐹 =
4 0.44 𝐻 -Accepted
= 0.54 = 1.227
7. Set up ANOVA table for the following information relating to three drugs testing to judge the effectiveness in
reducing blood pressure for three different groups of people:
X Y Z
A 14 10 11
15 9 11
B 12 7 10
11 8 11
C 10 11 8
11 11 7
Do the drugs act differently? Are the different groups of people affected differently? Is the interaction term
significant? Answer the above questions taking a significant level of 5%.
Sol.
Given observations from different people (A, B, C) to the different drugs (X, Y, Z) are as
Group Drug T T2
of
X Y Z
people
A 14 10 11 70 4900
15 9 11
B 12 7 10 59 3481
11 8 11
C 10 11 8 58 3364
11 11 7
P 73 56 58 =187 -
Where N=6+6+6=18
( )
Correction Factor 𝐶𝐹 = = = = 1942.722
The squares are as follows
Group Drug Sum of
of Squares
people X Y Z
𝑆𝑆𝑅 = + + − 1942.722
⇒ 𝑆𝑆𝑅 = 816.67 + 580.16 + 560.67 − 1942.722
⇒ 𝑆𝑆𝑅 = 14.78
Sum of the column squares 𝑆𝑆𝐶 = ∑ − 𝐶𝐹
𝑆𝑆𝐶 = + + − 1942.722
⇒ 𝑆𝑆𝐶 = 888.16 + 522.66 + 560.67 − 1942.722
⇒ 𝑆𝑆𝐶 = 28.77
SS within samples (SST)= (14 – 14.5)2 + (15 – 14.5)2 + (10 – 9.5)2 + (9 – 9.5)2 + (11 – 11)2 +
(11 – 11)2 + (12 – 11.5)2 + (11 – 11.5)2 + (7 – 7.5)2 + (8 – 7.5)2 + (10 – 10.5)2 + (11 – 10.5)2
+ (10 – 10.5)2 + (11 – 10.5)2 + (11 – 11)2 + (11 – 11)2 + (8 – 7.5)2 + (7 – 7.5)2
SST=3.50
Therefore,
SSE=TSS-SSR-SSC-SST
⇒ SSE = 76.28 − 14.78 − 28.77 − 3.5
⇒ SSE = 29.23
We have F(2,9)=4.26 , F(4,9)=3.63
Sources d.f. SS MSS F Ratio Conclusion
variation
Rows 3-1=2 SSR=14.78 14.78 7.39 𝐹 > 𝐹(2,9)
𝑀𝑆𝑅 = 𝐹 =
2 0.389 𝐻 -Rejected
= 7.39 = 19