S-Chapter 16: Analysis of Variance (Anova) and Analysis of Covariance (Ancova)

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 21

S-CHAPTER 16

ANALYSIS OF VARIANCE (ANOVA) AND


ANALYSIS OF COVARIANCE (ANCOVA)
08/06/2022 Chapter 16 2

RELATIONSHIP AMONG TECHNIQUES (p.497)

ANOVA and ANCOVA are used for examining the differences in the
mean values of the dependent variables associated with the
effect of the controlled independent variables after taking into
account the influence of the uncontrolled independent variables.
• - Explain it again after citing the example and the null hypothesis. 

Discussion
1. Data Requirements For: t-test, ANOVA, ANCOVA,
Regression Analysis, &
Discriminant Analysis

2. General Functional Form: Y = ƒ(x); Y = Dependent Variable,


X = Independent Variable
3
y = ƒ(x)
t - test y = Metric
x = one non-metric ind var with only two categories.

y = ƒ(x1, x2.............xk)
y = Metric
xi = Non-metric ind vars (factors) with ≥ 2 categories.
ANOVA One way ANOVA : One ind var (factor)
n-way ANOVA : n ind vars (factors)
Treatment: A particular combination of factor levels
In one-way ANOVA, a treatment is the same a
factor level
y = ƒ (x1, x2................xk, z1, z2...............zm)
ANCOVA y = Metric
xi = Non-metric ind vars (Factors); Controllable
zi = Metric ind vars (Covariates); Uncontrollable
y = ƒ(x1, x2....................................xk)
y = b0 + b1x1+ b2x2+............... + bkxk
Regression y = Metric
xi = Metric/Non-metric
If non-metric then convert to dummy vars.
y = ƒ(x1, x2...................................xk)
D = b0 + b1x1+ b2x2+... .. + bkxk
Discriminant y = Non-metric
D = Discriminant Scores
xi = Metric
08/06/2022 Chapter 16 4

Income

H M L

Family L 1 3 5
Size
S 2 4 6

Income
H M L
1 2 3

L 1
Family
Size S 2
08/06/2022 Chapter 16 5

An Alternative View of the last Discussion on Data Requirements


Fig 16.1 (p.491) Relp Among ANOVA, ANCOVA, t-test & Regression

Metric Dependent Variable

One Independent One or more


Variable Independent Variable

Binary Categorical: Categorical:


Interval
Factor and Interval

t Test
ANOVA ANCOVA Regression

One Factor More than


One Factor

One Way n-Way


ANOVA ANOVA
08/06/2022 p.486, 1st Para, Last Sentence (for example) 6

Ex: To examine the preference toward Total cereal of product use groups &
loyalty groups taking into account the respondents’
attitudes toward nutrition and the importance they attach to
breakfast as a meal.
- The first 2 ind vars will be measured categorically &
the latter 2 ind vars would be measured on 9-pt LS.

Dep Var Factors Covariates


Y X1 X2 X3 X4
Preference Product Attitude Importce
Toward Use Loyalty Toward of Cereal
Total Cereal Nutrition as a Meal
Interval Categorical Categorical Interval Interval
9-Pt LS H, M, L, N Loyal vs 9-Pt LS 9-Pt LS
Non-loyal
t-test √ √
One-way √ √ or √
ANOVA
2-Way √ √ √
ANOVA
ANCOVA √ √ √ √ √
Regression √ √ √ √ √
08/06/2022 Chapter 16 7

STATISTICS ASSOCIATED WITH ANOVA


Ex: Consider one-way ANOVA:
Let y = ƒ(x); y = Expenditure → Metric → Taka
x = Income → Non-metric → H, M, L
SSy = SSx + SSerror , or
SST = SSb + SSw , where
1. SSy is the total variation in Y.
N 2
SSy =  (Y  Y ) i
i 1

2. SSbetween (SSX) is the variation in Y related to the


variation in the means of the categories of X.
- This represents the portion of the variation in Y due to X.
c 
SSx =  n(Y j  Y )2
j 1

3. SSwithin (SSerror) is the variation in Y due to the variation within


each of the categories of X (it is due to error, not due to X).
c n
SSerror =   (Y ij  Y j )2
j i
08/06/2022 Chapter 16 8

STATISTICSASSOCIATEDWITHANOVAContd…

4. Eta Square (η2): The strength of the effects of X (ind.variable)


on Y (the dependent variable), where
ss x ( ss y  sserror )
η =
2
= , where 0 ≤ η2 ≤ 1.
ss y ss y
ss x /( c  1) MS x
5. F-statistic: F = ss

error /( N  c ) MS error
where, MSX (Mean Square related to X), and
MSerror (Mean Square related to error).
The F-statistic is used to test the null hypothesis that
all the category means are equal in the population, i.e.,
Ho: u1 = u2 = u3 = …………………= uc
08/06/2022 Chapter 16 9

CONDUCTINGONE-WAYANOVA

  1. Identity the Dependent and Independent Variables


2. Decompose the Total Variation
3. Measure the Effects
4. Test Significance
5. Interpret the Results
• - Now look at each of these.
08/06/2022 Chapter 16 10

1. Identify the Dependent and Independent Variables


Dependent Variable = Y, Independent Variable = X.
- X is a categorical variable having c categories
- For each category of X, there are n observations on Y (Tab 16.1).
- The sample size in each category of X = n,
- So, the total sample size = n x c = N
- The sample size in each category of X is taken to be equal (i.e., n)
for the sake of simplicity but this is not a requirement.
2. Decomposition of the Total Variation
The total variation (SSy) in Y (dep. variable) is decomposed into:
SSy = SSx + SSerror or
SST = SSb + SSw , where
SSb = Variation between the categories of X .
- It represents variation in Y due to X.
- It is the portion of SSyy related to X.
c 
=  n(Y
j 1
j  Y )2

SSw = Variation in Y related to the variation within


each category of X.
- It is not accounted for by X but for error.
c n
=   (Y ij  Y j )2
j i

N 2
SSy =  (Y
i 1
i Y)
08/06/2022 Chapter 16 11

3. Measurement of Effects
The strength of the effects of X on Y are measured as follows:
ss x ( ss y  ss error )
η =
2
= , where 0 ≤ η2 ≤ 1.
ss y ss y
It is a measure of the variation in Y explained by the ind var X.

4. Significance Testing
In one-way ANOVA, we test:
Ho: u1 = u2 = u3 = …………………= uc
using the F-statistic given by

ss x /( c  1) MS x
F=

ss error /( N  c ) MS error
where, (c - 1) and (N- c) are the numerator and the
denominator degrees of freedom
based on Appendix: Statistical Table 5 (5%, p.884; 1%, p.886).

5. Interpret the Results


If Fcal (Fobsd) > Ftab (Ftheo) then reject the Ho.
If Fcal (Fobsd) < Ftab (Ftheo) then don’t reject the Ho.
08/06/2022 Chapter 16 12
Table 16.2: Coupon, In-store Prom, Store Sales & Clientele Rating (p.490)

Store Y X1 X2 X3
Number Sales In-store Coupon Clientele
Promotion Level Rating

1 10.00 1.00 1.00 9.00


2 9.00 1.00 1.00 10.00
3 10.00 1.00 1.00 8.00
4 8.00 1.00 1.00 4.00
5 9.00 1.00 1.00 6.00
6 8.00 2.00 1.00 8.00
7 8.00 2.00 1.00 4.00
8 7.00 2.00 1.00 10.00
9 9.M 2.00 1.00 6.00
10 6.00 2.00 1.00 9.00
11 5.00 3.00 1.00 8.00
12 7.00 3.00 1.00 9.00
13 6.00 3.00 1.00 6.00
14 4.00 3.00 1.00 10.00
15 5.00 3.00 1.00 4.00
16 8.00 1.00 2.00 10.00
17 9.00 1.00 2.00 6.00
18 7.00 1.00 2.00 8.00
19 7.00 1.00 2.00 4.00
20 6.00 1.00 2.00 9.00
21 4.00 2.00 2.00 6.00
22 5.00 2.00 2.00 8.00
23 5.00 2.00 2.00 10.00
24 6.00 2.00 2.00 4.00
25 4.00 2.00 2.00 9.00
26 2.00 3.00 2.00 4.00
27 3.00 3.00 2.00 6.00
28 2.00 3.00 2.00 10.00
29 1.00 3.00 2.00 9.00
30 2.00 3.00 2.00 8.00
08/06/2022 Example of One-way ANOVA 13
- First do it manually.
- Then by making use of computer (SPSS).

Table 16.3 (p.498) Effect of In-Store Promotion on Sales

Level of In-Store Promotion

Number High (1) Medium (2) Low (3)


Normalized
Sales
1 10 8 5
2 9 8 7
3 10 7 6
4 8 9 4
5 9 6 5
6 8 4 2
7 9 5 3
8 7 5 2
9 7 6 1
10 6 4 2
Column totals 83 62 37

Category means Y j 83 62 37
10 10 10
= 8.3 = 6.2 = 3.7
Grand mean,

(83  62  37)
Y j   6.067
30
08/06/2022 Chapter 16 14

The various SSs are computed as follows (p.491).


N
SSy =  i
(Y  Y
i 1
) 2

= (10 - 0.067)2 + (9 - 6.067)2 + (10- 6.067)2 + (8-6.067)2 + (9- 6.067)2


+ (8 - 0.067)2 + (9 - 6.067)2 + (7 - 6.067)2 + (7-6.067)2 + (6 - 6.067)2
+ (8 - 6.067)2 + (8 - 6.067)2 + (7 - 6.067)2 + (9-6.067)2 + (6 - 6.067)2
+ (4 - 6.067)2 + (5 - 6.067)2 + (5 - 6.067)2 + (6-6.067)2 + (4 - 6.067)2
+ (5 - 6.067)2 + (7 - 6.067)2 + (6 - 6.067)2 + (4-6.067)2 + (5 - 6.067)2
+ (2 - 6.067)2 + (3 - 6.067)2 + (2 - 6.067)2 + (1-6.067)2 + (2 - 6.067)2

= (3.933)2 + (2.933)2+ (3.933)2 + (1.933)2 + (2.933)2


+ (1.933)2 + (2.933)2 + (.933)2 + (,933)2 + (-.933)2
+ (1.933)2 + (1.933)2 + (.933)2 + (2.933)2 + (-.067)2
+ (-2.067)2 + (-1.067)2 + (-1.067)2 + (-.067)2 + (-2.067)2
+ (-1.067)2 + (.933)2 + (-067)2 + (-2.067)2 + (-1.067)2
+ (-4.067)2 + (-3.067)2 + (-4.067)2 + (-5.067)3 + (-4.067)2
= 185.807
Chapter 16 15
08/06/2022

c
SSx =  n(Y
j 1
j  Y )2 ;

= 10(8.3 - o.067)2+10(6.2 - 6.067)2+ 10(3.7- 6.067)2


= 10(2.233)2+ 10(.133)2+ 10(- 2.367)2
= 106.067
c n
SSerror =   (Y
j i
ij  Yj )2

= (10 - 8.3)2 + (9 -8.3)2 + (10 - 8.3)2 + (8 - 8.3)2 + (9 - 8.3)2


+ (8 - 8.3)2 + (9 – 8.3)2 + (7 - 8.3)2 + (7 - 8.3)2 + (6 - 8.3)2
+ (8 - 6.2)2 + (8 - 6.2)2 + (7 - 6.2)2 + (9 - 6.2)2 + (6 - 6.2)2
+ (4 - 6.2)2 + (5 - 6.2)2 + (5 - 6.2)2 + (6 - 6.2)2 + (4 - 6.2)2
+ (5 - 3.7)2 + (7 - 3.7)2 + (6 - 3.7)2 + (4 - 3.7)2 + (5 - 3.7)2
+ (2 - 3.7) 2 + (3 - 3.7)2 + (2 - 3.7)2 + (I - 3.7)2 + (2 - 3.7)2
= (1.7)2 + (.7)2 +(1.7)2 + (-.3)2 + (.7)2
+ (-.3)2 + (.7)2 + (-1.3)2 + (-1.3)2 + (-2.3)2
+ (1.8) 2 + (1.8) 2 + (.8) 2 + (2.8) 2 + (-.2) 2
+ (-2.2) 2 + (-1.2) 2 + (-1.2) 2 + (-2) 2 + (-2.2) 2
+ (1.3) 2 + (3.3) 2 + (2.3) 2 + (.3) 2 + (1.3) 2
+ (-1.7) 2 + (-.7) 2 + (-1.7) 2 + (-2.7) 2 + (-1.7) 2
= 79.80

So, SSx + Sserror = 106.06 + 79.80 6= 185.87= SSy


08/06/2022 Chapter 16 16

Therefore, the strength of the effects of X on Y is


SS x 106.067
η2 =   .571
SS y 185.867
- 57.1 % of the variation in sales (Y) is accounted for by
in-store promotion (X).
The Ho: u1 = u2 = u3 = …………………= uc is tested by:

SS x /( c  1) MS x 106.067 /(3  1)
F=    17.944
SS error /( N  c) MS error 79.800 /(30  3)
- F2, 27,.05 = 3.35 < 17.944 (Fcal).

.'. We reject the Ho.


So, we conclude that a high level of in-store promotion leads to
significantly higher sales.
08/06/2022 Table 16.4 (p. 504): One-Way ANOVA 17

Effect of In-Store Promotion on Sales


- Based on SPSS output

Source of Sum of Mean F


Variation Squares df Square Ratio F Prob
Between groups 106.067 2 53.033 17.944 .000
(in-store promotion)
Within groups 79.800 27 2.956
(error)
Total 185.867 29 6.409
Cell Means
Level of
In-store Promotion Count Mean
High(l) 10 8.300
Medium (2) 10 6.200
Low (3) 10 3.700
Total 30 6.067

Explanation of p value:
p=.000 → H0 is rejected at 0% level; p=.045 →H0 is rejected at 4.5% level
p=.010→H0 is rejected at 1% level; p=.125 →H0 is rejected at 12.5% level
08/06/2022 Chapter 16 18

N-way ANOVA

In N-way ANOVA, we examine the interactions between the factors.


- Interactions occur when the effects of a factor on the dependent
variable depends on the level (category) of the other factors.

Consider two factors X1 & X2 having categories C1 & C2.


- The total variation is decomposed as follows:
• SSy = SSx1+ SSx2+ SSx1.x2+ Sserror

• Exercise: Decompose SSy for X1, X2 & X3 having categories C1, C2 & C3.
08/06/2022 Chapter 16 19

Overall Effect or Multiple η2

The strength of the joint effect of 2 factors,


called overall effect is given by

2
( SS x1  SS x 2  SS x1. x 2 )
Multiple n =
SS y
Significance of the Overall Effect
Is tested by an F-test:

( SS x1  SS x 2  SS x1. x 2 ) / df n SS x1. x 2. x1. x 2 / df n MS x1. x 2. x1x 2


F=  
SS error / df d SS error / df d MS error
where, dfn = (c1-1) + (c2-1) + (c1-1) + (c2-1) = c1c2-1
[[

dfd = N - c1c2
MS = Mean Square
08/06/2022 Chapter 16 20

Table 16.5 (p. 503) Two-Way Analysis of Variance

Source of Variation Sum of df Mean F Sig w2


Squares Squam of F
Main effects
In-store promotion 106.067 2 53.033 54.862 .000 .557
Coupon 53.333 1 53.333 55.172 .000 .280
Combined 159.400 3 53.133 54.966 .000
Two-way interaction 3.267 2 1.633 1.690 .206
Model 162.667 5 32.533 33.655 .000
Residual (error) 23.200 24 0.967
Total 185.967 29 6.409

Cell Mean
In- store Promotion Coupon Count Mean
High Yes 5 9.200
High No 5 7.400
Medium Yes 5 7.600
Medium No 5 4.800
Low Yes 5 5.400
Low No 5 2.000

Factor Level Means


Promotion Coupon Count Mean
High 10 8.300
Medium 10 6.200
Low 10 3.700
Yes 15 7.400
No 15 4.733
Grand 30 6.067
Mean
08/06/2022 Chapter 16 21

Table 16.6 (p. 506): Analysis of Covariance

Source of variation SSs df Mean Sqr F Sig of F


Covariates
Clientele .838 1 .838 .862 .363
Main effects
Promotion 106.067 2 53.033 54.546 .000
Coupon 53.333 1 53.333 54.855 .000
Combined 159.400 3 53.133 54.649 .000
Two-way interaction
Promotion*coupon 3.267 2 1.633 1.680 .208
Model 163.505 6 27.251 21.028 .000
Residual (error) 22.362 23 .972
Total 185.867 29 6.409
Covariate Raw
Clientele Coefficient
-.078

You might also like