Two Sample Test & ANOVA Session 5

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 114

TWO SAMPLE TEST AND ANALYSIS

OF VARIANCE

STAT8008 APPLIED STATISTICS FOR BUSINESS


SESSION 05
CHAPTER 10 - TWO-SAMPLE TESTS
Objectives:
• How to compare the means of two independent populations.
• How to compare the means of two related populations.
• How to compare the proportions of two independent populations.
• How to compare the variances of two independent populations.
TWO-SAMPLE TESTS
DCOVA

Two-Sample Tests

Population Population
Means, Means, Population Population
Independent Related Proportions Variances
Samples Samples
Examples:
Group 1 vs. Same group Proportion 1 vs. Variance 1 vs.
Group 2 before vs. after Proportion 2 Variance 2
treatment
DIFFERENCE BETWEEN TWO
MEANS DCOVA

Population means, Goal: Test hypothesis or form


independent
samples
* a confidence interval for the
difference between two
population means, μ1 – μ2.
σ1 and σ2 unknown,
assumed equal The point estimate for the
difference is

X1 – X 2
σ1 and σ2 unknown,
not assumed equal
DIFFERENCE BETWEEN TWO MEANS: INDEPENDENT
SAMPLES DCOVA
• Different data sources
Population means, • Unrelated.
independent
samples
* • Independent.
• Sample selected from one population
has no effect on the sample selected
from the other population.

Use Sp to estimate unknown


σ1 and σ2 unknown,
σ. Use a Pooled-Variance t
assumed equal test.

σ1 and σ2 unknown, Use S1 and S2 to estimate


not assumed equal unknown σ1 and σ2. Use a
Separate-variance t test
HYPOTHESIS TESTS FOR
TWO POPULATION MEANS DCOVA
Two Population Means, Independent Samples

Lower-tail test: Upper-tail test: Two-tail test:

H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2


H1: μ1 < μ2 H1: μ1 > μ2 H1: μ1 ≠ μ2
i.e., i.e., i.e.,
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0
HYPOTHESIS TESTS FOR Μ1 – Μ2 DCOVA
Two Population Means, Independent Samples
Lower-tail test: Upper-tail test: Two-tail test:
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0

a a a/2 a/2

-ta ta -ta/2 ta/2


Reject H0 if tSTAT < -ta Reject H0 if tSTAT > ta Reject H0 if tSTAT < -ta/2
or tSTAT > ta/2
HYPOTHESIS TESTS FOR Μ1 - Μ2 WITH Σ1
AND Σ2 UNKNOWN AND ASSUMED
EQUAL
DCOVA

Population means, Assumptions:


independent  Samples are randomly and
samples
independently drawn.

σ1 and σ2 unknown,
assumed equal
*  Populations are normally
distributed or both sample
sizes are at least 30.

 Population variances are


unknown but assumed equal.
σ1 and σ2 unknown,
not assumed equal
HYPOTHESIS TESTS FOR Μ1 - Μ2 WITH Σ1 AND Σ2
UNKNOWN AND ASSUMED EQUAL
(continued)
DCOVA
• The pooled variance is:
Population means,
independent 2
S 
  2
 
n1  1 S1  n2  1 S 2
2

p
samples (n1  1)  (n2  1)
• The test statistic is:
σ1 and σ2 unknown,
assumed equal
*  X1  X 2    μ 1  μ 2 
t STAT 
2  1 1 

Sp   
 n1 n2 
σ1 and σ2 unknown,
not assumed equal • Where tSTAT has d.f. = (n1 + n2 – 2).
CONFIDENCE INTERVAL FOR Μ1 - Μ2 WITH Σ1 AND Σ2
UNKNOWN AND ASSUMED EQUAL
DCOVA

Population means,
independent
samples
The confidence interval for
μ1 – μ2 is:
σ1 and σ2 unknown,
assumed equal
*
2  
 X1  X 2   tα/2 Sp 
1

1


 n1 n 2 

σ1 and σ2 unknown, Where tα/2 has d.f. = n1 + n2 – 2.


not assumed equal
POOLED-VARIANCE T TEST
EXAMPLE
DCOVA
You are a financial analyst for a brokerage firm. Is there a
difference in dividend yield between stocks listed on the NYSE
& NASDAQ? You collect the following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16

Assuming both populations are


approximately normal with
equal variances, is
there a difference in mean
dividend yield ( = 0.05)?
POOLED-VARIANCE T TEST EXAMPLE:
CALCULATING THE TEST STATISTIC (continued)
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) DCOVA
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)

The test statistic is:

t
X1  X 2   μ 1  μ 2 

3.27  2.53  0  2.040
2  1 1   1 1 
Sp    1.5021  
  21 25 
 n1 n 2 

2
S 
n1  1S1
2
 n 2  1S 2
2

21  11.30 2  25  11.16 2
 1.5021
p
(n1  1)  (n2  1) (21 - 1)  (25  1)
POOLED-VARIANCE T TEST EXAMPLE: HYPOTHESIS
TEST SOLUTION DCOVA
Reject H0 Reject H0
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)
.025 .025
 = 0.05
df = 21 + 25 - 2 = 44 -2.0154 0 2.0154 t
Critical Values: t = ± 2.0154
2.040
Test Statistic: Decision:
3.27  2.53
t  2.040 Reject H0 at a = 0.05.
 1 1 
1.5021    Conclusion:
 21 25  There is evidence of a
difference in means.
POOLED VARIANCE T TEST IN EXCEL
DCOVA
POOLED-VARIANCE T TEST EXAMPLE:
CONFIDENCE INTERVAL FOR Μ1 - Μ2 DCOVA
Since we rejected H0 can we be 95% confident that µNYSE > µNASDAQ?

95% Confidence Interval for µNYSE - µNASDAQ:

X  X   t
1 2 /2
1 1 
S2p     0.74  2.0154  0.3628  (0.009, 1.471)
 n1 n 2 

Since 0 is less than the entire interval, we can be 95% confident


that µNYSE > µNASDAQ.
POOLED VARIANCE T CONFIDENCE INTERVAL IN DCOVA
EXCEL
HYPOTHESIS TESTS FOR Μ1 - Μ2 WITH Σ1
AND Σ2 UNKNOWN, NOT ASSUMED
EQUAL
DCOVA
Population means, Assumptions:
independent  Samples are randomly and
samples
independently drawn.

σ1 and σ2 unknown,  Populations are normally


distributed or both sample
assumed equal
sizes are at least 30.

 Population variances are

*
σ1 and σ2 unknown, unknown and cannot be
assumed to be equal.
not assumed equal
HYPOTHESIS TESTS FOR Μ1 - Μ2 WITH Σ1 AND Σ2
UNKNOWN AND NOT ASSUMED EQUAL
(continued)
DCOVA
Population means, This test is known at the
independent separate-variance t test.
samples

σ1 and σ2 unknown, The formulae for this test are


assumed equal not covered in this chapter.
See reference 3 for more
details.

*
σ1 and σ2 unknown, This test done in Excel is
not assumed equal shown on the next slide.
SEPARATE-VARIANCE T TEST IN EXCEL
NYSE NASDAQ (continued)
Number 21 25 DCOVA
Sample mean 3.27 2.53
Sample std dev 1.30 1.16

• Using α=0.05 this test fails to reject the null.

• For this data whether we can assume equal


variances or not is important to determine
because when we assumed equal variances
the null was rejected.

• In Section 10.4 a test to help determine whether


this is a reasonable assumption or not is discussed.
RELATED POPULATIONS
THE PAIRED DIFFERENCE TEST DCOVA
Tests Means of 2 Related Populations
Related
• Paired or matched samples.
• Repeated measures (before/after).
samples • Use difference between paired values:

Di = X1i - X2i
• Eliminates Variation Among Subjects.
• Assumptions:
• Differences are normally distributed.
• Or, if not Normal, use large samples.
RELATED POPULATIONS (continued)
THE PAIRED DIFFERENCE TEST DCOVA

The ith paired difference is Di , where


Related Di = X1i - X2i
samples
n
The point estimate for the
paired difference
D i
D i 1
population mean μD is D : n
n
The sample standard  i
(D  D ) 2

deviation is SD. SD  i1


n 1
n is the number of pairs in the paired sample
THE PAIRED DIFFERENCE TEST:
FINDING TSTAT
DCOVA

• The test statistic for μD is:


Paired
samples
D  μD
t STAT 
SD
n

 Where tSTAT has n - 1 d.f.


THE PAIRED DIFFERENCE TEST:
POSSIBLE HYPOTHESES DCOVA
Paired Samples
Lower-tail test: Upper-tail test: Two-tail test:

H0: μD  0 H0: μD ≤ 0 H0: μD = 0


H1: μD < 0 H1: μD > 0 H1: μD ≠ 0

a a a/2 a/2

-ta ta -ta/2 ta/2


Reject H0 if tSTAT < -ta Reject H0 if tSTAT > ta Reject H0 if tSTAT < -ta/2
or tSTAT > ta/2
Where tSTAT has n - 1 d.f.
THE PAIRED DIFFERENCE CONFIDENCE INTERVAL
DCOVA

Paired The confidence interval for μD is:


samples
SD
D  t / 2
n
n

 (D  D)
i
2

where SD  i 1
n 1
Paired Difference Test:
Example DCOVA
• Assume you send your salespeople to a “customer
service” training workshop. Has the training made a
difference in the number of complaints? You collect the
following data:

Number of Complaints: (2) - (1)  Di


Salesperson Before (1) After (2) Difference, Di D = n
C.B. 6 4 - 2 = -4.2
T.F. 20 6 -14
M.H. 3 2 - 1
R.K. 0 0 0
SD 
 i
(D  D ) 2

M.O. 4 0 - 4 n 1
-21
 5.67
Paired Difference Test:
Solution DCOVA

• Has the training made a difference in the number of complaints (at


the 0.01 level)?
Reject Reject
H0: μD = 0
H1: μD  0 /2
/2
- 4.604 4.604
 = .01 D = - 4.2
- 1.66
t0.005 = ± 4.604
Decision: Do not reject H0
d.f. = n - 1 = 4
(tstat is not in the rejection region).
Test Statistic:
Conclusion: There is insufficient
D  μ D  4.2  0 evidence of a change in the
t STAT    1.66 number of complaints.
SD / n 5.67/ 5
PAIRED DIFFERENCE T TEST IN EXCEL
DCOVA

Decision: Do not reject H0


(tstat is not in the rejection region).

Conclusion: There is
insufficient evidence of a
change in the number of
complaints.
THE PAIRED DIFFERENCE CONFIDENCE INTERVAL
-- EXAMPLE DCOVA

SD
D  t / 2
The confidence interval for μD is:

n
The probability this interval contains the true value of μD is 99%.

D = -4.2, SD = 5.67

5.67
99% CI for  D :  4.2  4.604
5
 (-15.87, 7.47)
TWO POPULATION
PROPORTIONS
DCOVA
Goal: test a hypothesis or form a
Population confidence interval for the difference
proportions between two population proportions,
π1 – π2
Assumptions:
n1 π1  5 , n1(1- π1)  5
n2 π2  5 , n2(1- π2)  5

The point estimate for


the difference is
p1  p2
TWO POPULATION
PROPORTIONS DCOVA
In the null hypothesis we assume the
null hypothesis is true, so we assume π1
Population
proportions = π2 and pool the two sample estimates.
The pooled estimate for the
overall proportion is:

X1  X 2
p
n1  n2
where X1 and X2 are the number of items of
interest in samples 1 and 2.
TWO POPULATION
PROPORTIONS (continued)

The test statistic for DCOVA


Population π1 – π2 is a Z statistic:
proportions

Z STAT 
 p 1  p 2    π1  π 2 
 1 1 
p (1  p )   
 n1 n 2 
X1  X 2 X X
where p , p1  1 , p 2  2
n1  n 2 n1 n2
HYPOTHESIS TESTS FOR
TWO POPULATION PROPORTIONS
DCOVA
Population proportions

Lower-tail test: Upper-tail test: Two-tail test:

H0: π1  π2 H0: π1 ≤ π2 H0: π1 = π2


H1: π1 < π2 H1: π1 > π2 H1: π1 ≠ π2
i.e., i.e., i.e.,
H0: π1 – π2  0 H0: π1 – π2 ≤ 0 H0: π1 – π2 = 0
H1: π1 – π2 < 0 H1: π1 – π2 > 0 H1: π1 – π2 ≠ 0
HYPOTHESIS TESTS FOR
TWO POPULATION PROPORTIONS
(continued)
Population proportions DCOVA
Lower-tail test: Upper-tail test: Two-tail test:
H0: π1 – π2  0 H0: π1 – π2 ≤ 0 H0: π1 – π2 = 0
H1: π1 – π2 < 0 H1: π1 – π2 > 0 H1: π1 – π2 ≠ 0

a a a/2 a/2

-za za -za/2 za/2


Reject H0 if ZSTAT < -Za Reject H0 if ZSTAT > Za Reject H0 if ZSTAT < -Za/2
or ZSTAT > Za/2
HYPOTHESIS TEST EXAMPLE:
TWO POPULATION PROPORTIONS
DCOVA
Is there a significant difference between the
proportion of men and the proportion of women
who will vote Yes on Proposition A?

• In a random sample, 36 of 72 men and 35 of 50


women indicated they would vote Yes.

• Test at the .05 level of significance.


Hypothesis Test Example:
Two Population Proportions
(continued)

• The hypothesis test is: DCOVA


H0: π1 – π2 = 0 (the two proportions are equal).
H1: π1 – π2 ≠ 0 (there is a significant difference between proportions).
 The sample proportions are:
 Men: p1 = 36/72 = 0.50
 Women: p2 = 35/50 = 0.70
 The pooled estimate for the overall proportion is:
X1  X 2 36  35 71
p    .582
n1  n 2 72  50 122
HYPOTHESIS TEST EXAMPLE: (continued)
TWO POPULATION PROPORTIONS DCOVA
Reject H0 Reject H0

The test statistic for π1 – π2 is:


.025 .025
z STAT 
 p1  p 2     1   2 
 1 1 
p (1  p)    -1.96 1.96
 n1 n 2  -2.20


 .50  .70   0   2.20 Decision: Reject H .
 1 1  0
.582 (1  .582)   
 72 50  Conclusion: There is
evidence of a significant
Critical Values = ±1.96 difference in the proportion
For  = .05 of men and women who
will vote yes.
COMPARING TWO POPULATION
DCOVA
PROPORTIONS IN EXCEL
Decision: Reject H0.
Conclusion: There is
evidence of a significant
difference in the proportion
of men and women who
will vote yes.
CONFIDENCE INTERVAL FOR
TWO POPULATION PROPORTIONS
DCOVA

Population The confidence interval for


proportions
π1 – π2 is:

p1 (1  p1 ) p 2 (1  p 2 )
 p1  p 2   Z/2 
n1 n2
CONFIDENCE INTERVAL FOR TWO
POPULATION PROPORTIONS -- EXAMPLE
DCOVA
The 95% confidence interval for π1 – π2 is:

0.50(0.50) 0.70(0.30)
 0.50  0.70  1.96 
72 50
 (-0.37, - 0.03)

Since this interval does not contain 0 can be 95%


confident the two proportions are different.
TESTING FOR THE RATIO OF TWO POPULATION VARIANCES
DCOVA
Hypotheses FSTAT
Tests for Two
Population
* H0: σ12 = σ22
H1: σ12 ≠ σ22
Variances S12 / S22
H0: σ12 ≤ σ22
H1: σ12 > σ22
F test statistic
Where:
S12 = Variance of sample 1 (the larger sample variance)
n1 = sample size of sample 1
S22 = Variance of sample 2 (the smaller sample variance)
n2 = sample size of sample 2
n1 –1 = numerator degrees of freedom
n2 – 1 = denominator degrees of freedom
THE F DISTRIBUTION DCOVA
• The F critical value is found from the F table.
• There are two degrees of freedom required: numerator and
denominator.
• The larger sample variance is always the numerator.

• When S12
FSTAT  2 df1 = n1 – 1 ; df2 = n2 – 1.
S2
• In the F table;
• numerator degrees of freedom determine the column.
• denominator degrees of freedom determine the row.
FINDING THE REJECTION REGION
DCOVA
H0: σ12 = σ22 H0: σ12 ≤ σ22
H1: σ12 ≠ σ22 H1: σ12 > σ22

/2 F 

0 0
Do not Reject H0 Do not Reject H0 F
reject H0 Fα/2 reject H0 Fα

Reject H0 if FSTAT > Fα/2 Reject H0 if FSTAT > Fα


F TEST: AN EXAMPLE DCOVA

You are a financial analyst for a brokerage firm. You want to


compare dividend yields between stocks listed on the NYSE &
NASDAQ. You collect the following data:
NYSE NASDAQ
Number 21 25
Mean 3.27 2.53
Std dev 1.30 1.16

Is there a difference in the variances between the


NYSE & NASDAQ at the  = 0.05 level?
F TEST: EXAMPLE SOLUTION DCOVA
• Form the hypothesis test:
H0: σ21 = σ22 (there is no difference between variances.)
H1: σ21 ≠ σ22 (there is a difference between variances.)

 Find the F critical value for  = 0.05.

 Numerator d.f. = n1 – 1 = 21 –1 =20.

 Denominator d.f. = n2 – 1 = 25 –1 = 24.

 Fα/2 = F.025, 20, 24 = 2.33.


F TEST: EXAMPLE SOLUTION
DCOVA
(continued)

• The test statistic is: H0: σ12 = σ22


2 2 H1: σ12 ≠ σ22
S 1.30
FSTAT  1

2 2
 1.256
S2 1.16
/2 = .025

0 F
Do not Reject H0
reject H0
 FSTAT = 1.256 is not in the rejection F0.025=2.33
region, so we do not reject H0.

 Conclusion: There is not sufficient evidence


of a difference in variances at  = .05.
F TEST IN EXCEL DCOVA

Conclusion: There is not


sufficient evidence of a
difference in variances at
 = .05.
CHAPTER SUMMARY
In this chapter we discussed:
• Comparing the means of two independent populations.

• Comparing the means of two related populations.

• Comparing the proportions of two independent populations.

• Comparing the variances of two independent populations.


CHAPTER 11 - ANALYSIS OF VARIANCE
Objectives:
• The basic concepts of experimental design.
• How to use one-way analysis of variance to test for differences among
the means of several groups.
• To use two-way analysis of variance and interpret the interaction
effect.
• To perform multiple comparisons in a one-way analysis of variance
and a two-way analysis of variance.
CHAPTER OVERVIEW
DCOVA
Analysis of Variance (ANOVA)

Completely Randomized Design Two-Way


One-Way ANOVA ANOVA
F-test
F-test
Tukey-
Kramer
Multiple Interaction
Comparisons Effects
Levene Test
For
Homogeneity Tukey Multiple
of Variance Comparisons
GENERAL ANOVA SETTING DCOVA

• Investigator controls one or more factors of interest.


• Each factor contains two or more levels.
• Levels can be numerical or categorical.
• Different levels produce different groups.
• Think of each group as a sample from a different
population.
• Observe effects on the dependent variable.
• Are the groups the same?
• Experimental design: the plan used to collect the data.
COMPLETELY RANDOMIZED
DESIGN
DCOVA
• Experimental units (subjects) are assigned randomly to groups.
• Subjects are assumed homogeneous.
• Only one factor or independent variable.
• With two or more levels.
• Analyzed by one-factor analysis of variance (ANOVA).
ONE-WAY ANALYSIS OF
VARIANCE DCOVA
• Evaluate the difference among the means of three or
more groups.
Examples: Number of accidents for 1st, 2nd, and 3rd shift.
Expected mileage for five brands of tires.

• Assumptions:
• Populations are normally distributed.
• Populations have equal variances.
• Samples are randomly and independently selected.
HYPOTHESES OF ONE-WAY
ANOVA DCOVA

• H0 : μ1  μ2  μ3    μc
• All population means are equal.
• i.e., no factor effect (no variation in means among
groups.)

• H1 : Not all of the population means are equal


• At least one population mean is different.
• i.e., there is a factor effect .
• Does not mean that all population means are different
(some pairs may be the same).
ONE-WAY ANOVA
DCOVA
H0 : μ1  μ2  μ3    μc
H1 : Not all μ j are equal

When The Null Hypothesis is True


All Means are the same:
(No Factor Effect)

μ1  μ 2  μ 3
ONE-WAY ANOVA (continued)

DCOVA
H0 : μ1  μ2  μ3    μc
H1 : Not all μ j are equal
When The Null Hypothesis is NOT true
At least one of the means is different.
(Factor Effect is present).

or

μ1  μ2  μ3 μ1  μ2  μ3
PARTITIONING THE VARIATION
DCOVA
• Total variation can be split into two parts:

SST = SSA + SSW

SST = Total Sum of Squares


(Total variation).
SSA = Sum of Squares Among Groups
(Among-group variation).
SSW = Sum of Squares Within Groups
(Within-group variation).
PARTITIONING THE VARIATION
(continued)

DCOVA
SST = SSA + SSW

Total Variation = the aggregate variation of the individual


data values across the various factor levels (SST).

Among-Group Variation = variation among the factor


sample means (SSA).

Within-Group Variation = variation that exists among


the data values within a particular factor level (SSW).
PARTITION OF TOTAL VARIATION
DCOVA
Total Variation (SST)

Variation Due to Variation Due to Random


= Factor (SSA) + Error (SSW)
TOTAL SUM OF SQUARES DCOVA

SST = SSA + SSW


c nj

SST   ( Xij  X) 2

j 1 i 1
Where:
SST = Total sum of squares
c = number of groups or levels
nj = number of values in group j
Xij = ith observation from group j
X = grand mean (mean of all data values)
TOTAL VARIATION DCOVA
(continued)

2 2 2
SST  ( X 11  X )  ( X 12  X )      ( X cn  X )
c

R esponse, X

G roup 1 G roup 2 G roup 3


AMONG-GROUP VARIATION DCOVA

SST = SSA + SSW


c
SSA   n j ( X j  X)2
j1
Where:
SSA = Sum of squares among groups
c = number of groups
nj = sample size from group j
Xj = sample mean from group j
X = grand mean (mean of all data values)
AMONG-GROUP VARIATION
(continued)
c DCOVA
SSA   n j ( X j  X)2
j 1

Variation Due to SSA


Differences Among MSA 
Groups. c 1
Mean Square Among =
SSA/degrees of freedom.

i j
AMONG-GROUP VARIATION
DCOVA
(continued)

SSA  n 1 (X1  X) 2  n 2 (X 2  X) 2      n c (X c  X) 2
R esponse, X

X3
X2 X
X1

G roup 1 G roup 2 G roup 3


WITHIN-GROUP VARIATION DCOVA

SST = SSA + SSW


c nj

SSW    ( Xij  X j ) 2

j1 i1
Where:
SSW = Sum of squares within groups
c = number of groups
nj = sample size from group j
Xj = sample mean from group j
Xij = ith observation in group j
WITHIN-GROUP VARIATION
(continued)

nj
DCOVA
c
SSW    ( Xij  X j ) 2

j 1 i1
SSW
Summing the variation
within each group and then
MSW 
adding over all groups. nc
Mean Square Within =
SSW/degrees of freedom.

μj
WITHIN-GROUP VARIATION
DCOVA
(continued)

SSW  (X11  X1 )  (X12  X 2 )      (X cnc  X c )


2 2 2

R esponse, X

X3
X2
X1

G roup 1 G roup 2 G roup 3


OBTAINING THE MEAN SQUARES
DCOVA
The Mean Squares are obtained by dividing the various
sum of squares by their associated degrees of freedom.

SSA Mean Square Among


MSA  (d.f. = c-1).
c 1
SSW Mean Square Within
MSW  (d.f. = n-c).
nc
SST
MST  Mean Square Total
n 1 (d.f. = n-1).
One-Way ANOVA Table
DCOVA

Source of Degrees of Sum Of Mean Square F


Variation Freedom Squares (Variance)

Among SSA FSTAT =


c-1 SSA MSA =
Groups c-1
MSA
Within SSW
n-c SSW MSW = MSW
Groups n-c
Total n–1 SST

c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom
ONE-WAY ANOVA
F TEST STATISTIC DCOVA

H0: μ1= μ2 = … = μc
H1: At least two population means are different
• Test statistic
MSA
FSTAT 
MSW
MSA is mean squares among groups.
MSW is mean squares within groups.

• Degrees of freedom
• df1 = c – 1 (c = number of groups)
• df2 = n – c (n = sum of sample sizes from all populations)
INTERPRETING ONE-WAY ANOVA
F STATISTIC DCOVA

• The F statistic is the ratio of the among estimate


of variance and the within estimate of variance:
• The ratio must always be positive.
• df1 = c -1 will typically be small.
• df2 = n - c will typically be large.

Decision Rule:
 Reject H if F
STAT > Fα,

0
otherwise do not reject
0
H0. Do not
reject H0
Reject H0


ONE-WAY ANOVA
DCOVA
F TEST EXAMPLE
• You want to see if three Club 1 Club 2 Club 3
different golf clubs yield 254 234 200
different distances traveled by 263 218 222
ball struck on an automated 241 235 197
driving machine.
237 227 206
• You randomly select five 251 216 204
measurements from trials on an
automated driving machine for
each club.
• At the 0.05 significance level, is
there a difference in mean
distance?
ONE-WAY ANOVA EXAMPLE:
SCATTER PLOT DCOVA
Distance
Club 1 Club 2 Club 3 270
254 234 200 260 •
••
263 218 222 250 X1
241 235 197 240 •
237 227 206 • ••
230
251 216 204
220 • X2 •
X
••
210
x1  249.2 x 2  226.0 x 3  205.8 200 •• X3

190 •
x  227.0

1 2 3
Club
ONE-WAY ANOVA EXAMPLE
COMPUTATIONS DCOVA

Club 1 Club 2 Club 3 X1 = 249.2 n1 = 5


254 234 200 X2 = 226.0 n2 = 5
263 218 222
241 235 197 X3 = 205.8 n3 = 5
237 227 206 n = 15
251 216 204 X = 227.0
c=3
SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4,716.4
SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1,119.6

MSA = 4,716.4 / (3-1) = 2,358.2 2358.2


FSTAT   25.275
MSW = 1,119.6 / (15-3) = 93.3 93.3
ONE-WAY ANOVA EXAMPLE
SOLUTION DCOVA

H0: μ1 = μ2 = μ3 Test Statistic:


H1: μj not all equal
MSA 2,358.2
 = 0.05 FSTAT    25.275
MSW 93.3
df1= 2 df2 = 12
Critical Decision:
Value:
Reject H0 at  = 0.05.
Fα = 3.89
Conclusion:
 = .05
There is evidence that
0 Do not Reject H0 at least one μj differs
reject H0
F0.05 = 3.89 from the rest.
One-Way ANOVA
Excel Output DCOVA
THE TUKEY-KRAMER PROCEDURE
DCOVA

• Tells which population means are significantly


different.
• e.g.: μ1 = μ2  μ3.
• Done after rejection of equal means in ANOVA.
• Allows paired comparisons:
• Compare absolute mean differences with critical range.

μ1= μ2 μ3 x
YOU MUST MAKE THREE ASSUMPTIONS ABOUT
YOUR DATA TO USE THE ANOVA F TEST
DCOVA
• Randomness and Independence
• Of the samples selected.
• Normality
• Of the c groups from which the samples are selected.
• Homogeneity of Variance:
• The variances of the c groups are equal.
• Can be tested with Levene’s Test.
RANDOMNESS & INDEPENDENCE IS THE
MUST CRITICAL OF THE ASSUMPTIONS
DCOVA
• Experimental validity depends on random sampling and/or the
randomization process.

• Randomly assigning items to groups avoids bias in the outcomes.

• Departures from this assumption can seriously affect inferences from


the ANOVA.
THE NORMALITY ASSUMPTION DCOVA
• The one-way ANOVA F test is fairly robust against
departures from the normal distribution.

• As long as distributions are not greatly affected,


particularly for large samples, the level of
significance of the F test is usually not greatly
affected.

• Normality can be assessed by using a normal


probability plot and/or a boxplot.
THE HOMOGENEITY OF VARIANCE
DCOVA
• With equal sample sizes, violations of this assumption do not
seriously affect inferences.

• With unequal sample sizes, unequal variances can have a serious


effect on inferences.

• To test this assumption, the Levene test for homogeneity-of-variance


(discussed on page 357) can be used.
WHEN ASSUMPTIONS ARE VIOLATED . .
DCOVA
• When only the normality assumption is violated, can
use the Kruskal-Wallis rank test (see Section 12.5).

• When only the equal variance test is violated, you


can use procedures similar to the separate-variance
test (Section 10.1).

• When both the normality and equal variance


assumptions are violated, data transformation is
needed (see references 2 and 3).
LEVENE TEST FOR HOMOGENEITY OF VARIANCE
DCOVA
• Tests the assumption that the variances of each population
are equal.

• First, define the null and alternative hypotheses:


• H0: σ21 = σ22 = …=σ2c.
• H1: Not all σ2j are equal

• Second, compute the absolute value of the difference


between each value and the median of each group.

• Third, perform a one-way ANOVA on these absolute


differences.
LEVENE HOMOGENEITY OF VARIANCE
TEST EXAMPLE
DCOVA

H0: σ21 = σ22 = σ23


H1: Not all σ2j are equal

Calculate Medians Calculate Absolute Differences

Club 1 Club 2 Club 3 Club 1 Club 2 Club 3


237 216 197 14 11 7
241 218 200 10 9 4
251 227 204 Median 0 0 0
254 234 206 3 7 2
263 235 222 12 8 18
Levene Homogeneity Of Variance
Test Example (continued)

Anova: Single Factor


DCOVA
SUMMARY
Groups Count Sum Average Variance
Since the
Club 1 5 39 7.8 36.2 p-value is
Club 2 5 35 7 17.5 greater
Club 3 5 31 6.2 50.2 than 0.05
there is
P- insufficient
Source of Variation SS df MS F value F crit
evidence
Between Groups 6.4 2 3.2 0.092 0.912 3.885
of a
Within Groups 415.6 12 34.6 difference
in the
Total 422 14         variances.
WHEN THE ANOVA F TEST IS DCOVA
STATISTICALLY SIGNIFICANT
• When the F test is rejected, you conclude that at
least one of the population means is different from
the others.

• Now you need to determine which one(s) are


different.

• In this book we use the Tukey-Kramer multiple


comparison procedure for one-way ANOVA to
answer this question.
TUKEY-KRAMER CRITICAL RANGE
DCOVA

MSW  1 1 
Critical Range  Q α 
2  n j n j' 

where:
Qα = Upper Tail Critical Value from Studentized
Range Distribution with c and n - c degrees of
freedom (see appendix E.7 table)
MSW = Mean Square Within
nj and nj’ = Sample sizes from groups j and j’
THE TUKEY-KRAMER
PROCEDURE: EXAMPLE DCOVA
1. Compute absolute mean
Club 1 Club 2 Club 3 differences:
254 234 200
263 218 222 x1  x 2  249.2  226.0  23.2
241 235 197 x1  x 3  249.2  205.8  43.4
237 227 206
251 216 204 x 2  x 3  226.0  205.8  20.2

2. Find the Qα value from the table in appendix E.7 with


c = 3 and (n – c) = (15 – 3) = 12 degrees of
freedom:
Q α  3.77
THE TUKEY-KRAMER
PROCEDURE: EXAMPLE (continued)

3. Compute Critical Range: DCOVA

MSW  1 1  93.3  1 1 
Critical Range  Q α   3.77     16.285
 
2  n j n j'  2 5 5

4. Compare:
5. All of the absolute mean differences
are greater than the critical range. x1  x 2  23.2
Therefore there is a significant
x1  x 3  43.4
difference between each pair of
means at 5% level of significance. x  x  20.2
2 3
Thus, with 95% confidence we conclude that
the mean distance for club 1 is greater than
club 2 and 3, and club 2 is greater than club 3.
FACTORIAL DESIGN: TWO-WAY ANOVA
DCOVA
• Examines the effect of:
• Two factors of interest on the dependent
variable.
• e.g., Percent carbonation and line speed on
soft drink bottling process.
• Interaction between the different levels of
these two factors:
• e.g., Does the effect of one particular
carbonation level depend on where line speed
is set?
TWO-WAY ANOVA
(continued)
DCOVA
• Assumptions:

• Populations are normally distributed.


• Populations have equal variances.
• Independent random samples are
selected.
TWO-WAY ANOVA
SOURCES OF VARIATION DCOVA

Two Factors of interest: A and B


r = number of levels of factor A.
c = number of levels of factor B.
n’ = number of replications for each cell.
n = total number of observations in all cells
n = (r)(c)(n’).
Xijk = value of the kth observation of level i of
factor A and level j of factor B.
TWO-WAY ANOVA DCOVA
SOURCES OF VARIATION (continued)

SST = SSA + SSB + SSAB + SSE Degrees of


Freedom:
SSA r–1
Factor A Variation

SST SSB c–1


Factor B Variation
Total Variation
SSAB
Variation due to interaction (r – 1)(c – 1)
between A and B
n-1
SSE rc(n’ – 1)
Random variation (Error)
TWO-WAY ANOVA EQUATIONS
DCOVA

Total Variation: r c n
SST   ( Xijk  X) 2

i1 j1 k 1

Factor A Variation: r
SSA  cn  ( Xi..  X)
 2

i 1

Factor B Variation: c
SSB  rn ( X. j.  X)2
j 1
TWO-WAY ANOVA EQUATIONS
(continued)
DCOVA

Interaction Variation:
r c
SSAB  n ( Xij.  Xi..  X.j.  X)2
i1 j1

Sum of Squares Error:


r c n
SSE   ( Xijk  Xij. ) 2

i 1 j 1 k 1
TWO-WAY ANOVA EQUATIONS
(continued)
r c n

where:  X
i1 j1 k 1
ijk DCOVA
X  Grand Mean
c n
rcn
 X
j1 k 1
ijk

Xi..   Mean of ith level of factor A (i  1, 2, ..., r)


cn
r n

 X ijk
X. j.  i 1 k 1
 Mean of jth level of factor B (j  1, 2, ..., c)
rn

Xijk
n r = number of levels of factor A
Xij.    Mean of cell ij c = number of levels of factor B
k 1 n n’ = number of replications in each cell
MEAN SQUARE CALCULATIONS
DCOVA
SSA
MSA  Mean square factor A 
r 1

SSB
MSB  Mean square factor B 
c 1

SSAB
MSAB  Mean square interactio n 
(r  1)(c  1)

SSE
MSE  Mean square error 
rc(n '1)
INTERPRETING RESULTS FROM A TWO
WAY ANOVA DCOVA

• First determine if the interaction is statistically


significant.

• If the interaction is significant then further


analysis will focus on the interaction.

• If the interaction is insignificant then focus on


the main effects.
TWO-WAY ANOVA:
THE F TEST STATISTICS DCOVA

F Test for Factor A Effect


H0: μ1..= μ2.. = μ3..= • • = µr..
MSA Reject H0 if
H1: Not all μi.. are equal FSTAT 
MSE FSTAT > Fα.

F Test for Factor B Effect


H0: μ.1. = μ.2. = μ.3.= • • = µ.c.
MSB Reject H0 if
H1: Not all μ.j. are equal FSTAT 
MSE FSTAT > Fα.

F Test for Interaction Effect


H0: the interaction of A and B is
equal to zero MSAB
H1: interaction of A and B is not
FSTAT  Reject H0 if
MSE
zero FSTAT > Fα.
TWO-WAY ANOVA
SUMMARY TABLE DCOVA

Source of Sum of Degrees of Mean


F
Variation Squares Freedom Squares

MSA MSA
Factor A SSA r–1
= SSA /(r – 1) MSE
MSB MSB
Factor B SSB c–1
= SSB /(c – 1) MSE

AB MSAB MSAB
SSAB (r – 1)(c – 1)
(Interaction) = SSAB / (r – 1)(c – 1) MSE

MSE =
Error SSE rc(n’ – 1)
SSE/rc(n’ – 1)
Total SST n–1
FEATURES OF TWO-WAY ANOVA
F TEST DCOVA

• Degrees of freedom always add up:


• n-1 = rc(n’-1) + (r-1) + (c-1) + (r-1)(c-1).
• Total = error + factor A + factor B + interaction.
• The denominators of the F Test are always the same
but the numerators are different.
• The sums of squares always add up:
• SST = SSE + SSA + SSB + SSAB.
• Total = error + factor A + factor B + interaction.
DO IN-STORE LOCATION AND THE
PERMISSIBILITY OF MOBILE PAYMENTS IMPACT
SALES DCOVA
Sales By Location and Permissibility Of Mobile Payments

Mobile
Payments In-Aisle Front Kiosk Expert
No 30.06 32.22 30.78 30.33
No 29.96 31.47 30.91 30.29
No 30.19 32.13 30.79 30.25
No 29.96 31.86 30.95 30.25
No 29.74 32.29 31.13 30.55
Yes 30.66 32.81 31.34 31.03
Yes 29.99 32.65 31.80 31.77
Yes 30.73 32.81 32.00 30.97
Yes 30.72 32.42 31.07 31.43
Yes 30.73 33.12 31.69 30.72
EXCEL TWO-WAY ANOVA
RESULTS DCOVA

1. The interaction is not


significant since its p-
value is > 0.05.

2. Both Main Effects (In-


Store Location and
Mobile Payment
Possibility) are
significant.
THE TUKEY-KRAMER PROCEDURE
DCOVA
• The Tukey-Kramer Procedure can also be used for a two
way ANOVA when there is no significant interaction.
• Done after one or both of the factor effects is significant.
• The critical range for each factor for multiple
comparisons procedure for a two way ANOVA is:

MSE MSE
Critical Range For Factor A  Q Critical Range For Factor B  Q
cn rn
where: where:
Qα = Upper Tail Critical Value from Qα = Upper Tail Critical Value from
Studentized Range Distribution Studentized Range Distribution with c and
with r and rc(n’-1) degrees of freedom (see rc(n’-1) degrees of freedom (see appendix
appendix E.7 table). E.7 table).
THE TUKEY PROCEDURE: EXAMPLE
DCOVA
1. Compute absolute mean differences: 2. Compute the critical range:

x .1.  x .2.  30.274 - 32.378  2.104 MSE  0.0821


r  2, c  4, so rc ( n  1)  32
x .1.  x .3.  30.274 - 31.246  0.972
Q  3.84
x .1.  x .4.  30.274 - 30.759  0.485
0.0821
x .2.  x .3.  32.378 - 31.246  1.132 CriticalRa nge  3.84  0.3482
10
x .2.  x .4.  32.378 - 30.759  1.619
x .3.  x .4.  31.246 - 30.759  0.487

3. All pairwise comparisons exceed the critical range and are


significant. This indicates mean sales are different for the
four in-store locations.
VISUALIZING INTERACTIONS: THE CELL DCOVA
MEANS PLOT

An Insignificant Interaction Will Yield Cell Means Plots


With Approximately Parallel Line Segments.
INTERACTIONS WILL YIELD PLOTS WITH
SOME NON-PARALLEL LINE SEGMENTS
DCOVA
 Interaction is present:
• No interaction: line
some line segments
segments are parallel.
not parallel.

Factor B Level 1
Mean Response

Mean Response
Factor B Level 1
Factor B Level 3

Factor B Level 2
Factor B Level 2
Factor B Level 3

Factor A Levels Factor A Levels


INTERPRETING AN INTERACTION
EFFECT
DCOVA
• A statistically significant interaction indicates that the
effect one factor has on the dependent variable
depends on what the level of another factor is.

• In a cell means plot this will show up as some non-


parallel line segments.

• The following example has a significant interaction.


DO ACT PREP COURSE TYPE AND COURSE LENGTH IMPACT
MEAN ACT SCORES? DCOVA
ACT Scores for Different Types and Lengths of Courses
LENGTH OF COURSE

TYPE OF COURSE Condensed Regular

Traditional 26 18 34 28

Traditional 27 24 24 21

Traditional 25 19 35 23

Traditional 21 20 31 29

Traditional 21 18 28 26

Online 27 21 24 21

Online 29 32 16 19

Online 30 20 22 19

Online 24 28 20 24

Online 30 29 23 25
PLOTTING CELL MEANS SHOWS A STRONG
INTERACTION DCOVA

Nonparallel lines indicate


the effect of condensing
the course depends on
whether the course is
taught in the traditional
classroom or by online
distance learning.

The online course yields


higher scores when
condensed while the
traditional course yields
higher scores when not
condensed (regular).
EXCEL ANALYSIS OF ACT PREP COURSE
DATA DCOVA

The interaction between course


length & type is significant
because its p-value is 0.0000.

While the p-values associated


with both course length &
course type are not significant,
because the interaction is
significant you cannot directly
conclude they have no effect.
WITH THE SIGNIFICANT INTERACTION
COLLAPSE THE DATA INTO FOUR GROUPS
DCOVA
• After collapsing into four groups do a one way ANOVA.

• The four groups are:


1. Traditional course condensed.
2. Traditional course regular length.
3. Online course condensed.
4. Online course regular length.
EXCEL ANALYSIS OF COLLAPSED DATA
DCOVA

1. Traditional regular > Traditional condensed


2. Online condensed > Traditional condensed
Group is a significant effect. 3. Traditional regular > Online regular
4. Online condensed > Online regular
p-value of 0.0003 < 0.05.
If the course is taken online should use the
condensed version and if the course is taken
by traditional method should use the regular.
CHAPTER SUMMARY
In this chapter we discussed:
• The basic concepts of experimental design.
• How to use one-way analysis of variance to test for differences among
the means of several groups.
• How to use two-way analysis of variance and interpret the interaction
effect.
• How to perform multiple comparisons in a one-way analysis of variance
and a two-way analysis of variance.
REFERENCE
• Levine, David M, Stephan, David F., &
Szabat, Kathryn A. . (2017). Statistics for
managers using Microsoft excel. (8th Ed.).
Pearson Education. ISBN : 9780273787112.

You might also like