Two Sample Test & ANOVA Session 5
Two Sample Test & ANOVA Session 5
Two Sample Test & ANOVA Session 5
OF VARIANCE
Two-Sample Tests
Population Population
Means, Means, Population Population
Independent Related Proportions Variances
Samples Samples
Examples:
Group 1 vs. Same group Proportion 1 vs. Variance 1 vs.
Group 2 before vs. after Proportion 2 Variance 2
treatment
DIFFERENCE BETWEEN TWO
MEANS DCOVA
X1 – X 2
σ1 and σ2 unknown,
not assumed equal
DIFFERENCE BETWEEN TWO MEANS: INDEPENDENT
SAMPLES DCOVA
• Different data sources
Population means, • Unrelated.
independent
samples
* • Independent.
• Sample selected from one population
has no effect on the sample selected
from the other population.
a a a/2 a/2
σ1 and σ2 unknown,
assumed equal
* Populations are normally
distributed or both sample
sizes are at least 30.
p
samples (n1 1) (n2 1)
• The test statistic is:
σ1 and σ2 unknown,
assumed equal
* X1 X 2 μ 1 μ 2
t STAT
2 1 1
Sp
n1 n2
σ1 and σ2 unknown,
not assumed equal • Where tSTAT has d.f. = (n1 + n2 – 2).
CONFIDENCE INTERVAL FOR Μ1 - Μ2 WITH Σ1 AND Σ2
UNKNOWN AND ASSUMED EQUAL
DCOVA
Population means,
independent
samples
The confidence interval for
μ1 – μ2 is:
σ1 and σ2 unknown,
assumed equal
*
2
X1 X 2 tα/2 Sp
1
1
n1 n 2
t
X1 X 2 μ 1 μ 2
3.27 2.53 0 2.040
2 1 1 1 1
Sp 1.5021
21 25
n1 n 2
2
S
n1 1S1
2
n 2 1S 2
2
21 11.30 2 25 11.16 2
1.5021
p
(n1 1) (n2 1) (21 - 1) (25 1)
POOLED-VARIANCE T TEST EXAMPLE: HYPOTHESIS
TEST SOLUTION DCOVA
Reject H0 Reject H0
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)
.025 .025
= 0.05
df = 21 + 25 - 2 = 44 -2.0154 0 2.0154 t
Critical Values: t = ± 2.0154
2.040
Test Statistic: Decision:
3.27 2.53
t 2.040 Reject H0 at a = 0.05.
1 1
1.5021 Conclusion:
21 25 There is evidence of a
difference in means.
POOLED VARIANCE T TEST IN EXCEL
DCOVA
POOLED-VARIANCE T TEST EXAMPLE:
CONFIDENCE INTERVAL FOR Μ1 - Μ2 DCOVA
Since we rejected H0 can we be 95% confident that µNYSE > µNASDAQ?
X X t
1 2 /2
1 1
S2p 0.74 2.0154 0.3628 (0.009, 1.471)
n1 n 2
*
σ1 and σ2 unknown, unknown and cannot be
assumed to be equal.
not assumed equal
HYPOTHESIS TESTS FOR Μ1 - Μ2 WITH Σ1 AND Σ2
UNKNOWN AND NOT ASSUMED EQUAL
(continued)
DCOVA
Population means, This test is known at the
independent separate-variance t test.
samples
*
σ1 and σ2 unknown, This test done in Excel is
not assumed equal shown on the next slide.
SEPARATE-VARIANCE T TEST IN EXCEL
NYSE NASDAQ (continued)
Number 21 25 DCOVA
Sample mean 3.27 2.53
Sample std dev 1.30 1.16
Di = X1i - X2i
• Eliminates Variation Among Subjects.
• Assumptions:
• Differences are normally distributed.
• Or, if not Normal, use large samples.
RELATED POPULATIONS (continued)
THE PAIRED DIFFERENCE TEST DCOVA
a a a/2 a/2
(D D)
i
2
where SD i 1
n 1
Paired Difference Test:
Example DCOVA
• Assume you send your salespeople to a “customer
service” training workshop. Has the training made a
difference in the number of complaints? You collect the
following data:
M.O. 4 0 - 4 n 1
-21
5.67
Paired Difference Test:
Solution DCOVA
Conclusion: There is
insufficient evidence of a
change in the number of
complaints.
THE PAIRED DIFFERENCE CONFIDENCE INTERVAL
-- EXAMPLE DCOVA
SD
D t / 2
The confidence interval for μD is:
n
The probability this interval contains the true value of μD is 99%.
D = -4.2, SD = 5.67
5.67
99% CI for D : 4.2 4.604
5
(-15.87, 7.47)
TWO POPULATION
PROPORTIONS
DCOVA
Goal: test a hypothesis or form a
Population confidence interval for the difference
proportions between two population proportions,
π1 – π2
Assumptions:
n1 π1 5 , n1(1- π1) 5
n2 π2 5 , n2(1- π2) 5
X1 X 2
p
n1 n2
where X1 and X2 are the number of items of
interest in samples 1 and 2.
TWO POPULATION
PROPORTIONS (continued)
Z STAT
p 1 p 2 π1 π 2
1 1
p (1 p )
n1 n 2
X1 X 2 X X
where p , p1 1 , p 2 2
n1 n 2 n1 n2
HYPOTHESIS TESTS FOR
TWO POPULATION PROPORTIONS
DCOVA
Population proportions
a a a/2 a/2
.50 .70 0 2.20 Decision: Reject H .
1 1 0
.582 (1 .582)
72 50 Conclusion: There is
evidence of a significant
Critical Values = ±1.96 difference in the proportion
For = .05 of men and women who
will vote yes.
COMPARING TWO POPULATION
DCOVA
PROPORTIONS IN EXCEL
Decision: Reject H0.
Conclusion: There is
evidence of a significant
difference in the proportion
of men and women who
will vote yes.
CONFIDENCE INTERVAL FOR
TWO POPULATION PROPORTIONS
DCOVA
p1 (1 p1 ) p 2 (1 p 2 )
p1 p 2 Z/2
n1 n2
CONFIDENCE INTERVAL FOR TWO
POPULATION PROPORTIONS -- EXAMPLE
DCOVA
The 95% confidence interval for π1 – π2 is:
0.50(0.50) 0.70(0.30)
0.50 0.70 1.96
72 50
(-0.37, - 0.03)
• When S12
FSTAT 2 df1 = n1 – 1 ; df2 = n2 – 1.
S2
• In the F table;
• numerator degrees of freedom determine the column.
• denominator degrees of freedom determine the row.
FINDING THE REJECTION REGION
DCOVA
H0: σ12 = σ22 H0: σ12 ≤ σ22
H1: σ12 ≠ σ22 H1: σ12 > σ22
/2 F
0 0
Do not Reject H0 Do not Reject H0 F
reject H0 Fα/2 reject H0 Fα
0 F
Do not Reject H0
reject H0
FSTAT = 1.256 is not in the rejection F0.025=2.33
region, so we do not reject H0.
• Assumptions:
• Populations are normally distributed.
• Populations have equal variances.
• Samples are randomly and independently selected.
HYPOTHESES OF ONE-WAY
ANOVA DCOVA
• H0 : μ1 μ2 μ3 μc
• All population means are equal.
• i.e., no factor effect (no variation in means among
groups.)
μ1 μ 2 μ 3
ONE-WAY ANOVA (continued)
DCOVA
H0 : μ1 μ2 μ3 μc
H1 : Not all μ j are equal
When The Null Hypothesis is NOT true
At least one of the means is different.
(Factor Effect is present).
or
μ1 μ2 μ3 μ1 μ2 μ3
PARTITIONING THE VARIATION
DCOVA
• Total variation can be split into two parts:
DCOVA
SST = SSA + SSW
SST ( Xij X) 2
j 1 i 1
Where:
SST = Total sum of squares
c = number of groups or levels
nj = number of values in group j
Xij = ith observation from group j
X = grand mean (mean of all data values)
TOTAL VARIATION DCOVA
(continued)
2 2 2
SST ( X 11 X ) ( X 12 X ) ( X cn X )
c
R esponse, X
i j
AMONG-GROUP VARIATION
DCOVA
(continued)
SSA n 1 (X1 X) 2 n 2 (X 2 X) 2 n c (X c X) 2
R esponse, X
X3
X2 X
X1
SSW ( Xij X j ) 2
j1 i1
Where:
SSW = Sum of squares within groups
c = number of groups
nj = sample size from group j
Xj = sample mean from group j
Xij = ith observation in group j
WITHIN-GROUP VARIATION
(continued)
nj
DCOVA
c
SSW ( Xij X j ) 2
j 1 i1
SSW
Summing the variation
within each group and then
MSW
adding over all groups. nc
Mean Square Within =
SSW/degrees of freedom.
μj
WITHIN-GROUP VARIATION
DCOVA
(continued)
R esponse, X
X3
X2
X1
c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom
ONE-WAY ANOVA
F TEST STATISTIC DCOVA
H0: μ1= μ2 = … = μc
H1: At least two population means are different
• Test statistic
MSA
FSTAT
MSW
MSA is mean squares among groups.
MSW is mean squares within groups.
• Degrees of freedom
• df1 = c – 1 (c = number of groups)
• df2 = n – c (n = sum of sample sizes from all populations)
INTERPRETING ONE-WAY ANOVA
F STATISTIC DCOVA
Decision Rule:
Reject H if F
STAT > Fα,
0
otherwise do not reject
0
H0. Do not
reject H0
Reject H0
Fα
ONE-WAY ANOVA
DCOVA
F TEST EXAMPLE
• You want to see if three Club 1 Club 2 Club 3
different golf clubs yield 254 234 200
different distances traveled by 263 218 222
ball struck on an automated 241 235 197
driving machine.
237 227 206
• You randomly select five 251 216 204
measurements from trials on an
automated driving machine for
each club.
• At the 0.05 significance level, is
there a difference in mean
distance?
ONE-WAY ANOVA EXAMPLE:
SCATTER PLOT DCOVA
Distance
Club 1 Club 2 Club 3 270
254 234 200 260 •
••
263 218 222 250 X1
241 235 197 240 •
237 227 206 • ••
230
251 216 204
220 • X2 •
X
••
210
x1 249.2 x 2 226.0 x 3 205.8 200 •• X3
•
190 •
x 227.0
1 2 3
Club
ONE-WAY ANOVA EXAMPLE
COMPUTATIONS DCOVA
μ1= μ2 μ3 x
YOU MUST MAKE THREE ASSUMPTIONS ABOUT
YOUR DATA TO USE THE ANOVA F TEST
DCOVA
• Randomness and Independence
• Of the samples selected.
• Normality
• Of the c groups from which the samples are selected.
• Homogeneity of Variance:
• The variances of the c groups are equal.
• Can be tested with Levene’s Test.
RANDOMNESS & INDEPENDENCE IS THE
MUST CRITICAL OF THE ASSUMPTIONS
DCOVA
• Experimental validity depends on random sampling and/or the
randomization process.
MSW 1 1
Critical Range Q α
2 n j n j'
where:
Qα = Upper Tail Critical Value from Studentized
Range Distribution with c and n - c degrees of
freedom (see appendix E.7 table)
MSW = Mean Square Within
nj and nj’ = Sample sizes from groups j and j’
THE TUKEY-KRAMER
PROCEDURE: EXAMPLE DCOVA
1. Compute absolute mean
Club 1 Club 2 Club 3 differences:
254 234 200
263 218 222 x1 x 2 249.2 226.0 23.2
241 235 197 x1 x 3 249.2 205.8 43.4
237 227 206
251 216 204 x 2 x 3 226.0 205.8 20.2
MSW 1 1 93.3 1 1
Critical Range Q α 3.77 16.285
2 n j n j' 2 5 5
4. Compare:
5. All of the absolute mean differences
are greater than the critical range. x1 x 2 23.2
Therefore there is a significant
x1 x 3 43.4
difference between each pair of
means at 5% level of significance. x x 20.2
2 3
Thus, with 95% confidence we conclude that
the mean distance for club 1 is greater than
club 2 and 3, and club 2 is greater than club 3.
FACTORIAL DESIGN: TWO-WAY ANOVA
DCOVA
• Examines the effect of:
• Two factors of interest on the dependent
variable.
• e.g., Percent carbonation and line speed on
soft drink bottling process.
• Interaction between the different levels of
these two factors:
• e.g., Does the effect of one particular
carbonation level depend on where line speed
is set?
TWO-WAY ANOVA
(continued)
DCOVA
• Assumptions:
Total Variation: r c n
SST ( Xijk X) 2
i1 j1 k 1
Factor A Variation: r
SSA cn ( Xi.. X)
2
i 1
Factor B Variation: c
SSB rn ( X. j. X)2
j 1
TWO-WAY ANOVA EQUATIONS
(continued)
DCOVA
Interaction Variation:
r c
SSAB n ( Xij. Xi.. X.j. X)2
i1 j1
i 1 j 1 k 1
TWO-WAY ANOVA EQUATIONS
(continued)
r c n
where: X
i1 j1 k 1
ijk DCOVA
X Grand Mean
c n
rcn
X
j1 k 1
ijk
X ijk
X. j. i 1 k 1
Mean of jth level of factor B (j 1, 2, ..., c)
rn
Xijk
n r = number of levels of factor A
Xij. Mean of cell ij c = number of levels of factor B
k 1 n n’ = number of replications in each cell
MEAN SQUARE CALCULATIONS
DCOVA
SSA
MSA Mean square factor A
r 1
SSB
MSB Mean square factor B
c 1
SSAB
MSAB Mean square interactio n
(r 1)(c 1)
SSE
MSE Mean square error
rc(n '1)
INTERPRETING RESULTS FROM A TWO
WAY ANOVA DCOVA
MSA MSA
Factor A SSA r–1
= SSA /(r – 1) MSE
MSB MSB
Factor B SSB c–1
= SSB /(c – 1) MSE
AB MSAB MSAB
SSAB (r – 1)(c – 1)
(Interaction) = SSAB / (r – 1)(c – 1) MSE
MSE =
Error SSE rc(n’ – 1)
SSE/rc(n’ – 1)
Total SST n–1
FEATURES OF TWO-WAY ANOVA
F TEST DCOVA
Mobile
Payments In-Aisle Front Kiosk Expert
No 30.06 32.22 30.78 30.33
No 29.96 31.47 30.91 30.29
No 30.19 32.13 30.79 30.25
No 29.96 31.86 30.95 30.25
No 29.74 32.29 31.13 30.55
Yes 30.66 32.81 31.34 31.03
Yes 29.99 32.65 31.80 31.77
Yes 30.73 32.81 32.00 30.97
Yes 30.72 32.42 31.07 31.43
Yes 30.73 33.12 31.69 30.72
EXCEL TWO-WAY ANOVA
RESULTS DCOVA
MSE MSE
Critical Range For Factor A Q Critical Range For Factor B Q
cn rn
where: where:
Qα = Upper Tail Critical Value from Qα = Upper Tail Critical Value from
Studentized Range Distribution Studentized Range Distribution with c and
with r and rc(n’-1) degrees of freedom (see rc(n’-1) degrees of freedom (see appendix
appendix E.7 table). E.7 table).
THE TUKEY PROCEDURE: EXAMPLE
DCOVA
1. Compute absolute mean differences: 2. Compute the critical range:
Factor B Level 1
Mean Response
Mean Response
Factor B Level 1
Factor B Level 3
Factor B Level 2
Factor B Level 2
Factor B Level 3
Traditional 26 18 34 28
Traditional 27 24 24 21
Traditional 25 19 35 23
Traditional 21 20 31 29
Traditional 21 18 28 26
Online 27 21 24 21
Online 29 32 16 19
Online 30 20 22 19
Online 24 28 20 24
Online 30 29 23 25
PLOTTING CELL MEANS SHOWS A STRONG
INTERACTION DCOVA