Chap11 Two Sample Hypothesis Testing BBA 2K3

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 47

Statistics II

Chapter 11
Two sample Tests of Hypothesis
or Testing for Two Population
Parameters
Chapter Goals

After completing this chapter, you should be


able to:
 Test hypotheses or form interval estimates for
 two independent population means
 Standard deviations known

Standard deviations unknown
 two means from paired samples
 the difference between two population
proportions
 Is there a difference in the mean value of
residential real estate sold by male agents and
female agents in south Florida?
 Is there a difference in the mean number of
defects produced on the day and the afternoon
shifts at Kimble Products?
 Is there a difference in the mean number of
days absent between young workers (under 21
years of age) and older workers (more than 60
years of age) in the fastfood industry?
Estimation for Two Populations
Estimating two
population values

Population
means, Paired Population
independent samples proportions
samples
Examples:
Group 1 vs. Same group Proportion 1 vs.
independent before vs. after Proportion 2
Group 2 treatment
Difference Between Two Means

Population means, Goal: Form a confidence


independent
samples
* interval for the difference
between two population
means, μ1 – μ2
σ1 and σ2 known

The point estimate for the


σ1 and σ2 unknown difference is
but assumed equal
x1 – x2
σ1 and σ2 unknown,
not assumed equal
Independent Samples

 Different data sources


Population means,
independent
samples
*  Unrelated

 Independent

 Sample selected from

one population has no


σ1 and σ2 known
effect on the sample
selected from the other
σ1 and σ2 unknown population
but assumed equal  Use the difference between
2 sample means
σ1 and σ2 unknown,
not assumed equal
σ1 and σ2 known

Population means, Assumptions:


independent
samples  Samples are randomly and
independently drawn
σ1 and σ2 known *  population distributions are
normal or both sample sizes
σ1 and σ2 unknown
are  30
but assumed equal
 Population standard
σ1 and σ2 unknown,
not assumed equal deviations are known
σ1 and σ2 known
(continued)

When σ1 and σ2 are known and


Population means,
independent both populations are normal or
samples both sample sizes are at least 30,
the test statistic is a z value…

σ1 and σ2 known * …and the standard error of


x1 – x2 is
σ1 and σ2 unknown
but assumed equal 2 2
σ σ2
σ1 and σ2 unknown,
σ x1  x 2  1

not assumed equal
n1 n2
σ1 and σ2 known
(continued)

Population means,
independent The confidence interval for
samples μ1 – μ2 is:

*
x 
σ1 and σ2 known 2 2
σ σ
1  x 2  z/2 1
 2

σ1 and σ2 unknown n1 n2
but assumed equal

σ1 and σ2 unknown,
not assumed equal
σ1 and σ2 unknown

Population means, Assumptions:


independent  Samples are randomly and
samples
independently drawn

σ1 and σ2 known  Population standard


deviations are unknown
σ1 and σ2 unknown
but assumed equal
*  The two standard deviations
are equal
σ1 and σ2 unknown,
not assumed equal
 EXP (P315): Customers at FoodTown Super Markets have a choice
when paying for their groceries. They may check out and pay using the
standard cashier assisted checkout, or they may use the new U-Scan
procedure. In the standard procedure a FoodTown employee scans each
item, puts it on a short conveyor where another employee puts it in a bag
and then into the grocery cart. In the U-Scan procedure the customer
scans each item, bags it, and places the bags in the cart themselves. The
U-Scan procedure is designed to reduce the time a custom~r spends in the
checkout line. The U-Scan facility was recently installed at the Byrne Road
FoodTown location.
The store manager would like to know if the mean checkout time using the
standard checkout method is longer than using the U-Scan. She gathered the
following sample information. The time is measured from when the customer
enters the line until their
bags are in the cart. Hence the time includes both waiting in line and checking
out.
σ1 and σ2 unknown
(continued)

Population means,
independent
Forming interval estimates:
samples
 The population standard
deviations are assumed equal,
σ1 and σ2 known so use the two sample
standard deviations and pool
σ1 and σ2 unknown
but assumed equal
* them to estimate σ

 the test statistic is a t value


σ1 and σ2 unknown, with (n1 + n2 – 2) degrees
not assumed equal of freedom
σ1 and σ2 unknown
(continued)

Population means, The pooled standard


independent deviation is
samples

σ1 and σ2 known

sp 
n1  1s12  n2  1s22
σ1 and σ2 unknown
but assumed equal
* n1  n2  2

σ1 and σ2 unknown,
not assumed equal
σ1 and σ2 unknown
(continued)

Population means, The confidence interval for


independent μ1 – μ2 is:
samples

σ1 and σ2 known x 1 
 x 2  t /2 sp
1 1

n1 n2
σ1 and σ2 unknown
but assumed equal
* Where t/2 has (n1 + n2 – 2) d.f.,
and
σ1 and σ2 unknown,
sp 
n1  1s12  n2  1s22
not assumed equal n1  n2  2
σ1 and σ2 unknown

Population means, Assumptions:


independent
samples  populations are normally
distributed
σ1 and σ2 known  there is a reason to believe
that the populations do not
σ1 and σ2 unknown have equal variances
but assumed equal
 samples are independent
σ1 and σ2 unknown,
not assumed equal
*
σ1 and σ2 unknown
(continued)
Forming interval
Population means, estimates:
independent
samples  The population variances
are not assumed equal, so
σ1 and σ2 known we do not pool them

 the test statistic is a t value


σ1 and σ2 unknown with degrees of freedom
but assumed equal given by:

σ1 and σ2 unknown,
not assumed equal
* df 

(s12 /n1  s22 /n 2 )2
 
 s2 /n 2 s2 /n 2 
 1 1  2 2  
 n1  1 n2  1 

σ1 and σ2 unknown
(continued)

Population means, The confidence interval for


independent μ1 – μ2 is:
samples

σ1 and σ2 known x 1 
 x 2  t α/2
s12 s22

n1 n2
σ1 and σ2 unknown
but assumed equal Where t/2 has d.f. given by

σ1 and σ2 unknown,
not assumed equal
* df 

(s12 /n1  s22 /n 2 )2
 
 s2 /n 2 s2 /n 2 
 1 1  2 2  
 n1  1 n2  1 

Hypothesis Tests for the
Difference Between Two Means

 Testing Hypotheses about μ1 – μ2

 Use the same situations discussed already:


 Standard deviations known
 Standard deviations unknown
 Assumed equal

 Assumed not equal


Hypothesis Tests for
Two Population Proportions
Two Population Means, Independent Samples

Lower tail test: Upper tail test: Two-tailed test:

H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2


HA: μ1 < μ2 HA: μ1 > μ2 HA: μ1 ≠ μ2
i.e., i.e., i.e.,
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
HA: μ1 – μ2 < 0 HA: μ1 – μ2 > 0 HA: μ1 – μ2 ≠ 0
Hypothesis tests for μ1 – μ2

Population means, independent samples

σ1 and σ2 known Use a z test statistic


Use sp to estimate unknown
σ1 and σ2 unknown σ , use a t test statistic with
but assumed equal n1 + n2 – 2 d.f.
σ1 and σ2 unknown, Use s1 and s2 to estimate
not assumed equal unknown σ1 and σ2 , use a t
test statistic and calculate the
required degrees of freedom
σ1 and σ2 known

Population means,
independent The test statistic for
samples μ1 – μ2 is:

σ1 and σ2 known * z
 x 1 
 x 2   μ1  μ2 
2 2
σ1 and σ2 unknown σ σ2
but assumed equal
1

n1 n2
σ1 and σ2 unknown,
not assumed equal
σ1 and σ2 unknown,
large samples
The test statistic for
Population means,
independent μ1 – μ2 is:
samples

t
 x 1 
 x 2   μ1  μ2 
σ1 and σ2 known
1 1
sp 
σ1 and σ2 unknown
but assumed equal
* n1 n2
Where t has (n1 + n2 – 2) d.f.,

σ1 and σ2 unknown, and


not assumed equal sp 
n1  1s12  n2  1s2 2
n1  n2  2
σ1 and σ2 unknown,
small samples
The test statistic for
Population means, μ1 – μ2 is:
independent
samples
t
 x 1 
 x 2   μ1  μ2 
2 2
σ1 and σ2 known s s2

1
n1 n2
σ1 and σ2 unknown
but assumed equal Where t has d.f. given by

σ1 and σ2 unknown,
not assumed equal
* df 

(s12 /n1  s22 /n 2 )2
 
 s2 /n 2 s2 /n 2 
 1 1  2 2  
 n1  1 n2  1 

Hypothesis tests for μ1 – μ2
Two Population Means, Independent Samples
Lower tail test: Upper tail test: Two-tailed test:
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
HA: μ1 – μ2 < 0 HA: μ1 – μ2 > 0 HA: μ1 – μ2 ≠ 0
Example: σ1 and σ2 known:

a a a/2 a/2

-za za -za/2 za/2


Reject H0 if z < -za Reject H0 if z > za Reject H0 if z < -za/2
or z > za/2
Pooled sp t Test Example
σ1 and σ2 unknown, assumed equal
You’re a financial analyst for a brokerage firm. Is there a
difference in dividend yield between stocks listed on the
NYSE & NASDAQ? You collect the following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16

Assuming equal variances, is


there a difference in average
yield ( = 0.05)?
Calculating the Test Statistic
The test statistic is:

t
x 1 
 x 2  μ1  μ2  3.27  2.53   0
  2.040
1 1 1 1
sp  1.2256 
n1 n2 21 25

Where:

sp 
n1  1s12  n2  1s22 
21  11.30 2  25  11.16 2  1.2256
n1  n2  2 21  25  2
Solution
Reject H0 Reject H0
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
HA: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2) .025 .025
 = 0.05
df = 21 + 25 - 2 = 44
-2.0154 0 2.0154 t
Critical Values: t = ± 2.0154
2.040
Test Statistic: Decision:
3.27  2.53 Reject H0 at a = 0.05
t  2.040
1 1
1.2256  Conclusion:
21 25 There is evidence that
the means are different.
Exp: Gibbs Baby Food Company wishes to compare
the weight gain of infants using its brand versus its
competitor’s. A sample of 40 babies using the Gibbs
products revealed a mean weight gain of 7.6 pounds
in the first three months after birth. For the Gibbs
brand, the population standard deviation of the sample
is 2.3 pounds. Asample of 55 babies using the
competitor’s brand revealed a mean increase in
weight of 8.1 ounds. The population standard
deviation is 2.9 pounds. At the .05 significance level,
can we onclude that babies using the Gibbs brand
gained less weight?
 Ho: µ1 ≥ µ2
H1: µ1 < µ2
 The significance level at 0.05
 Calculation:

 Critical region: Reject H0 if z < −1.65


 Decision: Compare calculated value and tabulated
value; Fail to reject H0.
 Cocnclusion: Babies using the Gibbs brand did
not gain less weight.
Paired Samples

Tests Means of 2 Related Populations


Paired  Paired or matched samples
samples  Repeated measures (before/after)
 Use difference between paired values:

d = x1 - x2
 Eliminates Variation Among Subjects
 Assumptions:
 Both Populations Are Normally Distributed

 Or, if Not Normal, use large samples


Paired Differences
The ith paired difference is di , where
Paired di = x1i - x2i
samples
n
The point estimate for
the population mean
d i
d i 1
paired difference is d : n

n
The sample standard
deviation is  i
(d  d) 2

sd  i1
n 1
n is the number of pairs in the paired sample
Paired Differences
(continued)

Paired The confidence interval for d is


samples
sd
dt
n
n

Where t has n - 1 d.f. and sd is:  i


(d  d) 2

sd  i1
n 1
n is the number of pairs in the paired sample
Hypothesis Testing for
Paired Samples
The test statistic for d is
Paired
samples
d  μd
t
sd
n
n is the
number n
of pairs
in the
Where t has n - 1 d.f.  i
(d  d) 2

paired and sd is: sd  i 1


sample n 1
Hypothesis Testing for
Paired Samples
(continued)
Paired Samples
Lower tail test: Upper tail test: Two-tailed test:

H0: μd  0 H0: μd ≤ 0 H0: μd = 0


HA: μd < 0 HA: μd > 0 HA: μd ≠ 0

a a a/2 a/2

-ta ta -ta/2 ta/2


Reject H0 if t < -ta Reject H0 if t > ta Reject H0 if t < -ta/2
or t > ta/2
Where t has n - 1 d.f.
Paired Samples Example
 Assume you send your salespeople to a “customer
service” training workshop. Is the training effective?
You collect the following data:

Number of Complaints: (2) - (1)  di


Salesperson Before (1) After (2) Difference, di d = n
C.B. 6 4 - 2 = -4.2
T.F. 20 6 -14
M.H. 3 2 - 1
R.K.
M.O.
0
4
0
0
0
- 4 sd 
 i
(d  d ) 2

-21
n 1
 5.67
Paired Samples: Solution
 Has the training made a difference in the number of
complaints (at the 0.05 level)?
Reject Reject
H0: μd = 0
HA: μd  0 /2
/2
 = .05 d = - 4.2 - 2.7765 2.7765
- 1.66
Critical Value = ± 2.7765
d.f. = n - 1 = 4 Decision: Do not reject H0
(t stat is not in the reject region)
Test Statistic:
Conclusion: There is not a
d  μd  4.2  0
t   1.66 significant change in the
s d / n 5.67/ 5 number of complaints.
Two Population Proportions

Goal: Form a confidence interval for


Population or test a hypothesis about the
proportions difference between two population
proportions, π1 – π2
Assumptions:
n1π1  5 , n1(1-π1)  5
n2π2  5 , n2(1-π2)  5

The point estimate for


the difference is p1 – p 2
Confidence Interval for
Two Population Proportions

Population The confidence interval for


proportions
π1 – π2 is:

p1(1  p1 ) p 2 (1  p 2 )
 p1  p2  z 
n1 n2
Hypothesis Tests for
Two Population Proportions
Population proportions

Lower tail test: Upper tail test: Two-tailed test:

H0: π1  π2 H0: π1 ≤ π2 H0: π1 = π2


HA: π1 < π2 HA: π1 > π2 HA: π1 ≠ π2
i.e., i.e., i.e.,
H0: π1 – π2  0 H0: π1 – π2 ≤ 0 H0: π1 – π2 = 0
HA: π1 – π2 < 0 HA: π1 – π2 > 0 HA: π1 – π2 ≠ 0
Two Population Proportions
Since we begin by assuming the null
hypothesis is true, we assume π1 = π2
Population
proportions and pool the two p estimates
The pooled estimate for the
overall proportion is:

n1p1  n2p 2 x1  x 2
p 
n1  n2 n1  n2
where x1 and x2 are the numbers from
samples 1 and 2 with the characteristic of interest
Two Population Proportions
(continued)

Population The test statistic for


proportions π1 – π2 is:

z
 p1  p2    π1  π 2 
 1 1 
p (1  p )   
 n1 n2 
Hypothesis Tests for
Two Population Proportions
Population proportions
Lower tail test: Upper tail test: Two-tailed test:
H0: π1 – π2  0 H0: π1 – π2 ≤ 0 H0: π1 – π2 = 0
HA: π1 – π2 < 0 HA: π1 – π2 > 0 HA: π1 – π2 ≠ 0

a a a/2 a/2

-za za -za/2 za/2


Reject H0 if z < -za Reject H0 if z > za Reject H0 if z < -za/2
or z > za/2
Example:
Two population Proportions
Is there a significant difference between the
proportion of men and the proportion of
women who will vote Yes on Proposition A?

 In a random sample, 36 of 72 men and 31 of


50 women indicated they would vote Yes

 Test at the .05 level of significance


Example:
Two population Proportions
(continued)
 The hypothesis test is:
H0: π1 – π2 = 0 (the two proportions are equal)
HA: π1 – π2 ≠ 0 (there is a significant difference between proportions)
 The sample proportions are:
 Men: p1 = 36/72 = .50
 Women: p2 = 31/50 = .62
 The pooled estimate for the overall proportion is:
x1  x 2 36  31 67
p    .549
n1  n2 72  50 122
Example:
Two population Proportions
(continued)
Reject H0 Reject H0

The test statistic for π1 – π2 is:


.025 .025
z
 p1  p2    π1  π 2 
 1 1 

p (1  p )    -1.96 1.96
 n1 n2 
-1.31


 .50  .62    0    1.31
 1 1  Decision: Do not reject H0
.549 (1  .549)   
 72 50 
Conclusion: There is not
significant evidence of a
Critical Values = ±1.96
For  = .05 difference in the proportion
who will vote yes between
men and women.
Chapter Summary
 Compared two independent samples
 Formed confidence intervals for the differences between two
means
 Performed z test for the differences in two means
 Performed t test for the differences in two means
 Compared two related samples (paired samples)
 Formed confidence intervals for the paired difference
 Performed paired sample t tests for the mean difference
 Compared two population proportions
 Formed confidence intervals for the difference between two
population proportions
 Performed z test for two population proportions

You might also like