Inbound 8609162511062510069

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

ES031 ENGINEERING DATA ANALYSIS

Module 5

TEST OF HYPOTHESIS:
TWO SAMPLES
Prepared by:
Engr. Kristan Ian Cabaña
IE Department

Adopted by:
Engr. Lynne Ivy L. Illaga
IE Department
OUTLINE

• Inference on the Difference in Means of Two Normal


Distributions, Variances Known
• Inference on the Difference in Means of Two Normal
Distributions, Variances Unknown
• Comparing Means of Two Related Populations
• Comparing Two Population Proportions
INFERENCE ON THE DIFFERENCE IN
MEANS OF TWO NORMAL
DISTRIBUTIONS
VARIANCES KNOWN
DIFFERENCE BETWEEN MEANS, VARIANCES KNOWN

TWO SAMPLE HYPOTHESIS TESTING


In a two-sample hypothesis test, two parameters from two populations
are compared.

For a two-sample hypothesis test,


1. the null hypothesis H0 is a statistical hypothesis that usually states
there is no difference between the parameters of two populations.
The null hypothesis always contains the symbol , =, or .
2. the alternative hypothesis Ha is a statistical hypothesis that is true
when H0 is false. The alternative hypothesis always contains the
symbol >, , or <.
DIFFERENCE BETWEEN MEANS, VARIANCES KNOWN

TWO SAMPLE HYPOTHESIS TESTING


To write a null and alternative hypothesis for a two-sample hypothesis
test, translate the claim made about the population parameters from a
verbal statement to a mathematical statement.

H0: μ1 = μ2 H0: μ1  μ2 H0: μ1  μ2


Ha: μ1  μ2 Ha: μ1 > μ2 Ha: μ1 < μ2

Regardless of which hypotheses used, 𝝁𝟏 = 𝝁𝟐 is always assumed to


be true.
DIFFERENCE BETWEEN MEANS, VARIANCES KNOWN

TWO SAMPLE Z-TEST


Three conditions are necessary to perform a z-test for the difference
between two population means 𝝁𝟏 𝒂𝒏𝒅 𝝁𝟐

1. The samples must be randomly selected.


2. The samples must be independent. Two samples are independent if
the sample selected from one population is not related to the
sample selected from the second population.
3. Each sample size must be at least 30, or, if not, each population
must have a normal distribution with a known standard deviation.
DIFFERENCE BETWEEN MEANS, VARIANCES KNOWN

TWO SAMPLE Z-TEST


A two-sample z-test can be used to test the difference between two
population means μ1 and μ2 when a large sample (at least 30) is randomly
selected from each population and the samples are independent. The
standardized test statistic is

( x1 − x2 ) − ( 1 −  2 )
Zc =
 2
 2
1
+ 2
n1 n2
DIFFERENCE BETWEEN MEANS, VARIANCES KNOWN

ILLUSTRATIVE EXAMPLE

A high school math teacher claims that students in


her class will score higher on the math portion of the
ACT than students in a colleague’s math class. The
mean ACT math score for 49 students in her class is
22.1 and the standard deviation is 4.8. The mean ACT
math score for 44 of the colleague’s students is 19.8
and the standard deviation is 5.4. At 0.10 level of
significance, can the teacher’s claim be supported?
INFERENCE ON THE DIFFERENCE IN
MEANS OF TWO NORMAL
DISTRIBUTIONS
VARIANCES UNKNOWN
DIFFERENCE BETWEEN MEANS, VARIANCES UNKNOWN

TWO SAMPLE t-TEST


If samples of size less than 30 are taken from normally-distributed populations, a
t-test may be used to test the difference between the population means
μ1 and μ2.
Three conditions are necessary to use a t-test for small independent samples.

1. The samples must be randomly selected.


2. The samples must be independent. Two samples are independent if
the sample selected from one population is not related to the sample
selected from the second population.
3. Each population must have a normal distribution.
DIFFERENCE BETWEEN MEANS, VARIANCES UNKNOWN

TWO SAMPLE t-TEST

A two-sample t-test is used to test the difference between two population


means μ1 and μ2 when a sample is randomly selected from each population.
Performing this test requires each population to be normally distributed,
and the samples should be independent. The standardized test statistic is:

Pooled variance t-test where Separate variance t-test where


 =
2
1
2
2  
2
1
2
2
DIFFERENCE BETWEEN MEANS, VARIANCES UNKNOWN
Pooled variance t-test where Separate variance t-test where
 12 =  22  12   22

( x1 − x 2 ) − ( 1 −  2 ) ( x1 − x 2 ) − ( 1 −  2 )
tc = tc =
2 2
1 1 s s
Sp + 1
+ 2

n1 n2 n1 n2
Where Pooled Estimator of Variance s2
s 
2 2

 + 
1 2

(n1 − 1) S12 + (n2 − 1) S 22 Degrees of  n1 n2 


Sp = =
n1 + n2 − 2 Freedom s 2

2
s 
2 2

 n 
1
 n  2
 1
+  2
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 = 𝑛1 + 𝑛2 − 2 n1 − 1 n2 − 1
DIFFERENCE BETWEEN MEANS, VARIANCES UNKNOWN
DIFFERENCE BETWEEN MEANS, VARIANCES UNKNOWN

ILLUSTRATIVE EXAMPLE

A random sample of 14 police officers in Brownsville


has a mean annual income of $35,800 and a
standard deviation of $7,800. In Greensville, a
random sample of 15 police officers has a mean
annual income of $35,100 and a standard deviation of
$7,375. Test the claim at 0.01 level of significance that
the mean annual incomes in the two cities are not
the same. Assume the population variances are
equal.
DIFFERENCE BETWEEN MEANS, VARIANCES UNKNOWN

ILLUSTRATIVE EXAMPLE
Arsenic concentration in public drinking water supplies is a
potential health risk. An article in the Arizona Republic (2001
Issue) reported drinking water arsenic concentrations in
parts per billion (ppb) for 10 metropolitan Phoenix
communities with a sample mean of 12.5 and standard
deviation of 7.63, and 10 communities in rural Arizona with a
sample mean of 27.5 and standard deviation of 15.3.
Determine whether any difference exists in mean arsenic
concentrations. Assume that the population variances are
not the same.
COMPARING MEANS OF TWO
RELATED POPULATIONS
COMPARING MEANS OF TWO RELATED POPULATIONS

INDEPENDENT AND DEPENDENT SAMPLES


Two samples are independent if the sample selected from one
population is not related to the sample selected from the second
population. Two samples are dependent if each member of one
sample corresponds to a member of the other sample. Dependent
samples are also called paired samples or matched samples.

Independent Samples Dependent Samples


COMPARING MEANS OF TWO RELATED POPULATIONS

INDEPENDENT AND DEPENDENT SAMPLES


Illustrative Examples:
Classify each pair of samples as independent or dependent.

Sample 1: The weight of 24 students in a first-grade class


Sample 2: The height of the same 24 students
These samples are dependent because the weight and height can be
paired with respect to each student.
Sample 1: The average price of 15 new trucks
Sample 2: The average price of 20 used sedans
These samples are independent because it is not possible to pair the
new trucks with the used sedans. The data represents prices for
different vehicles.
COMPARING MEANS OF TWO RELATED POPULATIONS

Z TEST FOR MEAN DIFFERENCE

ഥ − 𝜇𝐷
𝐷
𝑧𝑐 = 𝜎
𝐷
ൗ 𝑛

σ𝑛
𝑖=1 𝐷𝑖
Where: ഥ=
𝐷
𝑛
μD = hypothesized mean difference
σD = population standard deviation of the difference scores
n = sample size
𝐷𝑖 = difference between two samples
COMPARING MEANS OF TWO RELATED POPULATIONS

PAIRED t TEST FOR MEAN DIFFERENCE

ഥ − 𝜇𝐷
𝐷
𝑡𝑐 = 𝑠
𝐷
ൗ 𝑛

σ𝑛
𝑖=1 𝐷𝑖
Where: ഥ=
𝐷 2
𝑛 σ𝑛𝑖=1ሺ𝐷𝑖ഥ
− 𝐷൯
μD = hypothesized mean difference 𝑠𝐷 =
n = sample size 𝑛−1
𝐷𝑖 = difference between two samples
COMPARING MEANS OF TWO RELATED POPULATIONS

ILLUSTRATIVE EXAMPLE
A reading center claims that students will perform better on
a standardized reading test after going through the reading
course offered by their center. The table shows the reading
scores of 6 students before and after the course. At = 0.05,
is there enough evidence to conclude that the students’
scores after the course are better than the scores before the
course?
Student 1 2 3 4 5 6
Score (before) 85 96 70 76 81 78
Score (after) 88 85 89 86 92 89
COMPARING TWO POPULATION
PROPORTIONS
COMPARING TWO POPULATION PROPORTIONS

TWO SAMPLE Z-TEST


A z-test is used to test the difference between two population proportions, p1
and p2.

Three conditions are required to conduct the test.

1. The samples must be randomly selected.


2. The samples must be independent.
3. The samples must be large enough to use a normal sampling
distribution. That is,
n1p1  5, n1q1  5,
n2p2  5, and n2q2  5.
COMPARING TWO POPULATION PROPORTIONS

෢1 − 𝑃
𝑃 ෢2 − ሺ𝑃1 − 𝑃2 )
𝑍=
ത ത 1 1
𝑃ሺ1 − 𝑃) +
𝑛1 𝑛2

WITH

x1 + x2 p1 =
x1
p2 =
x2
p=
n1 + n2 n1 n2
COMPARING TWO POPULATION PROPORTIONS

ILLUSTRATIVE EXAMPLE
A recent survey stated that male college students smoke
less than female college students. In a survey of 1245 male
students, 361 said they smoke at least one pack of cigarettes
a day. In a survey of 1065 female students, 341 said they
smoke at least one pack a day. At alpha = 0.01, can you
support the claim that the proportion of male college
students who smoke at least one pack of cigarettes a day is
lower then the proportion of female college students who
smoke at least one pack a day?
HYPOTHESIS TESTING

SUMMARY
INDEPENDENT

N Y
𝜎 or 𝜎 2 known?
Y N Y N
𝜎12 = 𝜎22 n ≥ 30?

ሺ𝑥1 − 𝑥2 ) − ሺ𝜇1 − 𝜇2 ) ሺ𝑥1 − 𝑥2 ) − ሺ𝜇1 − 𝜇2 ) ሺ𝑥1 − 𝑥2 ) − ሺ𝜇1 − 𝜇2 ) ሺ𝑥1 − 𝑥2 ) − ሺ𝜇1 − 𝜇2 )


𝑡𝑐 = 𝑡𝑐 = 𝑍𝑐 = 𝑡𝑐 =
1 1 𝑠12 𝑠22
𝑆𝑝 𝑛 + 𝑛 𝑠12 𝑠22 𝜎12 𝜎22
+
1 2
𝑛1 + 𝑛2 𝑛1 + 𝑛2 𝑛1 𝑛2
HYPOTHESIS TESTING

SUMMARY
Independent RELATED
Proportions

Y N
𝜎 or 𝜎 2
෢1 − 𝑃
𝑃 ෢2 − ሺ𝑃1 − 𝑃2 ) known?
𝑍=
ത 1 + 1
ത − 𝑃)
𝑃ሺ1 Y
n ≥ 30? N
𝑛1 𝑛2
ഥ − 𝜇𝐷
𝐷
𝑧𝑐 = 𝜎
𝐷
ൗ 𝑛

ഥ − 𝜇𝐷
𝐷 ഥ − 𝜇𝐷
𝐷
𝑧𝑐 = 𝑠 𝑡𝑐 = 𝑠
𝐷 𝐷
ൗ 𝑛 ൗ 𝑛
ES031 ENGINEERING DATA ANALYSIS

End of Module 5

TEST OF HYPOTHESIS:
TWO SAMPLES
Prepared by:
Engr. Kristan Ian Cabaña
IE Department

Adopted by:
Engr. Lynne Ivy L. Illaga
IE Department

You might also like