Module - 5 PROB
Module - 5 PROB
Module - 5 PROB
Module - 5 1 / 31
Notations
Population parameters
Population mean (µ)
Population standard deviation (σ)
Population size (N)
Population proportion (P)
Sample statistic
Sample mean (x̄)
Sample standard deviation (s)
Sample size (n)
Sample proportion (p)
Module - 5 2 / 31
Hypothesis Testing
Statistical hypothesis
is a statement or a claim about one or more population parameters
Example
Suppose we test for population mean. Then
Null Hypothesis H0 : µ = µ0
Alt Hypothesis H1 : µ 6= µ0 or µ > µ0 or µ < µ0
If µ 6= µ0 , then the test is called Two-tailed.
If µ > µ0 , then it is called Right tailed test.
If µ < µ0 , then it is called Left tailed test.
Module - 5 3 / 31
Types of errors
H0 is true H0 is false
Reject H0 Type I error Correct decision
Accept H0 Correct decision Type II error
Module - 5 4 / 31
Steps involved in Hypothesis testing
Module - 5 5 / 31
Table corresponding to critical values
Module - 5 6 / 31
Test for single proportion
Conditions
nP ≥ 5 and n(1 − P) ≥ 5
Null Hypothesis: H0 : P = P0
Test statistic:
p−P
Z= q
PQ
n
follows Standard normal distribution
q
Standard error of proportion= PQ n
95% confidence limits (that is α = 5%) for P is
r r
pq pq
(p − 1.96 , p + 1.96 )
n n
Module - 5 7 / 31
Problems on single proportion
Module - 5 8 / 31
Problems
Solution contd:
Hence,
p−P 0.9 − 0.85
Z= q =q
PQ 0.85×0.15
n 40
= 0.8856
Note that Z follows standard normal distribution.
4: Critical region:
Zα = 1.65. The critical region is Z > 1.65 Since Cal Z = 0.8856 lies
in the acceptance region, we accept H0 (or fail to reject H0 )
5: Conclusion:
There is no statistical evidence to prove that more than 85% of the
people are attacked by a disease and survived.
Module - 5 9 / 31
Test of difference of proportions
Conditions
n1 p1 ≥ 5; n1 q1 ≥ 5; n2 p2 ≥ 5; n2 q2 ≥ 5
Module - 5 10 / 31
Test of difference of proportions
Conditions
n1 p1 ≥ 5; n1 q1 ≥ 5; n2 p2 ≥ 5; n2 q2 ≥ 5
Test statistic :
(p1 − p2 ) − (P1 − P2 )
Z= q
PQ( n11 + n12 )
Module - 5 10 / 31
Test of difference of proportions
Conditions
n1 p1 ≥ 5; n1 q1 ≥ 5; n2 p2 ≥ 5; n2 q2 ≥ 5
Test statistic :
(p1 − p2 ) − (P1 − P2 )
Z= q
PQ( n11 + n12 )
Module - 5 10 / 31
Problems
1. In a random sample of 100 men taken from Village A, 60 were found to
be consuming alcohol. In another sample of 200 men taken from Village
B, 100 were found to be consuming alcohol. Do the two villages differ
significantly in respect to the proportion of men who consume alcohol?
Module - 5 11 / 31
Problems
1. In a random sample of 100 men taken from Village A, 60 were found to
be consuming alcohol. In another sample of 200 men taken from Village
B, 100 were found to be consuming alcohol. Do the two villages differ
significantly in respect to the proportion of men who consume alcohol?
Solution:
Let x1 , x2 denotes the number of men consuming alcohol from Village A
and B resptly.
Here x1 = 60, n1 = 100, x2 = 100, n2 = 200
sample proportion p1 = nx11 = 100
60 100
= 0.6, p2 = 200 = 0.5
Module - 5 11 / 31
Problems
1. In a random sample of 100 men taken from Village A, 60 were found to
be consuming alcohol. In another sample of 200 men taken from Village
B, 100 were found to be consuming alcohol. Do the two villages differ
significantly in respect to the proportion of men who consume alcohol?
Solution:
Let x1 , x2 denotes the number of men consuming alcohol from Village A
and B resptly.
Here x1 = 60, n1 = 100, x2 = 100, n2 = 200
sample proportion p1 = nx11 = 100
60 100
= 0.6, p2 = 200 = 0.5
1: H0 : P1 − P2 = 0 against H1 : P1 − P2 6= 0 (Two tailed test)
Module - 5 11 / 31
Problems
1. In a random sample of 100 men taken from Village A, 60 were found to
be consuming alcohol. In another sample of 200 men taken from Village
B, 100 were found to be consuming alcohol. Do the two villages differ
significantly in respect to the proportion of men who consume alcohol?
Solution:
Let x1 , x2 denotes the number of men consuming alcohol from Village A
and B resptly.
Here x1 = 60, n1 = 100, x2 = 100, n2 = 200
sample proportion p1 = nx11 = 100
60 100
= 0.6, p2 = 200 = 0.5
1: H0 : P1 − P2 = 0 against H1 : P1 − P2 6= 0 (Two tailed test)
2: Level of significance α = 0.05
Module - 5 11 / 31
Problems
1. In a random sample of 100 men taken from Village A, 60 were found to
be consuming alcohol. In another sample of 200 men taken from Village
B, 100 were found to be consuming alcohol. Do the two villages differ
significantly in respect to the proportion of men who consume alcohol?
Solution:
Let x1 , x2 denotes the number of men consuming alcohol from Village A
and B resptly.
Here x1 = 60, n1 = 100, x2 = 100, n2 = 200
sample proportion p1 = nx11 = 100
60 100
= 0.6, p2 = 200 = 0.5
1: H0 : P1 − P2 = 0 against H1 : P1 − P2 6= 0 (Two tailed test)
2: Level of significance α = 0.05
3: Test Statistic:
Consider the conditions n1 p1 = 100 × 0.6 = 60 > 5, n1 q1 = 40 >
5, n2 p2 = 100 > 5, n2 q2 = 100 > 5
Module - 5 11 / 31
Problems
Solution contd:
Hence,
n1 p1 + n2 p2 100 × 0.6 + 200 × 0.5
P= = = 0.533
n1 + n2 100 + 200
Module - 5 12 / 31
Problems
Solution contd:
Hence,
n1 p1 + n2 p2 100 × 0.6 + 200 × 0.5
P= = = 0.533
n1 + n2 100 + 200
(p1 − p2 ) − (P1 − P2 )
Z= q
PQ( n11 + n12 )
0.1
=q
1 1
0.533 × 0.467( 100 + 200 )
= 1.6366
Note that Z follows standard normal distribution.
Module - 5 12 / 31
Problems
Solution contd:
4: Critical region:
Since it is a two tailed test , The critical value is Z α2 = 1.96.
The critical region is |Z | ≥ 1.96. That is, critical region is
−3 < Z ≤ −1.96 or 1.96 ≤ Z < 3
Since −1.96 ≤ Cal Z = 1.6366 ≤ 1.96, we accept H0 (or fail to reject
H0 )
Module - 5 13 / 31
Problems
Solution contd:
4: Critical region:
Since it is a two tailed test , The critical value is Z α2 = 1.96.
The critical region is |Z | ≥ 1.96. That is, critical region is
−3 < Z ≤ −1.96 or 1.96 ≤ Z < 3
Since −1.96 ≤ Cal Z = 1.6366 ≤ 1.96, we accept H0 (or fail to reject
H0 )
5: Conclusion:
There is no statistical evidence to prove that two villages differ
significantly in respect of proportion.
Module - 5 13 / 31
Test of single mean
Conditions
If Population standard deviation is known, proceed with z-test.
If Population standard deviation is not known, then proceed with
z-test if it is a large sample (n ≥ 30). If small sample, proceed with
t-test (We will study in Module 6).
Module - 5 14 / 31
Test of single mean
Null Hyp H0 : µ = µ0
Module - 5 15 / 31
Test of single mean
Null Hyp H0 : µ = µ0
Test statistic :
x̄ − µ
Z=
√σ
n
Module - 5 15 / 31
Test of single mean
Null Hyp H0 : µ = µ0
Test statistic :
x̄ − µ
Z=
√σ
n
Module - 5 15 / 31
Problems
Module - 5 16 / 31
Problems
Module - 5 16 / 31
Problems
Module - 5 16 / 31
Problems
Module - 5 16 / 31
Problems
Module - 5 16 / 31
Problems
Solution contd:
4: Critical region:
Since it is a two tailed test , the critical value is Z α2 = 2.58
The critical region is |Z | ≥ 2.58.
Note that Cal Z = −2.22 ≥ −2.58, we accept H0
5: Conclusion:
We conclude that the mean life time of the tubes produced by the
company is 1600 hours.
Module - 5 17 / 31
Test of difference of means
Conditions
If Population standard deviations are known, proceed with z-test.
If Population standard deviations are not known, then proceed with
z-test if both samples are such that n1 ≥ 30, and n2 ≥ 30. If small
sample, proceed with t-test (We will study in Module 6).
Module - 5 18 / 31
Test of difference of means
Conditions
If Population standard deviations are known, proceed with z-test.
If Population standard deviations are not known, then proceed with
z-test if both samples are such that n1 ≥ 30, and n2 ≥ 30. If small
sample, proceed with t-test (We will study in Module 6).
Null Hyp H0 : µ1 − µ2 = d
Module - 5 18 / 31
Test of difference of means
Conditions
If Population standard deviations are known, proceed with z-test.
If Population standard deviations are not known, then proceed with
z-test if both samples are such that n1 ≥ 30, and n2 ≥ 30. If small
sample, proceed with t-test (We will study in Module 6).
Null Hyp H0 : µ1 − µ2 = d
Test statistic :
Module - 5 18 / 31
Problems
1. Intelligence test given to two groups of boys and girls gave the following
information:
Module - 5 19 / 31
Problems
1. Intelligence test given to two groups of boys and girls gave the following
information:
Module - 5 19 / 31
Problems
Solution contd:
3: Test Statistic:
Since n1 , n2 ≥ 30, we proceed with Z -test.
4: Critical region:
Since it is a two tailed test , Z α2 = 1.96. The critical region is
|Z | ≥ 1.96
Note that Cal Z = 2.6958 ≥ 1.96, we reject H0
5: Conclusion:
We conclude that the difference in the mean score of boys and girls is
statistically significant.
Module - 5 20 / 31
More problems- Test of single proportion
1. A sample poll of 100 voters chosen at random from all the voters in a
given district indicated that 55% of them were in favour of a particular
candidate. Find (i) 95% and (ii) 99% confidence limits for the proportion
of all the voters in favour of this candidate.
Solution
Given n = 100, p = 0.55 (proportion of voters favouring the candidate.
(i) α = 5% = 0.05 =⇒ Z α2 = 1.96
q q
95% Confidence limits are given by (p − 1.96 pq n , p + 1.96 pq
n )
r r
0.55 × 0.45 0.55 × 0.45
(0.55 − 1.96 , 0.55 + 1.96 )
100 100
(0.4525, 0.6475)
Module - 5 21 / 31
Solution contd.
(ii) α = 1% = 0.01 =⇒ Z α2 = 2.58
q q
99% Confidence limits are given by (p − 2.58 pq
n , p + 2.58 pq
n )
r r
0.55 × 0.45 0.55 × 0.45
(0.55 − 2.58 , 0.55 + 2.58 )
100 100
(0.4216, 0.6784)
Module - 5 22 / 31
2. The population proportion is expected to be around 0.7. Find the
sample size needed to estimate the proportion within 0.02 with confidence
level of 90%.
Solution:
Given P = 0.7, p − P = 0.02
α = 10% = 0.1 =⇒ Z α2 = 1.645
p−P
We know that Z α2 = q PQ
n
q
PQ
p − P = Z α2 ×
qn
0.02 = 1.645 × PQ n =⇒ n = 1420.7
Hence, n = 1421 approx.
Module - 5 23 / 31
3. A die is thrown 9000 times and throw of 3 or 4 is observed 3240 times.
Show that the die cannot be regarded as an unbiased one.
Solution:
Let x be the number of times 3 or 4 occurs.
n = 9000, x = 3240, p = 3240
9000 = 0.36
1 1
1: H0 : P = 3 (unbiased) against H1 : P 6= 3 (biased, Two tailed test)
2: Level of significance α = 0.05
3: Test Statistic:
Consider the conditions
nP = 9000×0.333 = 2997 > 5, n(1−P) = 9000×0.667 = 6003 > 5
Module - 5 24 / 31
Problems
Solution contd:
Hence,
p−P 0.36 − 0.333
Z= q =q = 5.37
PQ 0.333×0.667
n 9000
Module - 5 25 / 31
More problems- Test of difference of proportions
1.A cigarette manufacturing firm claims that its brand B cigarette outsells
its brand A by 8%. If it is found that 42 out of a sample of 200 smokers
prefer brand A and 18 out of another random sample of 100 smokers
prefer brand B, test whether the 8% difference is a valid claim.
Solution:
Given n1 = 200, n2 = 100, x1 = 42, x2 = 18
42 18
p1 = 200 = 0.21, p2 = 100 = 0.18
1. H0 : P1 − P2 = 0.08 against H1 : P1 − P2 < 0.08 (Left tailed test)
2. LOS α = 5% = 0.05
3. n1 p1 = 200 × 0.21 = 42 > 5, n1 q1 = 200 × 0.79 = 158 > 5, n2 p2 =
100 × 0.18 = 18 > 5, n2 q2 = 82 > 5
P = n1 np11 +n
+n2 p2
2
= 0.2
Module - 5 26 / 31
Solution contd.
(p1 − p2 ) − (P1 − P2 )
Z= q
PQ( n11 + n12 )
Module - 5 27 / 31
Solution contd.
(p1 − p2 ) − (P1 − P2 )
Z= q
PQ( n11 + n12 )
Module - 5 27 / 31
2. In a year, there are 956 births in a town A of which 52.5% were male,
while in towns A and B combined, this proportion in a total of 1406 births
was 0.496. Is there any significant difference in the proportion of male
births in the two towns?
Hint:
n1 = 956, n1 + n2 = 1406 =⇒ n2 = 450
p1 = 0.525, combined proportion is 0.496 = n1 np11 +n
+n2 p2
2
Hence p2 = 0.434.
H0 : P1 − P2 = 0 against H1 : P1 − P2 6= 0 (two-tailed)
Cal Z = 3.184, Z α2 = 1.96
Cal Z lies in the rejection region. Hence reject H0 (there is significant
difference in the proportion of male births in the two towns)
Module - 5 28 / 31
More problems- Test of single mean
Module - 5 29 / 31
More problems- Test of single mean
(3.23, 2.57)
Module - 5 29 / 31
More problems- Test of difference of means
Module - 5 30 / 31
Try this
Module - 5 31 / 31