Unit 12
Unit 12
Unit 12
12.1 INTRODUCTION
Many a time, we strongly believe some results to be true. But after taking a sample, we
notice that one sample data does not wholly support the result. The difference is due to (i)
the original belief being wrong, or (ii) the sample being slightly one sided.
Tests are, therefore, needed to distinguish between the two possibilities. These tests tell
about the likely possibilities and reveal whether or not the difference can be due to only
chance elements. If the difference is not due to chance elements, it is significant and,
therefore, these tests are called tests of significance. The whole procedure is known as
Testing of Hypothesis.
Setting up and testing hypotheses is an essential part of statistical inference. In order to
formulate such a test, usually some theory has been put forward, either because it is
believed to be true or because it is to be used as a basis for argument, but has not been
proved. For example, the hypothesis may be the claim that a new drug is better than the
current drug for treatment of a disease, diagnosed through a set of symptoms.
In each problem considered, the question of interest is simplified into two competing
claims/hypotheses between which we have a choice; the null hypothesis, denoted by H0,
against the alternative hypothesis, denoted by H1. These two competing claims /
hypotheses are not however treated on an equal basis; special consideration is given to
the null hypothesis. We have two common situations :
(i) The experiment has been carried out in an attempt to disprove or reject a
particular hypothesis, the null hypothesis; thus we give that one priority so it
cannot be rejected unless the evidence against it is sufficiently strong. For
example, null hypothesis H0: there is no difference in taste between coke
and diet coke, against the alternate hypothesis H1: there is a difference in the
tastes.
(ii) If one of the two hypotheses is ‘simpler’, we give it priority so that a more
‘complicated’ theory is not adopted unless there is sufficient evidence
against the simpler one. For example, it is ‘simpler’ to claim that there is no
51
Probability and Statistics difference in flavour between coke and diet coke than it is to say that there is
a difference.
The hypotheses are often statements about population parameters like expected value and
variance. For example, H0, might be the statement that the expected value of the height of
ten year old boys in the Indian population, is not different from that of ten year old girls.
A hypothesis might also be a statement about the distributional form of a characteristic of
interest; for example, that the height of ten years old boys is normally distributed within
the Indian population.
Objectives
After studying this unit, you should be able to
• understand the basic concepts of Testing of Hypothesis,
• explain Null Hypothesis,
• differentiate Type-I and Type-II errors,
• apply student’s t-distribution,
• appreciate Chi-square test, and
• understand the use of common statistical tests.
(iv) Critical Region : The region composed of extreme samples values that are
very unlikely outcomes if the null hypothesis is true. The boundaries for the
critical region are determined by the alpha level. If sample data fall in the
critical region, the null hypothesis is rejected. The α-level you set affects the
outcome of the research.
(v) Collect data and compute sample statistics using the formula
x −µ
z=
σx
where, x = sample mean,
µ = hypothesised population mean, and
σ x = standard error between x and µ.
σ
σx =
n
(vi) Make a decision and write down the decision rule.
Z-Score Statistics
56
Z-Score is called a test statistics. The purpose of a test statistics is to determine Testing of Hypothesis
whether the result of a research study (the obtained difference) is more than what
would be expected by the chance alone.
Obtained difference
z=
Difference due to chance
Now suppose a manufacturer, produces some type of articles of good quality. A
purchaser by chance selects a sample randomly. It so happens that the sample contains
many defective articles and it leads the purchaser to reject the whole product. Now, the
manufacturer suffers a loss even though he has produced a good article of quality.
Therefore, this Type-I error is called “producers risk”.
On the other hand, if we accept the entire lot on the basis of a sample and the lot is not
really good, the consumers are put in loss. Therefore, this Type-II error is called the
“consumers risk”.
In practical situations, still other aspects are considered while accepting or rejecting a lot.
The risks involved for both producer and consumer are compared. Then Type-I and
Type-II errors are fixed; and a decision is reached.
(iii) Most people would probably agree that the Type-I error in this situation is
by far the more serious. Thus, we would want α, the probability of
committing a Type-I error, to be very small indeed.
A convention that is generally observed when formulating the null and alternative
hypotheses of any statistical test is to state H0 so that the possible error of
incorrectly rejecting H0 (Type-I error) is considered more serious than the possible
error of incorrectly failing to reject H0 (Type-II error). In many cases, the decision
as to which type of error is more serious is admittedly not as clear-cut as that of
Example 12.1; experience will help to minimize this potential difficulty.
Types of Errors for a Hypothesis Test
The goal of any hypothesis testing is to make a decision. In particular, we will
decide whether to reject the null hypothesis, H0, in favour of the alternative
hypothesis, H1. Although we would like always to be able to make a correct
decision, we must remember that the decision will be based on sample information,
and thus we are subject to make one of two types of error, as defined in Table 12.2.
The null hypothesis can be either true or false. Further, we will make a conclusion either
to reject or not to reject the null hypothesis. Thus, there are four possible situations that
may arise in testing a hypothesis as shown in Table 12.3.
Table 12.3 : Conclusions and Consequences for Testing a Hypothesis
Conclusions
Do Not Reject Reject
Null Hypothesis Null Hypothesis
Null Hypothesis Correct conclusion Type-I error
True
Alternative Hypothesis Type-II error Correct conclusion
“State of Nature”
The kind of error that can be made depends on the actual state of affairs (which, of
course, is unknown to the investigator). Note that we risk a Type-I error only if the null
hypothesis is rejected, and we risk a Type-II error only if the null hypothesis is not
rejected. Thus, we may make no error, or we may make either a Type-I error (with
58 probability α), or a Type-II error (with probability β), but not both. We don't know which
type of error corresponds to actuality and so would like to keep the probabilities of both Testing of Hypothesis
types of errors small. There is an intuitively appealing relationship between the
probabilities for the two types of error : As α increases, β decreases, similarly, as β
increases, α decreases. The only way to reduce α and β simultaneously is to increase the
amount of information available in the sample, i.e. to increase the sample size.
You may note that we have carefully avoided stating a decision in terms of “accept the
null hypothesis H0”. Instead, if the sample does not provide enough evidence to support
the alternative hypothesis H1 we prefer a decision “not to reject H0”. This is because, if
we were to “accept H0”, the reliability of the conclusion would be measured by β, the
probability of Type-II error. However, the value of β is not constant, but depends on the
specific alternative value of the parameter and is difficult to compute in most testing
situations.
x −µ
Next, calculate | z| =
σ
n
(i) If | z | < 1.96, the difference is not significant at 5% level and Ho is accepted,
otherwise rejected.
(ii) If | z | < 2.58, the difference is not significant at 1% level and Ho is accepted,
otherwise rejected.
Note : We have assumed that σ is known. If however σ is not known, we take σ
to be equal to the S. D. of the sample.
Example 12.3
A random sample of 400 male students have average weight of 55 kg. Can we say
that the sample comes from a population with mean 58 kg with a variance of 9 kg?
Solution
The null hypothesis Ho is that the sample comes from the given population.
In notations : Ho : µ = 58 kg and H1 : µ ≠ 58 kg.
x −µ
Now | z| = . Insert x = sample mean = 55 kg.
σ
n
µ = population mean = 58 kg, n = 400 and σ = population SD = 3
55 − 58
Therefore | z | = = 20 > 2.58
3
400
This value is highly significant. We will reject Ho on the basis of this sample. The
60 sample, therefore, is not likely to be from the given population.
Example 12.4 Testing of Hypothesis
A random sample of 400 tins of vegetable oil and labeled “5 kg net weight” has a
mean net weight of 4.98 kg with standard deviation of 0.22 kg. Do we reject the
hypothesis of net weight of 5 kg per tin on the basis of this sample at 1% level of
significance?
Solution
The null hypothesis Ho is that the net weight of each tin is 5 kg.
In notations Ho : µ = 5 kg.
Inserting, x = 4.98 kg, µ = 5 kg, σ = 0.22 kg and n = 400 in
x −µ
| z| = , we get
σ
n
4.98 − 5
| z| = = 18 > 2.58
0.22
400
Hence Ho is rejected at 1% level of significance.
Application of Hypothesis Testing
In this section, we will present applications of the hypothesis-testing logic. Among
the population parameters to be considered are (µ1 − µ2), p, and (p1 − p2).
The concepts of a hypothesis test are the same for all these parameters; the null and
alternative hypotheses, test statistic, and rejection region all have the same general
form. However, the manner in which the test statistic is actually computed depends
on the parameter of interest. For example, we saw that the large-sample test
statistic for testing a hypothesis about a population mean µ is given by
x − µ0
| z| =
σ
n
while the test statistic for testing a hypothesis about the parameter p is
pˆ − p0
| z| =
p0 q0
n
where pˆ and p0 denote the sample proportion and the theoretical proportion
respectively.
The key to correctly diagnosing a hypothesis test is to determine first the
parameter of interests. In this section, we will present several examples illustrating
how to determine the parameter of interest. The following are the key words to
look for when conducting a hypothesis test about a population parameter.
Table 12.4 : Determining the Parameter of Interest
Parameter Description
µ Mean; average
(µ1 − µ2) Difference in means or averages; mean difference;
comparison of means or averages
p Proportion; percentage; fraction; rate 61
Probability and Statistics (p1 − p2) Difference in proportion, percentage, fraction, or
rates; comparison of proportions, percentages,
fractions, or rates
σ2 Variance; variation; precision
Test Statistic :
x − µ0 x − µ0
z= ≈
σx σ
n
Rejection Region for H0 Rejection Region for H0
z > zα (or z < − zα) z < − zα /2 (or z > zα /2)
where zα is the z-value such that P(z > zα) = α; and zα/2 is the z-value such that P(z > zα/2) = α/2.
[Note: µ0 is our symbol for the particular numerical value specified for µ in the null hypothesis.]
Assumption : The sample size must be sufficiently large (say, n ≥ 30) so that the sampling
distribution of x is approximately normal and that s provides a good approximation to σ.
Example 12.5
The mean time spent on studies of all students at a university last year was
40 hours per week. This year, a random sample of 35 students at the university was
drawn. The following summary statistics were computed:
x = 42 hours; σ = 13.85 hours
62
Test the hypothesis that µ, the population mean time spent on studies per week is Testing of Hypothesis
equal to 40 hours against the alternative that µ is larger than 40 hours. Use a
significance level of α = 0.05.
Solution
We have previously formulated the hypotheses as
H0 : µ = 40
H1 : µ > 40
Note that the sample size, n = 35, is sufficiently large so that the sampling
distribution of x is approximately normal and that σ provides a good
approximation to σ. Since the required assumption is satisfied, we may proceed
with a large-sample test of hypothesis about µ.
Using a significance level of α = 0.05, we will reject the null hypothesis for this
one-tailed test if z > zα/ 2 = z0.05 , i.e., if z > 1.645. This rejection region is shown
in Figure 12.3.
Computing the value of the test statistic, we obtain
x − µ 0 42.1 − 40
z= = = 0.897
σ 13.85
n 35
Since this value does not fall within the rejection region (Figure 12.3), we do not
reject H0. We say that there is insufficient evidence (at α = 0.05) to conclude that
the mean time spent on studies per week of all students at the university this year is
greater than 40 hours. We would need to take a larger sample before we could
detect whether µ > 40, if in fact this were the case.
Example 12.6
A sugar refiner packs sugar into bags weighing, on average, 1 kilogram. Now the
setting of machine tends to drift, i.e. the average weight of bags filled by the
machine sometimes increases, sometimes decreases. It is important to control the
average weight of bags of sugar. The refiner wish to detect shifts in the mean
weight of bags as quickly as possible, and reset the machine. In order to detect
shifts in the mean weight, he will periodically select 50 bags, weigh them, and
calculate the sample mean and standard deviation. The data of a periodical sample
is as follows :
x = 1.03 kg, σ = 0.05 kg
63
Probability and Statistics Test whether the population mean µ is different from 1 kg at significance level
α = 0.01.
Solution
We formulate the following hypotheses :
H0 : µ = 1
H1 : µ ≠ 1
The sample size (50) exceeds 30, we may proceed with the larger sample test about
µ. Because shifts in µ in either direction are important, so the test is
two-tailed.
At significance level α = 0.01, we will reject the null hypothesis for this two tail
test if
z < − zα/2 = − z0.005 or z > zα/2 = z0.005
i.e., if z < − 2.576 or z > 2.576.
The value of the test statistic is computed as follows :
x − µ 0 1.03 − 1
z≈ = = 4.243
σ 0.05
n 50
Since this value is greater than the upper-tail critical value (2.576), we reject the
null hypothesis and accept the alternative hypothesis at the significance level of
1%. We would conclude that the overall mean weight was no longer 1 kg, and
would run a less than 1% chance of committing a Type-I error.
Example 12.7
When flipped 1000 times, a coin landed 515 times heads up. Does it support the
hypothesis that the coin is unbiased?
Solution
The null hypothesis is that the coin is unbiased.
In notations Ho : P = Po, where Po = 0.5 and qo = 1 − Po = 0.5
515
Now the sample proportion is pˆ = = 0.515
1000
pˆ − p0 0.515 − 0.5
| z| = = = 0.949 < 2.58 or even 3
p0 q0 0.5 × 0.5
n 1000
We then do not reject the null hypothesis. The coin is unbiased.
Example 12.8
While throwing 5 die 40 times, a person got success 25 times – getting a 4 was
called a success. Can we consider the difference between expected value and
observed value as being significantly different?
Solution
If we carefully examine the data then the hypothesis can be stated that the dice is
unbiased.
4 5
1 5 5
In notation Ho : P = Po, where P0 = 5C4 = = 0.4019
6 6 6
and q0 = 1 – P0 = 0.5981
64
i.e. Ho : P = 0.4019 and H1 : P ≠ 0.4019 Testing of Hypothesis
25
The sample proportion pˆ = = 0.625
40
pˆ − p0 0.625 − 0.4019
| z| = = = 2.88 > 2.58
p0 q0 0.4019 × 0.5981
n 40
Hence the hypothesis H0 is to be rejected at 1% level of significance or we can say
that the value obtained is highly significant. The given data do not support H0.
Thus the dice is not unbiased.
Tests of Population Means using Small Samples
When the assumption required for a large-sample test of hypothesis about µ is
violated, we need a hypothesis-testing procedure that is appropriate for use with
small samples. Because if we use methods of the large-sample test, we will run
into trouble on two accounts. Firstly, our small sample will underestimate the
population variance, so our test statistic will be wrong. Secondly, the means of
small samples are not normally distributed, so our critical values will be wrong.
We have learnt that the means of small samples have a t-distribution, and the
appropriate t-distribution will depend on the number of degrees of freedom in
estimating the population variance. If we use large samples to test a hypothesis,
then the critical values we use will depend upon the type of test (one or two tailed).
But if we use small samples, then the critical values will depend upon the degrees
of freedom as well as the type of test.
A hypothesis test about a population mean, µ, based on a small sample (n < 30)
consists of the elements listed in Table 12.6.
Table 12.6 : Small-sample Test of Hypothesis about a Population Mean
ONE-TAILED TEST TWO-TAILED TEST
H0 : µ = µ 0 H0 : µ = µ 0
H1 : µ > µ 0 (or H1 : µ < µ 0) H1 : µ ≠ µ 0
Test Statistic
x − µ0
t=
σ
n
Rejection Region Rejection Region
t > tα (or t < − tα) t < − tα /2 (or t > tα /2)
where the distribution of t is based on (n – 1) degrees of freedom; tα is the
t-value such that P (t > tα ) = α ; and tα/2 is the t-value such that
P (t > tα/2 ) = α/2.
Assumption: The relative frequency distribution of the population from which
the sample was selected is approximately normal.
As we noticed in the development of estimation procedures, when we are making
inferences based on small samples, more restrictive assumptions are required than
when making inferences from large samples. In particular, this hypothesis test
requires the assumption that the population from which the sample is selected is
approximately normal.
Notice that the test statistic given in Table 12.6 is a t statistic and is calculated
exactly as our approximation to the large-sample test statistic, z, given earlier in
this section. Therefore, just like z, the computed value of t indicates the direction 65
Probability and Statistics and approximate distance (in units of standard deviations) that the sample mean,
x , is from the hypothesized population mean, µ0.
Example 12.9
The expected lifetime of electric light bulbs produced by a given process was 1500
hours. To test a new batch a sample of 10 was taken which showed a mean lifetime
of 1410 hours. The standard deviation is 90 hours. Test the hypothesis that the
mean lifetime of the electric light bulbs has not changed, using a level of
significance of α = 0.05.
Solution
This question asks us to test that the mean has not changed, so we must employ a
two-tailed test :
H0 : µ = 1500
H1: µ ≠ 1500
Since we are restricted to a small sample, we must make the assumption that the
lifetimes of the electric light bulbs have a relative frequency distribution that is
approximately normal. Under this assumption, the test statistic will have a
t-distribution with (n − 1) = (10 − 1) = 9 degrees of freedom. The rejection rule is
then to reject the null hypothesis for values of t such that
t < − tα /2 or t > tα /2 with α /2 = 0.05/2 = 0.025.
From Table with 9 degrees of freedom, we find that
t0.025 = 2.262.
The value of test statistic is
x − µ 0 1410 − 1500
t= = = − 3.1623
σ 90
n 10
The computed value of the test statistic, t = − 3.1623, falls below the critical value
of − 2.262. We reject H0 and accept H1 at significance level of 0.05, and conclude
that there is some evidence to suggest that the mean lifetime of all light bulbs has
changed.
Testing the Difference between Means
If x1 and x2 denote the means of the samples drawn from the first and second
population respectively, having means µ1 and µ2 and standard deviations σ1 and σ2
and if the sizes of the samples are n1 and n2, then it can be proved that the
distribution of the difference between the means x1 − x2 is normal with mean
(µ1 − µ2) and standard deviation is given by
σ12 σ 22
σ= +
n1 n2
(x1 − x2 ) − (µ1 − µ 2 )
Therefore, z=
σ12 σ 22
+
n1 n2
66
x1 − x2 Testing of Hypothesis
z= is the standard normal variate.
σ12 σ 22
+
n1 n2
When the two samples belong to the same population, we have σ1 = σ2 = σ then,
x1 − x2
z=
1 1
σ +
n1 n2
Similarly, the confidence limits for (µ1 − µ2 ) at various levels of confidence are :
(i) (x1 − x2 ) ± 1.96 σ at 95% level of confidence
1 1
Note 1 : Here S = S. E. = σ + . For the samples drawn from the same
n1 n2
population.
Note 2 : If S.D. of two populations, i.e. σ1, σ2 are unknown, we use s. d. of
samples in their places.
Example 12.10
A group of 200 students have the mean height of 154 cms. Another group of
300 students have the mean height of 152 cms. Can these be from the same
population with S.D. of 5 cms?
Solution
Ho : µ1 = µ2, the samples are from the same population.
H1 : µ1 ≠ µ2, here x1 = 154 cms, x2 = 152 cms, σ = 5 cms, n1 = 200 and n2 = 300.
x1 − x2 154 − 152
Now, | z| = = = 4.38 > 3
1 1 1 1
σ + 5 +
n1 n2 200 300
i.e. the z-score is highly significant. Therefore, we reject Ho, i.e. it is not likely that
the two samples are from the same population.
Example 12.11
Suppose it is claimed that in a very large batch of components, about 10% of items
contain some form of defect. It is proposed to check whether this proportion has
increased, and this will be done by drawing randomly a sample of 150
components. In the sample, 20 are defectives. Does this evidence indicate that the
true proportion of defective components is significantly larger than 10%? Test at
significance level α = 0.05.
Solution
We wish to perform a large-sample test about a population proportion, p :
H0: p = 0.10 (i.e., no change in proportion of defectives)
H1: p > 0.10 (i.e., proportion of defectives has increased) 67
Probability and Statistics where p represents the true proportion of defects.
At significance level α = 0.05, the rejection region for this one-tailed test consists
of all values of z for which
z > z0.05 = 1.645
The test statistic requires the calculation of the sample proportion, p̂ of defects :
20
= = 0.133
150
Noting that q0 = 1 – p0 = 1 – 0.10 = 0.90, we obtain the following value of the test
statistic :
pˆ − p0 0.133 − 0.10
z= = = 1.347
p0 q 0 (0.10) (0.90)
n 150
This value of z lies out of the rejection region; so we would conclude that the
proportion defective in the sample is not significant. We have no evidence to reject
the null hypothesis that the proportion defective is 0.01 at the 5% level of
significance. The probability of our having made a Type II error (accepting H0
when, in fact, it is not true) is β = 0.05.
[Note that the interval
pˆ qˆ (0.133) (1 − 0.133)
pˆ ± 2 = 0.133 ± 2 = 0.133 ± 0.056
n 150
does not contain 0 or 1. Thus, the sample size is large enough to guarantee that
validity of the hypothesis test.]
Although small-sample procedures are available for testing hypotheses about a
population proportion, the details are omitted from our discussion. It is our experience
that they are of limited utility since most surveys of binomial population performed in the
reality use samples that are large enough to employ the techniques of this section.
|x − µ| |x − µ|
t= or n
σ 2 σ
n
where x = sample mean, µ = actual or hypothetical mean of population,
n = sample size, σ = standard deviation of sample and
∑ (xi − x ) 2
where σ=
n −1
Figure 12.4
Also note that the t-distribution is lower at the mean and higher at the
tails than the normal distribution, i.e. the t-distribution has
proportionally greater area at its tails than the normal distribution.
(iii) (a) If | t | exceeds t0.05 then difference between x and µ is significant at
0.05 level of significance.
(b) If | t | exceeds t0.01, then difference is said to highly significant at 0.01
level of significance.
(c) If | t | < t0.05, we conclude that the difference between x and m is not
significant and the sample might have been drawn from a population
with mean = µ, i.e. the data is consistent with the hypothesis.
(iv) Fiducial limits of population mean
σ
For 95% x± t0.05
n
69
Probability and Statistics σ
For 99% x± t0.01
n
Example 12.12
A random sample of 16 values from a normal population is found to have a mean
of 41.5 and standard deviation of 2.795. On this basis, is there any reason to reject
the hypothesis that the population mean µ = 43? Also find the confidence limits for
µ.
Solution
Here n = 16 − 1 = 15, x = 41.5, σ = 2.795 and µ = 43.
|x − µ| 1.5 × 15
Now t= n= = 2.078
σ 2.795
From the t-table for 15 degree of freedom, the probability of t being 0.05, the value
of t = 2.13. Since 2.078 < 2.13, the difference between x and µ is not significant.
Now, null hypothesis : Ho : µ = 43 and
Alternative hypothesis : H1 : µ ≠ 43.
Thus there is no reason to reject Ho. To find the limits, using for 95%,
σ
x± t0.05
n
2.795
= 41.5 ± × 2.13
16
= 41.5 ± (0.6988) (2.13)
= (40.011, 42.988)
Example 12.13
Ten individuals are chosen at random from the population and their heights are
found to be inches 63, 63, 64, 65, 66, 69, 69, 70, 70, 71. Discuss the suggestion
that the mean height in the universe is 65 inches given that for 9 degree of freedom
the value of student’s ‘t’ at 0.05 level of significance is 2.262.
Solution
xi = 63, 63, 64, 65, 66, 69, 69, 70, 70, 71 and n = 10
∑ xi 670
∴ x= = = 67
n 10
∑ (xi − x ) 2 88
and σ= = = 3.13 inches
n −1 9
1 n1 1 n2
σ12 = ∑ ( xi − x1 ) 2 and σ 22 = ∑ ( xi − x2 ) 2
n1 − 1 i =1 n2 − 1 i = 1
Then the static t is given by
( x1 − x2 )
t=
1 1
σ 2P +
n1 n2
(n1 − 1) σ12 + (n2 − 1) σ 22
where σ 2P = ,
n1 + n2 − 2
is called the pooled estimate of the population variance.
Example 12.15
Two types of drugs were used on 5 and 7 patients for reducing their weights in
Iswari’s ‘slim-beauty’ health club. Drug A was allopathic and drug B was Herbal. 71
The decrease in the weight after using drugs for six months was as follows :
Probability and Statistics Drug A : 10 12 13 11 14
Drug B : 8 9 12 14 15 10 9
Is there a significant difference in the efficiency of the two drugs? If yes, which
drug should you buy?
Solution
Let the null hypothesis Ho : µ1 = µ2 or Ho : µ1 − µ2 = 0.
Alternative hypothesis H1 : µ1 ≠ µ2 or H1 : µ1 − µ2 ≠ 0
x1i ( x1 − x1 ) ( x1 − x1 ) 2 x2i ( x2 − x2 ) ( x2 − x2 )2
10 −2 4 8 −3 9
12 0 0 9 −2 4
13 1 1 12 1 1
11 −1 1 14 3 9
14 2 4 15 4 16
10 −1 1
9 −2 4
5 7
∑ x1i = 60, ∑ ( x1i − x1 ) 2 = 10 and ∑ x2i = 77, ∑ ( x2i − x2 ) 2 = 44
i =1 i =1
∑ x1i 60 ∑ x2i 77
Now, x1 = = = 12 and x2 = = = 11
n 5 n 7
(n1 − 1) σ12 + (n2 − 1) σ 22
Also σ 2P = ,
n1 + n2 − 2
∑ ( x1 − x1 ) 2 10
where σ12 = = = 2 .5
n1 − 1 5 −1
∑ ( x2 − x2 ) 2 44
and σ 22 = = = 7 .3
n2 − 1 7 −1
(5 − 1) 2.5 + (7 − 1) 7.3
Therefore, σ 2P =
5+ 7 − 2
4 × 2 .5 + 6 × 7 .3
= = 5.38
10
Then using the formula
x1 − x2 − (µ1 − µ 2 )
t= , where µ1 − µ 2 = 0 for H0,
1 1
σ 2P +
n1 n2
12 − 11 1 1
we get t= = = = 0.736
1 1 12 5.38 × 0.342
5.38 + 5.38 ×
5 7 35
Now ν (df ) = n1 + n2 − 2 = 10
For ν = 10, t0.05 = 2.228
Therefore, 0.736 < 2.288
72
Thus the null hypothesis is accepted. Hence there is no significance in the Testing of Hypothesis
efficiency of the two drugs. Since drug B is Herbal and there is no difference in
efficiency between the two with no side effects, we should buy the Herbal drug.
Example 12.16
To test the effect of a fertilizer on rice production, 24 equal plots of a certain land
are selected. Half of them were treated with fertilizer leaving the rest untreated.
Other conditions were the same. The mean production of rice on untreated plots
was 4.8 quintals with standard deviation of 0.4 quintal, while the mean yield on the
treated plots was 5.1 quintals with a standard deviation of 0.36 quintal. Can we say
that there is significant improvement in the production of rice due to use of
fertilizer at 0.05 level of significance?
Solution
The null hypothesis H0 : µ1 = µ2 or H0 : µ1 − µ2 = 0
Alternative hypothesis H1: µ1 ≠ µ2 or H1 : µ1 − µ2 ≠ 0
or H1 : µ1> µ2, and the fertilizer improved the yield.
Given x1 = 4.8, n1 = 12, σ1 = 0.4, x2 = 5.1, n2 = 12, σ 2 = 0.36
∴ σ 2P = 0.1448
x1 − x2 − (µ1 − µ 2 )
Using the formula t=
1 1
σ 2P +
n1 n2
5.1 − 4.8 − 0
we get t= = 1.93
1 1
0.1448 +
12 12
For n (df) = 12 + 12 − 2 = 22, t0.05 = 2.07
Therefore, 1.93 < 2.07
Thus we accept Ho, i.e. there is no significant difference in rice production due to
the use of fertilizer.
12.5.1 Two Tailed and One Tailed Tests
While testing a hypothesis, we often talk of two-tailed tests and one-tailed tests. In the
previous tests the critical region lay along both the tails of the distributions. That is, we
did not want sample statistic (say mean) to be away from the population parameter (say
mean) in either direction. The test for such a hypothesis is non-directional or two-sided or
two-tailed. A two-tailed test of hypothesis will reject the null hypothesis Ho, if the sample
statistic is significantly higher than or lower than the hypothesized population parameter.
Thus in two-tailed test, the rejection (critical) region is located in both the tails.
For example, suppose you suspect that a particular 6th grader’s performance on a test in
Mathematics is not a true representative of the students who have appeared. The national
mean score in this test was found to be 75. The alternative (or research) hypothesis is :
H1 : µ ≠ 75 while the null hypothesis is : Ho : µ = 75.
Now our pre-determined probability level is 95%, i.e. 5% level of significance for this
test. Both tests have the rejection (or critical) region of 5%, i.e. 0.05. Now this rejection
region is divided between both the tails of the distribution (Figure 12.5), i.e. 2.5% or 0.25
73
in the upper tail and 2.5% or 0.25 in the lower tail since your hypothesis gives only a
Probability and Statistics difference and not a direction. You will reject the null hypothesis on the basis that the
sample mean falls into the area beyond 1.96 S.E. Otherwise if it falls into area 0.475
corresponds to 1.96 S.E. you can accept the null hypothesis.
Figure 12.5
Suppose you want to reduce the risk of committing a Type-I error, then reduce the size of
the rejection region, if the hypothesis is treated at 1%, i.e. 0.01 level of significance and
if we consult the table of areas under the normal curve, we find the acceptance region of
0.495 (one half of 0.99) is equal to 2.58 S.E. from µH, i.e. z-score = 0.
Figure 12.6
You will still reject the null hypothesis of no difference, if the class sample is either
much higher or much lower than our population mean of 75.
As distinguished from the two-tailed test, we can apply a directional – one sided, i.e. one-
tailed test also because in some cases it is necessary to guard against only small values of
x , (i.e. sample mean). One-tailed test is so called because the rejection region will be
located in only one-tail, which may either be on the upper or the lower side of the
distribution depending upon the alternative (H1) hypothesis formula. For example, we
want to test a hypothesis that the average income per household is greater than Rs. 5000
against the alternative hypothesis that the income is Rs. 1000 or more. We will place all
α risk on the upper-side of the theoretical sampling distribution and the test will be
one-tailed. On the other hand, if we are testing that the average income per household is
Rs. 5000 against H1 that the income is less than Rs. 5000 or less, the α risk is on the
lower side of the distribution and the test will be one sided.
74
Testing of Hypothesis
Figure 12.7
Summing up, if the population’s specified mean is say µ0, then the null hypothesis
would be H0 : µ = µ0 and alternative (researcher’s) hypothesis could be either one
of
(i) H1 : µ ≠ µo (i.e. µ > µo or µ < µo).
(ii) H1 : µ > µo or
(iii) H1 : µ < µo
Example 12.17
Past records show that the mean marks of students taking statistics are 60 with
standard deviation of 15 marks. A new method of teaching is adopted and a
random sample of 64 students is chosen. After using the new method, the sample
gives the mean marks of 65. Is the new method better?
Solution
Here we are interested in knowing whether the marks increased on using the new
teaching method. Therefore, we use the one-tailed method :
The null hypothesis is : Ho : µ = 60
The alternative hypothesis is : H1 : µ > 60.
We have x = 65, µ = 60, σ = 15 and n = 64 then
x − µ 65 − 60
z= = = 2.66
σ 15
n 64
Now suppose the researcher had predetermined the level of significance which is
0.01 or 1% for his decision. Then 2.66 > 2.33 (Here z-score is 2.33 for 0.01 level
on the upper-tail of distribution). Therefore, the observed value is highly
significant. That is, Ho is rejected and H1 is accepted. This means the new teaching
method is better.
Example 12.18
A manufacturer of an antibiotic claimed that his antibiotic was 90% effective in
curing a certain type of V. D. if used for a duration of 8 weeks. In a sample of
200 people who tried this, 160 people were cured. Determine whether his claim is
legitimate.
Solution
Let P = Probability for curing the V. D. by the use of the manufacturer’s antibiotic.
Setting two types of hypothesis as :
Null hypothesis : H0: P = 0.9 ⇒ claim is supported.
Researcher’s hypothesis : H1 : P < 0.9 ⇒ claim is rejected.
75
Probability and Statistics
Figure 12.8
160
Now p̂ = Proportion of success in the given sample = = 0 .8
200
Now p = 0.9 ⇒ q = 0.1.
pq (0.9) (0.1)
Therefore, = = 0.021
n 200
Thus the corresponding z-score will be
Pˆ − P 0.8 − 0.9 − 0.1
z= = = = − 4.71
pq 0.021 0.021
n
which is much less than − 2.33. Thus by our decision rule H0 is rejected and H1 is
accepted stating that his claim is not legitimate and that the sample results are
highly significant (at 0.01 level of significance).
Test of Significance for Small Samples
So far we have discussed problems belonging to large samples. When a small
sample (size < 30) is considered, the above tests are inapplicable because the
assumptions we made for large sample tests, do not hold good for small samples.
In case of small samples, it is not possible to assume (i) that the random sampling
distribution of a statistics normal and (ii) the sample values are sufficiently close to
population values to calculate the S.E. of estimate.
Thus an entirely new approach is required to deal with problems of small samples.
But one should note that the methods and theory of small samples are applicable to
large samples but its converse is not true.
Degree of Freedom
By degree of freedom (df) we mean the number of classes to which the value
can be assigned arbitrarily or at will without voicing the restrictions or
limitations placed.
For example, we are asked to choose any 4 numbers whose total is 50.
Clearly we are at freedom to choose any 3 numbers say 10, 23, 7 but the
fourth number, 10 is fixed since the total is 50 [50 – (10 + 23 + 7) = 10].
Thus we are given a restriction, hence the freedom of selection of number is
4 − 1 = 3.
The degree of freedom (df) is denoted by v (nu) or df and it is given by
v = n − k, where n = number of classes and k = number of independent
constrains (or restrictions).
In general for a Binomial distribution, v = n − 1.
For Poisson distribution, v = n − 2 (since we use total frequency and
arithmetic mean).
For normal distribution, v = n − 3 (since we use total frequency, mean and
standard deviation) etc.
12.5.2 Hypothesis Tests about the Difference between Two Population
Means
There are two brands of coffee, A and B. Suppose a consumer group wishes to determine
whether the mean price per kg of brand A exceeds the mean price per kg of
brand B. That is, the consumer group will test the null hypothesis H0: (µ1 − µ2) = 0
against the alternative (µ1 − µ2) > 0. The large-sample procedure described in Table 12.7
is applicable for testing a hypothesis about (µ1 − µ2), the difference between two
76 population means.
Table 12.7 : Large-sample Test of Hypothesis about (µ1 − µ2) Testing of Hypothesis
Example 12.19
A consumer group selected independent random samples of supper-markets
located throughout a country for the purpose of comparing the retail prices per kg
of coffee of brands A and B. The results of the investigation are summarised in
Table 12.8. Does this evidence indicate that the mean retail price per kg of brand A
coffee is significantly higher than the mean retail price per kg of brand B coffee?
Use a significance level of α = 0.01.
Table 12.8 : Coffee Prices
Brand A Brand B
n1 = 75 n2 = 64
x1 = Rs. 300 x2 = Rs. 295
σ1 = Rs.11 σ2 = Rs. 9
Solution
The consumer group wants to test the hypotheses
H0 : (µ1 − µ2) = 0 (i.e., no difference between mean retail prices)
H1 : (µ1 − µ2) > 0 (i.e., mean retail price per kg of brand A is higher than that of
brand B)
where, µ1 = Mean retail price per kg of brand A coffee at all
super-markets, and
µ2 = Mean retail price per kg of brand B coffee at all super-markets.
This one-tailed, large-sample test is based on a z statistic. Thus, we will reject H0 if
z > zα = z0.01. Since z0.01 = 2.33, the rejection region is given by z > 2.33
(Figure 12.9.)
We compute the test statistic as follows :
(x1 − x2 ) − D0 (300 − 295) − 0
z= = = 2.947
σ12 σ 22 (11) 2
(9) 2
+ +
n1 n2 75 64
77
Probability and Statistics
Example 12.20
There was a research on the weights at birth of the children of urban and rural
women. The researcher suspects there is a significant difference between the mean
weights at birth of children of urban and rural women. To test this hypothesis, he
selects independent random samples of weights at birth of children of mothers
from each group, calculates the mean weights and standard deviations and
summarizes in Table 12.10. Test the researcher’s belief, using a significance of
α = 0.02.
78
Table 12.10 : Weight at Birth Data Testing of Hypothesis
Solution
The researcher wants to test the following hypothesis :
H0: (µ1 − µ2) = 0 (i.e., no difference between mean weights at birth)
H1: (µ1− µ2) ≠ 0 (i.e., mean weights at birth of children of urban and rural
women are different) where µ1 and µ2 are the true mean weights at birth of
children of urban and rural women, respectively.
Since the sample sizes for the study are small (n1 = 15, n2 = 14), the following
assumptions are required:
(i) The two populations of weights at birth of children both have approximately
normal distributions.
(ii) The variances of the populations of weights at birth of children for two
groups of mothers are equal.
(iii) The samples were independently and randomly selected.
If these three assumptions are valid, the test statistic will have a t-distribution with
(n1 + n2 − 2) = (15 + 14 − 2) = 27 degrees of freedom with a significance level of
α = 0.02, the rejection region (Figure 12.10) is given by
t < − t0.01 = − 2.473 or t > t0.01 = 2.473 (see Figure 12.10)
Using this pooled sample variance in the computation of the test statistic, we 79
obtain
Probability and Statistics (x1 − x2 ) − D0 (3.5933 − 3.2029) − D0
t= = = 2.422
1 1 1 1
σ 2p + 0.1881 +
n1 n2 15 14
Now the computed value of t does not fall within the rejection region; thus, we fail
to reject the null hypothesis (at α = 0.02) and conclude that there is insufficient
evidence of a difference between the mean weights at birth of children of urban
and rural women.
In this example, we can see that the computed value of t is very close to the upper
boundary of the rejection region. This region is specified by the significance level and
the degree of freedom. How is the conclusion about the difference between the mean
weights at births affected if the significance level is α = 0.05? We will answer the
question in the next example.
12.5.3 Test for Difference between Proportions
If two samples are drawn from different populations, we may be interested in finding out
whether the difference between the proportion of successes is significant or not. Let x1
and x2 be the number of items possessing the attribute A, in the random sampling of sizes
n1 and n2 from two populations respectively. Then the sample proportions of successes
x x
are P1 = 1 and P2 = 2 , where P1 and P2 are proportion of successes in the two
n1 n2
populations.
Under the hypothesis that the proportions in two populations are equal
P1 − P2
z=
1 1
P Q +
n1 n2
In general, however, we do not know the population’s proportion of success. In such a
case, we can replace P by its best estimate, the pooled estimate of the actual proportion in
the population, where
n1 P1 + n2 P2 x + x2
Pooled estimate (P) = or P = 1 and Q = 1 – P.
n1 + n2 n1 + n2
Example 12.21
A machine produced 16 defective articles in a batch of 500. After overhauling, it
produced 3 defectives in a batch of 100. Has the machine improved?
Solution
Ho : P1 = P2, i.e. the machine has not improved after overhauling. H1 : P1 ≠ P2
16 3
Now P1 = = 0.032 and P2 = = 0.030 .
500 100
Pooled estimate of actual proportion in the population is given by
x1 + x2 16 + 3
P= = = 0.032
n1 + n2 500 + 100
Q = 1 – P = 0.968
80
Testing of Hypothesis
P1 − P2 0.032 − 0.030
| z| = =
1 1 1 1
P Q + 0.032 × 0.968 +
n1 n2 500 100
0.002
| z| = = 0.105 < 1.96 (at 5% level)
0.019
Ho is true, i.e. the machine has not improved significantly.
Example 12.22
There are 1000 students in a college out of 20000 students in the whole university.
In a study 200 were found smokers in the college and 1000 in the university. Is
there a significant difference between the proportion of smokers in the college and
in the university?
Solution
Ho : P1 = P2, i.e. there is no significant difference in the college and university in
case of proportion of smokers. H1: P1 ≠ P2.
200
Proportion of smokers in college, P1 = = 0.20
1000
1000
Proportion of smokers in the university, P2 = = 0.05
20000
Q2 = 1 − P2 = 0.95
Also n1 = 1000 and n1 + n2 = 20000 ∴ n2 = 19000.
P1 − P2 0.20 − 0.05
| z| = =
n2 19000
P2 Q2 × 0.05 × 0.95
n1 + n2 1000 + 19000
| z | = 0.706 < 3
Since the value is highly significant, it could not have arisen due to sample
fluctuations. By not Rejecting Ho we say that there is no significant difference
between proportion of smokers in the college and the university.
SAQ 1
(a) A stenographer claims that she can take dictation at the rate of 120 words
per minute. Can we reject her claim on the basis of 100 trials in which she
demonstrated a mean of 116 words with standard deviation of 15 words?
(b) An automatic machine was designed to pack exactly 2 kg of tea. A sample
of 100 packs was examined to test the machine. The average weight was
found to be 1.94 kg with standard deviation of 0.10 kg. Is the machine
working properly?
(c) Prior to the institution of a new safety program, the average number of on-
the-job accidents per day at a factory was 4.5. To determine if the safety
program has been effective in reducing the average number of accidents per
day, a random sample of 30 days is taken after the institution of the new
safety program and the number of accidents per day is recorded. The sample
mean and standard deviation were computed as follows :
81
Probability and Statistics x = 3.7 σ = 1.3
(i) Is there sufficient evidence to conclude (at significance level 0.01)
that the average number of on-the-job accidents per day at the factory
has decreased since the institution of the safety program?
(ii) What is the practical interpretation of the test statistic computed in
part (i)?
(d) A patented medicine claimed that it is effective in curing 90% of the patients
suffering from malaria. From a sample of 200 patients using this medicine, it
was found that only 170 were cured. Determine whether the claim is right or
wrong (Take 1% level of significance).
(e) Random samples from two population gave the following results :
Population A Population B
Mean 490 500
SD 50 40
Size 300 300
(h) The breaking strengths of metal rods, produced by a certain company, have
mean as 820 kg and standard deviation as 50 kg. When a new manufacturing
process is adopted, it is claimed that the breaking strength can be improved.
A sample of 100 rods is tested and the results indicates that the breaking
strength as 840 kg. Can we support this claim at a 1% level of significance?
SAQ 2
(a) A certain stimulus administered to each of 12 patients resulted in the
following increments in ‘Blood pressure’ 5, 2, 8, − 1, 3, 0, 6, − 2, 1, 5, 0, 4.
Can it be concluded that the stimulus will in general be accompanied by an
increase in blood pressure, given that for all df the value of t0.05 = 2.201?
(b) Two types of batteries are tested for their length of life and following results
are obtained.
82
No. of Sample Mean Variance Testing of Hypothesis
(n) (x)
Battery A 10 500 hours 100
Battery B 10 560 hours 121
A a
B AB aB
22 38 60
84
Testing of Hypothesis
Ab ab
b 8 32 40
30 70 100
Now the formula for calculating expected frequency of any class (cell)
R×C
In notations : Expected frequency =
N
For example, if we have two attributes A and B that are independent then the expected
30 × 60
frequency of the class (cell) AB would be = = 18 .
100
Once the expected frequency of cell (AB) is decided the expected frequencies of
remaining three classes are automatically fixed.
Thus for class (aB) it would be 60 – 18 = 42
for class (Ab) it would be 30 – 18 = 12
for class (ab) it would be 70 – 42 = 28
This means that so far as two χ2 association (contingency) table is concerned, there is
1 degree of freedom.
In such tables, the degrees of freedom are given by a formula n = (c – 1) (r – 1),
where c = Number of columns and r = Number of rows.
Thus in 2 × 2 table df = (2 – 1) (2 – 1) = 1
3 × 3 table df = (3 – 1) (3 – 1) = 4
4 × 4 table df = (4 – 1) (4 – 1) = 9 etc.
If the data is not in the form of contingency tables but as a series of individual
observations or discrete or continuous series then it is calculated by n = n – 1 where n is
the number of frequencies or values of number of independent individuals.
(O − E ) 2
χ2 = ∑
E
where O = Observed frequency and E = Expected frequency.
Example 12.23
The following table shows the age groups of people interviewed according to their
age-group and the number in each group estimated to have T. B.
15 – 20 199 1
20 – 25 300 8
85
Probability and Statistics
25 – 35 1128 38
35 – 45 1375 96
45 – 55 1089 105
55 – 65 625 56
65 - 75 155 12
Do these figures justify the hypothesis that T. B. is equally popular in all age
groups?
Solution
If T.B. equally popular in all groups then in each age group
316
4871 × 100 = 6.5% of the people suffer from it
On this basis, the observed and expected frequencies would be as
86
Find the value of χ2 on the hypothesis that the dice were unbiased and hence show Testing of Hypothesis
that the data is consistent with the hypothesis so far as the χ2 test is concerned.
12.7 SUMMARY
In this unit we have learnt the procedures for testing hypotheses about various population
parameters.
In many practical problems, statisticians are called upon to make decisions about a
statistical population on the basis of simple observations. In attempting to reach such
decision, it is necessary to make certain assumptions or guesses about the characteristics
of population, particularly about the probability distribution or the value of its
parameters. Such an assumption or statement about the population is called a Statistical
Hypothesis. The validity of a hypothesis will be tested by analysing the sample. The
procedure which enables us to decide whether a certain hypothesis is true or not, is called
Testing of Hypothesis.
A statistical test of significance involves two mutually exclusive and exhaustive
hypotheses, the null hypothesis (H0) and the alternative hypothesis (H1).
The null hypothesis specifies an expected value for the population. It usually takes the
form of “no effect” or “no difference”.
The alternative hypothesis denies the null hypothesis.
Statistical proof is both indirect and probabilistic. By rejecting the null hypothesis, we
assert the alternative hypothesis.
Rejection of the null hypothesis involves a judgement based on probability. If the
obtained result would have rarely occurred in the sampling distribution of the statistic,
we reject H0 and assert H1.
“Rarely” is, in turn, defined by probability. The 5 percent significance level means that
the result would be obtained by chance 5 percent of the time or less. Similarly, using
α = 0.01, H0 is rejected when the result would be obtained by chance 1 percent of time or
less.
Statistical proof is not absolute. Two types of error that may be made are Type-I or
Type α error and Type-II or Type β error.
A Type-I error consists of falsely rejecting H0; i.e. rejecting H0 when it is true. The
probability of this type of error is α.
A Type-II error occurs when we fail to reject H0, when in fact H0 is false.
Both the null and alternative hypothesis may either be non-directional or directional.
When non-directional, the critical region is found in both tails of the sampling
distribution. When directional, the critical region is only one-tailed.
Tests involving two sample means usually involve two different conditions, i.e. there are
two samples, which are drawn from two populations. The usual H0 is that the mean of the
first population equals the mean of the second population. Rejection of H0 permits us to
infer that the conditions produced different results.
Dependent tests of significance are used when the measurements are paired in some way.
This may be accomplished by using before-after measures on the same persons or objects
or by matching them on some known basis.
87
Probability and Statistics Hypothesis testing is an inferential process, which means that it uses limited information
as the basis for reaching a general conclusion. A sample provides only limited or
incomplete information about the whole population. This means we could make incorrect
conclusions.
Hypothesis testing involving two sample proportions is conceptually similar to the test of
significance of the difference between means. H0 is typically that the proportion of the
first population equals the proportion of the second population. Rejection of H0 permits
us to infer differences in the populations of the variable of interest.
Chi-square (χ2) test of independence and goodness of fit is a prominent example of a
non-parametric test. The chi-square test (χ2) can be used to evaluate a relationship
between two nominal or ordinal variables. Finally, we had also discussed the Chi-square
test.
x −µ
| z| = , we get
σ
n
116 − 120
| z| = = 2.67 > 1.96
15
100
The difference is not significant at both 5% and 1% level of significance, i.e.
the value of z - score 2.67 is highly significant. Hence Ho is rejected, i.e. her
claim is to be rejected.
If somebody is interested in the number of trials on the basis of which, with
the same figure, her claim would not have been rejected, he can proceed as
116 − 120 −4
| z| = = ≤ 1.96
15 15
n n
4 n
i.e. ≤ 1.96
15
i.e. n ≤ 0.49 × 15
i.e. n ≤ 7.35
i.e. n ≤ (7.35) 2
i.e. n ≤ 54.02
i.e. n = 54 trials.
(b) The null hypothesis to be tested is that the machine is working properly.
In notations H0 : µ = 2 kg and H1 : µ ≠ 2 kg.
Substituting, x = 1.94, σ = 0.10, n = 100
x −µ 1.94 − 2
we get, | z| = = = 6.0 > 2.58
σ 0.10
n 100
The z-score is highly significant, hence, we reject Ho on the basis of this
sample, i.e. the machine is not working properly.
(c) (i) In order to determine whether the safety program was effective, we
will conduct a large-sample test of
H0 : µ = 4.5 (i.e., no change in average number of on-the-job
accidents per day)
H1 : µ < 4.5 (i.e., average number of on-the-job accidents per day has
decreased)
where µ represents the average number of on-the-job accidents per
day at the factory after institution of the new safety program. For a
significance level of α = 0.01, we will reject the null hypotheses if
z < − z0.01 = − 2.33
89
The computed value of the test statistic is
Probability and Statistics
x − µ0 3.7 − 4.5
z= = = 3.37
σ 1.3
n 30
Since this value does fall within the rejection region, there is
sufficient evidence (at α = 0.01) to conclude that the average number
of on-the-job accidents per day at the factory has decreased since the
institution of the safety program. It appears that the safety program
was effective in reducing the average number of accidents per day.
(ii) If the null hypothesis is true, µ = 4.5. Recall that for large samples,
the sampling distribution of x is approximately normal, with mean
σ
µ x = µ and standard deviation σ x = . Then the z-score for x ,
n
under the assumption that H0 is true, is given by
x − 4.5
σx =
σ
n
pˆ − p0 0.85 − 0.9
| z| = = = 2.36 < 2.58
p0 q0 0.9 × 0.1
n 200
90
The null hypothesis Ho is quite right at 1% level of significance and that the Testing of Hypothesis
claim is justified.
(e) H0 : µ1 = µ2, the difference between the two means is not significant, i.e.
H1 : µ1 ≠ µ2.
We have x1 = 490, x2 = 500, σ1 = 50, σ2 = 40, n1 = 300 and n2 = 300.
x1 − x2 490 − 500 − 10
| z| = = =
σ12 σ 22 (50) 2
(40) 2 2500 1600
+ + +
n1 n2 300 300 300 300
− 10 − 10 − 10
= = = = 2.71 > 1.96
4100 41 13.66
300 3
Therefore, the null hypothesis H0 is rejected, i.e. the difference between the
two means is significant.
(f) P1 = 30% = 0.30 and P2 = 25% = 0.25, q1 = 0.70 and q2 = 0.75, n1 = 1200
and n2 = 900.
P1 − P2 0.30 − 0.25
| z| = =
P1 q1 P2 q2 (0.3) (0.7) (0.25) (0.75)
+ +
n1 n2 1200 900
0.05
= = 2.56
0.0195
Therefore, | z | > 1.96, (i.e. at 5% level of significance). Hence, it is unlikely
that the real difference will be hidden.
Note : At times you may be interested in the comparison of proportions of
persons possessing an attribute in a sample with proportion given by
the population. In that case use :
P1 − P2
| z| =
n2
P1 q1 ×
n1 + n2
and qˆ = 1 − pˆ = 0.535
Then we have
(0.37 − 0.56) − 0
z= = − 2.69
1 1
(0.465) (.535) +
100 100
This value falls below the critical value of − 2.33. Thus, at α = 0.01, we
reject the null hypothesis; there is sufficient evidence to conclude that the
proportion of patients giving reactions to needles of the old type is
significantly less than the corresponding proportion of patients giving
reactions to needles of the new type, i.e. p1 < p2.
The inference derived from the test in SAQ 1(g) is valid only if the sample
sizes, n1 and n2, are sufficiently large to guarantee that the intervals
pˆ 1 − qˆ1 pˆ 2 − qˆ 2
pˆ 1 ± 2 and pˆ 2 ± 2
n1 n2
92
Testing of Hypothesis
Figure 12.13
Note : In practice, one should adopt a ‘one tailed test’ only when he has enough
reasoning to expect that the difference will be in a specified direction. A two-
tailed test is conservative than a one-tailed test. Since it uses more extreme test
statistic for the rejection of the null hypothesis.
SAQ 2
∑ xi 31
(a) x= = = 2.583 ≈ 2.6
n 12
∑ (xi − x ) 2 104.92
Also σ= = = 3.08
n −1 12 − 1
The null hypothesis Ho : µ = 0, i.e. assuming that the stimulus will not be
accompanied by an increase in blood pressure (or the mean increase in blood
pressure for the population is zero). 93
Probability and Statistics |x − µ| 2.6 − 0
Now t= n= 12 = 2.924
σ 3.08
The table value, t0.05, n = 11 = 2.201
Therefore, 2.924 > 2.201.
Thus the null hypothesis Ho is rejected, i.e. we find that our assumption is
wrong and we say that as a result of the stimulus the blood pressure will
increase.
(b) The null hypothesis Ho : µ1 = µ2 or Ho : µ1 − µ2 ≠ 0
Alternative hypothesis H1 : µ1 ≠ µ2 or H1 : µ1 − µ2 ≠ 0, i.e. there is no
significant difference in the two batteries.
Now n1 = 10, σ1 = 100 = 10, x1 = 500, n2 = 10, n1 = 10, σ 2 = 121 = 11
and x2 = 560 .
( n1 − 1) σ12 + (n2 − 1) σ 22
Thus σ 2P =
n1 + n2 − 2
x1 − x2 − (µ1 − µ 2 )
Using the formula t = , where µ1 − µ 2 = 0
1 1
σ 2P +
n1 n2
500 − 560
we get t= = 12.76
1 1
110.5 +
10 10
The degree of freedom v (df) = 10 + 10 − 2 = 18
For v = 18, t0.05 = 2.1
Therefore, 12.76 > 2.1 (much higher)
Thus the difference is highly significant (rejection of Ho)
(c) x = 1570, µ = 1600, σ = 120, n = 100
x −µ
Now, t=
σ
n
1570 − 1600
= = 2 .5
120
100
at 0.05, the level of significance t = 1.96.
Since t > 1.96. Hence the claim is to be rejected.
(d) x = 0.742, µ = 0.70, σ = 0.04, n = 10
94
x −µ Testing of Hypothesis
Now, t=
σx
n −1
0.742 − 0.70
= = 3.15
0.04
10 − 1
x −µ
Now, t=
σ
n
4 .3 − 4
= = 0.75
1.2
9
α
From table, t n −1 , = t8, 0.05 = 1.86 .
2
Since calculated | t | < 1.86, we have no reason to reject H0.
(f) Clearly, it is a two tailed test, so that
H0 : π = 30, and
H1 : π ≠ 30
p−π
Now, z=
σp
π (1 − π)
where, σp =
n
0.3 × 0.7
= = 0.037
150
50
and p= = 0.33
150
0.33 − 0.3
then z= = 0.81
0.037
Since the z values of 0.81 is less than the critical value of z at α = 0.05
which is 1.96, we cannot reject the null hypothesis.
(g) It is a tow-tailed test, so that
H0 : µ1 = µ2
H1 : µ1 ≠ µ2
Now, we have n1 = 100 n2 = 50
x1 = 75 x2 = 70
σ1 = 10 σ 2 = 12
95
Probability and Statistics α1 = 0.01
x1 − x2
z=
σ12 σ 22
+
n1 n2
75 − 70
= = 2.54
10 2 12 2
+
100 50
Since this value z is less than the critical value of z at α = 0.01 for a
two-tailed test, which is 2.58, we cannot reject the null hypothesis.
(h) As the sample size is small, a t-distribution represents the data more closed
to the z distribution.
Mean x = 120
∑ (x − x)2
and σ= = 6.69
n −1
Figure 12.14
σ
x1 = x − t
n
6.69
= 120 − × (2.36) = 114.42
8
σ
and x2 = x + t
n
= 125.58
Hence 114.42 ≤ µ ≤ 125.58
(i) Since the sample size is small, we can use the t-test.
96
x −µ Testing of Hypothesis
Now, t=
σ
n
x = 8.2, µ = 8, σ = 0.3, n = 9 , then
8 .2 − 8
t= =2
0 .3
9
The critical value of t from the table at α = 0.01 for one tail test and df
(degree of freedom) = 8, is 2.896.
Since our calculated value of t is less than the critical value of t, we cannot
reject the null hypothesis.
(j) H0 : µ1 = µ2
H1 : µ1 ≠ µ2
x1 − x2
Now, t=
1 1
σ 2p +
n1 n2
(10 − 1) 4 2 + (8 − 1) 52
= = 19.94
(10 + 8 − 2)
23 − 26
and t= = − 1.41
1 1
19.94 +
10 8
The critical value of t from the table at α = 0.05 for a two-tail test and
df = (n1 + n2 − 2) = 16 is 2.12. Since the numerical value of our calculated
‘t’ is less than the critical value of t, we cannot reject the null hypothesis.
SAQ 3
On the hypothesis of unbiased dice the theoretical frequencies in 4096 throws are
the terms in the Binomial expansion of 4096 (5/6 + 1/6)2 and are given below :
Number of 0 1 2 3 4 5 6 7 Total
Successes
FURTHER READING
Grant, E. L. and R. S. Leavenworth (1980), Statistical Quality Control, 5th Ed., McGraw
Hill, New York.
Grewal, B. S. (2004), Higher Engineering Mathematics, Khanna Publisher, New Delhi.
Kreyszig Erwin (2002), Advanced Engineering Mathematics, John Wiley and Sons, Inc.
Richard, P. Runyon and Haber Audrey, Business Statistics, Richard D. Irwin, Inc.
Stroud, K. A. (2002), Engineering Mathematics, McMillan Press Ltd.
Stuart, A. and J. K. Orth, Kendalls’, Advanced Theory of Statistics, Volumes 1 and 2,
5th Ed., Kent UK, Arnold.
Walpole Ronald, E and Mayers Rammond, H (2002), Probability and Statistics for
Engineers and Scientists, McMillan Publishing Company, New York.
98