Testing Hypothesis
Testing Hypothesis
Testing Hypothesis
Hypothesis testing begins with an assumption, called a hypothesis that we make about a population
parameter. Then we collect sample data, produce sample statistics, and use this information to decide
whether our hypothesized population parameter is correct.
To test this validity of our assumption, we collect sample data and determine the difference between the
hypothesized value and actual value of sample mean. Then we judge whether the difference is
significant.
The smaller the difference, the greater the likelihood that our hypothesized value for the mean is correct.
The larger the difference, the smaller the likelihood.
The difference between the hypothesized population parameter and the actual statistic is more often
neither so large that we automatically reject our hypothesis nor so small that we accept it quickly.
So in hypothesis testing, as in most significant real-life decisions, clear-cut solutions are the exceptions.
We cannot accept or reject a hypothesis about a population parameter simple by intuition. Instead we
need to learn how to decide objectively, on the basis of sample information, whether to accept or reject
a hunch/guess/ assumption.
Testing Hypothesis
A hypothesis is a statement about a population parameter whose validity is to be tested on the
basis of a random sample drawn from the population.
In hypothesis testing, we must assume a value of the population parameter before we begin
sampling.
This assumption we wish to test is called null hypothesis and is symbolized H0 or (H sub zero)
Example : suppose we intend to test hypothesis that the population mean is equal to 350.
That is “ the null hypothesis is that the population mean is equal to 350”
H0: = 350
H1: ≠ 350; Alternative hypothesis is that the population mean is not equal to 350
H1: > 350; Alternative hypothesis is that the population mean is greater than 350
H1: < 350; Alternative hypothesis is that the population mean is less than 350
If the alternative hypothesis is one sided, test procedure is said to be one tailed or otherwise.
Suppose that making a Type I error (rejecting a null hypothesis when it is true) involves the time and
trouble of reworking a batch of chemicals that should have been accepted. At the same time, making a
type II error (accepting a null hypothesis when it is false) means taking a chance that an entire group
of users of this chemical compound will be poisoned. Obviously, the management of this company
will prefer a Type I error to a Type II error and, as a result, will set very high level of significance in
its testing to get low βs.
Suppose, on the other hand, that making a Type I error involves disassembling an entire engine at the
factory, but making a Type II error involves relatively inexpensive warranty repairs by the dealers.
Then the manufacturer is more likely to prefer a Type II error and will set lower significance levels in
its testing.
The probability of making type I error is called the level of significance of the test denoted by.
The probability of making type II error is denoted by and (1 – ) is called the power of the test.
The higher the significance level we use for testing a hypothesis, the higher the probability of rejecting
a null hypothesis when it is true.
The purpose of hypothesis testing is not to question the computed value of sample statistic but
to make judgment about the difference between that sample statistic and hypothesized
population parameter.
What criterion to use for deciding whether to accept or reject the null hypothesis.
In statistical terms, the value 0.5 or 0.10, or 0,01 is called significance level.
The remaining area i.e. 05 or 0.9 or 0.99, where no significant difference exists.
If we assume the hypothesis is correct, then the significance level will indicate the percentage
of sample means that is outside certain limits.
0.1 area under the curve, where significant difference exists; we reject the null hypothesis and
0.9 of the area under the curve where we would accept the null hypothesis.
The numerical values of the test statistic for which the null hypothesis is rejected are called critical
values of the test and these values constitute a region called critical region or rejection region of the
test.
Rejection Rule:
If the absolute value of the test statistic computed using sample data exceeds the absolute critical value
of the test, the null hypothesis is rejected.
When the null hypothesis is rejected, the test is called significant and when the null hypothesis is
not rejected, the test is called insignificant.
Points to Note:
(i) Rejection of H0 indicates that an extremely unlikely sample has been drawn which
implies that H0 is very likely to be false.
(ii) Failing to reject H0 does not prove that H0 is true.
(iii) In testing hypothesis, the assumption is always made that the sample used in the test
process is a random sample.
(iv) It is assumed that the sampling distribution of the test statistic is known.
(v) H0, HA and are determined before the test is carried out.
Step 1: Set up H0 and HA. HA decides whether the test is one or two tailed.
Step 2: Specify the level of significance. ()
Step 3: Select an appropriate test-statistic (z or t-test) and compute the value of the test-statistic using
sample data assuming null hypothesis to be true.
Step 4: Determine the critical values and the critical region of the test (using z or t table)
Step 6: If the numerical value of the test-statistic (i.e. 𝒛𝒄𝒂𝒍 or 𝒕𝒄𝒂𝒍 ) falls in the rejection
region, we reject the null hypothesis, in other case accept the null hypothesis.
Decide if the null hypothesis is to be rejected and write the conclusion of the test.
The test will be significant if H0 is rejected otherwise the test will be insignificant.
Solution:
A.
Decision Rule: A decision rule is a statement of the specific conditions under which the
null hypothesis is rejected.
Ho : µ ≤ = 10
H1 : µ > 10 (Here HA shows that the test is one-tailed i.e. right-tailed test)
Sample size is 12 i.e. upto 30. Test statistic for a mean, when population standard deviation is
not known, is t. We get t-value from t-distribution table which is 1.833.
B. Calculation of t-statistic
𝑥̅ − 𝜇𝐻𝑜 𝑥̅ − 𝜇𝐻𝑜 12 − 10
𝑡𝑐𝑎𝑙 = = 𝑠 = = 2.108
𝜎̂𝑥̅ 3
√𝑛 √10
(BY: MUHAMMAD MEMON – IBA) PAGE (6)
C. Decision regarding the null hypothesis
Here |𝑡𝑐𝑎𝑙 | > |𝑡𝑡𝑎𝑏 | , therefore reject null Hypothesis and accept the alternative
hypothesis that the mean is greater than 10.
The measure of disagreement is called the observed significance level (or p-value) for the test.
P-value, for a specific statistical test, is the probability of observing a value of the test statistic which
disagrees with the null hypothesis, and supportive to the alternative hypothesis, as the actual one
computed from the sample data.
α is the significance level, (rejection or critical region) taken prior testing the statistic, whereas p-value
(probability of rejecting ) as computed from sample data:
For the above example, the value of the test statistic computed for the sample of n=50, we calculated z
= 2.12. Therefore, the observed significance level (p-value) for this test is
Here the p-value is less than chosen value of α (i.e. observed significance level is less than significance
level), then we reject the null hypothesis, and accept alternative hypothesis.
Note: In contrast, if we choose α = 0.01, we would not reject the null hypothesis because the p-value
for the test is larger than 0.01.
2. (a) If the test is one-tailed: the p-value is equal to the tail area beyond z in the same direction
as the alternative hypothesis. Thus, if the alternative hypothesis is of the form >, the p-
value is the area to the right of, or above the observed z value. Conversely, if the
alternative is of the form <, the p-value is the area to the left of, or below the observed z
value.
(b) If the test is two-tailed: the p-value is equal to twice the tail area beyond the observed z
value in the direction of the sign of z. That is, if z is positive, the p-value is twice the area
to the right of, or above the observed z value. Conversely, if z is negative, the p-value is
twice the area to the left of, or below the observed z value.
2. If the observed significance level (p-value) of the test is less than the chosen value of α, reject
the null hypothesis. Otherwise, do not reject the null hypothesis.
𝑧0.05 = 1.645
Rejection Region or Critical Region: 𝑧 > 1.645
Ho : µ ≤ 2400,
Step-6: Conclusion:
Null hypothesis is rejected. Hence the test is significant and conclude that the
company’s pipe has a mean strength that exceeds 2400 pounds per linear foot,
and thus the sample result is consistent with the manufacturer’s claim.
Step-2:
Here α = 0.05, test is two tailed therefore we divide α equally between the lower
𝛼
and upper tail of the distribution of z, so = 0.025
2
Step-3:
Sample size is more than 30, therefore calculate z statistic:
n=100, 𝑥̅ = 3806, 𝜇𝐻𝑜 = 3675, s=710 , then
𝑧𝛼 = 𝑧0.025 = 1.96
2
Rejection Region or Critical Region: 𝑧 < −1.96 𝑜𝑟 𝑧 > 1.96
Step-5: Rejection Rule
Ho : µ = Rs. 3675,
Step-6: Conclusion:
The test is two tailed, and z is positive, then the p-value will be twice the area to the right, or above, the
observed z value , i.e.
Here the p-value is not less than chosen value of α (i.e. observed significance level is more than
significance level), then we cannot reject the null hypothesis, and we reject HA.
QUESTION:
The roofing contract for a new sports complex in San Francisco has been awarded to parkhill associates,
a large building contractor. Building specifications call for a movable roof covered by approximately
10,000 sheets of 0.04-inch-thick aluminum. The aluminum sheets cannot be appreciably thicker than
0.04 inch because the structure could not support the additional weight. Nor can the sheets be
appreciably thinner than 0.04 inch because the strength of the roof would be inadequate. Because of this
restriction on thickness, Parkhill carefully checks the aluminum sheets from its supplier. Of course,
parkhill does not want to measure each sheet, so it randomly samples 100.
The sheets in the sample have a mean thickness of 0.0408 inch. From past experience with this supplier,
Parkhill believes that these sheets come from a thickness population with a standard deviation of 0.004
inch.
On the basis of these sample statistics, Parkhill must decide whether to accept the shipment of 10,000
sheets or it may reject the aluminum sheets sent by the supplier. (take α=0.05)
Solution:
Step-I:
Ho : µ = 0.04
HA : µ ≠ 0.04 (Here HA shows that the test is two-tailed. )
Step-2:
Here α = 0.01, test is two tailed therefore we divide α equally between the lower
𝛼
and upper tail of the distribution of z, so = 0.025
2
Step-3:
Sample size is more than 30, therefore calculate z statistic:
𝑧𝛼 = 𝑧0.025 = 1.96
2
Rejection Region or Critical Region: 𝑧 < −1.96 𝑜𝑟 𝑧 > 1.96
Step-5: Rejection Rule
Ho : µ = 0.04
Step-6: Conclusion:
Hence the test is significant and Parkhill could conclude that a population with a true
mean of 0.04 inch would not produce a sample like this. The project supervisor would
reject the aluminum company’s statement about the mean thickness of the sheets.
Solution:
Step-I:
Ho : p = 0.05
HA : p < 0.05 (Here HA shows that the test is one-tailed. )
Step-2:
Here α = 0.01, test is one tailed.
Step-3:
10
𝑝0 = = 0.033
300
Before conducting the test of hypothesis, we check to determine whether the
sample size is large enough to use the normal approximation for the sampling
distribution of 𝑝0
𝑝𝐻0 (1 − 𝑝𝐻0 )
𝑝𝐻0 ± 3𝜎𝑝 = 𝑝𝐻0 ± 3√
𝑛
0.05∗0.95
= 0.05 ± 3√ = 0.05 ± 0.04 Or (0.01, 0.09)
300
Since the interval lies within (0, 1), the normal approximation will be adequate.
Ho : p = 0.05,
Step-6: Conclusion:
HA is then rejected. Hence the test is insignificant & the sample result
is therefore not consistent with the manufacturer claim.
Hence, there is insufficient evidence at the 0.01 level of significance to indicate
that the shipment contains fewer than 5% defective batteries.
For the above example, the value of the test statistic computed for the sample of n=300, we
calculated z = -1.35. Therefore, the observed significance level (p-value) for this test is
Here the p-value is greater than chosen value of α=0.01 (i.e. observed significance level is
greater than significance level), then we accept the null hypothesis, and reject the alternative
hypothesis.
.
Question: How many standard errors around the hypothesized value should we use to be 99.44
percent certain that we accept the hypothesis when it is true?
Question: An automobile manufacturer claims that a particular model gets 28miles to the gallon.
The Environmental Protection Agency, using a sample of 49 automobiles of this model,
finds the sample mean to be 26.8 miles per gallon. From previous studies, the
population standard deviation is known to be 5 miles per gallon. Could we reasonably
expect (within 2 standard errors) that we could select such a sample if indeed the
population mean is actually 28 miles per gallon?
Question: The manufacturer of The X-15 steel-belted radial truck tire claims that the mean mileage the
tire can be driven before the tread wears out is 60,000 miles. The population standard
deviation of the mileage is 5,000 miles. Crosset Truck Company bought 48 tires and found
that the mean mileage for its trucks is 59,500 miles. Is Crossets’s experience different from
that claimed by the manufacturer at the .05 significance level?
Question: At the time she was hired as a server at the Grumney Family Restaurant, Beth Brigden was
told, “You can average more than $80 a day in tips.” Assume the standard deviation of the
population distribution is $3.24. Over the first 35 days she was employed at the restaurant,
the mean daily amount of her tips was $84.85. At the 0.1 significance level, can Ms.
Brigden conclude that she is earning an average of more than $80 in tips?
Question: Research at the University of Toledo indicates that 50 percent of students change their
major area of study after their first year in a program. A random sample of 100 students in
the College of Business revealed that 48 had changed their major area of study after their
first year of the program. Has there been a significant decrease in the proportion of students
who change their major after the first year in this program? Test at α= 0.05 .
Question: The McFarland Insurance Company Claims Department reports the mean cost to process a
claim is $60. An industry comparison showed this amount to be larger than most other
insurance companies, so the company instituted cost-cutting measures. To evaluate the
effect of the cost-cutting measures, the Supervisor of the Claims Department selected a
random sample of 26 claims processed last month. The sample information is reported
below.
At the .01 significance level is it reasonable to conclude that mean cost to process a claim
is now less than $60?