Hypothesis Testing

Hypothesis testing
Luca Aguilar
Mathematics Department, School of Technology, University of Extremadura

[email protected]
May 2017
Introduction Tests for key distributions Relation between intervals and tests Classical approach
Overview
1 Introduction
2 Hypothesis tests for the parameters of key distributions
3 Relation between confidence intervals and hypothesis

tests
4 Classical approach to hypothesis testing

1 Introduction
In the previous lecture you have studied point and interval

estimation for the unknown parameter values of
distributions. Both are techniques employed in the
inferential phase of statistical analysis.
In this course we consider hypothesis testing, the third
main inferential technique.
As we shall see, hypothesis testing has much in common
with confidence interval construction, and we will draw
heavily on the results already presented for the sampling
distributions of the random variables underpinning such
constructions.
1 Introduction
We first introduce and illustrate hypothesis testing

procedures for the unknown parameter values of the main
distributions.
In the practical session we will introduce a test used to
explore whether a data set was drawn from a normal
population.
2 Hypothesis tests for the parameters of key

distributions
In this section we consider procedures designed to test

whether the parameters of the Bernoulli, binomial, Poisson,
exponential and normal distributions take specified values.
We assume that a single variable, X , is of interest and we
measure its value for each element of a simple random
sample of size n drawn from the population under
consideration.
Due to the use of simple random sampling, it is reasonable
to assume that the values obtained, X1 , X2 , ..., Xn , are
independent and identically distributed (iid) with a common
distribution which is that of X in the population.
2.1 Bernoulli distribution
As we know, the Bernoulli distribution has a single

parameter: p, the probability of success.
Suppose X1 , X2 , ..., Xn are iid Bernoulli random variables
and we are interested in testing whether p takes a specific
value, p0 say.
We say that the null hypothesis is that p = p0 , denoted as
H0 : p = p0 .
The null hypothesis is tested against an alternative

hypothesis, denoted as H1 , which in this case could be any
one of:
1 H1 : p 6= p0 ;
2 H1 : p < p0 ;
3 H1 : p > p0 .
The first is referred to as being two-sided because under it
p could be less than p0 or greater than p0 .
The other two are referred to as being one-sided
alternative hypotheses.
Unless we have reasons for investigating one of the
one-sided alternatives, it is usual to test H0 against the
two-sided alternative hypothesis.
To make things more concrete, we will consider an

example, Example 1. A large computing company wants
to know what proportion of its workers are in their offices
after 20:00. On a typical working day they choose 50
members of staff at random and found that the proportion
of the 50 who were in their offices after 8 oclock was
0.72.
So our point estimate of p is p = 0.72.
We can calculate a 95% confidence interval for p,
obtaining the interval (0.60, 0.84).
Suppose now that we wanted to test the null hypothesis
that p = 0.75, i.e. H0 : p = 0.75, against the two-sided
alternative hypothesis that p 6= 0.75, i.e. H1 : p 6= 0.75.
Having identified the null and alternative hypotheses we

next calculate the value of a test statistic.
For the problem under consideration, and sufficiently large
n, this isp
no other than the value of the random variable
(p p)/ p(1 p)/n, with p replaced by its value under
the null hypothesis, namely p0 .
Implicitly, then, we are assuming the null hypothesis to be
true. And we carry on assuming it to be true until we have
sufficient evidence to the contrary.
So we calculate the value of
p p0
Z0 = p .
p0 (1 p0 )/n
p p0
Z0 = p .
p0 (1 p0 )/n
If p = p0 we would expect p to have a value close to p0 ,

and hence Z0 to have a value close to 0.
If p was actually less than p0 then we would expect p to be
less than p0 and hence the value of Z0 to be negative.
If p was actually greater than p0 then we would expect p to
be greater than p0 and hence the value of Z0 to be positive.
p p0
Z0 = p .
p0 (1 p0 )/n
Returning to the data of Example 1, under H0 : p = 0.75

the observed value of Z0 is
p
z0 = (0.72 0.75)/ (0.75(1 0.75)/50) = 0.49.
Does this value provide evidence for or against the null

hypothesis?
p p0
Z0 = p .
p0 (1 p0 )/n
Being an estimator, p is a random variable and as Z0 is a

function of it, Z0 is a random variable too.
According to the Central Limit Theorem, for large enough
n, the distribution of Z0 should be well approximated by the
standard normal distribution if p = p0 .
To decide whether the calculated value of Z0 is sufficiently
far from 0 for us to reject the null hypothesis, we calculate
what is referred to as the p-value of the test, defined as:
p-value
Definition
The probability, under the null hypothesis, of observing a value
of the test statistic that is at least as extreme as its value for the
data.
In this definition, what extreme means depends on the

alternative hypothesis under consideration.
For the problem under consideration, the test statistic is Z0 .
Values of Z0 close to 0 tend to support the null hypothesis.
p-value
p p0
Z0 = p .
p0 (1 p0 )/n
For a two-sided alternative hypothesis large values of Z0 ,

either positive or negative, tend to indicate that the
alternative hypothesis is true.
For H1 : p < p0 large negative values of Z0 tend to indicate
that the alternative hypothesis is true.
For H1 : p > p0 large positive values of Z0 tend to indicate
that the alternative hypothesis is true.
p-value
So, for the two-sided alternative hypothesis H1 : p 6= p0 the

p-value is
P(Z0 |z0 | or Z0 |z0 |).
As the standard normal distribution is symmetric, this
probability is
2P(Z0 |z0 |) = 2(|z0 |),
where denotes the distribution function of the standard

normal distribution.
p-value
For the one-sided alternative hypothesis H1 : p < p0 the

p-value is
P(Z0 z0 ) = (z0 ).
For the one-sided alternative hypothesis H1 : p > p0 the
p-value is
P(Z0 z0 ) = 1 (z0 ).
p-value
Returning to the hypothesis testing problem for the data

of Example 1, for the two-sided alternative hypothesis
H1 : p 6= 0.75, the p-value is 2(| 0.49|) = 2(0.49).
In R this probability can be calculated using
2*pnorm(-0.49)
The value returned by R is 0.62.
If the alternative hypothesis had been H1 : p < 0.75, the
p-value would have been (0.49) = (0.49) = 0.31.
If the alternative hypothesis had been H1 : p > 0.75, the
p-value would have been
1 (0.49) = 1 0.31 = 0.69.
p-value
Given its definition, a p-value is a probability and must

therefore take a value in the interval [0, 1].
Large p-values tend to suggest that values of the test
statistic like that observed for the data are highly likely
under the null hypothesis.
Small p-values suggest that values of the test statistic like
that observed for the data are improbable under the null
hypothesis.
So, a large p-value does not provide evidence against the
null hypothesis.
A small p-value does provide evidence against the null
hypothesis and in favour of the alternative hypothesis.
p-value
But how small is small?, I hear you ask.

It has become standard practice in many disciplines to take
p-values less than 0.05 (5% or 1 in 20) as significant
statistical evidence for the rejection of the null hypothesis
and acceptance of the alternative hypothesis.
p-value
For our testing problem for the data of Example 1, none of

the p-values for the different alternative hypotheses are
less than 0.05.
So there is no statistical evidence to reject the null
hypothesis H0 : p = 0.75 in favour of any of the three
alternative hypotheses at the 5% level of significance.
In practice, we would only be interested in testing the null
hypothesis against one of the three alternative
hypotheses, not all three.
p-value
But what if we had wanted to test the null hypothesis

H0 : p = 0.55?
p
Now, z0 = (0.72 0.55)/ (0.55(1 0.55)/50) = 2.42,
and:
1 2(|z0 |) = 2(2.42) = 0.016;
2 (z0 ) = (2.42) = 0.992;
3 1 (z0 ) = 1 (2.42) = 0.008.
p-value
So, if the alternative hypothesis had been H1 : p 6= 0.55,
we would have rejected the null hypothesis H0 : p = 0.55
in favour of the alternative hypothesis at the 5%
significance level because 0.016 is less than 0.05.
If the alternative hypothesis had been H1 : p 0.55 we
would not have rejected the null hypothesis in favour of
the alternative hypothesis at the 5% significance level
because 0.992 is (much) greater than 0.05.
If the alternative hypothesis had been H1 : p 0.55 we
would have rejected the null hypothesis in favour of the
alternative hypothesis at the 5% significance level
because 0.008 is smaller than 0.05.
So, clearly, the result of a hypothesis test very much
depends on the alternative hypothesis being investigated,
not just the null hypothesis.

The steps in this hypothesis testing procedure are:
1 Identify the value of p0 for the null hypothesis H0 : p = p0 .
2 Identify the alternative hypothesis H1 of interest (one of
p 6= p0 , p < p0 or p > p0 ).
3 Calculate the value of the test statistic z0 = pp0 for
p0 (1p0 )/n
the data.
4 Calculate the p-value for the chosen alternative hypothesis
(either 2(|z0 |), (z0 ) or 1 (z0 )).
5 If the p-value is less than 0.05, reject the null hypothesis in
favour of the alternative hypothesis (at the 5% significance
level).
6 If not, there is no statistically significant evidence against
the null hypothesis and there is no reason to reject it in
favour of the alternative hypothesis.
2.2 Binomial distribution B(k ,p)
We will assume that the number of Bernoulli trials, k , is

known, and concentrate on hypothesis testing for p, the
probability of success in a Bernoulli trial.
Proceeding as in Section 2.1, the steps for this hypothesis
testing procedure are the same as those at the end of the
previous subsection apart from the third which changes to:
pp0
3. Calculate the value of the test statistic z0 =
p0 (1p0 )/kn
for the data.
The only difference between the test statistic here and that
for the test in Section 2.1 is the inclusion of the number of
Bernoulli trials, k .
Example 2
45 Computer Engineering and 52 Software Engineering

students attempted to complete 6 simple programming
tasks in two hours. It was found that, for the 45 Computer
Engineering students, the proportion of successfully
completed programming tasks was 0.65, and for the 52
Software Engineering students the proportion was 0.78.
We can calculate 95% confidence intervals for the value
of p in the two populations of students, obtaining the
interval (0.59, 0.71) for the Computer Engineering
students and (0.73, 0.83) for the Software Engineering
students.
Here we consider testing the null hypothesis
H0 : p = 0.68 against the alternative hypothesis
H1 : p > 0.68 in each of the two populations.
Example 2
For the Computer Engineering

p students,
z0 = (0.65 0.68)/ 0.68(1 0.68)/(6(45)) = 1.06
and for the SoftwarepEngineering students,
z0 = (0.78 0.68)/ 0.68(1 0.68)/(6(52)) = 3.79.
For the Computer Engineering students,
1 (z0 ) = 1 (1.06) = 0.86
and for the Software Engineering students,
1 (z0 ) = 1 (3.79) = 0.00008.
Example 2
As the p-value for the Computer Engineering students is

greater than 0.05, we have no reason to reject the null
hypothesis H0 : p = 0.68 in favour of the alternative
hypothesis H1 : p > 0.68 at the 5% significance level.
However, the p-value for the Software Engineering
students is far less than 0.05 and so we reject the null
hypothesis H0 : p = 0.68 in favour of the alternative
hypothesis H1 : p > 0.68 at the 5% significance level.
2.3 Poisson distribution
Here we consider hypothesis testing for the parameter of

a Poisson distribution.
Proceeding along analogous lines to those in Section 2.1,
the steps for this hypothesis testing procedure are the
same as those at the end of Section 2.1 apart from the first
three which change to:
1 Identify the value of 0 for the null hypothesis H0 : = 0 .
2 Identify the alternative hypothesis H1 of interest (one of
6= 0 , < 0 or > 0 ).
x0
3 Calculate the value of the test statistic z0 = for the
0 /n
data.
Example 3
Number of users connecting to the network each minute

during a period of an hour.
4 5 7 8 7 6 7 4 4 7
9 7 3 5 10 8 7 4 7 10
6 6 6 6 9 8 8 7 4 4
9 6 11 5 12 11 7 4 4 10
8 11 6 8 8 10 7 12 6 6
7 10 8 12 3 6 3 8 4 4
The point estimate of , = x, is 6.98.

We test now the null hypothesis H0 : = 8 against the
alternative hypothesis H1 : < 8.
The value of the p
test statistic is
z0 = (6.98 8)/ 8/60 = 2.79.
Example 3
The p-value is (z0 ) = (2.79) = 0.003.

As the p-value is considerably less than 0.05, we reject
the null hypothesis H0 : = 8 in favour of the alternative
hypothesis H1 : < 8 at the 5% significance level.
We therefore have significant statistical evidence that the
mean number of connections to the network per minute is
less than 8.
Table 1 Summary of hypothesis tests with the column headings denoting: D, distribution; NH, null hypothesis; AH,
alternative hypothesis; TS, test statistic; SD, sampling distribution under the null hypothesis; PV, p-value.
D NH AH TS SD PV
pp0
Bernoulli p = p0 p 6= p0 Z0 = r N(0, 1) 2(|z0 |)
p0 (1p0 )
n
p < p0 (z0 )
p > p0 1 (z0 )
pp0
Binomial p = p0 p 6= p0 Z0 = r N(0, 1) 2(|z0 |)
p0 (1p0 )
kn
p < p0 (z0 )
p > p0 1 (z0 )
X
Poisson = 0 6= 0 Z0 = q 0 N(0, 1) 2(|z0 |)
0
n
< 0 (z0 )
> 0 1 (z0 )

Exponential = 0 =6 0 Z0 = 0 1 n N(0, 1) 2(|z0 |)

< 0 1 (z0 )
> 0 (z0 )
X 0
Normal = 0 6= 0 T0 = S
tn1 2Ft,n1 (|t0 |)

n
< 0 Ft,n1 (t0 )
> 0 1 Ft,n1 (t0 )
(n1)S 2

2 = 02 2 6= 02 C0 = 2n1 2 min F2 ,n1 (c0 ), 1 F2 ,n1 (c0 )
2
0
2 < 02 F2 ,n1 (c0 )
2 > 02 1 F2 ,n1 (c0 )
3 Relation between confidence intervals and

hypothesis tests
There is clearly much in common between the confidence

interval constructions and the hypothesis testing
procedures.
More generally, a 100(1 )% confidence interval for a
parameter, say, contains all those values of 0 for which
the null hypothesis H0 : = 0 would not be rejected at the
100% significance level against the two-sided alternative
hypothesis H1 : 6= 0 .

hypothesis tests
So, a 95% confidence interval for identifies all those

values of 0 for which the null hypothesis H0 : = 0
would not be rejected at the 5% significance level against
the alternative hypothesis H1 : 6= 0 .
As the significance level most often applied in hypothesis
testing is 5%, this is the main reason why 95%
confidence intervals are usually quoted.

hypothesis tests
As we have seen, a confidence interval gives us an idea of

the range of possible values the parameter of interest, ,
might take (for a given confidence level).
On the other hand, the p-value of a test gives us an idea of
how likely, or unlikely, it is that = 0 against a specified
alternative hypothesis.
We have based our approach to hypothesis testing on the

calculation of a p-value and its comparison with = 0.05,
or the significance level of 100% = 5%.
In the classical approach to hypothesis testing, a
significance level, 100%, is chosen and the acceptance
region of values of the test statistic is identified for the
chosen significance level and the alternative hypothesis
under consideration.
The value of the test statistic is calculated and if it falls
outside the acceptance region, and hence inside the
so-called critical region, the null hypothesis is rejected.
If not, the null hypothesis is not rejected.
Instead of identifying the acceptance and critical regions

one can equivalently calculate the p-value of the test and
reject the null hypothesis if that p-value is less than ,
corresponding to the significance level of 100%.
This is what we have done, employing the commonly used
significance level of 5%.
Table 2: Summary of the two types of errors that can occur when
performing an hypothesis test.
Accept H0 Reject H0
H0 true X Type I error
H0 false Type II error X
The significance level of a test is the probability of

committing a type I error.
We commit a type I error when we reject the null
hypothesis when it is in fact true.
This type of error is always possible, unless, of course, we
never reject the null hypothesis!
An argument for not doing so is based on a consideration
of the type II error.
We commit a type II error when we accept the null
hypothesis when in fact the alternative hypothesis is true
(i.e. we do not reject the null hypothesis when it is false).
Type I and Type II Errors
Clearly, we would like to reduce the probabilities of

committing such errors as much as possible.
However, it is difficult to control both of them
simultaneously and the classical approach is to fix one of
them: the probability of committing a type I error, , or
equivalently the significance level 100%.
For a given significance level, 100%, the power of a test
is 1 , where denotes the probability of committing a
type II error.
For a given significance level, it is natural to seek tests with
high power: so-called, powerful tests.

Hypothesis Testing

Uploaded by

Copyright:

Available Formats

Hypothesis Testing

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hypothesis Testing

Uploaded by

Copyright:

Available Formats

Hypothesis testing

Mathematics Department, School of Technology, University of Extremadura

2 Hypothesis tests for the parameters of key distributions

3 Relation between confidence intervals and hypothesis

4 Classical approach to hypothesis testing

In the previous lecture you have studied point and interval

We first introduce and illustrate hypothesis testing

2 Hypothesis tests for the parameters of key

In this section we consider procedures designed to test

2.1 Bernoulli distribution

As we know, the Bernoulli distribution has a single

2.1 Bernoulli distribution

The null hypothesis is tested against an alternative

2.1 Bernoulli distribution

To make things more concrete, we will consider an

2.1 Bernoulli distribution

Having identified the null and alternative hypotheses we

2.1 Bernoulli distribution

If p = p0 we would expect p to have a value close to p0 ,

2.1 Bernoulli distribution

Returning to the data of Example 1, under H0 : p = 0.75

Does this value provide evidence for or against the null

2.1 Bernoulli distribution

Being an estimator, p is a random variable and as Z0 is a

In this definition, what extreme means depends on the

For a two-sided alternative hypothesis large values of Z0 ,

So, for the two-sided alternative hypothesis H1 : p 6= p0 the

2P(Z0 |z0 |) = 2(|z0 |),

where denotes the distribution function of the standard

For the one-sided alternative hypothesis H1 : p < p0 the

Returning to the hypothesis testing problem for the data

Given its definition, a p-value is a probability and must

But how small is small?, I hear you ask.

For our testing problem for the data of Example 1, none of

But what if we had wanted to test the null hypothesis

2.1 Bernoulli distribution

2.2 Binomial distribution B(k ,p)

We will assume that the number of Bernoulli trials, k , is

45 Computer Engineering and 52 Software Engineering

For the Computer Engineering

As the p-value for the Computer Engineering students is

2.3 Poisson distribution

Here we consider hypothesis testing for the parameter of

Number of users connecting to the network each minute

The point estimate of , = x, is 6.98.

The p-value is (z0 ) = (2.79) = 0.003.

3 Relation between confidence intervals and

There is clearly much in common between the confidence

3 Relation between confidence intervals and

So, a 95% confidence interval for identifies all those

3 Relation between confidence intervals and

As we have seen, a confidence interval gives us an idea of

4 Classical approach to hypothesis testing

We have based our approach to hypothesis testing on the

4 Classical approach to hypothesis testing

Instead of identifying the acceptance and critical regions

The significance level of a test is the probability of

Type I and Type II Errors

Clearly, we would like to reduce the probabilities of