S1 Chp7 HypothesisTesting

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 30

Stats1 Chapter 7 :: Hypothesis

Testing
Mrs Barton

Last modified: 27th April 2020


Experimental Chp2: Measures of Chp3: Representation
i.e. Dealing with collected data.
Location/Spread of Data
Chp1: Data Collection Statistics used to summarise Producing and interpreting
data, including mean, visual representations of
Methods of sampling, types standard deviation, quartiles, data, including box plots and
of data, and populations vs percentiles. Use of linear histograms.
samples. interpolation for estimating
medians/quartiles. Chp4: Correlation
Measuring how related two
variables are, and using linear
Theoretical regression to predict values.
Deal with probabilities and modelling to make inferences about what we ‘expect’ to see or make
predictions, often using this to reason about/contrast with experimentally collected data.

Chp5: Probability Chp6: Statistical Chp7: Hypothesis


Venn Diagrams, mutually Distributions Testing
exclusive + independent Common distributions used to Determining how likely
events, tree diagrams. easily find probabilities under observed data would have
certain modelling conditions, happened ‘by chance’, and
e.g. binomial distribution. making subsequent deductions.
What is Hypothesis Testing?
I throw a coin 10 times. For what numbers of heads might
you conclude that the coin is biased towards heads? Why?

Our intuition is that the further away we are from the ‘expected’
number of heads (i.e. 5 heads out of 10), the more unlikely it is.

𝑝 ( 𝑥) likely
a t e l y
ti ll m oder ce’.
ads s c ha n
8 He pp e n ‘by
to h a

appen
l y to h e’.
ike c
Unl y chan
‘b

0 1 2 3 4 5 6 7 8 9 10
Number of heads ()
What is Hypothesis Testing?
I throw a coin 10 times. For what numbers of heads might
you conclude that the coin is biased towards heads? Why?
! A hypothesis is a statement made about the value of a In this context…
population parameter that we wish to test by collecting
evidence in the form of a sample. We’re asking “is the coin biased”.
This is making a statement about
 The null hypothesis, is the default position, i.e. that
nothing has changed, unless proven otherwise.
?
the probability of getting Heads
(i.e. the in )
 The alternative hypothesis, , is that there has been
some change in the population parameter. The ‘default position’ is that the
𝑝 ( 𝑥) For this range of outcomes we wouldn’t conclude the ?
coin is fair, i.e. .
coin is biased, i.e. we’d “accept ”
The ‘alternative’ position is that the
?
coin is biased towards heads, i.e. is
more than 0.5.

For this range of outcomes we’d conclude that this


number of heads was too unlikely to happen by
chance, and hence reject (i.e. that coin was fair) and
accept (i.e. that coin was biased).

0 1 2 3 4 5 6 7 8 9 10
Number of heads ()
What is Hypothesis Testing?
I throw a coin 10 times. For what numbers of heads might
you conclude that the coin is biased towards heads? Why?

In this context…
! In a hypothesis test, the evidence from the sample The test statistic is what we
is a test statistic. observed, in this case, is the
number of heads seen in 10 throws.

Note that the test statistic is a


𝑝 ( 𝑥) ?
distribution (i.e. across the possible
things we might observe).

noting that is not known until we


start making assumptions.

0 1 2 3 4 5 6 7 8 9 10
Number of heads ()
What is Hypothesis Testing?
I throw a coin 10 times. For what numbers of heads might
you conclude that the coin is biased towards heads? Why?

In this context…
 The level of significance is the maximum We said that if we saw a number of
probability where we would reject the null heads within ranges of outcomes
hypothesis. that were sufficiently unlikely, then
This is usually 5% or 1%. we’d rule out that the coin is fair
and conclude it was in fact biased.
𝑝 ( 𝑥) If , we’ve set this region so that
there’s at most a 5% probability
of seeing these number of
?
But how unlikely is ‘sufficiently
unlikely’? If , then we’d find a
region of outcomes where there’s
heads ‘by chance’.
(at most) a 5% chance of one of
these extreme values happening ‘by
chance’ (i.e. if the coin was fair).

0 1 2 3 4 5 6 7 8 9 10
Number of heads ()
What is Hypothesis Testing? !

Hypothesis testing in a nutshell then is:


1. We have some hypothesis we wish to see if true (e.g. coin is biased
towards heads), so…
2. We collect some sample data by throwing the coin (giving us our ‘test
statistic’) and…
3. If that number of heads (or more) is sufficiently unlikely to have
emerged ‘just by chance’, then we conclude that our (alternative)
hypothesis is correct, i.e. the coin is biased.
Null Hypothesis and Alternative Hypothesis
[Textbook] John wants to see whether a coin is unbiased or whether it is biased towards
coming down heads. He tosses the coin 8 times and counts the number of times , it lands
head uppermost.

We said that our two hypotheses are about the population parameter.
Suppose is the probability of a coin landing heads.
Under the null hypothesis , we assume
Null hypothesis: ? that the population parameter is
correct, in this case, that it is a normal
Alternative hypothesis:
? coin and the probability of Heads is 0.5

Under the alternative hypothesis , there


has been an underlying change in the
population parameter, in this case that the
coin is actually biased towards Heads.

The latter is known as a ‘one-tailed test’ because we’re saying the coin is biased
one way or the other (i.e. or ).
But we could also have had the hypothesis ‘the coin is biased (either way)’, i.e. .
This is known as a two-tailed test.
Further Example
[Textbook] An election candidate believes she has the support of 40% of the residents in
a particular town. A researcher wants to test, at the 5% significance level, whether the
candidate is over-estimating her support. The researcher asks 20 people whether they
support the candidate or not. 3 people say they do.
a) Write down a suitable test statistic.
b) Write down two suitable hypotheses.
c) Explain the condition under which the null hypothesis would be rejected.

a The number of people who say they For a hypothesis test involving the binomial

?
support the candidate.
distribution, the test statistic is always the count of
successes.

The alternative hypothesis is that the candidate is


b overestimating her support, so we’re interested
? where less than 40% support them (more than
40% would not undermine the candidate’s claim).
Null hypothesis would be rejected if the
c probability of 3 or fewer people This is the hard bit!
We always calculate the probability of seeing this outcome or
supporting the candidate is less than
5%, given that ? more extreme (in this case, ‘more extreme’ meaning even
fewer the 3 people, because this takes us even further from the
expected number of people out of the 20 (i.e. 8) who would
support them.
The “” bit is because, as discussed before, we calculate the
probability of seeing the observed outcome of 3 people (or
more extreme) if it occurred purely by chance (the null
hypothesis), i.e. if the candidate did have 40% support.
Test Your Understanding
In the UK, 5% of students turn up late to school each day. Mr Hameed wishes to
determine, to a 10% significance level if his school, Piffin School, has a problem with
attendance. He stands at the front gate one day and finds that 6 of the 40 students who
pass him are late.
a) Write down a suitable test statistic.
b) Write down two suitable hypotheses.
c) Explain the condition under which the null hypothesis would be rejected.

a The number of students who are late to


school that day. ?
If there is indeed a problem with attendance (i.e. the
b school is not the norm) then we expect the proportion
? of students late at Tiffin to be higher than 5%.

Null hypothesis would be rejected if the


c probability of 6 or more students being
late to school is less than 10%, given Just to reiterate again what’s going on here:

that ? a) We assume that the school is the norm (i.e. 5% of


students are late), and calculate the probability that 6 or
more students would be late under this assumption.
b) If the probability of this happening by chance is
sufficiently unlikely (less than the 10% significance level),
we conclude that it probably isn’t just by chance, and the
school’s lateness rate is worse than 5%.
Exercise 7A
Pearson Applied Year 1/AS
Pages 100-101
Critical Regions and Values
John wants to see whether a coin is unbiased or whether it is biased towards coming
down heads. He tosses the coin 8 times and counts the number of times , it lands head
uppermost. What values would lead to John’s hypothesis being rejected?

As before, we’re interested how likely a given outcome is likely to happen ‘just by chance’
under the null hypothesis (i.e. when the coin is not biased).
However, there are values which The probability of getting exactly 5 heads is only 22%,
collectively form a range of which is more likely to not happen than to happen. If
‘extreme values’ where it would we saw this number of heads, why would it not be
be unlikely that the coin would sensible to think the coin is biased?
be unbiased. Their combined The probability is only low because there’s lots of
probability is limited by the level possible outcomes. But 5 heads forms part of a range
of significance set (e.g. 5%) ?
of possible number of heads that collectively would
Prob under

be consistent with a coin not biased towards heads.

0 1 2 3 4 5 6 7 8
Num heads
Critical Regions and Values
John wants to see whether a coin is unbiased or whether it is biased towards coming
down heads. He tosses the coin 8 times and counts the number of times , it lands head
uppermost. What values would lead to John’s hypothesis being rejected, if the
significance level was 5%?

What’s the probability that we would see 6


heads, or an even more extreme value? Is this C.D.F. Binomial table:
sufficiently unlikely to support John’s claim that
the coin is biased?
0 0.0039
Insufficient evidence to reject null hypothesis
(since ). ? 1
2
0.0352
0.1445
3 0.3633
4 0.6367
What’s the probability that we would see 7
5 0.8555
heads, or an even more extreme value?
6 0.9648
7 0.9961
Since 0.0352 < 0.05, this is very unlikely, so we
?
reject the null hypothesis and accept the
alternative hypothesis that the coin is biased.
Critical Regions and Values
John wants to see whether a coin is unbiased or whether it is biased towards coming
down heads. He tosses the coin 8 times and counts the number of times , it lands head
uppermost. What values would lead to John’s hypothesis being rejected, if the
significance level was 5%?

! The critical region is the range of values


of the test statistic that would lead to you C.D.F. Binomial table:
rejecting
If level of significance 5%, critical region?
0 0.0039
We saw that 95% is exceeded when . This means
? Fro Tip: Use the first value
1 0.0352
AFTER the one in the table 2 0.1445
that exceeds 95%. 3 0.3633
4 0.6367
! The value(s) on the boundary of the
5 0.8555
critical region are called critical value(s).
6 0.9648
7 0.9961
Critical value:
8 1
?
Quickfire Critical Regions
Determine the critical region when we throw a coin where we’re trying to establish if
there’s the specified bias, given the specified number of throws, when the level of
significance is 5%.

Coin thrown 5 times. Coin thrown 10 Coin thrown 10 Fro Reminder:


At the positive
Trying to establish if times. Trying to times. Trying to tail, use the
biased towards establish if biased establish if biased value AFTER
heads. towards heads. towards tails. the first that
exceeds 95%
𝑝= 0.5 , 𝑛=5 𝑝=0.5 , 𝑛=10 𝑝=0.5 , 𝑛=10 (100 - 5).

0 0.0010 0 0.0010 At the negative


0 0.0312 tail, we just
1 0.0107 1 0.0107
use the first
1 0.1875 2 0.0547 2 0.0547 value that goes
2 0.5000 … … … … under the
significance
3 0.8125 7 0.9453 7 0.9453
level.
8 0.9893 8 0.9893
4 0.9688
9 0.9990 9 0.9990

Critical region: Critical region: Critical region:


? ? ?
Actual Significance Level
John wants to see whether a coin is unbiased or whether it is biased towards coming
down heads. He tosses the coin 8 times and counts the number of times , it lands head
uppermost. What values would lead to John’s hypothesis being rejected, if the
significance level was 5%?

We saw earlier that the critical region was , i.e. the


region in which John would reject the null C.D.F. Binomial table:
hypothesis (and conclude the coin was biased).

We ensured that was less than the significance 0 0.0039


level of 5%. 1 0.0352
But what actually is ? 2 0.1445
3 0.3633
? 4 0.6367
5 0.8555
This is known as the actual significance level, i.e.
6 0.9648
the probability that we’re in the critical region. We
expected this to be less than, but close to, 5%. 7 0.9961
8 1

! The actual significance level is the actual


probability of being in the critical region.
Two-tailed test
Critical region Acceptance region Critical region
Suppose I threw a coin 8 times and
was now interested in how may
heads would suggest it was a
biased coin (i.e. either way!). How

Prob under
do we work out the critical values
now, with 5% significance?

We split the 5% so there’s 2.5% at 0 1 2 3 4 5 6 7 8


? as normal:
either tail, then proceed Num heads

C.D.F. Binomial table:


Critical region at positive tail:
Look at closest value above 0.975 (then
go one above):
? 0 0.0039
1 0.0352
2 0.1445
Critical region at negative tail: … …
Look at closest value below 0.025.
? 6 0.9648
7 0.9961
8 1
Test Your Understanding
A random variable has binomial distribution . A single observation is used to test
against . The indicates bias either way, i.e. two-tailed.

a) Using the 2% level of significance, find the critical region of this test. The probability
in each tail should be as close as possible to 0.01. This means you find the closest to
b) Write down the actual significance level of the test. 0.01 (even if slightly above) rather
than the closest under 0.01
a (Half of 0.02 is 0.01) To ensure all method marks always
show the probability of being in the
critical region (even if you don’t C.D.F. Binomial table:
subsequently need the value!)
? Note that 2 0.0010
can’t go
below 0 or 3 0.0047
exceed 40. 4 0.0160
Critical region is or 5 0.0433
b
? 16 0.9884
17 0.9953
18 0.9983
19 0.9994

Warning: Textbook has several typos in this example.


Exercise 7B
Pearson Applied Year 1/AS
Pages 103-105
Doing a full one-tailed hypothesis test
We’ve done various bits of a hypothesis test, and haven’t actually properly
conducted one yet. Let’s do an example!
John tosses a coin 8 times and it comes up heads 6 times. He claims the coin is biased towards
heads. With a significance level of 5%, test his claim.

is number of heads. STEP 1: Define test statistic

? of heads.
is probability (stating its distribution), and
the parameter .
C.D.F. Binomial table:

STEP 2: Write null and


alternative hypotheses.
? 0
1
0.0039
0.0352
STEP 3: Determine 2 0.1445
Assume is true, probability of observed
3 0.3633
test statistic (or ‘more
? extreme’), assuming null 4 0.6367
hypothesis. 5 0.8555
14.45% > 5%, so insufficient i.e. Determine probability we’d
see this outcome just by chance.
evidence to reject . 6 0.9648
Coin is not biased. STEP 4: Two-part
7 0.9961

? conclusion:
1. Do we reject or not? NEW TO A LEVEL 2017: The probability of
2. Put in context of ‘the observed value or more extreme’ is
original problem. known as the -value.
Alternative method using critical regions
We can also find the critical region and see if the test statistic lies within it.

John tosses a coin 8 times and it comes up heads 6 times. He claims the coin is biased towards
heads. With a significance level of 5%, test his claim.

is number of heads. STEP 1: Define test statistic


(stating its distribution), and C.D.F. Binomial table:
is probability of heads. the parameter .

STEP 2: Write null and


alternative hypotheses. 0 0.0039
1 0.0352
2 0.1445

Critical region is ? 3 0.3633


STEP 3 (Alternative): 4 0.6367
Determine critical
6 is not in critical region, so do not 5 0.8555
region.
reject . ? 6 0.9648
Coin is not biased. STEP 4: Two-part
7 0.9961
conclusion:
1. Do we reject or not?
2. Put in context of
original problem.
More on -values
(Note that this is not covered in the Pearson textbook, but is in the specification)

Sheila wants to know if a coin is biased towards heads and throws it a large
number of times, counting the number of heads. The -value is less than 0.03.
Conduct a hypothesis test at the 5% significance level.

Let be the probability of heads. Froflections: Ordinarily we’d calculate


the probability of seeing the
observed number of heads ‘or more
extreme’. But this has already been
done for us (i.e. the -value), so we
so reject . ? just need to compare this against the
Sufficient evidence to suggest the coin threshold.
is biased.
Further Example
[Textbook] The standard treatment for a particular disease has a probability of
success. A certain doctor has undertake research in this area and has produced a
new drug which has been successful with 11 out of 20 patients. The doctor claims
the new drug represents an improvement on the standard treatment.
Test, at the 5% significance level, the claim made by the doctor.

is number of patients for whom trial was successful. STEP 1: Define test statistic
is probability of success in?each patient. (stating its distribution), and
the parameter .

STEP 2: Write null and


? alternative hypotheses.
Assume is true, so
STEP 3: Determine probability of
? observed test statistic (or ‘more
extreme’), assuming null hypothesis.
12.75% > 5% so not enough evidence to reject .
STEP 4: Two-part
New drug is no better than old one.
? conclusion:
1. Do we reject or not?
2. Put in context of
original problem.
Exercise 7C
Pearson Applied Year 1/AS
Pages 106-107
For a two tailed test, halve the significance level.

You need to know which tail of the distribution you are testing.
If X~ B(n,p) then the expected outcome is np.
If the observed value, x, is < Expected outcome then consider P (X < x)
If the observed value x is > Expected Outcome then consider P( X > x)
A coin is tossed 20 times and lands on Heads 6 times.
Use 2 tailed test with 5% significance to determine whether
coin is biased.
A random variable has distribution X B(50,p). A single observation
of x = 4 is taken from this distribution. Test at the 2% significance
level, : p=0.02 against : p
Two-Tailed Tests
We have already seen that if we’re interest in bias ‘either way’, we have two tails, and therefore
have to split the critical region by halving the significance level at each end.

Over a long period of time it has been found that in Enrico’s restaurant the ratio of non-veg to
veg meals is 2 to 1. In Manuel’s restaurant in a random sample of 10 people ordering meals, 1
ordered a vegetarian meal. Using a 5% level of significance, test whether or not the proportion
of people eating veg meals in Manuel’s restaurant is different to that in Enrico’s restaurant.

Proportion eating veg meals at Enrico’s is


Let be the proportion of people at Manuel’s that order veg.
Let be number of people eating veg meals.

If true then

?.
therefore insufficient evidence to reject
There is no evidence that proportion of veg meals at Manuel’s Half significance
restaurant is different to Enrico’s. as 2 tailed.

Conclusion and
what it means in
context.
Test Your Understanding
Edexcel S2 Jan 2006 Q7a

?
Exercise 7D
Pearson Applied Year 1/AS
Pages 108-109

You might also like