Stat 139 - Unit 03 - Hypothesis Testing - 1 Per Page
Stat 139 - Unit 03 - Hypothesis Testing - 1 Per Page
Stat 139 - Unit 03 - Hypothesis Testing - 1 Per Page
• Motivating Example
• Hypothesis Testing Framework
• (Fisher’s) Randomization Test
• Permutation Test
• Testing in R
2
• Malaria sickens
~200 million each
year, killing about 1
million.
• Most malaria cases
and deaths occur in
sub-Saharan Africa.
• Malaria vaccines are
difficult to develop
due to the diversity
of the parasite and
the complexity if its
effects on human
body - an area of
intensive research.
3
Example: Malaria Vaccine Trial
• “Researchers reported that…out of 9 volunteers who
received four doses of the new vaccine, 3 contracted
the disease. Of 12 volunteers who received no vaccine,
10 became infected.”
• What is the scientific question of interest?
• What are the statistical hypotheses to address this
scientific question?
• What statistical analysis tool should we use to test these
hypotheses?
4
Unit 3 Outline
• Motivating Example
• Hypothesis Testing Framework
• (Fisher’s) Randomization Test
• Permutation Test
• Testing in R
5
Steps in testing Hypotheses
• There are 4 main steps to a statistical hypothesis
testing procedure:
1) Formulate the hypotheses: H0 and HA.
2) Calculate a test statistic that is evidence for these
hypotheses.
3) Calculate the p-value based on a reference
distribution of the test statistic (assuming H0 is true).
4) Determine the conclusion of the test by comparing
the p-value to the significance level of the test
procedure. State how it generalizes internally and
externally.
6
Hypotheses
• A scientific hypothesis makes a testable statement
about the observable universe.
• A statistical hypothesis is more restricted in that it
concerns the behavior of a measurable (or
observable) random variable. It is often a statement
or claim about a parameter of a population or
distribution.
• Two competing types of statistical hypotheses for
any scientific problem: H0 vs. HA.
7
Hypothesis Testing
• Suppose that the treatment is randomly assigned to
selected units in a sample, while others are used as
“controls”,
Treatment
Control
9
2. Determine a Test Statistic
(& Reference Distribution)
• Statistic: A function of the data, y = y1, …, yn: .
ˆ ˆ (y )
• Test Statistic: a specific statistic used to weigh evidence
supporting and contradicting the null hypothesis.
• Reference Distribution: Probability distribution of the test
statistic, assuming that the null hypothesis is true
f ( ˆ (Y ) | H 0 ). This is often called the sampling
distribution.
• What should be the test statistic for the malaria dataset?
How can we determine it’s reference distribution?
10
3. Calculate the p-value
• p-value – the probability of observing our test
statistic or a more extreme one, assuming the null
hypothesis to be true. It is used as a measure of
the strength of evidence of the hypotheses.
2500
Frequency
p-value is NOT the
probability that the null
hypothesis is true! 1000
0
-0.5 0.0 0.5
11
Significance Level of a test
• p-value - probability that the test statistic would be at least as
extreme as observed, under the null hypothesis.
• A significance level (α) is the criterion compared against the
p-value. The null hypothesis is rejected if p-value is lower
than α.
• Generally, α reflects the probability of rejecting the null
hypothesis given that it is true (Type I error).
12
4. Determining the Conclusion
• We come to a the conclusion about our hypotheses by
comparing the p-value to the Type I error rate.
• If the p-value is as small or smaller than the pre-specified
level of the test or alpha (), usually 0.05, we reject the
null hypothesis and say the result is statistically significant
at level .
• If the p-value is larger than , we are unable to reject the
null hypothesis.
• Warning: this is different than concluding that the null.
Hypothesis is true! Why?
• Based on the study design, we can then generalize internally
and/or externally (the scope of the inference procedure)
13
Unit 3 Outline
• Motivating Example
• Hypothesis Testing Framework
• (Fisher’s) Randomization Test
• Permutation Test
• Testing in R
14
Uncertainty in Randomized
Experiments
• In randomized experiments, uncertainty comes from
randomness of an assignment mechanism.
• How can we capture this randomization through a
summary measurement?
• (Fisher’s) Randomization Test is a distribution-free test
for treatment effect in randomized experiments.
15
(Fisher’s) Randomization Test
• Additive Treatment Effect:
Yc ,i Yv ,i
• H0: Zero treatment effect for all units, δ = 0. Each
unit’s outcome is the same, regardless of the treatment
assigned.
• Consequently, the distribution of outcomes is
identical in two groups.
• Ha: Non-zero treatment effect for ALL units, δ ≠ 0
• (This version is used in R&S, Section 1.3.1: Y* Y )
16
(Fisher’s) Randomization Test
Assumptions:
• Random assignment to groups.
• Under the H0, independence of study units.
– To be precise, there is an exchangeability of study units,
i.e., the labels assigning subjects to groups are
interchangeable.
– We will take advantage of this exchangeability when
building the sampling distribution (through simulation) of
the test statistic.
17
(Fisher’s) Randomization Test
18
Example 1: Randomization Test
12
• Observed data
• Observed test statistic
8
Count
yc yv 0.5
4
0
• Randomization distribution Control Vaccine
Randomization Distribution of Y
of Yc Yv
2500
21
Frequency
– What is 9 ?
– Exact vs. approximate 1000
– R-code to come
0
2500
• One-sided alternative
Frequency
1000
H0 : 0
0
H a : 0 (or 0) -0.5 0.0 0.5
• Motivating Example
• Hypothesis Testing Framework
• (Fisher’s) Randomization Test
• Permutation Test
• Randomization Test in R
22
Example 2: Natural Resistance to
Malaria
• Sickle-cell anemia – hereditary blood disordered.
*Allison AC (1954). "Notes on sickle-cell polymorphism". Ann Hum Genet 19: 39–57.
23
Uncertainty in Observational
Studies
• We can apply the same “regrouping” idea.
However, because assignment mechanism is not
random, the uncertainty is due to an “imaginary”
chance mechanism.
25
Permutation Test: Same mechanics 0 1 2 3 4 5
Norm SC
4
80
Percent of Total
Count
3
60
2
40
1
20
0
Norm SC 0
0 1 2 3 4 5
100 per mm^3
100 per mm^3
Permutation distribution:
2000
Frequency
p-value = 0.02
1000
Conclusion?
0
0.74
-1.5 -1.0 -0.5 0.0 0.5 1.0
Difference in S. cells counts 26
Randomization Test: Permutation Test:
H0: Each unit in the population H0: outcomes in the population are
would have the same outcome not related to group status; Hence,
regardless of the treatment outcome distributions in
assigned; Hence, distributions of subpopulations with different
outcomes under the two treatments groups status are the same.
are the same. Ha: there is an association between
Ha: Treatment effect is non-zero, group status and outcomes in the
, in the population. population.
27
Unit 3 Outline
• Motivating Example
• Hypothesis Testing Framework
• (Fisher’s) Randomization Test
• Permutation Test
• Randomization Test in R
28
R’s for loop
• Often times in statistical computing, you would like to repeat
some process multiple times:
• Repeated Sampling (like for Fisher’s randomization test).
• Operating on a vector or matrix [though use of one of
apply, lapply, or tapply may be better]
• Markov chains
• Simulation Studies
• Using a loop of some sort can help with this task. R has two
major type:
• for loop: used to repeat a task a fixed number of times
(called the number of iterations)
• while loop: used to repeat a task until you satisfy a certain
condition (number of iterations undetermined)
29
for loop syntax
2) for statement: 1) Initialization step:
The word for, and in parentheses, Define your constant (here: n) and
the range of values that the iterator usually define a “blank” variable
(here: i) will take on (here: 1:n.iter). (here: variable) which will store
Here, the iterator will increase by the results of the steps
one each time through the loop of the for loop.
(most common way to do it).
n.iter = 100
3) Body of work (inside the “{}”): variable = rep(NA, n.iter)
The work to be done for each time
through the loop. Usually there will for(i in 1:n.iter){
be a step that indexes the storage samp = runif(10)
vector (here: variable[i]). variable[i] = max(samp)
}
4) Results:
hist(variable)
After the for loop is complete, what
calculation and results will be mean(variable)
produced on the storage variable 30
Fisher’s randomization test in R
#################################### #and the for loop to do all the work:
# Performing Fisher's Randomization reordering the x's, and splitting into
# Test for the Malaraia data groups
#################################### for(i in 1:nsims){
#Create the variables in the data set x.sim = sample(x)
x=c(rep(1,9),rep(0,12)) ybar.sim[i]=mean(y[x.sim==0])-
y=c(rep(1,3),rep(0,6),rep(1,10),rep(0,2) mean(y[x.sim==1])
) }
n=length(x)
mean(ybar.sim)
#split into two groups and calculate the var(ybar.sim)
test statistics hist(ybar.sim, col="grey")
y.v=y[x==1]
y.c=y[x==0]
#one-sided p-value
ybar.obs=mean(y.c)-mean(y.v) mean(ybar.sim >= ybar.obs)
#two-sided p-value
#initialize the vector to store the mean( abs(ybar.sim) >= abs(ybar.obs))
simulated
nsims=100000
ybar.sim=rep(NA,nsims)
31
The Last Word
http://xkcd.com/892/
32