Decision Making and Hypothesis Testing 1 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

Welcome to the

Advanced Data Analytics


Semester 2

Decision making: Hypothesis testing Dr. Shahram Azizi


Decision making

Learning Objectives
At the end of this session, you will be able to:
• Get to know on Decision making
• Hypothesis testing
1. Test of mean of population(s)
2. Test of proportion of population(s)
3. Test of variance of population(s)
4. Test of goodness of fit
5. Test of independency between two RVs
Steps in Hypothesis testing

• Setting up two competing hypotheses


• Set some level of significance called alpha.
• Calculate a test statistic
• Calculate probability value (p-value), or find rejection
region
• Make a test decision about the null hypothesis
• State an overall conclusion
Binary decision making

https://onlinecourses.science.psu.edu/stat500/node/40
Methods for Making a Statistical Decision

• rejection region approach

• p-value (or probability value) approach.


Test of µ, s Known, Population Normally Distributed

• Test Statistic:
x –m
z= s 0
n
• Where x is the sample statistic.
• µ0 is the value identified in the null
hypothesis.
• s is known.
• n is the sample size.
Test of µ, s Known, Population Shape Not Known/Not Normal

• If n  30, Test Statistic:


x –m
z= s 0
n
• If n < 30, use a distribution-free test
Test of µ, s Unknown, Population Normally Distributed

• Test Statistic:

x –m
t= s 0
where n
• 𝑥ҧ is the sample statistic.
• µ0 is the value identified in the null hypothesis.
• s is unknown, s is sample standard deviation.
• n is the sample size
• degrees of freedom on t are n – 1.
Test of µ, s Unknown, Population Shape Not Known/Not Normal

• If n  30, Test Statistic:


x –m
t= s 0
n

• If n < 30, use a distribution-free test


The Formal Hypothesis Test for the Example, s Known

• I. Hypotheses
• H0: µ = 1.3250 minutes
• H1: µ  1.3250 minutes
• II. Rejection Region Do Not
• a = 0.05 Reject H 0

Reject H Reject H
Decision Rule: 0 0

a a a


If z < – 1.96 or z > 1.96,
reject H0. z=-1.96 z=+1.96
The Formal Hypothesis Test, cont.
• III. Test Statistic
x –m
z = s 0 =1.3229–1.3250= – 0.0021= –0.47
0.0396 0.00443
n 80
• IV. Decision
Since the test statistic of z = – 0.47 fell between the critical
boundaries of z = ± 1.96, we do not reject H0 with at least
95% confidence or at most 5% error.
The Formal Hypothesis Test, cont.

• V. Conclusion
This is not sufficient evidence to conclude that the robot
welder is out of adjustment.
Example in simulation

• Problem
Generate 100 sample from N(3,10). Based on last 30
sample decide whether mu=2 or not, at the level of
\alpha=0.01. (let us assume that variance is known)
Lower Tail Test of Population Mean with Known Variance

• Problem
Suppose the manufacturer claims that the mean lifetime
of a light bulb is more than 10,000 hours. In a sample of
30 light bulbs, it was found that they only last 9,900
hours on average. Assume the population standard
deviation is 120 hours. At .05 significance level, can we
reject the claim by the manufacturer?
Lower Tail Test of Population Mean with Known Variance

• Solution
http://www.r-tutor.com/elementary-statistics/hypothesis-
testing/lower-tail-test-population-mean-known-variance
Upper Tail Test of Population Mean with Known Variance

• Problem
Suppose the food label on a cookie bag states that there
is at most 2 grams of saturated fat in a single cookie. In
a sample of 35 cookies, it is found that the mean amount
of saturated fat per cookie is 2.1 grams. Assume that the
population standard deviation is 0.25 grams. At .05
significance level, can we reject the claim on food label?
Upper Tail Test of Population Mean with Known Variance

• Solution
http://www.r-tutor.com/elementary-statistics/hypothesis-
testing/upper-tail-test-population-mean-known-variance
Two-Tailed Test of Population Mean with Known Variance

• Problem
Suppose the mean weight of King Penguins found in an
Antarctic colony last year was 15.4 kg. In a sample of 35
penguins same time this year in the same colony, the
mean penguin weight is 14.6 kg. Assume the population
standard deviation is 2.5 kg. At .05 significance level,
can we reject the null hypothesis that the mean penguin
weight does not differ from last year?
Upper Tail Test of Population Mean with Known Variance

• Solution
http://www.r-tutor.com/elementary-statistics/hypothesis-
testing/two-tailed-test-population-mean-known-variance
Lower Tail Test of Population Mean with Unknown Variance

• Problem
Suppose the manufacturer claims that the mean lifetime
of a light bulb is more than 10,000 hours. In a sample of
30 light bulbs, it was found that they only last 9,900
hours on average. Assume the sample standard
deviation is 125 hours. At .05 significance level, can we
reject the claim by the manufacturer?
Lower Tail Test of Population Mean with Unknown Variance

• Solution
http://www.r-tutor.com/elementary-statistics/hypothesis-
testing/lower-tail-test-population-mean-unknown-
variance
Upper Tail Test of Population Mean with Unknown Variance

• Problem
Suppose the food label on a cookie bag states that
there is at most 2 grams of saturated fat in a single
cookie. In a sample of 35 cookies, it is found that the
mean amount of saturated fat per cookie is 2.1 grams.
Assume that the sample standard deviation is 0.3 gram.
At .05 significance level, can we reject the claim on food
label?
Upper Tail Test of Population Mean with Unknown Variance

• Solution
http://www.r-tutor.com/elementary-statistics/hypothesis-
testing/upper-tail-test-population-mean-unknown-
variance
Two-Tailed Test of Population Mean with Unknown Variance

• Problem
Suppose the mean weight of King Penguins found in an
Antarctic colony last year was 15.4 kg. In a sample of 35
penguins same time this year in the same colony, the
mean penguin weight is 14.6 kg. Assume the sample
standard deviation is 2.5 kg. At .05 significance level,
can we reject the null hypothesis that the mean penguin
weight does not differ from last year?
Two-tailed Test of Population Mean with Unknown Variance

• Solution
http://www.r-tutor.com/elementary-statistics/hypothesis-
testing/two-tailed-test-population-mean-unknown-
variance
Example: Test of mean

1- Generate 100 samples from N(5,9). Test the assumption


𝐻0 : 𝜇 = 5, versus 𝐻0 : 𝜇 ≠ 5 Based on the last 30 data. Find
95% confidence interval for 𝜇.
Discrete Variables and Test of a proportion

• Discrete data are the result of a counting process.


The sampled elements are sorted, and the elements
with the characteristic of interest are counted.
• Test of a proportion, p
Test of p, Sample Sufficiently Large

• If both n p  5 and n(1 – p)  5,


Test Statistic: p–p
z= 0
p (1–p )
0 0
n
• where p = sample proportion
• p0 is the value identified in the null
hypothesis.
• n is the sample size. © 2002 The Wadsworth Group
Test of p, Sample Not Sufficiently Large

• If either n p < 5 or n(1 – p) < 5, convert


the proportion to the underlying
binomial distribution.
• Note there is no t-test on a population
proportion.
Example: test of proportion

https://onlinecourses.science.psu.edu/statprogram/node/16
4
Lower Tail Test of Population Proportion

• Problem
Suppose 60% of citizens voted in last election. 85 out
of 148 people in a telephone survey said that they voted
in current election. At 0.5 significance level, can we
reject the null hypothesis that the proportion of voters in
the population is above 60% this year?
Lower Tail Test of Population Proportion

• Solution
http://www.r-tutor.com/elementary-statistics/hypothesis-
testing/lower-tail-test-population-proportion
Lower Tail Test of Population Proportion

• Problem
Suppose 60% of citizens voted in last election. 85 out
of 148 people in a telephone survey said that they voted
in current election. At 0.5 significance level, can we
reject the null hypothesis that the proportion of voters in
the population is above 60% this year?
Lower Tail Test of Population Proportion

• Solution
http://www.r-tutor.com/elementary-statistics/hypothesis-
testing/lower-tail-test-population-proportion
Upper Tail Test of Population Proportion

• Problem
Suppose that 12% of apples harvested in an orchard
last year was rotten. 30 out of 214 apples in a harvest
sample this year turns out to be rotten. At .05
significance level, can we reject the null hypothesis that
the proportion of rotten apples in harvest stays below
12% this year?
Upper Tail Test of Population Proportion

• Solution
http://www.r-tutor.com/elementary-statistics/hypothesis-
testing/upper-tail-test-population-proportion
Two-Tailed Test of Population Proportion

• Problem
Suppose a coin toss turns up 12 heads out of 20 trials.
At .05 significance level, can one reject the null
hypothesis that the coin toss is fair?
Two-Tailed Test of Population Proportion

• Solution
http://www.r-tutor.com/elementary-statistics/hypothesis-
testing/two-tailed-test-population-proportion
Test of variance

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35
8.htm
Goodness-of-Fit Test

https://onlinecourses.science.psu.edu/stat504/node/60

You might also like