Solution To Exam 1
Solution To Exam 1
Solution To Exam 1
requirements Random sample Random Sample Random sample S: All populations have
from from a sample from the same variance
Either (a) a large large enough that Either (a) a large I: All samples are
sample or (b) a the number of sample or (b) a random and
normal distribution successes and normal distribution independent sampled
number of failures or objects are randomly
are 5 or more assigned
N:All populations are
normally distributed or
a large sample
An acceptable answer would have been either a: a random sample from each population or (b) a large sample or
sampling form a normal distribution
2. In a medical test, the null hypothesis is that you are healthy. What would be the meaning of the power of the
blood test? Do not just give the definition. You must put it in terms of healthy and ill.
"Probability of rejecting the null when it is false is 1-, also called the power of the test."
Replacing the phrase "the null " with the phrase "you are healthy" obtains the answer:
"The power is the probability of rejecting you are healthy when it is false"
And rephrasing:
"The power is the probability of t you are ill when in fact you are ill."
3. If n=144, the sample mean = 16, and the population standard deviation = 24. What is the 90% confidence
interval for the population mean?
σ 24 24
= = =2
a. find the standard error: √n √ 144 12
b. Find the table value for 90%: Z = 1.645
σ
Z =1 . 645∗2=3 .290
c. Determine the margin of error: √n
d. Write the conclusion: We are 90% confident that the population mean falls between 12.71 (=16-3.29) and
19.29 (= 16+3.29)
4. How can you tell the difference between a confidence interval question from first level statistics and one from
One-Way ANOVA?
From Review of Statistics 1 notes, section 8: You wish to estimate a population mean
From the One-Way ANOVA notes, section 4.5: You wish to estimate differences in population means
5. You are interested if there is a difference in average effectiveness among five types of adds. You have 15 people
already divided into three groups based on where they shop: 5 people to shop at Walmart, 5 at Target, and 5 at
Bestbuy. In each group you randomly assign one of the adds to one of the people in the group. If you wished to
see if there is a difference in the ads, what type of experiment is this? You must explain your reasoning.
2.2 Randomized Block Design – Randomization occurs only within subsets (blocks) of experimental units.
Parameter – a number that describes some aspect of the population; e.g. the mean
Statistic – a number that describes some aspect of the sample
Thus an example of a statistic would be a sample mean while an example of a parameter would be a population
mean
8. You are trying to estimate the difference between pairs of population average grades among three teaching
methods. What building block applies to this situation? You must explain your reasoning.
"4.5.1 Estimate –the difference in the sample means. By the Second Building Block of the course we know
that this difference tends to be in error and does not equal to the difference in population means. By the
Fourth Building Block we know that the largest error in the difference we would expect with a specified
probability is called the _________________________ which is found by multiplying ___________ and
___________."
From Review of statistics 1, n is the sample size. From the One-Way Anova notes, section 4.3: c is the
number of averages.
From the One-Way Anova notes, Section 4.4.1: "If the F-ratio is large (above an F-table value then reject equal
population means) The F table has two degrees of freedom: numerator degrees of freedom, c-1, and divisor
degrees of freedom, n-c."
From the problem c - 1= 5-1 = 4 and n –c = (12*5) – 5 = 100 – 5 = 95. The closest table value is 2.49 = F
with 4 and 80 degrees for freedom
Therefore using the example of section 4.4.2, the rejection region is: Reject Ho if F > 2.49
10. Suppose you want compare work experience among types of MBAs. You have a random sample of students
who are enrolled in the cohort MBA, a random sample from the regular MBA, and a random sample from an
online MBA. One of the requirements of ANOVA is the normality. In terms of this problem, restate the
normality assumption. Do not just the definition but use the terms types of MBA and years of work experience.
Normality: The population distribution of the values of the response variable is assumed to be normal in
each population defined by the levels of the treatment.
In this problem the response variable is the work experience, with the populations being the students in each
type of MBAs.
Replacing the names response variable with work experience and population with type of MBA results in :
The work experience is assumed to be normally distributed among the students in each type of MBA.
From the label in section 4.5 of the One-Way ANOVA notes: When you are "Estimating Differences in
Population Means"
12. You have three random samples of 10 each. If you have an ANOVA F test statistic value of 11.34, would you
reject equal means? Why or why not?
Since 11.34 > 3.35 reject the null hypothesis of equal means.
13. Different factors could affect a student’s test grade. If you could set up an experiment, what would be one
possible factor you would use?
From section 1 of the One-Way ANOVA notes:
Response variable, Y, – a quantitative variable that may depend on the values of a qualitative variable
(factor)
One possible answer would be teaching method. Note: Since this is an experiment, the factor must be
assignable.
Therefore you could not use factors such as gender or student year (Freshman, etc.) since you can only
measure those factors.
14. You have a t test value of 12.34. What does the 12.34 measure?
( x̄−μ )
The t test value is of the form
σ x̄
. 11.34 indicates that there are 11.34 sample standard errors
between the sample mean and the hypothesized population mean.
15. The probability associated with your test statistic is called the p-value. What is another name or symbol for
the probability attached to rejection region?
Rejection Region: values of the test statistic that would be unlikely if the null was true.
A Type I error is
16. If a sample mean equals to 20, we cannot conclude that the population mean is also 20. What building block
supports this statement? Explain your reasoning.
Therefore you would not expect the population mean to be 20 just because the sample mean is 20.
17. If a normally distributed population has a mean of 15 and a standard deviation of 5, then sketch the
empirical rule and determine the probability of finding a value less than 20.
Range Percent
- 3* up to - 2* 2.5%
- 2* up to - 13.5%
- up to 34.0%
up to + 34.0%
+ up to + 2* 13.5%
+ 2* up to + 3* 2.5%
35%
5 up to 10 13.5% 30%
10 up to 15 34.0% 25%
20%
15 up to 20 34.0% 15%
20 up to 25 13.5% 10%
5%
25 up to 30 2.5% 0%
0 up to 5 5 up to 10 10 up to 15 15 up to 20 20 up to 25 25 up to 30
The probability would be the sum of the values in red font above or 84%
18. You trying to estimate a population mean using a very large random sample. You know nothing about the
population. What table would you use?
Since nothing is known about the population, then the population standard deviation is not known.
8.3 Estimation of population mean if the population is normal but the population standard error is
unknown
In Section 8.3, we use a t–table for this confidence interval (and you could have also have mentioned that
since we do not know for sure if the population is normal, the requirements would force us to take a large
random sample).
19. If you have a confidence interval of 30 to 80 for a population mean. Can you conclude that the population
mean is larger than 10? Why or why not?
If you think the population mean is in the interval 30 to 80 and since both 30 and 80 are larger than 10, then
yes you would conclude the population mean is larger than 10.
20. What table value would you use for a 99% confidence interval for μ if σ is unknown and n=16?
Using Section 8.3, we would use a t-table with degrees of freedom n-1 = 15. Using the Within header row of
the table use the last column and row 15, the table value is 2.9467
21. The 95% confidence interval for the difference in average salary between two types of jobs are given below
for four cases. Each case is in the format of a sample difference plus or minus the margin of error. Case 1:
$30,000 ± $5, Case 2: $30,000 ± $45,000, Case 3: $1 ± $45,000 and Case 4: $1 ± 32 cents. In each case, indicate
(a) whether you would reject equality of means and (b) whether you would want to increase your sample size to
get a more precise estimate.
Let the difference be the mean of job 1 minus the mean of Job 2:
Case 1: You believe that the average salary of job 1 is from $29,995 to $30,005 more than job 2. I would say
job 1 is better than job 2 and the precision is very good since we are only off by $5. (Choose job 1)
Case 2: You believe that the average salary of job 1 is from $15,000 lower than to $75,000 more than job 2. I
am not sure which one is better since the estimated difference is very imprecise being off by plus of minus
45,000. You need more precision since the in that range are some very large differences in salary.
Case 3: This is similar to case 2 since you cannot tell which one is better and the precision is poor.
Case 4: You believe that the average salary of job 1 is from 68 cents to $1.32 more than job 2. I am not sure
which one is better but the precision is very good since the largest error I would expect is 32 cents. (In this
case, it does not really matter which job you choose.)
22. Find the population standard deviation of the following five values: 1, 1, 2, 2, 4
1 (1-2)=-1 1
2 (2-2)=0 0
2 (2-2)=0 0
4 (4-2)=2 4
Step a: mean =2
6
Step d. Sum =
Step e. 2= 6/5 =1.2
23. According to the notes, what is one problem with using surveys?
From the Review of Statistics 1 notes: Section 2, b. iv:
Typically some people do not respond which can bias the results
Individual responses vary from day to day.
24. You run a consulting company and wish to show that the time (in days) before being paid by first time
customers exceeds 30 days on average. From a random sample of 30 companies you calculate a t-test value of
2.45. What can you conclude?
a. Determine the alternative hypothesis based on what you wish to show. H1: > 30
b. Based on the alternative decide what would cause you to reject the null and support
the alternative: In order to support that the mean time is greater than 30 you must
find sample means much higher than 30. Far enough as to be unlikely.
c. Determine your critical value of the t and the rejection region: Degrees of freedom equals n-1 = 30 - 1
= 29. Use the 0.05 column in > header row. From the t-table we find the critical value to be 1.6991.
Your rejection region is then Reject Ho if t > 1.6991
d. Calculate the test statistic using the formula: The t test value of 2.45 is given
e. Determine by your rule in c, if you should Reject Ho: Since 2.45 is greater than 1.6991
then reject Ho
f. Make a conclusion in context of the problem: We can show that the mean time of all
invoice payments is greater than 30.
25. Which building block is used in testing to see if the status quo value of a population mean has changed?
Explain your reasoning
From Basic Building block 3 c: If the probability is low either the sample was unlikely or one of the
population values in the above ratio is not correct.