Solution To Exam 1

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

BSTAT 5325 –Exam I, Version 1– Summer, 2017– Printed Name_______________________

1. What is a requirement of all confidence intervals we have used?


2. In a medical test, the null hypothesis is that you are healthy. What would be the meaning of the power of the blood
test? Do not just give the definition. You must put it in terms of healthy and ill.
3. If n=144, the sample mean = 16, and the population standard deviation = 24. What is the 90% confidence interval
for the population mean?
4. How can you tell the difference between a confidence interval question from first level statistics and one from One-
Way ANOVA?
5. You are interested if there is a difference in average effectiveness among five types of adds. You have 15 people
already divided into three groups based on where they shop: 5 people to shop at Walmart, 5 at Target, and 5 at
Bestbuy. In each group you randomly assign one of the adds to one of the people in the group. If you wished to see
if there is a difference in the ads, what type of experiment is this? You must explain your reasoning.
6. An example of a statistic is _______________________ and an example of a parameter is __________________.
7. What is Dr. Eakin’s last name? (Hint: It is spelled E A K I N )
8. You are trying to estimate the difference between pairs of population average grades among three teaching methods.
What building block applies to this situation? You must explain your reasoning.
9. If you have samples of 20 from each of 5 population, what is the rejection when trying to show that at least two
means differ?
10. Suppose you want compare work experience among types of MBAs. You have a random sample of students who
are enrolled in the cohort MBA, a random sample from the regular MBA, and a random sample from an online
MBA. One of the requirements of ANOVA is the normality. In terms of this problem, restate the normality
assumption. Do not just the definition but use the terms types of MBA and years of work experience.
11. When would you use the Tukey’s procedure?
12. You have three random samples of 10 each. If you have an ANOVA F test statistic value of 11.34, would you reject
equal means? Why or why not?
13. Different factors could affect a student’s test grade. If you could set up an experiment, what would be one possible
factor you would use?
14. You have a t test value of 12.34. What does the 12.34 measure?
15. The probability associated with your test statistic is called the p-value. What is another name or symbol for the
probability attached to rejection region?
16. If a sample mean equals to 20, we cannot conclude that the population mean is also 20. What building block
supports this statement? Explain your reasoning.
17. If a normally distributed population has a mean of 15 and a standard deviation of 5, then sketch the empirical rule
and determine the probability of finding a value less than 20.
18. You trying to estimate a population mean using a very large random sample. You know nothing about the
population. What table would you use?
19. If you have a confidence interval of 30 to 80 for a population mean. Can you conclude that the population mean is
larger than 10? Why or why not?
20. What table value would you use for a 99% confidence interval for μ if σ is unknown and n=16?
21. The 95% confidence interval for the difference in average salary between two types of jobs are given below for four
cases. Each case is in the format of a sample difference plus or minus the margin of error. Case 1: $30,000 ± $5,
Case 2: $30,000 ± $45,000, Case 3: $1 ± $45,000 and Case 4: $1 ± 32 cents. In each case, indicate (a) whether you
would reject equality of means and (b) whether you would want to increase your sample size to get a more precise
estimate.
22. Find the population standard deviation of the following five values: 1, 1, 2, 2, 4
23. According to the notes, what is one problem with using surveys?
24. You run a consulting company and wish to show that the time (in days) before being paid by first time customers
exceeds 30 days on average. From a random sample of 30 companies you calculate a t-test value of 2.45. What can
you conclude?
25. Which building block is used in testing to see if the status quo value of a population mean has changed? Explain
your reasoning

Solutions start on the next page.


1. What is a requirement of all confidence intervals we have used?

From the Confidence Intervals of Modules 2 and 3:

requirements Random sample Random Sample Random sample S: All populations have
from from a sample from the same variance
Either (a) a large large enough that Either (a) a large I: All samples are
sample or (b) a the number of sample or (b) a random and
normal distribution successes and normal distribution independent sampled
number of failures or objects are randomly
are 5 or more assigned
N:All populations are
normally distributed or
a large sample

An acceptable answer would have been either a: a random sample from each population or (b) a large sample or
sampling form a normal distribution

2. In a medical test, the null hypothesis is that you are healthy. What would be the meaning of the power of the
blood test? Do not just give the definition. You must put it in terms of healthy and ill.

From Section 9 of the Review of Statistics 1 notes:

"Probability of rejecting the null when it is false is 1-, also called the power of the test."

Replacing the phrase "the null " with the phrase "you are healthy" obtains the answer:

"The power is the probability of rejecting you are healthy when it is false"

And rephrasing:

"The power is the probability of t you are ill when in fact you are ill."

3. If n=144, the sample mean = 16, and the population standard deviation = 24. What is the 90% confidence
interval for the population mean?

From the steps at http://wweb.uta.edu/faculty/eakin/busa3321/zconint.xls

σ 24 24
= = =2
a. find the standard error: √n √ 144 12
b. Find the table value for 90%: Z = 1.645
σ
Z =1 . 645∗2=3 .290
c. Determine the margin of error: √n
d. Write the conclusion: We are 90% confident that the population mean falls between 12.71 (=16-3.29) and
19.29 (= 16+3.29)
4. How can you tell the difference between a confidence interval question from first level statistics and one from
One-Way ANOVA?

From Review of Statistics 1 notes, section 8: You wish to estimate a population mean
From the One-Way ANOVA notes, section 4.5: You wish to estimate differences in population means

5. You are interested if there is a difference in average effectiveness among five types of adds. You have 15 people
already divided into three groups based on where they shop: 5 people to shop at Walmart, 5 at Target, and 5 at
Bestbuy. In each group you randomly assign one of the adds to one of the people in the group. If you wished to
see if there is a difference in the ads, what type of experiment is this? You must explain your reasoning.

From the One-Way ANOVA notes, section 2.2

2.2 Randomized Block Design – Randomization occurs only within subsets (blocks) of experimental units.

In the problem it says: In each group you randomly assign ….

Thus this is a randomized block design.

6. An example of a statistic is _______________________ and an example of a parameter is __________________.

From Section 1 of the Review of Statistics notes, it states:

Parameter – a number that describes some aspect of the population; e.g. the mean
Statistic – a number that describes some aspect of the sample

Thus an example of a statistic would be a sample mean while an example of a parameter would be a population
mean

7. What is Dr. Eakin’s last name? (Hint: It is spelled E A K I N )

8. You are trying to estimate the difference between pairs of population average grades among three teaching
methods. What building block applies to this situation? You must explain your reasoning.

From section 4.5 of the One-Way ANOVA notes:

"4.5.1 Estimate –the difference in the sample means. By the Second Building Block of the course we know
that this difference tends to be in error and does not equal to the difference in population means. By the
Fourth Building Block we know that the largest error in the difference we would expect with a specified
probability is called the _________________________ which is found by multiplying ___________ and
___________."

I would accept either the Second or the Fourth Building Block.


9. If you have samples of 20 from each of 5 population, what is the rejection when trying to show that at least
two means differ?

From Review of statistics 1, n is the sample size. From the One-Way Anova notes, section 4.3: c is the
number of averages.

From the One-Way Anova notes, Section 4.4.1: "If the F-ratio is large (above an F-table value then reject equal
population means) The F table has two degrees of freedom: numerator degrees of freedom, c-1, and divisor
degrees of freedom, n-c."

From the problem c - 1= 5-1 = 4 and n –c = (12*5) – 5 = 100 – 5 = 95. The closest table value is 2.49 = F
with 4 and 80 degrees for freedom

Therefore using the example of section 4.4.2, the rejection region is: Reject Ho if F > 2.49

10. Suppose you want compare work experience among types of MBAs. You have a random sample of students
who are enrolled in the cohort MBA, a random sample from the regular MBA, and a random sample from an
online MBA. One of the requirements of ANOVA is the normality. In terms of this problem, restate the
normality assumption. Do not just the definition but use the terms types of MBA and years of work experience.

From Section 3.1 of the One-Way ANOVA notes:

Normality: The population distribution of the values of the response variable is assumed to be normal in
each population defined by the levels of the treatment.

In this problem the response variable is the work experience, with the populations being the students in each
type of MBAs.

Replacing the names response variable with work experience and population with type of MBA results in :

The work experience is assumed to be normally distributed among the students in each type of MBA.

11. When would you use the Tukey’s procedure?

From the label in section 4.5 of the One-Way ANOVA notes: When you are "Estimating Differences in
Population Means"

12. You have three random samples of 10 each. If you have an ANOVA F test statistic value of 11.34, would you
reject equal means? Why or why not?

From the examples at http://wweb.uta.edu/faculty/eakin/busa5325/OneWayAnova.xls


Reject Ho if the F test value > F c, n-c = F 2, 27 = 3.35

Since 11.34 > 3.35 reject the null hypothesis of equal means.

13. Different factors could affect a student’s test grade. If you could set up an experiment, what would be one
possible factor you would use?
From section 1 of the One-Way ANOVA notes:

Response variable, Y, – a quantitative variable that may depend on the values of a qualitative variable
(factor)

One possible answer would be teaching method. Note: Since this is an experiment, the factor must be
assignable.

Therefore you could not use factors such as gender or student year (Freshman, etc.) since you can only
measure those factors.

14. You have a t test value of 12.34. What does the 12.34 measure?

( x̄−μ )
The t test value is of the form
σ x̄
. 11.34 indicates that there are 11.34 sample standard errors
between the sample mean and the hypothesized population mean.

15. The probability associated with your test statistic is called the p-value. What is another name or symbol for
the probability attached to rejection region?

From Section 9: the rejection region is

Rejection Region: values of the test statistic that would be unlikely if the null was true.

A Type I error is

Type 1. Saying the process is out of control when it isn’t.

And the level of significance, α is:

Probability of a type 1 error is , also called the level of significance.

There the symbol attached to the rejection region is α

16. If a sample mean equals to 20, we cannot conclude that the population mean is also 20. What building block
supports this statement? Explain your reasoning.

The second building block of the course states:

2. Sample estimates tend to be in error: e.g., sample mean – population mean ≠ 0.

Therefore you would not expect the population mean to be 20 just because the sample mean is 20.
17. If a normally distributed population has a mean of 15 and a standard deviation of 5, then sketch the
empirical rule and determine the probability of finding a value less than 20.

From section 6 of the Review of Statistics notes

Range Percent
 - 3* up to  - 2* 2.5%
 - 2* up to  -  13.5%
 -  up to  34.0%
 up to  +  34.0%
 +  up to  + 2*  13.5%
 + 2* up to  + 3* 2.5%

Using the example just below this table in the notes:

Range Percent Sketch


0 up to 5 2.5% 40%

35%
5 up to 10 13.5% 30%

10 up to 15 34.0% 25%

20%
15 up to 20 34.0% 15%

20 up to 25 13.5% 10%

5%
25 up to 30 2.5% 0%
0 up to 5 5 up to 10 10 up to 15 15 up to 20 20 up to 25 25 up to 30

The probability would be the sum of the values in red font above or 84%

18. You trying to estimate a population mean using a very large random sample. You know nothing about the
population. What table would you use?

Since nothing is known about the population, then the population standard deviation is not known.

From the label of section 8.3 of the Review of Statistics notes:

8.3 Estimation of population mean if the population is normal but the population standard error is
unknown

In Section 8.3, we use a t–table for this confidence interval (and you could have also have mentioned that
since we do not know for sure if the population is normal, the requirements would force us to take a large
random sample).
19. If you have a confidence interval of 30 to 80 for a population mean. Can you conclude that the population
mean is larger than 10? Why or why not?

If you think the population mean is in the interval 30 to 80 and since both 30 and 80 are larger than 10, then
yes you would conclude the population mean is larger than 10.

20. What table value would you use for a 99% confidence interval for μ if σ is unknown and n=16?

Using Section 8.3, we would use a t-table with degrees of freedom n-1 = 15. Using the Within header row of
the table use the last column and row 15, the table value is 2.9467

21. The 95% confidence interval for the difference in average salary between two types of jobs are given below
for four cases. Each case is in the format of a sample difference plus or minus the margin of error. Case 1:
$30,000 ± $5, Case 2: $30,000 ± $45,000, Case 3: $1 ± $45,000 and Case 4: $1 ± 32 cents. In each case, indicate
(a) whether you would reject equality of means and (b) whether you would want to increase your sample size to
get a more precise estimate.

Let the difference be the mean of job 1 minus the mean of Job 2:

Case 1: You believe that the average salary of job 1 is from $29,995 to $30,005 more than job 2. I would say
job 1 is better than job 2 and the precision is very good since we are only off by $5. (Choose job 1)

Case 2: You believe that the average salary of job 1 is from $15,000 lower than to $75,000 more than job 2. I
am not sure which one is better since the estimated difference is very imprecise being off by plus of minus
45,000. You need more precision since the in that range are some very large differences in salary.

Case 3: This is similar to case 2 since you cannot tell which one is better and the precision is poor.

Case 4: You believe that the average salary of job 1 is from 68 cents to $1.32 more than job 2. I am not sure
which one is better but the precision is very good since the largest error I would expect is 32 cents. (In this
case, it does not really matter which job you choose.)

22. Find the population standard deviation of the following five values: 1, 1, 2, 2, 4

Using the example from Section 5 of the Review of Statistics notes:


Step c.
  Step b.
Square the
Distance to Average
Values Distances
1 (1-2)=-1 1

1 (1-2)=-1 1

2 (2-2)=0 0

2 (2-2)=0 0

4 (4-2)=2 4
 
Step a: mean =2  
6
Step d. Sum =
Step e. 2= 6/5 =1.2

Step f. = 1.095445115

23. According to the notes, what is one problem with using surveys?
From the Review of Statistics 1 notes: Section 2, b. iv:

 Typically some people do not respond which can bias the results
 Individual responses vary from day to day.

I would accept either answer

24. You run a consulting company and wish to show that the time (in days) before being paid by first time
customers exceeds 30 days on average. From a random sample of 30 companies you calculate a t-test value of
2.45. What can you conclude?

From the Review of Statistics 1 notes, Section 9 use an example from


http://wweb.uta.edu/faculty/eakin/busa3321/thyptest.xls that refers to "exceeds" or a right sided test

a. Determine the alternative hypothesis based on what you wish to show. H1:  > 30      
b. Based on the alternative decide what would cause you to reject the null and support
the alternative: In order to support that the mean time is greater than 30 you must
find sample means much higher than 30. Far enough as to be unlikely.        
c. Determine your critical value of the t and the rejection region: Degrees of freedom equals n-1 = 30 - 1
= 29. Use the 0.05 column in > header row. From the t-table we find the critical value to be 1.6991.    
  Your rejection region is then Reject Ho if t > 1.6991        
d. Calculate the test statistic using the formula: The t test value of 2.45 is given
e. Determine by your rule in c, if you should Reject Ho: Since 2.45 is greater than 1.6991
then reject Ho      
f. Make a conclusion in context of the problem: We can show that the mean time of all
invoice payments is greater than 30.      

25. Which building block is used in testing to see if the status quo value of a population mean has changed?
Explain your reasoning

From Basic Building block 3 c: If the probability is low either the sample was unlikely or one of the
population values in the above ratio is not correct.

You might also like