Statistics FinalReview

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 8

Statistics / Review for Final Exam 2011 1

Theoretical quetions:
1. Explain the steps how you compute the pth percentile. Give example for the quartiles.
2. Prove the short-cut formula for a population variance.
3. State and prove the Bayess theorem.
4. Describe the binomial distribution X=B(n,p) and prove the formula for P(X=k).
5. Let X=N(,) and Y=(X-)/. Explain how you compute P(X<a), P(X>b) and
P(a<X<b) in term of Y. Deduce that Y=N(0,1).
6. Describe the sampling distribution of sample means, give an example. State the
central limit theorem in this case.
7. Describe the sampling distribution of sample proportion, give an example. State the
central limit theorem in this case.
8. Explain how to obtain the confidence interval for a normal population mean.
9. Describe the formulae of the sample sizes required to obtain a confidence interval of
specific margin of error, given a significance level , for a normal mean and for a
population proportion.

Problems:
Descriptive statistics
P1. How much do users pay for Internet service? Here are the monthly fees (in dollars) paid
by a random sample of 50 users of commercial Internet service providers in August 2000:
20 40 22 22 21 21 20 10 20 20 20 13 18 50 20 18 15 8 22 25
22 10 20 22 22 21 15 23 30 12 9 20 40 22 29 19 15 20 20 20
20 15 19 21 14 22 21 35 20 22
(a) Make a stemplot of these data. Briefly describe the pattern you see. About how much do
you think America Online and its larger competitors were charging in August 2000?
(b) Which observations are suspected outliers by the 1.5IQR rule?

P2. Some companies grade on a bell curve to compare the performance of their managers
and professional workers. This forces the use of some low performance ratings, so that not
all workers are graded above average. Until the threat of lawsuits forced a change, Ford
Motor Companys performance management process assigned 10% A grades, 80% B
grades, and 10% C grades to the companys 18,000 managers. It isnt clear that the bell
curve of ratings is really a normal distribution. Nonetheless, suppose that Fords
performance scores are normally distributed. One year, managers with scores less than 25
received Cs and those with scores above 475 received As. What are the mean and standard
deviation of the scores?

P3. The median of any normal distribution is the same as its mean. We can use normal
calculations to find the quartiles and related descriptive measures for normal distributions.
(a) What is the area under the standard normal curve to the left of the first quartile? Use this
to find the value of the first quartile for a standard normal distribution. Find the third
quartile similarly.

1
Lecturer: Dr. Pho Duc Tai, Department of Mathematics, Vietnam National University, E-mail: [email protected]
1
(b) Your work in (a) gives the z-scores for the quartiles of any normal distribution. Scores
on the Wechsler Intelligence Scale for Children (WISC) are normally distributed with mean
100 and standard deviation 15. What are the quartiles of WISC scores?
(c) What is the value of the IQR for the standard normal distribution?
(d) What percent of the observations in the standard normal distribution are suspected
outliers according to the 1.5 IQR rule? (This percent is the same for any normal
distribution.)

P4. Many random number generators allow users to specify the range of the random
numbers to be produced. Suppose that you specify that the outcomes are to be distributed
uniformly between 0 and 2. Then the density curve of the outcomes has constant height
between 0 and 2, and height 0 elsewhere.
(a) What is the height of the density curve between 0 and 2? Draw a graph of the density
curve.
(b) Use your graph from (a) and the fact that areas under the curve are proportions of
outcomes to find the proportion of outcomes that are less than 1.
(c) Find the proportion of outcomes that lie between 0.5 and 1.3.

P5. Here is a two-way table of all suicides committed in a recent year by sex of the victim
and method used.
Male Female
Firearms 15,802 2,367
Poison 3,262 2,233
Hanging 3,822 856
Other 1,571 571
Total 24,457 6,027

(a) What is the probability that a randomly selected suicide victim is male?
(b) What is the probability that the suicide victim used a firearm?
(c) What is the conditional probability that a suicide used a firearm, given that it was a
man? Given that it was a woman?
(d) Describe in simple language (dont use the word probability) what your results in (a)
tell you about the difference between men and women with respect to suicide.

Sampling distribution and Estimation


Problem 1. The standard error of the mean decreases if
(a) the sample size decreases.
(b) the standard deviation increases, provided that n is constant.
(c) the standard deviation decreases or if n increases.
(d) the population size decreases.
Problem 2. Which statement is incorrect? Explain.
(a) If p = .50 and n = 64 the estimated standard error of the sample proportion is .025.
(b) In a sample size calculation for estimating it is conservative to assume = .50.
(c) If n = 250 and p = .07 it is safe to assume normality in a confidence interval for.
Problem 3. The owner of Limp Pines Resort wanted to know the average age of its clients.
A random sample of 25 tourists is taken. It shows a mean age of 46 years with a standard
deviation of 5 years. The width of a 98 percent confidence interval for the true mean client
age is approximately
2.06 / 2.33 / 2.49 / 2.79 (years)

2
Problem 4. A random sample of 16 ATM transactions at the Last National Bank of Flatrock
revealed a mean transaction time of 2.8 minutes with a standard deviation of 1.2 minutes.
The width (in minutes) of the 95% confidence interval for the true mean transaction time is
0.639 / 0.588 / 0.300 / 2.131
Problem 5. To estimate the average annual expenses of students on books and class
materials a sample of size 36 is taken. The mean is $850 and the standard deviation is $54.
A 99% confidence interval for the population mean is
(a) $823.72 to $876.28
(b) $826.82 to $873.18
(c) $831.73 to $868.27
(d) $825.48 to $874.52
Problem 6. A poll showed that 48 out of 120 randomly chosen graduates of California
medical schools last year intended to specialize in family practice. What is the width of a
90% confidence interval for the proportion that plan to specialize in family practice?
.04472 / .07357 / .08765 / .00329
Problem 7. In a random sample of 810 women employees, it is found that 81 would prefer
working for a female boss. The width of the 95% confidence interval for the proportion of
women who prefer a female boss is
.0288 / .0105 / .0196 / .0207
Problem 8. Jolly Blue Giant Health Insurance (JBGHI) is concerned about rising lab test
costs and would like to know what proportion of the positive lab tests for prostate cancer
are actually proven correct through subsequent biopsy. JBGHI demands a sample large
enough to ensure an error of 2% with 90% confidence. What is the necessary sample size?
2,401 / 1,692 / 1,604 / 609
Problem 9. A financial institution wishes to estimate the mean balances owed by its credit
card customers. The population standard deviation is estimated to be $300. If a 98 percent
confidence interval is used and an interval of $75 is desired, how many cardholders should
be sampled?
3,382 / 62 / 629 / 87
Problem 10. Landings and takeoffs at Schiphol, Holland, per month are (in 1,000s) as
follows:
26, 19, 27, 30, 18, 17, 21, 28, 18, 26, 19, 20, 23, 18, 25, 29, 30, 26, 24, 22, 31, 18, 30, 19
Assume a random sample of months. Give a 95% confidence interval for the average
monthly number of takeoffs and landings.
Answer: [21.507, 25.493]
Problem 11. The Java computer language, developed by Sun Microsystems, has the
advantage that its programs can run on types of hardware ranging from mainframe
computers all the way down to handheld computing devices or even smart phones. A test of
100 randomly selected programmers revealed that 71 preferred Java to their other most used
computer languages. Construct a 95% confidence interval for the proportion of all
programmers in the population from which the sample was selected who prefer Java.
Answer: [0.6211, 0.7989]
Problem 12. According to the Wall Street Journal, an average of 44 tons of carbon dioxide
will be saved per year if new, more efficient lamps are used.20 Assume that this average is
based on a random sample of 15 test runs of the new lamps and that the sample standard
deviation was 18 tons. Give a 90% confidence interval for average annual savings.
Answer: [35.81417, 52.18583]

3
Problem 13. Sonys new optical disk system prototype tested and claimed to be able to
record an average of 1.2 hours of high-definition TV. Assume n = 10 trials and = 0.2 hour.
Give a 90% confidence interval.
Answer: [1.0841, 1.3159]
Problem 14. FinAid is a new, free Web site that helps people obtain information on 180,000
college tuition aid awards. A random sample of 500 such awards revealed that 368 were
granted for reasons other than financial need. They were based on the applicants
qualifications, interests, and other variables. Construct a 95% confidence interval for the
proportion of all awards on this service made for reasons other than financial need.
Answer: [0.6974, 0.7746]
Problem 15. In May 2007, a banker was arrested and charged with insider trading after
government investigators had secretly looked at a sample of nine of his many trades and
found that on these trades he had made a total of $7.5 million. Compute the average earning
per trade. Assume also that the sample standard deviation was $0.5 million and compute a
95% confidence interval for the average earning per trade for all trades made by this banker.
Use the assumption that the nine trades were randomly selected. Suppose the confidence
interval contained the value 0.00. How could the bankers attorney use this information to
defend his client?
Answer: did benefit
Problem 16. A marketing manager wishes to estimate the proportion of customers who
prefer a new packaging of a product to the old. He guesses that 60% of the customers would
prefer the new packaging. The manager wishes to estimate the proportion to within 2% with
90% confidence. What is the minimum required sample size?
Answer:
Problem 17. According to Shape, on the average, 1/2 cup of edamame beans contains 6
grams of protein. If this conclusion is based on a random sample of 50 half-cups of
edamames and the sample standard deviation is 3 grams, construct a 95% confidence
interval for the average amount of protein in 1/2 cup of edamames.
Answer: [5.147, 6.853]

One-Sample Hypothesis Tests


Problem 1. Given H0: =18 and H1: < 18, we would commit Type I error if we
(a) conclude that 18 when the truth is that < 18.
(b) conclude that < 18 when the truth is that 18.
(c) fail to reject 18 when the truth is that < 18.
Problem 2. Which is not true of p-values?
(a) When they are small, we want to reject H0.
(b) They must be specified before the sample is taken.
(c) They show the chance of Type I error if we reject H0.
Problem 3. Dullco Manufacturing claims that its alkaline batteries last at least forty hours
on average in a certain type of portable CD player. But tests on a random sample of 18
batteries from a day's large production run showed a mean battery life of only 37.8 hours
with a standard deviation of 5.4 hours. To test DullCo's hypothesis, the test statistic is
-1.980 / -1.728 / -2.101 / -1.960
Problem 4. Last year, 10 percent of all teenagers owned an iPhone. This year, a sample of
260 randomly chosen teenagers showed that 39 owned an iPhone. (a) The test statistic to
find out whether the percent has risen is
4
2.867 / 2.758 / .0256 / 2.258
(b) To test whether the percent has risen, the critical value at = .05 is
1.686 / 1.655 / 1.645 / 1.960
(c) To test whether the percent has risen, the p-value is
.0501 / .0314 / .0492 / .0036
Problem 5. Assuming that other factors remain the same, which of the following
statements is most nearly correct for a t-test of a mean?
(a) The critical value of Student's t is smaller if n is smaller.
(b) If ttest = 1.482 with n = 22, we get a clear-cut rejection in a right-tailed test at = .05.
(c) Rejecting H0: = 75 in a two-tailed test implies rejection in a one-tailed test at the
same .
(d) A calculated p-value of 0.13 would lead us to reject the null hypothesis at = 0.10.
Problem 7. In a right-tail test, a statistician came up with a z test statistic of 1.470. What is
the p-value?
.4292 / .0708 / .0874 / .0301
Problem 8. In a right-tailed test of hypothesis for a population mean with 13 degrees of
freedom, the value of the test statistic was 1.863. The p-value is
less than .025 / between .025 and .05 / between .05 and .10 / greater than .10
Problem 9. Many recent changes have affected the real estate market. A study was
undertaken to determine customer satisfaction from real estate deals. Suppose that before
the changes, the average customer satisfaction rating, on a scale of 0 to 100, was 77. A
survey questionnaire was sent to a random sample of 50 residents who bought new plots
after the changes in the market were instituted, and the average satisfaction rating for this
sample was found to be 84; the sample standard deviation was found to be s = 28. Use an
of your choice, and determine whether statistical evidence indicates a change in customer
satisfaction. If you determine that a change did occur, state whether you believe customer
satisfaction has improved or deteriorated.
Answer: z = 1.7678, Reject H0
Problem 10. A new chemical process is introduced by Duracell in the production of
lithium-ion batteries. For batteries produced by the old process, the average life of a battery
is 102.5 hours. To determine whether the new process affects the average life of the
batteries, the manufacturer collects a random sample of 25 batteries produced by the new
process and uses them until they run out. The sample mean life is found to be 107 hours,
and the sample standard deviation is found to be 10 hours. Are these results significant at
the = 0.05 level? Are they significant at the = 0.01 level? Explain. Draw your
conclusion.
Answer: t(24) = 2.25, Do not reject H0 at = 0.01, reject at = 0.05
Problem 11. The manufacturer of electronic components needs to inform its buyers of the
proportion of defective components in its shipments. The company has been stating that the
percentage of defectives is 12%. The company wants to test whether the proportion of all
components that are defective is as claimed. A random sample of 100 items indicates 17
defectives. Use = 0.05 to test the hypothesis that the percentage of defective components
is 12%.
Answer: z = 1.539, Do not reject H0 (p-value = 0.1238)
Problem 12. A companys market share is very sensitive to both its level of advertising and
the levels of its competitors advertising. A firm known to have a 56% market share wants
to test whether this value is still valid in view of recent advertising campaigns of its
competitors and its own increased level of advertising. A random sample of 500 consumers

5
reveals that 298 use the companys product. Is there evidence to conclude that the
companys market share is no longer 56%, at the 0.01 level of significance?
Answer: z = 1.622, Do not reject H0 (p-value = 0.1048)
Problem 13. According to Money, the average amount of money that a typical person in the
United States would need to make him or her feel rich is $1.5 million. A researcher wants to
test this claim. A random sample of 100 people in the United States reveals that their mean
amount to feel rich is $2.3 million and the standard deviation is $0.5 million. Conduct the
test.
Answer: z = 16.0, Reject H0
Problem 14. Certain eggs are stated to have reduced cholesterol content, with an average of
only 2.5% cholesterol. A concerned health group wants to test whether the claim is true. The
group believes that more cholesterol may be found, on the average, in the eggs. A random
sample of 100 eggs reveals a sample average content of 5.2% cholesterol, and a sample
standard deviation of 2.8%. Does the health group have cause for action?
Answer: z = 9.643, Reject H0
Problem 15. The engine of the Volvo model S70 T-5 is stated to provide 246 horsepower.
To test this claim, believing it is too high, a competitor runs the engine n = 60 times,
randomly chosen, and gets a sample mean of 239 horsepower and standard deviation of 20
horsepower. Conduct the test, using = 0.01.
Answer: z = 2.711, Reject H0
Problem 16. According to BusinessWeek, the Standard & Poors 500 Index posted an
average gain of 13% for 2006. If a random sample of 50 stocks from this index reveals an
average gain of 11% and standard deviation of 6%, can you reject the magazines claim in a
two-tailed test? What is your p-value?
Answer: z = 2.3570, Reject H0
Problem 17. The null and alternative hypotheses of a t test for the mean are
H0: = 1,000, H1: < 1,000
Other things remaining the same, which of the following will result in an increase in
the p-value?
a. Increase in the sample size.
b. Increase in the sample mean.
c. Increase in the sample standard deviation.
d. Increase in .
Problem 18. The null and alternative hypotheses of a test for population proportion are
H0: = 0.25, H1: > 0.25
Other things remaining the same, which of the following will result in an increase in
the p-value?
a. Increase in sample size.
b. Increase in sample proportion.
c. Increase in .

Two-sample Hypothesis Tests


Problem 1. Weekly sales of diet coke at each of twelve Target stores are recorded before
and after installing a new eye-catching display. To determine if the display is effective in
increasing sales, what type of statistical test would you expect to perform?
(a) Comparison of means using an independent sample t-test.
(b) Comparison of means using a paired t-test.
(c) Comparison of means using a z-test.

6
Problem 2. Carver Memorial Hospital's surgeons have a new procedure that they think will
decrease the time to perform an appendectomy. A sample of 8 appendectomies using the old
method had a mean of 38 minutes with a variance of 36 minutes, while a sample of 10
appendectomies using the experimental method had a mean of 29 minutes with a variance
of 16 minutes.
(a) For a right-tail test of means (assume equal variances) the critical value for = .10 is
1.746 / 1.337 / 2.120 / 2.754
(b) For a right-tail test of means (assume equal variances) the test statistic is
2.365 / 3.814 / 3.000 / 1.895

Problem 3. In a test of a new surgical procedure, the five most respected surgeons in
FlatBroke Township were invited to Carver Hospital. Each surgeon was assigned two
patients of the same age, gender, and overall health. One patient was operated upon in the
old way, and the other in the new way. Both procedures are considered equally safe. The
time (in minutes) to complete each procedure is shown:
Surgeon
Allen Bob Chloe Daphne Edgar
Old Way 36 55 28 40 62
New Way 31 45 28 35 57

(a) In a right-tail test for a difference of means at = .05, the critical value is
(i) 3.162, paired t-test
(ii) 2.132, paired t-test
(iii) 1.645, independent samples t-test
(iv) 2.776, independent samples t-test
(b) In a right-tailed test for a difference of means, the test statistic is 3.162 / 1.645 /
1.860 / 2.132

Problem 4. The Board of Surgeons recommends a postoperative examination six months


after a prostatectomy. In a sample from the records of Cutter Memorial Hospital, follow-up
exams were given in 90 out of 200 cases. In a sample of records from Paymor Hospital,
follow-up exams were given in 110 out of 200 cases.
(a) In comparing the two proportions, can we assume the normality?
(b) In a left-tailed test for equality of proportions, the test statistic is
-1.96 / -2.58 / -2.00 / -3.47
(c) In a left-tailed test for equality of proportions the p-value is
.9772 / .4772 / .0228 / .0014
Problem 5. Two movies were screen-tested at two different samples of theaters. Mystic
River was viewed at 80 theaters and was considered a success in terms of box office sales in
60 of these theaters. Swimming Pool was viewed at a random sample of 100 theaters and
was considered a success in 65. Based on these data, do you believe that one of these
movies was a greater success than the other? Explain.
Answer: z = 1.447, Do not reject H0 (p-value = 0.1478)
Problem 6. Home loan delinquencies have recently been causing problems throughout the
American economy. According to USA Today, the percentage of homeowners falling behind
on their mortgage payments in some sections of the West has been 4.95%, while in some
areas of the South that rate was 6.79%. Assume that these numbers are derived from two
independent samples of 1,000 homeowners in each region. Test for equality of proportions
of loan default in the two regions using = 0.05.
Answer: z = 1.7503, Do not reject H0
7
Problem 7. Toys are entering the virtual world, and Mattel recently developed a digital
version of its famous Barbie. The average price of the virtual doll is reported to be $60. A
competing product sells for an average of $65. Suppose both averages are sample estimates
based on independent random samples of 25 outlets selling Barbiesoftware and 20 outlets
selling the competing virtual doll, and suppose the sample standard deviation for Barbie is
$14 and for the competing doll it is $8. Test for equality of average price using the 0.05
level of significance.
Answer:t = 1.5048, Do not reject H0
Problem 8. According to Fortune, there has been an average decrease of 86% in the
Atlantic cod catch over the last two decades. Suppose that two areas are monitored for catch
sizes and one of them has a daily average of 1.7 tons and a standard deviation of 0.4 ton,
while the other has an average daily catch of 1.5 tons and a standard deviation of 0.7 ton.
Both estimates are obtained from independent random samples of 25 days. Conduct a test
for equality of mean catch and report your p-value.
Answer: Do not reject H0

You might also like