Sem 1 Review

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

1st Semester Final Exam Review

1. Researchers looking at the relationship between the type of college attended (public or private) and
achievement gather the following data on 3265 people who graduated from college in the same year. The
variable “management level” describes their job description 20 years after graduating from college.
Type of College
Public Private
High 75 107
Management level Medium 962 794
Low 732 595
(a) Calculate the marginal distribution of management level in percents.

(b) Find the conditional distribution of management level for each college type, in percents.

(c) Write a brief description of what the information in (a) and (b) tells you about the relationship between
these variables.
2. Literary scholars sometimes use the distribution of word lengths in a work as a test of authenticity. Here are
the word lengths for the first 25 words on a randomly-selected page from Toni Morrison’s Song of Solomon.

2 3 4 10 2 11 2 8 4 3 7 2 7
5 3 6 4 4 2 5 8 2 3 4 4

(a) Make a dotplot of these data.

(b) Describe the overall pattern of the distribution and any possible outliers.
3.The scores of a reference population on the Wechsler Intelligence Scale for Children (WISC) are
approximately Normally distributed with µ = 100 and s = 15.

(a) What score would represent the 50th percentile? Explain.

(b) A score in what range would represent the top 1% of the scores?

(c) What proportion of the reference population has WISC scores below 110?

(d) What proportion of the reference population has WISC scores between 80 and 110?

(e) What is the interquartile range of WISC scores for the reference population?
4. Twenty students were asked to guess the age of a man in a photograph. Here are their guesses:

44 43 48 37 44 40 33 42 43 41

50 49 43 46 46 45 43 38 39 41

Are these guesses approximately Normally distributed? Provide evidence to support your answer.
5. Of the 50 species of oaks in the United States, 28 grow on the Atlantic coast and 11 grow in California. We
are interested in the distribution of acorn volumes among oak species. Here are back-to-back stemplots on
the volumes of acorns (in cubic centimeters) for these 39 oak species:

Use the stemplots to compare the distribution of acorn sizes between Atlantic Coast and California oak
species.
6. The Candela brothers own two pizza restaurants, one on Park Street and one on Bridge Road. The
computer output below summarizes the distribution of weekly revenues at each restaurant—26
weeks for Park Street and 40 weeks for Bridge Avenue.

Descriptive Statistics: Park, Bridge

Variable N Mean SEMean StDev Minimum Q1 Median Q3 Maximum


Park 36 6611 597 3580 800 3600 6600 9675 14100
Bridge 40 5989 299 1794 1800 5225 6000 7625 8600

(a) One week, Park Street’s revenues were $7500, which was the 15th highest revenue recorded for
that restaurant. In the same week, Bridge Road’s revenue was $7100, the 12th highest for that
restaurant. Use percentiles and z-scores to compare how successful each restaurant was that week,
relative to their typical weekly revenue.

(b) The weekly fixed operating costs for the Park Street restaurant are $3000, which means that net
weekly profit is weekly revenue minus $3000. Find the mean, median, standard deviation, and
interquartile range for net weekly profit.
7. Below is some information about the first ten United States Presidents.
Age at Age at
Name Political Party Inauguration Death State of Birth
George Washington Federalist 57 67 Virginia
John Adams Federalist 61 90 Massachusetts
Thomas Jefferson Democratic-Republican 57 83 Virginia
James Madison Democratic-Republican 57 85 Virginia
James Monroe Democratic-Republican 58 73 Virginia
John Quincy Adams Democratic-Republican 57 80 Massachusetts
Andrew Jackson Democrat 61 78 South Carolina
Martin Van Buren Democrat 54 79 New York
William H. Harrison Whig 68 68 Virginia
John Tyler Whig 51 71 Virginia

(a) What are the individuals in this data set?

(b) Identify the variables that were recorded, and indicate whether each one is categorical or quantitative.

(c) Here is a pie chart for the distribution of the variable “State of birth.” Fill in the blanks with the
appropriate values of the variable.

(d) Below is a bar graph of the number of presidents of


each political party. What is wrong with the way
information is presented in this graph?
8. How much oil wells in a given field will ultimately produce is crucial information in deciding whether to drill
more wells. Here are the estimated total amounts of oil recovered from 38 wells in the Devonian Richmond
Dolomite area of the Michigan basin, in thousands of barrels. The data is provided in ascending order, along
with a dotplot.

3 22 35 43 49 57 70 92
13 25 35 43 50 59 70 98
15 31 37 45 50 63 74 157
19 33 37 46 53 65 80
21 35 38 48 56 66 82

(a) What measures would you use to describe the center and spread of these data? Justify your answer.

(b) Find the five-number summary for these data.

(c) Are there any outliers? Justify your answer.

(d) For the oil well data on the previous page, how can you tell without doing any calculations, that the mean
of these data is larger than the median?
9. Below are cumulative frequency graphs for the age distributions of the populations of France and the
Philippines.

(a) Use the graphs to compare the median and interquartile range of ages in the two countries.
10. Below is a cumulative relative frequency graph for the length of time a group of 62 students spent on
a no-time-limit final exam in Algebra II.

(a) What are the median and interquartile range for the amount of time these students spent on the
exam? Draw lines on the graph to show how you arrived at your answers.

(b) According to these data, the mean time students spent on the exam was 94.1 minutes, and the
standard deviation was 24.23 minutes. Suppose the exam proctor realized after compiling these
data that he had used the wrong start time in his calculation, so that each value for time spent on
exam needs to be reduced by 15 minutes. He also wants to express the times in hours, rather than
minutes. Find the mean and standard deviation of the transformed data.

(c) What are the mean and standard deviation of the z-scores of time spent on the exam for all the
students who took this exam? Justify your answer.
11. A church group interested in promoting volunteerism in a community chooses an SRS of 200
community addresses and sends members to visit these addresses during weekday working hours to
inquire about the residents’ attitudes toward volunteer work. Sixty percent of all respondents say
that they would be willing to donate at least an hour a week to some volunteer organization. Bias is
present in this sample design. Identify the type of bias involved and state whether you think the
sample percent obtained is higher or lower than the true population percent.
12. The probability that a randomly selected person in the United States is left-handed is about 0.14.

(a) Use this probability to explain what the Law of Large Numbers says.

(b) Among the 28 students in Mr. Millar’s Calculus BC class, 8 are left-handed. Could this have
happened by chance alone? Describe how you would use a random number table to simulate the
proportion of left-handers in a class of 28 students if they were chosen randomly from a population
that is 14% left-handed. Do not perform the simulation.
13. A student wonders if there is a relationship between foot length and height. She measures herself and
six classmates and produces the following data (heights and foot lengths are in centimeters):

Height 158 160 149 169 152 150 157


Foot Length 17 20 15 18 12 13 16

(a) Make a well-labeled scatterplot of these data.

(b) Based on the scatterplot, describe the pattern, if any, in the relationship between the heights and
foot lengths of these students.

(c) Use your calculator to find the correlation r between the height and foot length. Do the data
show any evidence of a relationship between these two variables? Explain.

(d) How would r change if


• all the heights were reduced by 5 centimeters?

• heights were measured in inches rather than centimeters? (There are 2.54 centimeters in an inch.)

(e) Suppose another student with foot length 19 cm and height 167 cm were added to the data. How
would this influence r?
14. The school’s newspaper has asked you to contact 100 of the approximately 1100 students at the
school to gather information about student opinions regarding food at your school’s cafeteria.

(a) With as much precision as possible, describe the population for your study.

(b) You are pretty sure that there is a big difference between the opinions of males and females when
it comes to cafeteria food. Describe a study design that takes into account this potentially important
variable. Explain the advantage of this method.

(c) You decide to conduct a survey about the quality of food served in the school cafeteria by
randomly selecting students as they leave the cafeteria after lunch on a specific day next week.
Describe a source of bias that may result from using this method. Be sure to use the correct
terminology, and indicate the direction of the potential bias.
15. A couple has two sons and decide to have a third child. The husband says, “We’re bound to have a
daughter this time: things balance out.” The wife says, “Nonsense! Two boys in a row means we
are more likely to have another boy.” Comment on this disagreement, based on your
understanding of probability.
16. Below is some data on the relationship between the price of a certain manufacturer’s flat-panel LCD
televisions and the area of the screen. We would like to use these data to predict the price of
televisions based on size.

(a) Use your calculator to find the equation of the least-squares regression equation. Write the
equation below, defining any variables you use.

(b) This manufacturer also produces a television with a screen size of 943 square inches. Would it
be reasonable to use this equation to predict the price of that television? Explain.

(c) Calculate the residual for the television that has a screen area of 437 square inches. What does
this number suggest about the cost of this television, relative to the others?
17. Agricultural scientists for a chemical company want to determine if a newly developed fertilizer
produces heavier tomatoes than the fertilizer they currently manufacture. For their first pilot study,
they have 24 healthy young tomato plants growing in individual pots, numbered from 1 to 24.
Describe the design of a completely randomized, controlled experiment to test the whether the new
fertilizer produces heavier tomatoes. Your answer should address all four basic principles of
experimental design.
18. A cookie manufacturer is trying to determine how long cookies stay fresh on store shelves, and the
extent to which the type of packaging and the store’s temperature influences how long the cookies
stay fresh. He designs a completely randomized experiment involving low (64 °F and high (75 °F)
temperatures and two types of packaging—plastic and waxed cardboard. List the experimental
units, factors, and treatments in this experiment.
19. Alana’s favorite exercise machine is a stair climber. On the “random” setting, it changes speeds at
regular intervals, so the total number of simulated “floors” she climbs varies from session to session.
She also exercises for different lengths of time each session. She decides to explore the relationship
between the number of minutes she works out on the stair climber and the number of floors it tells
her that she’s climbed. She records minutes of climbing time and number of floors climbed for six
exercise sessions. Computer output and a residual plot from a linear regression analysis of the data
are shown below.

(a) What is the equation of the least-squares line? Be sure to define any variables you use.

(b) Is a line an appropriate model for these data? Justify your answer.

(c) Interpret the value of s (S = 2.3472) in the context of this problem.


20. A university’s financial aid office wants to know how much it can expect students to earn from
summer employment. This information will be used to set the level of financial aid. The
population contains 3478 students who have completed at least one year of study but have not yet
graduated. A questionnaire will be sent to an SRS of 100 of these students, drawn from an
alphabetized list.

(a) Explain clearly how you would use your calculator to choose a sample of 100 students for this
study.

(b)Explain how you could use a random digits table to choose a sample of 100 students for this
study.
21. For each study describe below, comment briefly on the extent to which results can be generalized to
some larger population, and the extent to which cause and effect has been established.

(a) A marketing executive who wants to gauge reactions to a new packaging design for a popular
brand of cookie places the new packages in 45 randomly-selected grocery stores in a large city and
compares sales of the cookies to sales of the same cookie (with the old packaging) in the previous
month.

(b) A consumer advocacy organization wants to determine if using premium gasoline in the engines
of cars improves gas mileage. They randomly select 40 makes and models of new cars and acquire
two of each. They run each car on a track for 1000 miles, one with regular gasoline, one with
premium. (Which car within each pair gets the premium gas is determined by coin flip). After
driving each car, they determine the difference in fuel consumption within each pair of cars.

(c) A high school student thinks that the longer a student has been at the school, the less they like the
food in the cafeteria. To test this theory, she gives a two-question survey to the first 100 people
who enter the cafeteria on a certain day. The first question is, “How long have you attended school
here?” The second question asks the student to rate the food in the cafeteria on a 1 to 5 scale.

22. Some days, Ramon drives to work. The rest of the time he rides his bike. Suppose we choose a
random work day. The following table gives the probabilities of several events.

Event Probability
Student participates in sports 0.20
Student participates in sports and graduates 0.18
Student graduates, given no participation in sports 0.82

(a) Find the probability that Ramon is late for work, given that he drives.

(b) Find the probability that Ramon is not late for work, given that he drives.

(c) Draw a tree diagram to summarize the given probabilities and those you determined above.

(d) Find the probability that Ramon drove to work, given that he is late.
23. Suppose a person was having two surgeries performed at the same time by different operating teams.
Assume (unrealistically) that the two operations are independent. If the chances of success for
surgery A are 85%, and the chances of success for surgery B are 90%, what is the probability that
both will fail?
24. What age groups use social networking sites? A recent study produced the following data about
768 individuals who were asked their age and which of three social networking sites they used most
often. (People who did not use such sites were excluded from the study).
Age Group (Years)
Web site 0 – 24 25 – 44 45 – 64 Over 65 Totals
Facebook 77 105 114 12 308
Twitter 46 110 81 7 244
LinkedIn 15 97 95 9 216
Totals 138 312 290 28 768

Suppose one subject from this study was selected at random.

(a) Find the probability that the selected subject preferred Twitter.

(b) Find the probability that the selected subject preferred Twitter, given that he or she was in the 45
– 64 age group.

(c) Are the events “preferred Twitter” and “age group 45 – 64” independent? Explain.

(d) Are the events “preferred Twitter” and “age group 45 – 64” mutually exclusive? Explain.

(e) If a random sample of two subjects were selected, what is the probability that neither preferred
Twitter?
25.The manager of a children’s puppet theatre has determined that the number of adult tickets he sells for a
Saturday afternoon show is a random variable with a mean of 28.3 tickets and a standard deviation
of 5.3 tickets. The mean number of children’s tickets he sells is 42.5, with a standard deviation of
8.1.

(a) The adult tickets sell for $10. Let A = the money he collects from adult tickets on a random
Saturday. What are the mean and standard deviation of A?

(b) The children’s tickets sell for $6. Let T = the money he collects from all ticket sales (adults and
children) on a random Saturday. Assume (unrealistically, perhaps) that the number of tickets sold
to adults is independent of the number sold to children. What are the mean and standard deviation
of T?

(c) It costs $300 for the manager to put on each puppet show. Let P = the profit from a random
Saturday’s show. What are the mean and standard deviation of P?
26. Consider the following activity: The letters in the word AARDVARK are printed on identical
plastic cards with one letter per card. The eight cards are then placed in a hat, and one card is
randomly chosen (without looking) from the hat. The chance process we are interested in is what
letter is on the selected card.

(a) List the sample space S of all possible outcomes.

(b) Make a table that shows the set of outcomes and the probability of each outcome:

Outcome
Probability

(c) Consider the following events:

V: the letter chosen is a vowel.


F: the letter chosen falls in the first half of the alphabet (that is, between A and
M).

List the outcomes in each of the following events, and determine their probabilities:

V={ P(V) =
F={ P(F) =
V or F = { P(V or F) =
c
F ={ P(Fc) =
V and F = { P(V and F) =
V given F = { P(V|F) =
(d) Are the events V and F are independent? Explain.

(e) Are the events V and F mutually exclusive? Explain.


27. Mr. Voss and Mr. Cull bowl every Tuesday night. Over the past few years, Mr. Voss’s scores have
been approximately Normally distributed with a mean of 212 and a standard deviation of 31.
During the same period, Mr. Cull’s scores have also been approximately Normally distributed with a
mean of 230 and a standard deviation of 40. Assuming their scores are independent, what is the
probability that Mr. Voss scores higher than Mr. Cull on a randomly-selected Tuesday night?
28.Determine whether each random variable described below satisfies the conditions for a binomial setting,
a geometric setting, or neither. Support your conclusion in each case.

(a) Draw a card from a standard deck of 52 playing cards, observe the card, return the card to the
deck, and shuffle. Count the number of times you draw a card in this manner until you observe a
jack.

(b) Joey buys a Virginia lottery ticket every week. X is the number of times in a year that he wins a
prize.
29.When a computerized generator is used to generate random digits, the probability that any particular digit
in the set {0, 1, 2, . . . , 9} is generated on any individual trial is 1/10 = 0.1. Suppose that we are
generating digits one at a time and are interested in tracking occurrences of the digit 0.

(a) Determine the probability that the first 0 occurs as the fifth random digit generated.

(b).How many random digits would you expect to have to generate in order to observe the first 0?

(c)Let X = number of digits selected until first zero is encountered. Construct a probability
distribution histogram for X = 1 through X = 5.
30.A fair coin is flipped 20 times.

(a) Determine the probability that the coin comes up tails exactly 15 times.

(b).Let X = the number of tails in the 20 flips. Find the mean and standard deviation of X.

(c) Find the probability that X takes a value within 1 standard deviation of its mean.
31. The weights of Granny Smith apples from a large orchard are Normally distributed with a mean of
380 gm and a standard deviation of 28 gm.

(a) A single apple is selected at random from this orchard. What is the probability that it weighs
more 400 gm?

(b) Three apples are selected at random from this orchard. What is the probability that their mean
weight is greater than 400 gm.?

(c) Explain why the probabilities in A. and B. are not equal.


32.The service department of a large automobile dealership keeps records of the odometer readings of cars
that it repairs and determines that the distribution of miles driven per year by all of its customers has
a mean of 14,000 miles and a standard deviation of 4000. The distribution is skewed to the right.
Suppose a random sample of 12 cars is taken from the hundreds of cars for which they have records,
and mean number of miles per year, is calculated.

(a) What is the mean of the sampling distribution of ?

(b) Is it possible to calculate the standard deviation of ? If it is, do the calculation. If it isn’t,
explain why.

(c) Do you know the approximate shape of the sampling distribution of ? If so, describe the shape
and justify your answer. If not, explain why not.
33.A four sided die shaped like an asymmetrical tetrahedron has the following roll probabilities.
Number on Die 1 2 3 4
Probability 0.4 0.3 0.2 0.1

Let X = the result of a single roll.

(a) Find

(b) Find

(c) Describe in words and find its value.

(d) Find the smallest value A for which

(e) If T = the sum of two rolls, find

Below is a copy of the table from the first page, showing the probability distribution of X = the
number rolled on an asymmetrical four-sided die.
X 1 2 3 4
P(X) 0.4 0.3 0.2 0.1

(f) Find and interpret the mean and standard deviation of X.


34. For each description below, identify each underlined number as a parameter or statistic. Use
appropriate notation to describe each number, e.g., .

(a) A 1993 survey conducted by the Richmond Times-Dispatch one week before election day asked
voters which candidate for the state’s attorney general they would vote for. 37% of the respondents
said they would vote for the Democratic candidate. On election day, 41% actually voted for the
Democratic candidate.

(b) The National Center for Health Statistics reports that the mean systolic blood pressure for males
35 to 44 years of age is 128 and the standard deviation is 15. The medical director of a large
company looks at the medical records of 72 executives in this age group and finds that the mean
systolic blood pressure for these executives is 126.07.

35. A large pet store that specializes in tropical fish has several thousand guppies. The store claims that
the guppies have a mean length of 5 cm and a standard deviation of 0.5 cm. You come to the store
and buy 10 randomly-selected guppies and find that the mean length of your 10 guppies is 4.8 cm.
This makes you suspect that the mean fish length is not what the store says it is. To explore this
further, you assume that the length of guppies is Normally distributed and use a computer to
simulate 200 samples of 10 guppies from the store’s claimed population. Below is a dotplot of the
means from these 200 samples.

(a) What is the population in this situation, and what population parameters have we been given?
(b) The distribution of one sample is described in the opening paragraph. What information have
we been given about this sample?

(c) Is the dotplot above a sampling distribution? Explain.

(d) Do you think the store is being honest about the length of its guppies? Justify your answer.

36. According to a poll, 22% of high school students in the United Kingdom say that Dobby is their favorite
character in the Harry Potter books. Let’s assume this is the parameter value for the entire
population of high school students in the U.K. You take a sample of 150 high school students and
record the proportion, , of individuals in your sample who say Dobby is their favorite character.

(a) What are the mean and standard deviation of the sampling distribution of ?

(b) What is the approximate shape of the sampling distribution? Justify your answer.

(c) Suppose our sample size was 36 instead of 150. Compare the shape, center, and spread of this
sampling distribution to the one in parts A. and B..

(d) A small town in the U.K. has only 600 high school students. What is the largest possible
sample you can take from this town and still be able to calculate the standard deviation of the
sampling distribution of using the method presented in the textbook? Explain.
37. Power companies severely trim trees growing near their lines to avoid power failures due to falling
limbs in storms. Applying a chemical to slow the growth of the trees is cheaper than trimming, but
the chemical kills some of the trees. Suppose that one such chemical would kill 20% of sycamore
trees. The power company tests the chemical on 250 sycamores. Consider these an SRS from the
population of all sycamore trees.

(a) What are the mean and standard deviation of the proportion of trees that are killed in samples of
250 trees?

(b) Calculate probability that at least 24% of the trees in the sample are killed.
1st Semester Final Exam Review
Answer Section
1. ANS:
(a) High: 5.6%; Medium: 53.8%; Low: 40.6%. (b) and (c) See the table of conditional distributions and
segmented bar graph below:

(d) The proportion of private college graduates in the medium and low management levels is about the same
as the proportion of public college graduates in those categories. But a higher proportion of the private
college graduates are in high management level than for public college graduates.

PTS: 1 BNK: Quiz 1.1A


2. ANS:

(a)

(b) The distribution is skewed to the right, with two peaks at 2 and 4 letters in length and a range of 11 – 2 = 9
letters. There are two possible outliers of 10 letters and 11 letters in length

PTS: 1 BNK: Quiz 1.2A


3. ANS:
(a) 50th percentile = median, which is equal to the mean in a symmetric distribution, so it’s 100.
(b) Top 1% is equivalent to 99th percentile, which corresponds to z = 2.33. This would be a score of
2.33(15) + 100 135 or higher. (c) . By Table A, this has 0.7486 or 74.86% of the scores below it.
(d) , which by Table A has 0.0918 or 9.18% or the scores below it. Using the answer to part (c), the
percentage of z-scores between – 1.33 and 0.67 is .7486 – .0918 = .6568 or 65.68%. (e) Q1 (the
25th percentile) for a Normal distribution (from Table A) is at approximately z = –0.67. By
symmetry, Q3 is at z = 0.67. These correspond to WISC scores of . Thus the interquartile range is
approximately 20 points.

PTS: 1 BNK: Quiz 2.2C


4. ANS:
Answers may vary. Students may choose to provide a dotplot, boxplot, histogram, or normal
probability plot. See examples below. Since the Normal probability plot is roughly linear, the data
is approximately normal. The boxplot shows only slight skew to the left, and there is no strong
skew in the dotplot. None of the plots suggest strong departure from Normality.
PTS: 1 BNK: Quiz 2.2C
5. ANS:
The distribution of Atlantic coast acorn volumes is strongly skewed to the right. There is a peak around
10-20 cm3 and most Atlantic coast acorn volumes are between 0.3 cm3 and about 3.6 cm3. There are three
possible outliers at 81, 91, and 105 cm3. California acorn volumes seem to fall into two groups. Some are
similar in size to Atlantic acorns (0.4 to 2.6 cm3) Some are larger than all but the outliers in the Atlantic group
(4.1 cm3 to 7.1 cm3). There is also one strong outlier among the California acorns at 17.1 cm3. In general,
Atlantic coast acorns are smaller than California acorns (Atlantic median is 17 cm3, California median is 41
cm3).

PTS: 1 BNK: Quiz 1.2B


6. ANS:
(a) Park Street’s 15th highest revenue is higher than 36 – 15 = 21 weeks, so the percentile is
(21/36)*100 = 58.3. The z-score for $7500 is (7500 - 6611)/3580 = 0.25. Bridge Road’s 12th
highest revenue is higher than 40 – 12 = 28 weeks, so the percentile is (28/40)*100 = 70. The
z-score for $7100 is (7100 – 5989)/1794 = 0.62. The revenue for the given week was better for
Bridge Road, both in terms of relative position in comparison to other weeks (percentile) and
number of standard deviations above mean revenue (z-score). (b) Mean = 6611 – 3000 = 3611;
median = 6600 – 3000 = 3600; standard deviation and IQR remain unchanged by adding to each
value, so s = 3580 and IQR = 6075.

PTS: 1 BNK: Quiz 2.1C


7. ANS:
(a) Individuals are the ten presidents. (b) Categorical variables: Political Party, State of birth. Quantitative
variables: Age at Inauguration, Age at death. (c) Clockwise from right side of pie, the blanks are: Virginia,
Massachusetts, New York or South Carolina (in either order). (d) Because the vertical scale starts just
below 2 instead of starting at 0, the differences Number of Presidents are exaggerated, making it appear that
there were many more Democratic-Republicans.

PTS: 1 BNK: Quiz 1.1B


8. ANS:
(a) Use median and interquartile range, since the distribution is skewed, there is a strong outlier, and these
measures are resistant to outliers. (b) Min. = 3 Q1 = 35 Med. = 47 Q3 = 65 Max. = 157 (c) 1.5 x IQR
= 1.5 x 30 = 45; 35 – 45 = –10 (no low outliers); 65 + 45 = 110 so 157 is an outlier. (d) See boxplot below.
(e) Since the mean is not resistant to the strong outlier to the right, it will be higher than the median, which is
not influenced by the outlier.

PTS: 1 BNK: Quiz 1.3A


9. ANS:
(a) The median age in France (about 37) is higher than the median age in the Philippines (about 20).
The is also more variability in ages in France, since France’s IQR is about 56 – 18 = 38 and the
Philippines IQR is about 35 – 8 = 27. (b) The distribution for the Philippines is strongly skewed to
the right, because the curve is much steeper at low ages (0 – 30 years) than it is at high ages (50 – 90
years).

PTS: 1 BNK: Quiz 2.1C


10. ANS:
(a) Median is approximately 92 minutes. The interquartile range is approximately 110 – 80 = 30
minutes. (b) Mean = (94.1 – 15)/60 = 1.318 hours, standard deviation = 24.23/60 = 0.404 hours.
(c) Calculating z-scores is a linear transformation--subtract the mean (1.318 hours, in this case) from
each value and divide by the standard deviation (0.404 hours). That is, for each x,
_z=(x-1.318)/(0.404). Using the ideas about transformations of variables, if the mean of x is 1.318,
the mean of z is _(1.318-1.318)/.404 , and if the standard deviation of x is 0.404, the standard
deviation of z is _0.404/0.404 = 1

PTS: 1 BNK: Quiz 2.1B


11. ANS:
Sampling only during workday hours meant that only people without regular daytime jobs were
available to answer the door—the poll suffered from undercoverage of people who were employed.
Since those who are not employed may be more likely to have time to volunteer, the poll probably
overestimated the proportion of potential volunteers. There is also potential response bias: a person
is likely to say he or she will volunteer to look like a good person.

PTS: 1 BNK: Quiz 4.1B


12. ANS:
A. As we randomly select more and more people, the proportion of left-handed people will get
closer and closer to 0.14. B. Assign 01 through 14 to left-handers and 15 through 00 to
right-handers. Choose 28 two-digit numbers from the random digits table and count the number of
left-handers in the group. Do this many, many times. C. The number of left-handers is equal or
greater than 8 in only 4 of 100 simulated classes of 28 students. This suggests that the number of
lefties in Mr. Millar’s class is unusual.

PTS: 1 BNK: Quiz 5.1C


13. ANS:
A. Since the student’s question is, “Is there a relationship between foot length and height?” no
explanatory-response relationship is implied. B. See graph below. C. There appears to be a
weak positive relationship between height and foot length. Whether the relationship is linear or not
is difficult to determine with so few data points. A case could be made that the observation (18,
169) is an outlier. D. r = 0.733. Since r is positive and reasonably high (close to 1), there is
evidence of a positive relationship between foot length and height. E. Subtracting the same
amount from each y value will not change the correlation, nor would multiplying each height by a
constant to convert the heights into inches. F. Adding this point to the data would reinforce the
positive trend, thereby making the correlation closer to 1.
PTS: 1 BNK: Quiz 3.1C
14. ANS:
(a) Population is the 1100 students in the school. (b) Take a stratified random sample, randomly
selecting males and females in proportion to their relative abundance at the school. The principle
advantage is that there will be much less variation from sample to sample, since the proportion of
boys and girls in the sample is fixed—you can’t get a sample that is all, or nearly all, one sex or the
other. (c) Two possible solutions (others are possible): 1) You may have undercoverage if
students who don’t like the food don’t come to the cafeteria at all, which would mean you would
overestimate how much people like the food 2)You might get response bias if the day you choose to
conduct your survey is one when something particularly good or bad is served. This could
overestimate or underestimate how much people like the food, depending on what is served that day.

PTS: 1 BNK: Quiz 4.1C


15. ANS:
Neither is correct. The probability of having a male child is not influenced by the genders of
previous offspring. Only in the long run can we expect the proportion of male children to approach
the expected probability (about 51%, as it turns out).

PTS: 1 BNK: Quiz 5.1C


16. ANS:
A. _ _= predicted price, x = screen area. B. The least-squares regression line
is the line that minimizes the sum of the squared deviations between observed prices and prices
predicted by the linear model. C. 943 sq. in. is well beyond the range of screen areas used to
produce the regression line, so this would be extrapolation. We cannot be sure that the relationship
described by this line holds outside the range of available data.
D. _. Since the residual is negative, the
observed value is lower than the value predicted by the regression. This suggests that this particular
television is a good buy!

PTS: 1 BNK: Quiz 3.2C


17. ANS:
Randomly assign the 24 plants to two groups. (Answers should describe a method of
randomization, such as numbering the plants, writing the numbers on slips of paper, and drawing 12
slips from a hat.) The plants in one group will receive the new fertilizer while the plants in other
group will receive the old fertilizer. In all other ways (watering, sunlight, humidity, etc.), the
plants in the two groups should be treated identically, thus controlling for other variables that might
affect tomato weight. Weigh ripe tomatoes produced by plants, and compare the difference in mean
tomato weight for the two groups. Using 12 plants in each group addresses the fourth principle,
replication.

PTS: 1 BNK: Quiz 4.2B


18. ANS:
Experimental units: packages of cookies. Factors: Temperature and packaging. Treatments: Low
temp and plastic, high temp and plastic, low temp and waxed cardboard, high temp and waxed
cardboard.

PTS: 1 BNK: Quiz 4.2C


19. ANS:
A. ; x = minutes of exercise, = predicted number of floors climbed. B. Since
there is no distinctive pattern in the residuals, the linear model is a good fit. C. When using the
least-squares regression line with x = minutes of exercise to predict y = number of floors climbed,
we will typically be off by about 2.3472 floors.

PTS: 1 BNK: Quiz 3.2C


20. ANS:
(a) Number the students from 1 to 3478. Use the command randInt(1,3478) to select students until
100 different students have been selected. (b) Number the students from 0001 to 3478. Go to the
random digits table and pick a starting point. Record four-digit numbers, skipping any that aren’t
between 0001 and 3478 and any repeated numbers, until you have 100 unique numbers between
0001 and 3478.

PTS: 1 BNK: Quiz 4.1C


21. ANS:
(a) Since packaging type is confounded with time, cause and effect cannot be inferred: we cannot
separate the effect of packaging from differences in sales from last month to this month. We can,
however, make inferences about the population of all stores in this city, since a random sample of
stores was used. (b) Random assignment within matched pairscause and effect can be inferred.
Random sampling of carsCan generalize to population of all cars. (c) No random
assignmentcause and effect cannot be inferred. No random samplingCannot generalize beyond
the subjects of the study.

PTS: 1 BNK: Quiz 4.3C


22. ANS:
See tree diagram for event names.
A.
B.
C. Tree Diagram at right

D.

PTS: 1 BNK: Quiz 5.3B


23. ANS:

PTS: 1 BNK: Quiz 5.3C


24. ANS:
A.

B.
C. No. From parts A. and B.,
D. No. The occurrence of one event does not preclude the occurrence of the other; it’s possible
that a subject preferred Twitter and is also in the 45 – 64 age group. That is,

E. 524 subjects did not prefer Twitter, so

PTS: 1 BNK: Quiz 5.3B


25. ANS:

PTS: 1 BNK: Quiz 6.2A


26. ANS:
A. . S = {A, D, K, R, V}
B.
Outcome A D K R V
Probability 3/8 1/8 1/8 2/8 1/8
C. V = {A}, P(V) = 3/8; F = {A, D, K}, P(F) = 5/8; V or F = {A, D, K}, P(V or F) = 5/8; Fc = {R,
V}, P(Fc) = 3/8; V and F = {A}, P(V and F) = 3/8; V given F = {A}, P(V|F) = 3/5.
D. No, since

E. No, since P(V and F)

PTS: 1 BNK: Quiz 5.3C


27. ANS:
Let D = difference in scores between Mr. Cull and Mr. Voss. Then

PTS: 1 BNK: Quiz 6.2A


28. ANS:
A. This is a geometric setting: We are counting the number of cards it takes to get our first jack.
(Since we are selecting a card with replacement, the trials are independent). B. This is a binomial
setting: Binary outcomes (Joey wins or loses), Independent trials (whether he wins this week does
not influence whether he wins next week), Number of trials is fixed at 52 (weeks), and the
probability of Success—however minuscule—does not change.

PTS: 1 BNK: Quiz 6.3C


29. ANS:
A. Geometric probability:
B. Geometric mean; .
C. Histogram below

PTS: 1 BNK: Quiz 6.3C


30. ANS:
A. Binomial distribution with n = 20 and p = 0.5.
B. ;
C.

PTS: 1 BNK: Quiz 6.3C


31. ANS:
A.

B..

C. The mean weight of a random sample of three apples is less variable than the weight of a single
randomly-selected apple, so we are less likely to get a mean weight that is 20 gm above the mean
when we take a sample of three apples.

PTS: 1 BNK: Quiz 7.3A


32. ANS:
A. miles (the same as the population mean). B. Yes. It seems reasonable to assume that
the sample of 12 is less than 10% of the entire population of customers’ cars. .

C. No. The population distribution is skewed, and n = 12, which is not large enough for the
central limit theorem to apply.

PTS: 1 BNK: Quiz 7.3B


33. ANS:
A. . B. C.
The probability of rolling a 3, given that the roll is 2 or greater; .
D. so A = 3. E. The event T = 4 can happen three ways:
{1, 3}, {3, 1}, and {2, 2}. Probabilities for these events are, respectively,
. Hence the total probability is 0.25.
F. ;
. is the expected mean roll if the
die is rolled many times, or the expected long-run value of a single roll. is the typical
distance each roll is from the mean roll.

PTS: 1 BNK: Quiz 6.1C


34. ANS:
A. is a statistic; is a parameter. B. is a parameter; is a parameter;
is a statistic.
PTS: 1 BNK: Quiz 7.1A
35. ANS:
A. The population is all the guppies in the pet store. We’ve been given the population mean
cm and the population standard deviation cm. B. The sample mean is cm and the
sample size is C. No, it’s merely an approximation of a sampling distribution generated by
simulating 200 sample means. The actual sampling distribution includes the means from all
possible samples of size 10 from the population—many more than 200 values. D. 21 out of 200,
or 10.5% of the sample means in our simulation are as far or farther below 5.0 as our sample was.
Our sample is not sufficiently unusual to arouse suspicions about the store’s claim.

PTS: 1 BNK: Quiz 7.1A


36. ANS:
A. B. Since np = (150)(0.22) = 33 10 and n(1-p) = 150(.78) - 117 10, the
distribution is approximately normal. C. would not change, would be larger (0.069) and
the distribution would be non-Normal, since , which is less than 10.
D. The largest sample we can take is 60, otherwise the sample would be more than 10% of the
population, and sampling without replacement would require a finite population correction to
calculate standard deviation.

PTS: 1 BNK: Quiz 7.2B


37. ANS:
(a) = 0.22; = 0.034
(b) Since np = (150)(0.22) = 33 ≥ 10 and n(1 – p) = 150(.78) = 117 ≥ 10, the distribution is
approximately normal.
(c) would not change, would be larger (0.069) and the distribution would be non-Normal,
since np = 36(0.22) = 7.92, which is less than 10.
(d) The largest sample we can take is 60, otherwise the sample would be more than 10% of the
population, and sampling without replacement would require a finite population correction to
calculate standard deviation.

PTS: 1 BNK: Quiz 7.2B

You might also like