Ken Black QA All Odd No Chapter Solution
Ken Black QA All Odd No Chapter Solution
Ken Black QA All Odd No Chapter Solution
Chapter
No.
Chapter No. Page
No.
1 Introduction to Statistics 1
3 Descriptive Statist 9
5 Discrete Distributions 90
6 Continuous Distributions 174
7 Sampling and Sampling Distributions 269
8 Statistical Inference: Estimation for Single Populations 327
9 Statistical Inference: Hypothesis Testing for Single Populations 381
10 Statistical Inferences about Two Populations 458
11 Analysis of Variance and Design of Experiments 542
13 Nonparametric Statistics 599
14 Simple Regression Analysis 701
15 Multiple Regression Analysis 775
16 Building Multiple Regression Models 795
18 Statistical Quality Control 824
19 Decision Analysis 870
Chapter 1
Introduction to Statistics
LEARNING OBJECTIVES
The primary objective of chapter 1 is to introduce you to the world of statistics, enabling you to:
1. Define statistics.
2. Be aware of a wide range of applications of statistics in business.
3. Differentiate between descriptive and inferential statistics.
4. Classify numbers by level of data and understand why doing so is important.
2
CHAPTER TEACHING STRATEGY
In chapter 1 it is very important to motivate business students to study statistics by
presenting them with many applications of statistics in business. The definition of statistics as a
science dealing with the collection, analysis, interpretation, and presentation of numerical data
is a very good place to start. Statistics is about dealing with data. Data are found in all areas of
business. This is a time to have the students brainstorm on the wide variety of places in
business where data are measured and gathered. It is important to define statistics for students
because they bring so many preconceptions of the meaning of the term. For this reason, several
perceptions of the word statistics are given in the chapter.
Chapter 1 sets up the paradigm of inferential statistics. The student will understand
that while there are many useful applications of descriptive statistics in business, the strength of
the application of statistics in the field of business is through inferential statistics. From this
notion, we will later introduce probability, sampling, confidence intervals, and hypothesis
testing. The process involves taking a sample from the population, computing a statistic on the
sample data, and making an inference (decision or conclusion) back to the population from
which the sample has been drawn.
In chapter 1, levels of data measurement are emphasized. Too many texts present data
to the students with no comment or discussion of how the data were gathered or the level of
data measurement. In chapter 7, there is a discussion of sampling techniques. However, in this
chapter, four levels of data are discussed. It is important for students to understand that the
statistician is often given data to analyze without input as to how it was gathered or the type of
measurement. It is incumbent upon statisticians and researchers to ascertain the level of
measurement that the data represent so that appropriate techniques can be used in analysis.
All techniques presented in this text cannot be appropriately used to analyze all data.
CHAPTER OUTLINE
1.1 Statistics in Business
Marketing
Management
Finance
Economics
Management Information Systems
3
1.2 Basic Statistical Concepts
1.3 Data Measurement
Nominal Level
Ordinal Level
Interval Level
Ratio Level
Comparison of the Four Levels of Data
Statistical Analysis Using the Computer: Excel and MINITAB
KEY TERMS
Census Ordinal Level Data
Descriptive Statistics Parameter
Inferential Statistics Parametric Statistics
Interval Level Data Population
Metric Data Ratio Level Data
Nominal Level Data Sample
Non-metric Data Statistic
Nonparametric Statistics Statistics
4
SOLUTIONS TO PROBLEMS IN CHAPTER 1
1.1 Examples of data in functional areas:
accounting - cost of goods, salary expense, depreciation, utility costs, taxes, equipment
inventory, etc.
finance - World bank bond rates, number of failed savings and loans, measured risk of
common stocks, stock dividends, foreign exchange rate, liquidity rates for a
single-family, etc.
human resources - salaries, size of engineering staff, years experience, age of
employees, years of education, etc.
marketing - number of units sold, dollar sales volume, forecast sales, size of sales force,
market share, measurement of consumer motivation, measurement of consumer
frustration, measurement of brand preference, attitude measurement, measurement of
consumer risk, etc.
information systems - CPU time, size of memory, number of work stations, storage
capacity, percent of professionals who are connected to a computer network, dollar
assets of company computing, number of hits on the Internet, time spent on the
Internet per day, percentage of people who use the Internet, retail dollars spent in e-
commerce, etc.
production - number of production runs per day, weight of a product; assembly time,
number of defects per run, temperature in the plant, amount of inventory, turnaround
time, etc.
management - measurement of union participation, measurement of employer support,
measurement of tendency to control, number of subordinates reporting to a manager,
measurement of leadership style, etc.
5
1.2 Examples of data in business industries:
manufacturing - size of punched hole, number of rejects, amount of inventory, amount
of production, number of production workers, etc.
insurance - number of claims per month, average amount of life insurance per family
head, life expectancy, cost of repairs for major auto collision, average medical costs
incurred for a single female over 45 years of age, etc.
travel - cost of airfare, number of miles traveled for ground transported vacations,
number of nights away from home, size of traveling party, amount spent per day on
besides lodging, etc.
retailing - inventory turnover ratio, sales volume, size of sales force, number of
competitors within 2 miles of retail outlet, area of store, number of sales people, etc.
communications - cost per minute, number of phones per office, miles of cable per
customer headquarters, minutes per day of long distance usage, number of operators,
time between calls, etc.
computing - age of company hardware, cost of software, number of CAD/CAM stations,
age of computer operators, measure to evaluate competing software packages, size of
data base, etc.
agriculture - number of farms per county, farm income, number of acres of corn per
farm, wholesale price of a gallon of milk, number of livestock, grain storage capacity,
etc.
banking - size of deposit, number of failed banks, amount loaned to foreign banks,
number of tellers per drive-in facility, average amount of withdrawal from automatic
teller machine, federal reserve discount rate, etc.
healthcare - number of patients per physician per day, average cost of hospital stay,
average daily census of hospital, time spent waiting to see a physician, patient
satisfaction, number of blood tests done per week.
6
1.3 Descriptive statistics in recorded music industry -
1) RCA total sales of compact discs this week, number of artists under contract to a
company at a given time.
2) Total dollars spent on advertising last month to promote an album.
3) Number of units produced in a day.
4) Number of retail outlets selling the company's products.
Inferential statistics in recorded music industry -
1) Measure the amount spent per month on recorded music for a few consumers
then use that figure to infer the amount for the population.
2) Determination of market share for rap music by randomly selecting a sample of
500 purchasers of recorded music.
3) Determination of top ten single records by sampling the number of requests at a
few radio stations.
4) Estimation of the average length of a single recording by taking a sample of
records and measuring them.
The difference between descriptive and inferential statistics lies mainly in the usage of
the data. These descriptive examples all gather data from every item in the population
about which the description is being made. For example, RCA measures the sales on all
its compact discs for a week and reports the total.
In each of the inferential statistics examples, a sample of the population is taken and the
population value is estimated or inferred from the sample. For example, it may be
practically impossible to determine the proportion of buyers who prefer rap music.
7
However, a random sample of buyers can be contacted and interviewed for music
preference. The results can be inferred to population market share.
1.4 Descriptive statistics in manufacturing batteries to make better decisions -
1) Total number of worker hours per plant per week - help management
understand labor costs, work allocation, productivity, etc.
2) Company sales volume of batteries in a year - help management decide if the
product is profitable, how much to advertise in the coming year, compare to
costs to determine profitability.
3) Total amount of sulfuric acid purchased per month for use in battery
production. - can be used by management to study wasted inventory, scrap, etc.
Inferential Statistics in manufacturing batteries to make decisions -
1) Take a sample of batteries and test them to determine the average shelf life -
use the sample average to reach conclusions about all batteries of this type.
Management can then make labeling and advertising claims. They can compare
these figures to the shelf-life of competing batteries.
2) Take a sample of battery consumers and determine how many batteries they
purchase per year. Infer to the entire population - management can use this
information to estimate market potential and penetration.
3) Interview a random sample of production workers to determine attitude
towards company management - management can use this survey result to
ascertain employee morale and to direct efforts towards creating a more
positive working environment which, hopefully, results in greater productivity.
8
1.5 a) ratio
b) ratio
c) ordinal
d) nominal
e) ratio
f) ratio
g) nominal
h) ratio
1.6 a) ordinals
b) Ratio
c) nominal
d) Ratio
e) Interval
f) Interval
g) Nominal
h) Ordinal
1.7 a) The population for this study is the 900 electric contractors who purchased
Rathburn wire.
b) The sample is the randomly chosen group of thirty-five contractors.
c) The statistic is the average satisfaction score for the sample of thirty-five
contractors.
d) The parameter is the average satisfaction score for all 900 electric contractors in
the population.
9
Chapter 3
Descriptive Statistics
LEARNING OBJECTIVES
The focus of Chapter 3 is on the use of statistical techniques to describe data, thereby enabling
you to:
1. Distinguish between measures of central tendency, measures of variability, and
measures of shape.
2. Understand conceptually the meanings of mean, median, mode, quartile, percentile,
and range.
3. Compute mean, median, mode, percentile, quartile, range, variance, standard deviation,
and mean absolute deviation on ungrouped data.
4. Differentiate between sample and population variance and standard deviation.
5. Understand the meaning of standard deviation as it is applied using the empirical rule
and Chebyshevs theorem.
6. Compute the mean, median, standard deviation, and variance on grouped data.
7. Understand box and whisker plots, skewness, and kurtosis.
8. Compute a coefficient of correlation and interpret it.
10
CHAPTER TEACHING STRATEGY
In chapter 2, the students learned how to summarize data by constructing frequency
distributions (grouping data) and by using graphical depictions. Much of the time, statisticians
need to describe data by using single numerical measures. Chapter 3 presents a cadre of
statistical measures for describing numerically sets of data.
It can be emphasized in this chapter that there are at least two major dimensions along
which data can be described. One is the measure of central tendency with which statisticians
attempt to describe the more central portions of the data. Included here are the mean, median,
mode, percentiles, and quartiles. It is important to establish that the median is a useful device
for reporting some business data, such as income and housing costs, because it tends to ignore
the extremes. On the other hand, the mean utilizes every number of a data set in its
computation. This makes the mean an attractive tool in statistical analysis.
A second major group of descriptive statistical techniques are the measures of
variability. Students can understand that a measure of central tendency is often not enough to
fully describe data, often giving information only about the center of the distribution or key
milestones of the distribution. A measure of variability helps the researcher get a handle on the
spread of the data. An attempt is made in this text to communicate to the student that through
the use of the empirical rule and/or Chebyshevs Theorem, students can better understand the
meaning of a standard deviation. The empirical rule will be referred to quite often throughout
the course; and therefore, it is important to emphasize it as a rule of thumb. For example, in
discussing control charts in chapter 18, the upper and lower control limits are established by
using the range of + 3 standard deviations of the statistic as limits within which 99.7% of the
data values should fall if a process is in control.
In this section of chapter 3, z scores are presented mainly to bridge the gap between the
discussion of means and standard deviations in chapter 3 and the normal curve of chapter 6.
One application of the standard deviation in business is the use of it as a measure of risk in the
financial world. For example, in tracking the price of a stock over a period of time, a financial
analyst might determine that the larger the standard deviation, the greater the risk (because of
swings in the price). However, because the size of a standard deviation is a function of the
mean and a coefficient of variation conveys the size of a standard deviation relative to its mean,
other financial researchers prefer the coefficient of variation as a measure of the risk. That is, it
can be argued that a coefficient of variation takes into account the size of the mean (in the case
of a stock, the investment) in determining the amount of risk as measured by a standard
deviation.
11
It should be emphasized that the calculation of measures of central tendency and
variability for grouped data is different than for ungrouped or raw data. While the principles are
the same for the two types of data, implementation of the formulas is different. Computations
of statistics from grouped data are based on class midpoints rather than raw values; and for this
reason, students should be cautioned that group statistics are often just approximations.
Measures of shape are useful in helping the researcher describe a distribution of data.
The Pearsonian coefficient of skewness is a handy tool for ascertaining the degree of skewness
in the distribution. Box and Whisker plots can be used to determine the presence of skewness in
a distribution and to locate outliers. The coefficient of correlation is introduced here instead of
chapter 14 (regression chapter) so that the student can begin to think about two-variable
relationships and analyses and view a correlation coefficient as a descriptive statistic. In
addition, when the student studies simple regression in chapter 14, there will be a foundation
upon which to build. All in all, chapter 3 is quite important because it presents some of the
building blocks for many of the later chapters.
12
CHAPTER OUTLINE
3.1 Measures of Central Tendency: Ungrouped Data
Mode
Median
Mean
Percentiles
Quartiles
3.2 Measures of Variability - Ungrouped Data
Range
Interquartile Range
Mean Absolute Deviation, Variance, and Standard Deviation
Mean Absolute Deviation
Variance
Standard Deviation
Meaning of Standard Deviation
Empirical Rule
Chebyshevs Theorem
Population Versus Sample Variance and Standard Deviation
Computational Formulas for Variance and Standard Deviation
Z Scores
Coefficient of Variation
13
3.3 Measures of Central Tendency and Variability - Grouped Data
Measures of Central Tendency
Mean
Mode
Measures of Variability
3.4 Measures of Shape
Skewness
Skewness and the Relationship of the Mean, Median, and Mode
Coefficient of Skewness
Kurtosis
Box and Whisker Plot
3.5 Measures of Association
Correlation
3.6 Descriptive Statistics on the Computer
KEY TERMS
Arithmetic Mean Measures of Shape
Bimodal Measures of Variability
Box and Whisker Plot Median
Chebyshevs Theorem Mesokurtic
Coefficient of Correlation (r) Mode
14
Coefficient of Skewness Multimodal
Coefficient of Variation (CV) Percentiles
Correlation Platykurtic
Deviation from the Mean Quartiles
Empirical Rule Range
Interquartile Range Skewness
Kurtosis Standard Deviation
Leptokurtic Sum of Squares of x
Mean Absolute Deviation (MAD) Variance
Measures of Central Tendency z Score
15
SOLUTIONS TO PROBLEMS IN CHAPTER 3
3.1 Mode
2, 2, 3, 3, 4, 4, 4, 4, 5, 6, 7, 8, 8, 8, 9
The mode = 4
4 is the most frequently occurring value
3.2 Median for values in 3.1
Arrange in ascending order:
2, 2, 3, 3, 4, 4, 4, 4, 5, 6, 7, 8, 8, 8, 9
There are 15 terms.
Since there are an odd number of terms, the median is the middle number.
The median = 4
Using the formula, the median is located
at the
n+1
2
th
term =
15 1
2
+
= 8
th
term
The 8
th
term = 4
3.3 Median
Arrange terms in ascending order:
073, 167, 199, 213, 243, 345, 444, 524, 609, 682
There are 10 terms.
Since there are an even number of terms, the median is the average of the
middle two terms:
16
Median =
( ) 243 345
2
588
2
+
= = 294
Using the formula, the median is located at the
n+1
2
th
term
n = 10 therefore
10 1
2
11
2
+
= = 5.5
th
term.
The median is located halfway between the 5
th
and 6
th
terms.
5
th
term = 243 6
th
term = 345
Halfway between 243 and 345 is the median = 294
3.4 Mean
17.3
44.5 = Ex/N = (333.6)/8 = 41.7
31.6
40.0
52.8 x = Ex/n = (333.6)/8 = 41.7
38.8
30.1
78.5 (It is not stated in the problem whether the
Ex = 333.6 data represent as population or a sample).
17
3.5 Mean
7
-2
5 = Ex/N = -12/12 = -1
9
0
-3
-6 x = Ex/n = -12/12 = -1
-7
-4
-5
2
-8 (It is not stated in the problem whether the
Ex = -12 data represent a population or a sample).
3.6 Rearranging the data into ascending order:
11, 13, 16, 17, 18, 19, 20, 25, 27, 28, 29, 30, 32, 33, 34
25 . 5 ) 15 (
100
35
= = i
P
35
is located at the 5 + 1 = 6
th
term, P
35
= 19
25 . 8 ) 15 (
100
55
= = i
18
P
55
is located at the 8 + 1 = 9
th
term, P
55
= 27
Q
1
= P
25
but 75 . 3 ) 15 (
100
25
= = i
Q
1
= P
25
is located at the 3 + 1 = 4
th
term, Q
1
= 17
Q
2
= Median but: The median is located at the term
th
th
8
2
1 15
= |
.
|
\
| +
Q
2
= 25
Q
3
= P
75
but 25 . 11 ) 15 (
100
75
= = i
Q
3
= P
75
is located at the 11 + 1 = 12
th
term, Q
3
= 30
19
3.7 Rearranging the data in ascending order:
80, 94, 97, 105, 107, 112, 116, 116, 118, 119, 120, 127,
128, 138, 138, 139, 142, 143, 144, 145, 150, 162, 171, 172
n = 24
For P
20
: 8 . 4 ) 24 (
100
20
= = i
Thus, P
20
is located at the 4 + 1 = 5
th
term and P
20
= 107
For P
47
: 28 . 11 ) 24 (
100
47
= = i
Thus, P
47
is located at the 11 + 1 = 12
th
term and P
47
= 127
For P
83
: 92 . 19 ) 24 (
100
83
= = i
Thus, P
83
is located at the 19 + 1 = 20
th
term and P
83
= 145
Q
1
= P
25
For P
25
: 6 ) 24 (
100
25
= = i
20
Thus, Q
1
is located at the 6.5
th
term and Q
1
= (112 + 116)/ 2 = 114
Q
2
= Median
The median is located at the: term
th
th
5 . 12
2
1 24
= |
.
|
\
| +
Thus, Q
2
= (127 + 128)/ 2 = 127.5
Q
3
= P
75
For P
75
:
18 ) 24 (
100
75
= = i
Thus, Q
3
is located at the 18.5
th
term and Q
3
= (143 + 144)/ 2 = 143.5
3.8 Mean = 33 . 1216
15
245 , 18
= =
N
x
The mean is 1216.33.
The median is located at the
th
|
.
|
\
| +
2
1 15
= 8
th
term
Median = 1,233
Q
2
= Median = 1,233
For P
63
, 45 . 9 ) 15 (
100
63
= = i
21
P
63
is located at the 9 + 1 = 10
th
term, P
63
= 1,277
For P
29
, 35 . 4 ) 15 (
100
29
= = i
P
29
is located at the 4 + 1 = 5
th
term, P
29
= 1,119
3.9 The median is located at the
th
th
5 . 6
2
1 12
= |
.
|
\
| +
position
The median = (3.41 + 4.63)/2 = 4.02
For Q
3
= P
75
: 9 ) 12 (
100
75
= = i
P
75
is located halfway between the 9
th
and 10
th
terms.
Q
3
= P
75
Q
3
= (5.70 + 7.88)/ 2 = 6.79
For P
20
: 4 . 2 ) 12 (
100
20
= = i
P
20
is located at the 3
rd
term P
20
= 2.12
22
For P
60
: 2 . 7 ) 12 (
100
60
= = i
P
60
is located at the 8
th
term P
60
= 5.10
For P
80
: 6 . 9 ) 12 (
100
80
= = i
P
80
is located at the 10
th
term P
80
= 7.88
For P
93
: 16 . 11 ) 12 (
100
93
= = i
P
93
is located at the 12
th
term P
93
= 8.97
3.10 n = 17; Mean = 588 . 3
17
61
= =
N
x
The mean is 3.588
The median is located at the
th
|
.
|
\
| +
2
1 17
= 9
th
term, Median = 4
There are eight 4s, therefore the Mode = 4
Q
3
= P
75
: i = 75 . 12 ) 17 (
100
75
=
23
Q
3
is located at the 13
th
term and Q
3
= 4
P
11
: i = 87 . 1 ) 17 (
100
11
=
P
11
is located at the 2
nd
term and P
11
= 1
P
35
: i = 95 . 5 ) 17 (
100
35
=
P
35
is located at the 6
th
term and P
35
= 3
P
58
: i = 86 . 9 ) 17 (
100
58
=
P
58
is located at the 10
th
term and P
58
= 4
P
67
: i = 39 . 11 ) 17 (
100
67
=
P
67
is located at the 12
th
term and P
67
= 4
3.11 x x - (x-)
2
6 6-4.2857 = 1.7143 2.9388
2 2.2857 5.2244
4 0.2857 0.0816
9 4.7143 22.2246
24
1 3.2857 10.7958
3 1.2857 1.6530
5 0.7143 0.5102
Ex = 30 E|x-| = 14.2857 E(x -)
2
= 43.4284
2857 . 4
7
30
= =
E
=
N
x
u
a.) Range = 9 - 1 = 8
b.) M.A.D. = = =
E
7
2857 . 14
N
x u
2.0408
c.) o
2
=
7
4284 . 43 ) (
2
=
E
N
x u
= 6.2041
d.) o = 2041 . 6
) (
2
=
E
N
x u
= 2.4908
e.) Arranging the data in order: 1, 2, 3, 4, 5, 6, 9
Q
1
= P
25
i = ) 7 (
100
25
= 1.75
Q
1
is located at the 2
nd
term, Q
1
= 2
25
Q
3
= P
75
: i = ) 7 (
100
75
= 5.25
Q
3
is located at the 6
th
term, Q
3
= 6
IQR = Q
3
- Q
1
= 6 - 2 = 4
f.) z =
4908 . 2
2857 . 4 6
= 0.69
z =
4908 . 2
2857 . 4 2
= -0.92
z =
4908 . 2
2857 . 4 4
= -0.11
z =
4908 . 2
2857 . 4 9
= 1.89
z =
4908 . 2
2857 . 4 1
= -1.32
z =
4908 . 2
2857 . 4 3
= -0.52
z =
4908 . 2
2857 . 4 5
= 0.29
26
3.12 x x x
2
) ( x x
4 0 0
3 1 1
0 4 16
5 1 1
2 2 4
9 5 25
4 0 0
5 1 1
Ex = 32 x x E = 14
2
) ( x x E = 48
8
32
=
E
=
n
x
x = 4
a) Range = 9 - 0 = 9
b) M.A.D. =
8
14
=
E
n
x x
= 1.75
c) s
2
=
7
48
1
) (
2
=
E
n
x x
= 6.8571
d) s = 857 . 6
1
) (
2
=
E
n
x x
= 2.6186
27
e) Numbers in order: 0, 2, 3, 4, 4, 5, 5, 9
Q
1
= P
25
i = ) 8 (
100
25
= 2
Q
1
is located at the average of the 2
nd
and 3
rd
terms, Q
1
= 2.5
Q
3
= P
75
i = ) 8 (
100
75
= 6
Q
3
is located at the average of the 6
th
and 7
th
terms, Q
3
= 5
IQR = Q
3
- Q
1
= 5 - 2.5 = 2.5
3.13 a.)
x (x-) (x -)
2
12 12-21.167= -9.167 84.034
23 1.833 3.360
19 -2.167 4.696
26 4.833 23.358
24 2.833 8.026
23 1.833 3.360
Ex = 127 E(x -) = -0.002 E(x -)
2
= 126.834
28
u =
6
127
=
E
N
x
= 21.167
o = 139 . 21
6
834 . 126 ) (
2
= =
E
N
x u
= 4.598 ORIGINAL FORMULA
b.)
x x
2
12 144
23 529
19 361
26 676
24 576
23 529
Ex = 127 Ex
2
= 2815
o =
138 . 21
6
83 . 126
6
17 . 2688 2815
6
6
) 127 (
2815
) (
2 2
2
= =
=
E
E
N
N
x
x
= 4.598 SHORT-CUT FORMULA
The short-cut formula is faster, but the original formula gives insight
29
into the meaning of a standard deviation.
3.14 s
2
= 433.9267
s = 20.8309
Ex = 1387
Ex
2
= 87,365
n = 25
x = 55.48
3.15 o
2
= 58,631.295
o = 242.139
Ex = 6886
Ex
2
= 3,901,664
n = 16
= 430.375
3.16
30
14, 15, 18, 19, 23, 24, 25, 27, 35, 37, 38, 39, 39, 40, 44,
46, 58, 59, 59, 70, 71, 73, 82, 84, 90
Q
1
= P
25
i = ) 25 (
100
25
= 6.25
P
25
is located at the 7
th
term, and therefore, Q
1
= 25
Q
3
= P
75
i = ) 25 (
100
75
= 18.75
P
75
is located at the 19
th
term, and therefore, Q
3
= 59
IQR = Q
3
- Q
1
= 59 - 25 = 34
3.17 a) 75 .
4
3
4
1
1
2
1
1
2
= = = .75
b) 84 .
25 . 6
1
1
5 . 2
1
1
2
= = .84
c) 609 .
56 . 2
1
1
6 . 1
1
1
2
= = .609
31
d) 902 .
24 . 10
1
1
2 . 3
1
1
2
= = .902
32
3.18
Set 1:
5 . 65
4
262
1
= =
E
=
N
x
u
4
4
) 262 (
970 , 17
) (
2 2
2
1
=
E
E
=
N
N
x
x
o = 14.2215
Set 2:
5 . 142
4
570
2
= =
E
=
N
x
u
4
4
) 570 (
070 , 82
) (
2 2
2
2
=
E
E
=
N
N
x
x
o = 14.5344
CV
1
= ) 100 (
5 . 65
2215 . 14
= 21.71%
CV
2
= ) 100 (
5 . 142
5344 . 14
= 10.20%
33
34
3.19 x x x
2
) ( x x
7 1.833 3.361
5 3.833 14.694
10 1.167 1.361
12 3.167 10.028
9 0.167 0.028
8 0.833 0.694
14 5.167 26.694
3 5.833 34.028
11 2.167 4.694
13 4.167 17.361
8 0.833 0.694
6 2.833 8.028
106 32.000 121.665
12
106
=
E
=
n
x
x = 8.833
a) MAD =
12
32
=
E
n
x x
= 2.667
b) s
2
=
11
665 . 121
1
) (
2
=
E
n
x x
= 11.06
c) s = 06 . 11
2
= s = 3.326
35
d) Rearranging terms in order: 3 5 6 7 8 8 9 10 11 12 13 14
Q
1
= P
25
: i = (.25)(12) = 3
Q
1
= the average of the 3
rd
and 4
th
terms: Q
1
= (6 + 7)/2 = 6.5
Q
3
= P
75
: i = (.75)(12) = 9
Q
3
= the average of the 9
th
and 10
th
terms: Q
3
= (11 + 12)/2 = 11.5
IQR = Q
3
- Q
1
= 11.5 6.5 = 5
e.) z =
326 . 3
833 . 8 6
= - 0.85
f.) CV =
833 . 8
) 100 )( 326 . 3 (
= 37.65%
3.20 n = 11 x | x-u|
768 475.64
429 136.64
323 30.64
306 13.64
286 6.36
262 30.36
215 77.36
36
172 120.36
162 130.36
148 144.36
145 147.36
Ex = 3216 E|x-| = 1313.08
= 292.36 Ex = 3216 Ex
2
= 1,267,252
a.) Range = 768 - 145 = 623
b.) MAD =
11
08 . 1313
=
E
N
x u
= 119.37
c.) o
2
=
11
11
) 3216 (
252 , 267 , 1
) (
2 2
2
=
E
E
N
N
x
x
= 29,728.23
d.) o = 23 . 728 , 29 = 172.42
e.) Q
1
= P
25
: i = .25(11) = 2.75
Q
1
is located at the 3
rd
term and Q
1
= 162
Q
3
= P
75
: i = .75(11) = 8.25
Q
3
is located at the 9
th
term and Q
3
= 323
37
IQR = Q
3
- Q
1
= 323 - 162 = 161
f.) x
nestle
= 172
z =
42 . 172
36 . 292 172
=
o
u x
= -0.70
g.) CV = ) 100 (
36 . 292
42 . 172
) 100 ( =
u
o
= 58.98%
3.21 = 125 o = 12
68% of the values fall within:
1o = 125 1(12) = 125 12
between 113 and 137
95% of the values fall within:
2o = 125 2(12) = 125 24
between 101 and 149
99.7% of the values fall within:
3o = 125 3(12) = 125 36
38
between 89 and 161
3.22 = 38 o = 6
between 26 and 50:
x
1
- = 50 - 38 = 12
x
2
- = 26 - 38 = -12
6
12
1
=
o
u x
= 2
6
12
2
=
o
u x
= -2
k = 2, and since the distribution is not normal, use Chebyshevs theorem:
4
3
4
1
1
2
1
1
1
1
2 2
= = =
k
= .75
at least 75% of the values will fall between 26 and 50
39
between 14 and 62? = 38 o = 6
x
1
- = 62 - 38 = 24
x
2
- = 14 - 38 = -24
6
24
1
=
o
u x
= 4
6
24
2
=
o
u x
= -4
k = 4
16
15
16
1
1
4
1
1
1
1
2 2
= = =
k
= .9375
at least 93.75% of the values fall between 14 and 62
between what 2 values do at least 89% of the values fall?
1 -
2
1
k
=.89
.11 =
2
1
k
40
.11 k
2
= 1
k
2
=
11 .
1
k
2
= 9.09
k = 3.015
With = 38, o = 6 and k = 3.015 at least 89% of the values fall within:
3.015o = 38 3.015 (6) = 38 18.09
Between 19.91 and 56.09
3.23 1 -
2
1
k
= .80
1 - .80 =
2
1
k
.20 =
2
1
k
and .20k
2
= 1
k
2
= 5 and k = 2.236
41
2.236 standard deviations
3.24 = 43. 68% of the values lie + 1o . Thus, between the mean, 43, and one of
the values, 46, is one standard deviation. Therefore,
1o = 46 - 43 = 3
within 99.7% of the values lie + 3o. Thus, between the mean, 43, and one of
the values, 51, are three standard deviations. Therefore,
3o = 51 - 43 = 8
o = 2.67
= 28 and 77% of the values lie between 24 and 32 or + 4 from the mean:
1 -
2
1
k
= .77
Solving for k:
.23 =
2
1
k
and therefore, .23k
2
= 1
42
k
2
= 4.3478
k = 2.085
2.085o = 4
o =
085 . 2
4
= 1.918
3.25 = 29 o = 4
Between 21 and 37 days:
4
8
4
29 21
1
=
o
u x
= -2 Standard Deviations
4
8
8
29 37
2
=
o
u x
= 2 Standard Deviations
Since the distribution is normal, the empirical rule states that 95% of
the values fall within 2o .
Exceed 37 days:
Since 95% fall between 21 and 37 days, 5% fall outside this range. Since the
43
normal distribution is symmetrical, 2% fall below 21 and above 37.
Thus, 2% lie above the value of 37.
Exceed 41 days:
4
12
4
29 41
=
o
u x
= 3 Standard deviations
The empirical rule states that 99.7% of the values fall within 3o = 29 3(4) =
29 12. That is, 99.7% of the values will fall between 17 and 41 days.
0.3% will fall outside this range and half of this or .15% will lie above 41.
Less than 25: = 29 o = 4
4
4
4
29 25
=
o
u x
= -1 Standard Deviation
According to the empirical rule, 1o contains 68% of the values.
29 1(4) = 29 4
Therefore, between 25 and 33 days, 68% of the values lie and 32% lie outside this
44
range with (32%) = 16% less than 25.
3.26 x
97
109
111
118
120
130
132
133
137
137
Ex = 1224 Ex
2
= 151,486 n = 10 x = 122.4 s = 13.615
Bordeaux: x = 137
z =
615 . 13
4 . 122 137
= 1.07
Montreal: x = 130
z =
615 . 13
4 . 122 130
= 0.56
Edmonton: x = 111
45
z =
615 . 13
4 . 122 111
= -0.84
Hamilton: x = 97
z =
615 . 13
4 . 122 97
= -1.87
46
3.27 Mean
Class f M fM
0 - 2 39 1 39
2 - 4 27 3 81
4 - 6 16 5 80
6 - 8 15 7 105
8 - 10 10 9 90
10 - 12 8 11 88
12 - 14 6 13 78
Ef=121 EfM=561
u =
121
561
=
E
E
f
fM
= 4.64
Mode: The modal class is 0 2.
The midpoint of the modal class = the mode = 1
3.28
Class f M fM
1.2 - 1.6 220 1.4 308
1.6 - 2.0 150 1.8 270
2.0 - 2.4 90 2.2 198
2.4 - 2.8 110 2.6 286
47
2.8 - 3.2 280 3.0 840
Ef=850 EfM=1902
Mean: u =
850
1902
=
E
E
f
fM
= 2.24
Mode: The modal class is 2.8 3.2.
The midpoint of the modal class is the mode = 3.0
48
3.29 Class f M fM
20-30 7 25 175
30-40 11 35 385
40-50 18 45 810
50-60 13 55 715
60-70 6 65 390
70-80 4 75 300
Total 59 2775
u =
59
2775
=
E
E
f
fM
= 47.034
M - (M - )
2
f(M - )
2
-22.0339 485.4927 3398.449
-12.0339 144.8147 1592.962
- 2.0339 4.1367 74.462
7.9661 63.4588 824.964
17.9661 322.7808 1936.685
27.9661 782.1028 3128.411
Total 10,955.933
o
2
=
59
93 . 955 , 10 ) (
2
=
E
E
f
M f u
= 185.694
o = 694 . 185 = 13.627
49
3.30 Class f M fM fM
2
5 - 9 20 7 140 980
9 - 13 18 11 198 2,178
13 - 17 8 15 120 1,800
17 - 21 6 19 114 2,166
21 - 25 2 23 46 1,058
Ef=54 EfM= 618 EfM
2
= 8,182
s
2
=
53
67 . 7071 8182
53
54
) 618 (
8182
1
) (
2 2
2
E
E
n
n
f M
f M
= 20.931
s = 9 . 20
2
= s = 4.575
50
3.31 Class f M fM fM
2
18 - 24 17 21 357 7,497
24 - 30 22 27 594 16,038
30 - 36 26 33 858 28,314
36 - 42 35 39 1,365 53,235
42 - 48 33 45 1,485 66,825
48 - 54 30 51 1,530 78,030
54 - 60 32 57 1,824 103,968
60 - 66 21 63 1,323 83,349
66 - 72 15 69 1,035 71,415
Ef= 231 EfM= 10,371 EfM
2
= 508,671
a.) Mean:
231
371 , 10
=
E
E
=
E
=
f
fM
n
fM
x = 44.896
b.) Mode. The Modal Class = 36-42. The mode is the class midpoint = 39
c.) s
2
=
230
5065 . 053 , 43
230
231
) 371 , 10 (
671 , 508
1
) (
2 2
2
=
E
E
n
n
f M
f M
= 187.189
d.) s = 2 . 187 = 13.682
51
52
3.32
Class f M fM fM
2
0 - 1 31 0.5 15.5 7.75
1 - 2 57 1.5 85.5 128.25
2 - 3 26 2.5 65.0 162.50
3 - 4 14 3.5 49.0 171.50
4 - 5 6 4.5 27.0 121.50
5 - 6 3 5.5 16.5 90.75
Ef=137 EfM=258.5 EfM
2
=682.25
a.) Mean
u =
137
5 . 258
=
E
E
f
fM
= 1.887
b.) Mode: Modal Class = 1-2. Mode = 1.5
c.) Variance:
o
2
=
137
137
) 5 . 258 (
25 . 682
) (
2 2
2
=
E
E
N
N
f M
f M
= 1.4197
d.) Standard Deviation:
o = 4197 . 1
2
= o = 1.1915
53
3.33 f M fM fM
2
20-30 8 25 200 5000
30-40 7 35 245 8575
40-50 1 45 45 2025
50-60 0 55 0 0
60-70 3 65 195 12675
70-80 1 75 75 5625
Ef = 20 EfM = 760 EfM
2
= 33900
a.) Mean:
u =
20
760
=
E
E
f
fM
= 38
b.) Mode. The Modal Class = 20-30.
The mode is the midpoint of this class = 25.
c.) Variance:
o
2
=
20
20
) 760 (
900 , 33
) (
2 2
2
=
E
E
N
N
f M
f M
= 251
d.) Standard Deviation:
o = 251
2
= o = 15.843
54
55
3.34 No. of Farms f M fM
0 - 20,000 16 10,000 160,000
20,000 - 40,000 11 30,000 330,000
40,000 - 60,000 10 50,000 500,000
60,000 - 80,000 6 70,000 420,000
80,000 - 100,000 5 90,000 450,000
100,000 - 120,000 1 110,000 110,000
Ef = 49 EfM = 1,970,000
u =
49
000 , 970 , 1
=
E
E
f
fM
= 40,204
The actual mean for the ungrouped data is 37,816. This computed group
mean, 40,204, is really just an approximation based on using the class
midpoints in the calculation. Apparently, the actual numbers of farms per
state in some categories do not average to the class midpoint and in fact
might be less than the class midpoint since the actual mean is less than the
grouped data mean.
The value for EfM
2
is 1.185x10
11
o
2
=
49
49
) 000 , 970 , 1 (
10 185 . 1
) (
2
11
2
2
=
E
E x
N
N
f m
f M
= 801,999,167
o = 28,319.59
The actual standard deviation was 29,341. The difference again is due to
56
the grouping of the data and the use of class midpoints to represent the
data. The class midpoints due not accurately reflect the raw data.
3.35 mean = $35
median = $33
mode = $21
The stock prices are skewed to the right. While many of the stock prices
are at the cheaper end, a few extreme prices at the higher end pull the
mean.
57
3.36 mean = 51
median = 54
mode = 59
The distribution is skewed to the left. More people are older but the most
extreme ages are younger ages.
3.37 S
k
=
59 . 9
) 19 . 3 51 . 5 ( 3 ) ( 3
=
o
u
d
M
= 0.726
3.38 n = 25 x = 600
x = 24 s = 6.6521 M
d
= 23
S
k
=
6521 . 6
) 23 24 ( 3 ) ( 3
=
s
M x
d
= 0.451
There is a slight skewness to the right
3.39 Q
1
= 500. Median = 558.5. Q
3
= 589.
58
IQR = 589 - 500 = 89
Inner Fences: Q
1
- 1.5 IQR = 500 - 1.5 (89) = 366.5
and Q
3
+ 1.5 IQR = 589 + 1.5 (89) = 722.5
Outer Fences: Q
1
- 3.0 IQR = 500 - 3 (89) = 233
and Q
3
+ 3.0 IQR = 589 + 3 (89) = 856
The distribution is negatively skewed. There are no mild or extreme outliers.
59
3.40 n = 18
Median:
th th th
n
2
19
2
) 1 18 (
2
) 1 (
=
+
=
+
= 9.5
th
term
Median = 74
Q
1
= P
25
:
i = ) 18 (
100
25
= 4.5
Q
1
= 5
th
term = 66
Q
3
= P
75
:
i = ) 18 (
100
75
= 13.5
Q
3
= 14
th
term = 90
Therefore, IQR = Q
3
- Q
1
= 90 - 66 = 24
Inner Fences: Q
1
- 1.5 IQR = 66 - 1.5 (24) = 30
Q
3
+ 1.5 IQR = 90 + 1.5 (24) = 126
Outer Fences: Q
1
- 3.0 IQR = 66 - 3.0 (24) = -6
60
Q
3
+ 3.0 IQR = 90 + 3.0 (24) = 162
There are no extreme outliers. The only mild outlier is 21. The
distribution is positively skewed since the median is nearer to Q
1
than Q
3
.
61
3.41 Ex = 80 Ex
2
= 1,148 Ey = 69
Ey
2
= 815 Exy = 624 n = 7
r =
n
y
y
n
x
x
n
y x
xy
2
2
2
2
) ( ) (
=
r =
7
) 69 (
815
7
) 80 (
148 , 1
7
) 69 )( 80 (
624
2 2
=
) 857 . 134 )( 714 . 233 (
571 . 164
=
r =
533 . 177
571 . 164
= -0.927
3.42 Ex = 1,087 Ex
2
= 322,345 Ey = 2,032
Ey
2
= 878,686 Exy= 507,509 n = 5
r =
n
y
y
n
x
x
n
y x
xy
2
2
2
2
) ( ) (
=
62
r =
5
) 032 , 2 (
686 , 878
5
) 087 , 1 (
345 , 322
5
) 032 , 2 )( 087 , 1 (
509 , 507
2 2
=
r =
) 2 . 881 , 52 )( 2 . 031 , 86 (
2 . 752 , 65
=
5 . 449 , 67
2 . 752 , 65
= .975
63
3.43 Delta (x) SW (y)
47.6 15.1
46.3 15.4
50.6 15.9
52.6 15.6
52.4 16.4
52.7 18.1
Ex = 302.2 Ey = 96.5 Exy = 4,870.11
Ex
2
= 15,259.62 Ey
2
= 1,557.91
r =
n
y
y
n
x
x
n
y x
xy
2
2
2
2
) ( ) (
=
r =
6
) 5 . 96 (
91 . 557 , 1
6
) 2 . 302 (
62 . 259 , 15
6
) 5 . 96 )( 2 . 302 (
11 . 870 , 4
2 2
= .6445
3.44 Ex = 6,087 Ex
2
= 6,796,149
Ey = 1,050 Ey
2
= 194,526
Exy = 1,130,483 n = 9
64
r =
n
y
y
n
x
x
n
y x
xy
2
2
2
2
) ( ) (
=
r =
9
) 050 , 1 (
526 , 194
9
) 087 , 6 (
149 , 796 , 6
9
) 050 , 1 )( 087 , 6 (
483 , 130 , 1
2 2
=
r =
) 026 , 72 )( 308 , 679 , 2 (
333 , 420
=
705 . 294 , 439
333 , 420
= .957
3.45 Correlation between Year 1 and Year 2:
Ex = 17.09 Ex
2
= 58.7911
Ey = 15.12 Ey
2
= 41.7054
Exy = 48.97 n = 8
r =
n
y
y
n
x
x
n
y x
xy
2
2
2
2
) ( ) (
=
65
r =
8
) 12 . 15 (
7054 . 41
8
) 09 . 17 (
7911 . 58
8
) 12 . 15 )( 09 . 17 (
97 . 48
2 2
=
r =
) 1286 . 13 )( 28259 . 22 (
6699 . 16
=
1038 . 17
6699 . 16
= .975
Correlation between Year 2 and Year 3:
Ex = 15.12 Ex
2
= 41.7054
Ey = 15.86 Ey
2
= 42.0396
Exy = 41.5934 n = 8
r =
n
y
y
n
x
x
n
y x
xy
2
2
2
2
) ( ) (
=
r =
8
) 86 . 15 (
0396 . 42
8
) 12 . 15 (
7054 . 41
8
) 86 . 15 )( 12 . 15 (
5934 . 41
2 2
=
66
r =
) 59715 . 10 )( 1286 . 13 (
618 . 11
=
795 . 11
618 . 11
= .985
Correlation between Year 1 and Year 3:
Ex = 17.09 Ex
2
= 58.7911
Ey = 15.86 Ey
2
= 42.0396
Exy = 48.5827 n = 8
r =
n
y
y
n
x
x
n
y x
xy
2
2
2
2
) ( ) (
=
r =
8
) 86 . 15 (
0396 . 42
8
) 09 . 17 (
7911 . 58
8
) 86 . 15 )( 09 . 17 (
5827 . 48
2 2
r =
) 5972 . 10 )( 2826 . 22 (
702 . 14
=
367 . 15
702 . 14
= .957
The years 2 and 3 are the most correlated with r = .985.
67
3.46 Arranging the values in an ordered array:
1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
3, 3, 3, 3, 3, 3, 4, 4, 5, 6, 8
Mean:
30
75
=
E
=
n
x
x = 2.5
Mode = 2 (There are eleven 2s)
Median: There are n = 30 terms.
The median is located at
2
31
2
1 30
2
1
=
+
=
+
th
n
= 15.5
th
position.
Median is the average of the 15
th
and 16
th
value.
However, since these are both 2, the median is 2.
Range = 8 - 1 = 7
Q
1
= P
25
: i = ) 30 (
100
25
= 7.5
Q
1
is the 8
th
term = 1
68
Q
3
= P
75
: i = ) 30 (
100
75
= 22.5
Q
3
is the 23
rd
term = 3
IQR = Q
3
- Q
1
= 3 - 1 = 2
69
3.47 P
10
: i = ) 40 (
100
10
= 4
P
10
= 4.5
th
term = 23
P
80
: i = ) 40 (
100
80
= 32
P
80
= 32.5
th
term = 49.5
Q
1
= P
25
: i = ) 40 (
100
25
= 10
P
25
= 10.5
th
term = 27.5
Q
3
= P
75
: i = ) 40 (
100
75
= 30
P
75
= 30.5
th
term = 47.5
IQR = Q
3
- Q
1
= 47.5 - 27.5 = 20
70
Range = 81 - 19 = 62
71
3.48 =
20
904 , 126
=
E
N
x
= 6345.2
The median is located at the
2
1 + n
th value = 21/2 = 10.5
th
value
The median is the average of 5414 and 5563 = 5488.5
P
30
: i = (.30)(20) = 6
P
30
is located at the average of the 6
th
and 7
th
terms
P
30
= (4507+4541)/2 = 4524
P
60
: i = (.60)(20) = 12
P
60
is located at the average of the 12
th
and 13
th
terms
P
60
= (6101+6498)/2 = 6299.5
P
90
: i = (.90)(20) = 18
P
90
is located at the average of the 18
th
and 19
th
terms
P
90
= (9863+11,019)/2 = 10,441
72
Q
1
= P
25
: i = (.25)(20) = 5
Q
1
is located at the average of the 5
th
and 6
th
terms
Q
1
= (4464+4507)/2 = 4485.5
Q
3
= P
75
: i = (.75)(20) = 15
Q
3
is located at the average of the 15
th
and 16
th
terms
Q
3
= (6796+8687)/2 = 7741.5
Range = 11,388 - 3619 = 7769
IQR = Q
3
- Q
1
= 7741.5 - 4485.5 = 3256
3.49 n = 10 Ex = 87.95 Ex
2
= 1130.9027
= (Ex)/N = 87.95/10 = 8.795
o =
10
10
) 95 . 87 (
9027 . 1130
) (
2 2
2
=
E
E
N
N
x
x
= 5.978
73
3.50 a.) =
N
x E
= 26,675/11 = 2425
Median = 1965
b.) Range = 6300 - 1092 = 5208
Q
3
= 2867 Q
1
= 1532 IQR = Q
3
- Q
1
= 1335
c.) Variance:
o
2
=
11
11
) 675 , 26 (
873 , 942 , 86
) (
2 2
2
=
E
E
N
N
x
x
= 2,023,272.55
Standard Deviation:
o = 55 . 272 , 023 , 2
2
= o = 1422.42
d.) Texaco:
z =
42 . 1422
2425 1532
=
o
u x
= -0.63
Exxon Mobil:
74
z =
42 . 1422
2425 6300
=
o
u x
= 2.72
e.) Skewness:
S
k
=
42 . 1422
) 1965 2425 ( 3 ) ( 3
=
o
u
d
M
= 0.97
3.51 a.) Mean:
14
95 . 32
=
E
=
n
x
u = 2.3536
Median:
2
07 . 2 79 . 1 +
= 1.93
Mode: No Mode
b.) Range: 4.73 1.20 = 3.53
Q
1
: 5 . 3 ) 14 (
4
1
= Located at the 4
th
term. Q
1
= 1.68
Q
3
: 5 . 10 ) 14 (
4
3
= Located at the 11
th
term. Q
3
= 2.87
IQR = Q
3
Q
1
= 2.87 1.68 = 1.19
x x x ( )
2
x x
75
4.73 2.3764 5.6473
3.64 1.2864 1.6548
3.53 1.1764 1.3839
2.87 0.5164 0.2667
2.61 0.2564 0.0657
2.59 0.2364 0.0559
2.07 0.2836 0.0804
1.79 0.5636 0.3176
1.77 0.5836 0.3406
1.69 0.6636 0.4404
1.68 0.6736 0.4537
1.41 0.9436 0.8904
1.37 0.9836 0.9675
1.20 1.1536 1.3308
x x = 11.6972
2
) ( x x = 13.8957
MAD =
14
6972 . 11
=
n
x x
= 0.8355
s
2
=
( )
13
8957 . 13
1
2
=
n
x x
= 1.0689
s = 0689 . 1
2
= s = 1.0339
c.) Pearsons Coefficient of Skewness:
S
k
=
0339 . 1
) 93 . 1 2.3536 ( 3 ) ( 3
=
s
M x
d
= 1.229
d.) Use Q
1
= 1.68, Q
2
= 1.93, Q
3
= 2.87, IQR = 1.19
Extreme Points: 1.20 and 4.73
Inner Fences: 1.68 1.5(1.19) = -0.105
1.93 + 1.5(1.19) = 3.715
Outer Fences: 1.68 + 3.0(1.19) = -1.890
1.93 + 3.0(1.19) = 5.500
There is one mild outlier. The 4.73 recorded for Arizona is outside the upper
inner fence.
76
Mineral Production
5 4 3 2 1
Mineral Production
77
3.52 f M fM fM
2
15-20 9 17.5 157.5 2756.25
20-25 16 22.5 360.0 8100.00
25-30 27 27.5 742.5 20418.75
30-35 44 32.5 1430.0 46475.00
35-40 42 37.5 1575.0 59062.50
40-45 23 42.5 977.5 41543.75
45-50 7 47.5 332.5 15793.75
50-55 2 52.5 105.0 5512.50
Ef = 170 EfM = 5680.0 EfM
2
= 199662.50
a.) Mean:
u =
170
5680
=
E
E
f
fM
= 33.412
Mode: The Modal Class is 30-35. The class midpoint is the mode = 32.5.
b.) Variance:
s
2
=
169
170
) 5680 (
5 . 662 , 199
1
) (
2 2
2
E
E
n
n
f M
f M
= 58.483
78
Standard Deviation:
s = 483 . 58
2
= s = 7.647
79
3.53 Class f M fM fM
2
0 - 20 32 10 320 3,200
20 - 40 16 30 480 14,400
40 - 60 13 50 650 32,500
60 - 80 10 70 700 49,000
80 - 100 19 90 1,710 153,900
Ef = 90 EfM= 3,860 EfM
2
= 253,000
a) Mean:
90
860 , 3
=
E
E
=
E
=
f
fm
n
fM
x = 42.89
Mode: The Modal Class is 0-20. The midpoint of this class is the mode = 10.
b) Sample Standard Deviation:
s =
572 . 982
89
9 . 448 , 87
89
1 . 551 , 165 000 , 253
89
90
) 3860 (
000 , 253
1
) (
2 2
2
= =
E
E
n
n
f M
f M
= 31.346
80
3.54 Ex = 36 Ex
2
= 256
Ey = 44 Ey
2
= 300
Exy = 188 n = 7
r =
n
y
y
n
x
x
n
y x
xy
2
2
2
2
) ( ) (
=
7
) 44 (
300
7
) 36 (
256
7
) 44 )( 36 (
188
2
r =
) 42857 . 23 )( 85714 . 70 (
2857 . 38
=
7441 . 40
2857 . 38
= -.940
81
3.55 CV
x
= %) 100 (
32
45 . 3
%) 100 ( =
x
x
u
o
= 10.78%
CV
Y
= %) 100 (
84
40 . 5
%) 100 ( =
y
y
u
o
= 6.43%
Stock X has a greater relative variability.
3.56 = 7.5 Each of the numbers, 1 and 14, are 6.5 units away from the mean.
From the Empirical Rule: 99.7% of the values lie in + 3o
3o = 14 - 7.5 = 6.5 Solving for 3o = 6.5 for o: o = 2.167
Suppose that = 7.5, o = 1.7:
95% lie within + 2o = 7.5 + 2(1.7) = 7.5 + 3.4
Between 4.1 and 10.9 lie 95% of the values.
82
3.57 = 419, o = 27
a.) 68%: + 1o 419 + 27 392 to 446
95%: + 2o 419 + 2(27) 365 to 473
99.7%: + 3o 419 + 3(27) 338 to 500
b.) Use Chebyshevs:
Each of the points, 359 and 479 is a distance of 60 from the mean, = 419.
k = (distance from the mean)/o = 60/27 = 2.22
Proportion = 1 - 1/k
2
= 1 - 1/(2.22)
2
= .797 = 79.7%
c.) Since x = 400, z =
27
419 400
= -0.704. This worker is in the lower half of
workers but within one standard deviation of the mean.
3.58 a.) x x
2
Albania 4,900 24,010,000
Bulgaria 8,200 67,240,000
Croatia 11,200 125,440,000
Czech 16,800 282,240,000
Ex=41,100 Ex
2
= 498,930,000
83
u =
4
100 , 41
=
E
N
x
= 10,275
o =
4
4
) 100 , 41 (
000 , 930 , 498
) (
2 2
2
=
E
E
N
N
x
x
= 4376.86
b.) x x
2
Hungary 14,900 222,010,000
Poland 12,000 144,000,000
Romania 7,700 59,290,000
Bosnia/Herz 6,500 42,250,000
Ex=41,100 Ex
2
=467,550,000
u =
4
100 , 41
=
E
N
x
= 10,275
o =
4
4
) 100 , 41 (
000 , 550 , 467
) (
2 2
2
=
E
E
N
N
x
x
= 3363.31
c.)
CV
1
= ) 100 (
275 , 10
86 . 4376
) 100 (
1
1
=
u
o
= 42.60%
84
CV
2
= ) 100 (
275 , 10
31 . 3363
) 100 (
2
2
=
u
o
= 32.73%
The first group has a larger coefficient of variation
85
3.59 Mean $35,748
Median $31,369
Mode $29,500
Since these three measures are not equal, the distribution is skewed. The
distribution is skewed to the right because the mean is greater than the median. Often,
the median is preferred in reporting income data because it yields information about
the middle of the data while ignoring extremes.
3.60 Ex = 36.62 Ex
2
= 217.137
Ey = 57.23 Ey
2
= 479.3231
Exy = 314.9091 n = 8
r =
n
y
y
n
x
x
n
y x
xy
2
2
2
2
) ( ) (
=
r =
8
) 23 . 57 (
3231 . 479
8
) 62 . 36 (
137 . 217
8
) 23 . 57 )( 62 . 36 (
9091 . 314
2 2
=
86
r =
) 91399 . 69 )( 50895 . 49 (
938775 . 52
= .90
There is a strong positive relationship between the inflation rate and
the thirty-year treasury yield.
87
3.61 a.) Q
1
= P
25
: i = ) 20 (
100
25
= 5
Q
1
= 5.5
th
term = (48.3 + 49.9)/2 = 49.1
Q
3
= P
75
: i = ) 20 (
100
75
= 15
Q
3
= 15.5
th
term = (77.6+83.8)/2 = 80.7
Median:
th th
n
2
1 20
2
1 +
=
+
= 10.5
th
term
Median = (55.9 + 61.3)/2 = 58.6
IQR = Q
3
- Q
1
= 80.7 49.1 = 31.6
1.5 IQR = 47.4; 3.0 IQR = 94.8
Inner Fences:
Q
1
- 1.5 IQR = 49.1 47.4 = 1.7
Q
3
+ 1.5 IQR = 80.7 + 47.4 = 128.1
88
Outer Fences:
Q
1
- 3.0 IQR = 49.1 94.8 = - 45.70
Q
3
+ 3.0 IQR = 80.7 + 94.8 = 175.5
b.) and c.) There are no outliers in the lower end. There are two extreme
outliers in the upper end (South Louisiana, 198.8, and Houston,
190.9). There is one mild outlier at the upper end (New York, 145.9).
Since the median is nearer to Q
1
, the distribution is positively skewed.
d.) There are three dominating, large ports
Displayed below is the MINITAB boxplot for this problem.
U.S. Ports
200 150 100 50
Boxplot of U.S. Ports
89
3.62 Paris: Since 1 - 1/k
2
= .53, solving for k: k = 1.459
The distance from = 349 to x = 381 is 32
1.459o = 32
o = 21.93
Moscow: Since 1 - 1/k
2
= .83, solving for k: k = 2.425
The distance from = 415 to x = 459 is 44
2.425o = 44
o = 18.14
90
Chapter 5
Discrete Distributions
LEARNING OBJECTIVES
The overall learning objective of Chapter 5 is to help you understand a category of probability
distributions that produces only discrete outcomes, thereby enabling you to:
1. Distinguish between discrete random variables and continuous random variables.
2. Know how to determine the mean and variance of a discrete distribution.
3. Identify the type of statistical experiments that can be described by the binomial
distribution and know how to work such problems.
4. Decide when to use the Poisson distribution in analyzing statistical experiments and
know how to work such problems.
5. Decide when binomial distribution problems can be approximated by the Poisson
distribution and know how to work such problems.
6. Decide when to use the hypergeometric distribution and know how to work such
problems
91
CHAPTER TEACHING STRATEGY
Chapters 5 and 6 introduce the student to several statistical distributions. It is
important to differentiate between the discrete distributions of chapter 5 and the continuous
distributions of chapter 6.
The approach taken in presenting the binomial distribution is to build on techniques
presented in chapter 4. It can be helpful to take the time to apply the law of multiplication for
independent events to a problem and demonstrate to students that sequence is important.
From there, the student will more easily understand that by using combinations, one can more
quickly determine the number of sequences and weigh the probability of obtaining a single
sequence by that number. In a sense, we are developing the binomial formula through an
inductive process. Thus, the binomial formula becomes more of a summary device than a
statistical "trick". The binomial tables presented in this text are non cumulative. This makes it
easier for the student to recognize that the table is but a listing of a series of binomial formula
computations. In addition, it lends itself more readily to the graphing of a binomial distribution.
It is important to differentiate applications of the Poisson distribution from binomial
distribution problems. It is often difficult for students to determine which type of distribution to
apply to a problem. The Poisson distribution applies to rare occurrences over some interval.
The parameters involved in the binomial distribution (n and p) are different from the parameter
(Lambda) of a Poisson distribution.
It is sometimes difficult for students to know how to handle Poisson problems in which
the interval for the problem is different than the stated interval for Lambda. Note that in such
problems, it is always the value of Lambda that is adjusted not the value of x. Lambda is a
long-run average that can be appropriately adjusted for various intervals. For example, if a store
is averaging customers in 5 minutes, then it will also be averaging 2 customers in 10 minutes.
On the other hand, x is a one-time observation and just because x customers arrive in 5 minutes
does not mean that 2x customers will arrive in 10 minutes.
Solving for the mean and standard deviation of binomial distributions prepares the
students for chapter 6 where the normal distribution is sometimes used to approximate
binomial distribution problems. Graphing binomial and Poisson distributions affords the student
the opportunity to visualize the meaning and impact of a particular set of parameters for a
distribution. In addition, it is possible to visualize how the binomial distribution approaches the
normal curve as p gets nearer to .50 and as n gets larger for other values of p. It can be useful to
demonstrate this in class along with showing how the graphs of Poisson distributions also
approach the normal curve as gets larger.
92
In this text (as in most) because of the number of variables used in its computation, only
exact probabilities are determined for hypergeometric distribution. This, combined with the
fact that there are no hypergeometric tables given in the text, makes it cumbersome to
determine cumulative probabilities for the hypergeometric distribution. Thus, the
hypergeometric distribution can be presented as a fall-back position to be used only when the
binomial distribution should not be applied because of the non independence of trials and size
of sample.
93
CHAPTER OUTLINE
5.1 Discrete Versus Continuous Distributions
5.2 Describing a Discrete Distribution
Mean, Variance, and Standard Deviation of Discrete Distributions
Mean or Expected Value
Variance and Standard Deviation of a Discrete Distribution
5.3 Binomial Distribution
Solving a Binomial Problem
Using the Binomial Table
Using the Computer to Produce a Binomial Distribution
Mean and Standard Deviation of the Binomial Distribution
Graphing Binomial Distributions
5.4 Poisson Distribution
Working Poisson Problems by Formula
Using the Poisson Tables
Mean and Standard Deviation of a Poisson Distribution
Graphing Poisson Distributions
Using the Computer to Generate Poisson Distributions
Approximating Binomial Problems by the Poisson Distribution
5.5 Hypergeometric Distribution
94
Using the Computer to Solve for Hypergeometric Distribution
Probabilities
KEY TERMS
Binomial Distribution Hypergeometric Distribution
Continuous Distributions Lambda ()
Continuous Random Variables Mean, or Expected Value
Discrete Distributions Poisson Distribution
Discrete Random Variables Random Variable
95
SOLUTIONS TO PROBLEMS IN CHAPTER 5
5.1 x P(x) xP(x) (x-)
2
(x-)
2
P(x)
1 .238 .238 2.775556 0.6605823
2 .290 .580 0.443556 0.1286312
3 .177 .531 0.111556 0.0197454
4 .158 .632 1.779556 0.2811700
5 .137 .685 5.447556 0.7463152
= [xP(x)] = 2.666 o
2
= [(x-)
2
P(x)] = 1.836444
o = 836444 . 1 = 1.355155
5.2 x P(x) xP(x) (x-)
2
(x-)
2
P(x)
0 .103 .000 7.573504
0.780071
1 .118 .118 3.069504
0.362201
2 .246 .492 0.565504
0.139114
3 .229 .687 0.061504 0.014084
4 .138 .552 1.557504
0.214936
5 .094 .470 5.053504
0.475029
6 .071 .426 10.549500
0.749015
96
7 .001 .007 18.045500
0.018046
= [xP(x)] = 2.752 o
2
= [(x-)
2
P(x)] = 2.752496
o = 752496 . 2 = 1.6591
5.3 x P(x) xP(x) (x-)
2
(x-)
2
P(x)
0 .461 .000 0.913936 0.421324
1 .285 .285 0.001936 0.000552
2 .129 .258 1.089936 0.140602
3 .087 .261 4.177936 0.363480
4 .038 .152 9.265936 0.352106
E(x) = = [xP(x)]= 0.956 o
2
= [(x-)
2
P(x)] = 1.278064
o = 278064 . 1 = 1.1305
97
5.4 x P(x) xP(x) (x-)
2
(x-)
2
P(x)
0 .262 .000 1.4424 0.37791
1 .393 .393 0.0404 0.01588
2 .246 .492 0.6384 0.15705
3 .082 .246 3.2364 0.26538
4 .015 .060 7.8344 0.11752
5 .002 .010 14.4324 0.02886
6 .000 .000 23.0304
0.00000
= [xP(x)] = 1.201 o
2
= [(x-)
2
P(x)] = 0.96260
o = 96260 . = .98112
5.5 a) n = 4 p = .10 q = .90
P(x=3) =
4
C
3
(.10)
3
(.90)
1
= 4(.001)(.90) = .0036
b) n = 7 p = .80 q = .20
P(x=4) =
7
C
4
(.80)
4
(.20)
3
= 35(.4096)(.008) = .1147
c) n = 10 p = .60 q = .40
98
P(x > 7) = P(x=7) + P(x=8) + P(x=9) + P(x=10) =
10
C
7
(.60)
7
(.40)
3
+
10
C
8
(.60)
8
(.40)
2
+
10
C
9
(.60)
9
(.40)
1
+
10
C
10
(.60)
10
(.40)
0
=
120(.0280)(.064) + 45(.0168)(.16) + 10(.0101)(.40) + 1(.0060)(1) =
.2150 + .1209 + .0403 + .0060 = .3822
d) n = 12 p = .45 q = .55
P(5 < x < 7) = P(x=5) + P(x=6) + P(x=7) =
12
C
5
(.45)
5
(.55)
7
+
12
C
6
(.45)
6
(.55)
6
+
12
C
7
(.45)
7
(.55)
5
=
792(.0185)(.0152) + 924(.0083)(.0277) + 792(.0037)(.0503) =
.2225 + .2124 + .1489 = .5838
5.6 By Table A.2:
a) n = 20 p = .50
P(x=12) = .120
99
b) n = 20 p = .30
P(x > 8) = P(x=9) + P(x=10) + P(x=11) + ...+ P(x=20) =
.065 + .031 + .012 + .004 + .001 + .000 = .113
c) n = 20 p = .70
P(x < 12) = P(x=11) + P(x=10) + P(x=9) + ... + P(x=0) =
.065 + .031 + .012 + .004 + .001 + .000 = .113
d) n = 20 p = .90
P(x < 16) = P(x=16) + P(x=15) + P(x=14) + ...+ P(x=0) =
.090 + .032 + .009 + .002 + .000 = .133
e) n = 15 p = .40
P(4 < x < 9) =
100
P(x=4) + P(x=5) + P(x=6) + P(x=7) + P(x=8) + P(x=9) =
.127 + .186 + .207 + .177 + .118 + .061 = .876
f) n = 10 p = .60
P(x > 7) = P(x=7) + P(x=8) + P(x=9) + P(x=10) =
.215 + .121 + .040 + .006 = .382
101
5.7 a) n = 20 p = .70 q = .30
= np = 20(.70) = 14
o = 2 . 4 ) 30 )(. 70 (. 20 = = q p n = 2.05
b) n = 70 p = .35 q = .65
= np = 70(.35) = 24.5
o = 925 . 15 ) 65 )(. 35 (. 70 = = q p n = 3.99
c) n = 100 p = .50 q = .50
= np = 100(.50) = 50
o = 25 ) 50 )(. 50 (. 100 = = q p n = 5
5.8 a) n = 6 p = .70 x Prob
0 .001
1 .010
2 .060
3 .185
102
4 .324
5 .303
6 .118
b) n = 20 p = .50 x Prob
0 .000
1 .000
2 .000
3 .001
4 .005
5 .015
6 .037
7 .074
8 .120
9 .160
10 .176
11 .160
103
12 .120
13 .074
14 .037
15 .015
16 .005
17 .001
18 .000
19 .000
20 .000
104
c) n = 8 p = .80 x Prob
0 .000
1 .000
2 .001
3 .009
4 .046
5 .147
6 .294
7 .336
8 .168
5.9 a) n = 20 p = .78 x = 14
20
C
14
(.78)
14
(.22)
6
= 38,760(.030855)(.00011338) = .1356
105
b) n = 20 p = .75 x = 20
20
C
20
(.75)
20
(.25)
0
= (1)(.0031712)(1) = .0032
c) n = 20 p = .70 x < 12
Use table A.2:
P(x=0) + P(x=1) + . . . + P(x=11)=
.000 + .000 + .000 + .000 + .000 + .000 + .000 +
.001 + .004 + .012 + .031 + .065 = .113
106
5.10 n = 16 p = .40
P(x > 9): from Table A.2:
x Prob
9 .084
10 .039
11 .014
12 .004
13 .001
.142
P(3 < x < 6):
x Prob
3 .047
4 .101
5 .162
6 .198
.508
n = 13 p = .88
107
P(x = 10) =
13
C
10
(.88)
10
(.12)
3
= 286(.278500976)(.001728) = .1376
P(x = 13) =
13
C
13
(.88)
13
(.12)
0
= (1)(.1897906171)(1) = .1898
Expected Value = = np = 13(.88) = 11.44
108
5.11 n = 25 p = .60
a) x > 15
P(x > 15) = P(x = 15) + P(x = 16) + + P(x = 25)
Using Table A.2 n = 25, p = .60
x Prob
15 .161
16 .151
17 .120
18 .080
19 .044
20 .020
21 .007
22 .002
.585
b) x > 20
P(x > 20) = P(x = 21) + P(x = 22) + P(x = 23) + P(x = 24) + P(x = 25) =
Using Table A.2 n = 25, p = .60
.007 + .002 + .000 + .000 + .000 = .009
109
c) P(x < 10)
Using Table A.2 n = 25, p = .60 and x = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
x Prob.
9 .009
8 .003
7 .001
<6 .000
.013
110
5.12 n = 16 p = .50 x > 10
Using Table A.2, n = 16 and p = .50, P(x=11) + P(x=12) + . . . + P(x=16) =
x Prob.
11 .067
12 .028
13 .009
14 .002
15 .000
16 .000
.106
For n = 10 p = .87 x = 6
10
C
6
(.87)
6
(.13)
4
= 210(.433626)(.00028561) = .0260
5.13 n = 15 p = .20
a) P(x = 5) =
15
C
5
(.20)
5
(.80)
10
= 3003(.00032)(.1073742) = .1032
b) P(x > 9): Using Table A.2
P(x = 10) + P(x = 11) + . . . + P(x = 15) = .000 + .000 + . . . + .000 = .000
c) P(x = 0) =
15
C
0
(.20)
0
(.80)
15
= (1)(1)(.035184) = .0352
111
d) P(4 < x < 7): Using Table A.2
P(x = 4) + P(x = 5) + P(x = 6) + P(x = 7) = .188 + .103 + .043 + .014 = .348
e)
5.14 n = 18
a) p =.30 = 18(.30) = 5.4
p = .34 = 18(.34) = 6.12
b) P(x > 8) n = 18 p = .30
from Table A.2
x Prob
8 .081
112
9 .039
10 .015
11 .005
12 .001
.141
c) n = 18 p = .34
P(2 < x < 4) = P(x = 2) + P(x = 3) + P(x = 4) =
18
C
2
(.34)
2
(.66)
16
+
18
C
3
(.34)
3
(.66)
15
+
18
C
4
(.34)
4
(.66)
14
=
.0229 + .0630 + .1217 = .2076
d) n = 18 p = .30 x = 0
18
C
0
(.30)
0
(.70)
18
= .00163
n = 18 p = .34 x = 0
18
C
0
(.34)
0
(.66)
18
= .00056
Since only 30% (compared to 34%) fall in the $500,000 to $1,000,000 category, it is
more likely that none of the CPA financial advisors would fall in this category.
113
5.15 a) P(x=5| = 2.3) =
2 3
5
64 36343 100259
120
5 2 3
.
!
( . )(. )
.
e
= .0538
b) P(x=2| = 3.9) =
39
2
1521 020242
2
2 3 9
.
!
( . )(. )
.
e
= .1539
c) P(x < 3| = 4.1) = P(x=3) + P(x=2) + P(x=1) + P(x=0) =
41
3
68921 016573
6
3 4 1
.
!
( . )(. )
.
e
= .1904
41
2
1681 016573
2
2 4 1
.
!
( . )(. )
.
e
= .1393
41
1
41 016573
1
1 4 1
.
!
( . )(. )
.
e
= .0679
41
0!
1 016573
1
0 4 1
. ( )(. )
.
e
= .0166
.1904 + .1393 + .0679 + .0166 = .4142
d) P(x=0| = 2.7) =
2 7
0!
1 06721
1
0 2 7
. ( )(. )
.
e
= .0672
114
e) P(x=1| = 5.4)=
54
1
54 0045166
1
1 5 4
.
!
( . )(. )
.
e
= .0244
f) P(4 < x < 8| = 4.4): P(x=5| = 4.4) + P(x=6| = 4.4) + P(x=7| =
4.4)=
4 4
5
5 4 4
.
!
.
e
+
4 4
6!
6 4 4
.
.
e
+
4 4
7
7 4 4
.
!
.
e
=
( . )(. ) 16491622 01227734
120
+
( . )(. ) 72563139 01227734
720
+
( , . )(. ) 31927 781 01227734
5040
= .1687 + .1237 + .0778 = .3702
5.16 a) P(x=6| = 3.8) = .0936
b) P(x>7| = 2.9):
x Prob
8 .0068
9 .0022
10 .0006
11 .0002
12 .0000
.0098
115
c) P(3 < x < 9| = 4.2)=
x Prob
3 .1852
4 .1944
5 .1633
6 .1143
7 .0686
8 .0360
9 .0168
.7786
d) P(x=0| = 1.9) = .1496
e) P(x < 6| = 2.9)=
x Prob
0 .0550
1 .1596
2 .2314
3 .2237
4 .1622
5 .0940
6 .0455
.9714
f) P(5 < x < 8| = 5.7) =
116
x Prob
6 .1594
7 .1298
8 .0925
.3817
5.17 a) = 6.3 mean = 6.3 Standard deviation = 3 . 6 = 2.51
x Prob
0 .0018
1 .0116
2 .0364
3 .0765
4 .1205
5 .1519
6 .1595
7 .1435
8 .1130
9 .0791
10 .0498
11 .0285
12 .0150
13 .0073
14 .0033
117
15 .0014
16 .0005
17 .0002
18 .0001
19 .0000
118
b) = 1.3 mean = 1.3 standard deviation = 3 . 1 = 1.14
x Prob
0 .2725
1 .3542
2 .2303
3 .0998
4 .0324
5 .0084
6 .0018
7 .0003
8 .0001
9 .0000
119
c) = 8.9 mean = 8.9 standard deviation = 9 . 8 = 2.98
x Prob
0 .0001
1 .0012
2 .0054
3 .0160
4 .0357
5 .0635
6 .0941
7 .1197
8 .1332
9 .1317
10 .1172
11 .0948
12 .0703
13 .0481
14 .0306
15 .0182
16 .0101
17 .0053
18 .0026
19 .0012
20 .0005
21 .0002
22 .0001
120
121
d) = 0.6 mean = 0.6 standard deviation = 6 . 0 = .775
x Prob
0 .5488
1 .3293
2 .0988
3 .0198
4 .0030
5 .0004
6 .0000
122
5.18 = 2.8|4 minutes
a) P(x=6| = 2.8)
from Table A.3 .0407
b) P(x=0| = 2.8) =
from Table A.3 .0608
c) Unable to meet demand if x > 4|4 minutes:
x Prob.
5 .0872
6 .0407
7 .0163
8 .0057
9 .0018
10 .0005
11 .0001
.1523
There is a .1523 probability of being unable to meet the demand.
Probability of meeting the demand = 1 - (.1523) = .8477
123
15.23% of the time a second window will need to be opened.
d) = 2.8 arrivals|4 minutes
P(x=3) arrivals|2 minutes = ??
Lambda must be changed to the same interval ( the size)
New lambda=1.4 arrivals|2 minutes
P(x=3)|=1.4) = from Table A.3 = .1128
P(x > 5| 8 minutes) = ??
Lambda must be changed to the same interval(twice the size):
New lambda = 5.6 arrivals|8 minutes
124
P(x > 5 | = 5.6):
From Table A.3: x Prob.
5 .1697
6 .1584
7 .1267
8 .0887
9 .0552
10 .0309
11 .0157
12 .0073
13 .0032
14 .0013
15 .0005
16 .0002
17 .0001
.6579
5.19 = Ex/n = 126/36 = 3.5
Using Table A.3
a) P(x = 0) = .0302
b) P(x > 6) = P(x = 6) + P(x = 7) + . . . =
125
.0771 + .0385 + .0169 + .0066 + .0023 +
.0007 + .0002 + .0001 = .1424
c) P(x < 4 |10 minutes)
Double Lambda to = 7.0|10 minutes
P(x < 4) = P(x = 0) + P(x = 1) + P(x = 2) + P(x = 3) =
.0009 + .0064 + .0223 + .0521 = .0817
d) P(3 < x < 6| 10 minutes)
= 7.0 | 10 minutes
P(3 < x < 6) = P(x = 3) + P(x = 4) + P(x = 5) + P(x = 6)
= .0521 + .0912 + .1277 + .1490 = .42
e) P(x = 8 | 15 minutes)
Change Lambda for a 15 minute interval by multiplying the original Lambda by 3.
= 10.5 | 15 minutes
126
P(x = 8|15 minutes) =
! 8
) )( 5 . 10 (
!
5 . 10 8
=
e
x
e
x
= .1009
5.20 = 5.6 days|3 weeks
a) P(x=0 | = 5.6):
from Table A.3 = .0037
b) P(x=6 | = 5.6):
from Table A.3 = .1584
c) P(x > 15| = 5.6):
x Prob.
15 .0005
16 .0002
17 .0001
.0008
Because this probability is so low, if it actually occurred, the researcher would
question the Lambda value as too low for this period. Perhaps the value of
127
Lambda has changed because of an overall increase in pollution.
128
5.21 = 0.6 trips|1 year
a) P(x=0 | = 0.6):
from Table A.3 = .5488
b) P(x=1 | = 0.6):
from Table A.3 = .3293
c) P(x > 2 | = 0.6):
from Table A.3 x Prob.
2 .0988
3 .0198
4 .0030
5 .0004
6 .0000
.1220
d) P(x < 3 |3 year period):
The interval length has been increased (3 times)
New Lambda = = 1.8 trips|3 years
129
P(x < 3 | = 1.8):
from Table A.3 x Prob.
0 .1653
1 .2975
2 .2678
3 .1607
.8913
e) P(x=4|6 years):
The interval has been increased (6 times)
New Lambda = = 3.6 trips|6 years
P(x=4| = 3.6):
from Table A.3 = .1912
5.22 = 1.2 collisions|4 months
a) P(x=0 | = 1.2):
from Table A.3 = .3012
130
b) P(x=2|2 months):
The interval has been decreased (by )
New Lambda = = 0.6 collisions|2 months
P(x=2 | = 0.6):
from Table A.3 = .0988
c) P(x < 1 collision|6 months):
The interval length has been increased (by 1.5)
New Lambda = = 1.8 collisions|6 months
P(x < 1| = 1.8):
from Table A.3 x Prob.
0 .1653
1 .2975
.4628
131
The result is likely to happen almost half the time (46.26%). Ship channel and
weather conditions are about normal for this period. Safety awareness is
about normal for this period. There is no compelling reason to reject the
lambda value of 0.6 collisions per 4 months based on an outcome of 0 or 1
collisions per 6 months.
132
5.23 = 1.2 pens|carton
a) P(x=0 | = 1.2):
from Table A.3 = .3012
b) P(x > 8 | = 1.2):
from Table A.3 = .0000
c) P(x > 3 | = 1.2):
from Table A.3 x Prob.
4 .0260
5 .0062
6 .0012
7 .0002
8 .0000
.0336
133
5.24 n = 100,000 p = .00004
P(x > 7|n = 100,000 p = .00004):
= = np = 100,000(.00004) = 4.0
Since n > 20 and np < 7, the Poisson approximation to this binomial problem is
close enough.
P(x > 7 | = 4):
Using Table A.3 x Prob.
7 .0595
8 .0298
9 .0132
10 .0053
11 .0019
12 .0006
13 .0002
14 .0001
.1106
P(x >10 | = 4):
Using Table A.3 x Prob.
11 .0019
134
12 .0006
13 .0002
14 .0001
.0028
Since getting more than 10 is a rare occurrence, this particular geographic region
appears to have a higher average rate than other regions. An investigation of
particular characteristics of this region might be warranted.
135
5.25 p = .009 n = 200
Use the Poisson Distribution:
= np = 200(.009) = 1.8
a) P(x > 6) from Table A.3 =
P(x = 6) + P(x = 7) + P(x = 8) + P(x = 9) + . . . =
.0078 + .0020 + .0005 + .0001 = .0104
b) P(x > 10) = .0000
c) P(x = 0) = .1653
d) P(x < 5) = P(x = 0) + P(x = 1) + P(x = 2) + P( x = 3) + P(x = 4) =
.1653 + .2975 + .2678 + .1607 + .0723 = .9636
5.26 If 99% see a doctor, then 1% do not see a doctor. Thus, p = .01 for this problem.
136
n = 300, p = .01, = n(p) = 300(.01) = 3
a) P(x = 5):
Using = 3 and Table A.3 = .1008
b) P(x < 4) = P(x = 0) + P(x = 1) + P(x = 2) + P(x = 3) =
.0498 + .1494 + .2240 + .2240 = .6472
c) The expected number = = = 3
137
5.27 a) P(x = 3 |N = 11, A = 8, n = 4)
330
) 3 )( 56 (
4 11
1 3 3 8
=
C
C C
= .5091
b) P(x < 2)|N = 15, A = 5, n = 6)
P(x = 1) + P (x = 0) =
6 15
5 10 1 5
C
C C
+
6 15
6 10 0 5
C
C C
=
5005
) 210 )( 1 (
5005
) 252 )( 5 (
+
.2517 + .0420 = .2937
c) P(x=0 |N = 9, A = 2, n = 3)
84
) 35 )( 1 (
3 9
3 7 0 2
=
C
C C
= .4167
d) P(x > 4 |N = 20, A = 5, n = 7) =
P(x = 5) + P(x = 6) + P(x = 7) =
7 20
2 15 5 5
C
C C
+
7 20
1 15 6 5
C
C C
+
7 20
0 15 7 5
C
C C
=
138
77520
) 105 )( 1 (
+
5
C
6
(impossible) +
5
C
7
(impossible) = .0014
139
5.28 N = 19 n = 6
a) P(x = 1 private) A = 11
132 , 27
) 56 )( 11 (
6 19
5 8 1 11
=
C
C C
= .0227
b) P(x = 4 private)
132 , 27
) 28 )( 330 (
6 19
2 8 4 11
=
C
C C
= .3406
c) P(x = 6 private)
132 , 27
) 1 )( 462 (
6 19
0 8 6 11
=
C
C C
= .0170
d) P(x = 0 private)
132 , 27
) 28 )( 1 (
6 19
6 8 0 11
=
C
C C
= .0010
140
5.29 N = 17 A = 8 n = 4
a) P(x = 0) =
4 17
4 9 0 8
C
C C
=
2380
) 126 )( 1 (
= .0529
b) P(x = 4) =
4 17
0 9 4 8
C
C C
=
2380
) 1 )( 70 (
= .0294
c) P(x = 2 non computer) =
4 17
2 8 2 9
C
C C
=
2380
) 28 )( 36 (
= .4235
5.30 N = 20 A = 16 white N - A = 4 red n = 5
a) P(x = 4 white) =
5 20
1 4 4 16
C
C C
=
15504
) 4 )( 1820 (
= .4696
b) P(x = 4 red) =
5 20
1 16 4 4
C
C C
=
15504
) 16 )( 1 (
= .0010
c) P(x = 5 red) =
5 20
0 16 5 4
C
C C
= .0000 because
4
C
5
is impossible to determine
141
The participant cannot draw 5 red beads if there are only 4 to draw from.
142
5.31 N = 10 n = 4
a) A = 3 x = 2
P(x = 2) =
210
) 21 )( 3 (
4 10
2 7 2 3
=
C
C C
= .30
b) A = 5 x = 0
P(x = 0) =
210
) 5 )( 1 (
4 10
4 5 0 5
=
C
C C
= .0238
c) A = 5 x = 3
P(x = 3) =
210
) 5 )( 10 (
4 10
1 5 3 5
=
C
C C
= .2381
5.32 N = 16 A = 4 defective n = 3
a) P(x = 0) =
560
) 220 )( 1 (
3 16
3 12 0 4
=
C
C C
= .3929
b) P(x = 3) =
560
) 1 )( 4 (
3 16
0 12 3 4
=
C
C C
= .0071
143
c) P(x > 2) = P(x=2) + P(x=3) =
3 16
1 12 2 4
C
C C
+ .0071 (from part b.) =
560
) 12 )( 6 (
+ .0071 = .1286 + .0071 = .1357
d) P(x < 1) = P(x=1) + P(x=0) =
3 16
2 12 1 4
C
C C
+ .3929 (from part a.) =
560
) 66 )( 4 (
+ .3929 = .4714 + .3929 = .8643
144
5.33 N = 18 A = 11 Hispanic n = 5
P(x < 1) = P(1) + P(0) =
5 18
4 7 1 11
C
C C
+
5 18
5 7 0 11
C
C C
=
8568
) 21 )( 1 (
8568
) 35 )( 11 (
+ = .0449 + .0025 = .0474
It is fairly unlikely that these results occur by chance. A researcher might want to
further investigate this result to determine causes. Were officers selected based on
leadership, years of service, dedication, prejudice, or some other reason?
5.34 a) P(x=4 |n = 11 and p = .23)
11
C
4
(.23)
4
(.77)
7
= 330(.0028)(.1605) = .1482
b) P(x > 1|n = 6 and p = .50) =
1 - P(x < 1) = 1 - P(x = 0) =
1 [
6
C
0
(.50)
0
(.50)
6
] = 1 [(1)(1)(.0156)] = .9844
c) P(x > 7 |n = 9 and p = .85) = P(x = 8) + P(x = 9) =
145
9
C
8
(.85)
8
(.15)
1
+
9
C
9
(.85)
9
(.15)
0
=
(9)(.2725)(.15) + (1)(.2316)(1) = .3679 + .2316 = .5995
d) P(x < 3 |n = 14 and p = .70) =
P(x = 3) + P(x = 2) + P(x = 1) + P(x = 0) =
14
C
3
(.70)
3
(.30)
11
+
14
C
2
(.70)
2
(.30)
12
+
14
C
1
(.70)
1
(.30)
13
+
14
C
0
(.70)
0
(.30)
14
=
(364)(.3430)(.00000177) + (91)(.49)(.000000531)=
(14)(.70)(.00000016) + (1)(1)(.000000048) =
.0002 + .0000 + .0000 + .0000 = .0002
146
5.35 a) P(x = 14 |n = 20 and p = .60) = .124
b) P(x < 5 |n = 10 and p =.30) =
P(x = 4) + P(x = 3) + P(x = 2) + P(x = 1) + P(x=0) =
x Prob.
0 .028
1 .121
2 .233
3 .267
4 .200
.849
c) P(x > 12 |n = 15 and p = .60) =
P(x = 12) + P(x = 13) + P(x = 14) + P(x = 15)
x Prob.
12 .063
13 .022
14 .005
15 .000
.090
147
d) P(x > 20 |n = 25 and p = .40) = P(x = 21) + P(x = 22) +
P(x = 23) + P(x = 24) + P(x=25) =
x Prob.
21 .000
22 .000
23 .000
24 .000
25 .000
.000
148
5.36
a) P(x = 4| = 1.25)
( . )( )
!
( . )(. )
.
125
4
2 4414 2865
24
4 1 25
e
= = .0291
b) P(x < 1| = 6.37) = P(x = 1) + P(x = 0) =
( . ) ( )
!
( . )( ) ( . )(. ) ( )(. )
. .
637
1
637
0!
637 0017
1
1 0017
1
1 6 37 0 6 37
e e
+ = +
.0109 + .0017 = .0126
c) P(x > 5| = 2.4) = P(x = 6) + P(x = 7) + ... =
( . )( ) ( . )( )
!
( . )( ) ( . )( )
!
( . )( )
. . . . .
2 4
6!
2 4
7
2 4
8!
2 4
9
2 4
10!
6 2 4 7 2 4 8 2 4 9 2 4 10 2 4
e e e e e
+ + + + +
.0241 + .0083 + .0025 + .0007 + .0002 = .0358
for values x > 11 the probabilities are each .0000 when rounded off to 4
decimal places.
149
5.37 a) P(x = 3| = 1.8) = .1607
b) P(x < 5| = 3.3) =
P(x = 4) + P(x = 3) + P(x = 2) + P(x = 1) + P(x = 0) =
x Prob.
0 .0369
1 .1217
2 .2008
3 .2209
4 .1823
.7626
c) P(x > 3| = 2.1) =
x Prob.
3 .1890
4 .0992
5 .0417
6 .0146
7 .0044
8 .0011
9 .0003
10 .0001
11 .0000
.3504
150
d) P(2 < x < 5| = 4.2):
P(x=3) + P(x=4) + P(x=5) =
x Prob.
3 .1852
4 .1944
5 .1633
.5429
151
5.38 a) P(x = 3| N = 6, n = 4, A = 5) =
15
) 1 )( 10 (
4 6
1 1 3 5
=
C
C C
= .6667
b) P(x < 1| N = 10, n = 3, A = 5):
P(x = 1) + P(x = 0) =
3 10
2 5 1 5
C
C C
+
3 10
3 5 0 5
C
C C
=
120
) 10 )( 1 (
120
) 10 )( 5 (
+
= .4167 + .0833 = .5000
c) P(x > 2 | N = 13, n = 5, A = 3):
P(x=2) + P(x=3) Note: only 3 x's in population
5 13
3 10 2 3
C
C C
+
5 13
2 10 3 3
C
C C
=
1287
) 45 )( 1 (
1287
) 120 )( 3 (
+ = .2797 + .0350 = .3147
5.39 n = 25 p = .20 retired
from Table A.2: P(x = 7) = .111
P(x > 10): P(x = 10) + P(x = 11) + . . . + P(x = 25) = .012 + .004 + .001 = .017
Expected Value = = np = 25(.20) = 5
152
n = 20 p = .40 mutual funds
P(x = 8) = .180
P(x < 6) = P(x = 0) + P(x = 1) + . . . + P(x = 5) =
.000 + .000 + .003 +.012 + .035 + .075 = .125
P(x = 0) = .000
P(x > 12) = P(x = 12) + P(x = 13) + . . . + P(x = 20) = .035 + .015 + .005 + .001 = .056
x = 8
Expected Number = = n p = 20(.40) = 8
5.40 = 3.2 cars|2 hours
a) P(x=3) cars per 1 hour) = ??
The interval has been decreased by .
The new = 1.6 cars|1 hour.
153
P(x = 3| = 1.6) = (from Table A.3) .1378
b) P(x = 0| cars per hour) = ??
The interval has been decreased by the original amount.
The new = 0.8 cars| hour.
P(x = 0| = 0.8) = (from Table A.3) .4493
c) P(x > 5| = 1.6) = (from Table A.3)
x Prob.
5 .0176
6 .0047
7 .0011
8 .0002
.0236
Either a rare event occurred or perhaps the long-run average, , has changed
(increased).
154
5.41 N = 32 A = 10 n = 12
a) P(x = 3) =
12 32
9 22 3 10
C
C C
=
840 , 792 , 225
) 420 , 497 )( 120 (
= .2644
b) P(x = 6) =
12 32
6 22 6 10
C
C C
=
840 , 792 , 225
) 613 , 74 )( 210 (
= .0694
c) P(x = 0) =
12 32
12 22 0 10
C
C C
=
840 , 792 , 225
) 646 , 646 )( 1 (
= .0029
d) A = 22
P(7 < x < 9) =
12 32
5 10 7 22
C
C C
+
12 32
4 10 8 22
C
C C
+
12 32
3 10 9 22
C
C C
=
840 , 792 , 225
) 120 )( 420 , 497 (
840 , 792 , 225
) 210 )( 770 , 319 (
840 , 792 , 225
) 252 )( 544 , 170 (
+ +
= .1903 + .2974 + .2644 = .7521
5.42 = 1.4 defects|1 lot If x > 3, buyer rejects If x < 3, buyer accepts
P(x < 3| = 1.4) = (from Table A.3)
155
x Prob.
0 .2466
1 .3452
2 .2417
3 .1128
.9463
156
5.43 a) n = 20 and p = .25
The expected number = = np = (20)(.25) = 5.00
b) P(x < 1| n = 20 and p = .25) =
P(x = 1) + P(x = 0) =
20
C
1
(.25)
1
(.75)
19
+
20
C
0
(.25)
0
(.75)
20
= (20)(.25)(.00423) + (1)(1)(.0032) = .0212 +. 0032 = .0244
Since the probability is so low, the population of your state may have a lower
percentage of chronic heart conditions than those of other states.
5.44 a) P(x > 7| n = 10 and p = .70) = (from Table A.2):
x Prob.
8 .233
9 .121
10 .028
.382
157
Expected number = = np = 10(.70) = 7
b) n = 15 p = 1/3 Expected number = = np = 15(1/3) = 5
P(x=0| n = 15 and p = 1/3) =
15
C
0
(1/3)
0
(2/3)
15
= .0023
c) n = 7 p = .53
P(x = 7|n = 7 and p = .53) =
7
C
7
(.53)
7
(.47)
0
= .0117
Probably the 53% figure is too low for this population since the probability of
this occurrence is so low (.0117).
158
5.45 n = 12
a.) P(x = 0 long hours):
p = .20
12
C
0
(.20)
0
(.80)
12
= .0687
b.) P(x > 6) long hours):
p = .20
Using Table A.2: .016 + .003 + .001 = .020
c) P(x = 5 good financing):
p = .25,
12
C
5
(.25)
5
(.75)
7
= .1032
d.) p = .19 (good plan), expected number = = n(p) = 12(.19) = 2.28
5.46 n = 100,000 p = .000014
Worked as a Poisson: = np = 100,000(.000014) = 1.4
a) P(x = 5):
159
from Table A.3 = .0111
b) P(x = 0):
from Table A.3 = .2466
c) P(x > 6): (from Table A.3)
x Prob
7 .0005
8 .0001
.0006
160
5.47 P(x < 3) |n = 8 and p = .60): From Table A.2:
x Prob.
0 .001
1 .008
2 .041
3 .124
.174
17.4% of the time in a sample of eight, three or fewer customers are walk-ins by
chance. Other reasons for such a low number of walk-ins might be that she is
retaining more old customers than before or perhaps a new competitor is
attracting walk-ins away from her.
5.48 n = 25 p = .20
a) P(x = 8| n = 25 and p = .20) = (from Table A.2) .062
b) P(x > 10)|n = 25 and p = .20) = (from Table A.2)
x Prob.
161
11 .004
12 .001
13 .000
.005
c) Since such a result would only occur 0.5% of the time by chance, it is likely
that the analyst's list was not representative of the entire state of Idaho or the
20% figure for the Idaho census is not correct.
162
5.49 = 0.6 flats|2000 miles
P(x = 0| = 0.6) = (from Table A.3) .5488
P(x > 3| = 0.6) = (from Table A.3)
x Prob.
3 .0198
4 .0030
5 .0004
.0232
Assume one trip is independent of the other. Let F = flat tire and NF = no flat tire
P(NF
1
NF
2
) = P(NF
1
) P(NF
2
)
but P(NF) = .5488
P(NF
1
NF
2
) = (.5488)(.5488) = .3012
5.50 N = 25 n = 8
a) P(x = 1 in NY) A = 4
163
8 25
7 21 1 4
C
C C
=
575 , 081 , 1
) 280 , 116 )( 4 (
= .4300
b) P(x = 4 in top 10) A = 10
575 , 081 , 1
) 1365 ( 210 (
8 25
4 15 4 10
=
C
C C
= .2650
c) P(x = 0 in California) A = 5
575 , 081 , 1
) 970 , 125 )( 1 (
8 25
8 20 0 5
=
C
C C
= .1165
d) P(x = 3 with M) A = 3
575 , 081 , 1
) 334 , 26 )( 1 (
8 25
5 22 3 3
=
C
C C
= .0243
5.51 N = 24 n = 6 A = 8
a) P(x = 6) =
596 , 134
) 1 )( 28 (
6 24
0 16 6 8
=
C
C C
= .0002
b) P(x = 0) =
596 , 134
) 8008 )( 1 (
6 24
6 16 0 8
=
C
C C
= .0595
164
d) A = 16 East Side
P(x = 3) =
596 , 134
) 56 )( 560 (
6 24
3 8 3 16
=
C
C C
= .2330
5.52 n = 25 p = .20 Expected Value = = np = 25(.20) = 5
= 25(.20) = 5 o = ) 80 )(. 20 (. 25 = q p n = 2
P(x > 12) = From Table A.2: x Prob
13 .0000
The values for x > 12 are so far away from the expected value that they are very
unlikely to occur.
165
P(x = 14) =
25
C
14
(.20)
14
(.80)
11
= .000063 which is very unlikely.
If this value (x = 14) actually occurred, one would doubt the validity of the
p = .20 figure or one would have experienced a very rare event.
5.53 = 2.4 calls|1 minute
a) P(x = 0| = 2.4) = (from Table A.3) .0907
b) Can handle x < 5 calls Cannot handle x > 5 calls
P(x > 5| = 2.4) = (from Table A.3)
x Prob.
6 .0241
7 .0083
8 .0025
9 .0007
10 .0002
11 .0000
.0358
166
c) P(x = 3 calls|2 minutes)
The interval has been increased 2 times.
New Lambda: = 4.8 calls|2 minutes.
from Table A.3: .1517
d) P(x < 1 calls|15 seconds):
The interval has been decreased by .
New Lambda = = 0.6 calls|15 seconds.
P(x < 1| = 0.6) = (from Table A.3)
P(x = 1) = .3293
P(x = 0) = .5488
.8781
167
5.54 n = 160 p = .01
Working this problem as a Poisson problem:
a) Expected number = = n(p) = 160(.01) = 1.6
b) P(x > 8):
Using Table A.3: x Prob.
8 .0002
9 .0000
.0002
c) P(2 < x < 6):
Using Table A.3: x Prob.
2 .2584
3 .1378
4 .0551
5 .0176
6 .0047
.4736
168
5.55 p = .005 n = 1,000
= np = (1,000)(.005) = 5
a) P(x < 4) = P(x = 0) + P(x = 1) + P(x = 2) + P(x = 3) =
.0067 + .0337 + .0842 + .1404 = .265
b) P(x > 10) = P(x = 11) + P(x = 12) + . . . =
.0082 + .0034 + .0013 + .0005 + .0002 = .0136
c) P(x = 0) = .0067
169
5.56 n = 8 p = .36 x = 0 women
8
C
0
(.36)
0
(.64)
8
= (1)(1)(.0281475) = .0281
It is unlikely that a company would randomly hire 8 physicians from the U.S. pool
and none of them would be female. If this actually happened, figures similar to these
might be used as evidence in a lawsuit.
5.57 N = 34
a) n = 5 x = 3 A = 13
256 , 278
) 210 )( 286 (
5 34
2 21 3 13
=
C
C C
= .2158
b) n = 8 x < 2 A = 5
8 34
8 29 0 5
C
C C
+
8 34
7 29 1 5
C
C C
+
8 34
6 29 2 5
C
C C
=
204 , 156 , 18
) 020 , 475 )( 10 (
204 , 156 , 18
) 780 , 560 , 1 )( 5 (
204 , 156 , 18
) 145 , 292 , 4 )( 1 (
+ + = .2364 + .4298 + .2616 = .9278
170
c) n = 5 x = 2 A = 3
5
C
2
(3/34)
2
(31/34)
3
= (10)(.0077855)(.7579636) = .0590
5.58 N = 14 n = 4
a) P(x = 4| N = 14, n = 4, A = 10 north side)
1001
) 1 (( 210 (
4 14
0 4 4 10
=
C
C C
= .2098
b) P(x = 4| N = 14, n = 4, A = 4 west)
1001
) 1 )( 1 (
4 14
0 10 4 4
=
C
C C
= .0010
c) P(x = 2| N = 14, n = 4, A = 4 west)
1001
) 45 )( 6 (
4 14
2 10 2 4
=
C
C C
= .2697
171
5.59 a) = 3.84|1,000
P(x = 0) =
! 0
84 . 3
84 . 3 0
e
= .0215
b) = 7.68|2,000
P(x = 6) =
720
) 000461975 )(. 258 . 195 , 205 (
! 6
68 . 7
68 . 7 6
=
e
= .1317
c) = 1.6|1,000 and = 4.8|3,000
from Table A.3:
P(x < 7) = P(x = 0) + P(x = 1) + . . . + P(x = 6) =
.0082 + .0395 + .0948 + .1517 + .1820 + .1747 + .1398 = .7907
172
5.60 This is a binomial distribution with n = 15 and p = .36.
u = np = 15(.36) = 5.4
o = ) 64 )(. 36 (. 15 = 1.86
The most likely values are near the mean, 5.4. Note from the printout that the
most probable values are at x = 5 and x = 6 which are near the mean.
5.61 This printout contains the probabilities for various values of x from zero to eleven from a
Poisson distribution with = 2.78. Note that the highest probabilities are at x = 2 and
x = 3 which are near the mean. The probability is slightly higher at x = 2 than at x = 3
even though x = 3 is nearer to the mean because of the piling up effect of x = 0.
5.62 This is a binomial distribution with n = 22 and p = .64.
The mean is np = 22(.64) = 14.08 and the standard deviation is:
o = ) 36 )(. 64 (. 22 = q p n = 2.25
173
The x value with the highest peak on the graph is at x = 14 followed by x = 15
and x = 13 which are nearest to the mean.
5.63 This is the graph of a Poisson Distribution with = 1.784. Note the high
probabilities at x = 1 and x = 2 which are nearest to the mean. Note also that the
probabilities for values of x > 8 are near to zero because they are so far away
from the mean or expected value.
174
Chapter 6
Continuous Distributions
LEARNING OBJECTIVES
The primary objective of Chapter 6 is to help you understand continuous distributions, thereby
enabling you to:
1. Understand concepts of the uniform distribution.
2. Appreciate the importance of the normal distribution.
3. Recognize normal distribution problems and know how to solve such problems.
4. Decide when to use the normal distribution to approximate binomial distribution
problems and know how to work such problems.
5. Decide when to use the exponential distribution to solve problems in business and know
how to work such problems.
CHAPTER TEACHING STRATEGY
Chapter 5 introduced the students to discrete distributions. This chapter introduces the
students to three continuous distributions: the uniform distribution, the normal distribution and
175
the exponential distribution. The normal distribution is probably the most widely known and
used distribution. The text has been prepared with the notion that the student should be able
to work many varied types of normal curve problems. Examples and practice problems are
given wherein the student is asked to solve for virtually any of the four variables in the z
equation. It is very helpful for the student to get into the habit of constructing a normal curve
diagram, with a shaded portion for the desired area of concern for each problem using the
normal distribution. Many students tend to be more visual learners than auditory and these
diagrams will be of great assistance in problem demonstration and in problem solution.
This chapter contains a section dealing with the solution of binomial distribution
problems by the normal curve. The correction for continuity is emphasized. In this text,
the correction for continuity is always used whenever a binomial distribution problem is
worked by the normal curve. Since this is often a stumbling block for students to
comprehend, the chapter has included a table (Table 6.4) with rules of thumb as to how to
apply the correction for continuity. It should be emphasized, however, that answers for
this type of problem are still only approximations. For this reason and also in an effort to
link chapters 5 & 6, the student is sometimes asked to work binomial problems both by
methods in this chapter and also by using binomial tables (A.2). This also will allow the
student to observe how good the approximation of the normal curve is to binomial
problems.
The exponential distribution can be taught as a continuous distribution, which can be
used in complement with the Poisson distribution of chapter 5 to solve inter-arrival time
problems. The student can see that while the Poisson distribution is discrete because it
describes the probabilities of whole number possibilities per some interval, the exponential
distribution describes the probabilities associated with times that are continuously distributed.
CHAPTER OUTLINE
6.1 The Uniform Distribution
Determining Probabilities in a Uniform Distribution
Using the Computer to Solve for Uniform Distribution Probabilities
6.2 Normal Distribution
176
History of the Normal Distribution
Probability Density Function of the Normal Distribution
Standardized Normal Distribution
Solving Normal Curve Problems
Using the Computer to Solve for Normal Distribution Probabilities
6.3 Using the Normal Curve to Approximate Binomial Distribution Problems
Correcting for Continuity
6.4 Exponential Distribution
Probabilities of the Exponential Distribution
Using the Computer to Determine Exponential Distribution Probabilities
KEY TERMS
Correction for Continuity Standardized Normal Distribution
Exponential Distribution Uniform Distribution
Normal Distribution z Distribution
Rectangular Distribution z Score
177
SOLUTIONS TO PROBLEMS IN CHAPTER 6
6.1 a = 200 b = 240
a) f(x) =
40
1
200 240
1 1
=
=
a b
= .025
b) u =
2
240 200
2
+
=
+ b a
= 220
o =
12
40
12
200 240
12
=
=
a b
= 11.547
c) P(x > 230) =
40
10
200 240
230 240
=
= .250
d) P(205 < x < 220) =
40
15
200 240
205 220
=
= .375
e) P(x < 225) =
40
25
200 240
200 225
=
= .625
6.2 a = 8 b = 21
178
a) f(x) =
13
1
8 21
1 1
=
=
a b
= .0769
b) u =
2
29
2
21 8
2
=
+
=
+ b a
= 14.5
o =
12
13
12
8 21
12
=
=
a b
= 3.7528
c) P(10 < x < 17) =
13
7
8 21
10 17
=
= .5385
d) P(x > 22) = .0000
e) P(x > 7) = 1.0000
179
6.3 a = 2.80 b = 3.14
u =
2
14 . 3 80 . 2
2
=
+b a
= 2.97
o =
12
80 . 2 14 . 3
12
=
a b
= 0.098
P(3.00 < x < 3.10) =
80 . 2 14 . 3
00 . 3 10 . 3
= 0.2941
6.4 a = 11.97 b = 12.03
Height =
97 . 11 03 . 12
1 1
=
a b
= 16.667
P(x > 12.01) =
97 . 11 03 . 12
01 . 12 03 . 12
= .3333
P(11.98 < x < 12.01) =
97 . 11 03 . 12
98 . 11 01 . 12
= .5000
6.5 = 2100 a = 400 b = 3800
180
o =
12
400 3800
12
=
a b
= 981.5
Height =
12
400 3800 1
=
a b
= .000294
P(x > 3000) =
3400
800
400 3800
3000 3800
=
= .2353
P(x > 4000) = .0000
P(700 < x < 1500) =
3400
800
400 3800
700 1500
=
= .2353
6.6 a) P(z > 1.96):
Table A.5 value for z = 1.96: .4750
P(z > 1.96) = .5000 - .4750 = .0250
181
b) P (z < 0.73):
Table A.5 value for z = 0.73: .2673
P(z < 0.73) = .5000 + .2673 = .7673
c) P(-1.46 < z < 2.84):
Table A.5 value for z = 2.84: .4977
Table A.5 value for z = 1.46: .4279
P(1.46 < z < 2.84) = .4977 + 4279 = .9256
d) P(-2.67 < z < 1.08):
Table A.5 value for z = -2.67: .4962
Table A.5 value for z = 1.08: .3599
P(-2.67 < z < 1.08) = .4962 + .3599 = .8561
e) P (-2.05 < z < -.87):
182
Table A.5 value for z = -2.05: .4798
Table A.5 value for z = -0.87: .3078
P(-2.05 < z < -.87) = .4798 - .3078 = .1720
183
6.7 a) P(x < 635| = 604, o = 56.8):
z =
8 . 56
604 635
=
o
u x
= 0.55
Table A.5 value for z = 0.55: .2088
P(x < 635) = .2088 + .5000 = .7088
b) P(x < 20| = 48, o = 12):
z =
12
48 20
=
o
u x
= -2.33
Table A.5 value for z = -2.33: .4901
P(x < 20) = .5000 - .4901 = .0099
c) P(100 < x < 150| = 111, o = 33.8):
z =
8 . 33
111 150
=
o
u x
= 1.15
Table A.5 value for z = 1.15: .3749
z =
8 . 33
111 100
=
o
u x
= -0.33
184
Table A.5 value for z = -0.33: .1293
P(100 < x < 150) = .3749 + .1293 = .5042
d) P(250 < x < 255| = 264, o = 10.9):
z =
9 . 10
264 250
=
o
u x
= -1.28
Table A.5 value for z = -1.28: .3997
z =
9 . 10
264 255
=
o
u x
= -0.83
Table A.5 value for z = -0.83: .2967
P(250 < x < 255) = .3997 - .2967 = .1030
e) P(x > 35| = 37, o = 4.35):
z =
35 . 4
37 35
=
o
u x
= -0.46
Table A.5 value for z = -0.46: .1772
185
P(x > 35) = .1772 + .5000 = .6772
f) P(x > 170| = 156, o = 11.4):
z =
4 . 11
156 170
=
o
u x
= 1.23
Table A.5 value for z = 1.23: .3907
P(x > 170) = .5000 - .3907 = .1093
6.8 = 22 o = 4
a) P(x > 17):
z =
4
22 17
=
o
u x
= -1.25
area between x = 17 and = 22 from table A.5 is .3944
P(x > 17) = .3944 + .5000 = .8944
b) P(x < 13):
186
z =
4
22 13
=
o
u x
= -2.25
from table A.5, area = .4878
P(x < 13) = .5000 - .4878 = .0122
187
c) P(25 < x < 31):
z =
4
22 31
=
o
u x
= 2.25
from table A.5, area = .4878
z =
4
22 25
=
o
u x
= 0.75
from table A.5, area = .2734
P(25 < x < 31) = .4878 - .2734 = .2144
6.9 = 60 o = 11.35
a) P(x > 85):
z =
35 . 11
60 85
=
o
u x
= 2.20
from Table A.5, the value for z = 2.20 is .4861
188
P(x > 85) = .5000 - .4861 = .0139
b) P(45 < x < 70):
z =
35 . 11
60 45
=
o
u x
= -1.32
z =
35 . 11
60 70
=
o
u x
= 0.88
from Table A.5, the value for z = -1.32 is .4066
and for z = 0.88 is .3106
P(45 < x < 70) = .4066 + .3106 = .7172
189
c) P(65 < x < 75):
z =
35 . 11
60 65
=
o
u x
= 0.44
z =
35 . 11
60 75
=
o
u x
= 1.32
from Table A.5, the value for z = 0.44 is .1700
from Table A.5, the value for z = 1.32 is .4066
P(65 < x < 75) = .4066 - .1700 = .2366
d) P(x < 40):
z =
35 . 11
60 40
=
o
u x
= -1.76
from Table A.5, the value for z = -1.76 is .4608
P(x < 40) = .5000 - .4608 = .0392
6.10 = $1332 o = $725
190
a) P(x > $2000):
z =
725
1332 2000
=
o
u x
= 0.92
from Table A.5, the z = 0.92 yields: .3212
P(x > $2000) = .5000 - .3212 = .1788
b) P(owes money) = P(x < 0):
z =
725
1332 0
=
o
u x
= -1.84
from Table A.5, the z = -1.84 yields: .4671
P(x < 0) = .5000 - .4671 = .0329
c) P($100 < x < $700):
z =
725
1332 100
=
o
u x
= -1.70
from Table A.5, the z = -1.70 yields: .4554
z =
725
1332 700
=
o
u x
= -0.87
191
from Table A.5, the z = -0.87 yields: .3078
P($100 < x < $700) = .4554 - .3078 = .1476
6.11 u = $30,000 o = $9,000
a) P($15,000 < x < $45,000):
z =
000 , 9
000 , 30 000 , 45
=
o
u x
= 1.67
From Table A.5, z = 1.67 yields: .4525
z =
000 , 9
000 , 30 000 , 15
=
o
u x
= -1.67
From Table A.5, z = -1.67 yields: .4525
P($15,000 < x < $45,000) = .4525 + .4525 = .9050
b) P(x > $50,000):
z =
000 , 9
000 , 30 000 , 50
=
o
u x
= 2.22
192
From Table A.5, z = 2.22 yields: .4868
P(x > $50,000) = .5000 - .4868 = .0132
193
c) P($5,000 < x < $20,000):
z =
000 , 9
000 , 30 000 , 5
=
o
u x
= -2.78
From Table A.5, z = -2.78 yields: .4973
z =
000 , 9
000 , 30 000 , 20
=
o
u x
= -1.11
From Table A.5, z = -1.11 yields .3665
P($5,000 < x < $20,000) = .4973 - .3665 = .1308
194
d) Since 90.82% of the values are greater than x = $7,000, x = $7,000 is in the
lower half of the distribution and .9082 - .5000 = .4082 lie between x and .
From Table A.5, z = -1.33 is associated with an area of .4082.
Solving for o: z =
o
u x
-1.33 =
o
000 , 30 000 , 7
o = 17,293.23
e) o = $9,000. If 79.95% of the costs are less than $33,000, x = $33,000 is in
the upper half of the distribution and .7995 - .5000 = .2995 of the values lie
between $33,000 and the mean.
From Table A.5, an area of .2995 is associated with z = 0.84
Solving for : z =
o
u x
0.84 =
000 , 9
000 , 33 u
195
= $25,440
6.12 = 200, o = 47 Determine x
a) 60% of the values are greater than x:
Since 50% of the values are greater than the mean, = 200, 10% or .1000 lie
between x and the mean. From Table A.5, the z value associated with an area
of .1000 is z = -0.25. The z value is negative since x is below the mean.
Substituting z = -0.25, = 200, and o = 47 into the formula and solving for x:
z =
o
u x
-0.25 =
47
200 x
x = 188.25
b) x is less than 17% of the values.
Since x is only less than 17% of the values, 33% (.5000- .1700) or .3300 lie
between x and the mean. Table A.5 yields a z value of 0.95 for an area of
.3300. Using this z = 0.95, = 200, and o = 47, x can be solved for:
196
z =
o
u x
0.95 =
47
200 x
x = 244.65
c) 22% of the values are less than x.
Since 22% of the values lie below x, 28% lie between x and the mean
(.5000 - .2200). Table A.5 yields a z of -0.77 for an area of .2800. Using the z
value of -0.77, = 200, and o = 47, x can be solved for:
z =
o
u x
-0.77 =
47
200 x
x = 163.81
d) x is greater than 55% of the values.
Since x is greater than 55% of the values, 5% (.0500) lie between x and the
mean. From Table A.5, a z value of 0.13 is associated with an area of .05.
197
Using z = 0.13, = 200, and o = 47, x can be solved for:
z =
o
u x
0.13 =
47
200 x
x = 206.11
6.13 o = 625. If 73.89% of the values are greater than 1700, then 23.89% or .2389
lie between 1700 and the mean, . The z value associated with .2389 is -0.64
since the 1700 is below the mean.
Using z = -0.64, x = 1700, and o = 625, can be solved for:
z =
o
u x
-0.64 =
625
1700 u
= 2100
198
= 2258 and o = 625. Since 31.56% are greater than x, 18.44% or .1844
(.5000 - .3156) lie between x and = 2258. From Table A.5, a z value of 0.48
is associated with .1844 area under the normal curve.
Using = 2258, o = 625, and z = 0.48, x can be solved for:
z =
o
u x
0.48 =
625
2258 x
x = 2558
6.14 = 22 o = ??
Since 72.4% of the values are greater than 18.5, then 22.4% lie between 18.5 and . x =
18.5 is below the mean. From table A.5, z = - 0.59.
-0.59 =
o
22 5 . 18
-0.59o = -3.5
o =
59 . 0
5 . 3
= 5.932
199
6.15 P(x < 20) = .2900
x is less than because of the percentage. Between x and is .5000 - .2900 =
.2100 of the area. The z score associated with this area is -0.55. Solving for :
z =
o
u x
-0.55 =
4
20 u
u = 22.20
6.16 = 9.7 Since 22.45% are greater than 11.6, x = 11.6 is in the upper half of the
distribution and .2755 (.5000 - .2245) lie between x and the mean. Table A.5 yields
a z = 0.76 for an area of .2755.
Solving for o:
z =
o
u x
200
0.76 =
o
7 . 9 6 . 11
o = 2.5
6.17 a) P(x < 16| n = 30 and p = .70)
= np = 30(.70) = 21
o = ) 30 )(. 70 (. 30 = q p n = 2.51
P(x < 16.5| = 21 and o = 2.51)
b) P(10 < x < 20| n = 25 and p = .50)
= np = 25(.50) = 12.5
o = ) 50 )(. 50 (. 25 = q p n = 2.5
P(10.5 < x < 20.5| = 12.5 and o = 2.5)
c) P(x = 22| n = 40 and p = .60)
201
= np = 40(.60) = 24
o = ) 40 )(. 60 (. 40 = q p n = 3.10
P(21.5 < x < 22.5| = 24 and o = 3.10)
d) P(x > 14 n = 16 and p = .45)
= np = 16(.45) = 7.2
o = ) 55 )(. 45 (. 16 = q p n = 1.99
P(x > 14.5| = 7.2 and o = 1.99)
202
6.18 a) n = 8 and p = .50 = np = 8(.50) = 4
o = ) 50 )(. 50 (. 8 = q p n = 1.414
3o = 4 3(1.414) = 4 4.242
(-0.242 to 8.242) does not lie between 0 and 8.
Do not use the normal distribution to approximate this problem.
b) n = 18 and p = .80 = np = 18(.80) = 14.4
o = ) 20 )(. 80 (. 18 = q p n = 1.697
3o = 14.4 3(1.697) = 14.4 5.091
(9.309 to 19.491) does not lie between 0 and 18.
Do not use the normal distribution to approximate this problem.
c) n = 12 and p = .30
= np = 12(.30) = 3.6
203
o = ) 70 )(. 30 (. 12 = q p n = 1.587
3o = 3.6 3(1.587) = 3.6 4.761
(-1.161 to 8.361) does not lie between 0 and 12.
Do not use the normal distribution to approximate this problem.
d) n = 30 and p = .75 = np = 30(.75) = 22.5
3o = 22.5 3(2.37) = 22.5 7.11
(15.39 to 29.61) does lie between 0 and 30.
The problem can be approximated by the normal curve.
e) n = 14 and p = .50 = np = 14(.50) = 7
o = ) 50 )(. 50 (. 14 = q p n = 1.87
3o = 7 3(1.87) = 7 5.61
(1.39 to 12.61) does lie between 0 and 14.
The problem can be approximated by the normal curve.
6.19 a) P(x = 8|n = 25 and p = .40) = np = 25(.40) = 10
o = ) 60 )(. 40 (. 25 = q p n = 2.449
204
3o = 10 3(2.449) = 10 7.347
(2.653 to 17.347) lies between 0 and 25.
Approximation by the normal curve is sufficient.
P(7.5 < x < 8.5| = 10 and o = 2.449):
z =
449 . 2
10 5 . 7
= -1.02
From Table A.5, area = .3461
z =
449 . 2
10 5 . 8
= -0.61
From Table A.5, area = .2291
P(7.5 < x < 8.5) = .3461 - .2291 = .1170
From Table A.2 (binomial tables) = .120
b) P(x > 13|n = 20 and p = .60) = np = 20(.60) = 12
o = ) 40 )(. 60 (. 20 = q p n = 2.19
205
3o = 12 3(2.19) = 12 6.57
(5.43 to 18.57) lies between 0 and 20.
Approximation by the normal curve is sufficient.
P(x > 12.5| = 12 and o = 2.19):
z =
19 . 2
12 5 . 12
=
o
u x
= 0.23
From Table A.5, area = .0910
P(x > 12.5) = .5000 -.0910 = .4090
From Table A.2 (binomial tables) = .415
c) P(x = 7|n = 15 and p = .50) = np = 15(.50) = 7.5
o = ) 50 )(. 50 (. 15 = q p n = 1.9365
3o = 7.5 3(1.9365) = 7.5 5.81
(1.69 to 13.31) lies between 0 and 15.
Approximation by the normal curve is sufficient.
P(6.5 < x < 7.5| = 7.5 and o = 1.9365):
206
z =
9365 . 1
5 . 7 5 . 6
=
o
u x
= -0.52
From Table A.5, area = .1985
From Table A.2 (binomial tables) = .196
d) P(x < 3|n = 10 and p =.70): = np = 10(.70) = 7
o = ) 30 )(. 70 (. 10 = q p n
3o = 7 3(1.449) = 7 4.347
(2.653 to 11.347) does not lie between 0 and 10.
The normal curve is not a good approximation to this problem.
6.20 P(x < 40| n = 120 and p = .37): = np = 120(.37) = 44.4
o = ) 63 )(. 37 (. 120 = q p n = 5.29
+ 3o = 28.53 to 60.27 does lie between 0 and 120.
It is okay to use the normal distribution to approximate this problem
Correcting for continuity: x = 39.5
207
z =
29 . 5
4 . 44 5 . 39
= -0.93
from Table A.5, the area of z = -0.93 is .3238
P(x < 40) = .5000 - .3238 = .1762
6.21 n = 70, p = .59 P(x < 35):
Converting to the normal dist.:
= n(p) = 70(.59) = 41.3 and o = ) 41 )(. 59 (. 70 = q p n = 4.115
Test for normalcy:
0 < + 3o < n, 0 < 41.3 + 3(4.115) < 70
0 < 28.955 to 53.645 < 70, passes the test
correction for continuity, use x = 34.5
z =
115 . 4
3 . 41 5 . 34
= -1.65
from table A.5, area = .4505
208
P(x < 35) = .5000 - .4505 = .0495
6.22 For parts a) and b), n = 300 p = .53
= 300(.53) = 159
o = ) 47 )(. 53 (. 300 = q p n = 8.645
Test: + 3o = 159 + 3(8.645) = 133.065 to 184.935
which lies between 0 and 300. It is okay to use the normal distribution as an
approximation on parts a) and b).
a) P(x > 175 transmission)
correcting for continuity: x = 175.5
z =
645 . 8
159 5 . 175
= 1.91
from A.5, the area for z = 1.91 is .4719
P(x > 175) = .5000 - .4719 = .0281
b) P(165 < x < 170)
209
correcting for continuity: x = 164.5; x = 170.5
z =
645 . 8
159 5 . 170
= 1.33 and z =
645 . 8
159 5 . 164
= 0.64
from A.5, the area for z = 1.33 is .4082
the area for z = 0.64 is .2389
P(165 < x < 170) = .4082 - .2389 = .1693
For parts c) and d): n = 300 p = .60
u = 300(.60) = 180 o = ) 40 )(. 60 (. 300 = q p n = 8.485
Test: + 3o = 180 + 3(8.485) = 180 + 25.455
154.545 to 205.455 lies between 0 and 300
It is okay to use the normal distribution to approximate c) and d)
c) P(155 < x < 170 personnel):
correcting for continuity: x = 154.5; x = 170.5
210
z =
485 . 8
180 5 . 170
= -1.12 and z =
485 . 8
180 5 . 154
= -3.01
from A.5, the area for z = -1.12 is .3686
the area for z = -3.01 is .4987
P(155 < x < 170) = .4987 - .3686 = .1301
d) P(x < 200 personnel):
correcting for continuity: x = 199.5
z =
485 . 8
180 5 . 199
= 2.30
from A.5, the area for z = 2.30 is .4893
P(x < 200) = .5000 + .4893 = .9893
6.23 p = .25 n = 130
Conversion to normal dist.: = n(p) = 130(.25) = 32.5
o = ) 75 )(. 25 (. 130 = q p n = 4.94
a) P(x > 36): Correct for continuity: x = 36.5
z =
94 . 4
5 . 32 5 . 36
= 0.81
211
from table A.5, area = .2910
P(x > 20) = .5000 - .2910 = .2090
b) P(26 < x < 35): Correct for continuity: 25.5 to 35.5
z =
94 . 4
5 . 32 5 . 25
= -1.42 and z =
94 . 4
5 . 32 5 . 35
= 0.61
from table A.5, area for z = -1.42 is .4222
area for z = 0.61 is .2291
P(26 < x < 35) = .4222 + .2291 = .6513
c) P(x < 20): correct for continuity: x = 19.5
z =
94 . 4
5 . 32 5 . 19
= -2.63
from table A.5, area for z = -2.63 is .4957
P(x < 20) = .5000 - .4957 = .0043
d) P(x = 30): correct for continuity: 29.5 to 30.5
212
z =
94 . 4
5 . 32 5 . 29
= -0.61 and z =
94 . 4
5 . 32 5 . 30
= -0.40
from table A.5, area for -0.61 = .2291
area for -0.40 = .1554
P(x = 30) = .2291 - .1554 = .0737
6.24 n = 95
a) P(44 < x < 52) agree with direct investments, p = .52
By the normal distribution: = n(p) = 95(.52) = 49.4
o = ) 48 )(. 52 (. 95 = q p n = 4.87
test: + 3o = 49.4 + 3(4.87) = 49.4 + 14.61
0 < 34.79 to 64.01 < 95 test passed
z =
87 . 4
4 . 49 5 . 43
= -1.21
from table A.5, area = .3869
213
z =
87 . 4
4 . 49 5 . 52
= 0.64
from table A.5, area = .2389
P(44 < x < 52) = .3869 + .2389 = .6258
b) P(x > 56):
correcting for continuity, x = 56.5
z =
87 . 4
4 . 49 5 . 56
= 1.46
from table A.5, area = .4279
P(x > 56) = .5000 - .4279 = .0721
c) Joint Venture:
p = .70, n = 95
By the normal dist.: = n(p) = 95(.70) = 66.5
o = ) 30 )(. 70 (. 95 = q p n = 4.47
214
test for normalcy: 66.5 + 3(4.47) = 66.5 + 13.41
0 < 53.09 to 79.91 < 95 test passed
P(x < 60):
correcting for continuity: x = 59.5
z =
47 . 4
5 . 66 5 . 59
= -1.57
from table A.5, area = .4418
P(x < 60) = .5000 - .4418 = .0582
d) P(55 < x < 62):
correcting for continuity: 54.5 to 62.5
z =
47 . 4
5 . 66 5 . 54
= -2.68
from table A.5, area = .4963
z =
47 . 4
5 . 66 5 . 62
= -0.89
215
from table A.5, area = .3133
P(55 < x < 62) = .4963 - .3133 = .1830
216
6.25 a) = 0.1
x
0
y
0 .1000
1 .0905
2 .0819
3 .0741
4 .0670
5 .0607
6 .0549
7 .0497
8 .0449
9 .0407
10 .0368
b) = 0.3
x
0
y
217
0 .3000
1 .2222
2 .1646
3 .1220
4 .0904
5 .0669
6 .0496
7 .0367
8 .0272
9 .0202
c) = 0.8
x
0
y
0 .8000
1 .3595
218
2 .1615
3 .0726
4 .0326
5 .0147
6 .0066
7 .0030
8 .0013
9 .0006
d) = 3.0
x
0
y
0 3.0000
1 .1494
219
2 .0074
3 .0004
4 .0000
5 .0000
220
6.26 a) = 3.25
=
25 . 3
1 1
=
= 0.31
o =
25 . 3
1 1
=
= 0.31
b) = 0.7
=
007 .
1 1
=
= 1.43
o =
007 .
1 1
=
= 1.43
c) = 1.1
=
1 . 1
1 1
=
= 0.91
o =
1 . 1
1 1
=
= 0.91
d) = 6.0
u =
6
1 1
=
= 0.17
221
o =
6
1 1
=
= 0.17
222
6.27 a) P(x > 5| = 1.35) =
for x
0
= 5: P(x) = e
-x
= e
-1.35(5)
= e
-6.75
= .0012
b) P(x < 3| = 0.68) = 1 - P(x < 3| = .68) =
for x
0
= 3: 1 e
-x
= 1 e
-0.68(3)
= 1 e
2.04
= 1 - .1300 = .8700
c) P(x > 4| = 1.7) =
for x
0
= 4: P(x) = e
-x
= e
-1.7(4)
= e
-6.8
= .0011
d) P(x < 6| = 0.80) = 1 - P(x > 6| = 0.80) =
for x
0
= 6: P(x) = 1 e
-x
= 1 e
-0.80(6)
= 1 e
-4.8
= 1 - .0082
= .9918
6.28 = 23 sec.
=
u
1
= .0435 per second
223
a) P(x > 1 min| = .0435/sec.)
Change to minutes: = .0435(60) = 2.61 min
P(x > 1 min| = 2.61/min) =
for x
0
= 1: P(x) = e
-x
= e
-2.61(1)
= .0735
b) = .0435/sec
Change to minutes: = (.0435)(60) = 2.61 min
P(x > 3 | = 2.61/min) =
for x
0
= 3: P(x) = e
-x
= e
-2.61(3)
= e
-7.83
= .0004
6.29 = 2.44/min.
a) P(x > 10 min| = 2.44/min) =
Let x
0
= 10, e
-x
= e
-2.44(10)
= e
-24.4
= .0000
b) P(x > 5 min| = 2.44/min) =
224
Let x
0
= 5, e
-x
= e
-2.44(5)
= e
-12.20
= .0000
c) P(x > 1 min| = 2.44/min) =
Let x
0
= 1, e
-x
= e
-2.44(1)
= e
-2.44
= .0872
d) Expected time = =
44 . 2
1 1
=
= 0.295
(0.295)(1,000) = 295
P(x > 500):
Let x
0
= 500/1,000 passengers = .5
e
-x
= e
-3.39(.5)
= e
-1.695
= .1836
P(x < 200):
Let x
0
= 200/1,000 passengers = .2
e
-x
= e
-3.39(.2)
= e
-.678
= .5076
P(x < 200) = 1 - .5076 = .4924
227
6.32 = 20 years
=
20
1
= .05/year
x
0
Prob(x > x
0
)=e
-x
1 .9512
2 .9048
3 .8607
If the foundation is guaranteed for 2 years, based on past history, 90.48% of the
foundations will last at least 2 years without major repair and only 9.52% will
require a major repair before 2 years.
228
6.33 = 2/month
Average number of time between rain = =
2
1 1
=
month = 15 days
o = = 15 days
P(x < 2 days| = 2/month):
Change to days: =
30
2
= .067/day
P(x < 2 days| = .067/day) =
1 P(x > 2 days| = .067/day)
let x
0
= 2, 1 e
-x
= 1 e
-.067(2)
= 1 .8746 = .1254
6.34 a = 6 b = 14
f(x) =
8
1
6 14
1 1
=
=
a b
= .125
u =
2
14 6
2
+
=
+ b a
= 10
o =
12
8
12
6 14
12
=
=
a b
= 2.309
229
P(x > 11) =
8
3
6 14
11 14
=
= .375
P(7 < x < 12) =
8
5
6 14
7 12
=
= .625
6.35 a) P(x < 21| = 25 and o = 4):
z =
4
25 21
=
o
u x
= -1.00
From Table A.5, area = .3413
P(x < 21) = .5000 -.3413 = .1587
b) P(x > 77| = 50 and o = 9):
z =
9
50 77
=
o
u x
= 3.00
From Table A.5, area = .4987
P(x > 77) = .5000 -.4987 = .0013
c) P(x > 47| = 50 and o = 6):
230
z =
6
50 47
=
o
u x
= -0.50
From Table A.5, area = .1915
P(x > 47) = .5000 + .1915 = .6915
d) P(13 < x < 29| = 23 and o = 4):
z =
4
23 13
=
o
u x
= -2.50
From Table A.5, area = .4938
z =
4
23 29
=
o
u x
= 1.50
From Table A.5, area = .4332
P(13 < x < 29) = .4938 + 4332 = .9270
e) P(x > 105| = 90 and o = 2.86):
z =
86 . 2
90 105
=
o
u x
= 5.24
231
From Table A.5, area = .5000
P(x > 105) = .5000 - .5000 = .0000
6.36 a) P(x = 12| n = 25 and p = .60):
= np = 25(.60) = 15
o = ) 40 )(. 60 (. 25 = q p n = 2.45
3o = 15 3(2.45) = 15 7.35
(7.65 to 22.35) lies between 0 and 25.
The normal curve approximation is sufficient.
P(11.5 < x < 12.5| = 15 and o = 2.45):
z =
45 . 2
15 5 . 11
=
o
u x
= -1.43 From Table A.5, area = .4236
z =
45 . 2
15 5 . 12
=
o
u x
= -1.02 From Table A.5, area = .3461
P(11.5 < x < 12.5) = .4236 - .3461 = .0775
232
From Table A.2, P(x = 12) = .076
b) P(x > 5| n = 15 and p = .50):
= np = 15(.50) = 7.5
o = ) 50 )(. 50 (. 15 = q p n = 1.94
3o = 7.5 3(1.94) = 7.5 5.82
(1.68 to 13.32) lies between 0 and 15.
The normal curve approximation is sufficient.
P(x > 5.5| = 7.5 and = l.94)
z =
94 . 1
5 . 7 5 . 5
= -1.03
From Table A.5, area = .3485
P(x > 5.5) = .5000 + .3485 = .8485
Using table A.2, P(x > 5) = .849
233
c) P(x < 3|n = 10 and p = .50):
= np = 10(.50) = 5
o = ) 50 )(. 50 (. 10 = q p n = 1.58
3o = 5 3(1.58) = 5 4.74
(0.26 to 9.74) lies between 0 and 10.
The normal curve approximation is sufficient.
P(x < 3.5| = 5 and o = l.58):
z =
58 . 1
5 5 . 3
= -0.95
From Table A.5, area = .3289
P(x < 3.5) = .5000 - .3289 = .1711
d) P(x > 8|n = 15 and p = .40):
= np = 15(.40) = 6
o = ) 60 )(. 40 (. 15 = q p n = 1.90
234
u 3o = 6 3(1.90) = 6 5.7
(0.3 to 11.7) lies between 0 and 15.
The normal curve approximation is sufficient.
P(x > 7.5| = 6 and o = l.9):
z =
9 . 1
6 5 . 7
= 0.79
From Table A.5, area = .2852
P(x > 7.5) = .5000 - .2852 = .2148
235
6.37 a) P(x > 3| = 1.3):
let x
0
= 3
P(x > 3| = 1.3) = e
-x
= e
-1.3(3)
= e
-3.9
= .0202
b) P(x < 2| = 2.0):
Let x
0
= 2
P(x < 2| = 2.0) = 1 - P(x > 2| = 2.0) =
1 e
-x
= 1 e
-2(2)
= 1 e
-4
= 1 - .0183 = .9817
c) P(1 < x < 3| = 1.65):
P(x > 1| = 1.65):
Let x
0
= 1
e
-x
= e
-1.65(1)
= e
-1.65
= .1920
P(x > 3| = 1.65):
Let x
0
= 3
236
e
-x
= e
-1.65(3)
= e
-4.95
= .0071
P(1 < x < 3) = P(x > 1) - P(x > 3) = .1920 - .0071 = .1849
d) P(x > 2| = 0.405):
Let x
0
= 2
e
-x
= e
-(.405)(2)
= e
-.81
= .4449
237
6.38 = 43.4
12% more than 48. x = 48
Area between x and is .50 - .12 = .38
z associated with an area of .3800 is z = 1.175
Solving for o:
z =
o
u x
1.175 =
o
4 . 43 48
o =
175 . 1
6 . 4
= 3.915
238
6.39 p = 1/5 = .20 n = 150
P(x > 50):
= 150(.20) = 30
o = ) 80 )(. 20 (. 150 = 4.899
z =
899 . 4
30 5 . 50
= 4.18
Area associated with z = 4.18 is .5000
P(x > 50) = .5000 - .5000 = .0000
239
6.40 = 1 customer/20 minutes
= 1/ = 1
a) 1 hour interval
x
0
= 3 because 1 hour = 3(20 minute intervals)
P(x > x
0
) = e
-x
= e
-1(3)
= e
-3
= .0498
b) 10 to 30 minutes
x
0
= .5, x
0
= 1.5
P(x > .5) = e
-x
= e
-1(.5)
= e
-.5
= .6065
P(x > 1.5) = e
-x
= e
-1(1.5)
= e
-1.5
= .2231
P(10 to 30 minutes) = .6065 - .2231 = .3834
c) less than 5 minutes
x
0
= 5/20 = .25
P(x > .25) = e
-x
= e
-1(.25)
= e
-.25
= .7788
240
P(x < .25) = 1 - .7788 = .2212
6.41 = 90.28 o = 8.53
P(x < 80):
z =
53 . 8
28 . 90 80
= -1.21
from Table A.5, area for z = -1.21 is .3869
P(x < 80) = .5000 - .3869 = .1131
P(x > 95):
z =
53 . 8
28 . 90 95
= 0.55
from Table A.5, area for z = 0.55 is .2088
P(x > 95) = .5000 - .2088 = .2912
P(83 < x < 87):
241
z =
53 . 8
28 . 90 83
= -0.85
z =
53 . 8
28 . 90 87
= -0.38
from Table A.5, area for z = -0.85 is .3023
area for z = -0.38 is .1480
P(83 < x < 87) = .3023 - .1480 = .1543
6.42 o = 83
Since only 3% = .0300 of the values are greater than 2,655(million), x = 2655
lies in the upper tail of the distribution. .5000 - .0300 = .4700 of the values lie
between 2655 and the mean.
Table A.5 yields a z = 1.88 for an area of .4700.
Using z = 1.88, x = 2655, o = 83, can be solved for.
z =
o
u x
242
1.88 =
83
2655 u
u = 2498.96 million
243
6.43 a = 18 b = 65
P(25 < x < 50) =
47
25
18 65
25 50
=
= .5319
u =
2
18 65
2
+
=
+ b a
= 41.5
f(x) =
47
1
18 65
1 1
=
=
a b
= .0213
6.44 = 1.8 per 15 seconds
a) u =
8 . 1
1 1
=
o
u x
= 0.51
from Table A.5, the area for z = 0.51 is .1950
P(x > 1000) = .5000 - .1950 = .3050
245
b) P(900 < x < 1100):
z =
96
951 900
=
o
u x
= -0.53
z =
96
951 1100
=
o
u x
= 1.55
from Table A.5, the area for z = -0.53 is .2019
the area for z = 1.55 is .4394
P(900 < x < 1100) = .2019 + .4394 = .6413
c) P(825 < x < 925):
z =
96
951 825
=
o
u x
= -1.31
z =
96
951 925
=
o
u x
= -0.27
from Table A.5, the area for z = -1.31 is .4049
the area for z = -0.27 is .1064
P(825 < x < 925) = .4049 - .1064 = .2985
d) P(x < 700):
246
z =
96
951 700
=
o
u x
= -2.61
from Table A.5, the area for z = -2.61 is .4955
P(x < 700) = .5000 - .4955 = .0045
247
6.46 n = 60 p = .24
= np = 60(.24) = 14.4
o = ) 76 )(. 24 (. 60 = q p n = 3.308
test: + 3o = 14.4 + 3(3.308) = 14.4 + 9.924 = 4.476 and 24.324
Since 4.476 to 24.324 lies between 0 and 60, the normal distribution can be used
to approximate this problem.
P(x > 17):
correcting for continuity: x = 16.5
z =
308 . 3
4 . 14 5 . 16
=
o
u x
= 0.63
from Table A.5, the area for z = 0.63 is .2357
P(x > 17) = .5000 - .2357 = .2643
P(x > 22):
correcting for continuity: x = 22.5
248
z =
308 . 3
4 . 14 5 . 22
=
o
u x
= 2.45
from Table A.5, the area for z = 2.45 is .4929
P(x > 22) = .5000 - .4929 = .0071
P(8 < x < 12):
correcting for continuity: x = 7.5 and x = 12.5
z =
308 . 3
4 . 14 5 . 12
=
o
u x
= -0.57 z =
308 . 3
4 . 14 5 . 7
=
o
u x
= -2.09
from Table A.5, the area for z = -0.57 is .2157
the area for z = -2.09 is .4817
P(8 < x < 12) = .4817 - .2157 = .2660
6.47 = 45,970 o = 4,246
a) P(x > 50,000):
z =
246 , 4
970 , 45 000 , 50
=
o
u x
= 0.95
249
from Table A.5, the area for z = 0.95 is .3289
P(x > 50,000) = .5000 - .3289 = .1711
b) P(x < 40,000):
z =
246 , 4
970 , 45 000 , 40
=
o
u x
= -1.41
from Table A.5, the area for z = -1.41 is .4207
P(x < 40,000) = .5000 - .4207 = .0793
c) P(x > 35,000):
z =
246 , 4
970 , 45 000 , 35
=
o
u x
= -2.58
from Table A.5, the area for z = -2.58 is .4951
P(x > 35,000) = .5000 + .4951 = .9951
d) P(39,000 < x < 47,000):
z =
246 , 4
970 , 45 000 , 39
=
o
u x
= -1.64
250
z =
246 , 4
970 , 45 000 , 47
=
o
u x
= 0.24
from Table A.5, the area for z = -1.64 is .4495
the area for z = 0.24 is .0948
P(39,000 < x < 47,000) = .4495 + .0948 = .5443
251
6.48 = 9 minutes
= 1/ = .1111/minute = .1111(60)/hour
= 6.67/hour
P(x > 5 minutes| = .1111/minute) =
1 - P(x > 5 minutes| =.1111/minute):
Let x
0
= 5
P(x > 5 minutes| = .1111/minute) =
e
-x
= e
-.1111(5)
= e
-.5555
= .5738
P(x < 5 minutes) = 1 - P(x > 5 minutes) = 1 - .5738 = .4262
6.49 u = 88 o = 6.4
a) P(x < 70):
z =
4 . 6
88 70
=
o
u x
= -2.81
252
From Table A.5, area = .4975
P(x < 70) = .5000 - .4975 = .0025
b) P(x > 80):
z =
4 . 6
88 80
=
o
u x
= -1.25
From Table A.5, area = .3944
P(x > 80) = .5000 + .3944 = .8944
c) P(90 < x < 100):
z =
4 . 6
88 100
=
o
u x
= 1.88
From Table A.5, area = .4699
z =
4 . 6
88 90
=
o
u x
= 0.31
From Table A.5, area = .1217
253
P(90 < x < 100) = .4699 - .1217 = .3482
6.50 n = 200, p = .81
expected number = = n(p) = 200(.81) = 162
= 162
o = ) 19 )(. 81 (. 200 = q p n = 5.548
+ 3o = 162 + 3(5.548) lie between 0 and 200, the normalcy test is passed
P(150 < x < 155):
correction for continuity: 150.5 to 154.5
z =
548 . 5
162 5 . 150
= -2.07
from table A.5, area = .4808
z =
548 . 5
162 5 . 154
= -1.35
254
from table A.5, area = .4115
P(150 < x < 155) = .4808 - .4115 = .0693
P(x > 158):
correcting for continuity, x = 158.5
z =
548 . 5
162 5 . 158
= -0.63
from table A.5, area = .2357
P(x > 158) = .2357 + .5000 = .7357
P(x < 144):
correcting for continuity, x = 143.5
z =
548 . 5
162 5 . 143
= -3.33
from table A.5, area = .4996
255
P(x < 144) = .5000 - .4996 = .0004
6.51 n = 150 p = .75
= np = 150(.75) = 112.5
o = ) 25 )(. 75 (. 150 = q p n = 5.3033
a) P(x < 105):
correcting for continuity: x = 104.5
z =
3033 . 5
5 . 112 5 . 104
=
o
u x
= -1.51
from Table A.5, the area for z = -1.51 is .4345
P(x < 105) = .5000 - .4345 = .0655
b) P(110 < x < 120):
correcting for continuity: x = 109.5, x = 120.5
z =
3033 . 5
5 . 112 5 . 109
= -0.57
256
z =
3033 . 5
5 . 112 5 . 120
= 1.51
from Table A.5, the area for z = -0.57 is .2157
the area for z = 1.51 is .4345
P(110 < x < 120) = .2157 + .4345 = .6502
c) P(x > 95):
correcting for continuity: x = 95.5
z =
3033 . 5
5 . 112 5 . 95
= -3.21
from Table A.5, the area for -3.21 is .4993
P(x > 95) = .5000 + .4993 = .9993
6.52 u =
2
b a +
= 2.165
a + b = 2(2.165) = 4.33
b = 4.33 - a
257
Height =
a b
1
= 0.862
1 = 0.862b - 0.862a
Substituting b from above, 1 = 0.862(4.33 - a) - 0.862a
1 = 3.73246 - 0.862a - 0.862a
1 = 3.73246 - 1.724a
1.724a = 2.73246
a = 1.585 and b = 4.33 - 1.585 = 2.745
6.53 = 85,200
60% are between 75,600 and 94,800
94,800 85,200 = 9,600
75,600 85,200 = 9,600
The 60% can be split into 30% and 30% because the two x values are equal distance
from the mean.
258
The z value associated with .3000 area is 0.84
z =
o
u x
.84 =
o
200 , 85 800 , 94
o = 11,428.57
6.54 n = 75 p = .81 prices p = .44 products
a) Expected Value: u
1
= np = 75(.81) = 60.75 seeking price information
o
1
= ) 19 )(. 81 (. 75 = q p n = 3.3974
b) Expected Value: u
2
= np = 75(.44) = 33
o
2
= ) 56 )(. 44 (. 75 = q p n = 4.2988
Tests: + 3o = 60.75 + 3(3.397) = 60.75 + 10.191
50.559 to 70.941 lies between 0 and 75. It is okay to use the normal
distribution to approximate this problem.
259
+ 3o = 33 + 3(4.299) = 33 + 12.897
20.103 to 45.897 lies between 0 and 75. It is okay to use the normal
distribution to approximate this problem.
c) P(x > 67 prices)
correcting for continuity: x = 66.5
z =
3974 . 3
75 . 60 5 . 66
= 1.69
from Table A.5, the area for z = 1.69 is .4545
P(x > 67 prices) = .5000 - .4545 = .0455
d) P(x < 23 products):
correcting for continuity: x = 22.5
z =
2988 . 4
33 5 . 22
= -2.44
from Table A.5, the area for z = -2.44 is .4927
260
P(x < 23) = .5000 - .4927 = .0073
6.55 = 3 hurricanes|5 months
P(x > 1 month| = 3 hurricanes per 5 months):
Since x and are for different intervals,
change Lambda = = 3/ 5 months = 0.6 month.
P(x > month| = 0.6 per month):
Let x
0
= 1
P(x > 1) = e
-x
= e
-0.6(1)
= e
-0.6
= .5488
P(x < 2 weeks): 2 weeks = 0.5 month.
P(x < 0.5 month| = 0.6 per month) =
1 - P(x > 0.5 month| = 0.6 per month)
But P(x > 0.5 month| = 0.6 per month):
261
Let x
0
= 0.5
P(x > 0.5) = e
-x
= e
-0.6(.5)
= e
-0.30
= .7408
P(x < 0.5 month) = 1 - P(x > 0.5 month) = 1 - .7408 = .2592
Average time = Expected time = = 1/ = 1.67 months
6.56 n = 50 p = .80
u = np = 50(.80) = 40
o = ) 20 )(. 80 (. 50 = q p n = 2.828
Test: + 3o = 40 +3(2.828) = 40 + 8.484
31.516 to 48.484 lies between 0 and 50.
It is okay to use the normal distribution to approximate this binomial problem.
P(x < 35): correcting for continuity: x = 34.5
262
z =
828 . 2
40 5 . 34
= -1.94
from Table A.5, the area for z = -1.94 is .4738
P(x < 35) = .5000 - .4738 = .0262
The expected value = = 40
P(42 < x < 47):
correction for continuity: x = 41.5 x = 47.5
z =
828 . 2
40 5 . 41
= 0.53 z =
828 . 2
40 5 . 47
= 2.65
from Table A.5, the area for z = 0.53 is .2019
the area for z = 2.65 is .4960
P(42 < x < 47) = .4960 - .2019 = .2941
263
6.57 = 2087 o = 175
If 20% are less, then 30% lie between x and .
z
.30
= -.84
z =
o
u x
-.84 =
175
2087 x
x = 1940
If 65% are more, then 15% lie between x and
z
.15
= -0.39
z =
o
u x
-.39 =
175
2087 x
x = 2018.75
264
If x is more than 85%, then 35% lie between x and .
z
.35
= 1.04
z =
o
u x
1.04 =
175
2087 x
x = 2269
6.58 = 0.8 person|minute
P(x > 1 minute| = 0.8 minute):
Let x
0
= 1
P(x > 1) = e
-x
= e
-.8(1)
= e
-.8
= .4493
P(x > 2.5 Minutes| = 0.8 per minute):
Let x
0
= 2.5
P(x > 2.5) = e
-x
= e
-0.8(2.5)
= e
-2
= .1353
265
6.59 = 2,106,774 o = 50,940
P(x > 2,200,000):
z =
940 , 50
774 , 106 , 2 000 , 200 , 2
= 1.83
from table A.5 the area for z = 1.83 is .4664
P(x > 2,200,000) = .5000 - .4664 = .0336
P(x < 2,000,000):
z =
940 , 50
774 , 106 , 2 000 , 000 , 2
= -2.10
from table A.5 the area for z = -2.10 is .4821
P(x < 2,000,000) = .5000 - .4821 = .0179
266
6.60 = 2.2 calls|30 secs.
Expected time between calls = = 1/ = 1/(2.2) = .4545(30 sec.) = 13.64 sec.
P(x > 1 min.| = 2.2 calls per 30 secs.):
Since Lambda and x are for different intervals,
Change Lambda to: = 4.4 calls/1 min.
P(x > 1 min| = 4.4 calls/1 min.):
For x
0
= 1: e
-x
= e
-4.4(1)
= .0123
P(x > 2 min.| = 4.4 calls/1 min.):
For x
0
= 2: e
-x
= e
-4.4(2)
= e
-8.8
= .0002
6.61 This is a uniform distribution with a = 11 and b = 32.
The mean is (11 + 32)/2 = 21.5 and the standard deviation is
(32 - 11)/ 12 = 6.06. Almost 81% of the time there are less than or equal to 28
267
sales associates working. One hundred percent of the time there are less than or
equal to 34 sales associates working and never more than 34. About 23.8% of
the time there are 16 or fewer sales associates working. There are 21 or fewer
sales associates working about 48% of the time.
6.62 The weight of the rods is normally distributed with a mean of 227 mg and a
standard deviation of 2.3 mg. The probability that a rod weighs less than or
equal to 220 mg is .0012, less than or equal to 225 mg is .1923, less than
or equal to 227 is .5000 (since 227 is the mean), less than 231 mg is .9590, and
less than or equal to 238 mg is 1.000.
6.63 The lengths of cell phone calls are normally distributed with a mean of 2.35
minutes and a standard deviation of .11 minutes. Almost 99% of the calls are
268
less than or equal to 2.60 minutes, almost 82% are less than or equal to 2.45
minutes, over 32% are less than 2.3 minutes, and almost none are less than
2 minutes.
6.64 The exponential distribution has = 4.51 per 10 minutes and = 1/4.51 =
.22173 of 10 minutes or 2.2173 minutes. The probability that there is less than
.1 or 1 minute between arrivals is .3630. The probability that there is less than
.2 or 2 minutes between arrivals is .5942. The probability that there is .5 or 5
minutes or more between arrivals is .1049. The probability that there is more
than 1 or 10 minutes between arrivals is .0110. It is almost certain that there
will be less than 2.4 or 24 minutes between arrivals.
269
Chapter 7
Sampling and Sampling Distributions
LEARNING OBJECTIVES
The two main objectives for Chapter 7 are to give you an appreciation for the proper application
of sampling techniques and an understanding of the sampling distributions of two statistics, thereby
enabling you to:
1. Determine when to use sampling instead of a census.
2. Distinguish between random and nonrandom sampling.
3. Decide when and how to use various sampling techniques.
4. Be aware of the different types of errors that can occur in a study.
5. Understand the impact of the central limit theorem on statistical analysis.
6. Use the sampling distributions of x and p .
270
CHAPTER TEACHING STRATEGY
Virtually every analysis discussed in this text deals with sample data. It is important,
therefore, that students are exposed to the ways and means that samples are gathered. The
first portion of chapter 7 deals with sampling. Reasons for sampling versus taking a census are
given. Most of these reasons are tied to the fact that taking a census costs more than sampling
if the same measurements are being gathered. Students are then exposed to the idea of
random versus nonrandom sampling. Random sampling appeals to their concepts of fairness
and equal opportunity. This text emphasizes that nonrandom samples are non probability
samples and cannot be used in inferential analysis because levels of confidence and/or
probability cannot be assigned. It should be emphasized throughout the discussion of sampling
techniques that as future business managers (most students will end up as some sort of
supervisor/manager) students should be aware of where and how data are gathered for studies.
This will help to assure that they will not make poor decisions based on inaccurate and poorly
gathered data.
The central limit theorem opens up opportunities to analyze data with a host of
techniques using the normal curve. Section 7.2 begins by showing one population (randomly
generated and presented in histogram form) that is uniformly distributed and one that is
exponentially distributed. Histograms of the means for random samples of varying sizes are
presented. Note that the distributions of means pile up in the middle and begin to
approximate the normal curve shape as sample size increases. Note also by observing the
values on the bottom axis that the dispersion of means gets smaller and smaller as sample size
increases thus underscoring the formula for the standard error of the mean (
n
o
). As the
student sees the central limit theorem unfold, he/she begins to see that if the sample size is
large enough, sample means can be analyzed using the normal curve regardless of the shape of
the population.
Chapter 7 presents formulas derived from the central limit theorem for both sample
means and sample proportions. Taking the time to introduce these techniques in this chapter
can expedite the presentation of material in chapters 8 and 9.
CHAPTER OUTLINE
271
7.1 Sampling
Reasons for Sampling
Reasons for Taking a Census
Frame
Random Versus Nonrandom Sampling
Random Sampling Techniques
Simple Random Sampling
Stratified Random Sampling
Systematic Sampling
Cluster or Area Sampling
Nonrandom Sampling
Convenience Sampling
Judgment Sampling
Quota Sampling
Snowball Sampling
Sampling Error
Nonsampling Errors
7.2 Sampling Distribution of x
Sampling from a Finite Population
7.3 Sampling Distribution of p
272
KEY TERMS
Central Limit Theorem Quota Sampling
Cluster (or Area) Sampling Random Sampling
Convenience Sampling Sample Proportion
Disproportionate Stratified Random Sampling Sampling Error
Finite Correction Factor Simple Random Sampling
Frame Snowball Sampling
Judgment Sampling Standard Error of the Mean
Nonrandom Sampling Standard Error of the Proportion
Nonrandom Sampling Techniques Stratified Random Sampling
Nonsampling Errors Systematic Sampling
Proportionate Stratified Random Sampling Two-Stage Sampling
SOLUTIONS TO PROBLEMS IN CHAPTER 7
7.1 a) i. A union membership list for the company.
ii. A list of all employees of the company.
273
b) i. White pages of the telephone directory for Utica, New York.
ii. Utility company list of all customers.
c) i. Airline company list of phone and mail purchasers of tickets from the airline during
the past six months.
ii. A list of frequent flyer club members for the airline.
d) i. List of boat manufacturer's employees.
ii. List of members of a boat owners association.
e) i. Cable company telephone directory.
ii. Membership list of cable management association.
274
7.4 a) Size of motel (rooms), age of motel, geographic location.
b) Gender, age, education, social class, ethnicity.
c) Size of operation (number of bottled drinks per month), number of employees, number of
different types of drinks bottled at that location, geographic location.
d) Size of operation (sq.ft.), geographic location, age of facility, type of process used.
7.5 a) Under 21 years of age, 21 to 39 years of age, 40 to 55 years of age, over 55 years of age.
b) Under $1,000,000 sales per year, $1,000,000 to $4,999,999 sales per year, $5,000,000 to
$19,999,999 sales per year, $20,000,000 to $49,000,000 per year, $50,000,000 to
$99,999,999 per year, over $100,000,000 per year.
c) Less than 2,000 sq. ft., 2,000 to 4,999 sq. ft.,
5,000 to 9,999 sq. ft., over 10,000 sq. ft.
d) East, southeast, midwest, south, southwest, west, northwest.
e) Government worker, teacher, lawyer, physician, engineer, business person, police officer,
fire fighter, computer worker.
f) Manufacturing, finance, communications, health care, retailing, chemical, transportation.
275
7.6 n = N/k = 100,000/200 = 500
7.7 N = nk = 825
7.8 k = N/n = 3,500/175 = 20
Start between 0 and 20. The human resource department probably has a list of
company employees which can be used for the frame. Also, there might be a
company phone directory available.
276
7.9 a) i. Counties
ii. Metropolitan areas
b) i. States (beside which the oil wells lie)
ii. Companies that own the wells
c) i. States
ii. Counties
7.10 Go to the district attorney's office and observe the apparent activity of various
attorney's at work. Select some who are very busy and some who seem to be
less active. Select some men and some women. Select some who appear to
be older and some who are younger. Select attorneys with different ethnic
backgrounds.
7.11 Go to a conference where some of the Fortune 500 executives attend.
Approach those executives who appear to be friendly and approachable.
7.12 Suppose 40% of the sample is to be people who presently own a personal computer and 60% ,
people who do not. Go to a computer show at the city's conference center and start
277
interviewing people. Suppose you get enough people who own personal computers but not
enough interviews with those who do not. Go to a mall and start interviewing people. Screen
out personal computer owners. Interview non personal computer owners until you meet the
60% quota.
7.13 = 50, o = 10, n = 64
a) P( x > 52):
z =
64
10
50 52
=
n
x
o
u
= 1.6
from Table A.5, Prob. = .4452
P( x > 52) = .5000 - .4452 = .0548
b) P( x < 51):
z =
64
10
50 51
=
n
x
o
u
= 0.80
from Table A.5 prob. = .2881
278
P( x < 51) = .5000 + .2881 = .7881
c) P( x < 47):
z =
64
10
50 47
=
n
x
o
u
= -2.40
from Table A.5 prob. = .4918
P( x < 47) = .5000 - .4918 = .0082
d) P(48.5 < x < 52.4):
z =
64
10
50 5 . 48
=
n
x
o
u
= -1.20
from Table A.5 prob. = .3849
z =
64
10
50 4 . 52
=
n
x
o
u
= 1.92
279
from Table A.5 prob. = .4726
P(48.5 < x < 52.4) = .3849 + .4726 = .8575
e) P(50.6 < x
< 51.3):
z =
64
10
50 6 . 50
=
n
x
o
u
= 0.48
from Table A.5, prob. = .1844
z = 04 . 1
64
10
50 3 . 51
=
n
x
o
u
from Table A.5, prob. = .3508
P(50.6 < x
< 51.3) = .3508 - .1844 = .1664
7.14 = 23.45 o = 3.8
280
a) n = 10, P( x > 22):
z =
10
8 . 3
45 . 23 22
=
n
x
o
u
= -1.21
from Table A.5, prob. = .3869
P( x > 22) = .3869 + .5000 = .8869
b) n = 4, P( x > 26):
z =
4
8 . 3
45 . 23 26
=
n
x
o
u
= 1.34
from Table A.5, prob. = .4099
P( x > 26) = .5000 - .4099 = .0901
7.15 n = 36 = 278 P( x < 280) = .86
.3600 of the area lies between x = 280 and = 278. This probability is
associated with z = 1.08 from Table A.5. Solving for o :
281
z =
n
x
o
u
1.08 =
36
278 280
o
1.08
6
o
= 2
o =
08 . 1
12
= 11.11
7.16 n = 81 o = 12 P( x > 300) = .18
.5000 - .1800 = .3200 and from Table A.5, z
.3200
= 0.92
Solving for :
z =
n
x
o
u
282
0.92 =
81
12
300 u
0.92
9
12
= 300 - u
1.2267 = 300 - u
= 300 - 1.2267 = 298.77
7.17 a) N = 1,000 n = 60 = 75 o = 6
P( x < 76.5):
z =
1 1000
60 1000
60
6
75 5 . 76
1
N
n N
n
x
o
u
= 2.00
from Table A.5, prob. = .4772
P( x < 76.5) = .4772 + .5000 = .9772
b) N = 90 n = 36 = 108 o = 3.46
P(107 < x < 107.7):
283
z =
1 90
36 90
36
46 . 3
108 107
1
N
n N
n
x
o
u
= -2.23
from Table A.5, prob. = .4871
z =
1 90
36 90
36
46 . 3
108 7 . 107
1
N
n N
n
x
o
u
= -0.67
from Table A.5, prob. = .2486
P(107 < x < 107.7) = .4871 - .2486 = .2385
c) N = 250 n = 100 = 35.6 o = 4.89
P( x
>
36):
z =
1 250
100 250
100
89 . 4
6 . 35 36
1
N
n N
n
x
o
u
= 1.05
from Table A.5, prob. = .3531
P( x
>
36) = .5000 - .3531 = .1469
284
d) N = 5000 n = 60 = 125 o = 13.4
P( x < 123):
z =
1 5000
60 5000
60
4 . 13
125 123
1
N
n N
n
x
o
u
= -1.16
from Table A.5, prob. = .3770
P( x < 123) = .5000 - .3770 = .1230
7.18 = 99.9 o = 30 n = 38
a) P( x < 90):
z =
38
30
9 . 99 90
=
n
x
o
u
= -2.
03
from table A.5, area = .4788
P( x < 90) = .5000 - .4788 = .0212
285
b) P(98 < x < 105):
z =
38
30
9 . 99 105
=
n
x
o
u
= 1.05
from table A.5, area = .3531
z =
38
30
9 . 99 98
=
n
x
o
u
= -0.39
from table A.5, area = .1517
P(98 < x < 105) = .3531 + .1517 = .5048
c) P( x < 112):
z =
38
30
9 . 99 112
=
n
x
o
u
= 2.49
from table A.5, area = .4936
P( x < 112) = .5000 + .4936 = .9936
286
d) P(93 < x < 96):
z =
38
30
9 . 99 93
=
n
x
o
u
= -1.42
from table A.5, area = .4222
z =
38
30
9 . 99 96
=
n
x
o
u
= -0.80
from table A.5, area = .2881
P(93 < x < 96) = .4222 - .2881 = .1341
7.19 N = 1500 n = 100 = 177,000 o = 8,500
P( X > $185,000):
287
z =
1 1500
100 1500
100
500 , 8
000 , 177 000 , 185
1
N
n N
n
X
o
u
= 9.74
from Table A.5, prob. = .5000
P( X > $185,000) = .5000 - .5000 = .0000
7.20 = $65.12 o = $21.45 n = 45 P( x > 0 x ) = .2300
Prob. x lies between 0 x and = .5000 - .2300 = .2700
from Table A.5, z
.2700
= 0.74
Solving for 0 x :
z =
n
x
o
u 0
288
0.74 =
45
45 . 21
12 . 65 0 x
2.366 = 0 x - 65.12 and 0 x = 65.12 + 2.366 = 67.486
7.21 = 50.4 o = 11.8 n = 42
a) P( x > 52):
z =
42
8 . 11
4 . 50 52
=
n
x
o
u
= 0.88
from Table A.5, the area for z = 0.88 is .3106
P( x > 52) = .5000 - .3106 = .1894
b) P( x < 47.5):
z =
42
8 . 11
4 . 50 5 . 47
=
n
x
o
u
= -1.59
289
from Table A.5, the area for z = -1.59 is .4441
P( x < 47.5) = .5000 - .4441 = .0559
c) P( x < 40):
z =
42
8 . 11
4 . 50 40
=
n
x
o
u
= -5.71
from Table A.5, the area for z = -5.71 is .5000
P( x < 40) = .5000 - .5000 = .0000
d) 71% of the values are greater than 49. Therefore, 21% are between the
sample mean of 49 and the population mean, = 50.4.
The z value associated with the 21% of the area is -0.55
z
.21
= -0.55
z =
n
x
o
u
290
-0.55 =
42
4 . 50 49
o
o = 16.4964
291
7.22 p = .25
a) n = 110 P( p < .21):
z =
110
) 75 )(. 25 (.
25 . 21 .
=
n
q p
p p
= -0.97
from Table A.5, prob. = .3340
P( p < .21) = .5000 - .3340 = .1660
b) n = 33 P( p > .24):
z =
33
) 75 )(. 25 (.
25 . 24 .
=
n
q p
p p
= -0.13
from Table A.5, prob. = .0517
P( p > .24) = .5000 + .0517 = .5517
c) n = 59 P(.24 < p < .27):
z =
59
) 75 )(. 25 (.
25 . 24 .
=
n
q p
p p
= -0.18
292
from Table A.5, prob. = .0714
z =
59
) 75 )(. 25 (.
25 . 27 .
=
n
q p
P p
= 0.35
from Table A.5, prob. = .1368
P(.24 < p < .27) = .0714 + .1368 = .2082
293
d) n = 80 P( p > .30):
z =
80
) 75 )(. 25 (.
25 . 30 .
=
n
q p
p p
= 1.03
from Table A.5, prob. = .3485
P( p
> .30) = .5000 - .3485 = .1515
e) n = 800 P( p > .30):
z =
800
) 75 )(. 25 (.
25 . 30 .
=
n
q p
p p
= 3.27
from Table A.5, prob. = .4995
P( p > .30) = .5000 - .4995 = .0005
7.23 p = .58 n = 660
a) P( p > .60):
294
z =
660
) 42 )(. 58 (.
58 . 60 .
=
n
q p
p p
= 1.04
from table A.5, area = .3508
P( p > .60) = .5000 - .3508 = .1492
b) P(.55 < p
< .65):
z =
660
) 42 )(. 58 (.
58 . 65 .
=
n
q p
p p
= 3.64
from table A.5, area = .4998
z =
660
) 42 )(. 58 (.
58 . 55 .
=
n
q p
p p
= 1.56
from table A.5, area = .4406
P(.55 < p < .65) = .4998 + .4406 = .9404
c) P( p > .57):
295
z =
660
) 42 )(. 58 (.
58 . 57 .
=
n
q p
p p
= -0.52
from table A.5, area = .1985
P( p > .57) = .1985 + .5000 = .6985
d) P(.53 < p < .56):
z =
660
) 42 )(. 58 (.
58 . 56 .
=
n
q p
p p
= 1.04 z =
660
) 42 )(. 58 (.
58 . 53 .
=
n
q p
p p
= 2.60
from table A.5, area for z = 1.04 is .3508
from table A.5, area for z = 2.60 is .4953
P(.53 < p < .56) = .4953 - .3508 = .1445
e) P( p < .48):
z =
660
) 42 )(. 58 (.
58 . 48 .
=
n
q p
p p
= -5.21
from table A.5, area = .5000
296
P( p < .48) = .5000 - .5000 = .0000
297
7.24 p = .40 P(
p > .35) = .8000
P(.35 < p < .40) = .8000 - .5000 = .3000
from Table A.5, z
.3000
= -0.84
Solving for n:
z =
n
q p
p p
-0.84 =
n
) 60 )(. 40 (.
40 . 35 .
=
n
24 .
05 .
n =
05 .
24 . 84 . 0
8.23 = n
n = 67.73 ~ 68
7.25 p = .28 n = 140 P( p <
0
p ) = .3000
P( p <
0
p < .28) = .5000 - .3000 = .2000
298
from Table A.5, z
.2000
= -0.52
Solving for
0
p :
z =
n
q p
p p
-0.52 =
140
) 72 )(. 28 (.
28 .
0
p
-.02 =
0
p - .28
0
p = .28 - .02 = .26
7.26 P(x > 150): n = 600 p = .21 x = 150
p =
600
150
= .25
z =
600
) 79 )(. 21 (.
21 . 25 .
=
n
q p
p p
= 2.41
from table A.5, area = .4920
P(x > 150) = .5000 - .4920 = .0080
299
7.27 p = .48 n = 200
a) P(x < 90):
p =
200
90
= .45
z =
200
) 52 )(. 48 (.
48 . 45 .
=
n
q p
p p
= -0.85
from Table A.5, the area for z = -0.85 is .3023
P(x < 90) = .5000 - .3023 = .1977
b) P(x > 100):
p =
200
100
= .50
z =
200
) 52 )(. 48 (.
48 . 50 .
=
n
q p
p p
= 0.57
from Table A.5, the area for z = 0.57 is .2157
P(x > 100) = .5000 - .2157 = .2843
300
301
c) P(x > 80):
p =
200
80
= .40
z =
200
) 52 )(. 48 (.
48 . 40 .
=
n
q p
p p
= -2.26
from Table A.5, the area for z = -2.26 is .4881
P(x > 80) = .5000 + .4881 = .9881
7.28 p = .19 n = 950
a) P( p
> .25):
z =
950
) 89 )(. 19 (.
19 . 25 .
=
n
q p
p p
= 4.71
from Table A.5, area = .5000
P( p > .25) = .5000 - .5000 = .0000
b) P(.15 < p < .20):
302
z =
950
) 81 )(. 19 (.
19 . 15 .
=
n
q p
p p
= -3.14
z =
950
) 89 )(. 19 (.
19 . 20 .
=
n
q p
p p
= 0.79
from Table A.5, area for z = -3.14 is .4992
from Table A.5, area for z = 0.79 is .2852
P(.15 < p
< .20) = .4992 + .2852 = .7844
c) P(133 < x < 171):
1
p =
950
133
= .14
2
p =
950
171
= .18
P(.14 < p < .18):
z =
950
) 81 )(. 19 (.
19 . 14 .
=
n
q p
p p
= -3.93 z =
950
) 81 )(. 19 (.
19 . 18 .
=
n
q p
p p
= -0.79
from Table A.5, the area for z = -3.93 is .49997
the area for z = -0.79 is .2852
303
P(133 < x < 171) = .49997 - .2852 = .21477
7.29 = 76, o = 14
a) n = 35, P( x > 79):
z =
35
14
76 79
=
n
x
o
u
= 1.27
from table A.5, area = .3980
P( x > 79) = .5000 - .3980 = .1020
b) n = 140, P(74 < x < 77):
z =
140
14
76 74
=
n
x
o
u
= -1.69 z =
140
14
76 77
=
n
x
o
u
= 0.85
from table A.5, area for z = -1.69 is .4545
from table A.5, area for 0.85 is .3023
304
P(74 < x < 77) = .4545 + .3023 = .7568
305
c) n = 219, P( x < 76.5):
z =
219
14
76 5 . 76
=
n
x
o
u
= 0.53
from table A.5, area = .2019
P( x < 76.5) = .5000 + .2019 = .7019
7.30 p = .46
a) n = 60
P(.41 < p < .53):
z =
60
) 54 )(. 46 (.
46 . 53 .
=
n
q p
p p
= 1.09
from table A.5, area = .3621
z =
60
) 54 )(. 46 (.
46 . 41 .
=
n
q p
p p
= -0.78
306
from table A.5, area = .2823
P(.41 < p < .53) = .3621 + .2823 = .6444
b) n = 458 P( p < .40):
z =
. .
(. )(. )
p p
p q
n
=
40 46
46 54
458
= -2.58
from table A.5, area = .4951
P( p < .40) = .5000 - .4951 = .0049
307
c) n = 1350 P( p > .49):
z =
1350
) 54 )(. 46 (.
46 . 49 .
=
n
q p
p p
= 2.21
from table A.5, area = .4864
P( p > .49) = .5000 - .4864 = .0136
7.31 Under 18 250(.22) = 55
18 25 250(.18) = 45
26 50 250(.36) = 90
51 65 250(.10) = 25
over 65 250(.14) = 35
n = 250
7.32 p = .55 n = 600 x = 298
p =
600
298
=
n
x
= .497
P( p < .497):
308
z =
600
) 45 )(. 55 (.
55 . 497 .
=
n
q p
p p
= -2.61
from Table A.5, Prob. = .4955
P( p < .497) = .5000 - .4955 = .0045
No, the probability of obtaining these sample results by chance from a population that supports
the candidate with 55% of the vote is extremely low (.0045). This is such an unlikely chance
sample result that it would cause the researcher to probably reject her claim of 55% of the vote.
309
7.33 a) Roster of production employees secured from the human
resources department of the company.
b) Alpha/Beta store records kept at the headquarters of
their California division or merged files of store
records from regional offices across the state.
c) Membership list of Maine lobster catchers association.
7.34 = $ 17,755 o = $ 650 n = 30 N = 120
P( x
< 17,500):
z =
1 120
30 120
30
650
755 , 17 500 , 17
= -2.47
from Table A.5, the area for z = -2.47 is .4932
P( x
< 17,500) = .5000 - .4932 = .0068
310
7.35 Number the employees from 0001 to 1250. Randomly sample from the random number table
until 60 different usable numbers are obtained. You cannot use numbers from 1251 to 9999.
7.36 = $125 n = 32 x = $110 o
2
= $525
P( x > $110):
z =
32
525
125 110
=
n
x
o
u
= -3.70
from Table A.5, Prob.= .5000
P( x > $110) = .5000 + .5000 = 1.0000
311
P( x > $135):
z =
32
525
125 135
=
n
x
o
u
= 2.47
from Table A.5, Prob.= .4932
P( x > $135) = .5000 - .4932 = .0068
P($120 < x < $130):
z =
32
525
125 120
=
n
x
o
u
= -1.23
z =
32
525
125 130
=
n
x
o
u
= 1.23
from Table A.5, Prob.= .3907
P($120 < x < $130) = .3907 + .3907 = .7814
312
7.37 n = 1100
a) x > 810, p = .73
p =
1100
810
=
n
x
z =
1100
) 27 )(. 73 (.
73 . 7364 .
=
n
q p
p p
= 0.48
from table A.5, area = .1844
P(x > 810) = .5000 - .1844 = .3156
b) x < 1030, p = .96,
p =
1100
1030
=
n
x
= .9364
z =
1100
) 04 )(. 96 (.
96 . 9364 .
=
n
q p
p p
= -3.99
from table A.5, area = .49997
P(x < 1030) = .5000 - .49997 = .00003
313
c) p = .85
P(.82 < p < .84):
z =
1100
) 15 )(. 85 (.
85 . 82 .
=
n
q p
p p
= -2.79
from table A.5, area = .4974
z =
1100
) 15 )(. 85 (.
85 . 84 .
=
n
q p
p p
= -0.93
from table A.5, area = .3238
P(.82 < p < .84) = .4974 - .3238 = .1736
7.38 1) The managers from some of the companies you are interested in
studying do not belong to the American Managers Association.
2) The membership list of the American Managers Association is not up-to-date.
3) You are not interested in studying managers from some of the companies belonging to
the American Management Association.
4) The wrong questions are asked.
5) The manager incorrectly interprets a question.
314
6) The assistant accidentally marks the wrong answer.
7) The wrong statistical test is used to analyze the data.
8) An error is made in statistical calculations.
9) The statistical results are misinterpreted.
7.39 Divide the factories into geographic regions and select a few factories to represent those
regional areas of the country. Take a random sample of employees from each selected factory.
Do the same for distribution centers and retail outlets. Divide the United States into regions of
areas. Select a few areas. Take a random sample from each of the selected area distribution
centers and retail outlets.
7.40 N = 12,080 n = 300
k = N/n = 12,080/300 = 40.27
Select every 40th outlet to assure n > 300 outlets.
Use a table of random numbers to select a value between 0 and 40 as a starting point.
7.41 p = .54 n = 565
a) P(x > 339):
315
p =
565
339
=
n
x
= .60
z =
565
) 46 )(. 54 (.
54 . 60 .
=
n
q p
p p
= 2.86
from Table A.5, the area for z = 2.86 is .4979
P(x > 339) = .5000 - .4979 = .0021
b) P(x > 288):
p
=
565
288
=
n
x
= .5097
z =
565
) 46 )(. 54 (.
54 . 5097 .
=
n
q p
p p
= -1.45
from Table A.5, the area for z = -1.45 is .4265
P(x > 288) = .5000 + .4265 = .9265
316
c) P( p < .50):
z =
565
) 46 )(. 54 (.
54 . 50 .
=
n
q p
p p
= -1.91
from Table A.5, the area for z = -1.91 is .4719
P( p
< .50) = .5000 - .4719 = .0281
7.42 = $550 n = 50 o = $100
P( x < $530):
z =
50
100
550 530
=
n
x
o
u
= -1.41
from Table A.5, Prob.=.4207
P(x < $530) = .5000 - .4207 = .0793
317
7.43 = 56.8 n = 51 o = 12.3
a) P( x > 60):
z =
51
3 . 12
8 . 56 60
=
n
x
o
u
= 1.86
from Table A.5, Prob. = .4686
P( x > 60) = .5000 - .4686 = .0314
b) P( x > 58):
z =
51
3 . 12
8 . 56 58
=
n
x
o
u
= 0.70
from Table A.5, Prob.= .2580
P( x > 58) = .5000 - .2580 = .2420
c) P(56 < x < 57):
318
z =
51
3 . 12
8 . 56 56
=
n
x
o
u
= -0.46 z =
51
3 . 12
8 . 56 57
=
n
x
o
u
= 0.12
from Table A.5, Prob. for z = -0.46 is .1772
from Table A.5, Prob. for z = 0.12 is .0478
P(56 < x < 57) = .1772 + .0478 = .2250
d) P( x < 55):
z =
51
3 . 12
8 . 56 55
=
n
x
o
u
= -1.05
from Table A.5, Prob.= .3531
P( x < 55) = .5000 - .3531 = .1469
e) P( x < 50):
z =
51
3 . 12
8 . 56 50
=
n
x
o
u
= -3.95
319
from Table A.5, Prob.= .5000
P( x < 50) = .5000 - .5000 = .0000
7.45 p = .73 n = 300
a) P(210 < x < 234):
1
p =
300
210
=
n
x
= .70
2
p
=
300
234
=
n
x
= .78
z =
300
) 27 )(. 73 (.
73 . 70 .
=
n
q p
p p
= -1.17
z =
300
) 27 )(. 73 (.
73 . 78 .
=
n
q p
p p
= 1.95
from Table A.5, the area for z = -1.17 is .3790
the area for z = 1.95 is .4744
P(210 < x < 234) = .3790 + .4744 = .8534
b) P( p > .78):
z =
300
) 27 )(. 73 (.
73 . 78 .
=
n
q p
p p
= 1.95
320
from Table A.5, the area for z = 1.95 is .4744
P( p > .78) = .5000 - .4744 = .0256
c) p = .73 n = 800 P( p > .78):
z =
800
) 27 )(. 73 (.
73 . 78 .
=
n
q p
p p
= 3.19
from Table A.5, the area for z = 3.19 is .4993
P( p > .78) = .5000 - .4993 = .0007
321
7.46 n = 140 P(x > 35):
p =
140
35
= .25 p = .22
z =
140
) 78 )(. 22 (.
22 . 25 .
=
n
q p
p p
= 0.86
from Table A.5, the area for z = 0.86 is .3051
P(x > 35) = .5000 - .3051 = .1949
P(x < 21):
p =
140
21
= .15
z =
140
) 78 )(. 22 (.
22 . 15 .
=
n
q p
p p
= -2.00
from Table A.5, the area for z = 2.00 is .4772
P(x < 21) = .5000 - .4772 = .0228
n = 300 p = .20
322
P(.18 < p < .25):
z =
300
) 80 )(. 20 (.
20 . 18 .
=
n
q p
p p
= -0.87
from Table A.5, the area for z = -0.87 is .3078
z =
300
) 80 )(. 20 (.
20 . 25 .
=
n
q p
p p
= 2.17
from Table A.5, the area for z = 2.17 is .4850
P(.18 < p < .25) = .3078 + .4850 = .7928
7.47 By taking a sample, there is potential for obtaining more detailed information.
More time can be spent with each employee. Probing questions can
be asked. There is more time for trust to be built between employee and
interviewer resulting in the potential for more honest, open answers.
With a census, data is usually more general and easier to analyze because it is in a more
standard format. Decision-makers are sometimes more comfortable with a census because
everyone is included and there is no sampling error. A census appears to be a better political
device because the CEO can claim that everyone in the company has had input.
323
7.48 p = .75 n = 150 x = 120
P( p > .80):
z =
150
) 25 )(. 75 (.
75 . 80 .
=
n
q p
p p
= 1.41
from Table A.5, the area for z = 1.41 is .4207
P( p > .80) = .5000 - .4207 = .0793
7.49 Switzerland: n = 40 = $ 21.24 o = $ 3
P(21 < x < 22):
z =
40
3
24 . 21 21
=
n
x
o
u
= -0.51
z =
40
3
24 . 21 22
=
n
x
o
u
= 1.60
from Table A.5, the area for z = -0.51 is .1950
the area for z = 1.60 is .4452
324
P(21 < x
< 22) = .1950 + .4452 = .6402
325
Japan: n = 35 = $ 22.00 o = $3
P( x > 23):
z =
35
3
22 23
=
n
x
o
u
= 1.97
from Table A.5, the area for z = 1.97 is .4756
P( x > 23) = .5000 - .4756 = .0244
U.S.: n = 50 = $ 19.86 o = $ 3
P( x < 18.90):
z =
50
3
86 . 19 90 . 18
=
n
x
o
u
= -2.26
from Table A.5, the area for z = -2.26 is .4881
P( x < 18.90) = .5000 - .4881 = .0119
7.50 a) Age, Ethnicity, Religion, Geographic Region, Occupation, Urban-Suburban-Rural, Party
Affiliation, Gender
326
b) Age, Ethnicity, Gender, Geographic Region, Economic Class
c) Age, Ethnicity, Gender, Economic Class, Education
d) Age, Ethnicity, Gender, Economic Class, Geographic Location
7.51 = $281 n = 65 o = $47
P( x > $273):
z =
65
47
281 273
=
n
x
o
u
= -1.37
from Table A.5 the area for z = -1.37 is .4147
P( x > $273) = .5000 + .4147 = .9147
327
Chapter 8
Statistical Inference: Estimation for Single
Populations
LEARNING OBJECTIVES
The overall learning objective of Chapter 8 is to help you understand estimating
parameters of single populations, thereby enabling you to:
1. Know the difference between point and interval estimation.
2. Estimate a population mean from a sample mean when o is known.
3. Estimate a population mean from a sample mean when o is unknown.
4. Estimate a population proportion from a sample proportion.
5. Estimate the population variance from a sample variance.
6. Estimate the minimum sample size necessary to achieve given statistical goals.
328
CHAPTER TEACHING STRATEGY
Chapter 8 is the student's introduction to interval estimation and estimation of sample
size. In this chapter, the concept of point estimate is discussed along with the notion that as each
sample changes in all likelihood so will the point estimate. From this, the student can see that an
interval estimate may be more usable as a one-time proposition than the point estimate. The
confidence interval formulas for large sample means and proportions can be presented as mere
algebraic manipulations of formulas developed in chapter 7 from the Central Limit Theorem.
It is very important that students begin to understand the difference between mean and
proportions. Means can be generated by averaging some sort of measurable item such as age,
sales, volume, test score, etc. Proportions are computed by counting the number of items
containing a characteristic of interest out of the total number of items. Examples might be
proportion of people carrying a VISA card, proportion of items that are defective, proportion of
market purchasing brand A. In addition, students can begin to see that sometimes single
samples are taken and analyzed; but that other times, two samples are taken in order to
compare two brands, two techniques, two conditions, male/female, etc.
In an effort to understand the impact of variables on confidence intervals, it may be
useful to ask the students what would happen to a confidence interval if the sample size is
varied or the confidence is increased or decreased. Such consideration helps the student see in
a different light the items that make up a confidence interval. The student can see that
increasing the sample size reduces the width of the confidence interval, all other things being
constant, or that it increases confidence if other things are held constant. Business students
probably understand that increasing sample size costs more and thus there are trade-offs in the
research set-up.
In addition, it is probably worthwhile to have some discussion with students regarding
the meaning of confidence, say 95%. The idea is presented in the chapter that if 100 samples
are randomly taken from a population and 95% confidence intervals are computed on each
sample, that 95%(100) or 95 intervals should contain the parameter of estimation and
approximately 5 will not. In most cases, only one confidence interval is computed, not 100, so
the 95% confidence puts the odds in the researcher's favor. It should be pointed out, however,
that the confidence interval computed may not contain the parameter of interest.
This chapter introduces the student to the t distribution for
estimating population means when o is unknown. Emphasize that this applies only when the
population is normally distributed because it is an assumption underlying the t test that the
population is normally distributed, albeit that this assumption is robust. The student will
observe that the t formula is essentially the same as the z formula and that it is the table that is
329
different. When the population is normally distributed and o is known, the z formula can be
used even for small samples.
A formula is given in chapter 8 for estimating the population variance; and
it is here that the student is introduced to the chi-square distribution. An
assumption underlying the use of this technique is that the population is normally
distributed. The use of the chi-square statistic to estimate the population variance
is extremely sensitive to violations of this assumption. For this reason, extreme
caution should be exercised in using this technique. Because of this, some
statisticians omit this technique from consideration presentation and usage.
Lastly, this chapter contains a section on the estimation of sample size.
One of the more common questions asked of statisticians is: "How large of a
sample size should I take?" In this section, it should be emphasized that sample
size estimation gives the researcher a "ball park" figure as to how many to sample.
The error of estimation is a measure of the sampling error. It is also equal to
the + error of the interval shown earlier in the chapter.
330
CHAPTER OUTLINE
8.1 Estimating the Population Mean Using the z Statistic (o known).
Finite Correction Factor
Estimating the Population Mean Using the z Statistic when the
Sample Size is Small
Using the Computer to Construct z Confidence Intervals for the
Mean
8.2 Estimating the Population Mean Using the t Statistic (o unknown).
The t Distribution
Robustness
Characteristics of the t Distribution.
Reading the t Distribution Table
Confidence Intervals to Estimate the Population Mean Using the t
Statistic
Using the Computer to Construct t Confidence Intervals for the
Mean
331
8.3 Estimating the Population Proportion
Using the Computer to Construct Confidence Intervals of the
Population Proportion
8.4 Estimating the Population Variance
8.5 Estimating Sample Size
Sample Size When Estimating
Determining Sample Size When Estimating p
KEY WORDS
Bounds Point Estimate
Chi-square Distribution Robust
Degrees of Freedom(df) Sample-Size Estimation
Error of Estimation t Distribution
Interval Estimate t Value
332
SOLUTIONS TO PROBLEMS IN CHAPTER 8
8.1 a) x = 25 o = 3.5 n = 60
95% Confidence z
.025
= 1.96
n
z x
o
= 25 + 1.96
60
5 . 3
= 25 + 0.89 = 24.11 < < 25.89
b) x
= 119.6 o = 23.89 n = 75
98% Confidence z
.01
= 2.33
n
z x
o
= 119.6 + 2.33
75
89 . 23
=
119.6 6.43 = 113.17 < < 126.03
c) x
= 3.419 o = 0.974 n = 32
90% C.I. z
.05
= 1.645
n
z x
o
= 3.419 + 1.645
32
974 . 0
= 3.419 .283 = 3.136 < < 3.702
d) x
= 56.7 o = 12.1 N = 500 n = 47
333
80% C.I. z
.10
= 1.28
1
N
n N
n
z x
o
= 56.7 + 1.28
1 500
47 500
47
1 . 12
=
56.7 2.15 = 54.55 < < 58.85
334
8.2 n = 36 x = 211 o = 23
95% C.I. z
.025
= 1.96
n
z x
o
= 211 1.96
36
23
= 211 7.51 = 203.49 < < 218.51
8.3 n = 81 x = 47 o = 5.89
90% C.I. z
.05
=1.645
n
z x
o
= 47 1.645
81
89 . 5
= 47 1.08 = 45.92 < < 48.08
8.4 n = 70 o
2
= 49 x = 90.4
x = 90.4 Point Estimate
94% C.I. z
.03
= 1.88
n
z x
o
=
90.4 1.88
70
49
= 90.4 1.57 = 88.83 < < 91.97
8.5 n = 39 N = 200 x
= 66 o = 11
96% C.I. z
.02
= 2.05
335
1
N
n N
n
z x
o
= 66 2.05
1 200
39 200
39
11
=
66 3.25 = 62.75 < < 69.25
x = 66 Point Estimate
8.6 n = 120 x = 18.72 o = 0.8735
99% C.I. z
.005
= 2.575
x = 18.72 Point Estimate
n
z x
o
=
18.72 2.575
120
8735 . 0
= 8.72 .21 = 18.51 < < 18.93
8.7 N = 1500 n = 187 x = 5.3 years o = 1.28 years
95% C.I. z
.025
= 1.96
x = 5.3 years Point Estimate
1
N
n N
n
z x
o
= 5.3 1.96
1 1500
187 1500
187
28 . 1
=
5.3 .17 = 5.13 < < 5.47
336
8.8 n = 24 x = 5.625 o = 3.23
90% C.I. z
.05
= 1.645
n
z x
o
= 5.625 1.645
24
23 . 3
= 5.625 1.085 = 4.540 < < 6.710
8.9 n = 36 x = 3.306 o = 1.17
98% C.I. z
.01
= 2.33
n
z x
o
= 3.306 2.33
36
17 . 1
= 3.306 .454 = 2.852 < < 3.760
8.10 n = 36 x = 2.139 o = .113
x = 2.139 Point Estimate
90% C.I. z
.05
= 1.645
n
z x
o
= 2.139 1.645
36
) 113 (.
= 2.139 .031 = 2.108 < < 2.170
337
8.11 95% confidence interval n = 45
x = 24.533 o = 5.124 z = + 1.96
n
z x
o
= 24.533 + 1.96
45
124 . 5
=
24.533 + 1.497 = 23.036 < u < 26.030
8.12 The point estimate is 0.5765. n = 41
The assumed standard deviation is 0.14
95% level of confidence: z = + 1.96
Confidence interval: 0.533647 < < 0.619353
Error of the estimate: 0.619353 - 0.5765 = 0.042853
8.13 n = 13 x = 45.62 s = 5.694 df = 13 1 = 12
95% Confidence Interval and o/2=.025
t
.025,12
= 2.179
338
n
s
t x = 45.62 2.179
13
694 . 5
= 45.62 3.44 = 42.18 < < 49.06
8.14 n = 12 x = 319.17 s = 9.104 df = 12 - 1 = 11
90% confidence interval
o/2 = .05 t
.05,11
= 1.796
n
s
t x = 319.17 (1.796)
12
104 . 9
= 319.17 4.72 = 314.45 < < 323.89
8.15 n = 41 x = 128.4 s = 20.6 df = 41 1 = 40
98% Confidence Interval
o/2 = .01
t
.01,40
= 2.423
n
s
t x = 128.4 2.423
41
6 . 20
= 128.4 7.80 = 120.6 < < 136.2
x = 128.4 Point Estimate
339
8.16 n = 15 x = 2.364 s
2
= 0.81 df = 15 1 = 14
90% Confidence interval
o/2 = .05
t
.05,14
= 1.761
n
s
t x = 2.364 1.761
15
81 . 0
= 2.364 .409 = 1.955 < < 2.773
8.17 n = 25 x = 16.088 s = .817 df = 25 1 = 24
99% Confidence Interval
o/2 = .005
t
.005,24
= 2.797
n
s
t x = 16.088 2.797
25
) 817 (.
= 16.088 .457 = 15.631 < < 16.545
x = 16.088 Point Estimate
340
8.18 n = 22 x = 1,192 s = 279 df = n - 1 = 21
98% CI and o/2 = .01 t
.01,21
= 2.518
n
s
t x = 1,192 + 2.518
22
279
= 1,192 + 149.78 = 1,042.22 < u < 1,341.78
The figure given by Runzheimer International falls within the confidence
interval. Therefore, there is no reason to reject the Runzheimer figure as
different from what we are getting based on this sample.
8.19 n = 20 df = 19 95% CI t
.025,19
= 2.093
x = 2.36116 s = 0.19721
2.36116 + 2.093
20
1972 . 0
= 2.36116 + 0.0923 = 2.26886 < u < 2.45346
Point Estimate = 2.36116
Error = 0.0923
8.20 n = 28 x = 5.335 s = 2.016 df = 28 1 = 27
341
90% Confidence Interval o/2 = .05
t
.05,27
= 1.703
n
s
t x = 5.335 1.703
28
016 . 2
= 5.335 + .649 = 4.686 < < 5.984
8.21 n = 10 x = 49.8 s = 18.22 df = 10 1 = 9
95% Confidence o/2 = .025 t
.025,9
= 2.262
n
s
t x = 49.8 2.262
10
22 . 18
= 49.8 + 13.03 = 36.77 < < 62.83
8.22 n = 14 98% confidence o/2 = .01 df = 13
t
.01,13
= 2.650
from data: x = 152.16 s = 14.42
confidence interval:
n
s
t x = 152.16 + 2.65
14
42 . 14
=
342
152.16 + 10.21 = 141.95 < u < 162.37
The point estimate is 152.16
8.23 n = 17 df = 17 1 = 16 99% confidence o/2 = .005
t
.005,16
= 2.921
from data: x = 8.06 s = 5.07
confidence interval:
n
s
t x = 8.06 + 2.921
17
07 . 5
=
8.06 + 3.59 = 4.47 < u < 11.65
8.24 The point estimate is x which is 25.4134 hours. The sample size is 26 skiffs.
The confidence level is 98%. The confidence interval is:
x t
s
n
x t
s
n
s s u = 22.8124 < u < 28.0145
The error of the confidence interval is 2.6011.
343
8.25 a) n = 44 p =.51 99% C.I. z
.005
= 2.575
n
q p
z p
= .51 2.575
44
) 49 )(. 51 (.
= .51 .194 = .316 < p< .704
b) n = 300 p = .82 95% C.I. z
.025
= 1.96
n
q p
z p
= .82 1.96
300
) 18 )(. 82 (.
= .82 .043 = .777 < p < .863
c) n = 1150 p = .48 90% C.I. z
.05
= 1.645
n
q p
z p
= .48 1.645
1150
) 52 )(. 48 (.
= .48 .024 = .456 < p < .504
d) n = 95 p = .32 88% C.I. z
.06
= 1.555
n
q p
z p
= .32 1.555
95
) 68 )(. 32 (.
= .32 .074 = .246 < p < .394
8.26 a) n = 116 x = 57 99% C.I. z
.005
= 2.575
344
p =
116
57
=
n
x
= .49
n
q p
z p
= .49 2.575
116
) 51 )(. 49 (.
= .49 .12 = .37 < p < .61
b) n = 800 x = 479 97% C.I. z
.015
= 2.17
p =
800
479
=
n
x
= .60
n
q p
z p
= .60 2.17
800
) 40 )(. 60 (.
= .60 .038 = .562 < p < .638
c) n = 240 x = 106 85% C.I. z
.075
= 1.44
p =
240
106
=
n
x
= .44
n
q p
z p
= .44 1.44
240
) 56 )(. 44 (.
= .44 .046 = .394 < p < .486
d) n = 60 x = 21 90% C.I. z
.05
= 1.645
p =
60
21
=
n
x
= .35
345
n
q p
z p
= .35 1.645
60
) 65 )(. 35 (.
= .35 .10 = .25 < p < .45
8.27 n = 85 x = 40 90% C.I. z
.05
= 1.645
p
=
85
40
=
n
x
= .47
n
q p
z p
= .47 1.645
85
) 53 )(. 47 (.
= .47 .09 = .38 < p < .56
95% C.I. z
.025
= 1.96
n
q p
z p
= .47 1.96
85
) 53 )(. 47 (.
= .47 .11 = .36 < p < .58
99% C.I. z
.005
= 2.575
n
q p
z p
= .47 2.575
85
) 53 )(. 47 (.
= .47 .14 = .33 < p < .61
All other things being constant, as the confidence increased, the width of the interval increased.
346
8.28 n = 1003 p = .255 99% CI z
.005
= 2.575
n
q p
z p
= .255 + 2.575
1003
) 745 )(. 255 (.
= .255 + .035 = .220 < p < .290
n = 10,000 p = .255 99% CI z
.005
= 2.575
n
q p
z p
= .255 + 2.575
000 , 10
) 745 )(. 255 (.
= .255 + .011 = .244 < p < .266
The confidence interval constructed using n = 1003 is wider than the confidence
interval constructed using n = 10,000. One might conclude that, all other things
being constant, increasing the sample size reduces the width of the confidence
interval.
8.29 n = 560 p = .47 95% CI z
.025
= 1.96
n
q p
z p
= .47 + 1.96
560
) 53 )(. 47 (.
= .47 + .0413 = .4287 < p < .5113
n = 560 p = .28 90% CI z
.05
= 1.645
347
n
q p
z p
= .28 + 1.645
560
) 72 )(. 28 (.
= .28 + .0312 = .2488 < p < .3112
8.30 n = 1250 x = 997 98% C.I. z
.01
= 2.33
p =
1250
997
=
n
x
= .80
n
q p
z p
= .80 2.33
1250
) 20 )(. 80 (.
= .80 .026 = .774 < p < .826
348
8.31 n = 3481 x = 927
p =
3481
927
=
n
x
= .266
a) p = .266 Point Estimate
b) 99% C.I. z
.005
= 2.575
n
q p
z p
= .266 + 2.575
3481
) 734 )(. 266 (.
= .266 .019 =
.247 < p < .285
8.32 n = 89 x = 48 85% C.I. z
.075
= 1.44
p =
89
48
=
n
x
= .54
n
q p
z p
= .54 1.44
89
) 46 )(. 54 (.
= .54 .076 = .464 < p < .616
8.33 p = .63 n = 672 95% Confidencez
.025
= + 1.96
n
q p
z p
= .63 + 1.96
672
) 37 )(. 63 (.
= .63 + .0365 = .5935 < p < .6665
349
8.34 n = 275 x = 121 98% confidence z
.01
= 2.33
p =
275
121
=
n
x
= .44
n
q p
z p
= .44 2.33
275
) 56 )(. 44 (.
= .44 .07 = .37 < p < .51
350
8.35 a) n = 12 x = 28.4 s
2
= 44.9 99% C.I. df = 12 1 = 11
;
2
.995,11
= 2.60320 ;
2
.005,11
= 26.7569
7569 . 26
) 9 . 44 )( 1 12 (
< o
2
<
60320 . 2
) 9 . 44 )( 1 12 (
18.46 < o
2
< 189.73
b) n = 7 x = 4.37 s = 1.24 s
2
= 1.5376 95% C.I. df = 7 1 = 6
;
2
.975,6
= 1.23734 ;
2
.025,6
= 14.4494
4494 . 14
) 5376 . 1 )( 1 7 (
< o
2
<
23734 . 1
) 5376 . 1 )( 1 7 (
0.64 < o
2
< 7.46
c) n = 20 x = 105 s = 32 s
2
= 1024 90% C.I. df = 20 1 = 19
;
2
.95,19
= 10.11701 ;
2
.05,19
= 30.1435
1435 . 30
) 1024 )( 1 20 (
< o
2
<
11701 . 10
) 1024 )( 1 20 (
645.45 < o
2
< 1923.10
d) n = 17 s
2
= 18.56 80% C.I. df = 17 1 = 16
351
;
2
.90,16
= 9.31224 ;
2
.10,16
= 23.5418
5418 . 23
) 56 . 18 )( 1 17 (
< o
2
<
31224 . 9
) 56 . 18 )( 1 17 (
12.61 < o
2
< 31.89
352
8.36 n = 16 s
2
= 37.1833 98% C.I. df = 16-1 = 15
;
2
.99,15
= 5.22936 ;
2
.01,15
= 30.5780
5780 . 30
) 1833 . 37 )( 1 16 (
< o
2
<
22936 . 5
) 1833 . 37 )( 1 16 (
18.24 < o
2
< 106.66
8.37 n = 20 s = 4.3 s
2
= 18.49 98% C.I. df = 20 1 = 19
;
2
.99,19
= 7.63270 ;
2
.01,19
= 36.1908
1908 . 36
) 49 . 18 )( 1 20 (
< o
2
<
63270 . 7
) 49 . 18 )( 1 20 (
9.71 < o
2
< 46.03
Point Estimate = s
2
= 18.49
8.38 n = 15 s
2
= 3.067 99% C.I. df = 15 1 = 14
;
2
.995,14
= 4.07466 ;
2
.005,14
= 31.3194
353
3194 . 31
) 067 . 3 )( 1 15 (
< o
2
<
07466 . 4
) 067 . 3 )( 1 15 (
1.37 < o
2
< 10.54
8.39 n = 14 s
2
= 26,798,241.76 95% C.I. df = 14 1 = 13
Point Estimate = s
2
= 26,798,241.76
;
2
.975,13
= 5.00874 ;
2
.025,13
= 24.7356
7356 . 24
) 76 . 241 , 798 , 26 )( 1 14 (
< o
2
<
00874 . 5
) 76 . 241 , 798 , 26 )( 1 14 (
14,084,038.51 < o
2
< 69,553,848.45
8.40 a) o = 36 E = 5 95% Confidence z
.025
= 1.96
n =
2
2 2
2
2 2
5
) 36 ( ) 96 . 1 (
=
E
z o
= 199.15
Sample 200
b) o = 4.13 E = 1 99% Confidence z
.005
= 2.575
354
n =
2
2 2
2
2 2
1
) 13 . 4 ( ) 575 . 2 (
=
E
z o
= 113.1
Sample 114
c) E = 10 Range = 500 - 80 = 420
1/4 Range = (.25)(420) = 105
90% Confidence z
.05
= 1.645
n =
2
2 2
2
2 2
10
) 105 ( ) 645 . 1 (
=
E
z o
= 298.3
Sample 299
d) E = 3 Range = 108 - 50 = 58
1/4 Range = (.25)(58) = 14.5
88% Confidence z
.06
= 1.555
n =
2
2 2
2
2 2
3
) 5 . 14 ( ) 555 . 1 (
=
E
z o
= 56.5
Sample 57
355
8.41 a) E = .02 p = .40 96% Confidence z
.02
= 2.05
n =
2
2
2
2
) 02 (.
) 60 )(. 40 (. ) 05 . 2 (
=
E
q p z
= 2521.5
Sample 2522
b) E = .04 p = .50 95% Confidence z
.025
= 1.96
n =
2
2
2
2
) 04 (.
) 50 )(. 50 (. ) 96 . 1 (
=
E
q p z
= 600.25
Sample 601
c) E = .05 p = .55 90% Confidence z
.05
= 1.645
n =
2
2
2
2
) 05 (.
) 45 )(. 55 (. ) 645 . 1 (
=
E
q p z
= 267.9
Sample 268
d) E =.01 p = .50 99% Confidence z
.005
= 2.575
n =
2
2
2
2
) 01 (.
) 50 )(. 50 (. ) 575 . 2 (
=
E
q p z
= 16,576.6
356
Sample 16,577
8.42 E = $200 o = $1,000 99% Confidence z
.005
= 2.575
n =
2
2 2
2
2 2
200
) 1000 ( ) 575 . 2 (
=
E
z o
= 165.77
Sample 166
357
8.43 E = $2 o = $12.50 90% Confidence z
.05
= 1.645
n =
2
2 2
2
2 2
2
) 50 . 12 ( ) 645 . 1 (
=
E
z o
= 105.7
Sample 106
8.44 E = $100 Range = $2,500 - $600 = $1,900
o ~ 1/4 Range = (.25)($1,900) = $475
90% Confidence z
.05
= 1.645
n =
2
2 2
2
2 2
100
) 475 ( ) 645 . 1 (
=
E
z o
= 61.05
Sample 62
8.45 p = .20 q = .80 E = .02
90% Confidence, z
.05
= 1.645
358
n =
2
2
2
2
) 02 (.
) 80 )(. 20 (. ) 645 . 1 (
=
E
q p z
= 1082.41
Sample 1083
8.46 p = .50 q = .50 E = .05
95% Confidence, z
.025
= 1.96
n =
2
2
2
2
) 05 (.
) 50 )(. 50 (. ) 96 . 1 (
=
E
q p z
= 384.16
Sample 385
8.47 E = .10 p = .50 q = .50
95% Confidence, z
.025
= 1.96
n =
2
2
2
2
) 10 (.
) 50 )(. 50 (. ) 96 . 1 (
=
E
q p z
= 96.04
Sample 97
359
8.48 x = 45.6 o = 7.75 n = 35
80% confidence z
.10
= 1.28
35
75 . 7
28 . 1 6 . 45 =
n
z x
o
= 45.6 + 1.68
43.92 < u < 47.28
94% confidence z
.03
= 1.88
35
75 . 7
88 . 1 6 . 45 =
n
z x
o
= 45.6 + 2.46
43.14 < u < 48.06
98% confidence z
.01
= 2.33
35
75 . 7
33 . 2 6 . 45 =
n
z x
o
= 45.6 + 3.05
42.55 < u < 48.65
360
361
8.49 x = 12.03 (point estimate) s = .4373 n = 10 df = 9
For 90% confidence: o/2 = .05 t
.05,9
= 1.833
10
) 4373 (.
833 . 1 03 . 12 =
n
s
t x = 12.03 + .25
11.78 < u < 12.28
For 95% confidence: o/2 = .025 t
.025,9
= 2.262
10
) 4373 (.
262 . 2 03 . 12 =
n
s
t x = 12.03 + .31
11.72 < u < 12.34
For 99% confidence: o/2 = .005 t
.005,9
= 3.25
10
) 4373 (.
25 . 3 03 . 12 =
n
s
t x = 12.03 + .45
11.58 < u < 12.48
8.50 a) n = 715 x = 329 95% confidence z
.025
= 1.96
362
715
329
= p = .46
715
) 54 )(. 46 (.
96 . 1 46 .
=
n
q p
z p = .46 + .0365
.4235 < p < .4965
b) n = 284 p = .71 90% confidence z
.05
= 1.645
284
) 29 )(. 71 (.
645 . 1 71 .
=
n
q p
z p = .71 + .0443
.6657 < p < .7543
c) n = 1250 p = .48 95% confidence z
.025
= 1.96
1250
) 52 )(. 48 (.
96 . 1 48 .
=
n
q p
z p = .48 + .0277
.4523 < p < .5077
d) n = 457 x = 270 98% confidence z
.01
= 2.33
363
457
270
= p = .591
457
) 409 )(. 591 (.
33 . 2 591 .
=
n
q p
z p = .591 + .0536
.5374 < p < .6446
8.51 n = 10 s = 7.40045 s
2
= 54.7667 df = 10 1 = 9
90% confidence, o/2 = .05 1 - o/2 = .95
;
2
.95,9
= 3.32512 ;
2
.05,9
= 16.9190
9190 . 16
) 7667 . 54 )( 1 10 (
< o
2
<
32512 . 3
) 7667 . 54 )( 1 10 (
29.133 < o
2
< 148.235
95% confidence, o/2 = .025 1 - o/2 = .975
;
2
.975,9
= 2.70039 ;
2
.025,9
= 19.0228
364
0228 . 19
) 7667 . 54 )( 1 10 (
< o
2
<
70039 . 2
) 7667 . 54 )( 1 10 (
25.911 < o
2
< 182.529
365
8.52 a) o = 44 E = 3 95% confidence z
.025
= 1.96
n =
2
2 2
2
2 2
3
) 44 ( ) 96 . 1 (
=
E
z o
= 826.4
Sample 827
b) E = 2 Range = 88 - 20 = 68
use o = 1/4(range) = (.25)(68) = 17
90% confidence z
.05
= 1.645
2
2 2
2
2 2
2
) 17 ( ) 645 . 1 (
=
E
z o
= 195.5
Sample 196
c) E = .04 p = .50 q = .50
98% confidence z
.01
= 2.33
2
2
2
2
) 04 (.
) 50 )(. 50 (. ) 33 . 2 (
=
E
q p z
= 848.3
366
Sample 849
d) E = .03 p = .70 q = .30
95% confidence z
.025
= 1.96
2
2
2
2
) 03 (.
) 30 )(. 70 (. ) 96 . 1 (
=
E
q p z
= 896.4
Sample 897
367
8.53 n = 17 x = 10.765 s = 2.223 df = 17 - 1 = 16
99% confidence o/2 = .005 t
.005,16
= 2.921
17
223 . 2
921 . 2 765 . 10 =
n
s
t x = 10.765 + 1.575
9.19 < < 12.34
8.54 p = .40 E=.03 90% Confidence z
.05
= 1.645
n =
2
2
2
2
) 03 (.
) 60 )(. 40 (. ) 645 . 1 (
=
E
q p z
= 721.61
Sample 722
8.55 n = 17 s
2
= 4.941 99% C.I. df = 17 1 = 16
;
2
.995,16
= 5.14216 ;
2
.005,16
= 34.2671
2671 . 34
) 941 . 4 )( 1 17 (
< o
2
<
14216 . 5
) 941 . 4 )( 1 17 (
2.307 < o
2
< 15.374
368
8.56 n = 45 x
= 213 o = 48
98% Confidence z
.01
= 2.33
45
48
33 . 2 213 =
n
z x
o
= 213 16.67
196.33 < < 229.67
369
8.57 n = 39 x
= 37.256 o = 3.891
90% confidence z
.05
= 1.645
39
891 . 3
645 . 1 256 . 37 =
n
z x
o
= 37.256 1.025
36.231 < < 38.281
8.58 o = 6 E = 1 98% Confidence z
.98
= 2.33
n =
2
2 2
2
2 2
1
) 6 ( ) 33 . 2 (
=
E
z o
= 195.44
Sample 196
8.59 n = 1,255 x = 714 95% Confidencez
.025
= 1.96
1255
714
= p = .569
255 , 1
) 431 )(. 569 (.
96 . 1 569 .
=
n
q p
z p = .569 .027
370
.542 < p < .596
371
8.60 n = 41 s = 21 128 = x
98% C.I. df = 41 1 = 40
t
.01,40
= 2.423
Point Estimate = $128
41
21
423 . 2 128 =
n
s
t x = 128 + 7.947
120.053 < u < 135.947
Interval Width = 135.947 120.053 = 15.894
8.61 n = 60 x = 6.717 o = 3.06 N = 300
98% Confidencez
.01
= 2.33
1 300
60 300
60
06 . 3
33 . 2 717 . 6
1
N
n N
n
s
z x =
6.717 0.825
372
5.892 < < 7.542
8.62 E = $20 Range = $600 - $30 = $570
1/4 Range = (.25)($570) = $142.50
95% Confidencez
.025
= 1.96
n =
2
2 2
2
2 2
20
) 50 . 142 ( ) 96 . 1 (
=
E
z o
= 195.02
Sample 196
373
8.63 n = 245 x = 189 90% Confidence z
.05
= 1.645
245
189
= =
n
x
p = .77
245
) 23 )(. 77 (.
645 . 1 77 .
=
n
q p
z p = .77 .044
.726 < p < .814
8.64 n = 90 x = 30 95% Confidence z
.025
= 1.96
90
30
= =
n
x
p = .33
90
) 67 )(. 33 (.
96 . 1 33 .
=
n
q p
z p = .33 .097
.233 < p < .427
8.65 n = 12 x = 43.7 s
2
= 228 df = 12 1 = 11 95% C.I.
t
.025,11
= 2.201
374
12
228
201 . 2 7 . 43 =
n
s
t x = 43.7 + 9.59
34.11 < u < 53.29
;
2
.99,11
= 3.05350 ;
2
.01,11
= 24.7250
7250 . 24
) 228 )( 1 12 (
< o
2
<
05350 . 3
) 228 )( 1 12 (
101.44 < o
2
< 821.35
375
8.66 n = 27 x = 4.82 s = 0.37 df = 26
95% CI: t
.025,26
= 2.056
27
37 . 0
056 . 2 82 . 4 =
n
s
t x = 4.82 + .1464
4.6736 < < 4.9664
Since 4.50 is not in the interval, we are 95% confident that does not
equal 4.50.
8.67 n = 77 x
= 2.48 o = 12
95% Confidence z
.025
= 1.96
77
12
96 . 1 48 . 2 =
n
z x
o
= 2.48 2.68
-0.20 < < 5.16
The point estimate is 2.48
The interval is inconclusive. It says that we are 95% confident that the average arrival time is
somewhere between .20 of a minute (12 seconds) early and 5.16 minutes late. Since zero is in
the interval, there is a possibility that, on average, the flights are on time.
376
8.68 n = 560 p =.33
99% Confidence z
.005
= 2.575
560
) 67 )(. 33 (.
575 . 2 33 .
=
n
q p
z p = .33 .05
.28 < p < .38
377
8.69 p = .50 E = .05 98% Confidence z
.01
= 2.33
2
2
2
2
) 05 (.
) 50 )(. 50 (. ) 33 . 2 (
=
E
q p z
= 542.89
Sample 543
8.70 n = 27 x = 2.10 s = 0.86 df = 27 - 1 = 26
98% confidence o/2 = .01 t
.01,26
= 2.479
27
86 . 0
479 . 2 10 . 2 =
n
s
t x = 2.10 0.41
1.69 < < 2.51
8.71 n = 23 df = 23 1 = 22 s = .0631455 90% C.I.
;
2
.95,22
= 12.33801 ;
2
.05,22
= 33.9245
9245 . 33
) 0631455 )(. 1 23 (
2
< o
2
<
33801 . 12
) 0631455 )(. 1 23 (
2
378
.0026 < o
2
< .0071
8.72 n = 39 x = 1.294 o = 0.205 99% Confidence z
.005
= 2.575
39
205 . 0
575 . 2 294 . 1 =
n
z x
o
= 1.294 .085
1.209 < < 1.379
379
8.73 The sample mean fill for the 58 cans is 11.9788 oz. with a standard deviation of
.0536 oz. The 99% confidence interval for the population fill is 11.9607 oz. to
11.9969 oz. which does not include 12 oz. We are 99% confident that the population
mean is not 12 oz., indicating that the machine may be under filling
the cans.
8.74 The point estimate for the average length of burn of the new bulb is 2198.217
hours. Eighty-four bulbs were included in this study. A 90% confidence interval
can be constructed from the information given. The error of the confidence
interval is + 27.76691. Combining this with the point estimate yields the 90%
confidence interval of 2198.217 + 27.76691 = 2170.450 < < 2225.984.
8.75 The point estimate for the average age of a first time buyer is 27.63 years. The
sample of 21 buyers produces a standard deviation of 6.54 years. We are 98%
confident that the actual population mean age of a first-time home buyer is
between 24.0222 years and 31.2378 years.
8.76 A poll of 781 American workers was taken. Of these, 506 drive their cars to work. Thus, the
point estimate for the population proportion is 506/781 = .647887. A 95% confidence interval
380
to estimate the population proportion shows that we are 95% confident that the actual value
lies between .61324 and .681413. The error of this interval is + .0340865.
381
Chapter 9
Statistical Inference:
Hypothesis Testing for Single Populations
LEARNING OBJECTIVES
The main objective of Chapter 9 is to help you to learn how to test hypotheses on
single populations, thereby enabling you to:
1. Understand the logic of hypothesis testing and know how to establish null and alternate
hypotheses.
2. Understand Type I and Type II errors and know how to solve for Type II errors.
3. Know how to implement the HTAB system to test hypotheses.
4. Test hypotheses about a single population mean when o is known.
5. Test hypotheses about a single population mean when o is unknown.
6. Test hypotheses about a single population proportion.
7. Test hypotheses about a single population variance.
CHAPTER TEACHING STRATEGY
For some instructors, this chapter is the cornerstone of the first statistics course.
Hypothesis testing presents the logic in which ideas, theories, etc., are scientifically
examined. The student can be made aware that much of the development of
concepts to this point including sampling, level of data measurement, descriptive tools
such as mean and standard deviation, probability, and distributions pave the way for
testing hypotheses. Often students (and instructors) will say "Why do we need to test this
hypothesis when we can make a decision by examining the data?" Sometimes it is true
that examining the data could allow hypothesis decisions to be made. However, by using
the methodology and structure of hypothesis testing even in "obvious" situations, the
researcher has added credibility and rigor to his/her findings. Some statisticians actually
report findings in a court of law as an expert witness. Others report their findings in a
journal, to the public, to the corporate board, to a client, or to their manager. In each
case, by using the hypothesis testing method rather than a "seat of the pants" judgment,
the researcher stands on a much firmer foundation by using the principles of hypothesis
testing and random sampling. Chapter 9 brings together many of the tools developed to
this point and formalizes a procedure for testing hypotheses.
The statistical hypotheses are set up as to contain all possible decisions. The
two-tailed test always has = and = in the null and alternative hypothesis. One-tailed tests are
presented with = in the null hypothesis and either > or < in the alternative hypothesis. If in
doubt, the researcher should use a two-tailed test. Chapter 9 begins with a two-tailed test
example. Often that which the researcher wants to demonstrate true or prove true is set up as
the alternative hypothesis. The null hypothesis is that the new theory or idea is not true, the
status quo is still true, or that there is no difference. The null hypothesis is assumed to be true
before the process begins. Some researchers liken this procedure to a court of law where the
defendant is presumed innocent (assume null is true - nothing has happened). Evidence is
brought before the judge or jury. If enough evidence is presented, then the null hypothesis
(defendant innocent) can no longer be accepted or assumed true. The null hypothesis is rejected
as not true and the alternate hypothesis is accepted as true by default. Emphasize that the
researcher needs to make a decision after examining the observed statistic.
Some of the key concepts in this chapter are one-tailed and two-tailed test and Type I
and Type II error. In order for a one-tailed test to be conducted, the problem must include some
suggestion of a direction to be tested. If the student sees such words as greater, less than, more
than, higher, younger, etc., then he/she knows to use a one-tail test. If no direction is given
(test to determine if there is a "difference"), then a two-tailed test is called for. Ultimately,
Page | 383
students will see that the only effect of using a one-tailed test versus a two-tailed test is on the
critical table value. A one-tailed test uses all of the value of alpha in one tail. A two-tailed test
splits alpha and uses alpha/2 in each tail thus creating a critical value that is further out in the
distribution. The result is that (all things being the same) it is more difficult to reject the null
hypothesis with a two-tailed test. Many computer packages such as MINITAB include in the
results a p-value. If you
designate that the hypothesis test is a two-tailed test, the computer will double the p-value so
that it can be compared directly to alpha.
In discussing Type I and Type II errors, there are a few things to consider. Once a
decision is made regarding the null hypothesis, there is a possibility that the decision is correct
or that an error has been made. Since the researcher virtually never knows for certain whether
the null hypothesis was actually true or not, a probability of committing one of these errors can
be computed. Emphasize with the students that a researcher can never commit a Type I error
and a Type II error at the same time. This is so because a Type I error can only be committed
when the null hypothesis is rejected and a Type II error can only be committed when the
decision is to not reject the null hypothesis. Type I and Type II errors are important concepts for
managerial students to understand even beyond the realm of statistical hypothesis testing. For
example, if a manager decides to fire or not fire an employee based on some evidence collected,
he/she could be committing a Type I or a Type II error depending on the decision. If the
production manager decides to stop the production line because of evidence of faulty raw
materials, he/she might be committing a Type I error. On the other hand, if the manager fails to
shut the production line down even when faced with evidence of faulty raw materials, he/she
might be committing a Type II error.
The student can be told that there are some widely accepted values for alpha
(probability of committing a Type I error) in the research world and that a value is usually
selected before the research begins. On the other hand, since the value of Beta (probability of
committing a Type II error) varies with every possible alternate value of the parameter being
tested, Beta is usually examined and computed over a range of possible values of that
parameter. As you can see, the concepts of hypothesis testing are difficult and represent higher
levels of learning (logic, transfer, etc.). Student understanding of these concepts will improve as
you work your way through the techniques in this chapter and in chapter 10.
Page | 384
CHAPTER OUTLINE
9.1 Introduction to Hypothesis Testing
Types of Hypotheses
Research Hypotheses
Statistical Hypotheses
Substantive Hypotheses
Using the HTAB System to Test Hypotheses
Rejection and Non-rejection Regions
Type I and Type II errors
9.2 Testing Hypotheses About a Population Mean Using the z Statistic (o known)
Testing the Mean with a Finite Population
Using the p-Value Method to Test Hypotheses
Using the Critical Value Method to Test Hypotheses
Using the Computer to Test Hypotheses about a Population Mean Using
the z Statistic
9.3 Testing Hypotheses About a Population Mean Using the t Statistic (o unknown)
Using the Computer to Test Hypotheses about a Population Mean Using
the t Test
9.4 Testing Hypotheses About a Proportion
Using the Computer to Test Hypotheses about a Population Proportion
9.5 Testing Hypotheses About a Variance
Page | 385
9.6 Solving for Type II Errors
Some Observations About Type II Errors
Operating Characteristic and Power Curves
Effect of Increasing Sample Size on the Rejection Limits
Page | 386
KEY TERMS
Alpha(o ) One-tailed Test
Alternative Hypothesis Operating-Characteristic Curve (OC)
Beta(| ) p-Value Method
Critical Value Power
Critical Value Method Power Curve
Hypothesis Rejection Region
Hypothesis Testing Research Hypothesis
Level of Significance Statistical Hypothesis
Nonrejection Region Substantive Result
Null Hypothesis Two-Tailed Test
Observed Significance Level Type I Error
Observed Value Type II Error
SOLUTIONS TO PROBLEMS IN CHAPTER 9
9.1 a) H
o
: = 25
H
a
: = 25
x = 28.1 n = 57 o = 8.46 o = .01
For two-tail, o/2 = .005 z
c
= 2.575
Page | 387
z =
57
46 . 8
25 1 . 28
=
n
x
o
u
= 2.77
observed z = 2.77 > z
c
= 2.575
Reject the null hypothesis
b) from Table A.5, inside area between z = 0 and z = 2.77 is .4972
p-value = .5000 - .4972 = .0028
Since the p-value of .0028 is less than o/2 = .005, the decision is to:
Reject the null hypothesis
Page | 388
c) critical mean values:
z
c
=
n
xc
o
u
2.575 =
57
46 . 8
25 c x
x
c
= 25 2.885
x
c
= 27.885 (upper value)
x
c
= 22.115 (lower value)
9.2 H
o
: = 7.48
H
a
: < 7.48
x = 6.91 n = 24 o = 1.21 o =.01
For one-tail, o = .01 z
c
= -2.33
z =
24
21 . 1
48 . 7 91 . 6
=
n
x
o
u
= -2.31
Page | 389
observed z = -2.31 > z
c
= -2.33
Fail to reject the null hypothesis
Page | 390
9.3 a) H
o
: = 1,200
H
a
: > 1,200
x = 1,215 n = 113 o = 100 o = .10
For one-tail, o = .10 z
c
= 1.28
z =
113
100
200 , 1 215 , 1
=
n
x
o
u
= 1.59
observed z = 1.59 > z
c
= 1.28
Reject the null hypothesis
b) Probability > observed z = 1.59 is .5000 - .4441 = .0559 (the p-value) which is
less than o = .10.
Reject the null hypothesis.
c) Critical mean value:
z
c
=
n
xc
o
u
Page | 391
1.28 =
113
100
200 , 1 c x
x
c
= 1,200 + 12.04
Since the observed x = 1,215 is greater than the critical x = 1212.04, the decision is to
reject the null hypothesis.
Page | 392
9.4 H
o
: = 82
H
a
: < 82
x = 78.125 n = 32 o = 9.184 o = .01
z
.01
= -2.33
z =
32
184 . 9
82 125 . 78
=
n
x
o
u
= -2.39
Since observed z = -2.39 < z
.01
= -2.33
Reject the null hypothesis
Statistically, we can conclude that urban air soot is significantly lower. From a business
and community point-of-view, assuming that the sample result is representative of how
the air actually is now; is a reduction of suspended particles from 82 to 78.125 really an
important reduction in air pollution (is it substantive)? Certainly it marks an important
first step and perhaps a significant start. Whether or not it would really make a
difference in the quality of life for people in the city of St. Louis remains to be seen.
Most likely, politicians and city chamber of commerce folks would jump on such results
as indications of improvement in city conditions.
9.5 H
0
: u = $424.20
H
a
: u = $424.20
x
= $432.69 n = 54 o = $33.90
o = .05
Page | 393
2-tailed test, o/2 = .025 z
.025
= + 1.96
z =
54
90 . 33
20 . 424 69 . 432
=
n
x
o
u
= 1.84
Since the observed z = 1.85 < z
.025
= 1.96, the decision is to fail to reject the
null hypothesis.
Page | 394
9.6 H
0
: u = $62,600
H
a
: u < $62,600
x = $58,974 n = 18 o = $7,810 o = .01
1-tailed test, o = .01 z
.01
= -2.33
z =
18
810 , 7
600 , 62 974 , 58
=
n
x
o
u
= -1.97
Since the observed z = -1.97 > z
.01
= -2.33, the decision is to fail to reject the
null hypothesis.
9.7 H
0
: u = 5
H
a
: u = 5
x = 5.0611 n = 42 N = 650 o = 0.2803 o = .10
2-tailed test, o/2 = .05 z
.05
= + 1.645
z =
1 650
42 650
42
2803 . 0
5 0611 . 5
1
N
n N
n
x
o
u
= 1.46
Since the observed z = 1.46 < z
.05
= 1.645, the decision is to fail to reject the
Page | 395
null hypothesis.
9.8 H
o
: = 18.2
H
a
: < 18.2
x = 15.6 n = 32 o = 2.3 o = .10
For one-tail, o = .10, z
.10
= -1.28
z =
32
3 . 2
2 . 18 6 . 15
=
n
x
o
u
= -6.39
Since the observed z = -6.39 < z
.10
= -1.28, the decision is to
Reject the null hypothesis
9.9 H
o
: = $4,292
H
a
: < $4,292
x
= $4,008 n = 55 o = $386 o = .01
For one-tailed test, o = .01, z
.01
= -2.33
z =
55
386 $
292 , 4 $ 008 , 4 $
=
n
x
o
u
= -5.46
Page | 396
Since the observed z = -5.46 < z
.01
= -2.33, the decision is to
Reject the null hypothesis
The CEO could use this information as a way of discrediting the Runzheimer study and
using her own figures in recruiting people and in discussing relocation options. In such a
case, this could be a substantive finding. However, one must ask if the difference
between $4,292 and $4,008 is really an important difference in monthly rental expense.
Certainly, Paris is expensive either way. However, an almost $300 difference in monthly
rental cost is a nontrivial amount for most people and therefore might be considered
substantive.
9.10 H
o
: = 123
H
a
: > 123
o = .05 n = 40 40 people were sampled
x = 132.36 s = 27.68
This is a one-tailed test. Since the p-value = .016, we
reject the null hypothesis at o = .05.
The average water usage per person is greater than 123 gallons.
Page | 397
9.11 n = 20 x = 16.45 s = 3.59 df = 20 - 1 = 19 o = .05
H
o
: = 16
H
a
: = 16
For two-tail test, o/2 = .025, critical t
.025,19
= 2.093
t =
20
59 . 3
16 45 . 16
=
n
s
x u
= 0.56
Observed t = 0.56 < t
.025,19
= 2.093
The decision is to Fail to reject the null hypothesis
9.12 n = 51 x = 58.42 s
2
= 25.68 df = 51 - 1 = 50 o = .01
H
o
: = 60
H
a
: < 60
For one-tail test, o = .01 critical t
.01,50
= -2.403
Page | 398
t =
51
68 . 25
60 42 . 58
=
n
s
x u
= -2.23
Observed t = -2.23 > t
.01,7
= -2.403
The decision is to Fail to reject the null hypothesis
Page | 399
9.13 n = 11 x = 1,235.36 s = 103.81 df = 11 - 1 = 10 o = .05
H
o
: = 1,160
H
a
: > 1,160
or one-tail test, o = .05 critical t
.05,10
= 1.812
t =
11
81 . 103
160 , 1 36 . 236 , 1
=
n
s
x u
= 2.44
Observed t = 2.44 > t
.05,10
= 1.812
The decision is to Reject the null hypothesis
9.14 n = 20 x = 8.37 s = .1895 df = 20-1 = 19 o = .01
H
o
: = 8.3
H
a
: = 8.3
For two-tail test, o/2 = .005 critical t
.005,19
= 2.861
t =
20
1895 .
3 . 8 37 . 8
=
n
s
x u
= 1.65
Observed t = 1.65 < t
.005,19
= 2.861
Page | 400
The decision is to Fail to reject the null hypothesis
9.15 n = 12 x = 1.85083 s = .02353 df = 12 - 1 = 11 o = .10
H
0
: = 1.84
H
a
: = 1.84
For a two-tailed test, o/2 = .05 critical t
.05,11
= 1.796
t =
12
02353 .
84 . 1 85083 . 1
=
n
s
x u
= 1.59
Since t = 1.59 < t
11,.05
= 1.796,
The decision is to fail to reject the null hypothesis.
9.16 n = 25 x = 3.1948 s = .0889 df = 25 - 1 = 24 o = .01
H
o
: = $3.16
H
a
: > $3.16
For one-tail test, = .01 Critical t
.01,24
= 2.492
Page | 401
t =
25
0889 .
16 . 3 1948 . 3
=
n
s
x u
= 1.96
Observed t = 1.96 < t
.01,24
= 2.492
The decision is to Fail to reject the null hypothesis
9.17 n = 19 x = $31.67 s = $1.29 df = 19 1 = 18
o = .05
H
0
: u = $32.28
H
a
: u = $32.28
Two-tailed test, o/2 = .025 t
.025,18
= + 2.101
t =
19
29 . 1
28 . 32 67 . 31
=
n
s
x u
= -2.06
The observed t = -2.06 > t
.025,18
= -2.101,
The decision is to fail to reject the null hypothesis
9.18 n = 61 x = 3.72 s = 0.65 df = 61 1 = 60 o = .01
H
0
: u = 3.51
H
a
: u > 3.51
Page | 402
One-tailed test, o = .01 t
.01,60
= 2.390
t =
61
65 . 0
51 . 3 72 . 3
=
n
s
x u
= 2.52
The observed t = 2.52 > t
.01,60
= 2.390,
The decision is to reject the null hypothesis
Page | 403
9.19 n = 22 x = 1031.32 s = 240.37 df = 22 1 = 21 o = .05
H
0
: u = 1135
H
a
: u = 1135
Two-tailed test, o/2= .025 t
.025,21
= +2.080
t =
22
37 . 240
1135 32 . 1031
=
n
s
x u
= -2.02
The observed t = -2.02 > t
.025,21
= -2.080,
The decision is to fail to reject the null hypothesis
9.20 n = 12 x = 42.167 s = 9.124 df = 12 1 = 11 o = .01
H
0
: u = 46
H
a
: u < 46
One-tailed test, o = .01 t
.01,11
= -2.718
t =
12
124 . 9
46 167 . 42
=
n
s
x u
= -1.46
Page | 404
The observed t = -1.46 > t
.01,11
= -2.718,
The decision is to fail to reject the null hypothesis
9.21 n = 26 x = 19.534 minutes s = 4.100 minutes o = .05
H
0
: u = 19
H
a
: u = 19
Two-tailed test, o/2 = .025, critical t value = + 2.06
Observed t value = 0.66. Since the observed t = 0.66 < critical t value = 2.06,
The decision is to fail to reject the null hypothesis.
Since the Excel p-value = .256 > o/2 = .025 and MINITAB p-value =.513 > .05, the
decision is to fail to reject the null hypothesis.
She would not conclude that her city is any different from the ones in the
national survey.
9.22 H
o
: p = .45
H
a
: p > .45
n = 310 p = .465 o = .05
For one-tail, o = .05 z
.05
= 1.645
Page | 405
z =
310
) 55 )(. 45 (.
45 . 465 .
=
n
q p
p p
= 0.53
observed z = 0.53 < z
.05
= 1.645
The decision is to Fail to reject the null hypothesis
9.23 H
o
: p = 0.63
H
a
: p < 0.63
n = 100 x = 55
100
55
= =
n
x
p = .55
For one-tail, o = .01 z
.01
= -2.33
z =
100
) 37 )(. 63 (.
63 . 55 .
=
n
q p
p p
= -1.66
observed z = -1.66 > z
c
= -2.33
The decision is to Fail to reject the null hypothesis
Page | 406
Page | 407
9.24 H
o
: p = .29
H
a
: p = .29
n = 740 x = 207
740
207
= =
n
x
p = .28 o = .05
For two-tail, o/2 = .025 z
.025
= 1.96
z =
740
) 71 )(. 29 (.
29 . 28 .
=
n
q p
p p
= -0.60
observed z = -0.60 > z
c
= -1.96
The decision is to Fail to reject the null hypothesis
p-Value Method:
z = -0.60
from Table A.5, area = .2257
Area in tail = .5000 - .2257 = .2743 which is the p-value
Since the p-value = .2743 > o/2 = .025, the decision is to Fail to reject the null
hypothesis
Page | 408
Solving for critical values:
z =
n
q p
p p
c
1.96 =
740
) 71 )(. 29 (.
29 .
c
p
c
p = .29 .033
.257 and .323 are the critical values
Since p = .28 is not outside critical values in tails, the decision is to Fail to reject
the null hypothesis
Page | 409
9.25 H
o
: p = .48
H
a
: p = .48
n = 380 x = 164 o = .01 o/2 = .005 z
.005
= +2.575
380
164
= =
n
x
p = .4316
z =
380
) 52 )(. 48 (.
48 . 4316 .
=
n
q p
p p
= -1.89
Since the observed z = -1.89 is greater than z
.005
= -2.575, The decision is to fail to reject
the null hypothesis. There is not enough evidence to declare that the proportion is any
different than .48.
9.26 H
o
: p = .79
H
a
: p < .79
n = 415 x = 303 o = .01 z
.01
= -2.33
415
303
= =
n
x
p = .7301
Page | 410
z =
415
) 21 )(. 79 (.
79 . 7301
=
n
q p
p p
= -3.00
Since the observed z = -3.00 is less than z
.01
= -2.33, The decision is to reject the null
hypothesis.
Page | 411
9.27 H
o
: p = .31
H
a
: p = .31
n = 600 x = 200 o = .10 o/2 = .05 z
.005
= +1.645
600
200
= =
n
x
p = .3333
z =
600
) 69 )(. 31 (.
31 . 3333 .
=
n
q p
p p
= 1.23
Since the observed z = 1.23 is less than z
.005
= 1.645, The decision is to fail to reject the
null hypothesis. There is not enough evidence to declare that the proportion is any
different than .31.
H
o
: p = .24
H
a
: p < .24
n = 600 x = 130 o = .05 z
.05
= -1.645
600
130
= =
n
x
p = .2167
z =
600
) 76 )(. 24 (.
24 . 2167 .
=
n
q p
p p
= -1.34
Page | 412
Since the observed z = -1.34 is greater than z
.05
= -1.645, The decision is to fail to reject
the null hypothesis. There is not enough evidence to declare that the proportion is less
than .24.
Page | 413
9.28 H
o
: p = .18
H
a
: p > .18
n = 376 p = .22 o = .01
one-tailed test, z
.01
= 2.33
z =
376
) 82 )(. 18 (.
18 . 22 .
=
n
q p
p p
= 2.02
Since the observed z = 2.02 is less than z
.01
= 2.33, The decision is to fail to reject the null
hypothesis. There is not enough evidence to declare that the proportion is greater than
.18.
9.29 H
o
: p = .32
H
a
: p < .32
n = 118 x = 22
118
22
= =
n
x
p = .1864 o = .05
For one-tailed test, z
.05
= -1.645
z =
118
) 68 )(. 32 (.
32 . 1864 .
=
n
q p
p p
= -3.11
Page | 414
Observed z = -3.11 < z
.05
1.645
Since the observed z = -3.11 is less than z
.05
= -1.645, the decision is to reject the null
hypothesis.
Page | 415
9.30 H
o
: p = .47
H
a
: p = .47
n = 67 x = 40 o = .05 o/2 = .025
For a two-tailed test, z
.025
= +1.96
67
40
= =
n
x
p = .597
z =
67
) 53 )(. 47 (.
47 . 597 .
=
n
q p
p p
= 2.08
Since the observed z = 2.08 is greater than z
.025
= 1.96, The decision is to reject the null
hypothesis.
9.31 a) H
0
: o
2
= 20 o = .05 n = 15 df = 15 1 = 14 s
2
= 32
H
a
: o
2
> 20
;
2
.05,14
= 23.6848
;
2
=
20
) 32 )( 1 15 (
= 22.4
Page | 416
Since ;
2
= 22.4 < ;
2
.05,14
= 23.6848, the decision is to fail to reject the null
hypothesis.
b) H
0
: o
2
= 8.5 o = .10 o/2 = .05 n = 22 df = n-1 = 21 s
2
= 17
H
a
: o
2
= 8.5
;
2
.05,21
= 32.6706
;
2
=
5 . 8
) 17 )( 1 22 (
= 42
Since ;
2
= 42 > ;
2
.05,21
= 32.6706, the decision is to reject the null hypothesis.
c) H
0
: o
2
= 45 o = .01 n = 8 df = n 1 = 7 s = 4.12
H
a
: o
2
< 45
;
2
.01,7
= 18.4753
;
2
=
45
) 12 . 4 )( 1 8 (
2
= 2.64
Since ;
2
= 2.64 < ;
2
.01,7
= 18.4753, the decision is to fail to reject the null
hypothesis.
d) H
0
: o
2
= 5 o = .05 o/2 = .025 n = 11 df = 11 1 = 10 s
2
= 1.2
H
a
: o
2
= 5
Page | 417
;
2
.025,10
= 20.4832 ;
2
.975,10
= 3.24696
;
2
=
5
) 2 . 1 )( 1 11 (
= 2.4
Since ;
2
= 2.4 < ;
2
.975,10
= 3.24696, the decision is to reject the null hypothesis.
9.32 H
0
: o
2
= 14 o = .05 o/2 = .025 n = 12 df = 12 1 = 11 s
2
= 30.0833
H
a
: o
2
= 14
;
2
.025,11
= 21.9200 ;
2
.975,11
= 3.81574
;
2
=
14
) 0833 . 30 )( 1 12 (
= 23.64
Since ;
2
= 23.64 > ;
2
.025,11
= 21.9200, the decision is to reject the null
hypothesis.
9.33 H
0
: o
2
= .001 o = .01 n = 16 df = 16 1 = 15 s
2
= .00144667
H
a
: o
2
> .001
;
2
.01,15
= 30.5780
Page | 418
;
2
=
001 .
) 00144667 )(. 1 16 (
= 21.7
Since ;
2
= 21.7 < ;
2
.01,15
= 30.5780, the decision is to fail to reject the null
hypothesis.
Page | 419
9.34 H
0
: o
2
= 199,996,164 o = .10 o/2 = .05 n = 13 df =13 - 1 = 12
H
a
: o
2
= 199,996,164 s
2
= 832,089,743.7
;
2
.05,12
= 21.0261 ;
2
.95,12
= 5.22603
;
2
=
164 , 996 , 199
) 6 . 743 , 089 , 832 )( 1 13 (
= 49.93
Since ;
2
= 49.93 > ;
2
.05,12
= 21.0261, the decision is to reject the null
hypothesis. The variance has changed.
9.35 H
0
: o
2
= .04 o = .01 n = 7 df = 7 1 = 6 s = .34 s
2
= .1156
H
a
: o
2
> .04
;
2
.01,6
= 16.8119
;
2
=
04 .
) 1156 )(. 1 7 (
= 17.34
Since ;
2
= 17.34 > ;
2
.01,6
= 16.8119, the decision is to reject the null hypothesis
9.36 H
0
: = 100
Page | 420
H
a
: < 100
n = 48 = 99 o = 14
a) o = .10 z
.10
= -1.28
z
c
=
n
xc
o
u
-1.28 =
48
14
100 c x
x
c
= 97.4
z =
n
xc
o
u
=
48
14
99 4 . 97
= -0.79
from Table A.5, area for z = -0.79 is .2852
| = .2852 + .5000 = .7852
b) o = .05 z
.05
= -1.645
Page | 421
z
c
=
n
xc
o
u
-1.645 =
48
14
100 c x
x
c
= 96.68
z =
n
xc
o
u
=
48
14
99 68 . 96
= -1.15
from Table A.5, area for z = -1.15 is .3749
| = .3749 + .5000 = .8749
c) o = .01 z
.01
= -2.33
z
c
=
n
xc
o
u
-2.33 =
48
14
100 c x
Page | 422
x
c
= 95.29
z =
n
xc
o
u
=
48
14
99 29 . 95
= -1.84
from Table A.5, area for z = -1.84 is .4671
| = .4671 + .5000 = .9671
d) As o gets smaller (other variables remaining constant), | gets larger.
Decreasing the probability of committing a Type I error increases the probability
of committing a Type II error if other variables are held constant.
9.37 o = .05 = 100 n = 48 o = 14
a)
a
= 98.5 z
c
= -1.645
z
c
=
n
xc
o
u
-1.645 =
48
14
100 c x
Page | 423
x
c
= 96.68
z =
n
xc
o
u
=
48
14
5 . 98 68 . 96
= -0.90
from Table A.5, area for z = -0.90 is .3159
| = .3159 + .5000 = .8159
Page | 424
b)
a
= 98 z
c
= -1.645
x
c
= 96.68
z
c
=
n
xc
o
u
=
48
14
98 68 . 96
= -0.65
from Table A.5, area for z = -0.65 is .2422
| = .2422 + .5000 = .7422
c)
a
= 97 z
.05
= -1.645
x
c
= 96.68
z =
n
xc
o
u
=
48
14
97 68 . 96
= -0.16
from Table A.5, area for z = -0.16 is .0636
| = .0636 + .5000 = .5636
d)
a
= 96 z
.05
= -1.645
Page | 425
x
c
= 96.68
z =
n
xc
o
u
=
48
14
96 68 . 96
= 0.34
from Table A.5, area for z = 0.34 is .1331
| = .5000 - .1331 = .3669
e) As the alternative value gets farther from the null hypothesized value, the
probability of committing a Type II error reduces (all other variables being held
constant).
9.38 H
o
: = 50
H
a
: = 50
a
= 53 n = 35 o = 7 o = .01
Since this is two-tailed, o/2 = .005 z
.005
= 2.575
z
c
=
n
xc
o
u
Page | 426
2.575 =
35
7
50 c x
x
c
= 50 3.05
46.95 and 53.05
z =
n
xc
o
u
=
35
7
53 05 . 53
= 0.04
from Table A.5 for z = 0.04, area = .0160
Other end:
z =
n
xc
o
u
=
35
7
53 95 . 46
= -5.11
Area associated with z = -5.11 is .5000
| = .5000 + .0160 = .5160
Page | 427
9.39 a) H
o
: p = .65
H
a
: p < .65
n = 360 o = .05 p
a
= .60 z
.05
= -1.645
z
c
=
n
q p
p p
c
-1.645 =
360
) 35 )(. 65 (.
65 .
c
p
p
c
= .65 - .041 = .609
z =
n
q p
p p
c
=
360
) 40 )(. 60 (.
60 . 609 .
= 0.35
from Table A.5, area for z = -0.35 is .1368
| = .5000 - .1368 = .3632
b) p
a
= .55 z
.05
= -1.645 p
c
= .609
z =
n
q p
P p
c
=
360
) 45 )(. 55 (.
55 . 609 .
= 2.25
Page | 428
from Table A.5, area for z = -2.25 is .4878
| = .5000 - .4878 = .0122
c) p
a
= .50 z
.05
= -1.645 p
c
= .609
z =
n
q p
p p
c
=
360
) 50 )(. 50 (.
50 . 609 .
= -4.14
from Table A.5, the area for z = -4.14 is .5000
| = .5000 - .5000 = .0000
9.40 n = 58 x = 45.1 o = 8.7 o = .05 o/2 = .025
H
0
: = 44
H
a
: = 44 z
.025
= 1.96
Page | 429
z =
58
7 . 8
44 1 . 45
= 0.96
Since z = 0.96 < z
c
= 1.96, the decision is to fail to reject the null hypothesis.
+ 1.96 =
58
7 . 8
44 c x
2.239 = x
c
- 44
x
c
= 46.239 and 41.761
For 45 years:
z =
58
7 . 8
45 239 . 46
= 1.08
from Table A.5, the area for z = 1.08 is .3599
| = .5000 + .3599 = .8599
Power = 1 - | = 1 - .8599 = .1401
Page | 430
For 46 years:
z =
58
7 . 8
46 239 . 46
= 0.21
From Table A.5, the area for z = 0.21 is .0832
| = .5000 + .0832 = .5832
Power = 1 - | = 1 - .5832 = .4168
For 47 years:
Page | 431
z =
58
7 . 8
47 239 . 46
= -0.67
From Table A.5, the area for z = -0.67 is .2486
| = .5000 - .2486 = .2514
Power = 1 - | = 1 - .2514 = .7486
For 48 years:
z =
58
7 . 8
48 239 . 46
= 1.54
From Table A.5, the area for z = 1.54 is .4382
| = .5000 - .4382 = .0618
Power = 1 - | = 1 - .0618 = .9382
Page | 432
9.41 H
0
: p = .71
H
a
: p < .71
n = 463 x = 324 p =
463
324
= .6998 o = .10
Page | 433
z
.10
= -1.28
z =
463
) 29 )(. 71 (.
71 . 6998 .
=
n
q p
p p
= -0.48
Since the observed z = -0.48 > z
.10
= -1.28, the decision is to fail to reject the null
hypothesis.
Type II error:
Solving for the critical proportion, p
c
:
z
c
=
n
q p
p p
c
-1.28 =
463
) 29 )(. 71 (.
71 .
c
p
p = .683
For p
a
= .69
z =
463
) 31 )(. 69 (.
69 . 683 .
= -0.33
Page | 434
From Table A.5, the area for z = -0.33 is .1293
The probability of committing a Type II error = .1293 + .5000 = .6293
For p
a
= .66
z =
463
) 34 )(. 66 (.
66 . 683 .
= 1.04
From Table A.5, the area for z = 1.04 is .3508
The probability of committing a Type II error = .5000 - .3508 = .1492
For p
a
= .60
z =
463
) 40 )(. 60 (.
60 . 683 .
= 3.65
From Table A.5, the area for z = 3.65 is essentially, .5000
The probability of committing a Type II error = .5000 - .5000 = .0000
Page | 435
Page | 436
9.42 HTAB steps:
1) H
o
: = 36
H
a
: = 36
2) z =
n
x
o
u
3) o = .01
4) two-tailed test, o/2 = .005, z
.005
= + 2.575
If the observed value of z is greater than 2.575 or less than -2.575, the decision will be to
reject the null hypothesis.
5) n = 63, x = 38.4, o = 5.93
6) z =
n
x
o
u
=
63
93 . 5
36 4 . 38
= 3.21
7) Since the observed value of z = 3.21 is greater than z
.005
= 2.575, the decision is
to reject the null hypothesis.
8) The mean is likely to be greater than 36.
Page | 437
9.43 HTAB steps:
1) H
o
: = 7.82
H
a
: < 7.82
2) The test statistic is
t =
n
s
x u
3) o = .05
4) df = n - 1 = 16, t
.05,16
= -1.746. If the observed value of t is less than -1.746, then
the decision will be to reject the null hypothesis.
5) n = 17 x = 7.01 s = 1.69
6) t =
n
s
x u
=
17
69 . 1
82 . 7 01 . 7
= -1.98
7) Since the observed t = -1.98 is less than the table value of t = -1.746, the decision
is to reject the null hypothesis.
8) The population mean is significantly less than 7.82.
9.44 HTAB steps:
a. 1) H
o
: p = .28
H
a
: p > .28
Page | 438
2) z =
n
q p
p p
3) o = .10
4) This is a one-tailed test, z
.10
= 1.28. If the observed value of z is greater
than 1.28, the decision will be to reject the null hypothesis.
5) n = 783 x = 230
783
230
= p = .2937
6) z =
783
) 72 )(. 28 (.
28 . 2937 .
= 0.85
7) Since z = 0.85 is less than z
.10
= 1.28, the decision is to fail to reject the null
hypothesis.
8) There is not enough evidence to declare that p > .28.
Page | 439
b. 1) H
o
: p = .61
H
a
: p = .61
2) z =
n
q p
p p
3) o = .05
4) This is a two-tailed test, z
.025
= + 1.96. If the observed value of z is greater
than 1.96 or less than -1.96, then the decision will be to reject the null
hypothesis.
5) n = 401 p = .56
6) z =
401
) 39 )(. 61 (.
61 . 56 .
= -2.05
7) Since z = -2.05 is less than z
.025
= -1.96, the decision is to reject the null
hypothesis.
8) The population proportion is not likely to be .61.
9.45 HTAB steps:
1) H
0
: o
2
= 15.4
H
a
: o
2
> 15.4
2) ;
2
=
2
2
) 1 (
o
s n
Page | 440
3) o = .01
4) n = 18, df = 17, one-tailed test
;
2
.01,17
= 33.4087
5) s
2
= 29.6
6) ;
2
=
2
2
) 1 (
o
s n
=
4 . 15
) 6 . 29 )( 17 (
= 32.675
7) Since the observed ;
2
= 32.675 is less than 33.4087, the decision is to fail
to reject the null hypothesis.
8) The population variance is not significantly more than 15.4.
9.46 a) H
0
: = 130
H
a
: > 130
n = 75 o = 12 o = .01 z
.01
= 2.33
a
= 135
Solving for x
c
:
Page | 441
z
c
=
n
xc
o
u
2.33 =
75
12
130 c x
x
c
= 133.23
z =
75
12
135 23 . 133
= -1.28
from table A.5, area for z = -1.28 is .3997
| = .5000 - .3997 = .1003
b) H
0
: p = .44
H
a
: p < .44
n = 1095 o = .05 p
a
= .42 z
.05
= -1.645
z
c
=
n
q p
p p
c
-1.645 =
1095
) 56 )(. 44 (.
44 .
c
p
Page | 442
c
p = .4153
z =
1095
) 58 )(. 42 (.
42 . 4153 .
= -0.32
from table A.5, area for z = -0.32 is .1255
| = .5000 + .1255 = .6255
9.47 H
0
: p = .32
H
a
: p > .32
n = 80 o = .01 p = .39
z
.01
= 2.33
z =
80
) 68 )(. 32 (.
32 . 39 .
=
n
q p
p p
= 1.34
Since the observed z = 1.34 < z
.01
= 2.33, the decision is to fail to reject the null
hypothesis.
Page | 443
9.48 x = 3.45 n = 64 o
2
= 1.31 o = .05
H
o
: = 3.3
H
a
: = 3.3
For two-tail, o/2 = .025 z
c
= 1.96
z =
n
x
o
u
=
64
31 . 1
3 . 3 45 . 3
= 1.05
Since the observed z = 1.05 < z
c
= 1.96, the decision is to Fail to reject the null
hypothesis.
9.49 n = 210 x = 93 o = .10
210
93
= =
n
x
p = .443
H
o
: p = .57
H
a
: p < .57
For one-tail, o = .10 z
c
= -1.28
z =
210
) 43 )(. 57 (.
57 . 443 .
=
n
q p
p p
= -3.72
Since the observed z = -3.72 < z
c
= -1.28, the decision is to reject the null hypothesis.
Page | 444
9.50 H
0
: o
2
= 16 n = 12 o = .05 df = 12 - 1 = 11
H
a
: o
2
> 16
s = 0.4987864 ft. = 5.98544 in.
;
2
.05,11
= 19.6752
;
2
=
16
) 98544 . 5 )( 1 12 (
2
= 24.63
Since ;
2
= 24.63 > ;
2
.05,11
= 19.6752, the decision is to reject the null hypothesis.
9.51 H
0
: = 8.4 o = .01 o/2 = .005 n = 7 df = 7 1 = 6 s = 1.3
H
a
: = 8.4
x = 5.6 t
.005,6
= + 3.707
t =
7
3 . 1
4 . 8 6 . 5
= -5.70
Page | 445
Since the observed t = - 5.70 < t
.005,6
= -3.707, the decision is to reject the null
hypothesis.
9.52 x = $26,650 n = 100 o = $12,000
a) H
o
: = $25,000
H
a
: > $25,000 o = .05
For one-tail, o = .05 z
.05
= 1.645
z =
n
x
o
u
=
100
000 , 12
000 , 25 650 , 26
= 1.38
Since the observed z = 1.38 < z
.05
= 1.645, the decision is to fail to reject the null
hypothesis.
b)
a
= $30,000 z
c
= 1.645
Solving for x
c
:
z
c
=
n
xc
o
u
Page | 446
1.645 =
100
000 , 12
) 000 , 25 ( c x
x
c
= 25,000 + 1,974 = 26,974
z =
100
000 , 12
000 , 30 974 , 26
= -2.52
from Table A.5, the area for z = -2.52 is .4941
| = .5000 - .4941 = .0059
Page | 447
9.53 H
0
: o
2
= 4 n = 8 s = 7.80 o = .10 df = 8 1 = 7
H
a
: o
2
> 4
;
2
.10,7
= 12.0170
;
2
=
4
) 80 . 7 )( 1 8 (
2
= 106.47
Since observed ;
2
= 106.47 > ;
2
.10,7
= 12.017, the decision is to reject the null
hypothesis.
9.54 H
0
: p = .46
H
a
: p > .46
n = 125 x = 66 o = .05
125
66
= =
n
x
p = .528
Using a one-tailed test, z
.05
= 1.645
z =
125
) 54 )(. 46 (.
46 . 528 .
=
n
q p
p p
= 1.53
Since the observed value of z = 1.53 < z
.05
= 1.645, the decision is to fail to reject the null
hypothesis.
Solving for
c
p :
Page | 448
z
c
=
n
q p
p p
c
1.645 =
125
) 54 )(. 46 (.
46 .
c
p
and therefore,
c
p = .533
z =
125
) 50 )(. 50 (.
50 . 533 .
=
n
q p
p p
a a
a c
= 0.74
from Table A.5, the area for z = 0.74 is .2704
| = .5000 + .2704 = .7704
9.55 n = 16 x = 175 s = 14.28286 df = 16 - 1 = 15 o = .05
H
0
: = 185
H
a
: < 185
t
.05,15
= - 1.753
t =
n
s
x u
=
16
28286 . 14
185 175
= -2.80
Since observed t = - 2.80 < t
.05,15
= - 1.753, the decision is to reject the null hypothesis.
Page | 449
9.56 H
0
: p = .182
H
a
: p > .182
n = 428 x = 84 o = .01
428
84
= =
n
x
p = .1963
For a one-tailed test, z
.01
= 2.33
z =
428
) 818 )(. 182 (.
182 . 1963 .
=
n
q p
p p
= 0.77
Since the observed z = 0.77 < z
.01
= 2.33, the decision is to fail to reject the null
hypothesis.
The probability of committing a Type I error is .01.
Solving for
c
p :
z
c
=
n
q p
p p
c
2.33 =
428
) 818 )(. 182 (.
182 . .
c
p
Page | 450
c
p = .2255
z =
428
) 79 )(. 21 (.
21 . 2255 .
=
n
q p
p p
a a
a c
= 0.79
from Table A.5, the area for z = 0.79 is .2852
| = .5000 + .2852 = .7852
9.57 H
o
: = $15
H
a
: > $15
x = $19.34 n = 35 o = $4.52 o = .10
For one-tail and o = .10 z
c
= 1.28
z =
n
s
x u
=
35
52 . 4
15 34 . 19
= 5.68
Since the observed z = 5.68 > z
c
= 1.28, the decision is to reject the null hypothesis.
Page | 451
9.58 H
0
: o
2
= 16 n = 22 df = 22 1 = 21 s = 6 o = .05
H
a
: o
2
> 16
;
2
.05,21
= 32.6706
;
2
=
16
) 6 )( 1 22 (
2
= 47.25
Since the observed ;
2
= 47.25 > ;
2
.05,21
= 32.6706, the decision is to reject the null
hypothesis.
Page | 452
9.59 H
0
: = 2.5 x = 3.4 s = 0.6 o = .01 n = 9 df = 9 1 = 8
H
a
: > 2.5
t
.01,8
= 2.896
t =
n
s
x u
=
9
6 . 0
5 . 2 4 . 3
= 4.50
Since the observed t = 4.50 > t
.01,8
= 2.896, the decision is to reject the null hypothesis.
9.60 a) H
o
: u = 23.58
H
a
: u = 23.58
n = 95 x = 22.83 o = 5.11 o = .05
Since this is a two-tailed test and using o/2 = .025: z
.025
= + 1.96
z =
n
x
o
u
=
95
11 . 5
58 . 23 83 . 22
= -1.43
Since the observed z = -1.43 > z
.025
= -1.96, the decision is to fail to reject the
null hypothesis.
Page | 453
b) z
c
=
n
xc
o
u
+ 1.96 =
95
11 . 5
58 . 23
c
x
c
x = 23.58 + 1.03
c
x = 22.55, 24.61
for H
a
: u = 22.30
z =
n
x
a c
o
u
=
95
11 . 5
30 . 22 55 . 22
= 0.48
z =
n
x
a c
o
u
=
95
11 . 5
30 . 22 61 . 24
= 4.41
from Table A.5, the areas for z = 0.48 and z = 4.41 are .1844 and .5000
| = .5000 - .1844 = .3156
Page | 454
The upper tail has no effect on |.
9.61 n = 12 x = 12.333 s
2
= 10.424
H
0
: o
2
= 2.5
H
a
: o
2
= 2.5
o = .05 df = 11 two-tailed test, o/2 = .025
;
2
.025,11
= 21.9200
;
2
..975,11
= 3.81574
If the observed ;
2
is greater than 21.9200 or less than 3.81574, the decision is to reject
the null hypothesis.
;
2
=
2
2
) 1 (
o
s n
=
5 . 2
) 424 . 10 ( 11
= 45.866
Since the observed ;
2
= 45.866 is greater than ;
2
.025,11
= 21.92, the decision is to reject
the null hypothesis. The population variance is significantly more than 2.5.
Page | 455
9.62 H
0
: = 23 x = 18.5 s = 6.91 o = .10 n = 16 df = 16 1 = 15
H
a
: < 23
t
.10,15
= -1.341
t =
n
s
x u
=
16
91 . 6
23 5 . 18
= -2.60
Since the observed t = -2.60 < t
.10,15
= -1.341, the decision is to reject the null
hypothesis.
9.63 The sample size is 22. x is 3.969 s = 0.866 df = 21
The test statistic is:
t =
n
s
x u
The observed t = -2.33. The p-value is .015.
The results are statistical significant at o = .05.
The decision is to reject the null hypothesis.
Page | 456
9.64 H
0
: p = .25
H
a
: p = .25
This is a two-tailed test with o = .05. n = 384.
Since the p-value = .045 < o = .05, the decision is to reject the null hypothesis.
The sample proportion, p = .205729 which is less than the hypothesized p = .25.
One conclusion is that the population proportion is lower than .25.
Page | 457
9.65 H
0
: u = 2.51
H
a
: u > 2.51
This is a one-tailed test. The sample mean is 2.55 which is more than the hypothesized
value. The observed t value is 1.51 with an associated
p-value of .072 for a one-tailed test. Because the p-value is greater than
o = .05, the decision is to fail to reject the null hypothesis.
There is not enough evidence to conclude that beef prices are higher.
9.66 H
0
: u = 2747
H
a
: u < 2747
This is a one-tailed test. Sixty-seven households were included in this study.
The sample average amount spent on home-improvement projects was 2,349.
Since z = -2.09 < z
.05
= -1.645, the decision is to reject the null hypothesis at
o = .05. This is underscored by the p-value of .018 which is less than o = .05.
However, the p-value of .018 also indicates that we would not reject the null
hypothesis at o = .01.
Page | 458
Chapter 10
Statistical Inferences about Two Populations
LEARNING OBJECTIVES
The general focus of Chapter 10 is on testing hypotheses and constructing confidence intervals
about parameters from two populations, thereby enabling you to
1. Test hypotheses and construct confidence intervals about the difference in two
population means using the z statistic.
2. Test hypotheses and establish confidence intervals about the difference in two
population means using the t statistic.
3. Test hypotheses and construct confidence intervals about the difference in
two related populations.
4. Test hypotheses and construct confidence intervals about the difference in two
population proportions.
5. Test hypotheses and construct confidence intervals about two population variances.
Page | 459
CHAPTER TEACHING STRATEGY
The major emphasis of chapter 10 is on analyzing data from two samples. The student
should be ready to deal with this topic given that he/she has tested hypotheses and computed
confidence intervals in previous chapters on single sample data.
In this chapter, the approach as to whether to use a z statistic or a t statistic for
analyzing the differences in two sample means is the same as that used in chapters 8
and 9. When the population variances are known, the z statistic can be used. However, if the
population variances are unknown and sample variances are being used, then the
t test is the appropriate statistic for the analysis. It is always an assumption underlying the use
of the t statistic that the populations are normally distributed. If sample sizes are small and the
population variances are known, the z statistic can be used if the populations are normally
distributed.
Page | 460
In conducting a t test for the difference of two means from independent populations,
there are two different formulas given in the chapter. One version of this test uses a "pooled"
estimate of the population variance and assumes that the population variances are equal. The
other version does not assume equal population variances and is simpler to compute. In doing
hand calculations, it is generally easier to use the pooled variance formula because the
degrees of freedom formula for the unequal variance formula is quite complex. However, it is
good to expose students to both formulas since computer software packages often give you the
option of using the pooled that assumes equal population variances or the formula for
unequal variances.
A t test is also included for related (non independent) samples. It is important that the
student be able to recognize when two samples are related and when they are independent.
The first portion of section 10.3 addresses this issue. To underscore the potential difference in
the outcome of the two techniques, it is sometimes valuable to analyze some related measures
data with both techniques and demonstrate that the results and conclusions are usually quite
different. You can have your students work problems like this using both techniques to help
them understand the differences between the two tests (independent and dependent t tests)
and the different outcomes they will obtain.
A z test of proportions for two samples is presented here along with an F test for two
population variances. This is a good place to introduce the student to the F distribution in
preparation for analysis of variance in Chapter 11. The student will begin to understand that the
F values have two different degrees of freedom. The F distribution tables are upper tailed only.
For this reason, formula 10.14 is given in the chapter to be used to compute lower tailed F
values for two-tailed tests.
Page | 461
CHAPTER OUTLINE
10.1 Hypothesis Testing and Confidence Intervals about the Difference in Two Means using
the z Statistic (Population Variances Known)
Hypothesis Testing
Confidence Intervals
Using the Computer to Test Hypotheses about the Difference in Two
Population Means Using the z Test
10.2 Hypothesis Testing and Confidence Intervals about the Difference in Two Means:
Independent Samples and Population Variances Unknown
Hypothesis Testing
Using the Computer to Test Hypotheses and Construct Confidence
Intervals about the Difference in Two Population Means Using the t
Test
Confidence Intervals
10.3 Statistical Inferences For Two Related Populations
Hypothesis Testing
Using the Computer to Make Statistical Inferences about Two Related
Populations
Confidence Intervals
Page | 462
10.4 Statistical Inferences About Two Population Proportions, p
1
- p
2
Hypothesis Testing
Confidence Intervals
Using the Computer to Analyze the Difference in Two Proportions
10.5 Testing Hypotheses About Two Population Variances
Using the Computer to Test Hypotheses about Two Population Variances
KEY TERMS
Dependent Samples Independent Samples
F Distribution Matched-Pairs Test
F Value Related Measures
Page | 463
SOLUTIONS TO PROBLEMS IN CHAPTER 10
10.1 Sample 1 Sample 2
x
1
= 51.3 x
2
= 53.2
s
1
2
= 52 s
2
2
= 60
n
1
= 31 n
2
= 32
a) H
o
:
1
-
2
= 0
H
a
:
1
-
2
< 0
For one-tail test, o = .10 z
.10
= -1.28
z =
32
60
31
52
) 0 ( ) 2 . 53 3 . 51 ( ) ( ) (
2
2
2
1
2
1
2 1
2 1
+
=
+
n n
x x
o o
u u
= -1.01
Since the observed z = -1.01 > z
c
= -1.28, the decision is to fail to reject the null
hypothesis.
b) Critical value method:
z
c
=
2
2
2
1
2
1
2 1
2 1 ) ( ) (
n n
x x
c
o o
u u
+
Page | 464
-1.28 =
32
60
31
52
) 0 ( ) ( 2 1
+
c
x x
( x
1
-
x
2
)
c
= -2.41
c) The area for z = -1.01 using Table A.5 is .3438.
The p-value is .5000 - .3438 = .1562
Page | 465
10.2
Sample 1 Sample 2
n
1
= 32 n
2
= 31
x
1
= 70.4 x
2
= 68.7
o
1
= 5.76 o
2
= 6.1
For a 90% C.I., z
.05
= 1.645
2
2
2
1
2
1
2 1 ) (
n n
z x x
o o
+
(70.4) 68.7) + 1.645
31
1 . 6
32
76 . 5
2 2
+
1.7 2.46
-.76 <
1
-
2
< 4.16
10.3 a) Sample 1 Sample 2
x
1
= 88.23 x
2
= 81.2
o
1
2
= 22.74 o
2
2
= 26.65
n
1
= 30 n
2
= 30
Page | 466
H
o
:
1
-
2
= 0
H
a
:
1
-
2
= 0
For two-tail test, use o/2 = .01 z
.01
= + 2.33
z =
30
65 . 26
30
74 . 22
) 0 ( ) 2 . 81 23 . 88 ( ) ( ) (
2
2
2
1
2
1
2 1
2 1
+
=
+
n n
x x
o o
u u
= 5.48
Since the observed z = 5.48 > z
.01
= 2.33, the decision is to reject the null
hypothesis.
Page | 467
b)
2
2
2
1
2
1
2 1 ) (
n n
z x x
o o
+
(88.23 81.2) + 2.33
30
65 . 26
30
74 . 22
+
7.03 + 2.99
4.04 < u < 10.02
This supports the decision made in a) to reject the null hypothesis because zero is
not in the interval.
10.4 Computers/electronics Food/Beverage
x
1
= 1.96 x
2
= 3.02
o
1
2
= 1.0188 o
2
2
= .9180
n
1
= 50 n
2
= 50
H
o
:
1
-
2
= 0
H
a
:
1
-
2
= 0
For two-tail test, o/2 = .005 z
.005
= 2.575
Page | 468
z =
50
9180 . 0
50
0188 . 1
) 0 ( ) 02 . 3 96 . 1 ( ) ( ) (
2
2
2
1
2
1
2 1
2 1
+
=
+
n n
x x
o o
u u
= -5.39
Since the observed z = -5.39 < z
c
= -2.575, the decision is to reject the null hypothesis.
Page | 469
10.5 A B
n
1
= 40 n
2
= 37
x
1
= 5.3 x
2
= 6.5
o
1
2
= 1.99 o
2
2
= 2.36
For a 95% C.I., z
.025
= 1.96
2
2
2
1
2
1
2 1 ) (
n n
z x x
o o
+
(5.3 6.5) + 1.96
37
36 . 2
40
99 . 1
+
-1.2 .66 -1.86 < u < -.54
The results indicate that we are 95% confident that, on average, Plumber B does
between 0.54 and 1.86 more jobs per day than Plumber A. Since zero does not lie in this
interval, we are confident that there is a difference between Plumber A and Plumber B.
10.6 Managers Specialty
n
1
= 35 n
2
= 41
x
1
= 1.84 x
2
= 1.99
o
1
= .38 o
2
= .51
Page | 470
for a 98% C.I., z
.01
= 2.33
2
2
2
1
2
1
2 1 ) (
n n
z x x
o o
+
(1.84 - 1.99) 2.33
41
51 .
35
38 .
2 2
+
-.15 .2384
-.3884 <
1
-
2
< .0884
Point Estimate = -.15
Hypothesis Test:
1) H
o
:
1
-
2
= 0
H
a
:
1
-
2
= 0
2) z =
2
2
2
1
2
1
2 1
2 1 ) ( ) (
n n
x x
o o
u u
+
3) o = .02
4) For a two-tailed test, z
.01
= + 2.33. If the observed z value is greater than 2.33
Page | 471
or less than -2.33, then the decision will be to reject the null hypothesis.
5) Data given above
6) z =
41
) 51 (.
35
) 38 (.
) 0 ( ) 99 . 1 84 . 1 (
2 2
+
= -1.47
7) Since z = -1.47 > z
.01
= -2.33, the decision is to fail to reject the null
hypothesis.
8) There is no significant difference in the hourly rates of the two groups.
10.7 1996 2006
x
1
= 190
x
2
= 198
o
1
= 18.50 o
2
= 15.60
n
1
= 51 n
2
= 47 o = .01
H
0
: u
1
- u
2
= 0
H
a
: u
1
- u
2
< 0
For a one-tailed test, z
.01
= -2.33
Page | 472
z =
47
) 60 . 15 (
51
) 50 . 18 (
) 0 ( ) 198 190 ( ) ( ) (
2 2
2
2
2
1
2
1
2 1
2 1
+
=
+
n n
x x
o o
u u
= -2.32
Since the observed z = -2.32 > z
.01
= -2.33, the decision is to fail to reject the null
hypothesis.
10.8 Seattle Atlanta
n
1
= 31 n
2
= 31
x
1
= 2.64 x
2
= 2.36
o
1
2
= .03 o
2
2
= .015
For a 99% C.I., z
.005
= 2.575
2
2
2
1
2
1
2 1 ) (
n n
z x x
o o
+
(2.64-2.36) 2.575
31
015 .
31
03 .
+
.28 .10 .18 < u < .38
Between $ .18 and $ .38 difference with Seattle being more expensive.
Page | 473
10.9 Canon Pioneer
x
1
= 5.8 x
2
= 5.0
o
1
= 1.7 o
2
= 1.4
n
1
= 36 n
2
= 45
H
o
:
1
-
2
= 0
H
a
:
1
-
2
= 0
For two-tail test, o/2 = .025 z
.025
= 1.96
z =
45
) 4 . 1 (
36
) 7 . 1 (
) 0 ( ) 0 . 5 8 . 5 ( ) ( ) (
2
2
2
2
1
2
1
2 1
2 1
+
=
+
n n
x x
o o
u u
= 2.27
Since the observed z = 2.27 > z
c
= 1.96, the decision is to reject the null
hypothesis.
Page | 474
10.10 A B
x
1
= 8.05 x
2
= 7.26
o
1
= 1.36 o
2
= 1.06
n
1
= 50 n
2
= 38
H
o
:
1
-
2
= 0
H
a
:
1
-
2
> 0
For one-tail test, o = .10 z
.10
= 1.28
z =
38
) 06 . 1 (
50
) 36 . 1 (
) 0 ( ) 26 . 7 05 . 8 ( ) ( ) (
2 2
2
2
2
1
2
1
2 1
2 1
+
=
+
n n
x x
o o
u u
= 3.06
Since the observed z = 3.06 > z
c
= 1.28, the decision is to reject the null hypothesis.
10.11 H
o
:
1
-
2
= 0 o = .01
H
a
:
1
-
2
< 0 df = 8 + 11 - 2 = 17
Sample 1 Sample 2
n
1
= 8 n
2
= 11
x
1
= 24.56 x
2
= 26.42
s
1
2
= 12.4 s
2
2
= 15.8
Page | 475
For one-tail test, o = .01 Critical t
.01,17
= -2.567
t =
2 1 2 1
2
2
2 1
2
1
2 1
2
1
1 1
2
) 1 ( ) 1 (
) ( ) (
n n n n
n s n s
x x
+
+
+
u u
=
11
1
8
1
2 11 8
) 10 ( 8 . 15 ) 7 ( 4 . 12
) 0 ( ) 42 . 26 56 . 24 (
+
+
+
= -1.05
Since the observed t = -1.05 > t
.01,19
= -2.567, the decision is to fail to reject the null
hypothesis.
Page | 476
10.12 a) H
o
:
1
-
2
= 0 o =.10
H
a
:
1
-
2
= 0 df = 20 + 20 - 2 = 38
Sample 1 Sample 2
n
1
= 20 n
2
= 20
x
1
= 118 x
2
= 113
s
1
= 23.9 s
2
= 21.6
For two-tail test, o/2 = .05 Critical t
.05,38
= 1.697 (used df=30)
t =
2 1 2 1
2
2
2 1
2
1
2 1
2
1
1 1
2
) 1 ( ) 1 (
) ( ) (
n n n n
n s n s
x x
+
+
+
u u
=
t =
20
1
20
1
2 20 20
) 19 ( ) 6 . 21 ( ) 19 ( ) 9 . 23 (
) 0 ( ) 113 118 (
2 2
+
+
+
= 0.69
Since the observed t = 0.69 < t
.05,38
= 1.697, the decision is to fail to reject the
null hypothesis.
b)
2 1 2 1
2
2
2 1
2
1
2 1
1 1
2
) 1 ( ) 1 (
) (
n n n n
n s n s
t x x +
+
+
=
(118 113) + 1.697
20
1
20
1
2 20 20
) 19 ( ) 6 . 21 ( ) 19 ( ) 9 . 23 (
2 2
+
+
+
Page | 477
5 + 12.224
-7.224 < u
1
- u
2
< 17.224
Page | 478
10.13 H
o
:
1
-
2
= 0 o = .05
H
a
:
1
-
2
> 0 df = n
1
+ n
2
- 2 = 10 + 10 - 2 = 18
Sample 1 Sample 2
n
1
= 10 n
2
= 10
x
1
= 45.38 x
2
= 40.49
s
1
= 2.357 s
2
= 2.355
For one-tail test, o = .05 Critical t
.05,18
= 1.734
t =
2 1 2 1
2
2
2 1
2
1
2 1
2
1
1 1
2
) 1 ( ) 1 (
) ( ) (
n n n n
n s n s
x x
+
+
+
u u
=
t =
10
1
10
1
2 10 10
) 9 ( ) 355 . 2 ( ) 9 ( ) 357 . 2 (
) 0 ( ) 49 . 40 38 . 45 (
2 2
+
+
+
= 4.64
Since the observed t = 4.64 > t
.05,18
= 1.734, the decision is to reject the null
hypothesis.
10.14 H
o
:
1
-
2
= 0 o =.01
H
a
:
1
-
2
= 0 df = 18 + 18 - 2 = 34
Page | 479
Sample 1 Sample 2
n
1
= 18 n
2
= 18
x
1
= 5.333 x
2
= 9.444
s
1
2
= 12 s
2
2
= 2.026
For two-tail test, o/2 = .005 Critical t
.005,34
= 2.75 (used df=30)
t =
2 1 2 1
2
2
2 1
2
1
2 1
2 1
1 1
2
) 1 ( ) 1 (
) ( ) (
n n n n
n s n s
x x
+
+
+
u u
=
t =
18
1
18
1
2 18 18
17 ) 026 . 2 ( ) 17 ( 12
) 0 ( ) 444 . 9 333 . 5 (
+
+
+
= -4.66
Since the observed t = -4.66 < t
.005,34
= -2.75, reject the null hypothesis.
b) For 98% confidence, t
.01, 30
= 2.457
2 1 2 1
2
2
2 1
2
1
2 1
1 1
2
) 1 ( ) 1 (
) (
n n n n
n s n s
t x x +
+
+
=
(5.333 9.444) + 2.457
18
1
18
1
2 18 18
) 17 )( 026 . 2 ( ) 17 )( 12 (
+
+
+
Page | 480
-4.111 + 2.1689
-6.2799 < u
1
- u
2
< -1.9421
10.15 Peoria Evansville
n
1
= 21 n
2
= 26
1
x
= 116,900
2
x
= 114,000
s
1
= 2,300 s
2
= 1,750 df = 21 + 26 2
90% level of confidence, o/2 = .05 t
.05,45
= 1.684 (used df = 40)
2 1 2 1
2
2
2 1
2
1
2 1
1 1
2
) 1 ( ) 1 (
) (
n n n n
n s n s
t x x +
+
+
=
(116,900 114,000) + 1.684
26
1
21
1
2 26 21
) 25 ( ) 1750 ( ) 20 ( ) 2300 (
2 2
+
+
+
=
2,900 + 994.62
1905.38 < u
1
- u
2
< 3894.62
Page | 481
10.16 H
o
:
1
-
2
= 0 o = .10
H
a
:
1
-
2
= 0 df = 12 + 12 - 2 = 22
Co-op Interns
n
1
= 12 n
2
= 12
x
1
= $15.645 x
2
= $15.439
s
1
= $1.093 s
2
= $0.958
For two-tail test, o/2 = .05
Critical t
.05,22
= 1.717
t =
2 1 2 1
2
2
2 1
2
1
2 1
2 1
1 1
2
) 1 ( ) 1 (
) ( ) (
n n n n
n s n s
x x
+
+
+
u u
=
t =
12
1
12
1
2 12 12
) 11 ( ) 958 . 0 ( ) 11 ( ) 093 . 1 (
) 0 ( ) 439 . 15 645 . 15 (
2 2
+
+
+
= 0.49
Since the observed t = 0.49 < t
.05,22
= 1.717, the decision is to fail reject the null
hypothesis.
90% Confidence Interval: t
.05,22
= 1.717
Page | 482
2 1 2 1
2
2
2 1
2
1
2 1
1 1
2
) 1 ( ) 1 (
) (
n n n n
n s n s
t x x +
+
+
=
(15.645 15.439) + 1.717
12
1
12
1
2 12 12
) 11 ( ) 958 . 0 ( ) 11 ( ) 093 . 1 (
2 2
+
+
+
=
0.206 + 0.7204
-0.5144 < u
1
- u
2
< 0.9264
Page | 483
10.17 Let Boston be group 1
1) H
o
:
1
-
2
= 0
H
a
:
1
-
2
> 0
2) t =
2 1 2 1
2
2
2 1
2
1
2 1
2 1
1 1
2
) 1 ( ) 1 (
) ( ) (
n n n n
n s n s
x x
+
+
+
u u
3) o = .01
4) For a one-tailed test and df = 8 + 9 - 2 = 15, t
.01,15
= 2.602. If the observed
value of t is greater than 2.602, the decision is to reject the null hypothesis.
5) Boston Dallas
n
1
= 8 n
2
= 9
x
1
= 47 x
2
= 44
s
1
= 3 s
2
= 3
6) t =
9
1
8
1
15
) 3 ( 8 ) 3 ( 7
) 0 ( ) 44 47 (
2 2
+
+
= 2.06
7) Since t = 2.06 < t
.01,15
= 2.602, the decision is to fail to reject the null
hypothesis.
8) There is no significant difference in rental rates between Boston and Dallas.
Page | 484
10.18 n
m
= 22 n
no
= 20
x
m
= 112 x
no
= 122
s
m
= 11 s
no
= 12
df = n
m
+ n
no
- 2 = 22 + 20 - 2 = 40
For a 98% Confidence Interval, o/2 = .01 and t
.01,40
= 2.423
2 1 2 1
2
2
2 1
2
1
2 1
1 1
2
) 1 ( ) 1 (
) (
n n n n
n s n s
t x x +
+
+
=
(112 122) + 2.423
20
1
22
1
2 20 22
) 19 ( ) 12 ( ) 21 ( ) 11 (
2 2
+
+
+
-10 8.60
-$18.60 <
1
-
2
< -$1.40
Point Estimate = -$10
10.19 H
o
:
1
-
2
= 0
H
a
:
1
-
2
= 0
Page | 485
df = n
1
+ n
2
- 2 = 11 + 11 - 2 = 20
Toronto Mexico City
n
1
= 11 n
2
= 11
x
1
= $67,381.82 x
2
= $63,481.82
s
1
= $2,067.28 s
2
= $1,594.25
For a two-tail test, o/2 = .005 Critical t
.005,20
= 2.845
t =
2 1 2 1
2
2
2 1
2
1
2 1
2 1
1 1
2
) 1 ( ) 1 (
) ( ) (
n n n n
n s n s
x x
+
+
+
u u
=
t =
11
1
11
1
2 11 11
) 10 ( ) 25 . 594 , 1 ( ) 10 ( ) 28 . 067 , 2 (
) 0 ( ) 82 . 481 , 63 82 . 381 , 67 (
2 2
+
+
+
= 4.95
Since the observed t = 4.95 > t
.005,20
= 2.845, the decision is to Reject the null
hypothesis.
10.20 H
o
:
1
-
2
= 0
H
a
:
1
-
2
> 0
df = n
1
+ n
2
- 2 = 9 + 10 - 2 = 17
Page | 486
Men Women
n
1
= 9 n
2
= 10
x
1
= $110.92 x
2
= $75.48
s
1
= $28.79 s
2
= $30.51
This is a one-tail test, o = .01 Critical t
.01,17
= 2.567
t =
2 1 2 1
2
2
2 1
2
1
2 1
2 1
1 1
2
) 1 ( ) 1 (
) ( ) (
n n n n
n s n s
x x
+
+
+
u u
=
t =
( . . ) ( )
( . ) ( ) ( . ) ( )
11092 7548 0
2879 8 3051 9
9 10 2
1
9
1
10
2 2
+
+
+
= 2.60
Since the observed t = 2.60 > t
.01,17
= 2.567, the decision is to Reject the null
hypothesis.
10.21 H
o
: D = 0
H
a
: D > 0
Sample 1 Sample 2 d
Page | 487
38 22 16
27 28 -1
30 21 9
41 38 3
36 38 -2
38 26 12
33 19 14
35 31 4
44 35 9
n = 9 d =7.11 s
d
=6.45 o = .01
df = n - 1 = 9 - 1 = 8
For one-tail test and o = .01, the critical t
.01,8
= 2.896
t =
9
45 . 6
0 11 . 7
=
n
s
D d
d
= 3.31
Since the observed t = 3.31 > t
.01,8
= 2.896, the decision is to reject the null
hypothesis.
Page | 488
10.22 H
o
: D = 0
H
a
: D = 0
Before After d
107 102 5
99 98 1
110 100 10
113 108 5
96 89 7
98 101 -3
100 99 1
102 102 0
107 105 2
109 110 -1
104 102 2
99 96 3
101 100 1
n = 13 d
= 2.5385 s
d
=3.4789 o = .05
df = n - 1 = 13 - 1 = 12
For a two-tail test and o/2 = .025 Critical t
.025,12
= 2.179
Page | 489
t =
13
4789 . 3
0 5385 . 2
=
n
s
D d
d
= 2.63
Since the observed t = 2.63 > t
.025,12
= 2.179, the decision is to reject the null
hypothesis.
Page | 490
10.23 n = 22 d = 40.56 s
d
= 26.58
For a 98% Level of Confidence, o/2 = .01, and df = n - 1 = 22 - 1 = 21
t
.01,21
= 2.518
n
s
t d
d
40.56 (2.518)
22
58 . 26
40.56 14.27
26.29 < D < 54.83
10.24 Before After d
32 40 -8
28 25 3
35 36 -1
32 32 0
26 29 -3
25 31 -6
37 39 -2
Page | 491
16 30 -14
35 31 4
n = 9 d = -3 s
d
= 5.6347 o = .025
df = n - 1 = 9 - 1 = 8
For 90% level of confidence and o/2 = .05, t
.05,8
= 1.86
t =
n
s
t d
d
t = -3 + (1.86)
9
6347 . 5
= -3 3.49
-6.49 < D < 0.49
10.25 City Cost Resale d
Atlanta 20427 25163 -4736
Boston 27255 24625 2630
Des Moines 22115 12600 9515
Kansas City 23256 24588 -1332
Louisville 21887 19267 2620
Portland 24255 20150 4105
Raleigh-Durham 19852 22500 -2648
Page | 492
Reno 23624 16667 6957
Ridgewood 25885 26875 - 990
San Francisco 28999 35333 -6334
Tulsa 20836 16292 4544
d = 1302.82 s
d
= 4938.22 n = 11, df = 10
o = .01 o/2 = .005 t
.005,10
= 3.169
n
s
t d
d
= 1302.82 + 3.169
11
22 . 4938
= 1302.82 + 4718.42
-3415.6 < D < 6021.2
Page | 493
10.26 H
o
: D = 0
H
a
: D < 0
Before After d
2 4 -2
4 5 -1
1 3 -2
3 3 0
4 3 1
2 5 -3
2 6 -4
3 4 -1
1 5 -4
n = 9 d =-1.778 s
d
=1.716 o = .05 df = n - 1 = 9 - 1 = 8
For a one-tail test and o = .05, the critical t
.05,8
= -1.86
t =
9
716 . 1
0 778 . 1
=
n
s
D d
d
= -3.11
Since the observed t = -3.11 < t
.05,8
= -1.86, the decision is to reject the null hypothesis.
Page | 494
10.27 Before After d
255 197 58
230 225 5
290 215 75
242 215 27
300 240 60
250 235 15
215 190 25
230 240 -10
225 200 25
219 203 16
236 223 13
n = 11 d = 28.09 s
d
=25.813 df = n - 1 = 11 - 1 = 10
For a 98% level of confidence and o/2=.01, t
.01,10
= 2.764
n
s
t d
d
28.09 (2.764)
11
813 . 25
= 28.09 21.51
6.58 < D < 49.60
Page | 495
Page | 496
10.28 H
0
: D = 0
H
a
: D > 0 n = 27 df = 27 1 = 26
d
= 3.17 s
d
= 5
Since o = .01, the critical t
.01,26
= 2.479
t =
27
5
0 71 . 3
=
n
s
D d
d
= 3.86
Since the observed t = 3.86 > t
.01,26
= 2.479, the decision is to reject the null hypothesis.
10.29 n = 21 d = 75 s
d
= 30 df = 21 - 1 = 20
For a 90% confidence level, o/2=.05 and t
.05,20
= 1.725
n
s
t d
d
75 + 1.725
21
30
= 75 11.29
63.71 < D < 86.29
Page | 497
10.30 H
o
: D = 0
H
a
: D = 0
n = 15 d = -2.85 s
d
= 1.9 o = .01 df = 15 - 1 = 14
For a two-tail test, o/2 = .005 and the critical t
.005,14
= + 2.977
t =
15
9 . 1
0 85 . 2
=
n
s
D d
d
= -5.81
Since the observed t = -5.81 < t
.005,14
= -2.977, the decision is to reject the null
hypothesis.
10.31 a) Sample 1 Sample 2
n
1
= 368 n
2
= 405
x
1
= 175 x
2
= 182
368
175
1
1
1
= =
n
x
p = .476
405
182
2
2
2
= =
n
x
p = .449
773
357
405 368
182 175
2 1
2 1
=
+
+
=
+
+
=
n n
x x
p = .462
H
o
: p
1
- p
2
= 0
Page | 498
H
a
: p
1
- p
2
= 0
For two-tail, o/2 = .025 and z
.025
= 1.96
|
.
|
\
|
+
=
|
|
.
|
\
|
+
=
405
1
368
1
) 538 )(. 462 (.
) 0 ( ) 449 . 476 (.
1 1
) ( ) (
1
2 1 2 1
n n
q p
p p p p
z = 0.75
Since the observed z = 0.75 < z
c
= 1.96, the decision is to fail to reject the null
hypothesis.
b) Sample 1 Sample 2
p
1
= .38 p
2
= .25
n
1
= 649 n
2
= 558
558 649
) 25 (. 558 ) 38 (. 649
2 1
2 2 1 1
+
+
=
+
+
=
n n
p n p n
p = .32
H
o
: p
1
- p
2
= 0
H
a
: p
1
- p
2
> 0
For a one-tail test and o = .10, z
.10
= 1.28
|
.
|
\
|
+
=
|
|
.
|
\
|
+
=
558
1
649
1
) 68 )(. 32 (.
) 0 ( ) 25 . 38 (.
1 1
) ( ) (
1
2 1 2 1
n n
q p
p p p p
z = 4.83
Page | 499
Since the observed z = 4.83 > z
c
= 1.28, the decision is to reject the null
hypothesis.
10.32 a) n
1
= 85 n
2
= 90 p
1
= .75 p
2
= .67
For a 90% Confidence Level, z
.05
= 1.645
2
2 2
1
1 1
2 1
) (
n
q p
n
q p
z p p +
(.75 - .67) 1.645
90
) 33 )(. 67 (.
85
) 25 )(. 75 (.
+ = .08 .11
-.03 < p
1
- p
2
< .19
b) n
1
= 1100 n
2
= 1300 p
1
= .19 p
2
= .17
For a 95% Confidence Level, o/2 = .025 and z
.025
= 1.96
2
2 2
1
1 1
2 1
) (
n
q p
n
q p
z p p +
Page | 500
(.19 - .17) + 1.96
1300
) 83 )(. 17 (.
1100
) 81 )(. 19 (.
+ = .02 .03
-.01 < p
1
- p
2
< .05
c) n
1
= 430 n
2
= 399 x
1
= 275 x
2
= 275
430
275
1
1
1
= =
n
x
p = .64
399
275
2
2
2
= =
n
x
p = .69
For an 85% Confidence Level, o/2 = .075 and z
.075
= 1.44
2
2 2
1
1 1
2 1
) (
n
q p
n
q p
z p p +
(.64 - .69) + 1.44
399
) 31 )(. 69 (.
430
) 36 )(. 64 (.
+ = -.05 .047
-.097 < p
1
- p
2
< -.003
Page | 501
d) n
1
= 1500 n
2
= 1500 x
1
= 1050 x
2
= 1100
1500
1050
1
1
1
= =
n
x
p = .70
1500
1100
2
2
2
= =
n
x
p = .733
For an 80% Confidence Level, o/2 = .10 and z
.10
= 1.28
2
2 2
1
1 1
2 1
) (
n
q p
n
q p
z p p +
(.70 - .733) 1.28
1500
) 267 )(. 733 (.
1500
) 30 )(. 70 (.
+ = -.033 .02
-.053 < p
1
- p
2
< -.013
10.33 H
0
: p
m
- p
w
= 0
H
a
: p
m
- p
w
< 0 n
m
= 374 n
w
= 481 p
m
= .59
p
w
= .70
For a one-tailed test and o = .05, z
.05
= -1.645
481 374
) 70 (. 481 ) 59 (. 374
+
+
=
+
+
=
w m
w w m m
n n
p n p n
p = .652
Page | 502
|
.
|
\
|
+
=
|
|
.
|
\
|
+
=
481
1
374
1
) 348 )(. 652 (.
) 0 ( ) 70 . 59 (.
1 1
) ( ) (
1
2 1 2 1
n n
q p
p p p p
z = -3.35
Since the observed z = -3.35 < z
.05
= -1.645, the decision is to reject the null
hypothesis.
Page | 503
10.34 n
1
= 210 n
2
= 176
1
p = .24
2
p = .35
For a 90% Confidence Level, o/2 = .05 and z
.05
= + 1.645
2
2 2
1
1 1
2 1
) (
n
q p
n
q p
z p p +
(.24 - .35) + 1.645
176
) 65 )(. 35 (.
210
) 76 )(. 24 (.
+ = -.11 + .0765
-.1865 < p
1
p
2
< -.0335
10.35 Computer Firms Banks
p
1
= .48 p
2
= .56
n
1
= 56 n
2
= 89
89 56
) 56 (. 89 ) 48 (. 56
2 1
2 2 1 1
+
+
=
+
+
=
n n
p n p n
p = .529
H
o
: p
1
- p
2
= 0
H
a
: p
1
- p
2
= 0
For two-tail test, o/2 = .10 and z
c
= 1.28
Page | 504
|
.
|
\
|
+
=
|
|
.
|
\
|
+
=
89
1
56
1
) 471 )(. 529 (.
) 0 ( ) 56 . 48 (.
1 1
) ( ) (
1
2 1 2 1
n n
q p
p p p p
z = -0.94
Since the observed z = -0.94 > z
c
= -1.28, the decision is to fail to reject the null
hypothesis.
Page | 505
10.36 A B
n
1
= 35 n
2
= 35
x
1
= 5 x
2
= 7
35
5
1
1
1
= =
n
x
p = .14
35
7
2
2
2
= =
n
x
p = .20
For a 98% Confidence Level, o/2 = .01 and z
.01
= 2.33
2
2 2
1
1 1
2 1
) (
n
q p
n
q p
z p p +
(.14 - .20) 2.33
35
) 80 )(. 20 (.
35
) 86 )(. 14 (.
+ = -.06 .21
-.27 < p
1
- p
2
< .15
10.37 H
0
: p
1
p
2
= 0
H
a
: p
1
p
2
= 0
o = .10 p
1
= .09 p
2
= .06 n
1
= 780 n
2
= 915
Page | 506
For a two-tailed test, o/2 = .05 and z
.05
= + 1.645
915 780
) 06 (. 915 ) 09 (. 780
2 1
2 2 1 1
+
+
=
+
+
=
n n
p n p n
p = .0738
|
.
|
\
|
+
=
|
|
.
|
\
|
+
=
915
1
780
1
) 9262 )(. 0738 (.
) 0 ( ) 06 . 09 (.
1 1
) ( ) (
1
2 1 2 1
n n
q p
p p p p
Z = 2.35
Since the observed z = 2.35 > z
.05
= 1.645, the decision is to reject the null
hypothesis.
Page | 507
10.38 n
1
= 850 n
2
= 910 p
1
= .60 p
2
= .52
For a 95% Confidence Level, o/2 = .025 and z
.025
= + 1.96
2
2 2
1
1 1
2 1
) (
n
q p
n
q p
z p p +
(.60 - .52) + 1.96
910
) 48 )(. 52 (.
850
) 40 )(. 60 (.
+ = .08 + .046
.034 < p
1
p
2
< .126
10.39 H
0
: o
1
2
= o
2
2
o = .01 n
1
= 10 s
1
2
= 562
H
a
: o
1
2
< o
2
2
n
2
= 12 s
2
2
= 1013
df
num
= 12 - 1 = 11 df
denom
= 10 - 1 = 9
Table F
.01,10,9
= 5.26
F =
562
1013
2
1
2
2
=
s
s
= 1.80
Page | 508
Since the observed F = 1.80 < F
.01,10,9
= 5.26, the decision is to fail to reject the null
hypothesis.
10.40 H
0
: o
1
2
= o
2
2
o = .05 n
1
= 5 s
1
= 4.68
H
a
: o
1
2
= o
2
2
n
2
= 19 s
2
= 2.78
df
num
= 5 - 1 = 4 df
denom
= 19 - 1 = 18
The critical table F values are: F
.025,4,18
= 3.61 F
.95,18,4
= .277
F =
2
2
2
2
2
1
) 78 . 2 (
) 68 . 4 (
=
s
s
= 2.83
Since the observed F = 2.83 < F
.025,4,18
= 3.61, the decision is to fail to reject the null
hypothesis.
Page | 509
10.41 City 1 City 2
3.43 3.33
3.40 3.42
3.39 3.39
3.32 3.30
3.39 3.46
3.38 3.39
3.34 3.36
3.38 3.44
3.38 3.37
3.28 3.38
n
1
= 10 df
1
= 9 n
2
= 10 df
2
= 9
s
1
2
= .0018989 s
2
2
= .0023378
H
0
: o
1
2
= o
2
2
o = .10 o/2 = .05
H
a
: o
1
2
= o
2
2
Upper tail critical F value = F
.05,9,9
= 3.18
Lower tail critical F value = F
.95,9,9
= 0.314
Page | 510
F =
0023378 .
0018989 .
2
2
2
1
=
s
s
= 0.81
Since the observed F = 0.81 is greater than the lower tail critical value of 0.314 and less
than the upper tail critical value of 3.18, the decision is to fail
to reject the null hypothesis.
Page | 511
10.42 Let Houston = group 1 and Chicago = group 2
1) H
0
: o
1
2
= o
2
2
H
a
: o
1
2
= o
2
2
2) F =
2
2
2
1
s
s
3) o = .01
4) df
1
= 12 df
2
= 10 This is a two-tailed test
The critical table F values are: F
.005,12,10
= 5.66 F
.995,10,12
= .177
If the observed value is greater than 5.66 or less than .177, the decision will be to
reject the null hypothesis.
5) s
1
2
= 393.4 s
2
2
= 702.7
6) F =
7 . 702
4 . 393
= 0.56
7) Since F = 0.56 is greater than .177 and less than 5.66,
the decision is to fail to reject the null hypothesis.
8) There is no significant difference in the variances of
number of days between Houston and Chicago.
Page | 512
10.43 H
0
: o
1
2
= o
2
2
o = .05 n
1
= 12 s
1
= 7.52
H
a
: o
1
2
> o
2
2
n
2
= 15 s
2
= 6.08
df
num
= 12 - 1 = 11 df
denom
= 15 - 1 = 14
The critical table F value is F
.05,10,14
= 2.60
F =
2
2
2
2
2
1
) 08 . 6 (
) 52 . 7 (
=
s
s
= 1.53
Since the observed F = 1.53 < F
.05,10,14
= 2.60, the decision is to fail to reject the null
hypothesis.
Page | 513
10.44 H
0
: o
1
2
= o
2
2
o = .01 n
1
= 15 s
1
2
= 91.5
H
a
: o
1
2
= o
2
2
n
2
= 15 s
2
2
= 67.3
df
num
= 15 - 1 = 14 df
denom
= 15 - 1 = 14
The critical table F values are: F
.005,12,14
= 4.43 F
.995,14,12
= .226
F =
3 . 67
5 . 91
2
2
2
1
=
s
s
= 1.36
Since the observed F = 1.36 < F
.005,12,14
= 4.43 and > F
.995,14,12
= .226, the decision is to fail
to reject the null hypothesis.
10.45 H
o
:
1
-
2
= 0
H
a
:
1
-
2
= 0
For o = .10 and a two-tailed test, o/2 = .05 and z
.05
= + 1.645
Sample 1 Sample 2
x
1
= 138.4 x
2
= 142.5
o
1
= 6.71 o
2
= 8.92
n
1
= 48 n
2
= 39
Page | 514
z =
39
) 92 . 8 (
48
) 71 . 6 (
) 0 ( ) 5 . 142 4 . 138 ( ) ( ) (
2
2
2
2
1
2
1
2 1
2 1
+
=
+
n n
x x
o o
u u
= -2.38
Since the observed value of z = -2.38 is less than the critical value of z = -1.645, the
decision is to reject the null hypothesis. There is a significant difference in the means of
the two populations.
Page | 515
10.46 Sample 1 Sample 2
x
1
= 34.9 x
2
= 27.6
o
1
2
= 2.97 o
2
2
= 3.50
n
1
= 34 n
2
= 31
For 98% Confidence Level, z
.01
= 2.33
2
2
2
1
2
1
2 1 ) (
n
s
n
s
z x x +
(34.9 27.6) + 2.33
31
50 . 3
34
97 . 2
+ = 7.3 + 1.04
6.26 < u
1
- u
2
< 8.34
10.47 H
o
:
1
-
2
= 0
H
a
:
1
-
2
> 0
Sample 1 Sample 2
x
1
= 2.06 x
2
= 1.93
s
1
2
= .176 s
2
2
= .143
n
1
= 12 n
2
= 15 o = .05
Page | 516
This is a one-tailed test with df = 12 + 15 - 2 = 25. The critical value is
t
.05,25
= 1.708. If the observed value is greater than 1.708, the decision will be to reject
the null hypothesis.
t =
2 1 2 1
2
2
2 1
2
1
2 1
2 1
1 1
2
) 1 ( ) 1 (
) ( ) (
n n n n
n s n s
x x
+
+
+
u u
t =
15
1
12
1
25
) 14 )( 143 (. ) 11 )( 176 (.
) 0 ( ) 93 . 1 06 . 2 (
+
+
= 0.85
Since the observed value of t = 0.85 is less than the critical value of t = 1.708, the
decision is to fail to reject the null hypothesis. The mean for population one is not
significantly greater than the mean for population two.
10.48 Sample 1 Sample 2
x
1
= 74.6 x
2
= 70.9
s
1
2
= 10.5 s
2
2
= 11.4
n
1
= 18 n
2
= 19
For 95% confidence, o/2 = .025.
Using df = 18 + 19 - 2 = 35, t
30,.025
= 2.042
Page | 517
2 1 2 1
2
2
2 1
2
1
2 1
1 1
2
) 1 ( ) 1 (
) (
n n n n
n s n s
t x x +
+
+
(74.6 70.9) + 2.042
19
1
18
1
2 19 18
) 18 )( 4 . 11 ( ) 17 )( 5 . 10 (
+
+
+
3.7 + 2.22
1.48 < u
1
- u
2
< 5.92
10.49 H
o
: D = 0 o = .01
H
a
: D < 0
n = 21 df = 20 d = -1.16 s
d
= 1.01
The critical t
.01,20
= -2.528. If the observed t is less than -2.528, then the decision will be
to reject the null hypothesis.
t =
21
01 . 1
0 16 . 1
=
n
s
D d
d
= -5.26
Since the observed value of t = -5.26 is less than the critical t value of -2.528, the
decision is to reject the null hypothesis. The population difference is less
than zero.
Page | 518
Page | 519
10.50 Respondent Before After d
1 47 63 -16
2 33 35 - 2
3 38 36 2
4 50 56 - 6
5 39 44 - 5
6 27 29 - 2
7 35 32 3
8 46 54 - 8
9 41 47 - 6
d = -4.44 s
d
= 5.703 df = 8
For a 99% Confidence Level, o/2 = .005 and t
8,.005
= 3.355
n
s
t d
d
= -4.44 + 3.355
9
703 . 5
= -4.44 + 6.38
-10.82 < D < 1.94
10.51 H
o
: p
1
- p
2
= 0 o = .05 o/2 = .025
H
a
: p
1
- p
2
= 0 z
.025
= + 1.96
If the observed value of z is greater than 1.96 or less than -1.96, then the decision will be
to reject the null hypothesis.
Page | 520
Sample 1 Sample 2
x
1
= 345 x
2
= 421
n
1
= 783 n
2
= 896
896 783
421 345
2 1
2 1
+
+
=
+
+
=
n n
x x
p = .4562
783
345
1
1
1
= =
n
x
p = .4406
896
421
2
2
2
= =
n
x
p = .4699
|
.
|
\
|
+
=
|
|
.
|
\
|
+
=
896
1
783
1
) 5438 )(. 4562 (.
) 0 ( ) 4699 . 4406 (.
1 1
) ( ) (
1
2 1 2 1
n n
q p
p p p p
z = -1.20
Since the observed value of z = -1.20 is greater than -1.96, the decision is to fail to reject
the null hypothesis. There is no significant difference.
10.52 Sample 1 Sample 2
n
1
= 409 n
2
= 378
p
1
= .71 p
2
= .67
For a 99% Confidence Level, o/2 = .005 and z
.005
= 2.575
Page | 521
2
2 2
1
1 1
2 1
) (
n
q p
n
q p
z p p +
(.71 - .67) + 2.575
378
) 33 )(. 67 (.
409
) 29 )(. 71 (.
+ = .04 .085
-.045 < p
1
- p
2
< .125
10.53 H
0
: o
1
2
= o
2
2
o = .05 n
1
= 8 s
1
2
= 46
H
a
: o
1
2
= o
2
2
n
2
= 10 s
2
2
= 37
df
num
= 8 - 1 = 7 df
denom
= 10 - 1 = 9
The critical F values are: F
.025,7,9
= 4.20 F
.975,9,7
= .238
If the observed value of F is greater than 4.20 or less than .238, then the decision will be
to reject the null hypothesis.
F =
37
46
2
2
2
1
=
s
s
= 1.24
Since the observed F = 1.24 is less than F
.025,7,9
=4.20 and greater than
F
.975,9,7
= .238, the decision is to fail to reject the null hypothesis. There is no significant
difference in the variances of the two populations.
Page | 522
Page | 523
10.54 Term Whole Life
x
t
= $75,000 x
w
= $45,000
s
t
= $22,000 s
w
= $15,500
n
t
= 27 n
w
= 29
df = 27 + 29 - 2 = 54
For a 95% Confidence Level, o/2 = .025 and t
.025,50
= 2.009 (used df=50)
2 1 2 1
2
2
2 1
2
1
2 1
1 1
2
) 1 ( ) 1 (
) (
n n n n
n s n s
t x x +
+
+
(75,000 45,000) + 2.009
29
1
27
1
2 29 27
) 28 ( ) 500 , 15 ( ) 26 ( ) 000 , 22 (
2 2
+
+
+
30,000 10,160.11
19,839.89 <
1
-
2
< 40,160.11
10.55 Morning Afternoon d
43 41 2
51 49 2
37 44 -7
Page | 524
24 32 -8
47 46 1
44 42 2
50 47 3
55 51 4
46 49 -3
n = 9 d = -0.444 s
d
=4.447 df = 9 - 1 = 8
For a 90% Confidence Level: o/2 = .05 and t
.05,8
= 1.86
n
s
t d
d
-0.444 + (1.86)
9
447 . 4
= -0.444 2.757
-3.201 < D < 2.313
10.56 Marketing Accountants
n
1
= 400 n
2
= 450
x
1
= 220 x
2
= 216
H
o
: p
1
- p
2
= 0
H
a
: p
1
- p
2
> 0 o = .01
The critical table z value is: z
.01
= 2.33
Page | 525
1
p =
400
220
= .55
2
p =
450
216
= .48
450 400
216 220
2 1
2 1
+
+
=
+
+
=
n n
x x
p = .513
|
.
|
\
|
+
=
|
|
.
|
\
|
+
=
450
1
400
1
) 487 )(. 513 (.
) 0 ( ) 48 . 55 (.
1 1
) ( ) (
1
2 1 2 1
n n
q p
p p p p
z = 2.04
Since the observed z = 2.04 is less than z
.01
= 2.33, the decision is to fail to reject
the null hypothesis. There is no significant difference between marketing
managers and accountants in the proportion who keep track of obligations in their
head.
10.57 Accounting Data Entry
n
1
= 16 n
2
= 14
x
1
= 26,400 x
2
= 25,800
s
1
= 1,200 s
2
= 1,050
H
0
: o
1
2
= o
2
2
H
a
: o
1
2
= o
2
2
o = .05 and o/2 = .025
Page | 526
df
num
= 16 1 = 15 df
denom
= 14 1 = 13
The critical F values are: F
.025,15,13
= 3.05 F
.975,15,13
= 0.33
F =
500 , 102 , 1
000 , 440 , 1
2
2
2
1
=
s
s
= 1.31
Since the observed F = 1.31 is less than F
.025,15,13
= 3.05 and greater than
F
.975,15,13
= 0.33, the decision is to fail to reject the null hypothesis.
10.58 Men Women
n
1
= 60 n
2
= 41
x
1
= 631 x
2
= 848
o
1
= 100 o
2
= 100
For a 95% Confidence Level, o/2 = .025 and z
.025
= 1.96
2
2
2
1
2
1
2 1 ) (
n
s
n
s
z x x +
(631 848) + 1.96
41
100
60
100
2 2
+ = -217 39.7
-256.7 <
1
-
2
< -177.3
Page | 527
10.59 H
o
:
1
-
2
= 0 o = .01
H
a
:
1
-
2
= 0 df = 20 + 24 - 2 = 42
Detroit Charlotte
n
1
= 20 n
2
= 24
x
1
= 17.53 x
2
= 14.89
s
1
= 3.2 s
2
= 2.7
For two-tail test, o/2 = .005 and the critical t
.005,40
= 2.704 (used df=40)
t =
2 1 2 1
2
2
2 1
2
1
2 1
2
1
1 1
2
) 1 ( ) 1 (
) ( ) (
n n n n
n s n s
x x
+
+
+
u u
t =
24
1
20
1
42
) 23 ( ) 7 . 2 ( ) 19 ( ) 2 . 3 (
) 0 ( ) 89 . 14 53 . 17 (
2 2
+
+
= 2.97
Since the observed t = 2.97 > t
.005,40
= 2.704, the decision is to reject the null
hypothesis.
Page | 528
10.60 With Fertilizer Without Fertilizer
x
1
= 38.4 x
2
= 23.1
o
1
= 9.8 o
2
= 7.4
n
1
= 35 n
2
= 35
H
o
:
1
-
2
= 0
H
a
:
1
-
2
> 0
For one-tail test, o = .01 and z
.01
= 2.33
z =
35
) 4 . 7 (
35
) 8 . 9 (
) 0 ( ) 1 . 23 4 . 38 ( ) ( ) (
2 2
2
2
2
1
2
1
2 1
2 1
+
=
+
n n
x x
o o
u u
= 7.37
Since the observed z = 7.37 > z
.01
= 2.33, the decision is to reject the null
hypothesis.
Page | 529
10.61 Specialty Discount
n
1
= 350 n
2
= 500
p
1
= .75 p
2
= .52
For a 90% Confidence Level, o/2 = .05 and z
.05
= 1.645
2
2 2
1
1 1
2 1
) (
n
q p
n
q p
z p p +
(.75 - .52) + 1.645
500
) 48 )(. 52 (.
350
) 25 )(. 75 (.
+ = .23 .053
.177 < p
1
- p
2
< .283
Page | 530
10.62 H
0
: o
1
2
= o
2
2
o = .01 n
1
= 8 n
2
= 7
H
a
: o
1
2
= o
2
2
s
1
2
= 72,909 s
2
2
= 129,569
df
num
= 6 df
denom
= 7
The critical F values are: F
.005,6,7
= 9.16 F
.995,7,6
= .11
F =
909 , 72
569 , 129
2
2
2
1
=
s
s
= 1.78
Since F = 1.78 < F
.005,6,7
= 9.16 but also > F
.995,7,6
= .11, the decision is to fail to reject the
null hypothesis. There is no difference in the variances of the shifts.
Page | 531
10.63 Name Brand Store Brand d
54 49 5
55 50 5
59 52 7
53 51 2
54 50 4
61 56 5
51 47 4
53 49 4
n = 8 d = 4.5 s
d
=1.414 df = 8 - 1 = 7
For a 90% Confidence Level, o/2 = .05 and t
.05,7
= 1.895
n
s
t d
d
4.5 + 1.895
8
414 . 1
= 4.5 .947
3.553 < D < 5.447
Page | 532
10.64 H
o
:
1
-
2
= 0 o = .01
H
a
:
1
-
2
< 0 df = 23 + 19 - 2 = 40
Wisconsin Tennessee
n
1
= 23 n
2
= 19
x
1
= 69.652 x
2
= 71.7368
s
1
2
= 9.9644 s
2
2
= 4.6491
For one-tail test, o = .01 and the critical t
.01,40
= -2.423
t =
2 1 2 1
2
2
2 1
2
1
2 1
2 1
1 1
2
) 1 ( ) 1 (
) ( ) (
n n n n
n s n s
x x
+
+
+
u u
t =
19
1
23
1
40
) 18 )( 6491 . 4 ( ) 22 )( 9644 . 9 (
) 0 ( ) 7368 . 71 652 . 69 (
+
+
= -2.44
Since the observed t = -2.44 < t
.01,40
= -2.423, the decision is to reject the null
hypothesis.
10.65 Wednesday Friday d
71 53 18
56 47 9
Page | 533
75 52 23
68 55 13
74 58 16
n = 5 d = 15.8 s
d
= 5.263 df = 5 - 1 = 4
H
o
: D = 0 o = .05
H
a
: D > 0
For one-tail test, o = .05 and the critical t
.05,4
= 2.132
t =
5
263 . 5
0 8 . 15
=
n
s
D d
d
= 6.71
Since the observed t = 6.71 > t
.05,4
= 2.132, the decision is to reject the null
hypothesis.
10.66 H
o
: p
1
- p
2
= 0 o = .05
H
a
: p
1
- p
2
= 0
Machine 1 Machine 2
x
1
= 38 x
2
= 21
n
1
= 191 n
2
= 202
Page | 534
191
38
1
1
1
= =
n
x
p = .199
202
21
2
2
2
= =
n
x
p = .104
202 191
) 202 )( 104 (. ) 191 )( 199 (.
2 1
2 2 1 1
+
+
=
+
+
=
n n
p n p n
p = .15
For two-tail, o/2 = .025 and the critical z values are: z
.025
= 1.96
|
.
|
\
|
+
=
|
|
.
|
\
|
+
=
202
1
191
1
) 85 )(. 15 (.
) 0 ( ) 104 . 199 (.
1 1
) ( ) (
1
2 1 2 1
n n
q p
p p p p
z = 2.64
Since the observed z = 2.64 > z
c
= 1.96, the decision is to reject the null
hypothesis.
10.67 Construction Telephone Repair
n
1
= 338 n
2
= 281
x
1
= 297 x
2
= 192
338
297
1
1
1
= =
n
x
p = .879
281
192
2
2
2
= =
n
x
p = .683
For a 90% Confidence Level, o/2 = .05 and z
.05
= 1.645
2
2 2
1
1 1
2 1
) (
n
q p
n
q p
z p p +
Page | 535
(.879 - .683) + 1.645
281
) 317 )(. 683 (.
338
) 121 )(. 879 (.
+ = .196 .054
.142 < p
1
- p
2
< .250
10.68 Aerospace Automobile
n
1
= 33 n
2
= 35
x
1
= 12.4 x
2
= 4.6
o
1
= 2.9 o
2
= 1.8
For a 99% Confidence Level, o/2 = .005 and z
.005
= 2.575
2
2
2
1
2
1
2 1 ) (
n n
z x x
o o
+
(12.4 4.6) + 2.575
35
) 8 . 1 (
33
) 9 . 2 (
2 2
+ = 7.8 1.52
6.28 <
1
-
2
< 9.32
10.69 Discount Specialty
x
1
= $47.20 x
2
= $27.40
Page | 536
o
1
= $12.45 o
2
= $9.82
n
1
= 60 n
2
= 40
H
o
:
1
-
2
= 0 o = .01
H
a
:
1
-
2
= 0
For two-tail test, o/2 = .005 and z
c
= 2.575
z =
40
) 82 . 9 (
60
) 45 . 12 (
) 0 ( ) 40 . 27 20 . 47 ( ) ( ) (
2 2
2
2
2
1
2
1
2 1
2 1
+
=
+
n n
x x
o o
u u
= 8.86
Since the observed z = 8.86 > z
c
= 2.575, the decision is to reject the null
hypothesis.
Page | 537
10.70 Before After d
12 8 4
7 3 4
10 8 2
16 9 7
8 5 3
n = 5 d = 4.0 s
d
= 1.8708 df = 5 - 1 = 4
H
o
: D = 0 o = .01
H
a
: D > 0
For one-tail test, o = .01 and the critical t
.01,4
= 3.747
t =
5
8708 . 1
0 0 . 4
=
n
s
D d
d
= 4.78
Since the observed t = 4.78 > t
.01,4
= 3.747, the decision is to reject the null
hypothesis.
10.71 H
o
:
1
-
2
= 0 o = .01
H
a
:
1
-
2
= 0 df = 10 + 6 - 2 = 14
A B___
Page | 538
n
1
= 10 n
2
= 6
x
1
= 18.3 x
2
= 9.667
s
1
2
= 17.122 s
2
2
= 7.467
For two-tail test, o/2 = .005 and the critical t
.005,14
= 2.977
t =
2 1 2 1
2
2
2 1
2
1
2 1
2
1
1 1
2
) 1 ( ) 1 (
) ( ) (
n n n n
n s n s
x x
+
+
+
u u
t =
6
1
10
1
14
) 5 )( 467 . 7 ( ) 9 )( 122 . 17 (
) 0 ( ) 667 . 9 3 . 18 (
+
+
= 4.52
Since the observed t = 4.52 > t
.005,14
= 2.977, the decision is to reject the null
hypothesis.
10.72 A t test was used to test to determine if Hong Kong has significantly different
rates than Mumbai. Let group 1 be Hong Kong.
H
o
:
1
-
2
= 0
H
a
:
1
-
2
= 0
n
1
= 19 n
2
= 23 x
1
= 130.4 x
2
= 128.4
s
1
= 12.9 s
2
= 13.9 98% C.I. and o/2 = .01
Page | 539
t = 0.48 with a p-value of .634 which is not significant at of .05. There is not
enough evidence in these data to declare that there is a difference in the average
rental rates of the two cities.
10.73 H
0
: D = 0
H
a
: D = 0
This is a related measures before and after study. Fourteen people were involved in the
study. Before the treatment, the sample mean was 4.357 and after the treatment, the
mean was 5.214. The higher number after the treatment indicates that subjects were
more likely to blow the whistle after having been through the treatment. The
observed t value was 3.12 which was more extreme than two-tailed table t value of +
2.16 and as a result, the researcher rejects the null hypothesis. This is underscored by a
p-value of .0081 which is less than o = .05. The study concludes that there is a
significantly higher likelihood of blowing the whistle after the treatment.
10.74 The point estimates from the sample data indicate that in the northern city the market
share is .31078 and in the southern city the market share is .27013. The point estimate
for the difference in the two proportions of market share are .04065. Since the 99%
confidence interval ranges from -.03936 to +.12067 and zero is in the interval, any
hypothesis testing decision based on this interval would result in failure to reject the
null hypothesis. Alpha is .01 with a two-tailed test. This is underscored by an observed
z value of 1.31 which has an associated p-value of .191 which, of course, is not
significant for any of the usual values of o.
Page | 540
Page | 541
10.75 A test of differences of the variances of the populations of the two machines is being
computed. The hypotheses are:
H
0
: o
1
2
= o
2
2
H
a
: o
1
2
> o
2
2
Twenty-six pipes were measured for sample one and twenty-eight pipes were measured
for sample two. The observed F = 2.0575 is significant at o = .05 for a one-tailed test
since the associated p-value is .034787. The variance of pipe lengths for machine 1 is
significantly greater than the variance of pipe lengths for machine 2.
Page | 542
Chapter 11
Analysis of Variance and
Design of Experiments
LEARNING OBJECTIVES
The focus of this chapter is learning about the design of experiments and the analysis of
variance thereby enabling you to:
1. Understand the differences between various experiment designs and when to use them.
2. Compute and interpret the results of a one-way ANOVA.
3. Compute and interpret the results of a random block design.
4. Compute and interpret the results of a two-way ANOVA.
5. Understand and interpret interaction.
6. Know when and how to use multiple comparison techniques.
Page | 543
CHAPTER TEACHING STRATEGY
This important chapter opens the door for students to a broader view of statistics than
they have seen to this time. Through the topic of experimental designs, the student begins to
understand how they can scientifically set up controlled experiments in which to test certain
hypotheses. They learn about independent and dependent variables. With the completely
randomized design, the student can see how the t test for two independent samples can be
expanded to include three or more samples by using analysis of variance. This is something
that some of the more curious students were probably wondering about in chapter 10. Through
the randomized block design and the factorial designs, the student can understand how we can
analyze not only multiple categories of one variable, but we can simultaneously analyze multiple
variables with several categories each. Thus, this chapter affords the instructor an opportunity
to help the student develop a structure for statistical analysis.
In this chapter, we emphasize that the total sum of squares in a given problem do not
change. In the completely randomized design, the total sums of squares are parceled into
between treatments sum of squares and error sum of squares. By using a blocking design when
there is significant blocking, the blocking effects are removed from the error effect, which
reduces the size of the mean square error and can potentially create a more powerful test of the
treatment. A similar thing happens in the two-way factorial design when one significant
treatment variable siphons off the sum of squares from the error term that reduces the mean
square error and creates the potential for a more powerful test of the other treatment variable.
In presenting the random block design in this chapter, the emphasis is on determining if
the F value for the treatment variable is significant or not. There is a de- emphasis on examining
the F value of the blocking effects. However, if the blocking effects are not significant, the
random block design may be a less powerful analysis of the treatment effects. If the blocking
effects are not significant, even though the error sum of squares is reduced, the mean square
error might increase because the blocking effects may reduce the degrees of freedom error in a
proportionaly greater amount. This might result in a smaller treatment F value than would
Page | 544
occur in a completely randomized design. The repeated-measures design is shown in the
chapter as a special case of the random block design.
In factorial designs, if there are multiple values in the cells, it is possible to analyze
interaction effects. Random block designs do not have multiple values in cells and therefore
interaction effects cannot be calculated. It is emphasized in this chapter that if significant
interaction occurs, then the main effects analysis are confounded and should not be analyzed in
the usual manner. There are various philosophies about how to handle significant interaction
but are beyond the scope of this chapter. The main factorial example problem in the chapter
was created to have no significant interaction so that the student can learn how to analyze main
effects. The demonstration problem has significant interaction and these interactions are
displayed graphically for the student to see. You might consider taking this same problem and
graphing the interactions using row effects along the x axis and graphing the column means for
the student to see.
There are a number of multiple comparison tests available. In this text, one of the more
well-known tests, Tukey's HSD, is featured in the case of equal sample sizes. When sample sizes
are unequal, a variation on Tukeys HSD, the Tukey-Kramer test, is used. MINITAB uses the
Tukey test as one of its options under multiple comparisons and uses the Tukey-Kramer test for
unequal sample sizes. Tukey's HSD is one of the more powerful multiple comparison tests but
protects less against Type I errors than some of the other tests.
Page | 545
CHAPTER OUTLINE
11.1 Introduction to Design of Experiments
11.2 The Completely Randomized Design (One-Way ANOVA)
One-Way Analysis of Variance
Reading the F Distribution Table
Using the Computer for One-Way ANOVA
Comparison of F and t Values
11.3 Multiple Comparison Tests
Tukey's Honestly Significant Difference (HSD) Test: The Case of Equal Sample
Sizes
Using the Computer to Do Multiple Comparisons
Tukey-Kramer Procedure: The Case of Unequal Sample Sizes
11.4 The Randomized Block Design
Using the Computer to Analyze Randomized Block Designs
11.5 A Factorial Design (Two-Way ANOVA)
Advantages of the Factorial Design
Factorial Designs with Two Treatments
Applications
Statistically Testing the Factorial Design
Interaction
Using a Computer to Do a Two-Way ANOVA
Page | 546
KEY TERMS
a posteriori Factors
a priori Independent Variable
Analysis of Variance (ANOVA) Interaction
Blocking Variable Levels
Classification Variables Multiple Comparisons
Classifications One-way Analysis of Variance
Completely Randomized Design Post-hoc
Concomitant Variables Randomized Block Design
Confounding Variables Repeated Measures Design
Dependent Variable Treatment Variable
Experimental Design Tukey-Kramer Procedure
F Distribution Tukeys HSD Test
F Value Two-way Analysis of Variance
Factorial Design
SOLUTIONS TO PROBLEMS IN CHAPTER 11
11.1 a) Time Period, Market Condition, Day of the Week, Season of the Year
Page | 547
b) Time Period - 4 P.M. to 5 P.M. and 5 P.M. to 6 P.M.
Market Condition - Bull Market and Bear Market
Day of the Week - Monday, Tuesday, Wednesday, Thursday, Friday
Season of the Year Summer, Winter, Fall, Spring
c) Volume, Value of the Dow Jones Average, Earnings of Investment Houses
11.2 a) Type of 737, Age of the plane, Number of Landings per Week of the plane,
City that the plane is based
b) Type of 737 - Type I, Type II, Type III
Age of plane - 0-2 y, 3-5 y, 6-10 y, over 10 y
Number of Flights per Week - 0-5, 6-10, over 10
City - Dallas, Houston, Phoenix, Detroit
c) Average annual maintenance costs, Number of annual hours spent on
maintenance
11.3 a) Type of Card, Age of User, Economic Class of Cardholder, Geographic Region
Page | 548
b) Type of Card - Mastercard, Visa, Discover, American Express
Age of User - 21-25 y, 26-32 y, 33-40 y, 41-50 y, over 50
Economic Class - Lower, Middle, Upper
Geographic Region - NE, South, MW, West
c) Average number of card usages per person per month,
Average balance due on the card, Average per expenditure per person,
Number of cards possessed per person
11.4 Average dollar expenditure per day/night, Age of adult registering the family, Number of
days stay (consecutive)
Page | 549
11.5 Source df SS MS F__
Treatment 2 22.20 11.10 11.07
Error 14 14.03 1.00______
Total 16 36.24
o = .05 Critical F
.05,2,14
= 3.74
Since the observed F = 11.07 > F
.05,2,14
= 3.74, the decision is to reject the null
hypothesis.
11.6 Source df SS MS F__
Treatment 4 93.77 23.44 15.82
Error 18 26.67 1.48______
Total 22 120.43
o = .01 Critical F
.01,4,18
= 4.58
Since the observed F = 15.82 > F
.01,4,18
= 4.58, the decision is to reject the null
hypothesis.
Page | 550
11.7 Source df SS MS F_
Treatment 3 544.2 181.4 13.00
Error 12 167.5 14.0______
Total 15 711.8
o = .01 Critical F
.01,3,12
= 5.95
Since the observed F = 13.00 > F
.01,3,12
= 5.95, the decision is to reject the null
hypothesis.
Page | 551
11.8 Source df SS MS F__
Treatment 1 64.29 64.29 17.76
Error 12 43.43 3.62______
Total 13 107.71
o = .05 Critical F
.05,1,12
= 4.75
Since the observed F = 17.76 > F
.05,1,12
= 4.75, the decision is to reject the null
hypothesis.
Observed t value using t test:
1 2
n
1
= 7 n
2
= 7
x
1
= 29 x
2
= 24.71429
s
1
2
= 3 s
2
2
= 4.238095
t =
7
1
7
1
2 7 7
) 6 )( 238095 . 4 ( ) 6 ( 3
) 0 ( ) 71429 . 24 29 (
+
+
+
= 4.214
Also, t = 76 . 17 = F = 4.214
Page | 552
11.9 Source SS df MS F__
Treatment 583.39 4 145.8475 7.50
Error 972.18 50 19.4436______
Total 1,555.57 54
11.10 Source SS df MS F _
Treatment 29.64 2 14.820 3.03
Error 68.42 14 4.887___ __
Total 98.06 16
F
.05,2,14
= 3.74
Since the observed F = 3.03 < F
.05,2,14
= 3.74, the decision is to fail to reject
the null hypothesis
11.11 Source df SS MS F__
Treatment 3 .007076 .002359 10.10
Error 15 .003503 .000234________
Total 18 .010579
Page | 553
o = .01 Critical F
.01,3,15
= 5.42
Since the observed F = 10.10 > F
.01,3,15
= 5.42, the decision is to reject the null
hypothesis.
11.12 Source df SS MS F__
Treatment 2 180700000 90350000 92.67
Error 12 11699999 975000_________
Total 14 192400000
o = .01 Critical F
.01,2,12
= 6.93
Since the observed F = 92.67 > F
.01,2,12
= 6.93, the decision is to reject the null
hypothesis.
11.13 Source df SS MS F___
Treatment 2 29.61 14.80 11.76
Error 15 18.89 1.26________
Page | 554
Total 17 48.50
o = .05 Critical F
.05,2,15
= 3.68
Since the observed F = 11.76 > F
.05,2,15
= 3.68, the decison is to reject the null
hypothesis.
Page | 555
11.14 Source df SS MS F__
Treatment 3 456630 152210 11.03
Error 16 220770 13798_______
Total 19 677400
o = .05 Critical F
.05,3,16
= 3.24
Since the observed F = 11.03 > F
.05,3,16
= 3.24, the decision is to reject the null
hypothesis.
11.15 There are 4 treatment levels. The sample sizes are 18, 15, 21, and 11. The F
value is 2.95 with a p-value of .04. There is an overall significant difference at
alpha of .05. The means are 226.73, 238.79, 232.58, and 239.82.
11.16 The independent variable for this study was plant with five classification levels
(the five plants). There were a total of 43 workers who participated in the study. The
dependent variable was number of hours worked per week. An observed F value of 3.10
was obtained with an associated p-value of .026595. With an alpha of .05, there was a
significant overall difference in the average number of hours worked per week by plant.
A cursory glance at the plant averages revealed that workers at plant 3 averaged 61.47
Page | 556
hours per week (highest number) while workers at plant 4 averaged 49.20 (lowest
number).
11.17 C = 6 MSE = .3352 o = .05 N = 46
q
.05,6,40
= 4.23 n
3
= 8 n
6
= 7 x
3
= 15.85 x
6
= 17.2
HSD = 4.23 |
.
|
\
|
+
7
1
8
1
2
3352 .
= 0.896
21 . 17 85 . 15 6 3 = x x = 1.36
Since 1.36 > 0.896, there is a significant difference between the means of
groups 3 and 6.
11.18 C = 4 n = 6 N = 24 df
error
= N - C = 24 - 4 = 20 o = .05
MSE = 2.389 q
.05,4,20
= 3.96
HSD = q
n
MSE
= (3.96)
6
389 . 2
= 2.50
Page | 557
11.19 C = 3 MSE = 1.002381 o = .05 N = 17 N - C = 14
q
.05,3,14
= 3.70 n
1
= 6 n
2
= 5 x
1
= 2 x
2
= 4.6
HSD = 3.70
|
.
|
\
|
+
5
1
6
1
2
002381 . 1
= 1.586
6 . 4 2 2 1 = x x = 2.6
Since 2.6 > 1.586, there is a significant difference between the means of
groups 1 and 2.
11.20 From problem 11.6, MSE = 1.481481 C = 5 N = 23 N C = 18
n
2
= 5 n
4
= 5 o = .01 q
.01,5,18
= 5.38
HSD = 5.38 |
.
|
\
|
+
5
1
5
1
2
481481 . 1
= 2.93
x
2
= 10 x
4
= 16
Page | 558
16 10 6 3 = x x = 6
Since 6 > 2.93, there is a significant difference in the means of
groups 2 and 4.
Page | 559
11.21 N = 16 n = 4 C = 4 N - C = 12 MSE = 13.95833 q
.01,4,12
= 5.50
HSD = q
n
MSE
= 5.50
4
95833 . 13
= 10.27
x
1
= 115.25 x
2
= 125.25 x
3
= 131.5 x
4
= 122.5
x
1
and x
3
are the only pair that are significantly different using the
HSD test.
Page | 560
11.22 n = 7 C = 2 MSE = 3.619048 N = 14 N - C = 14 - 2 = 12
o = .05 q
.05,2,12
= 3.08
HSD = q
n
MSE
= 3.08
7
619048 . 3
= 2.215
x
1
= 29 and x
2
= 24.71429
Since x
1
- x
2
= 4.28571 > HSD = 2.215, the decision is to reject the null
hypothesis.
Page | 561
11.23 C = 4 MSE = .000234 o = .01 N = 19 N C = 15
q
.01,4,15
= 5.25 n
1
= 4 n
2
= 6 n
3
= 5 n
4
= 4
x
1
= 4.03, x
2
= 4.001667, x
3
= 3.974, x
4
= 4.005
HSD
1,2
= 5.25 |
.
|
\
|
+
6
1
4
1
2
000234 .
= .0367
HSD
1,3
= 5.25 |
.
|
\
|
+
5
1
4
1
2
000234 .
= .0381
HSD
1,4
= 5.25 |
.
|
\
|
+
4
1
4
1
2
000234 .
= .0402
HSD
2,3
= 5.25 |
.
|
\
|
+
5
1
6
1
2
000234 .
= .0344
HSD
2,4
= 5.25 |
.
|
\
|
+
4
1
6
1
2
000234 .
= .0367
HSD
3,4
= 5.25 |
.
|
\
|
+
4
1
5
1
2
000234 .
= .0381
Page | 562
3 1 x x = .056
This is the only pair of means that are significantly different.
Page | 563
11.24 o = .01 C = 3 n = 5 N = 15 N C = 12
MSE = 975,000 q
.01,3,12
= 5.04
HSD = q
n
MSE
= 5.04
5
000 , 975
= 2,225.6
x
1
= 40,900 x
2
= 49,400 x
3
= 45,300
2 1 x x = 8,500
3 1 x x = 4,400
3 2 x x = 4,100
Using Tukey's HSD, all three pairwise comparisons are significantly different.
11.25 o = .05 C = 3 N = 18 N - C = 15 MSE = 1.259365
q
.05,3,15
= 3.67 n
1
= 5 n
2
= 7 n
3
= 6
x
1
= 7.6 x
2
= 8.8571 x
3
= 5.8333
HSD
1,2
= 3.67 |
.
|
\
|
+
7
1
5
1
2
259365 . 1
= 1.705
Page | 564
HSD
1,3
= 3.67 |
.
|
\
|
+
6
1
5
1
2
259365 . 1
= 1.763
HSD
2,3
= 3.67
|
.
|
\
|
+
6
1
7
1
2
259365 . 1
= 1.620
3 1 x x = 1.767 (is significant)
3 2 x x = 3.024 (is significant)
Page | 565
11.26 o = .05 n = 5 C = 4 N = 20 N - C = 16 MSE = 13,798.13
x
1
= 591 x
2
= 350 x
3
= 776 x
4
= 563
HSD = q
n
MSE
= 4.05
5
13 . 798 , 13
= 212.76
2 1 x x = 241 3 1 x x = 185 4 1 x x = 28
3 2 x x = 426 4 2 x x = 213 4 3 x x = 213
Using Tukey's HSD = 212.76, means 1 and 2, means 2 and 3, means 2 and 4,
and means 3 and 4 are significantly different.
11.27 o = .05. There were five plants and ten pairwise comparisons. The MINITAB
output reveals that the only significant pairwise difference is between plant 2 and
plant 3 where the reported confidence interval (0.180 to 22.460) contains the same
sign throughout indicating that 0 is not in the interval. Since zero is not in the
interval, then we are 95% confident that there is a pairwise difference
significantly different from zero. The lower and upper values for all other
confidence intervals have different signs indicating that zero is included in the
interval. This indicates that the difference in the means for these pairs might be
zero.
Page | 566
Page | 567
11.28 H
0
:
1
=
2
=
3
=
4
H
a
: At least one treatment mean is different from the others
Source df SS MS F__
Treatment 3 62.95 20.9833 5.56
Blocks 4 257.50 64.3750 17.05
Error 12 45.30 3.7750______
Total 19 365.75
o = .05 Critical F
.05,3,12
= 3.49 for treatments
For treatments, the observed F = 5.56 > F
.05,3,12
= 3.49, the decision is to
reject the null hypothesis.
11.29 H
0
:
1
=
2
=
3
H
a
: At least one treatment mean is different from the others
Source df SS MS F _
Treatment 2 .001717 .000858 1.48
Blocks 3 .076867 .025622 44.13
Error 6 .003483 .000581_______
Total 11 .082067
o = .01 Critical F
.01,2,6
= 10.92 for treatments
Page | 568
For treatments, the observed F = 1.48 < F
.01,2,6
= 10.92 and the decision is to
fail to reject the null hypothesis.
11.30 Source df SS MS F__
Treatment 5 2477.53 495.506 1.91
Blocks 9 3180.48 353.387 1.36
Error 45 11661.38 259.142______
Total 59 17319.39
o = .05 Critical F
.05,5,45
= 2.45 for treatments
For treatments, the observed F = 1.91 < F
.05,5,45
= 2.45 and decision is to fail to
reject the null hypothesis.
11.31 Source df SS MS F__
Treatment 3 199.48 66.493 3.90
Blocks 6 265.24 44.207 2.60
Error 18 306.59 17.033______
Total 27 771.31
Page | 569
o = .01 Critical F
.01,3,18
= 5.09 for treatments
For treatments, the observed F = 3.90 < F
.01,3,18
= 5.09 and the decision is to
fail to reject the null hypothesis.
11.32 Source df SS MS F__
Treatment 3 2302.5 767.5000 15.67
Blocks 9 5402.5 600.2778 12.26
Error 27 1322.5 48.9815____ __
Total 39 9027.5
o = .05 Critical F
.05,3,27
= 2.96 for treatments
For treatments, the observed F = 15.67 > F
.05,3,27
= 2.96 and the decision is to
reject the null hypothesis.
Page | 570
11.33 Source df SS MS F__
Treatment 2 64.5333 32.2667 15.37
Blocks 4 137.6000 34.4000 16.38
Error 8 16.8000 2.1000_ _____
Total 14 218.9300
o = .01 Critical F
.01,2,8
= 8.65 for treatments
For treatments, the observed F = 15.37 > F
.01,2,8
= 8.65 and the decision is to
reject the null hypothesis.
Page | 571
11.34 This is a randomized block design with 3 treatments (machines) and 5 block levels
(operators). The F for treatments is 6.72 with a p-value of .019. There is a significant
difference in machines at o = .05. The F for blocking effects is 0.22 with a p-value of
.807. There are no significant blocking effects. The blocking effects reduced the power
of the treatment effects since the blocking effects were not significant.
11.35 The p value for Phone Type, .00018, indicates that there is an overall
significant difference in treatment means at alpha .001. The lengths of calls differ
according to type of telephone used. The p-value for managers, .00028, indicates that
there is an overall difference in block means at alpha .001. The lengths of calls differ
according to Manager. The significant blocking effects have improved the power of the
F test for treatments.
11.36 This is a two-way factorial design with two independent variables and one dependent
variable. It is 2x4 in that there are two row treatment levels and four column treatment
levels. Since there are three measurements per cell, interaction can be analyzed.
df
row treatment
= 1 df
column treatment
= 3 df
interaction
= 3 df
error
= 16 df
total
= 23
11.37 This is a two-way factorial design with two independent variables and one dependent
variable. It is 4x3 in that there are four treatment levels and three column treatment
levels. Since there are two measurements per cell, interaction can be analyzed.
Page | 572
df
row treatment
= 3 df
column treatment
= 2 df
interaction
= 6 df
error
= 12 df
total
= 23
11.38 Source df SS MS F__
Row 3 126.98 42.327 3.46
Column 4 37.49 9.373 0.77
Interaction 12 380.82 31.735 2.60
Error 60 733.65 12.228______
Total 79 1278.94
o = .05
Critical F
.05,3,60
= 2.76 for rows. For rows, the observed F = 3.46 > F
.05,3,60
= 2.76
and the decision is to reject the null hypothesis.
Critical F
.05,4,60
= 2.53 for columns. For columns, the observed F = 0.77 <
F
.05,4,60
= 2.53 and the decision is to fail to reject the null hypothesis.
Critical F
.05,12,60
= 1.92 for interaction. For interaction, the observed F = 2.60 >
F
.05,12,60
= 1.92 and the decision is to reject the null hypothesis.
Since there is significant interaction, the researcher should exercise extreme
caution in analyzing the "significant" row effects.
Page | 573
11.39 Source df SS MS F__
Row 1 1.047 1.047 2.40
Column 3 3.844 1.281 2.94
Interaction 3 0.773 0.258 0.59
Error 16 6.968 0.436______
Total 23 12.632
o = .05
Critical F
.05,1,16
= 4.49 for rows. For rows, the observed F = 2.40 < F
.05,1,16
= 4.49
and decision is to fail to reject the null hypothesis.
Critical F
.05,3,16
= 3.24 for columns. For columns, the observed F = 2.94 <
F
.05,3,16
= 3.24 and the decision is to fail to reject the null hypothesis.
Critical F
.05,3,16
= 3.24 for interaction. For interaction, the observed F = 0.59 <
F
.05,3,16
= 3.24 and the decision is to fail to reject the null hypothesis.
Page | 574
11.40 Source df SS MS F___
Row 1 60.750 60.750 38.37
Column 2 14.000 7.000 4.42
Interaction 2 2.000 1.000 0.63
Error 6 9.500 1.583________
Total 11 86.250
o = .01
Critical F
.01,1,6
= 13.75 for rows. For rows, the observed F = 38.37 >
F
.01,1,6
= 13.75 and the decision is to reject the null hypothesis.
Critical F
.01,2,6
= 10.92 for columns. For columns, the observed F = 4.42 <
F
.01,2,6
= 10.92 and the decision is to fail to reject the null hypothesis.
Critical F
.01,2,6
= 10.92 for interaction. For interaction, the observed F = 0.63 <
F
.01,2,6
= 10.92 and the decision is to fail to reject the null hypothesis.
11.41 Source df SS MS F__
Treatment 1 1 1.24031 1.24031 63.67
Treatment 2 3 5.09844 1.69948 87.25
Interaction 3 0.12094 0.04031 2.07
Error 24 0.46750 0.01948______
Total 31 6.92719
Page | 575
o = .05
Critical F
.05,1,24
= 4.26 for treatment 1. For treatment 1, the observed F = 63.67 >
F
.05,1,24
= 4.26 and the decision is to reject the null hypothesis.
Critical F
.05,3,24
= 3.01 for treatment 2. For treatment 2, the observed F = 87.25 >
F
.05,3,24
= 3.01 and the decision is to reject the null hypothesis.
Critical F
.05,3,24
= 3.01 for interaction. For interaction, the observed F = 2.07 <
F
.05,3,24
= 3.01 and the decision is to fail to reject the null hypothesis.
Page | 576
11.42 Source df SS MS F__
Age 3 42.4583 14.1528 14.77
No. Children 2 49.0833 24.5417 25.61
Interaction 6 4.9167 0.8194 0.86
Error 12 11.5000 0.9583______
Total 23 107.9583
o = .05
Critical F
.05,3,12
= 3.49 for Age. For Age, the observed F = 14.77 >
F
.05,3,12
= 3.49 and the decision is to reject the null hypothesis.
Critical F
.05,2,12
= 3.89 for No. Children. For No. Children, the observed
F = 25.61 > F
.05,2,12
= 3.89 and the decision is to reject the null hypothesis.
Critical F
.05,6,12
= 3.00 for interaction. For interaction, the observed F = 0.86 <
F
.05,6,12
= 3.00 and fail to reject the null hypothesis.
11.43 Source df SS MS F__
Location 2 1736.22 868.11 34.31
Competitors 3 1078.33 359.44 14.20
Interaction 6 503.33 83.89 3.32
Error 24 607.33 25.31_______
Total 35 3925.22
Page | 577
o = .05
Critical F
.05,2,24
= 3.40 for rows. For rows, the observed F = 34.31 >
F
.05,2,24
= 3.40 and the decision is to reject the null hypothesis.
Critical F
.05,3,24
= 3.01 for columns. For columns, the observed F = 14.20 >
F
.05,3,24
= 3.01 and decision is to reject the null hypothesis.
Critical F
.05,6,24
= 2.51 for interaction. For interaction, the observed F = 3.32 >
F
.05,6,24
= 2.51 and the decision is to reject the null hypothesis.
Note: There is significant interaction in this study. This may confound the
interpretation of the main effects, Location and Number of Competitors.
Page | 578
11.44 This two-way design has 3 row treatments and 5 column treatments. There are 45 total
observations with 3 in each cell.
F
R
=
49 . 3
16 . 46
=
E
R
MS
MS
= 13.23
p-value = .000 and the decision is to reject the null hypothesis for rows.
F
C
=
49 . 3
70 . 249
=
E
C
MS
MS
= 71.57
p-value = .000 and the decision is to reject the null hypothesis for columns.
F
I
=
49 . 3
27 . 55
=
E
I
MS
MS
= 15.84
p-value = .000 and the decision is to reject the null hypothesis for interaction.
Because there is significant interaction, the analysis of main effects is confounded. The
graph of means displays the crossing patterns of the line segments indicating the
presence of interaction.
11.45 The null hypotheses are that there are no interaction effects, that there are no
significant differences in the means of the valve openings by machine, and that there
are no significant differences in the means of the valve openings by shift. Since the p-
value for interaction effects is .876, there are no significant interaction effects and that
Page | 579
is good since significant interaction effects would confound that study. The p-value for
columns (shifts) is .008 indicating that column effects are significant at alpha of .01.
There is a significant difference in the mean valve opening according to shift. No
multiple comparisons are given in the output. However, an examination of the shift
means indicates that the mean valve opening on shift one was the largest at 6.47
followed by shift three with 6.3 and shift two with 6.25. The p-value for rows
(machines) is .937 and that is not significant.
Page | 580
11.46 This two-way factorial design has 3 rows and 3 columns with three observations
per cell. The observed F value for rows is 0.19, for columns is 1.19, and for
interaction is 1.40. Using an alpha of .05, the critical F value for rows and columns
(same df) is F
2,18,.05
= 3.55. Neither the observed F value for rows nor the observed F
value for columns is significant. The critical F value for interaction is F
4,18,.05
= 2.93.
There is no significant interaction.
11.47 Source df SS MS F__
Treatment 3 66.69 22.23 8.82
Error 12 30.25 2.52______
Total 15 96.94
o = .05 Critical F
.05,3,12
= 3.49
Since the treatment F = 8.82 > F
.05,3,12
= 3.49, the decision is to reject the null
hypothesis.
Page | 581
For Tukey's HSD:
MSE = 2.52 n = 4 N = 16 C = 4 N - C = 12
q
.05,4,12
= 4.20
HSD = q
n
MSE
= (4.20)
4
52 . 2
= 3.33
x
1
= 12 x
2
= 7.75 x
3
= 13.25 x
4
= 11.25
Using HSD of 3.33, there are significant pairwise differences between
means 1 and 2, means 2 and 3, and means 2 and 4.
11.48 Source df SS MS F__
Treatment 6 68.19 11.365 0.87
Error 19 249.61 13.137______
Total 25 317.80
Page | 582
11.49 Source df SS MS F__
Treatment 5 210 42.000 2.31
Error 36 655 18.194______
Total 41 865
11.50 Source df SS MS F__
Treatment 2 150.91 75.46 16.19
Error 22 102.53 4.66________
Total 24 253.44
o = .01 Critical F
.01,2,22
= 5.72
Since the observed F = 16.19 > F
.01,2,22
= 5.72, the decision is to reject the null
hypothesis.
x
1
= 9.200 x
2
= 14.250 x
3
= 8.714286
n
1
= 10 n
2
= 8 n
3
= 7
MSE = 4.66 C = 3 N = 25 N - C = 22
o = .01 q
.01,3,22
= 4.64
Page | 583
HSD
1,2
= 4.64 |
.
|
\
|
+
8
1
10
1
2
66 . 4
= 3.36
HSD
1,3
= 4.64 |
.
|
\
|
+
7
1
10
1
2
66 . 4
= 3.49
HSD
2,3
= 4.64 |
.
|
\
|
+
7
1
8
1
2
66 . 4
= 3.67
2 1 x x = 5.05 and 3 2 x x = 5.5357 are significantly different at o = .01
Page | 584
11.51 This design is a repeated-measures type random block design. There is one
treatment variable with three levels. There is one blocking variable with six
people in it (six levels). The degrees of freedom treatment are two. The degrees
of freedom block are five. The error degrees of freedom are ten. The total
degrees of freedom are seventeen. There is one dependent variable.
11.52 Source df SS MS F__
Treatment 3 20,994 6998.00 5.58
Blocks 9 16,453 1828.11 1.46
Error 27 33,891 1255.22_____
Total 39 71,338
o = .05 Critical F
.05,3,27
= 2.96 for treatments
Since the observed F = 5.58 > F
.05,3,27
= 2.96 for treatments, the decision is to
reject the null hypothesis.
11.53 Source df SS MS F__
Treatment 3 240.125 80.042 31.51
Blocks 5 548.708 109.742 43.20
Page | 585
Error 15 38.125 2.542_ _____
Total 23
o = .05 Critical F
.05,3,15
= 3.29 for treatments
Since for treatments the observed F = 31.51 > F
.05,3,15
= 3.29, the decision is to
reject the null hypothesis.
For Tukey's HSD:
Ignoring the blocking effects, the sum of squares blocking and sum of squares error are
combined together for a new SS
error
= 548.708 + 38.125 = 586.833. Combining the
degrees of freedom error and blocking yields a new df
error
= 20. Using these new figures,
we compute a new mean square error, MSE = (586.833/20) = 29.34165.
n = 6 C = 4 N = 24 N - C = 20 q
.05,4,20
= 3.96
Page | 586
HSD = q
n
MSE
= (3.96)
6
34165 . 29
= 8.757
x
1
= 16.667 x
2
= 12.333 x
3
= 12.333 x
4
= 19.833
None of the pairs of means are significantly different using Tukey's HSD = 8.757.
This may be due in part to the fact that we compared means by folding the
blocking effects back into error and the blocking effects were highly
significant.
11.54 Source df SS MS F__
Treatment 1 4 29.13 7.2825 1.98
Treatment 2 1 12.67 12.6700 3.44
Interaction 4 73.49 18.3725 4.99
Error 30 110.38 3.6793______
Total 39 225.67
o = .05
Critical F
.05,4,30
= 2.69 for treatment 1. For treatment 1, the observed F = 1.98 <
Page | 587
F
.05,4,30
= 2.69 and the decision is to fail to reject the null hypothesis.
Critical F
.05,1,30
= 4.17 for treatment 2. For treatment 2 observed F = 3.44 <
F
.05,1,30
= 4.17 and the decision is to fail to reject the null hypothesis.
Critical F
.05,4,30
= 2.69 for interaction. For interaction, the observed F = 4.99 >
F
.05,4,30
= 2.69 and the decision is to reject the null hypothesis.
Since there are significant interaction effects, examination of the main effects
should not be done in the usual manner. However, in this case, there are no
significant treatment effects anyway.
Page | 588
11.55 Source df SS MS F___
Treatment 2 3 257.889 85.963 38.21
Treatment 1 2 1.056 0.528 0.23
Interaction 6 17.611 2.935 1.30
Error 24 54.000 2.250________
Total 35 330.556
o = .01
Critical F
.01,3,24
= 4.72 for treatment 2. For the treatment 2 effects, the observed
F = 38.21 > F
.01,3,24
= 4.72 and the decision is to reject the null hypothesis.
Critical F
.01,2,24
= 5.61 for Treatment 1. For the treatment 1 effects, the observed
F = 0.23 < F
.01,2,24
= 5.61 and the decision is to fail to reject the null hypothesis.
Critical F
.01,6,24
= 3.67 for interaction. For the interaction effects, the observed
F = 1.30 < F
.01,6,24
= 3.67 and the decision is to fail to reject the null hypothesis.
11.56 Source df SS MS F___
Age 2 49.3889 24.6944 38.65
Column 3 1.2222 0.4074 0.64
Interaction 6 1.2778 0.2130 0.33
Page | 589
Error 24 15.3333 0.6389_______
Total 35 67.2222
o = .05
Critical F
.05,2,24
= 3.40 for Age. For the age effects, the observed F = 38.65 >
F
.05,2,24
= 3.40 and the decision is to reject the null hypothesis.
Critical F
.05,3,24
= 3.01 for Region. For the region effects, the observed F = 0.64
< F
.05,3,24
= 3.01 and the decision is to fail to reject the null hypothesis.
Critical F
.05,6,24
= 2.51 for interaction. For interaction effects, the observed
F = 0.33 < F
.05,6,24
= 2.51 and the decision is to fail to reject the null hypothesis.
There are no significant interaction effects. Only the Age effects are significant.
Page | 590
Computing Tukey's HSD for Age:
x
1
= 2.667 x
2
= 4.917 x
3
= 2.250
n = 12 C = 3 N = 36 N - C = 33
MSE is recomputed by folding together the interaction and column sum of
squares and degrees of freedom with previous error terms:
MSE = (1.2222 + 1.2778 + 15.3333)/(3 + 6 + 24) = 0.5404
q
.05,3,33
= 3.49
HSD = q
n
MSE
= (3.49)
12
5404 .
= 0.7406
Using HSD, there are significant pairwise differences between means 1 and 2 and
between means 2 and 3.
Shown below is a graph of the interaction using the cell means by Age.
Page | 591
Page | 592
11.57 Source df SS MS F__
Treatment 3 90477679 30159226 7.38
Error 20 81761905 4088095_______
Total 23 172000000
o = .05 Critical F
.05,3,20
= 3.10
The treatment F = 7.38 > F
.05,3,20
= 3.10 and the decision is to reject the null
hypothesis.
11.58 Source df SS MS F__
Treatment 2 460,353 230,176 103.70
Blocks 5 33,524 6,705 3.02
Error 10 22,197 2,220_______
Total 17 516,074
o = .01 Critical F
.01,2,10
= 7.56 for treatments
Since the treatment observed F = 103.70 > F
.01,2,10
= 7.56, the decision is to
reject the null hypothesis.
11.59 Source df SS MS F__
Page | 593
Treatment 2 9.555 4.777 0.46
Error 18 185.1337 10.285_______
Total 20 194.6885
o = .05 Critical F
.05,2,18
= 3.55
Since the treatment F = 0.46 > F
.05,2,18
= 3.55, the decision is to fail to reject the
null hypothesis.
Since there are no significant treatment effects, it would make no sense to
compute Tukey-Kramer values and do pairwise comparisons.
Page | 594
11.60 Source df SS MS F___
Years 2 4.875 2.437 5.16
Size 3 17.083 5.694 12.06
Interaction 6 2.292 0.382 0.81
Error 36 17.000 0.472_______
Total 47 41.250
o = .05
Critical F
.05,2,36
= 3.32 for Years. For Years, the observed F = 5.16 >
F
.05,2,36
= 3.32 and the decision is to reject the null hypothesis.
Critical F
.05,3,36
= 2.92 for Size. For Size, the observed F = 12.06 > F
.05,3,36
= 2.92
and the decision is to reject the null hypothesis.
Critical F
.05,6,36
= 2.42 for interaction. For interaction, the observed F = 0.81 <
F
.05,6,36
= 2.42 and the decision is to fail to reject the null hypothesis.
There are no significant interaction effects. There are significant row and column
effects at o = .05.
11.61 Source df SS MS F___
Treatment 4 53.400 13.350 13.64
Page | 595
Blocks 7 17.100 2.443 2.50
Error 28 27.400 0.979________
Total 39 97.900
o = .05 Critical F
.05,4,28
= 2.71 for treatments
For treatments, the observed F = 13.64 > F
.05,4,28
= 2.71 and the decision is to
reject the null hypothesis.
Page | 596
11.62 This is a one-way ANOVA with four treatment levels. There are 36 observations
in the study. The p-value of .045 indicates that there is a significant overall
difference in the means at o = .05. An examination of the mean analysis shows
that the sample sizes are different with sizes of 8, 7, 11, and 10, respectively. No
multiple comparison technique was used here to conduct pairwise comparisons.
However, a study of sample means shows that the two most extreme means are
from levels one and four. These two means would be the most likely candidates
for multiple comparison tests. Note that the confidence intervals for means one
and four (shown in the graphical output) are seemingly non-overlapping
indicating a potentially significant difference.
11.63 Excel reports that this is a two-factor design without replication indicating that
this is a random block design. Neither the row nor the column p-values are less
than .05 indicating that there are no significant treatment or blocking effects in this
study. Also displayed in the output to underscore this conclusion are the observed and
critical F values for both treatments and blocking. In both
cases, the observed value is less than the critical value.
11.64 This is a two-way ANOVA with 5 rows and 2 columns. There are 2 observations
per cell. For rows, F
R
= 0.98 with a p-value of .461 which is not significant. For columns,
F
C
= 2.67 with a p-value of .134 which is not significant. For interaction, F
I
= 4.65 with a
Page | 597
p-value of .022 which is significant at o = .05. Thus, there are significant interaction
effects and the row and column effects are confounded. An examination of the
interaction plot reveals that most of the lines cross verifying the finding of significant
interaction.
11.65 This is a two-way ANOVA with 4 rows and 3 columns. There are 3 observations
per cell. F
R
= 4.30 with a p-value of .014 is significant at o = .05. The null
hypothesis is rejected for rows. F
C
= 0.53 with a p-value of .594 is not
significant. We fail to reject the null hypothesis for columns. F
I
= 0.99 with a
p-value of .453 for interaction is not significant. We fail to reject the null
hypothesis for interaction effects.
Page | 598
11.66 This was a random block design with 5 treatment levels and 5 blocking levels.
For both treatment and blocking effects, the critical value is F
.05,4,16
= 3.01. The
observed F value for treatment effects is MS
C
/ MS
E
= 35.98 / 7.36 = 4.89 which
is greater than the critical value. The null hypothesis for treatments is rejected,
and we conclude that there is a significant different in treatment means. No
multiple comparisons have been computed in the output. The observed F value
for blocking effects is MS
R
/ MS
E
= 10.36 /7.36 = 1.41 which is less than the
critical value. There are no significant blocking effects. Using random block
design on this experiment might have cost a loss of power.
11.67 This one-way ANOVA has 4 treatment levels and 24 observations. The F = 3.51
yields a p-value of .034 indicating significance at o = .05. Since the sample sizes
are equal, Tukeys HSD is used to make multiple comparisons. The computer
output shows that means 1 and 3 are the only pairs that are significantly different
(same signs in confidence interval). Observe on the graph that the confidence
intervals for means 1 and 3 barely overlap.
Page | 599
Chapter 13
Nonparametric Statistics
LEARNING OBJECTIVES
This chapter presents several nonparametric statistics that can be used to analyze data enabling
you to:
1. Recognize the advantages and disadvantages of nonparametric statistics.
2. Understand how to use the runs test to test for randomness.
3. Know when and how to use the Mann-Whitney U Test, the Wilcoxon matched-pairs
signed rank test, the Kruskal-Wallis test, and the Friedman test.
4. Learn when and how to measure correlation using Spearman's rank correlation
measurement.
CHAPTER TEACHING STRATEGY
Page | 600
Chapter 13 contains new six techniques for analysis. Only the first technique, the runs
test, is conceptually a different idea for the student to consider than anything presented in the
text to this point. The runs test is a mechanism for testing to determine if a string of data are
random. There is a runs test for small samples that uses Table A.12 in the appendix and a test
for large samples, which utilizes a z test.
The main portion of chapter 13 (middle part) contains nonparametric alternatives
to parametric tests presented earlier in the book. The Mann-Whitney U test is a
nonparametric alternative to the t test for independent means. The Wilcoxon matched-
pairs signed ranks test is an alternative to the t test for matched-pairs. The Kruskal-
Wallis is a nonparametric alternative to the one-way analysis of variance test. The
Friedman test is a nonparametric alternative to the randomized block design presented in
chapter 11. Each of these four tests utilizes rank analysis.
The last part of the chapter is a section on Spearman's rank correlation. This correlation
coefficient can be presented as a nonparametric alternative to the Pearson product-moment
correlation coefficient of chapter 3. Spearman's rank correlation uses either ranked data or data
that is converted to ranks. The interpretation of Spearman's rank correlation is similar to
Pearson's product-moment correlation coefficient.
CHAPTER OUTLINE
Page | 601
13.1 Runs Test
Small-Sample Runs Test
Large-Sample Runs Test
13.2 Mann-Whitney U Test
Small-Sample Case
Large-Sample Case
13.3 Wilcoxon Matched-Pairs Signed Rank Test
Small-Sample Case (n < 15)
Large-Sample Case (n > 15)
13.4 Kruskal-Wallis Test
13.5 Friedman Test
13.6 Spearman's Rank Correlation
KEY TERMS
Friedman Test Parametric Statistics
Kruskal-Wallis Test Runs Test
Mann-Whitney U Test Spearmans Rank Correlation
Nonparametric Statistics Wilcoxon Matched-Pairs Signed Rank Test
Page | 602
Page | 603
SOLUTIONS TO CHAPTER 13
13.1 H
o
: The observations in the sample are randomly generated.
H
a
: The observations in the sample are not randomly generated.
This is a small sample runs test since n
1
, n
2
< 20
o = .05, The lower tail critical value is 6 and the upper tail critical value is 16
n
1
= 10 n
2
= 10
R = 11
Since R = 11 is between the two critical values, the decision is to fail to reject the null
hypothesis.
The data are random.
13.2 H
o
: The observations in the sample are randomly generated.
H
a
: The observations in the sample are not randomly generated.
o = .05, o/2 = .025, z
.025
= + 1.96
Page | 604
n
1
= 26 n
2
= 21 n = 47
1
21 26
) 21 )( 26 ( 2
1
2
2 1
2 1
+
+
= +
+
=
n n
n n
R
u = 24.234
| |
) 1 21 26 ( ) 21 26 (
21 26 ) 21 )( 26 ( 2 ) 21 )( 26 ( 2
) 1 ( ) (
) 2 ( 2
2
2 1
2
2 1
2 1 2 1 2 1
+ +
=
+ +
=
n n n n
n n n n n n
R
o = 3.351
R = 9
351 . 3
234 . 24 9
=
=
R
R
R
z
o
u
= -4.55
Since the observed value of z = -4.55 < z
.025
= -1.96, the decision is to reject the
null hypothesis. The data are not randomly generated.
Page | 605
13.3 n
1
= 8 n
2
= 52 o = .05
This is a two-tailed test and o/2 = .025. The p-value from the printout is .0264.
Since the p-value is the lowest value of alpha for which the null hypothesis can
be rejected, the decision is to fail to reject the null hypothesis
(p-value = .0264 > .025). There is not enough evidence to reject that the data are
randomly generated.
13.4 The observed number of runs is 18. The mean or expected number of runs
is 14.333. The p-value for this test is .1452. Thus, the test is not significant at
alpha of .05 or .025 for a two-tailed test. The decision is to fail to reject the
null hypothesis. There is not enough evidence to declare that the data are not random.
Therefore, we must conclude that the data a randomly generated.
13.5 H
o
: The observations in the sample are randomly generated.
H
a
: The observations in the sample are not randomly generated.
Since n
1
, n
2
> 20, use large sample runs test
o = .05 Since this is a two-tailed test, o/2 = .025, z
.025
= + 1.96. If the
Page | 606
observed value of z is greater than 1.96 or less than -1.96, the decision is to reject the
null hypothesis.
R = 27 n
1
= 40 n
2
= 24
1
64
) 24 )( 40 ( 2
1
2
2 1
2 1
+ = +
+
=
n n
n n
R
u = 31
| |
) 63 ( ) 64 (
24 40 ) 24 )( 40 ( 2 ) 24 )( 40 ( 2
) 1 ( ) (
) 2 ( 2
2
2 1
2
2 1
2 1 2 1 2 1
=
+ +
=
n n n n
n n n n n n
R
o = 3.716
716 . 3
31 27
=
=
R
R
R
z
o
u
= -1.08
Since the observed z of -1.08 is greater than the critical lower tail z value
of -1.96, the decision is to fail to reject the null hypothesis. The data are randomly
generated.
13.6 H
o
: The observations in the sample are randomly generated.
H
a
: The observations in the sample are not randomly generated.
n
1
= 5 n
2
= 8 n = 13 o = .05
Since this is a two-tailed test, o/2 = .025
From Table A.11, the lower critical value is 3
From Table A.11, the upper critical value is 11
Page | 607
R = 4
Since R = 4 > than the lower critical value of 3 and less than the upper critical
value of 11, the decision is to fail to reject the null hypothesis. The data are randomly
generated.
13.7 H
o
: Group 1 is identical to Group 2
H
a
: Group 1 is not identical to Group 2
Use the small sample Mann-Whitney U test since both n
1
, n
2
< 10, o = .05. Since this is a
two-tailed test, o/2 = .025. The p-value is obtained using Table A.13.
Value Rank Group
11 1 1
13 2.5 1
13 2.5 2
14 4 2
15 5 1
17 6 1
18 7.5 1
18 7.5 2
21 9.5 1
21 9.5 2
22 11 1
Page | 608
23 12.5 2
23 12.5 2
24 14 2
26 15 1
29 16 1
n
1
= 8
n
2
= 8
W
1
= 1 + 2.5 + 5 + 6 + 7.5 + 9.5 + 15 + 16 = 62.5
5 . 62
2
) 9 )( 8 (
) 8 )( 8 (
2
) 1 (
1
1 1
2 1
+ =
+
+ = W
n n
n n U = 37.5
U n n U =
2 1
'
= 64 37.5 = 26.5
We use the small U which is 26.5
From Table A.13, the p-value for U = 27 is .3227(2) = .6454
Since this p-value is greater than o/2 = .025, the decision is to fail to reject the
null hypothesis.
13.8 H
o
: Population 1 has values that are no greater than population 2
H
a
: Population 1 has values that are greater than population 2
Page | 609
Value Rank Group
203 1 2
208 2 2
209 3 2
211 4 2
214 5 2
216 6 1
217 7 1
218 8 2
219 9 2
222 10 1
223 11 2
224 12 1
227 13 2
229 14 2
230 15.5 2
230 15.5 2
231 17 1
236 18 2
240 19 1
241 20 1
248 21 1
255 22 1
256 23 1
Page | 610
283 24 1
n
1
= 11
n
2
= 13
W
1
= 6 + 7 + 10 + 12 + 17 + 19 + 20 + 21 + 22 + 23 + 24 =
W
1
= 181
2
) 13 )( 11 (
2
2 1
=
=
n n
u = 71.5
12
) 25 )( 13 )( 11 (
12
) 1 (
2 1 2 1
=
+ +
=
n n n n
o = 17.26
181
2
) 12 )( 11 (
13 )( 11 (
2
) 1 (
1
1 1
2 1
+ =
+
+ = W
n n
n n U = 28
26 . 17
5 . 71 28
=
=
o
u U
z = -2.52
Page | 611
o = .01, z
.01
= 2.33
Since z = 52 . 2 = 2.52 > z = 2.33, the decision is to reject the null
hypothesis.
13.9 Contacts Rank Group
6 1 1
8 2 1
9 3.5 1
9 3.5 2
10 5 2
11 6.5 1
11 6.5 1
12 8.5 1
12 8.5 2
13 11 1
13 11 2
13 11 2
14 13 2
15 14 2
16 15 2
17 16 2
Page | 612
W
1
= 39
39
2
) 8 )( 7 (
) 9 )( 7 (
2
) 1 (
1
1 1
2 1 1
+ =
+
+ = W
n n
n n U = 52
1 2 1 2
U n n U = = (7)(9) 52 = 11
U = 11
From Table A.13, the p-value = .0156. Since this p-value is greater than o = .01,
the decision is to fail to reject the null hypothesis.
13.10 H
o
: Urban and rural spend the same
H
a
: Urban and rural spend different amounts
Expenditure Rank Group
1950 1 U
2050 2 R
2075 3 R
2110 4 U
2175 5 U
2200 6 U
2480 7 U
2490 8 R
2540 9 U
Page | 613
2585 10 R
2630 11 U
2655 12 U
2685 13 R
2710 14 U
2750 15 U
2770 16 R
2790 17 R
2800 18 R
2850 19.5 U
2850 19.5 U
2975 21 R
2995 22.5 R
2995 22.5 R
3100 24 R
n
1
= 12
n
2
= 12
W
1
= 1 + 4 + 5 + 6 + 7 + 9 + 11 + 12 + 14 + 15 + 19.5 + 19.5 = 123
2
) 12 )( 12 (
2
2 1
=
=
n n
u = 72
12
) 25 )( 12 )( 12 (
12
) 1 (
2 1 2 1
=
+ +
=
n n n n
o = 17.32
Page | 614
123
2
) 13 )( 12 (
) 12 )( 12 (
2
) 1 (
1
1 1
2 1
+ =
+
+ = W
n n
n n U = 99
32 . 17
72 99
=
=
o
u U
z = 1.56
o = .05 o/2 = .025
z
.025
= +1.96
Since the observed z = 1.56 < z
.025
= 1.96, the decision is to fail to reject the
null hypothesis.
Page | 615
13.11 H
o
: Males do not earn more than females
H
a
: Males do earn more than females
Earnings Rank Gender
$28,900 1 F
31,400 2 F
36,600 3 F
40,000 4 F
40,500 5 F
41,200 6 F
42,300 7 F
42,500 8 F
44,500 9 F
45,000 10 M
47,500 11 F
47,800 12.5 F
47,800 12.5 M
48,000 14 F
50,100 15 M
51,000 16 M
51,500 17.5 M
51,500 17.5 M
53,850 19 M
55,000 20 M
57,800 21 M
Page | 616
61,100 22 M
63,900 23 M
n
1
= 11 n
2
= 12
W
1
= 10 + 12.5 + 15 + 16 + 17.5 + 17.5 + 19 + 20 + 21 + 22 + 23 = 193.5
2
) 12 )( 11 (
2
2 1
=
=
n n
u = 66 and
12
) 24 )( 12 )( 11 (
12
) 1 (
2 1 2 1
=
+ +
=
n n n n
o = 16.25
5 . 193
2
) 12 )( 11 (
) 12 )( 11 (
2
) 1 (
1
1 1
2 1
+ =
+
+ = W
n n
n n U = 4.5
25 . 16
66 5 . 4
=
=
o
u U
z = -3.78
o = .01, z
.01
= 2.33
Since the observed z = 3.78 > z
.01
= 2.33, the decision is to reject the null
hypothesis.
13.12 H
0
: There is no difference in the price of a single-family home in Denver and
Hartford
H
a
: There is a difference in the price of a single-family home in Denver and
Hartford
Page | 617
Price Rank City
132,405 1 D
134,127 2 H
134,157 3 D
134,514 4 H
135,062 5 D
135,238 6 H
135,940 7 D
136,333 8 H
136,419 9 H
136,981 10 D
137,016 11 D
137,359 12 H
137,741 13 H
137,867 14 H
138,057 15 D
139,114 16 H
139,638 17 D
140,031 18 H
140,102 19 D
140,479 20 D
141,408 21 D
141,730 22 D
141,861 23 D
142,012 24 H
142,136 25 H
Page | 618
143,947 26 H
143,968 27 H
144,500 28 H
n
1
= 13
n
2
= 15
W
1
= 1 + 3 + 5 + 7 + 10 + 11 + 15 + 17 + 19 +
20 + 21 + 22 + 23 = 174
174
2
) 14 )( 13 (
) 15 )( 13 (
2
) 1 (
1
1 1
2 1
+ =
+
+ = W
n n
n n U = 112
2
) 15 )( 13 (
2
2 1
=
=
n n
u = 97.5
12
) 29 )( 15 )( 13 (
12
) 1 (
2 1 2 1
=
+ +
=
n n n n
o = 21.708
708 . 21
5 . 97 112
=
=
o
u U
z = 0.67
For o = .05 and a two-tailed test, o/2 = .025 and
z
.025
= + 1.96.
Since the observed z = 0.67 < z
.025
= 1.96, the decision is to fail to reject
Page | 619
the null hypothesis. There is not enough evidence to declare that there is
a price difference for single family homes in Denver and Hartford.
13.13 H
o
: The population differences = 0
H
a
: The population differences = 0
1 2 d Rank
212 179 33 15
234 184 50 16
219 213 6 7.5
199 167 32 13.5
194 189 5 6
206 200 6 7.5
234 212 22 11
225 221 4 5
220 223 -3 - 3.5
218 217 1 1
234 208 26 12
212 215 -3 -3.5
219 187 32 13.5
196 198 -2 -2
178 189 -11 -9
213 201 12 10
Page | 620
n = 16
T
-
= 3.5 + 3.5 + 2 + 9 = 18
4
) 17 )( 16 (
4
) 1 )( (
=
+
=
n n
u = 68
24
) 33 )( 17 ( 16
24
) 1 2 )( 1 (
=
+ +
=
n n n
o = 19.34
34 . 19
68 18
=
=
o
u T
z = -2.59
o = .10 o/2 = .05 z
.05
= 1.645
Since the observed z = -2.59 < z
.05
= -1.645, the decision is to reject the null
hypothesis.
13.14 H
o
: M
d
= 0
H
a
: M
d
= 0
Before After d Rank
49 43 6 + 9
41 29 12 +12
Page | 621
47 30 17 +14
39 38 1 + 1.5
53 40 13 +13
51 43 8 +10
51 46 5 + 7.5
49 40 9 +11
38 42 -4 - 5.5
54 50 4 + 5.5
46 47 -1 - 1.5
50 47 3 + 4
44 39 5 + 7.5
49 49 0
45 47 -2 - 3
n = 15 but after dropping the zero difference, n = 14
o = .05, for two-tailed o/2 = .025, and from Table A.14, T
.025,14
= 21
T
+
= 9 + 12+ 14 + 1.5 + 13 + 10 + 7.5 + 11 + 5.5 + 4 + 7.5 = 95
T
-
= 5.5 + 1.5 + 3 = 10
T = min(T
+
,T
-
) = min(95, 10) = 10
Since the observed value of T = 10 < T
.025, 14
= 21, the decision is to reject the
null hypothesis. There is a significant difference in before and after.
Page | 622
Page | 623
13.15 H
o
: The population differences > 0
H
a
: The population differences < 0
Before After d Rank
10,500 12,600 -2,100 -11
8,870 10,660 -1,790 -9
12,300 11,890 410 3
10,510 14,630 -4,120 -17
5,570 8,580 -3,010 -15
9,150 10,115 -965 -7
11,980 14,320 -2,370 -12
6,740 6,900 -160 -2
7,340 8,890 -1,550 -8
13,400 16,540 -3,140 -16
12,200 11,300 900 6
10,570 13,330 -2,760 -13
9,880 9,990 -110 -1
12,100 14,050 -1,950 -10
9,000 9,500 -500 -4
11,800 12,450 -650 -5
10,500 13,450 -2,950 -14
Since n = 17, use the large sample test
T+ = 3 + 6 = 9
Page | 624
T = 9
4
) 18 )( 17 (
4
) 1 )( (
=
+
=
n n
u = 76.5
24
) 35 )( 18 ( 17
24
) 1 2 )( 1 (
=
+ +
=
n n n
o = 21.12
12 . 21
5 . 76 9
=
=
o
u T
z = -3.20
o = .05 z
.05
= -1.645
Since the observed z = -3.20 < z
.05
= -1.645, the decision is to reject the null
hypothesis.
13.16 H
o
:M
d
= 0
H
a
:M
d
< 0
Manual Scanner d Rank
426 473 -47 -11
387 446 -59 -13
410 421 -11 -5.5
506 510 -4 -2
Page | 625
411 465 -54 -12
398 409 -11 -5.5
427 414 13 7
449 459 -10 -4
407 502 -95 -14
438 439 -1 -1
418 456 -38 -10
482 499 -17 -8
512 517 -5 -3
402 437 -35 -9
n = 14
T
+
= (+7)
T
-
= (11 + 13 + 5.5 + 3 + 12 + 5.5 + 4 + 14 + 1 + 10 + 8 + 3 + 9)= 98
T = min(T
+
,T
-
) = min(7, 98) = 7
from Table A.14 with o = .05, n = 14, T
.05,14
= 26
Since the observed T = 7 < T
.05,14
= 26, the decision is to reject the null
hypothesis.
The differences are significantly less than zero and the after scores are
significantly higher.
Page | 626
13.17 H
o
: The population differences 0
H
a
: The population differences < 0
1999 2006 d Rank
49 54 -5 -7.5
27 38 -11 -15
39 38 1 2
75 80 -5 -7.5
59 53 6 11
67 68 -1 -2
22 43 -21 -20
61 67 -6 -11
58 73 -15 -18
60 55 5 7.5
72 58 14 16.5
62 57 5 7.5
49 63 -14 -16.5
48 49 -1 -2
19 39 -20 -19
32 34 -2 -4.5
60 66 -6 -11
80 90 -10 -13.5
55 57 -2 -4.5
68 58 10 13.5
Page | 627
n = 20
T+ = 2 + 11 + 7.5 + 16.5 + 7.5 + 13.5 = 58
T = 58
4
) 21 )( 20 (
4
) 1 )( (
=
+
=
n n
u = 105
24
) 41 )( 21 ( 20
24
) 1 2 )( 1 (
=
+ +
=
n n n
o = 26.79
79 . 26
105 58
=
=
o
u T
z = -1.75
For o = .10, z
.10
= -1.28
Since the observed z = -1.75 < z
.10
= -1.28, the decision is to reject the null
hypothesis.
13.18 H
o
: The population differences < 0
H
a
: The population differences > 0
April April
2002 2006 d Rank
63.1 57.1 5.7 16
Page | 628
67.1 66.4 0.7 3.5
65.5 61.8 3.7 12
68.0 65.3 2.7 8.5
66.6 63.5 3.1 10.5
65.7 66.4 -0.7 -3.5
69.2 64.9 4.3 14
67.0 65.2 1.8 6.5
65.2 65.1 0.1 1.5
60.7 62.2 -1.5 -5
63.4 60.3 3.1 10.5
59.2 57.4 1.8 6.5
62.9 58.2 4.7 15
69.4 65.3 4.1 13
67.3 67.2 0.1 1.5
66.8 64.1 2.7 8.5
n = 16
T- = 8.5
T = 8.5
4
) 17 )( 16 (
4
) 1 )( (
=
+
=
n n
u = 68
24
) 33 )( 17 ( 16
24
) 1 2 )( 1 (
=
+ +
=
n n n
o = 19.339
Page | 629
339 . 19
68 5 . 8
=
=
o
u T
z = -3.08
For o = .05, z
.05
= 1.645
Since the observed z = 08 . 3 > z
.05
= 1.645, the decision is to reject the null
hypothesis.
Page | 630
13.19 H
o
: The 5 populations are identical
H
a
: At least one of the 5 populations is different
1 2 3 4 5__
157 165 219 286 197
188 197 257 243 215
175 204 243 259 235
174 214 231 250 217
201 183 217 279 240
203 203 233
213
BY RANKS
1 2 3 4 5__
1 2 18 29 7.5
6 7.5 26 23.5 15
4 12 23.5 27 21
3 14 19 25 16.5
9 5 16.5 28 22
10.5 10.5 20
_ _ __ __ 13_
T
j
33.5 40.5 113.5 132.5 115
n
j
6 5 6 5 7
Page | 631
7
) 115 (
5
) 5 . 132 (
6
) 5 . 113 (
5
) 5 . 40 (
6
) 5 . 33 (
2 2 2 2
2
+ + + + =
j
j
n
T
= 8,062.67
n = 29
= +
+
= ) 30 ( 3 ) 67 . 062 , 8 (
) 30 ( 29
12
) 1 ( 3
) 1 (
12
2
n
n
T
n n
K
j
j
= 21.21
o = .01 df = c - 1 = 5 - 1 = 4
;
2
.01,4
= 13.2767
Since the observed K = 21.21 > ;
2
.01,4
= 13.2767, the decision is to reject the null
hypothesis.
Page | 632
13.20 H
o
: The 3 populations are identical
H
a
: At least one of the 3 populations is different
Group 1 Group 2 Group 4
19 30 39
21 38 32
29 35 41
22 24 44
37 29 30
42 27
33
By Ranks
Group 1 Group 2 Group 3
1 8.5 15
2 14 10
6.5 12 16
3 4 18
13 6.5 8.5
17 5
__ _ 11_
T
j
42.5 45 83.5
n
j
6 5 7
Page | 633
7
) 5 . 83 (
5
) 45 (
6
) 5 . 42 (
2 2 2
2
+ + =
j
j
n
T
= 1,702.08
n = 18
= +
+
= ) 19 ( 3 ) 08 . 702 , 1 (
) 19 ( 18
12
) 1 ( 3
) 1 (
12
2
n
n
T
n n
K
j
j
= 2.72
o = .05, df = c - 1 = 3 - 1 = 2
;
2
.05,2
= 5.9915
Since the observed K = 2.72 < ;
2
.05,2
= 5.9915, the decision is to fail to reject
the null hypothesis.
Page | 634
13.21 H
o
: The 4 populations are identical
H
a
: At least one of the 4 populations is different
Region 1 Region 2 Region 3 Region 4
$1,200 $225 $ 675 $1,075
450 950 500 1,050
110 100 1,100 750
800 350 310 180
375 275 660 330
200 680
425
By Ranks
Region 1 Region 2 Region 3 Region 4
23 5 15 21
12 19 13 20
2 1 22 17
18 9 7 3
10 6 14 8
4 16
_ _ _ 11
T
j
69 40 71 96
n
j
6 5 5 7
Page | 635
7
) 96 (
5
) 71 (
5
) 40 (
6
) 69 (
2 2 2 2
2
+ + + =
j
j
n
T
= 3,438.27
n = 23
= +
+
= ) 24 ( 3 ) 27 . 428 , 3 (
) 24 ( 23
12
) 1 ( 3
) 1 (
12
2
n
n
T
n n
K
j
j
= 2.75
o = .05 df = c - 1 = 4 - 1 = 3
;
2
.05,3
= 7.8147
Since the observed K = 2.75 < ;
2
.05,3
= 7.8147, the decision is to fail to reject
the null hypothesis.
13.22 H
o
: The 3 populations are identical
H
a
: At least one of the 3 populations is different
Small Town City Suburb
$21,800 $22,300 $22,000
22,500 21,900 22,600
21,750 21,900 22,800
Page | 636
22,200 22,650 22,050
21,600 21,800 21,250
22,550
By Ranks
Small Town City Suburb
4.5 11 8
12 6.5 14
3 6.5 16
10 15 9
2 4.5 1
__ __ 13
T
j
31.5 43.5 61
n
j
5 5 6
6
) 61 (
5
) 5 . 43 (
5
) 5 . 31 (
2 2 2
2
+ + =
j
j
n
T
= 1,197.07
n = 16
= +
+
= ) 17 ( 3 ) 07 . 197 , 1 (
) 17 ( 16
12
) 1 ( 3
) 1 (
12
2
n
n
T
n n
K
j
j
= 1.81
Page | 637
o = .05 df = c - 1 = 3 - 1 = 2
;
2
.05,2
= 5.9915
Since the observed K = 1.81 < ;
2
.05,2
= 5.9915, the decision is to fail to reject
the null hypothesis.
Page | 638
13.23 H
o
: The 4 populations are identical
H
a
: At least one of the 4 populations is different
Amusement Parks Lake Area City National Park
0 3 2 2
1 2 2 4
1 3 3 3
0 5 2 4
2 4 3 3
1 4 2 5
0 3 3 4
5 3 4
2 1
3
By Ranks
Amusement Parks Lake Area City National Park
2 20.5 11.5 11.5
5.5 11.5 11.5 28.5
5.5 20.5 20.5 20.5
2 33 11.5 28.5
11.5 28.5 20.5 20.5
5.5 28.5 11.5 33
2 20.5 20.5 28.5
33 20.5 28.5
Page | 639
11.5 5.5
__ __ 20.5 ____
T
j
34 207.5 154.0 199.5
n
j
7 9 10 8
8
) 5 . 199 (
10
) 154 (
9
) 5 . 207 (
7
) 34 (
2 2 2
2
+ + + =
j
j
n
T
= 12,295.80
n = 34
= +
+
= ) 35 ( 3 ) 80 . 295 , 12 (
) 35 ( 34
12
) 1 ( 3
) 1 (
12
2
n
n
T
n n
K
j
j
= 18.99
o = .05 df = c - 1 = 4 - 1 = 3
;
2
.05,3
= 7.8147
Since the observed K = 18.99 > ;
2
.05,3
= 7.8147, the decision is to reject the
null hypothesis.
13.24 H
o
: The 3 populations are identical
H
a
: At least one of the 3 populations is different
Day Shift Swing Shift Graveyard Shift
52 45 41
57 48 46
Page | 640
53 44 39
56 51 49
55 48 42
50 54 35
51 49 52
43
By Ranks
Day Shift Swing Shift Graveyard Shift
16.5 7 3
22 9.5 8
18 6 2
21 14.5 11.5
20 9.5 4
13 19 1
14.5 11.5 16.5
_ 5 _ ___
T
j
125 82 46
n
j
7 8 7
7
) 46 (
8
) 82 (
7
) 125 (
2 2 2
2
+ + =
j
j
n
T
= 3,374.93
n = 22
Page | 641
= +
+
= ) 23 ( 3 ) 93 . 374 , 3 (
) 23 ( 22
12
) 1 ( 3
) 1 (
12
2
n
n
T
n n
K
j
j
= 11.04
o = .05 df = c - 1 = 3 - 1 = 2
;
2
.05,2
= 5.9915
Since the observed K = 11.04 > ;
2
.05,2
= 5.9915, the decision is to reject the
null hypothesis.
Page | 642
13.25 H
o
: The treatment populations are equal
H
a
: At least one of the treatment populations yields larger values than at least one
other treatment population.
Use the Friedman test with o = .05
c = 5, b = 5, df = c - 1 = 4, ;
2
.05,4
= 9.4877
If the observed value of ;
2
> 9.4877, then the decision will be to reject the null
hypothesis.
Shown below are the data ranked by blocks:
1 2 3 4 5
1 1 4 3 5 2
2 1 3 4 5 2
3 2.5 1 4 5 2.5
4 3 2 4 5 1
5 4 2 3 5 1
R
j
11.5 12 18 25 8.5
R
j
2
132.25 144 324 625 72.25
ER
j
2
= 1,297.5
Page | 643
) 6 )( 5 ( 3 ) 5 . 297 , 1 (
) 6 )( 5 )( 5 (
12
) 1 ( 3
) 1 (
12 2
2
= +
+
=
c b R
c bc
j r
; = 13.8
Since the observed value of ;
r
2
= 13.8 > ;
4,.05
2
= 9.4877, the decision is to
reject the null hypothesis. At least one treatment population yields larger values than
at least one other treatment population.
Page | 644
13.26 H
o
: The treatment populations are equal
H
a
: At least one of the treatment populations yields larger values than at least one
other treatment population.
Use the Friedman test with o = .05
c = 6, b = 9, df = c - 1 = 5, ;
2
.05,5
= 11.0705
If the observed value of ;
2
> 11.0705, then the decision will be to reject the null
hypothesis.
Shown below are the data ranked by blocks:
1 2 3 4 5 6
1 1 3 2 6 5 4
2 3 5 1 6 4 2
3 1 3 2 6 5 4
4 1 3 4 6 5 2
5 3 1 2 4 6 5
6 1 3 2 6 5 4
7 1 2 4 6 5 3
8 3 1 2 6 5 4
9 1 2 3 6 5 4
R
j
15 25 25 56 50 38
Page | 645
R
j
2
225 625 625 3136 2500 1444
ER
j
2
= 8,555.5
) 7 )( 9 ( 3 ) 555 , 8 (
) 7 )( 6 )( 9 (
12
) 1 ( 3
) 1 (
12 2
2
= +
+
=
c b R
c bc
j r
; = 82.59
Since the observed value of ;
r
2
= 82.59 > ;
5,.05
2
= 11.0705, the decision is to
reject the null hypothesis. At least one treatment population yields larger values
than at least one other treatment population.
Page | 646
13.27 H
o
: The treatment populations are equal
H
a
: At least one of the treatment populations yields larger values than at least one
other treatment population.
Use the Friedman test with o = .01
c = 4, b = 6, df = c - 1 = 3, ;
2
.01,3
= 11.3449
If the observed value of ;
2
> 11.3449, then the decision will be to reject the null
hypothesis.
Shown below are the data ranked by blocks:
1 2 3 4
1 1 4 3 2
2 2 3 4 1
3 1 4 3 2
4 1 3 4 2
5 1 3 4 2
6 2 3 4 1
R
j
8 20 22 10
R
j
2
64 400 484 100
ER
j
2
= 1,048
Page | 647
) 5 )( 6 ( 3 ) 048 , 1 (
) 5 )( 4 )( 6 (
12
) 1 ( 3
) 1 (
12 2
2
= +
+
=
c b R
c bc
j r
; = 14.8
Since the observed value of ;
r
2
= 14.8 > ;
2
3,.01
= 11.3449, the decision is to
reject the null hypothesis. At least one treatment population yields larger values
than at least one other treatment population.
Page | 648
13.28 H
o
: The treatment populations are equal
H
a
: At least one of the treatment populations yields larger values than at least one
other treatment population.
Use the Friedman test with o = .05
c = 3, b = 10, df = c - 1 = 2, ;
2
.05,2
= 5.9915
If the observed value of ;
2
> 5.9915, then the decision will be to reject the null
hypothesis.
Shown below are the data ranked by blocks:
Worker 5-day 4-day 3.5 day
1 3 2 1
2 3 2 1
3 3 1 2
4 3 2 1
5 2 3 1
6 3 2 1
7 3 1 2
8 3 2 1
9 3 2 1
10 3 1 2
R
j
29 18 13
Page | 649
R
j
2
841 324 169
ER
j
2
= 1,334
) 4 )( 10 ( 3 ) 334 , 1 (
) 4 )( 3 )( 10 (
12
) 1 ( 3
) 1 (
12 2
2
= +
+
=
c b R
c bc
j r
; = 13.4
Since the observed value of ;
r
2
= 13.4 > ;
2
.05,2
= 5.9915, the decision is to
reject the null hypothesis. At least one treatment population yields larger values
than at least one other treatment population.
Page | 650
13.29 c = 4 treatments b = 5 blocks
S = ;
r
2
= 2.04 with a p-value of .564.
Since the p-value of .564 > o = .10, .05, or .01, the decision is to fail to reject
the null hypothesis. There is no significant difference in treatments.
13.30 The experimental design is a random block design that has been analyzed using a
Friedman test. There are five treatment levels and seven blocks. Thus, the degrees of
freedom are four. The observed value of S = 13.71 is the equivalent of ;
r
2
. The p-value
is .009 indicating that this test is significant at alpha .01. The null hypothesis is rejected.
That is, at least one population yields larger values than at least one other population.
An examination of estimated medians shows that treatment 1 has the lowest
value and treatment 3 has the highest value.
13.31 x y x Ranked y Ranked d d
2
23 201 3 2 1 1
41 259 10.5 11 -.5 0.25
37 234 8 7 1 1
29 240 6 8 -2 4
25 231 4 6 -2 4
17 209 1 3 -2 4
Page | 651
33 229 7 5 2 4
41 246 10.5 9 1.5 2.25
40 248 9 10 -1 1
28 227 5 4 1 1
19 200 2 1 1 1
Ed
2
= 23.5
n = 11
) 120 ( 11
) 5 . 23 ( 6
1
) 1 (
6
1
2
2
=
=
n n
d
r
s
= .893
Page | 652
13.32 x y d d
2
4 6 -2 4
5 8 -3 9
8 7 1 1
11 10 1 1
10 9 1 1
7 5 2 4
3 2 1 1
1 3 -2 4
2 1 1 1
9 11 -2 4
6 4 2 4
Ed
2
= 34
n = 11
) 120 ( 11
) 34 ( 6
1
) 1 (
6
1
2
2
=
=
n n
d
r
s
= .845
13.33 x y x Ranked y Ranked d d
2
99 108 8 2 6 36
Page | 653
67 139 4 5 -1 1
82 117 6 3 3 9
46 168 1 8 -7 49
80 124 5 4 1 1
57 162 3 7 -4 16
49 145 2 6 -4 16
91 102 7 1 6 36
Ed
2
= 164
n = 8
) 63 ( 8
) 164 ( 6
1
) 1 (
6
1
2
2
=
=
n n
d
r
s
= -.95
Page | 654
13.34 x y x Ranked y Ranked d d
2
92 9.3 8 9 -1 1
96 9.0 9 8 1 1
91 8.5 6.5 7 -.5 .25
89 8.0 5 3 2 4
91 8.3 6.5 5 1.5 2.25
88 8.4 4 6 -2 4
84 8.1 3 4 -1 1
81 7.9 1 2 -1 1
83 7.2 2 1 1 1
Ed
2
= 15.5
n = 9
) 80 ( 9
) 5 . 15 ( 6
1
) 1 (
6
1
2
2
=
=
n n
d
r
s
= .871
13.35 Bank Home Bank Home
Credit Equity Cr. Cd. Eq. Loan
Card Loan Rank Rank d d
2
2.51 2.07 12 1 11 121
2.86 1.95 6.5 2 4.5 20.25
2.33 1.66 13 6 7 49
2.54 1.77 10 3 7 49
Page | 655
2.54 1.51 10 7.5 2.5 6.25
2.18 1.47 14 10 4 16
3.34 1.75 3 4 -1 1
2.86 1.73 6.5 5 1.5 2.25
2.74 1.48 8 9 -1 1
2.54 1.51 10 7.5 2.5 6.25
3.18 1.25 4 14 -10 100
3.53 1.44 1 11 -10 100
3.51 1.38 2 12 -10 100
3.11 1.30 5 13 -8 64
Ed
2
= 636
n = 14
) 1 14 ( 14
) 636 ( 6
1
) 1 (
6
1
2 2
2
=
n n
d
r
s
= -.398
There is a very modest negative correlation between overdue payments for bank
credit cards and home equity loans.
13.36 Iron Steel
Year Rank Rank d d
2
1 12 12 0 0
2 11 10 1 1
3 3 5 -2 4
4 2 7 -5 25
Page | 656
5 4 6 -2 4
6 10 11 -1 1
7 9 9 0 0
8 8 8 0 0
9 7 4 3 9
10 1 3 -2 4
11 6 2 4 16
12 5 1 4 16
Ed
2
= 80
) 1 144 ( 12
) 80 ( 6
1
) 1 (
6
1
2
2
=
n n
d
r
s
= 0.72
13.37 No. Co. No. Eq. Is.
on NYSE on AMEX Rank NYSE Rank AMEX d d
2
1774 1063 11 1 10 100
1885 1055 10 2 8 64
2088 943 9 5 4 16
2361 1005 8 3 5 25
2570 981 7 4 3 9
2675 936 6 6 0 0
2907 896 4 7 -3 9
3047 893 2 8 -6 36
Page | 657
3114 862 1 9 -8 64
3025 769 3 10 -7 49
2862 765 5 11 -6 36 Ed
2
= 162
2
d = 408
n = 11
) 1 11 ( 11
) 408 ( 6
1
) 1 (
6
1
2 2
2
=
n n
d
r
s
= -0.855
There is a strong negative correlation between the number of companies listed
on the NYSE and the number of equity issues on the American Stock Exchange.
13.38 o = .05
H
0
: The observations in the sample are randomly generated
H
a
: The observations in the sample are not randomly generated
n
1
= 13, n
2
= 21
R = 10
Since this is a two-tailed test, use o/2 = .025. The critical value is: z
.025
= + 1.96
1
21 13
) 21 )( 13 ( 2
1
2
2 1
2 1
+
+
= +
+
=
n n
n n
R
u = 17.06
Page | 658
| |
) 1 21 13 ( ) 21 13 (
21 13 ) 21 )( 13 ( 2 ) 21 )( 13 ( 2
) 1 ( ) (
) 2 ( 2
2
2 1
2
2 1
2 1 2 1 2 1
+ +
=
+ +
=
n n n n
n n n n n n
R
o = 2.707
707 . 2
06 . 17 10
=
=
R
R
R
z
o
u
= -2.61
Since the observed z = - 2.61 < z
.025
= - 1.96, the decision is to reject the null
hypothesis. The observations in the sample are not randomly generated.
13.39 Sample 1 Sample 2
573 547
532 566
544 551
565 538
540 557
548 560
536 557
523 547
Page | 659
o = .01 Since n
1
= 8, n
2
= 8 < 10, use the small sample Mann-Whitney U test.
Page | 660
x Rank Group
523 1 1
532 2 1
536 3 1
538 4 2
540 5 1
544 6 1
547 7.5 2
547 7.5 2
548 9 1
551 10 2
557 11.5 2
557 11.5 2
560 13 2
565 14 1
566 15 2
573 16 1
W
1
= 1 + 2 + 3 + 5 + 6 + 9 + 14 + 16 = 56
56
2
) 9 )( 8 (
) 8 )( 8 (
2
) 1 (
1
1 1
2 1 1
+ =
+
+ = W
n n
n n U = 44
1 2 1 2
U n n U = = 8(8) - 44 = 20
Page | 661
Take the smaller value of U
1
, U
2
= 20
From Table A.13, the p-value (1-tailed) is .1172, for 2-tailed, the p-value is .2344.
Since the p-value is > o = .05, the decision is to fail to reject the null
hypothesis.
Page | 662
13.40 o = .05, n = 9
H
0
: M
d
= 0
H
a
: M
d
= 0
Group 1 Group 2 d Rank
5.6 6.4 -0.8 -8.5
1.3 1.5 -0.2 -4.0
4.7 4.6 0.1 2.0
3.8 4.3 -0.5 -6.5
2.4 2.1 0.3 5.0
5.5 6.0 -0.5 -6.5
5.1 5.2 -0.1 -2.0
4.6 4.5 0.1 2.0
3.7 4.5 -0.8 -8.5
Since n = 9, from Table A.14 (2-tailed test), T
.025
= 6
T
+
= 2 + 5 + 2 = 9
T
-
= 8.5 + 4 + 6.5 + 6.5 + 2 + 8.5 = 36
T = min(T
+
, T
-
) = 9
Since the observed value of T = 9 > T
.025
= 6, the decision is to fail to reject the
Page | 663
null hypothesis. There is not enough evidence to declare that there is a difference
between the two groups.
Page | 664
13.41 n
j
= 7, n = 28, c = 4, df = 3
Group 1 Group 2 Group 3 Group 4
6 4 3 1
11 13 7 4
8 6 7 5
10 8 5 6
13 12 10 9
7 9 8 6
10 8 5 7
By Ranks:
Group 1 Group 2 Group 3 Group 4
9.5 3.5 2 1
25 27.5 13.5 3.5
17.5 9.5 13.5 6
23 17.5 6 9.5
27.5 26 23 20.5
13.5 20.5 17.5 9.5
23 17.5 6 13.5
T
j
139 122 81.5 63.5
Page | 665
j
j
n
T
2
= 2760.14 + 2126.29 + 948.89 + 576.04 = 6411.36
= +
+
= ) 29 ( 3 ) 36 . 6411 (
) 29 ( 28
12
) 1 ( 3
) 1 (
12
2
n
n
T
n n
K
j
j
= 7.75
The critical value of chi-square is: ;
2
3,.01
= 11.3449.
Since K = 7.75 < ;
2
3,.01
= 11.3449, the decision is to fail to reject the null
hypothesis.
Page | 666
13.42 o = .05, b = 7, c = 4, df = 3
;
2
.05,3
= 7.8147
H
0
: The treatment populations are equal
H
a
: At least one treatment population yields larger values than at least one other
treatment population
Blocks Group 1 Group 2 Group 3 Group 4
1 16 14 15 17
2 8 6 5 9
3 19 17 13 9
4 24 26 25 21
5 13 10 9 11
6 19 11 18 13
7 21 16 14 15
By Ranks:
Blocks Group 1 Group 2 Group 3 Group 4
1 3 1 2 4
2 3 2 1 4
3 4 3 2 1
4 2 4 3 1
5 4 2 1 3
6 4 1 3 2
Page | 667
7 4 3 1 2
R
j
24 16 13 17
R
j
2
576 256 169 289
R
j
2
= 567 + 256 + 169 + 289 = 1290
) 5 )( 7 ( 3 ) 290 , 1 (
) 5 )( 4 )( 7 (
12
) 1 ( 3
) 1 (
12 2
2
= +
+
=
C b R
C bC
j r
; = 5.57
Since ;
r
2
= 5.57 < ;
2
.05,3
= 7.8147, the decision is to fail to reject the null
hypothesis. The treatment population means are equal.
Page | 668
13.43 Ranks
1 2 1 2 d d
2
101 87 1 7 -6 36
129 89 2 8 -6 36
133 84 3 6 -3 9
147 79 4 5 -1 1
156 70 5 3 2 4
179 64 6 1 5 25
183 67 7 2 5 25
190 71 8 4 4 16
Ed
2
= 152
n = 8
) 63 ( 8
) 152 ( 6
1
) 1 (
6
1
2
2
=
=
n n
d
r
s
= -.81
13.44 H
o
: The 3 populations are identical
H
a
: At least one of the 3 populations is different
1 Gal. 5 Gal. 10 Gal.
1.1 2.9 3.1
1.4 2.5 2.4
Page | 669
1.7 2.6 3.0
1.3 2.2 2.3
1.9 2.1 2.9
1.4 2.0 1.9
2.1 2.7
By Ranks
1 Gal. 5 Gal. 10 Gal.
1 17.5 20
3.5 14 13
5 15 19
2 11 12
6.5 9.5 17.5
3.5 8 6.5
9.5 16
T
j
31 91 88
n
j
7 7 6
6
) 88 (
7
) 91 (
7
) 31 (
2 2 2
2
+ + =
j
j
n
T
= 2,610.95
n = 20
Page | 670
= +
+
= ) 21 ( 3 ) 95 . 610 , 2 (
) 21 ( 20
12
) 1 ( 3
) 1 (
12
2
n
n
T
n n
K
j
j
= 11.60
o = .01 df = c - 1 = 3 - 1 = 2
;
2
.01,2
= 9.2104
Since the observed K = 11.60 > ;
2
.01,2
= 9.2104, the decision is to reject the null
hypothesis.
13.45 N = 40 n
1
= 24 n
2
= 16 o = .05
Use the large sample runs test since both n
1
, n
2
are not less than 20.
H
0
: The observations are randomly generated
H
a
: The observations are not randomly generated
With a two-tailed test, o/2 = .025, z
.025
= + 1.96. If the observed z > .196
or < -1.96, the decision will be to reject the null hypothesis.
R = 19
Page | 671
1
16 24
) 16 )( 24 ( 2
1
2
2 1
2 1
+
+
= +
+
=
n n
n n
R
u = 20.2
| |
) 39 ( ) 40 (
16 24 ) 16 )( 24 ( 2 ) 16 )( 24 ( 2
) 1 ( ) (
) 2 ( 2
2
2 1
2
2 1
2 1 2 1 2 1
=
+ +
=
n n n n
n n n n n n
R
o = 2.993
993 . 2
2 . 20 19
=
=
R
R
R
z
o
u
= -0.40
Since z = -0.40 > z
.025
= -1.96, the decision is to fail to reject the null hypothesis.
13.46 Use the Friedman test. Let o = .05
H
0
: The treatment populations are equal
H
a
: The treatment populations are not equal
c = 3 and b = 7
Operator Machine 1 Machine 2 Machine 3
1 231 229 234
2 233 232 231
3 229 233 230
4 232 235 231
5 235 228 232
6 234 237 231
Page | 672
7 236 233 230
By ranks:
Operator Machine 1 Machine 2 Machine 3
1 2 1 1
3 1 3 2
4 2 3 1
5 3 1 2
6 2 3 1
7 3 2 1
R
j
16 15 11
R
j
2
256 225 121
df = c - 1 = 2 ;
2
.05,2
= 5.99147.
If the observed ;
2
r
> 5.99147, the decision will be to reject the null hypothesis.
ER
j
2
= 256 + 225 + 121 = 602
) 4 )( 7 ( 3 ) 602 (
) 4 )( 3 )( 7 (
12
) 1 ( 3
) 1 (
12 2
2
= +
+
=
c b R
c bc
j r
; = 2
Since ;
2
r
= 2 < ;
2
.05,2
= 5.99147, the decision is to fail to reject the null
hypothesis.
Page | 673
Page | 674
13.47 H
o
: EMS workers are not older
H
a
: EMS workers are older
Age Rank Group
21 1 1
23 2 1
24 3 1
25 4 1
27 6 1
27 6 2
27 6 2
28 9 1
28 9 2
28 9 2
29 11 2
30 13 2
30 13 2
30 13 2
32 15 1
33 16.5 2
33 16.5 2
36 18.5 1
36 18.5 2
37 20 1
39 21 2
Page | 675
41 22 1
n
1
= 10 n
2
= 12
W
1
= 1 + 2 + 3 + 4 + 6 + 9 + 15 + 18.5 + 20 + 22 = 100.5
2
) 12 )( 10 (
2
2 1
=
=
n n
u = 60
12
) 23 )( 12 )( 10 (
12
) 1 (
2 1 2 1
=
+ +
=
n n n n
o = 15.17
5 . 100
2
) 11 )( 10 (
) 12 )( 10 (
2
) 1 (
1
1 1
2 1
+ =
+
+ = W
n n
n n U = 74.5
17 . 15
60 5 . 74
=
=
o
u U
z = 0.96 with o = .05, z
.05
= -1.645
Since the observed z = 0.96 < z
.05
= 645 . 1 , the decision is to fail to reject the
null hypothesis.
13.48 H
o
: The population differences = 0
H
a
: The population differences = 0
With Without d Rank
1180 1209 -29 -6
Page | 676
874 902 -28 -5
1071 862 209 18
668 503 165 15
889 974 -85 -12.5
724 675 49 9
880 821 59 10
482 567 -85 -12.5
796 602 194 16
1207 1097 110 14
968 962 6 1
1027 1045 -18 -4
1158 896 262 20
670 708 -38 -8
849 642 207 17
559 327 232 19
449 483 -34 -7
992 978 14 3
1046 973 73 11
852 841 11 2
n = 20
T
-
= 6 + 5 + 12.5 + 12.5 + 4 + 8 + 7 = 55
T = 55
Page | 677
4
) 21 )( 20 (
4
) 1 )( (
=
+
=
n n
u = 105
24
) 41 )( 21 ( 20
24
) 1 2 )( 1 (
=
+ +
=
n n n
o = 26.79
79 . 26
105 55
=
=
o
u T
z = -1.87
o = .01, o/2 = .005 z
.005
= 2.575
Since the observed z = -1.87 > z
.005
= -2.575, the decision is to fail to reject the
null hypothesis.
13.49 H
0
: There is no difference between March and June
H
a
: There is a difference between March and June
GMAT Rank Month
350 1 J
430 2 M
460 3 J
470 4 J
490 5 M
500 6 M
510 7 M
Page | 678
520 8 J
530 9.5 M
530 9.5 J
540 11 M
550 12.5 M
550 12.5 J
560 14 M
570 15.5 M
570 15.5 J
590 17 J
600 18 M
610 19 J
630 20 J
n
1
= 10 n
2
= 10
W
1
= 1 + 3 + 4 + 8 + 9.5 + 12.5 + 15.5 + 17 + 19 + 20 = 109.5
5 . 109
2
) 11 )( 10 (
) 10 )( 10 (
2
) 1 (
1
1 1
2 1 1
+ =
+
+ = W
n n
n n U = 45.5
1 2 1 2
U n n U = = (10)(10) - 45.5 = 54.5
From Table A.13, the p-value for U = 45 is .3980 and for 44 is .3697. For a
two-tailed test, double the p-value to at least .739. Using o = .10, the decision is
Page | 679
to fail to reject the null hypothesis.
Page | 680
13.50 Use the Friedman test. b = 6, c = 4, df = 3, o = .05
H
0
: The treatment populations are equal
H
a
: At least one treatment population yields larger values than at least on other
treatment population
The critical value is: ;
2
.05,3
= 7.8147
Location
Brand 1 2 3 4
A 176 58 111 120
B 156 62 98 117
C 203 89 117 105
D 183 73 118 113
E 147 46 101 114
F 190 83 113 115
By ranks:
Location
Brand 1 2 3 4
A 4 1 2 3
B 4 1 2 3
C 4 1 3 2
D 4 1 3 2
E 4 1 2 3
Page | 681
F 4 1 2 3
R
j
24 6 14 16
R
j
2
576 36 196 256
ER
j
2
= 1,064
) 5 )( 6 ( 3 ) 064 , 1 (
) 5 )( 4 )( 6 (
12
) 1 ( 3
) 1 (
12 2
2
= +
+
=
c b R
c bc
j r
; = 16.4
Since ;
r
2
= 16.4 > ;
2
.05,3
= 7.8147, the decision is to reject the null hypothesis.
At least one treatment population yields larger values than at least one other
treatment population. An examination of the data shows that location one produced
the highest sales for all brands and location two produced the lowest sales of gum for all
brands.
Page | 682
13.51 H
o
: The population differences = 0
H
a
: The population differences = 0
Box No Box d Rank
185 170 15 11
109 112 -3 -3
92 90 2 2
105 87 18 13.5
60 51 9 7
45 49 -4 -4.5
25 11 14 10
58 40 18 13.5
161 165 -4 -4.5
108 82 26 15.5
89 94 -5 -6
123 139 -16 -12
34 21 13 8.5
68 55 13 8.5
59 60 -1 -1
78 52 26 15.5
n = 16
T
-
= 3 + 4.5 + 4.5 + 6 + 12 + 1 = 31
T = 31
Page | 683
4
) 17 )( 16 (
4
) 1 )( (
=
+
=
n n
u = 68
24
) 33 )( 17 ( 16
24
) 1 2 )( 1 (
=
+ +
=
n n n
o = 19.34
34 . 19
68 31
=
=
o
u T
z = -1.91
o = .05, o/2 = .025 z
.025
= 1.96
Since the observed z = -1.91 > z
.025
= -1.96, the decision is to fail to reject the
null hypothesis.
Page | 684
13.52 Ranked Ranked
Cups Stress Cups Stress d d
2
25 80 6 8 -2 4
41 85 9 9 0 0
16 35 4 3 1 1
0 45 1 5 -4 16
11 30 3 2 1 1
28 50 7 6 1 1
34 65 8 7 1 1
18 40 5 4 1 1
5 20 2 1 1 1
Ed
2
= 26
n = 9
) 80 ( 9
) 26 ( 6
1
) 1 (
6
1
2
2
=
=
n n
d
r
s
= .783
13.53 n
1
= 15, n
2
= 15 Use the small sample Runs test
o = .05, o/.025
H
0
: The observations in the sample were randomly generated.
Page | 685
H
a
: The observations in the sample were not randomly generated
From Table A.11, lower tail critical value = 10
From Table A.12, upper tail critical value = 22
R = 21
Since R = 21 between the two critical values, the decision is to fail to reject the
null hypothesis. The observations were randomly generated.
Page | 686
13.54 H
o
: The population differences > 0
H
a
: The population differences < 0
Before After d Rank
430 465 -35 -11
485 475 10 5.5
520 535 -15 - 8.5
360 410 -50 -12
440 425 15 8.5
500 505 -5 -2
425 450 -25 -10
470 480 -10 -5.5
515 520 -5 -2
430 430 0 OMIT
450 460 -10 -5.5
495 500 -5 -2
540 530 10 5.5
n = 12
T+ = 5.5 + 8.5 + 5.5 = 19.5 T = 19.5
From Table A.14, using n = 12, the critical T for o = .01, one-tailed, is 10.
Since T = 19.5 is not less than or equal to the critical T = 10, the decision is to fail
Page | 687
to reject the null hypothesis.
Page | 688
13.55 H
o
: With ties have no higher scores
H
a
: With ties have higher scores
Rating Rank Group
16 1 2
17 2 2
19 3.5 2
19 3.5 2
20 5 2
21 6.5 2
21 6.5 1
22 9 1
22 9 1
22 9 2
23 11.5 1
23 11.5 2
24 13 2
25 15.5 1
25 15.5 1
25 15.5 1
25 15.5 2
26 19 1
26 19 1
26 19 2
27 21 1
28 22 1
Page | 689
n
1
= 11 n
2
= 11
W
1
= 6.5 + 9 + 9 + 11.5 + 15.5 + 15.5 + 15.5 + 19 + 19 + 21 + 22 = 163.5
2
) 11 )( 11 (
2
2 1
=
=
n n
u = 60.5
12
) 23 )( 11 )( 11 (
12
) 1 (
2 1 2 1
=
+ +
=
n n n n
o = 15.23
5 . 163
2
) 12 )( 11 (
) 11 )( 11 (
2
) 1 (
1
1 1
2 1
+ =
+
+ = W
n n
n n U = 23.5
23 . 15
5 . 60 5 . 23
=
=
o
u U
z = -2.43 For o = .05, z
.05
= 1.645
Since the observed z = 43 . 2 > z
.05
= 1.645, the decision is to reject the null
hypothesis.
13.56 H
o
: Automatic no more productive
H
a
: Automatic more productive
Sales Rank Type of Dispenser
92 1 M
105 2 M
Page | 690
106 3 M
110 4 A
114 5 M
117 6 M
118 7.5 A
118 7.5 M
125 9 M
126 10 M
128 11 A
129 12 M
137 13 A
143 14 A
144 15 A
152 16 A
153 17 A
168 18 A
n
1
= 9 n
2
= 9
W
1
= 4 + 7.5 + 11 + 13 + 14 + 15 + 16 + 17 + 18 = 115.5
5 . 115
2
) 10 )( 9 (
) 9 )( 9 (
2
) 1 (
1
1 1
2 1 1
+ =
+
+ = W
n n
n n U = 10.5
1 2 1 2
U n n U = = 81 10.5 = 70.5
Page | 691
The smaller of the two is U
1
= 10.5
o = .01
From Table A.13, the p-value = .0039. The decision is to reject the null
hypothesis since the p-value is less than .01.
Page | 692
13.57 H
o
: The 4 populations are identical
H
a
: At least one of the 4 populations is different
45 55 70 85
216 228 219 218
215 224 220 216
218 225 221 217
216 222 223 221
219 226 224 218
214 225 217
By Ranks:
45 55 70 85
4 23 11.5 9
2 18.5 13 4
9 20.5 14.5 6.5
4 16 17 14.5
11.5 22 18.5 9
1 20.5 6.5
T
j
31.5 120.5 74.5 49.5
n
j
6 6 5 6
Page | 693
6
) 5 . 49 (
5
) 5 . 74 (
6
) 5 . 120 (
6
) 5 . 31 (
2 2 2
2
+ + + =
j
j
n
T
= 4,103.84
n = 23
= +
+
= ) 24 ( 3 ) 84 . 103 , 4 (
) 24 ( 23
12
) 1 ( 3
) 1 (
12
2
n
n
T
n n
K
j
j
= 17.21
o = .01 df = c - 1 = 4 - 1 = 3
;
2
.01,3
= 11.3449
Since the observed K = 17.21 > ;
2
.01,3
= 11.3449, the decision is to reject the
null hypothesis.
Page | 694
13.58
Ranks Ranks
Sales Miles Sales Miles d d
2
150,000 1,500 1 1 0 0
210,000 2,100 2 2 0 0
285,000 3,200 3 7 -4 16
301,000 2,400 4 4 0 0
335,000 2,200 5 3 2 4
390,000 2,500 6 5 1 1
400,000 3,300 7 8 -1 1
425,000 3,100 8 6 2 4
440,000 3,600 9 9 0 0
Ed
2
= 26
n = 9
) 80 ( 9
) 26 ( 6
1
) 1 (
6
1
2
2
=
=
n n
d
r
s
= .783
Page | 695
13.59 H
o
: The 3 populations are identical
H
a
: At least one of the 3 populations is different
3-day Quality Mgmt. Inv.
9 27 16
11 38 21
17 25 18
10 40 28
22 31 29
15 19 20
6 35 31
By Ranks:
3-day Quality Mgmt. Inv.
2 14 6
4 20 11
7 13 8
3 21 15
12 17.5 16
5 9 10
1 19 17.5
T
j
34 113.5 83.5
n
j
7 7 7
Page | 696
7
) 5 . 83 (
7
) 5 . 113 (
7
) 34 (
2 2 2
2
+ + =
j
j
n
T
= 3,001.5
n = 21
= +
+
= ) 22 ( 3 ) 5 . 001 , 3 (
) 22 ( 21
12
) 1 ( 3
) 1 (
12
2
n
n
T
n n
K
j
j
= 11.96
o = .10 df = c - 1 = 3 - 1 = 2
;
2
.10,2
= 4.6052
Since the observed K = 11.96 > ;
2
.10,2
= 4.6052, the decision is to reject the
null hypothesis.
13.60 H
o
: The population differences > 0
H
a
: The population differences < 0
Husbands Wives d Rank
27 35 -8 -12
22 29 -7 -11
28 30 -2 -6.5
19 20 -1 -2.5
Page | 697
28 27 1 2.5
29 31 -2 -6.5
18 22 -4 -9.5
21 19 2 6.5
25 29 -4 -9.5
18 28 -10 -13.5
20 21 -1 -2.5
24 22 2 6.5
23 33 -10 -13.5
25 38 -13 -16.5
22 34 -12 -15
16 31 -15 -18
23 36 -13 -16.5
30 31 -1 -2.5
n = 18
T+ = 2.5 + 6.5 + 6.5 = 15.5
T = 15.51
4
) 19 )( 18 (
4
) 1 )( (
=
+
=
n n
u = 85.5
24
) 37 )( 19 ( 18
24
) 1 2 )( 1 (
=
+ +
=
n n n
o = 22.96
Page | 698
96 . 22
5 . 85 5 . 15
=
=
o
u T
z = -3.05
o = .01 z
.01
= -2.33
Since the observed z = -3.05 < z
.01
= -2.33, the decision is to reject the null
hypothesis.
13.61 This problem uses a random block design, which is analyzed by the Friedman
nonparametric test. There are 4 treatments and 10 blocks. The value of the
observed ;
r
2
(shown as S) is 12.16 (adjusted for ties) and has an associated
p-value of .007 that is significant at o = .01. At least one treatment population
yields larger values than at least one other treatment population. Examining the
treatment medians, treatment one has an estimated median of 20.125 and
treatment two has a treatment median of 25.875. These two are the farthest apart.
13.62 This is a Runs test for randomness. n
1
= 21, n
2
= 29. Because of the size of the
ns, this is a large sample Runs test. There are 28 runs, R = 28.
Page | 699
R
= 25.36 o
R
= 3.34
34 . 3
36 . 25 28
= z = 0.79
The p-value for this statistic is .4387 for a two-tailed test. The decision is to fail
to reject the null hypothesis at o = .05.
Page | 700
13.63 A large sample Mann-Whitney U test is being computed. There are 16
observations in each group. The null hypothesis is that the two populations are
identical. The alternate hypothesis is that the two populations are not identical.
The value of W is 191.5. The p-value for the test is .0066. The test is significant
at o = .01. The decision is to reject the null hypothesis. The two populations are
not identical. An examination of medians shows that the median for group two
(46.5) is larger than the median for group one (37.0).
13.64 A Kruskal-Wallis test has been used to analyze the data. The null hypothesis is
that the four populations are identical; and the alternate hypothesis is that at least one
of the four populations is different. The H statistic (same as the K statistic) is 11.28
when adjusted for ties. The p-value for this H value is .010 which indicates that there is
a significant difference in the four groups at o = .05 and marginally so for o = .01. An
examination of the medians reveals that all group medians are the same (35) except for
group 2 that has a median of 25.50. It is likely that it is group 2 that differs from the
other groups.
Page | 701
Chapter 14
Simple Regression Analysis
LEARNING OBJECTIVES
The overall objective of this chapter is to give you an understanding of bivariate regression
analysis, thereby enabling you to:
1. Compute the equation of a simple regression line from a sample of
data and interpret the slope and intercept of the equation.
2. Understand the usefulness of residual analysis in examining the fit of the regression line
to the data and in testing the assumptions underlying regression analysis.
3. Compute a standard error of the estimate and interpret its meaning.
4. Compute a coefficient of determination and interpret it.
5. Test hypotheses about the slope of the regression model and interpret the results.
6. Estimate values of y using the regression model.
7. Develop a linear trend line and use it to forecast.
Page | 702
CHAPTER TEACHING STRATEGY
This chapter is about all aspects of simple (bivariate, linear) regression. Early in the
chapter through scatter plots, the student begins to understand that the object of simple
regression is to fit a line through the points. Fairly soon in the process, the student learns how
to solve for slope and y intercept and develop the equation of the regression line. Most of the
remaining material on simple regression is to determine how good the fit of the line is and if
assumptions underlying the process are met.
The student begins to understand that by entering values of the independent variable
into the regression model, predicted values can be determined. The question then
becomes: Are the predicted values good estimates of the actual dependent
values? One rule to emphasize is that the regression model should not be used to
predict for independent variable values that are outside the range of values used to
construct the model. MINITAB issues a warning for such activity when
attempted. There are many instances where the relationship between x and y are
linear over a given interval but outside the interval the relationship becomes
curvilinear or unpredictable. Of course, with this caution having been given,
many forecasters use such regression models to extrapolate to values of x outside
the domain of those used to construct the model. Such forecasts are introduced in
section 14.8, Using Regression to Develop a Forecasting Trend Line. Whether
the forecasts obtained under such conditions are any better than "seat of the pants"
or "crystal ball" estimates remains to be seen.
The concept of residual analysis is a good one to show graphically and numerically how
the model relates to the data and the fact that it more closely fits some points than others, etc.
A graphical or numerical analysis of residuals demonstrates that the regression line fits the data
in a manner analogous to the way a mean fits a set of numbers. The regression model passes
through the points such that the vertical distances from the actual y values to the predicted
values will sum to zero. The fact that the residuals sum to zero points out the need to square
the errors (residuals) in order to get a handle on total error. This leads to the sum of squares
error and then on to the standard error of the estimate. In addition, students can learn why the
process is called least squares analysis (the slope and intercept formulas are derived by calculus
such that the sum of squares of error is minimized - hence "least squares"). Students can learn
that by examining the values of s
e
, the residuals, r
2
, and the t ratio to test the slope they can
Page | 703
begin to make a judgment about the fit of the model to the data. Many of the chapter problems
ask the student to comment on these items (s
e
, r
2
, etc.).
It is my view that for many of these students, the most important facet of this chapter
lies in understanding the "buzz" words of regression such as standard error of the estimate,
coefficient of determination, etc. because they may only interface regression again as some type
of computer printout to be deciphered. The concepts then may be more important than the
calculations.
Page | 704
CHAPTER OUTLINE
14.1 Introduction to Simple Regression Analysis
14.2 Determining the Equation of the Regression Line
14.3 Residual Analysis
Using Residuals to Test the Assumptions of the Regression Model
Using the Computer for Residual Analysis
14.4 Standard Error of the Estimate
14.5 Coefficient of Determination
Relationship Between r and r
2
14.6 Hypothesis Tests for the Slope of the Regression Model and Testing the Overall
Model
Testing the Slope
Testing the Overall Model
14.7 Estimation
Confidence Intervals to Estimate the Conditional Mean of y:
y/x
Prediction Intervals to Estimate a Single Value of y
Page | 705
14.8 Using Regression to Develop a Forecasting Trend Line
Determining the Equation of the Trend Line
Forecasting Using the Equation of the Trend Line
Alternate Coding for Time Periods
14.8 Interpreting Computer Output
KEY TERMS
Coefficient of Determination (r
2
) Prediction Interval
Confidence Interval Probabilistic Model
Dependent Variable Regression Analysis
Deterministic Model Residual
Heteroscedasticity Residual Plot
Homoscedasticity Scatter Plot
Independent Variable Simple Regression
Least Squares Analysis Standard Error of the Estimate (s
e
)
Outliers Sum of Squares of Error (SSE)
Page | 706
SOLUTIONS TO CHAPTER 14
14.1 x x
12 17
21 15
28 22
8 19
20 24
Ex = 89 Ey = 97 Exy = 1,767
Ex
2
= 1,833 Ey
2
= 1,935 n = 5
0
5
10
15
20
25
30
0 5 10 15 20 25 30
y
x
Page | 707
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
5
) 89 (
833 , 1
5
) 97 )( 89 (
767 , 1
2
= 0.162
b
0
=
5
89
162 . 0
5
97
1
=
n
x
b
n
y
= 16.51
y = 16.51 + 0.162 x
Page | 708
14.2 x_ _y_
140 25
119 29
103 46
91 70
65 88
29 112
24 128
Ex = 571 Ey = 498 Exy = 30,099
Ex
2
= 58,293 Ey
2
= 45,154 n = 7
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
7
) 571 (
293 , 58
7
) 498 )( 571 (
099 , 30
2
= -0.898
Page | 709
b
0
=
7
571
) 898 . 0 (
7
498
1
=
n
x
b
n
y
= 144.414
y = 144.414 0.898 x
Page | 710
14.3 (Advertising) x (Sales) y
12.5 148
3.7 55
21.6 338
60.0 994
37.6 541
6.1 89
16.8 126
41.2 379
Ex = 199.5 Ey = 2,670 Exy = 107,610.4
Ex
2
= 7,667.15 Ey
2
= 1,587,328 n = 8
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
8
) 5 . 199 (
15 . 667 , 7
8
) 670 , 2 )( 5 . 199 (
4 . 610 , 107
2
= 15.240
b
0
=
8
5 . 199
24 . 15
8
670 , 2
1
=
n
x
b
n
y
= -46.292
y = -46.292 + 15.240 x
14.4 (Prime) x (Bond) y
Page | 711
16 5
6 12
8 9
4 15
7 7
Ex = 41 Ey = 48 Exy = 333
Ex
2
= 421 Ey
2
= 524 n = 5
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
5
) 41 (
421
5
) 48 )( 41 (
333
2
= -0.715
b
0
=
5
41
) 715 . 0 (
5
48
1
=
n
x
b
n
y
= 15.460
y = 15.460 0.715 x
14.5 Bankruptcies(y) Firm Births(x)
34.3 58.1
35.0 55.4
38.5 57.0
40.1 58.5
35.5 57.4
Page | 712
37.9 58.0
Ex = 344.4 Ey = 221.3 Ex
2
= 19,774.78
Ey
2
= 8188.41 Exy = 12,708.08 n = 6
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
6
) 4 . 344 (
78 . 774 , 19
6
) 3 . 221 )( 4 . 344 (
08 . 708 , 12
2
=
b
1
= 0.878
b
0
=
6
4 . 344
) 878 . 0 (
6
3 . 221
1
=
n
x
b
n
y
= -13.503
y = -13.503 + 0.878 x
Page | 713
14.6 No. of Farms (x) Avg. Size (y)
5.65 213
4.65 258
3.96 297
3.36 340
2.95 374
2.52 420
2.44 426
2.29 441
2.15 460
2.07 469
2.17 434
2.10 444
Ex = 36.31 Ey = 4,576 Ex
2
= 124.7931
Ey
2
= 1,825,028 Exy = 12,766.71 n = 12
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
12
) 31 . 36 (
7931 . 124
12
) 576 , 4 )( 31 . 36 (
71 . 766 , 12
2
= -72.328
Page | 714
b
0
=
12
31 . 36
) 328 . 72 (
12
576 , 4
1
=
n
x
b
n
y
= 600.186
y
= 600.186 72.3281 x
Page | 715
14.7 Steel New Orders
99.9 2.74
97.9 2.87
98.9 2.93
87.9 2.87
92.9 2.98
97.9 3.09
100.6 3.36
104.9 3.61
105.3 3.75
108.6 3.95
Ex = 994.8 Ey = 32.15 Ex
2
= 99,293.28
Ey
2
= 104.9815 Exy = 3,216.652 n = 10
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
10
) 8 . 994 (
28 . 293 , 99
10
) 15 . 32 )( 8 . 994 (
652 . 216 , 3
2
= 0.05557
b
0
=
10
8 . 994
) 05557 . 0 (
10
15 . 32
1
=
n
x
b
n
y
= -2.31307
y
= -2.31307 + 0.05557 x
Page | 716
Page | 717
14.8 x y
15 47
8 36
19 56
12 44
5 21
y = 13.625 + 2.303 x
Residuals:
x y y Residuals (y- y )
15 47 48.1694 -1.1694
8 36 32.0489 3.9511
19 56 57.3811 -1.3811
12 44 41.2606 2.7394
5 21 25.1401 -4.1401
14.9 x y Predicted ( y ) Residuals (y- y )
12 17 18.4582 -1.4582
21 15 19.9196 -4.9196
28 22 21.0563 0.9437
Page | 718
8 19 17.8087 1.1913
20 24 19.7572 4.2428
y = 16.51 + 0.162 x
14.10 x y Predicted ( y ) Residuals (y- y )
140 25 18.6597 6.3403
119 29 37.5229 -8.5229
103 46 51.8948 -5.8948
91 70 62.6737 7.3263
65 88 86.0281 1.9720
29 112 118.3648 -6.3648
24 128 122.8561 5.1439
y = 144.414 - 0.898 x
Page | 719
14.11 x y Predicted ( y ) Residuals (y- y )
12.5 148 144.2053 3.7947
3.7 55 10.0954 44.9047
21.6 338 282.8873 55.1127
60.0 994 868.0945 125.9055
37.6 541 526.7236 14.2764
6.1 89 46.6708 42.3292
16.8 126 209.7364 -83.7364
41.2 379 581.5868 -202.5868
y = -46.292 + 15.240x
Page | 720
14.12 x y Predicted ( y ) Residuals (y- y )
16 5 4.0259 0.9741
6 12 11.1722 0.8278
8 9 9.7429 -0.7429
4 15 12.6014 2.3986
7 7 10.4576 -3.4575
y = 15.460 - 0.715 x
14.13 _ x_ _ y_ Predicted ( y ) Residuals (y- y )
58.1 34.3 37.4978 -3.1978
55.4 35.0 35.1277 -0.1277
57.0 38.5 36.5322 1.9678
58.5 40.1 37.8489 2.2511
57.4 35.5 36.8833 -1.3833
58.0 37.9 37.4100 0.4900
The residual for x = 58.1 is relatively large, but the residual for x = 55.4 is quite
small.
Page | 721
14.14 _x_ _ y_ Predicted ( y ) Residuals (y- y )
5 47 42.2756 4.7244
7 38 38.9836 -0.9836
11 32 32.3997 -0.3996
12 24 30.7537 -6.7537
19 22 19.2317 2.7683
25 10 9.3558 0.6442
y = 50.506 - 1.646 x
No apparent violation of assumptions
14.15 Miles (x) Cost y ( y ) (y- y )
1,245 2.64 2.5376 .1024
425 2.31 2.3322 -.0222
1,346 2.45 2.5629 -.1128
973 2.52 2.4694 .0506
255 2.19 2.2896 -.0996
865 2.55 2.4424 .1076
Page | 722
1,080 2.40 2.4962 -.0962
296 2.37 2.2998 .0702
y = 2.2257 0.00025 x
No apparent violation of assumptions
14.16
Error terms appear to be non independent
Page | 723
14.17
There appears to be nonlinear regression
14.18 The MINITAB Residuals vs. Fits graphic is strongly indicative of a violation of the
homoscedasticity assumption of regression. Because the residuals are very close
together for small values of x, there is little variability in the residuals at the left end of
the graph. On the other hand, for larger values of x, the graph flares out indicating a
much greater variability at the upper end. Thus, there is a lack of homogeneity of error
across the values of the independent variable.
Page | 724
14.19 SSE = Ey
2
b
0
Ey - b
1
EXY = 1,935 - (16.51)(97) - 0.1624(1767) = 46.5692
3
5692 . 46
2
=
=
n
SSE
s
e
= 3.94
Approximately 68% of the residuals should fall within 1s
e
.
3 out of 5 or 60% of the actual residuals fell within 1s
e
.
14.20 SSE = Ey
2
b
0
Ey - b
1
EXY = 45,154 - 144.414(498) - (-.89824)(30,099) =
SSE = 272.0
5
0 . 272
2
=
=
n
SSE
s
e
= 7.376
6 out of 7 = 85.7% fall within + 1s
e
7 out of 7 = 100% fall within + 2s
e
14.21 SSE = Ey
2
b
0
Ey - b
1
EXY = 1,587,328 - (-46.29)(2,670) - 15.24(107,610.4) =
SSE = 70,940
Page | 725
6
940 , 70
2
=
=
n
SSE
s
e
= 108.7
Six out of eight (75%) of the sales estimates are within $108.7 million.
14.22 SSE = Ey
2
b
0
Ey - b
1
EXY = 524 - 15.46(48) - (-0.71462)(333) = 19.8885
3
8885 . 19
2
=
=
n
SSE
s
e
= 2.575
Four out of five (80%) of the estimates are within 2.575 of the actual rate for
bonds. This amount of error is probably not acceptable to financial analysts.
Page | 726
14.23 _ x_ _ y_ Predicted ( y ) Residuals (y- y )
2
) ( y y
58.1 34.3 37.4978 -3.1978 10.2259
55.4 35.0 35.1277 -0.1277 0.0163
57.0 38.5 36.5322 1.9678 3.8722
58.5 40.1 37.8489 2.2511 5.0675
57.4 35.5 36.8833 -1.3833 1.9135
58.0 37.9 37.4100 0.4900 0.2401
2
) ( y y = 21.3355
SSE =
2
) (
y y = 21.3355
4
3355 . 21
2
=
=
n
SSE
s
e
= 2.3095
This standard error of the estimate indicates that the regression model is
with + 2.3095(1,000) bankruptcies about 68% of the time. In this
particular problem, 5/6 or 83.3% of the residuals are within this standard
error of the estimate.
14.24 (y- y ) (y- y )
2
4.7244 22.3200
-0.9836 .9675
-0.3996 .1597
Page | 727
-6.7537 45.6125
2.7683 7.6635
0.6442 .4150
E(y- y )
2
= 77.1382
SSE =
2
) (
y y = 77.1382
4
1382 . 77
2
=
=
n
SSE
s
e
= 4.391
Page | 728
14.25 (y- y ) (y- y )
2
.1024 .0105
-.0222 .0005
-.1129 .0127
.0506 .0026
-.0996 .0099
.1076 .0116
-.0962 .0093
.0702 .0049
E(y- y )
2
= .0620 SSE =
2
) (
y y = .0620
6
0620 .
2
=
=
n
SSE
s
e
= .1017
The model produces estimates that are .1017 or within about 10 cents 68% of the
time. However, the range of milk costs is only 45 cents for this data.
14.26 Volume (x) Sales (y)
728.6 10.5
497.9 48.1
439.1 64.8
377.9 20.1
375.5 11.4
363.8 123.8
Page | 729
276.3 89.0
n = 7 Ex = 3059.1 Ey = 367.7
Ex
2
= 1,464,071.97 Ey
2
= 30,404.31 Exy = 141,558.6
b
1
= -.1504 b
0
= 118.257
y = 118.257 - .1504x
SSE = Ey
2
b
0
Ey - b
1
EXY
= 30,404.31 - (118.257)(367.7) - (-0.1504)(141,558.6) = 8211.6245
5
6245 . 8211
2
=
=
n
SSE
s
e
= 40.526
This is a relatively large standard error of the estimate given the sales values
(ranging from 10.5 to 123.8).
14.27 r
2
=
5
) 97 (
935 , 1
6399 . 46
1
) (
1
2 2
2
n
y
y
SSE
= .123
This is a low value of r
2
Page | 730
14.28 r
2
=
7
) 498 (
154 , 45
121 . 272
1
) (
1
2 2
2
n
y
y
SSE
= .972
This is a high value of r
2
14.29 r
2
=
8
) 670 , 2 (
328 , 587 , 1
940 , 70
1
) (
1
2 2
2
n
y
y
SSE
= .898
This value of r
2
is relatively high
14.30 r
2
=
5
) 48 (
524
8885 . 19
1
) (
1
2 2
2
n
y
y
SSE
= .685
This value of r
2
is a modest value.
68.5% of the variation of y is accounted for by x but 31.5% is unaccounted for.
14.31 r
2
=
6
) 3 . 221 (
41 . 188 , 8
33547 . 21
1
) (
1
2 2
2
n
y
y
SSE
= .183
Page | 731
This value is a low value of r
2
.
Only 18.3% of the variability of y is accounted for by the x values and 81.7% are
unaccounted for.
14.32 CCI Median Income
116.8 37.415
91.5 36.770
68.5 35.501
61.6 35.047
65.9 34.700
90.6 34.942
100.0 35.887
104.6 36.306
125.4 37.005
Ex = 323.573 Ey = 824.9 Ex
2
= 11,640.93413
Ey
2
= 79,718.79 Exy = 29,804.4505 n = 9
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
9
) 573 . 323 (
93413 . 640 , 11
9
) 9 . 824 )( 573 . 323 (
4505 . 804 , 29
2
=
Page | 732
b
1
= 19.2204
b
0
=
9
573 . 323
) 2204 . 19 (
9
9 . 824
1
=
n
x
b
n
y
= -599.3674
y = -599.3674 + 19.2204 x
SSE = Ey
2
b
0
Ey - b
1
EXY =
79,718.79 (-599.3674)(824.9) 19.2204(29,804.4505) = 1283.13435
7
13435 . 1283
2
=
=
n
SSE
s
e
= 13.539
r
2
=
9
) 9 . 824 (
79 . 718 , 79
13435 . 1283
1
) (
1
2 2
2
n
y
y
SSE
= .688
14.33 s
b
=
5
) 89 (
833 . 1
94 . 3
) (
2 2
2
n
x
x
s
e
= .2498
b
1
= 0.162
Page | 733
H
o
: | = 0 o = .05
H
a
: | = 0
This is a two-tail test, o/2 = .025 df = n - 2 = 5 - 2 = 3
t
.025,3
= 3.182
t =
2498 .
0 162 . 0
1 1
=
b
s
b |
= 0.65
Since the observed t = 0.65 < t
.025,3
= 3.182, the decision is to fail to reject the
null hypothesis.
14.34 s
b
=
7
) 571 (
293 , 58
376 . 7
) (
2 2
2
n
x
x
s
e
= .068145
b
1
= -0.898
H
o
: | = 0 o = .01
H
a
: | = 0
Two-tail test, o/2 = .005 df = n - 2 = 7 - 2 = 5
Page | 734
t
.005,5
= 4.032
t =
068145 .
0 898 . 0
1 1
=
b
s
b |
= -13.18
Since the observed t = -13.18 < t
.005,5
= -4.032, the decision is to reject the null
hypothesis.
14.35 s
b
=
8
) 5 . 199 (
15 . 667 , 7
7 . 108
) (
2 2
2
n
x
x
s
e
= 2.095
b
1
= 15.240
H
o
: | = 0 o = .10
H
a
: | = 0
For a two-tail test, o/2 = .05 df = n - 2 = 8 - 2 = 6
t
.05,6
= 1.943
t =
095 . 2
0 240 , 15
1 1
=
b
s
b |
= 7.27
Page | 735
Since the observed t = 7.27 > t
.05,6
= 1.943, the decision is to reject the null
hypothesis.
14.36 s
b
=
5
) 41 (
421
575 . 2
) (
2 2
2
n
x
x
s
e
= .27963
b
1
= -0.715
H
o
: | = 0 o = .05
H
a
: | = 0
For a two-tail test, o/2 = .025 df = n - 2 = 5 - 2 = 3
t
.025,3
= 3.182
t =
27963 .
0 715 . 0
1 1
=
b
s
b |
= -2.56
Since the observed t = -2.56 > t
.025,3
= -3.182, the decision is to fail to reject the
null hypothesis.
Page | 736
14.37 s
b
=
6
) 4 . 344 (
78 . 774 , 19
3095 . 2
) (
2 2
2
n
x
x
s
e
= 0.926025
b
1
= 0.878
H
o
: | = 0 o = .05
H
a
: | = 0
For a two-tail test, o/2 = .025 df = n - 2 = 6 - 2 = 4
t
.025,4
= 2.776
t =
926025 .
0 878 . 0
1 1
=
b
s
b |
= 0.948
Since the observed t = 0.948 < t
.025,4
= 2.776, the decision is to fail to reject the
null hypothesis.
14.38 F = 8.26 with a p-value of .021. The overall model is significant at o = .05 but
not at o = .01. For simple regression,
t = F = 2.874
Page | 737
t
.05,8
= 1.86 but t
.01,8
= 2.896. The slope is significant at o = .05 but not at
o = .01.
Page | 738
14.39 x
0
= 25
95% confidence o/2 = .025
df = n - 2 = 5 - 2 = 3 t
.025,3
= 3.182
5
89
= =
n
x
x = 17.8
Ex = 89 Ex
2
= 1,833
s
e
= 3.94
y = 16.5 + 0.162(25) = 20.55
y t
o/2,n-2
s
e
+
n
x
x
x x
n
2
2
2
0
) (
) ( 1
20.55 3.182(3.94)
5
) 89 (
833 , 1
) 8 . 17 25 (
5
1
2
2
+ = 20.55 3.182(3.94)(.63903) =
20.55 8.01
12.54 < E(y
25
) < 28.56
Page | 739
14.40 x
0
= 100 For 90% confidence, o/2 = .05
df = n - 2 = 7 - 2 = 5 t
.05,5
= 2.015
7
571
= =
n
x
x = 81.57143
Ex= 571 Ex
2
= 58,293 s
e
= 7.377
y = 144.414 - .0898(100) = 54.614
y t
o /2,n-2
s
e
+ +
n
x
x
x x
n
2
2
2
0
) (
) ( 1
1
=
54.614 2.015(7.377)
7
) 571 (
293 , 58
) 57143 . 81 100 (
7
1
1
2
2
+ + =
54.614 2.015(7.377)(1.08252) = 54.614 16.091
38.523 < y < 70.705
For x
0
= 130, y = 144.414 - 0.898(130) = 27.674
Page | 740
y t
/2,n-2
s
e
+ +
n
x
x
x x
n
2
2
2
0
) (
) ( 1
1
=
27.674 2.015(7.377)
7
) 571 (
293 , 58
) 57143 . 81 130 (
7
1
1
2
2
+ + =
27.674 2.015(7.377)(1.1589) = 27.674 17.227
10.447 < y < 44.901
The width of this confidence interval of y for x
0
= 130 is wider that the
confidence interval of y for x
0
= 100 because x
0
= 100 is nearer to the value of
x = 81.57 than is x
0
= 130.
14.41 x
0
= 20 For 98% confidence, o/2 = .01
df = n - 2 = 8 - 2 = 6 t
.01,6
= 3.143
8
5 . 199
= =
n
x
x = 24.9375
Ex = 199.5 Ex
2
= 7,667.15 s
e
= 108.8
y = -46.29 + 15.24(20) = 258.51
Page | 741
y t
o /2,n-2
s
e
+
n
x
x
x x
n
2
2
2
0
) (
) ( 1
258.51 (3.143)(108.8)
8
) 5 . 199 (
15 . 667 , 7
) 9375 . 24 20 (
8
1
2
2
+
258.51 (3.143)(108.8)(0.36614) = 258.51 125.20
133.31 < E(y
20
) < 383.71
For single y value:
y t
/2,n-2
s
e
+ +
n
x
x
x x
n
2
2
2
0
) (
) ( 1
1
258.51 (3.143)(108.8)
8
) 5 . 199 (
15 . 667 , 7
) 9375 . 24 20 (
8
1
1
2
2
+ +
258.51 (3.143)(108.8)(1.06492) = 258.51 364.16
-105.65 < y < 622.67
The confidence interval for the single value of y is wider than the confidence
Page | 742
interval for the average value of y because the average is more towards the
middle and individual values of y can vary more than values of the average.
Page | 743
14.42 x
0
= 10 For 99% confidence o/2 = .005
df = n - 2 = 5 - 2 = 3 t
.005,3
= 5.841
5
41
= =
n
x
x = 8.20
Ex = 41 Ex
2
= 421 s
e
= 2.575
y = 15.46 - 0.715(10) = 8.31
y t
o /2,n-2
s
e
+
n
x
x
x x
n
2
2
2
0
) (
) ( 1
8.31 5.841(2.575)
5
) 41 (
421
) 2 . 8 10 (
5
1
2
2
+ =
8.31 5.841(2.575)(.488065) = 8.31 7.34
0.97 < E(y
10
) < 15.65
If the prime interest rate is 10%, we are 99% confident that the average bond rate
is between 0.97% and 15.65%.
Page | 744
14.43 Year Fertilizer
2001 11.9
2002 17.9
2003 22.0
2004 21.8
2005 26.0
Ex = 10,015 Ey = 99.6 Exy = 199,530.9
Ex
2
= 20,060,055 Ey
2
= 2097.26 n = 5
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
5
) 015 , 10 (
055 , 060 , 20
5
) 6 . 99 )( 015 , 10 (
9 . 530 , 199
2
= 3.21
b
0
=
5
015 , 10
21 . 3
5
6 . 99
1
=
n
x
b
n
y
= -6,409.71
y = -6,409.71 + 3.21 x
y (2008) = -6,409.71 + 3.21(2008) = 35.97
Page | 745
14.44 Year Fertilizer
1998 5860
1999 6632
2000 7125
2001 6000
2002 4380
2003 3326
2004 2642
Ex = 14,007 Ey = 35,965 Exy = 71,946,954
Ex
2
= 28,028,035 Ey
2
= 202,315,489 n = 7
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
7
) 007 , 14 (
035 , 028 , 28
7
) 965 , 35 )( 007 , 14 (
954 , 946 , 71
2
= -678.9643
b
0
=
7
007 , 14
9643 . 678
7
965 , 35
1
=
n
x
b
n
y
= 1,363,745.39
y = 1,363,745.39 + -678.9643 x
y (2007) = 1,363,745.39 + -678.9643(2007) = 1,064.04
Page | 746
14.45 Year Quarter Cum. Quarter(x) Sales(y)
2003 1 1 11.93
2 2 12.46
3 3 13.28
4 4 15.08
2004 1 5 16.08
2 6 16.82
3 7 17.60
4 8 18.66
2005 1 9 19.73
2 10 21.11
3 11 22.21
4 12 22.94
Use the cumulative quarters as the predictor variable, x, to predict sales, y.
Ex = 78 Ey = 207.9 Exy = 1,499.07
Ex
2
= 650 Ey
2
= 3,755.2084 n = 12
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
12
) 78 (
650
12
) 9 . 207 )( 78 (
07 . 499 , 1
2
= 1.033
Page | 747
b
0
=
12
78
033 . 1
12
9 . 207
1
=
n
x
b
n
y
= 10.6105
y = 10.6105 + 1.033 x
Remember, this trend line was constructed using cumulative quarters. To forecast
sales for the third quarter of year 2007, we must convert this time frame to
cumulative quarters. The third quarter of year 2007 is quarter number 19 in our
scheme.
y (19) = 10.6105 + 1.033(19) = 30.2375
Page | 748
14.46 x y
5 8
7 9
3 11
16 27
12 15
9 13
Ex = 52 Ex
2
= 564
Ey = 83 Ey
2
= 1,389 b
1
= 1.2853
Exy = 865 n = 6 b
0
= 2.6941
a) y = 2.6941 + 1.2853 x
b) y (Predicted Values) (y- y ) residuals
9.1206 -1.1206
11.6912 -2.6912
6.5500 4.4500
23.2588 3.7412
18.1177 -3.1176
14.2618 -1.2618
c) (y- y )
2
Page | 749
1.2557
7.2426
19.8025
13.9966
9.7194
1.5921
SSE = 53.6089
4
6089 . 53
2
=
=
n
SSE
s
e
= 3.661
d) r
2
=
6
) 83 (
389 , 1
6089 . 53
1
) (
1
2 2
2
n
y
y
SSE
= .777
Page | 750
e) H
o
: | = 0 o = .01
H
a
: | = 0
Two-tailed test, o/2 = .005 df = n - 2 = 6 - 2 = 4
t
.005,4
= 4.604
s
b
=
6
) 52 (
564
661 . 3
) (
2 2
2
n
x
x
s
e
= .34389
t =
34389 .
0 2853 . 1
1 1
=
b
s
b |
= 3.74
Since the observed t = 3.74 < t
.005,4
= 4.604, the decision is to fail to reject
the null hypothesis.
f) The r
2
= 77.74% is modest. There appears to be some prediction with this
model. The slope of the regression line is not significantly different from
zero using o = .01. However, for o = .05, the null hypothesis of a zero
slope is rejected. The standard error of the estimate, s
e
= 3.661 is not
particularly small given the range of values for y (11 - 3 = 8).
Page | 751
14.47 x y
53 5
47 5
41 7
50 4
58 10
62 12
45 3
60 11
Ex = 416 Ex
2
= 22,032
Ey = 57 Ey
2
= 489 b
1
= 0.355
Exy = 3,106 n = 8 b
0
= -11.335
a) y = -11.335 + 0.355 x
Page | 752
b) y (Predicted Values) (y- y ) residuals
7.48 -2.48
5.35 -0.35
3.22 3.78
6.415 -2.415
9.255 0.745
10.675 1.325
4.64 -1.64
9.965 1.035
c) (y- y )
2
6.1504
0.1225
14.2884
5.8322
0.5550
1.7556
2.6896
1.0712
SSE = 32.4649
d) s
e
=
6
4649 . 32
2
=
n
SSE
= 2.3261
Page | 753
e) r
2
=
8
) 57 (
489
4649 . 32
1
) (
1
2 2
2
n
y
y
SSE
= .608
f) H
o
: | = 0 o = .05
H
a
: | = 0
Two-tailed test, o/2 = .025 df = n - 2 = 8 - 2 = 6
t
.025,6
= 2.447
s
b
=
8
) 416 (
032 , 22
3261 . 2
) (
2 2
2
n
x
x
s
e
= 0.116305
t =
116305 .
0 3555 . 0
1 1
=
b
s
b |
= 3.05
Since the observed t = 3.05 > t
.025,6
= 2.447, the decision is to reject the
null hypothesis.
The population slope is different from zero.
g) This model produces only a modest r
2
= .608. Almost 40% of the
variance of y is unaccounted for by x. The range of y values is 12 - 3 = 9
and the standard error of the estimate is 2.33. Given this small range, the
Page | 754
s
e
is not small.
14.48 Ex = 1,263 Ex
2
= 268,295
Ey = 417 Ey
2
= 29,135
Exy = 88,288 n = 6
b
0
= 25.42778 b
1
= 0.209369
SSE = Ey
2
- b
0
Ey - b
1
Exy =
29,135 - (25.42778)(417) - (0.209369)(88,288) = 46.845468
r
2
=
5 . 153
845468 . 46
1
) (
1
2
2
=
n
y
y
SSE
= .695
Coefficient of determination = r
2
= .695
14.49a) x
0
= 60
Ex = 524 Ex
2
= 36,224
Page | 755
Ey = 215 Ey
2
= 6,411 b
1
= .5481
Exy = 15,125 n = 8 b
0
= -9.026
s
e
= 3.201 95% Confidence Interval o/2 = .025
df = n - 2 = 8 - 2 = 6
t.
025,6
= 2.447
y = -9.026 + 0.5481(60) = 23.86
8
524
= =
n
x
x = 65.5
y t
o /2,n-2
s
e
+
n
x
x
x x
n
2
2
2
0
) (
) ( 1
23.86 + 2.447(3.201)
8
) 524 (
224 , 36
) 5 . 65 60 (
8
1
2
2
+
23.86 + 2.447(3.201)(.375372) = 23.86 + 2.94
20.92 < E(y
60
) < 26.8
b) x
0
= 70
Page | 756
y
70
= -9.026 + 0.5481(70) = 29.341
y + t
o/2,n-2
s
e
+ +
n
x
x
x x
n
2
2
2
0
) (
) ( 1
1
29.341 + 2.447(3.201)
8
) 524 (
224 , 36
) 5 . 65 70 (
8
1
1
2
2
+ +
29.341 + 2.447(3.201)(1.06567) = 29.341 + 8.347
20.994 < y < 37.688
c) The confidence interval for (b) is much wider because part (b) is for a single value
of y which produces a much greater possible variation. In actuality, x
0
= 70 in
part (b) is slightly closer to the mean (x) than x
0
= 60. However, the width of the
single interval is much greater than that of the average or expected y value in
part (a).
Page | 757
14.50 Year Cost
1 56
2 54
3 49
4 46
5 45
Ex = 15 Ey = 250 Exy = 720
Ex
2
= 55 Ey
2
= 12,594 n = 5
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
5
) 15 (
55
5
) 250 )( 15 (
720
2
= -3
b
0
=
5
15
) 3 (
5
250
1
=
n
x
b
n
y
= 59
y = 59 - 3 x
y (7) = 59 - 3(7) = 38
14.51 Ey = 267 Ey
2
= 15,971
Ex = 21 Ex
2
= 101
Exy = 1,256 n = 5
Page | 758
b
0
= 9.234375 b
1
= 10.515625
SSE = Ey
2
- b
0
Ey - b
1
Exy =
15,971 - (9.234375)(267) - (10.515625)(1,256) = 297.7969
r
2
=
2 . 713 , 1
7969 . 297
1
) (
1
2
2
=
n
y
y
SSE
= .826
If a regression model would have been developed to predict number of cars sold
by the number of sales people, the model would have had an r
2
of 82.6%. The
same would hold true for a model to predict number of sales people by the
number of cars sold.
14.52 n = 12 Ex = 548 Ex
2
= 26,592
Ey = 5940 Ey
2
= 3,211,546 Exy = 287,908
b
1
= 10.626383 b
0
= 9.728511
y = 9.728511 + 10.626383 x
SSE = Ey
2
- b
0
Ey - b
1
Exy =
Page | 759
3,211,546 - (9.728511)(5940) - (10.626383)(287,908) = 94337.9762
10
9762 . 337 , 94
2
=
=
n
SSE
s
e
= 97.1277
r
2
=
246 , 271
9762 . 337 , 94
1
) (
1
2
2
=
n
y
y
SSE
= .652
t =
12
) 548 (
592 , 26
1277 . 97
0 626383 . 10
2
= 4.33
If o = .01, then t
.005,10
= 3.169. Since the observed t = 4.33 > t
.005,10
= 3.169, the
decision is to reject the null hypothesis.
Page | 760
14.53 Sales(y) Number of Units(x)
17.1 12.4
7.9 7.5
4.8 6.8
4.7 8.7
4.6 4.6
4.0 5.1
2.9 11.2
2.7 5.1
2.7 2.9
Ey = 51.4 Ey
2
= 460.1 Ex = 64.3
Ex
2
= 538.97 Exy = 440.46 n = 9
b
1
= 0.92025 b
0
= -0.863565
y = -0.863565 + 0.92025 x
SSE = Ey
2
- b
0
Ey - b
1
Exy =
460.1 - (-0.863565)(51.4) - (0.92025)(440.46) = 99.153926
Page | 761
r
2
=
55 . 166
153926 . 99
1
) (
1
2
2
=
n
y
y
SSE
= .405
Page | 762
14.54 Year Total Employment
1995 11,152
1996 10,935
1997 11,050
1998 10,845
1999 10,776
2000 10,764
2001 10,697
2002 9,234
2003 9,223
2004 9,158
Ex = 19,995 Ey = 103,834 Exy = 207,596,350
Ex
2
= 39,980,085 Ey
2
= 1,084,268,984 n = 7
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
10
) 995 , 19 (
085 , 980 , 39
10
) 834 , 103 )( 995 , 19 (
350 , 596 , 207
2
= -239.188
b
0
=
10
995 , 19
) 188 . 239 (
10
834 , 103
1
=
n
x
b
n
y
= 488,639.564
y = 488,639.564 + -239.188 x
y (2008) = 488,639.564 + -239.188(2008) = 8,350.30
Page | 763
Page | 764
14.55 1977 2003
581 666
213 214
668 496
345 204
1476 1600
1776 6278
Ex= 5059 Ey = 9458 Ex
2
= 6,280,931
Ey
2
= 42,750,268 Exy = 14,345,564 n = 6
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
6
) 5059 (
931 , 280 , 6
6
) 9358 )( 5059 (
272 , 593 , 13
2
= 3.1612
b
0
=
6
5059
) 1612 . 3 (
6
9458
1
=
n
x
b
n
y
= -1089.0712
y = -1089.0712 + 3.1612 x
for x = 700:
Page | 765
y = 1076.6044
y + t
o/2,n-2
s
e
+
n
x
x
x x
n
2
2
2
0
) (
) ( 1
o = .05, t
.025,4
= 2.776
x
0
= 700, n = 6
x = 843.167
SSE = Ey
2
b
0
Ey b
1
Exy =
42,750,268 (-1089.0712)(9458) (3.1612)(14,345,564) = 7,701,506.49
4
49 . 506 , 701 , 7
2
=
=
n
SSE
s
e
= 1387.58
Confidence Interval =
1123.757 + (2.776)(1387.58)
6
) 5059 (
931 , 280 , 6
) 167 . 843 700 (
6
1
2
2
+ =
1123.757 + 1619.81
-496.05 to 2743.57
Page | 766
H
0
: |
1
= 0
H
a
: |
1
= 0
o = .05 df = 4
Table t
.025,4
= 2.776
t =
8231614 .
9736 . 2
833 . 350 , 015 , 2
58 . 1387
0 1612 . 3 0
1
=
b
s
b
= 3.234
Since the observed t = 3.234 > t
.025,4
= 2.776, the decision is to reject the null
hypothesis.
14.56 Ex = 11.902 Ex
2
= 25.1215
Ey = 516.8 Ey
2
= 61,899.06 b
1
= 66.36277
Exy = 1,202.867 n = 7 b
0
= -39.0071
y = -39.0071 + 66.36277 x
SSE = Ey
2
- b
0
Ey - b
1
Exy
SSE = 61,899.06 - (-39.0071)(516.8) - (66.36277)(1,202.867) = 2,232.343
Page | 767
5
343 . 232 , 2
2
=
=
n
SSE
s
e
= 21.13
r
2
=
7
) 8 . 516 (
06 . 899 , 61
343 . 232 , 2
1
) (
1
2 2
2
n
y
y
SSE
= 1 - .094 = .906
14.57 Ex = 44,754 Ey = 17,314 Ex
2
= 167,540,610
Ey
2
= 24,646,062 n = 13 Exy = 59,852,571
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
13
) 754 , 44 (
610 , 540 , 167
13
) 314 , 17 )( 754 , 44 (
571 , 852 , 59
2
= .01835
b
0
=
13
754 , 44
) 01835 (.
13
314 , 17
1
=
n
x
b
n
y
= 1268.685
y = 1268.685 + .01835 x
r
2
for this model is .002858. There is no predictability in this model.
Test for slope: t = 0.18 with a p-value of 0.8623. Not significant
Time-Series Trend Line:
Page | 768
Ex = 91 Ey = 44,754 Exy = 304,797
Ex
2
= 819 Ey
2
= 167,540,610 n = 13
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
13
) 91 (
819
13
) 754 , 44 )( 91 (
797 , 304
2
= -46.5989
b
0
=
13
91
) 5989 . 46 (
13
754 , 44
1
=
n
x
b
n
y
= 3,768.81
y = 3,768.81 46.5989 x
y (2007) = 3,768.81 - 46.5989(15) = 3,069.83
Page | 769
14.58 Ex = 323.3 Ey = 6765.8
Ex
2
= 29,629.13 Ey
2
= 7,583,144.64
Exy = 339,342.76 n = 7
b
1
=
=
n
x
x
n
y x
xy
SS
SS
x
xy
2
2
) (
=
7
) 3 . 323 (
13 . 629 , 29
7
) 8 . 6765 )( 3 . 323 (
76 . 342 , 339
2
= 1.82751
b
0
=
7
3 . 323
) 82751 . 1 (
7
8 . 6765
1
=
n
x
b
n
y
= 882.138
y = 882.138 + 1.82751 x
SSE = Ey
2
b
0
Ey b
1
Exy
= 7,583,144.64 (882.138)(6765.8) (1.82751)(339,342.76) = 994,623.07
5
07 . 623 , 994
2
=
=
n
SSE
s
e
= 446.01
r
2
=
7
) 8 . 6765 (
64 . 144 , 583 , 7
07 . 623 , 994
1
) (
1
2 2
2
n
y
y
SSE
= 1 - .953 = .047
H
0
: | = 0
H
a
: | = 0 o = .05 t
.025,5
= 2.571
Page | 770
SS
xx
=
( )
7
) 3 . 323 (
13 . 629 , 29
2
2
2
=
n
x
x = 14,697.29
t =
29 . 697 , 14
01 . 446
0 82751 . 1 0
1
=
xx
e
SS
s
b
= 0.50
Since the observed t = 0.50 < t
.025,5
= 2.571, the decision is to fail to reject the
null hypothesis.
Page | 771
14.59 Let Water use = y and Temperature = x
Ex = 608 Ex
2
= 49,584
Ey = 1,025 Ey
2
= 152,711 b
1
= 2.40107
Exy = 86,006 n = 8 b
0
= -54.35604
y = -54.35604 + 2.40107 x
y
100
= -54.35604 + 2.40107(100) = 185.751
SSE = Ey
2
- b
0
Ey - b
1
Exy
SSE = 152,711 - (-54.35604)(1,025) - (2.40107)(86,006) = 1919.5146
6
5146 . 919 , 1
2
=
=
n
SSE
s
e
= 17.886
r
2
=
8
) 1025 (
711 , 152
5145 . 919 , 1
1
) (
1
2 2
2
n
y
y
SSE
= 1 - .09 = .91
Testing the slope:
H
o
: | = 0
H
a
: | = 0 o = .01
Page | 772
Since this is a two-tailed test, o/2 = .005
df = n - 2 = 8 - 2 = 6
t
.005,6
= 3.707
s
b
=
8
) 608 (
584 , 49
886 . 17
) (
2 2
2
n
x
x
s
e
= .30783
t =
30783 .
0 40107 . 2
1 1
=
b
s
b |
= 7.80
Since the observed t = 7.80 < t
.005,6
= 3.707, the decision is to reject the null
hypothesis.
14.60 a) The regression equation is: y = 67.2 0.0565 x
b) For every unit of increase in the value of x, the predicted value of y will
decrease by -.0565.
c) The t ratio for the slope is 5.50 with an associated p-value of .000. This is
significant at o = .10. The t ratio negative because the slope is negative and
the numerator of the t ratio formula equals the slope minus zero.
Page | 773
d) r
2
is .627 or 62.7% of the variability of y is accounted for by x. This is only
a modest proportion of predictability. The standard error of the estimate is
10.32. This is best interpreted in light of the data and the magnitude of the
data.
e) The F value which tests the overall predictability of the model is 30.25. For
simple regression analysis, this equals the value of t
2
which is (-5.50)
2
.
f) The negative is not a surprise because the slope of the regression line is also
negative indicating an inverse relationship between x and y. In addition,
taking the square root of r
2
which is .627 yields .7906 which is the magnitude
of the value of r considering rounding error.
14.61 The F value for overall predictability is 7.12 with an associated p-value of .0205 which is
significant at o = .05. It is not significant at alpha of .01. The coefficient of
determination is .372 with an adjusted r
2
of .32. This represents very modest
predictability. The standard error of the estimate is 982.219, which in units of 1,000
laborers means that about 68% of the predictions are within 982,219 of the actual
figures. The regression model is:
Number of Union Members = 22,348.97 - 0.0524 Labor Force. For a labor force of
100,000 (thousand, actually 100 million), substitute x = 100,000 and get a predicted
value of 17,108.97 (thousand) which is actually 17,108,970 union members.
Page | 774
14.62 The Residual Model Diagnostics from MINITAB indicate a relatively healthy set
of residuals. The Histogram indicates that the error terms are generally normally
distributed. This is somewhat confirmed by the semi straight line Normal Plot of
Residuals. However, the Residuals vs. Fits graph indicates that there may be some
heteroscedasticity with greater error variance for small x values.
Page | 775
Chapter 15
Multiple Regression Analysis
LEARNING OBJECTIVES
This chapter presents the potential of multiple regression analysis as a tool in business decision
making and its applications, thereby enabling you to:
1. Develop a multiple regression model.
2. Understand and apply significance tests of the regression model and its coefficients.
3. Compute and interpret residuals, the standard error of the estimate, and the coefficient
of determination.
4. Interpret multiple regression computer output.
Page | 776
CHAPTER TEACHING STRATEGY
In chapter 14 using simple regression, the groundwork was prepared for chapter 15 by
presenting the regression model along with mechanisms for testing the strength of the model
such as s
e
, r
2
, a t test of the slope, and the residuals. In this chapter, multiple regression is
presented as an extension of the simple linear regression case. It is initially pointed out that any
model that has at least one interaction term or a variable that represents a power of two or
more is considered a multiple regression model. Multiple regression opens up the possibilities
of predicting by multiple independent variables and nonlinear relationships. It is emphasized in
the chapter that with both simple and multiple regression models there is only one dependent
variable. Where simple regression utilizes only one independent variable, multiple regression
can utilize more than one independent variable.
Page | 777
Presented early in chapter 15 are the simultaneous equations that need to be solved to
develop a first-order multiple regression model using two predictors. This should help the
student to see that there are three equations with three unknowns to be solved. In addition,
there are eight values that need to be determined before solving the simultaneous equations (
x
1
, x
2
, y, x
1
2
, . . .) Suppose there are five predictors. Six simultaneous equations must be solved
and the number of sums needed as constants in the equations become overwhelming. At this
point, the student will begin to realize that most researchers do not want to take the time nor
the effort to solve for multiple regression models by hand. For this reason, much of the chapter
is presented using computer printouts. The assumption is that the use of multiple regression
analysis is largely from computer analysis.
Topics included in this chapter are similar to the ones in chapter 14 including tests of the
slope, R
2
, and s
e
. In addition, an adjusted R
2
is introduced in chapter 15. The adjusted R
2
takes
into account the degrees of freedom error and total degrees of freedom whereas R
2
does not. If
there is a significant discrepancy between adjusted R
2
and R
2
, then the regression model may
not be as strong as it appears to be with the R
2
. The gap between R
2
and adjusted R
2
tends to
increase as non significant independent variables are added to the regression model and
decreases with increased sample size.
Page | 778
CHAPTER OUTLINE
15.1 The Multiple Regression Model
Multiple Regression Model with Two Independent Variables (First-Order)
Determining the Multiple Regression Equation
A Multiple Regression Model
15.2 Significant Tests of the Regression Model and its Coefficients
Testing the Overall Model
Significance Tests of the Regression Coefficients
15.3 Residuals, Standard Error of the Estimate, and R
2
Residuals
SSE and Standard Error of the Estimate
Coefficient of Determination (R
2
)
Adjusted R
2
15.4 Interpreting Multiple Regression Computer Output
A Reexamination of the Multiple Regression Output
Page | 779
KEY TERMS
Adjusted R
2
R
2
Coefficient of Multiple Determination (R
2
)
Residual
Dependent Variable Response Plane
Independent Variable Response Surface
Least Squares Analysis Response Variable
Multiple Regression Standard Error of the Estimate
Outliers Sum of Squares of Error
Partial Regression Coefficient
SOLUTIONS TO PROBLEMS IN CHAPTER 15
15.1 The regression model is:
y = 25.03 - 0.0497 x
1
+ 1.928 x
2
Predicted value of y for x
1
= 200 and x
2
= 7 is:
y = 25.03 - 0.0497(200) + 1.928(7) = 28.586
Page | 780
15.2 The regression model is:
y = 118.56 - 0.0794 x
1
- 0.88428 x
2
+ 0.3769 x
3
Predicted value of y for x
1
= 33, x
2
= 29, and x
3
= 13 is:
y = 118.56 - 0.0794(33) - 0.88428(29) + 0.3769(13) = 95.19538
Page | 781
15.3 The regression model is:
y = 121.62 - 0.174 x
1
+ 6.02 x
2
+ 0.00026 x
3
+ 0.0041 x
4
There are four independent variables. If x
2
, x
3
, and x
4
are held constant, the predicted y
will decrease by - 0.174 for every unit increase in x
1
. Predicted y will increase by 6.02 for
every unit increase in x
2
as x
1
, x
3
, and x
4
are held constant. Predicted y will increase by
0.00026 for every unit increase in x
3
holding x
1
, x
2
, and x
4
constant. If x
4
is increased by
one unit, the predicted y will increase by 0.0041 if x
1
, x
2
, and x
3
are held constant.
15.4 The regression model is:
y = 31,409.5 + 0.08425 x
1
+ 289.62 x
2
- 0.0947 x
3
For every unit increase in x
1
, the predicted y increases by 0.08425 if x
2
and x
3
are held
constant. The predicted y will increase by 289.62 for every unit increase in x
2
if x
1
and x
3
are held constant. The predicted y will decrease by 0.0947 for every unit increase in x
3
if
x
1
and x
2
are held constant.
15.5 The regression model is:
Page | 782
Per Capita = -7,629.627 + 116.2549 Paper Consumption
- 120.0904 Fish Consumption + 45.73328 Gasoline Consumption.
For every unit increase in paper consumption, the predicted per capita consumption
increases by 116.2549 if fish and gasoline consumption are held constant. For every
unit increase in fish consumption, the predicted per capita consumption decreases by
120.0904 if paper and gasoline consumption are held constant. For every unit increase
in gasoline consumption, the predicted per capita consumption increases by 45.73328 if
paper and fish consumption are held constant.
Page | 783
15.6 The regression model is:
Insider Ownership =
17.68 - 0.0594 Debt Ratio - 0.118 Dividend Payout
The coefficients mean that for every unit of increase in debt ratio there is a predicted
decrease of - 0.0594 in insider ownership if dividend payout is held constant. On the
other hand, if dividend payout is increased by one unit, then there is a predicted drop of
insider ownership by 0.118 with debt ratio is held constant.
15.7 There are 9 predictors in this model. The F test for overall significance of the model is
1.99 with a probability of .0825. This model is not significant at o = .05. Only one of the
t values is statistically significant. Predictor x
1
has a t of 2.73 which has an associated
probability of .011 and this is significant at o = .05.
15.8 This model contains three predictors. The F test is significant at o = .05 but not at o =
.01. The t values indicate that only one of the three predictors is significant. Predictor
x
1
yields a t value of 3.41 with an associated probability of .005. The recommendation is
to rerun the model using only x
1
and then search for other variables besides x
2
and x
3
to
include in future models.
Page | 784
15.9 The regression model is:
Per Capita Consumption = -7,629.627 + 116.2549 Paper Consumption
- 120.0904 Fish Consumption + 45.73328 Gasoline Consumption
This model yields an F = 14.319 with p-value = .0023. Thus, there is overall significance
at o = .01. One of the three predictors is significant. Gasoline Consumption has a t =
2.67 with p-value of .032 which is statistically significant at o = .05. The p-values of the t
statistics for the other two predictors are insignificant indicating that a model with just
Gasoline Consumption as a single predictor might be nearly as strong.
Page | 785
15.10 The regression model is:
Insider Ownership =
17.68 - 0.0594 Debt Ratio - 0.118 Dividend Payout
The overall value of F is only 0.02 with p-value of .982. This model is not significant.
Neither of the t values are significant (t
Debt
= -0.19 with a p-value of .855 and t
Dividend
= -
0.11 with a p-value of .913).
15.11 The regression model is:
y = 3.981 + 0.07322 x
1
- 0.03232 x
2
- 0.003886 x
3
The overall F for this model is 100.47 with is significant at o = .001. Only one of the
predictors, x
1
, has a significant t value (t = 3.50, p-value of .005). The other independent
variables have non significant t values
(x
2
: t = -1.55, p-value of .15 and x
3
: t = -1.01, p-value of .332). Since x
2
and x
3
are non
significant predictors, the researcher should consider the using a simple regression
model with only x
1
as a predictor. The R
2
would drop some but the model would be
much more parsimonious.
15.12 The regression equation for the model using both x
1
and x
2
is:
y = 243.44 - 16.608 x
1
- 0.0732 x
2
Page | 786
The overall F = 156.89 with a p-value of .000. x
1
is a significant predictor of y as
indicated by t = - 16.10 and a p-value of .000.
For x
2
, t = -0.39 with a p-value of .702. x
2
is not a significant predictor of y when
included with x
1
. Since x
2
is not a significant predictor, the researcher might want to
rerun the model using justx
1
as a predictor.
The regression model using only x
1
as a predictor is:
Y
= 235.143 - 16.7678 x
1
There is very little change in the coefficient of x
1
from model one
(2 predictors) to this model. The overall F = 335.47 with a p-value of .000 is highly
significant. By using the one-predictor model, we get virtually the same predictability as
with the two predictor model and it is more parsimonious.
15.13 There are 3 predictors in this model and 15 observations.
The regression equation is:
y = 657.053 + 5.7103 x
1
0.4169 x
2
3.4715 x
3
F = 8.96 with a p-value of .0027
x
1
is significant at o = .01 (t = 3.19, p-value of .0087)
x
3
is significant at o = .05 (t = - 2.41, p-value of .0349)
The model is significant overall.
Page | 787
15.14 The standard error of the estimate is 3.503. R
2
is .408 and the adjusted R
2
is only .203.
This indicates that there are a lot of insignificant predictors in the model. That is
underscored by the fact that eight of the nine predictors have non significant t values.
15.15 s
e
= 9.722, R
2
= .515 but the adjusted R
2
is only .404. The difference in the two is due to
the fact that two of the three predictors in the model are non-significant. The model fits
the data only modestly. The adjusted R
2
indicates that 40.4% of the variance of y is
accounted for by this model and 59.6% is unaccounted for by the model.
15.16 The standard error of the estimate of 14,660.57 indicates that this model predicts Per
Capita Personal Consumption to within + 14,660.57 about 68% of the time. The entire
range of Personal Per Capita for the data is slightly less than 110,000. Relative to this
range, the standard error of the estimate is modest. R
2
= .85988 and the adjusted value
of R
2
is .799828 indicating that there are potentially some non significant variables in the
model. An examination of the t statistics reveals that two of the three predictors are
not significant. The model has relatively good predictability.
15.17 s
e
= 6.544. R
2
= .005. This model has no predictability.
Page | 788
15.18 The value of s
e
= 0.2331, R
2
= .965, and adjusted R
2
= .955. This is a
very strong regression model. However, since x
2
and x
3
are not significant predictors,
the researcher should consider the using a simple regression model with only x
1
as a
predictor. The R
2
would drop some but the model would be much more parsimonious.
15.19 For the regression equation for the model using both x
1
and x
2
, s
e
= 6.333,
R
2
= .963 and adjusted R
2
= .957. Overall, this is a very strong model. For the regression
model using only x
1
as a predictor, the standard error of the estimate is 6.124, R
2
= .963
and the adjusted R
2
= .960. The value of R
2
is the same as it was with the two
predictors. However, the adjusted R
2
is slightly higher with the one-predictor model
because the non-significant variable has been removed. In conclusion, by using the one
predictor model, we get virtually the same predictability as with the two predictor
model and it is more parsimonious.
15.20 R
2
= .710, adjusted R
2
= .630, s
e
= 109.43. The model is significant
overall. The R
2
is higher but the adjusted R
2
by 8%. The model is
moderately strong.
15.21 The Histogram indicates that there may be some problem with the error
Page | 789
terms being normally distributed. The Residuals vs. Fits plot reveals that there may be
some lack of homogeneity of error variance.
15.22 There are four predictors. The equation of the regression model is:
y = -55.9 + 0.0105 x
1
0.107 x
2
+ 0.579 x
3
0.870 x
4
The test for overall significance yields an F = 55.52 with a p-value of .000
which is significant at o = .001. Three of the t tests for regression coefficients are
significant at o = .01 including the coefficients for
x
2
, x
3
, and x
4
. The R
2
value of 80.2% indicates strong predictability for the model. The
value of the adjusted R
2
(78.8%) is close to R
2
and s
e
is 9.025.
15.23 There are two predictors in this model. The equation of the regression model is:
y = 203.3937 + 1.1151 x
1
2.2115 x
2
The F test for overall significance yields a value of 24.55 with an
associated p-value of .0000013 which is significant at o = .00001. Both
variables yield t values that are significant at a 5% level of significance.
x
2
is significant at o = .001. The R
2
is a rather modest 66.3% and the
standard error of the estimate is 51.761.
Page | 790
15.24 The regression model is:
y = 137.27 + 0.0025 x
1
+ 29.206 x
2
F = 10.89 with p = .005, s
e
= 9.401, R
2
= .731, adjusted R
2
= .664. For x
1
, t = 0.01 with p =
.99 and for x
2
, t = 4.47 with p = .002. This model has good predictability. The gap
between R
2
and adjusted R
2
indicates that there may be a non-significant predictor in
the model. The t values show x
1
has virtually no predictability and x
2
is a significant
predictor of y.
15.25 The regression model is:
Y
= 362.3054 4.745518 x
1
- 13.89972 x
2
+ 1.874297 x
3
F = 16.05 with p = .001, s
e
= 37.07, R
2
= .858, adjusted R
2
= .804. For x
1
, t = -4.35 with p =
.002; for x
2
, t = -0.73 with p = .483, for x
3
, t = 1.96 with p = .086. Thus, only one of the
three predictors, x
1
, is a significant predictor in this model. This model has very good
predictability (R
2
= .858). The gap between R
2
and adjusted R
2
underscores the fact that
there are two non-significant predictors in this model.
Page | 791
15.26 The overall F for this model was 12.19 with a p-value of .002 which is significant at o =
.01. The t test for Silver is significant at o = .01 ( t = 4.94, p = .001). The t test for
Aluminum yields a t = 3.03 with a p-value of .016 which is significant at o = .05. The t
test for Copper was insignificant with a p-value of .939. The value of R
2
was 82.1%
compared
to an adjusted R
2
of 75.3%. The gap between the two indicates the presence of some
insignificant predictors (Copper). The standard error of the estimate is 53.44.
15.27 The regression model was:
Employment = 71.03 + 0.4620 NavalVessels + 0.02082 Commercial
F = 1.22 with p = .386 (not significant)
R
2
= .379 and adjusted R
2
= .068
The low value of adjusted R
2
indicates that the model has very low predictability. Both t
values are not significant (t
NavalVessels
= 0.67 with
p = .541 and t
Commercial
= 1.07 with p = .345). Neither predictor is a significant predictor of
employment.
15.28 The regression model was:
Page | 792
All = -1.06 + 0.475 Food + 0.250 Shelter 0.008 Apparel +
0.272 Fuel Oil
F = 97.98 with a p-value of .000
s
e
= 0.7472, R
2
= .963 and adjusted R
2
= .953
One of the predictor variables, Food, produces a t value that is significant at o = .001.
Two others are significant at o = .05: Shelter (t = 2.48, p-value of .025 and Fuel Oil (t =
2.36 with a p-value of .032).
Page | 793
15.29 The regression model was:
Corn = -2718 + 6.26 Soybeans 0.77 Wheat
F = 14.25 with a p-value of .003 which is significant at o = .01
s
e
= 862.4, R
2
= 80.3%, adjusted R
2
= 74.6%
One of the two predictors, Soybeans, yielded a t value that was significant at o = .01
while the other predictor, Wheat was not significant (t = -0.75 with a p-value of .476).
15.30 The regression model was:
Grocery = 76.23 + 0.08592 Housing + 0.16767 Utility
+ 0.0284 Transportation - 0.0659 Healthcare
F = 2.29 with p = .095 which is not significant at o = .05.
s
e
= 4.416, R
2
= .315, and adjusted R
2
= .177.
Only one of the four predictors has a significant t ratio and that is Utility with t = 2.57
and p = .018. The ratios and their respective probabilities are:
t
housing
= 1.68 with p = .109, t
transportation
= 0.17 with p = .87, and
t
healthcare
= - 0.64 with p = .53.
Page | 794
This model is very weak. Only the predictor, Utility, shows much promise in accounting
for the grocery variability.
15.31 The regression equation is:
y = 87.89 0.256 x
1
2.714 x
2
+ 0.0706 x
3
F = 47.57 with a p-value of .000 significant at o = .001.
s
e
= 0.8503, R
2
= .941, adjusted R
2
= .921.
All three predictors produced significant t tests with two of them
(x
2
and x
3
) significant at .01 and the other, x
1
significant at o = .05. This is
a very strong model.
15.32 Two of the diagnostic charts indicate that there may be a problem with the
error terms being normally distributed. The histogram indicates that the error
term distribution might be skewed to the right and the normal probability plot is
somewhat nonlinear. In addition, the residuals vs. fits chart indicates a potential
heteroscadasticity problem with residuals for middle values of x producing more
variability that those for lower and higher values of x.
Page | 795
Chapter 16
Building Multiple Regression Models
LEARNING OBJECTIVES
This chapter presents several advanced topics in multiple regression analysis enabling you to:
1. Analyze and interpret nonlinear variables in multiple regression analysis.
2. Understand the role of qualitative variables and how to use them in multiple regression
analysis.
3. Learn how to build and evaluate multiple regression models.
4. Detect influential observations in regression analysis.
CHAPTER TEACHING STRATEGY
Page | 796
In chapter 15, the groundwork was prepared for chapter 16 by presenting multiple
regression models along with mechanisms for testing the strength of the models such as s
e
, R
2
, t
tests of the regression coefficients, and the residuals.
The early portion of this chapter is devoted to nonlinear regression models and search
procedures. There are other exotic types of regression models that can be explored. It is hoped
that by studying section 16.1, the student will be somewhat prepared to explore other nonlinear
models on his/her own. Tukeys Ladder of Transformations can be useful in steering the
research towards particular recoding schemes that will result in better fits for the data.
Dummy or indicator variables can be useful in multiple regression analysis. Remember
to emphasize that only one dummy variable is used to represent two categories (yes/no,
male/female, union/nonunion, etc.). For c categories of a qualitative variable, only c-1 indicator
variables should be included in the multiple regression model.
Several search procedures have been discussed in the chapter including stepwise
regression, forward selection, backward elimination, and all possible regressions. All possible
regressions is presented mainly to demonstrate to the student the large number of possible
models that can be examined. Most of the effort is spent on stepwise regression because of its
common usage. Forward selection is presented as the same as stepwise regression except that
forward selection procedures do not go back and examine variables that have been in the model
at each new step. That is, with forward selection, once a variable is in the model, it stays in the
model. Backward elimination begins with a "full" model of all predictors. Sometimes there may
not be enough observations to justify such a model.
Page | 797
CHAPTER OUTLINE
16.1 Non Linear Models: Mathematical Transformation
Polynomial Regression
Tukeys Ladder of Transformations
Regression Models with Interaction
Model Transformation
16.2 Indicator (Dummy) Variables
16.3 Model-Building: Search Procedures
Search Procedures
All Possible Regressions
Stepwise Regression
Forward Selection
Backward Elimination
16.4 Multicollinearity
Page | 798
KEY TERMS
All Possible Regressions Qualitative Variable
Backward Elimination Search Procedures
Dummy Variable Stepwise Regression
Forward Selection Tukeys Four-quadrant Approach
Indicator Variable Tukeys Ladder of
Multicollinearity Transformations
Quadratic Model Variance Inflation Factor
SOLUTIONS TO PROBLEMS IN CHAPTER 16
16.1 Simple Regression Model:
y = - 147.27 + 27.128 x
F = 229.67 with p = .000, s
e
= 27.27, R
2
= .97, adjusted R
2
= .966, and
t = 15.15 with p = .000. This is a very strong simple regression model.
Page | 799
Quadratic Model (Using both x and x
2
):
y = - 22.01 + 3.385 X + 0.9373 x
2
F = 578.76 with p = .000, s
e
= 12.3, R
2
= .995, adjusted R
2
= .993, for x:
t = 0.75 with p = .483, and for x
2
: t = 5.33 with p = .002. The quadratic model is also very
strong with an even higher R
2
value. However, in this model only the x
2
term is a
significant predictor.
Page | 800
16.2 The model is:
y = b
0
b
1
x
Using logs: log y = log b
0
+ x log b
1
The regression model is solved for in the computer using the values of x and the values
of log y. The resulting regression equation is:
log y = 0.5797 + 0.82096 x
F = 68.83 with p = .000, s
e
= 0.1261, R
2
= .852, and adjusted R
2
= .839. This model has
relatively strong predictability.
Page | 801
16.3 Simple regression model:
Y
= - 1456.6 + 71.017 x
R
2
= .928 and adjusted R
2
= .910. t = 7.17 with p = .002.
Quadratic regression model:
y = 1012 - 14.06 x + 0.6115 x
2
R
2
= .947 but adjusted R
2
= .911. The t ratio for the x term is t = - 0.17 with p = .876.
The t ratio for the x
2
term is t = 1.03 with p = .377
Neither predictor is significant in the quadratic model. Also, the adjusted R
2
for this
model is virtually identical to the simple regression model. The quadratic model adds
virtually no predictability that the simple regression model does not already have. The
scatter plot of the data follows:
Page | 802
110 100 90 80 70 60 50 40 30
7000
6000
5000
4000
3000
2000
1000
Eq & Sup Exp
A
d
E
x
p
16.4 The model is:
y = b
0
b
1
x
Using logs: log y= log b
0
+ x log b
1
The regression model is solved for in the computer using the values of x and the values
of log y where x is failures and y is liabilities. The resulting regression equation is:
log liabilities = 3.1256 + 0.012846 failures
F = 19.98 with p = .001, s
e
= 0.2862, R
2
= .666, and adjusted R
2
= .633. This model has
modest predictability.
Page | 803
Page | 804
16.5 The regression model is:
y = - 28.61 - 2.68 x
1
+ 18.25 x
2
- 0.2135 x
1
2
- 1.533 x
2
2
+ 1.226 x
1*
x
2
F = 63.43 with p = .000 significant at o = .001
s
e
= 4.669, R
2
= .958, and adjusted R
2
= .943
None of the t ratios for this model are significant. They are t(x
1
) = - 0.25 with p = .805,
t(x
2
) = 0.91 with p = .378, t(x
1
2
) = - 0.33 with .745,
t(x
2
2
) = - 0.68 with .506, and t(x
1*
x
2
) = 0.52 with p = .613. This model has a high R
2
yet
none of the predictors are individually significant.
The same thing occurs when the interaction term is not in the model. None of the t
tests are significant. The R
2
remains high at .957 indicating
that the loss of the interaction term was insignificant.
16.6 The F value shows very strong overall significance with a p-value of .00000073. This is
reinforced by the high R
2
of .910 and adjusted R
2
of .878. An examination of the t values
reveals that only one of the regression coefficients is significant at o = 05 and that is the
interaction term with a p-value of .039. Thus, this model with both variables, the square
of both variables, and the interaction term contains only one significant t test and that is
for interaction.
Without interaction, the R
2
drops to .877 and adjusted R
2
to .844. With the interaction
term removed, both variable x
2
and x
2
2
are significant at
o = .01.
Page | 805
16.7 The regression equation is:
y = 13.619 - 0.01201 x
1
+ 2.998 x
2
The overall F = 8.43 is significant at o = .01 (p = .009).
s
e
= 1.245, R
2
= .652, adjusted R
2
= .575
The t ratio for the x
1
variable is only t = -0.14 with p = .893. However the t ratio for the
dummy variable, x
2
is t = 3.88 with p = .004. The indicator variable is the significant
predictor in this regression model that has some predictability (adjusted R
2
= .575).
16.8 The indicator variable has c = 4 categories as shown by the c - 1 = 3 categories of
the predictors (x
2
, x
3
, x
4
).
The regression equation is:
y = 7.909 + 0.581 x
1
+ 1.458 x
2
- 5.881 x
3
- 4.108 x
4
Overall F = 13.54, p = .000 significant at o = .001
s
e
= 1.733, R
2
= .806, and adjusted R
2
= .747
For the predictors, t = 0.56 with p = .585 for the x
1
variable (not significant), t = 1.32 with
p = .208 for the first indicator variable (x
2
) and is non significant, t = -5.32 with p = .000
for x
3
the second indicator variable and this is significant at o = .001, t = -3.21 with p =
Page | 806
.007 for the third indicator variable (x
4
) which is significant at o = .01. This model has
strong predictability and the only significant predictor variables are the two dummy
variables, x
3
and x
4
.
16.9 This regression model has relatively strong predictability as indicated by R
2
= .795. Of
the three predictor variables, only x
1
and x
2
have significant t ratios (using o = .05). x
3
(a
non indicator variable) is not a significant predictor. x
1
, the indicator variable, plays a
significant role in this model along with x
2
.
Page | 807
16.10 The regression model is:
y = 41.225 + 1.081 x
1
18.404 x
2
F = 8.23 with p = .0017 which is significant at o = .01. s
e
= 11.744,
R
2
= .388 and the adjusted R
2
= .341.
The t-ratio for x
2
(the dummy variable) is 4.05 which has an associated
p-value of .0004 and is significant at o = .001. The t-ratio of 0.80 for x
1
is not significant (p-value = .4316). With x
2
= 0, the regression model
becomes y = 41.225 + 1.081x
1
. With x
2
= 1, the regression model
becomes y = 22.821 + 1.081x
1
. The presence of x
2
causes the y
intercept to drop by 18.404. The graph of each of these models (without
the dummy variable and with the dummy variable equal to one) is shown
below:
Page | 808
16.11 The regression equation is:
Price = 7.066 - 0.0855 Hours + 9.614 ProbSeat + 10.507 FQ
The overall F = 6.80 with p = .009 which is significant at o = .01. s
e
= 4.02, R
2
= .671, and
adjusted R
2
= .573. The difference between R
2
and adjusted R
2
indicates that there are
some non-significant predictors in the model. The t ratios, t = - 0.56 with p = .587 and t
= 1.37 with p = .202, of Hours and Probability of Being Seated are non-significant at o =
.05. The only significant predictor is the dummy variable, French Quarter or not, which
has a t ratio of 3.97 with p = .003 which is significant at o = .01. The positive coefficient
on this variable indicates that being in the French Quarter adds to the price of a meal.
16.12 There will be six predictor variables in the regression analysis:
three for occupation, two for industry, and one for marital status. The dependent
variable is job satisfaction. In total, there will be seven variables in this analysis.
16.13 Stepwise Regression:
Step 1: x
2
enters the model, t = - 7.35 and R
2
= .794
The model is = 36.15 - 0.146 x
2
Step 2: x
3
enters the model and x
2
remains in the model.
t for x
2
is -4.60, t for x
3
is 2.93. R
2
= .876.
The model is y = 26.40 - 0.101 x
2
+ 0.116 x
3
The variable, x
1
, never enters the procedure.
Page | 809
16.14 Stepwise Regression:
Step 1: x
4
enters the model, t = - 4.20 and R
2
= .525
The model is y = 133.53 - 0.78 x
4
Step 2: x
2
enters the model and x
4
remains in the model.
t for x
4
is - 3.22 and t for x
2
is 2.15. R
2
= .637
The model is y = 91.01 - 0.60 x
4
+ 0.51 x
2
The variables, x
1
and x
3
never enter the procedure.
16.15 The output shows that the final model had four predictor variables, x
4
, x
2
, x
5
, and x
7
. The
variables, x
3
and x
6
did not enter the stepwise analysis. The procedure took four steps.
The final model was:
y
1
= - 5.00 x
4
+ 3.22 x
2
+ 1.78 x
5
+ 1.56 x
7
The R
2
for this model was .5929, and s
e
was 3.36. The t ratios were:
t
x4
= 3.07, t
x2
= 2.05, t
x5
= 2.02, and t
x7
= 1.98.
Page | 810
16.16 The output indicates that the stepwise process only went two steps. Variable x
3
entered
at step one. However, at step two, x
3
dropped out of the analysis and x
2
and x
4
entered
as the predictors. x
1
was the dependent variable. x
5
never entered the procedure and
was not included in the final model as x
3
was not. The final regression model was:
Y
= 22.30 + 12.38 x
2
+ 0.0047 x
4
.
R
2
= .682 and s
e
= 9.47. t
x2
= 2.64 and t
x4
= 2.01.
16.17 The output indicates that the procedure went through two steps. At step 1, dividends
entered the process yielding an r
2
of .833 by itself. The t value was 6.69 and the model
was y = - 11.062 + 61.1 x
1
. At step 2, net income entered the procedure and dividends
remained in the model. The R
2
for this two-predictor model was .897 which is a modest
increase from the simple regression model shown in step one. The step 2 model was:
Premiums earned = - 3.726 + 45.2 dividends + 3.6 net income
For step 2, t
dividends
= 4.36 (p-value = .002)
and t
net income
= 2.24 (p-value = .056).
correlation matrix
Premiums Income Dividends Gain/Loss
Premiums 1
Income 0.808236 1
Dividends 0.912515 0.682321 1
Gain/Loss -0.40984 0.0924 -0.52241 1
Page | 811
16.18 This stepwise regression procedure only went one step. The only significant predictor
was natural gas. No other predictors entered the model. The regression model is:
Electricity = 1.748 + 0.994 Natural Gas
For this model, R
2
= .9295 and s
e
= 0.490. The t value for natural gas was 11.48.
Page | 812
16.19 y x
1
x
2
x
3
y - -.653 -.891 .821
x
1
-.653 - .650 -.615
x
2
-.891 .650 - -.688
x
3
.821 -.615 -.688 -
There appears to be some correlation between all pairs of the predictor variables, x
1
, x
2
,
and x
3
. All pairwise correlations between independent variables are in the .600 to .700
range.
Page | 813
16.20 y x
1
x
2
x
3
x
4
y - -.241 .621 .278 -.724
x
1
-.241 - -.359 -.161 .325
x
2
.621 -.359 - .243 -.442
x
3
.278 -.161 .243 - -.278
x
4
-.724 .325 -.442 -.278 -
An examination of the intercorrelations of the predictor variables reveals that the
highest pairwise correlation exists between variables x
2
and
x
4
(-.442). Other correlations between independent variables are less than .400.
Multicollinearity may not be a serious problem in this regression analysis.
Page | 814
16.21 The stepwise regression analysis of problem 16.17 resulted in two of the three
predictor variables being included in the model. The simple regression model yielded an
R
2
of .833 jumping to .897 with the two predictors. The predictor intercorrelations are:
Net
Income Dividends Gain/Loss
Net - .682 .092
Income
Dividends .682 - -.522
Gain/Loss .092 -.522 -
An examination of the predictor intercorrelations reveals that Gain/Loss and Net Income
have very little correlation, but Net Income and Dividends have a correlation of .682 and
Dividends and Gain/Loss have a correlation of -.522. These correlations might suggest
multicollinearity.
16.22 The intercorrelations of the predictor variables are:
Natural Fuel
Gas Oil Gasoline
Page | 815
Natural
Gas - .570 .701
Fuel Oil .570 - .934
Gasoline .701 .934 -
Each of these intercorrelations is not small. Of particular concern is the correlation
between fuel oil and gasoline, which is .934. These two variables seem to be adding
about the same predictability to the model. In the stepwise regression analysis only
natural gas entered the procedure. Perhaps the overlapping information between
natural gas and fuel oil and gasoline was such that fuel oil and gasoline did not have
significant unique variance to add to the prediction.
Page | 816
16.23 The regression model is:
y = 564 - 27.99 x
1
- 6.155 x
2
- 15.90 x
3
F = 11.32 with p = .003, s
e
= 42.88, R
2
= .809, adjusted R
2
= .738. For x
1
, t = -0.92 with p =
.384, for x
2
, t = -4.34 with p = .002, for x
3
, t = -0.71 with p = .497. Thus, only one of the
three predictors, x
2
, is a significant predictor in this model. This model has very good
predictability (R
2
= .809). The gap between R
2
and adjusted R
2
underscores the fact that
there are two nonsignificant predictors in this model. x
1
is a nonsignificant indicator
variable.
16.24 The stepwise regression process included two steps. At step 1, x
1
entered the
procedure producing the model:
y = 1540 + 48.2 x
1
.
The R
2
at this step is .9112 and the t ratio is 11.55. At step 2, x
1
2
entered the procedure
and x
1
remained in the analysis. The stepwise regression procedure stopped at this step
and did not proceed. The final model was:
y = 1237 + 136.1 x
1
- 5.9 x
1
2
.
The R
2
at this step was .9723, the t ratio for x
1
was 7.89, and the t ratio for x
1
2
was - 5.14.
Page | 817
16.25 In this model with x
1
and the log of x
1
as predictors, only the log x
1
was a significant
predictor of y. The stepwise procedure only went to step 1. The regression model was:
y = - 13.20 + 11.64 Log x
1
. R
2
= .9617 and the t ratio of Log x
1
was 17.36. This
model has very strong predictability using only the log of the x
1
variable.
Page | 818
16.26 The regression model is:
Grain = - 4.675 + 0.4732 Oilseed + 1.18 Livestock
The value of R
2
was .901 and adjusted R
2
= .877.
s
e
= 1.761. F = 36.55 with p = .000.
t
oilseed
= 3.74 with p = .006 and t
livestock
= 3.78 with p = .005. Both predictors are
significant at o = .01. This is a model with strong predictability.
16.27 The stepwise regression procedure only used two steps. At step 1, Silver was the
lone predictor. The value of R
2
was .5244. At step 2, Aluminum entered the model
and Silver remained in the model. However, the R
2
jumped to .8204. The final
model at step 2 was:
Gold = - 50.19 + 18.9 Silver +3.59 Aluminum.
The t values were: t
Silver
= 5.43 and t
Aluminum
= 3.85.
Copper did not enter into the process at all.
Page | 819
16.28 The regression model was:
Employment = 71.03 + 0.4620 NavalVessels + 0.02082 Commercial
F = 1.22 with p = .386 (not significant)
R
2
= .379 and adjusted R
2
= .068
The low value of adjusted R
2
indicates that the model has very low predictability. Both t
values are not significant (t
NavalVessels
= 0.67 with
p = .541 and t
Commercial
= 1.07 with p = .345). Neither predictor is a significant predictor of
employment.
Page | 820
16.29 There were four predictor variables. The stepwise regression procedure went three
steps. The predictor, apparel, never entered in the stepwise process. At step 1, food
entered the procedure producing a model with an R
2
of .84. At step 2, fuel oil entered
and food remained. The R
2
increased to .95. At step 3, shelter entered the procedure
and both fuel oil and food remained in the model. The R
2
at this step was .96. The final
model was:
All = -1.0615 + 0.474 Food + 0.269 Fuel Oil + 0.249 Shelter
The t ratios were: t
food
= 8.32, t
fuel oil
= 2.81, t
shelter
= 2.56.
16.30 The stepwise regression process with these two independent variables only went one
step. At step 1, Soybeans entered in producing the model,
Corn = - 2,962 + 5.4 Soybeans. The R
2
for this model was .7868.
The t ratio for Soybeans was 5.43. Wheat did not enter in to the analysis.
16.31 The regression model was:
Grocery = 76.23 + 0.08592 Housing + 0.16767 Utility
+ 0.0284 Transportation - 0.0659 Healthcare
Page | 821
F = 2.29 with p = .095 which is not significant at o = .05.
s
e
= 4.416, R
2
= .315, and adjusted R
2
= .177.
Only one of the four predictors has a significant t ratio and that is Utility with t = 2.57
and p = .018. The ratios and their respective probabilities are:
t
housing
= 1.68 with p = .109, t
transportation
= 0.17 with p = .87, and
t
healthcare
= - 0.64 with p = .53.
This model is very weak. Only the predictor, Utility, shows much promise in accounting
for the grocery variability.
Page | 822
16.32 The output suggests that the procedure only went two steps.
At step 1, x
1
entered the model yielding an R
2
of .7539. At step 2,
x
2
entered the model and x
1
remained. The procedure stopped here with a final model
of:
y = 124.5 - 43.4 x
1
+ 1.36 x
2
The R
2
for this model was .8059 indicating relatively strong predictability with two
independent variables. Since there were four predictor variables, two of the variables
did not enter the stepwise process.
16.33 Of the three predictors, x
2
is an indicator variable. An examination of the stepwise
regression output reveals that there were three steps and that all three predictors end
up in the final model. Variable x
3
is the strongest individual predictor of y and entered
at step one resulting in an R
2
of .8124. At step 2, x
2
entered the process and variable x
3
remained in the model. The R
2
at this step was .8782. At step 3, variable x
1
entered the
procedure. Variables x
3
and x
2
remained in the model. The final R
2
was .9407. The final
model was:
y = 87.89 + 0.071 x
3
- 2.71 x
2
- 0.256 x
1
Page | 823
16.34 The R
2
for the full model is .321. After dropping out variable, x
3
, the R
2
is still .321.
Variable x
3
added virtually no information to the model. This is underscored by the fact
that the p-value for the t test of the slope for x
3
is .878 indicating that there is no
significance. The standard error of the estimate actually drops slightly after x
3
is
removed from the model.
Page | 824
Chapter 18
Statistical Quality Control
LEARNING OBJECTIVES
Chapter 18 presents basic concepts in quality control, with a particular emphasis on statistical
quality control techniques, thereby enabling you to:
1. Understand the concepts of quality, quality control, and total quality management.
2. Understand the importance of statistical quality control in total quality management.
3. Learn about process analysis and some process analysis tools.
4. Learn how to construct x charts, R charts, p charts, and c charts.
5. Understand the theory and application of acceptance sampling.
CHAPTER TEACHING STRATEGY
The objective of this chapter is to present the major concepts of statistical quality
control including control charts and acceptance sampling in a context of total quality
management. Too many texts focus only on a few statistical quality control procedures and fail
to provide the student with a managerial, decision-making context within which to use quality
Page | 825
control statistical techniques. In this text, the concepts of total quality management along with
some varying definitions of quality and some of the major theories in quality are presented.
From this, the student can formulate a backdrop for the statistical techniques presented. Some
statistics instructors argue that students are exposed to some of this material in other courses.
However, the background material on quality control is relatively brief; and at the very least,
students should be required to read over these pages before beginning the study of control
charts.
The background material helps the students understand that everyone does not agree
on what a quality product is. After all, if there is no agreement on what is quality, then it is very
difficult to ascertain or measure if it is being accomplished. The notion of in-process quality
control helps the student understand why we generate the data that we use to construct
control charts. Once the student is in the work world, it will be incumbent upon him/her to
determine what measurements should be taken and monitored. A discussion on what types of
measurements can be garnered in a particular business setting might be worthy of some class
time. For example, if a hospital lab wants to improve quality, how would they go about it?
What measurements might be useful? How about a production line of computer chips?
The chapter contains "some important quality concepts". The attempt is to familiarize,
if only in passing, the student with some of the more well-known quality concepts. Included in
the chapter are such things as team-building, benchmarking, just-in-time, reengineering, FMEA,
Six Sigma, and Poka-Yoke all of which can effect the types of measurements being taken and the
statistical techniques being used. It is a disservice to send students into the business world
armed with statistical techniques such as acceptance sampling and control charts but with their
heads in the sand about how the techniques fit into the total quality picture.
Chapter 18 contains a section on process analysis. Improving quality usually involves an
investigation of the process from which the product emerges. The most obvious example of a
process is a manufacturing assembly line. However, even in most service industries such
insurance, banking, or healthcare there are processes. A useful class activity might be to
brainstorm about what kind of process is involved in a person buying gasoline for their car,
checking in to a hospital, or purchasing a health club membership. Think about it from a
companys perspective. What activities must occur in order for a person to get their car filled
up?
In analyzing process, we first discuss the construction of flowcharts. Flowcharting can
be very beneficial in identifying activities and flows that need to be studied for quality
improvement. One very important outcome of a flowchart is the identification of bottlenecks.
You may find out that all applications for employment, for example, must pass across a clerks
desk where they sit for several days. This backs up the system and prevents flow. Other process
techniques include fishbone diagrams, Pareto analysis, and control charts.
Page | 826
In this chapter, four types of control charts are presented. Two of the charts, the x bar
chart and the R chart, deal with measurements of product attributes such as weight, length,
temperature and others. The other two charts deal with whether or not items are in compliance
with specifications (p chart) or the number of noncompliances per item (c chart). The c chart is
less widely known and used than the other three. As part of the material on control charts, a
discussion on variation is presented. Variation is one of the main concerns of quality control. A
discussion on various types of variation that can occur in a business setting can be profitable in
helping the student understand why particular measurements are charted and controlled.
CHAPTER OUTLINE
18.1 Introduction to Quality Control
What is Quality Control?
Total Quality Management
Some Important Quality Concepts
Benchmarking
Just-in-Time Systems
Reengineering
Failure Mode and Effects Analysis (FMEA)
Poka-Yoke
Six Sigma
Design for Six Sigma
Lean Manufacturing
Team Building
Page | 827
18.2 Process Analysis
Flowcharts
Pareto Analysis
Cause-and-Effect (Fishbone) Diagrams
Control Charts
Check Sheets
Histogram
Scatter Chart
18.3 Control Charts
Variation
Types of Control Charts
x Chart
R Charts
p Charts
c Charts
Interpreting Control Charts
18.4 Acceptance Sampling
Single Sample Plan
Double-Sample Plan
Multiple-Sample Plan
Determining Error and OC Curves
Page | 828
KEY TERMS
Acceptance Sampling p Chart
After-Process Quality Control Pareto Analysis
Benchmarking Pareto Chart
c Chart Poka-Yoke
Cause-and-Effect Diagram Process
Centerline Producers Risk
Check Sheet Product Quality
Consumer's Risk Quality
Control Chart Quality Circle
Design for Six Sigma Quality Control
Double-Sample Plan R Chart
Failure Mode and Effects Analysis Reengineering
Fishbone Diagram Scatter Chart
Flowchart Single-Sample Plan
Histogram Six Sigma
In-Process Quality Control Team Building
Ishikawa Diagram Total Quality Management
Just-in-Time Inventory Systems Transcendent Quality
Lean Manufacturing Upper Control Limit (UCL)
Lower Control Limit (LCL) User Quality
Manufacturing Quality Value Quality
Multiple-Sample Plan x Chart
Operating Characteristic (OC) Curve
Page | 829
Page | 830
SOLUTIONS TO PROBLEMS IN CHAPTER 18
18.2 Complaint Number % of Total
Busy Signal 420 56.45
Too long a Wait 184 24.73
Could not get through 85 11.42
Get Disconnected 37 4.97
Transferred to the Wrong Person 10 1.34
Poor Connection 8 1.08
Total 744 99.99
Page | 831
18.4 1 x = 27.00, 2 x = 24.29, 3 x = 25.29, 4 x = 27.71, 5 x = 25.86
R
1
= 8, R
2
= 8, R
3
= 9, R
4
= 7, R
5
= 6
x = 26.03 R = 7.6
For x Chart: Since n = 7, A
2
= 0.419
Centerline: x = 26.03
UCL: x + A
2
R = 26.03 + (0.419)(7.6) = 29.21
LCL: x - A
2
R = 26.03 (0.419)(7.6) = 22.85
For R Chart: Since n = 7, D
3
= 0.076 D
4
= 1.924
Centerline: R = 7.6
UCL: D
4
R = (1.924)(7.6) = 14.62
LCL: D
3
R = (0.076)(7.6) = 0.58
Page | 832
x Chart:
R Chart:
Page | 833
18.5 1 x = 4.55, 2 x = 4.10, 3 x = 4.80, 4 x = 4.70,
5 x = 4.30, 6 x = 4.73, 7 x = 4.38
R
1
= 1.3, R
2
= 1.0, R
3
= 1.3, R
4
= 0.2, R
5
= 1.1, R
6
= 0.8, R
7
= 0.6
x = 4.51 R = 0.90
For x Chart: Since n = 4, A
2
= 0.729
Centerline: x = 4.51
UCL: x + A
2
R = 4.51 + (0.729)(0.90) = 5.17
LCL: x - A
2
R = 4.51 (0.729)(0.90) = 3.85
For R Chart: Since n = 4, D
3
= 0 D
4
= 2.282
Centerline: R = 0.90
UCL: D
4
R = (2.282)(0.90) = 2.05
LCL: D
3
R = 0
Page | 834
x Chart:
Page | 835
R Chart:
18.6
1
p = .02,
2
p = .07,
3
p = .04,
4
p = .03,
5
p = .03
6
p = .05,
7
p = .02,
8
p = .00,
9
p = .01,
10
p = .06
p = .033
Centerline: p = .033
UCL: .033 + 3
100
) 967 )(. 033 (.
= .033 + .054 = .087
Page | 836
LCL: .033 - 3
100
) 967 )(. 033 (.
= .033 - .054 = .000
pChart:
Page | 837
18.7
1
p = .025,
2
p = .000,
3
p = .025,
4
p = .075,
5
p = .05,
6
p = .125,
7
p = .05
p = .050
Centerline: p = .050
UCL: .05 + 3
40
) 95 )(. 05 (.
= .05 + .1034 = .1534
LCL: .05 - 3
40
) 95 )(. 05 (.
= .05 - .1034 = .000
p Chart:
Page | 838
18.8 c =
35
22
= 0.62857
Centerline: c = 0.62857
UCL: c c 3 + = 0.62857 + 3 62857 . 0 =
0.62857 + 2.37847 = 3.00704
LCL: c c 3 + = 0.62857 - 3 62857 . 0 =
0.62857 2.37847 = .000
c Chart:
Page | 839
18.9 c =
32
43
= 1.34375
Centerline: c = 1.34375
UCL: c c 3 + = 1.34375 + 3 34375 . 1 =
1.34375 + 3.47761 = 4.82136
LCL: c c 3 + = 1.34375 - 3 34375 . 1 =
1.34375 - 3.47761 = 0.000
c Chart:
18.10 a.) Six or more consecutive points are decreasing. Two of three
Page | 840
consecutive points are in the outer one-third (near LCL). Four out of
five points are in the outer two-thirds (near LCL).
b.) This is a relatively healthy control chart with no obvious rule violations.
c.) One point is above the UCL. Two out of three consecutive points are in
the outer one-third (both near LCL and near UCL). There are six
consecutive increasing points.
18.11 While there are no points outside the limits, the first chart exhibits some
problems. The chart ends with 9 consecutive points below the centerline.
Of these 9 consecutive points, there are at least 4 out of 5 in the outer 2/3 of the lower
region. The second control chart contains no points outside the control limit. However,
near the end, there are 8 consecutive points above the centerline. The p chart contains
no points outside the upper control limit. Three times, the chart contains two out of
three points in the outer third. However, this occurs in the lower third where the
proportion of noncompliance items approaches zero and is probably not a problem to
be concerned about. Overall, this seems to display a process that is in control. One
concern might be the wide swings in the proportions at samples 15, 16 and 22 and 23.
18.12 For the first sample:
If x
1
> 4 then reject
Page | 841
If x
1
< 2 then accept
If 2 < x
1
< 4 then take a second sample
For the second sample, c
2
= 3:
If x
1
+ x
2
< c
2
then accept
If x
1
+ x
2
> c
2
then reject
But x
1
= 2 and x
2
= 2 so x
1
+ x
2
= 4 > 3
Reject the lot because x
1
+ x
2
= 4 > c
2
= 3
This is a double sample acceptance plan
18.13 n = 10 c = 0 p
0
= .05
P(x = 0) =
10
C
0
(.05)
0
(.95)
10
= .5987
1 - P(x = 0) = 1 - .5987 = .4013
The producer's risk is .4013
p
1
= .14
Page | 842
P(x = 0) =
15
C
0
(.14)
0
(.86)
10
= .2213
The consumer's risk is .2213
18.14 n = 12 c = 1 p
0
= .04
Producer's Risk = 1 - [P(x = 0) + P(x = 1)] =
1 - [
12
C
0
(.04)
0
(.96)
12
+
12
C
1
(.04)
1
(.96)
11
] =
1 - [.6127 + .30635] = 1 - .91905 = .08095
p
1
= .15
Consumer's Risk = P(x = 0) + P(x = 1) =
12
C
0
(.15)
0
(.85)
12
+
12
C
1
(.15)
1
(.85)
11
= .14224 + .30122 = .44346
Page | 843
18.15 n = 8 c = 0 p
0
= .03 p
1
= .1
p Probability
.01 .9227
.02 .8506
.03 .7837
.04 .7214 Producer's Risk for (p
0
= .03) =
.05 .6634 1 - .7837 = .2163
.06 .6096
.07 .5596
.08 .5132 Consumer's Risk for (p
1
= .10) = .4305
.09 .4703
.10 .4305
.11 .3937
.12 .3596
.13 .3282
.14 .2992
.15 .2725
OC Chart:
Page | 844
Page | 845
18.16 n = 11 c = 1 p
0
= .08 p
1
= .20
p Probability
.02 .9805
.04 .9308
.06 .8618
.08 .7819 Producer's Risk for (p
0
= .08) = 1 - .7819 = .2181
.10 .6974
.12 .6127
.14 .5311
.16 .4547 Consumer's Risk for (p
1
= 20) = .3221
.18 .3849
.20 .3221
.22 .2667
.24 .2186
OC Chart:
Page | 846
Page | 847
18.17
Stop
|
N
|
(no)
D K L M (yes) Stop
|
| Stop
| |
(no) (no)
Start A B (yes) C E F G
(yes)
|
H(no) J Stop
(yes) |
| |
I |
Page | 848
18.18 Problem Frequency Percent of Total
1 673 26.96
2 29 1.16
3 108 4.33
4 379 15.18
5 73 2.92
6 564 22.60
7 12 0.48
8 402 16.11
9 54 2.16
10 202 8.09
2496
Page | 849
Pareto Chart:
18.19 Fishbone Diagram:
Page | 850
18.20 a) n = 13, c = 1, p
0
= .05, p
1
= .12
Under p
0
= .05, the probability of acceptance is:
13
C
0
(.05)
0
(.95)
13
+
13
C
1
(.05)
1
(.95)
12
=
(1)(1)(.51334) + (13)(.05)(.54036) =
.51334 + .35123 = .86457
The probability of being rejected = 1 - .86457 = .13543
Under p
1
= .12, the probability of acceptance is:
13
C
0
(.12)
0
(.88)
13
+
13
C
1
(.12)
1
(.88)
12
=
(1)(1)(.18979) + (13)(.12)(.21567) =
.18979 + .33645 = .52624
The probability of being rejected = 1 - .52624 = .47376
b) n = 20, c = 2, p
0
= .03
The probability of acceptance is:
20
C
0
(.03)
0
(.97)
19
+
20
C
1
(.03)
1
(.97)
19
+
20
C
2
(.03)
2
(.97)
18
=
(1)(1)(.54379) + (20)(.03)(.56061) + (190)(.0009)(.57795) =
.54379 + .33637 + .09883 = .97899
Page | 851
The probability of being rejected = 1 - .97899 = .02101
which is the producers risk
18.21
1
p = .06,
2
p = .22,
3
p = .14,
4
p = .04,
5
p = .10,
6
p = .16,
7
p = .00,
8
p = .18,
9
p = .02,
10
p = .12
p =
500
52
= .104
Centerline: p = .104
UCL: .104 + 3
50
) 896 )(. 104 (.
= .104 + .130 = .234
LCL: .104 - 3
50
) 896 )(. 104 (.
= .104 - .130 = .000
p Chart:
Page | 852
18.22 1 x = 24.022, 2 x = 24.048, 3 x = 23.996, 4 x = 24.000,
5 x = 23.998, 6 x = 24.018, 7 x = 24.000, 8 x = 24.034,
9 x = 24.014, 10 x = 24.002, 11 x = 24.012, 12 x = 24.022
R
1
= .06, R
2
= .09, R
3
= .08, R
4
= .03, R
5
= .05, R
6
= .05,
R
7
= .05, R
8
= .08, R
9
= .03, R
10
= .01, R
11
= .04, R
12
= .05
x = 24.01383 R = 0.05167
For x Chart: Since n = 12, A
2
= .266
Centerline: x = 24.01383
Page | 853
UCL: x + A
2
R = 24.01383 + (0.266)(.05167) =
24.01383 + .01374 = 24.02757
LCL: x - A
2
R = 24.01383 - (0.266)(.05167) =
24.01383 - .01374 = 24.00009
For R Chart: Since n = 12, D
3
= .284 D
4
= 1.716
Centerline: R = .05167
UCL: D
4
R = (1.716)(.05167) = .08866
LCL: D
3
R = (.284)(.05167) = .01467
x Chart:
Page | 854
R Chart:
Page | 855
18.23 n = 15, c = 0, p
0
= .02, p
1
= .10
p Probability
.01 .8601
.02 .7386
.04 .5421
.06 .3953
.08 .2863
.10 .2059
.12 .1470
.14 .1041
Producer's Risk for (p
0
= .02) = 1 - .7386 = .2614
Consumer's Risk for (p
1
= .10) = .2059
OC Curve:
Page | 856
Page | 857
18.24 c =
36
77
= 2.13889
Centerline: c = 2.13889
UCL: c c 3 + = 2.13889 + 3 13889 . 2 =
2.13889 + 4.38748 = 6.52637
LCL: c c 3 + = 2.13889 - 3 13889 . 2 =
2.13889 4.38748 = .00000
c Chart:
Page | 858
18.25 1 x = 1.2100, 2 x = 1.2050, 3 x = 1.1900, 4 x = 1.1725,
5 x = 1.2075, 6 x = 1.2025, 7 x = 1.1950, 8 x = 1.1950,
9 x = 1.1850
R
1
= .04, R
2
= .02, R
3
= .04, R
4
= .04, R
5
= .06, R
6
= .02,
R
7
= .07, R
8
= .07, R
9
= .06,
x = 1.19583 R = 0.04667
For x Chart: Since n = 4, A
2
= .729
Centerline: x = 1.19583
UCL: x + A
2
R = 1.19583 + .729(.04667) =
1.19583 + .03402 = 1.22985
LCL: x - A
2
R = 1.19583 - .729(.04667) =
1.19583 - .03402 = 1.16181
For R Chart: Since n = 9, D
3
= .184 D
4
= 1.816
Centerline: R = .04667
Page | 859
UCL: D
4
R = (1.816)(.04667) = .08475
LCL: D
3
R = (.184)(.04667) = .00859
x Chart:
R chart:
Page | 860
18.26 1 x = 14.99333, 2 x = 15.00000, 3 x = 14.97833, 4 x = 14.97833,
5 x = 15.01333, 6 x = 15.00000, 7 x = 15.01667, 8 x = 14.99667,
R
1
= .03, R
2
= .07, R
3
= .05, R
4
= .05,
R
5
= .04, R
6
= .05, R
7
= .05, R
8
= .06
x = 14.99854 R = 0.05
For x Chart: Since n = 6, A
2
= .483
Centerline: x = 14.99854
UCL: x + A
2
R = 14.99854 + .483(.05) =
14.00854 + .02415 = 15.02269
LCL: x - A
2
R = 14.99854 - .483(.05) =
14.00854 - .02415 = 14.97439
For R Chart: Since n = 6, D
3
= 0 D
4
= 2.004
Centerline: R = .05
Page | 861
UCL: D
4
R = 2.004(.05) = .1002
LCL: D
3
R = 0(.05) = .0000
x Chart:
R chart:
18.27
1
p = .12,
2
p = .04,
3
p = .00,
4
p = .02667,
Page | 862
5
p = .09333,
6
p = .18667,
7
p = .14667,
8
p = .10667,
9
p = .06667,
10
p = .05333,
11
p = .0000,
12
p = .09333
p =
900
70
= .07778 Centerline: p = .07778
UCL: .07778 + 3
75
) 92222 )(. 07778 (.
= .07778 + .09278 = .17056
LCL: .07778 - 3
75
) 92222 )(. 07778 (.
= .07778 - .09278 = .00000
p Chart:
Page | 863
18.28 c =
25
16
= 0.64
Centerline: c = 0.64
UCL: c c 3 + = 0.64 + 3 64 . 0 =
0.64 + 2.4 = 3.04
LCL: c c 3 + = 0.64 - 3 64 . 0 =
0.64 2.4 = .00000
c Chart:
Page | 864
18.29 n = 10 c = 2 p
0
= .10 p
1
= .30
p Probability
.05 .9885
.10 .9298
.15 .8202
.20 .6778
.25 .5256
.30 .3828
.35 .2616
.40 .1673
.45 .0996
.50 .0547
Producer's Risk for (p
0
= .10) = 1 - .9298 = .0702
Consumer's Risk for (p
1
= .30) = .3828
Page | 865
Page | 866
18.30 c =
40
81
= 2.025
Centerline: c = 2.025
UCL: c c 3 + = 2.025 + 3 025 . 2 =
2.025 + 4.26907 = 6.29407
LCL: c c 3 + = 2.025 - 3 025 . 2 =
2.025 4.26907 = .00000
c Chart:
Page | 867
18.31
1
p = .05,
2
p = .00,
3
p = .15,
4
p = .075,
5
p = .025,
6
p = .025,
7
p = .125,
8
p = .00,
9
p = .10,
10
p = .075,
11
p = .05,
12
p = .05,
13
p = .15,
14
p = .025,
15
p = .000
p =
600
36
= .06
Centerline: p = .06
UCL: .06 + 3
40
) 94 )(. 06 (.
= .06 + .11265 = .17265
LCL: .06 - 3
40
) 94 )(. 06 (.
= .06 - .112658 = .00000
p Chart:
Page | 868
18.32 The process appears to be in control. Only 1 sample mean is beyond the outer limits
(97% of the means are within the limits). There are no more than four means in a row
on one side of the centerline. There are no more than five consecutive points
decreasing or three consecutive points increasing. About 2/3 (67%) of the points are
within the inner 1/3 of the confidence bands (+ 1
x
o ).
18.33 There are some items to be concerned about with this chart. Only one sample range is
above the upper control limit. However, near the beginning of the chart there are eight
sample ranges in a row below the centerline. Later in the run, there are nine sample
ranges in a row above the centerline. The quality manager or operator might want to
determine if there is some systematic reason why there is a string of ranges below the
centerline and, perhaps more importantly, why there are a string of ranges above the
centerline.
Page | 869
18.34 This p chart reveals that two of the sixty samples (about 3%) produce proportions that
are too large. Nine of the sixty samples (15%) produce proportions large enough to be
greater than 1
p
o above the centerline. In general, this chart indicates a process that is
under control.
18.35 The centerline of the c chart indicates that the process is averaging 0.74
nonconformances per part. Twenty-five of the fifty sampled items have zero
nonconformances. None of the samples exceed the upper control limit for
nonconformances. However, the upper control limit is 3.321 nonconformances, which
in and of itself, may be too many. Indeed, three of the fifty (6%) samples actually had
three nonconformances. An additional six samples (12%) had two nonconformances.
One matter of concern may be that there is a run of ten samples in which nine of the
samples exceed the centerline (samples 12 through 21). The question raised by this
phenomenon is whether or not there is a systematic flaw in the process that produces
strings of nonconforming items.
Page | 870
Chapter 19
Decision Analysis
LEARNING OBJECTIVES
Chapter 19 describes how to use decision analysis to improve management decisions, thereby
enabling you to:
1. Learn about decision making under certainty, under uncertainty, and under risk.
2. Learn several strategies for decision-making under uncertainty, including expected
payoff, expected opportunity loss, maximin, maximax, and minimax regret.
3. Learn how to construct and analyze decision trees.
4. Understand aspects of utility theory.
5. Learn how to revise probabilities with sample information.
CHAPTER TEACHING STRATEGY
Page | 871
The notion of contemporary decision making is built into the title of the text as a
statement of the importance of recognizing that statistical analysis is primarily done as a
decision-making tool. For the vast majority of students, statistics take on importance only in as
much as they aid decision-makers in weighing various alternative pathways and helping the
manager make the best possible determination. It has been an underlying theme from chapter
1 that the techniques presented should be considered in a decision-making context. This
chapter focuses on analyzing the decision-making situation and presents several alternative
techniques for analyzing decisions under varying conditions.
Early in the chapter, the concepts of decision alternatives, the states of nature, and the
payoffs are presented. It is important that decision makers spend time brainstorming about
possible decision alternatives that might be available to them. Sometimes the best alternatives
are not obvious and are not immediately considered. The international focus on foreign
companies investing in the U.S. presents a scenario in which there are several possible
alternatives available. By using cases such as the Fletcher-Terry case at the chapter's end,
students can practice enumerating possible decision alternatives.
States of nature are possible environments within which the outcomes will occur over
which we have no control. These include such things as the economy, the weather, health of
the CEO, wildcat strikes, competition, change in consumer demand, etc. While the text presents
problems with only a few states of nature in order to keep the length of solution reasonable,
students should learn to consider as many states of nature as possible in decision making.
Determining payoffs is relatively difficult but essential in the analysis of decision alternatives.
Decision-making under uncertainty is the situation in which the outcomes are not
known and there are no probabilities given as to the likelihood of them occurring. With these
techniques, the emphasis is whether or not the approach is optimistic, pessimistic, or weighted
somewhere in between.
In making decisions under risk, the probabilities of each state of nature occurring are
known or are estimated. Decision trees are introduced as an alternative mechanism for
displaying the problem. The idea of an expected monetary value is that if this decision process
were to continue with the same parameters for a long time, what would the long-run average
outcome be? Some decisions lend themselves to long-run average analysis such as gambling
outcomes or insurance actuary analysis. Other decisions such as building a dome stadium
Page | 872
downtown or drilling one oil well tend to be more one time activities and may not lend
themselves as nicely to expected value analysis. It is important that the student understand
that expected value outcomes are long-run averages and probably will not occur in single
instance decisions.
Utility is introduced more as a concept than an analytic technique. The
idea here is to aid the decision-maker in determining if he/she tends to be more of a risk-taker,
an EMV'r, or risk-averse. The answer might be that it depends on the matter over which the
decision is being made. One might be a risk-taker on attempting to employ a more diverse work
force and at the same time be more risk-averse in investing the company's retirement fund.
Page | 873
CHAPTER OUTLINE
19.1 The Decision Table and Decision Making Under Certainty
Decision Table
Decision-Making Under Certainty
19.2 Decision Making Under Uncertainty
Maximax Criterion
Maximin Criterion
Hurwicz Criterion
Minimax Regret
19.3 Decision Making Under Risk
Decision Trees
Expected Monetary Value (EMV)
Expected Value of Perfect Information
Utility
19.4 Revising Probabilities in Light of Sample Information
Expected Value of Sample Information
Page | 874
KEY TERMS
Decision Alternatives Hurwicz Criterion
Decision Analysis Maximax Criterion
Decision Making Under Certainty Maximin Criterion
Decision Making Under Risk Minimax Regret
Decision Making Under Uncertainty Opportunity Loss Table
Decision Table Payoffs
Decision Trees Payoff Table
EMV'er Risk-Avoider
Expected Monetary Value (EMV) Risk-Taker
Expected Value of Perfect Information States of Nature
Expected Value of Sample Information Utility
Page | 875
SOLUTIONS TO PROBLEMS IN CHAPTER 19
19.1 S
1
S
2
S
3
Max Min
d
1
250 175 -25 250 -25
d
2
110 100 70 110 70
d
3
390 140 -80 390 -80
a.) Max {250, 110, 390} = 390 decision: Select d
3
b.) Max {-25, 70, -80} = 70 decision: Select d
2
c.) For o = .3
d
1
: .3(250) + .7(-25) = 57.5
d
2
: .3(110) + .7(70) = 82
d
3
: .3(390) + .7(-80) = 61
Page | 876
decision: Select d
2
For o = .8
d
1
: .8(250) + .2(-25) = 195
d
2
: .8(110) + .2(70) = 102
d
3
: .8(390) + .2(-80) = 296
decision: Select d
3
Comparing the results for the two different values of alpha, with a more pessimist point-
of-view (o = .3), the decision is to select d
2
and the payoff is 82. Selecting by using a
more optimistic point-of-view (o = .8) results in choosing d
3
with a higher payoff of 296.
d.) The opportunity loss table is:
S
1
S
2
S
3
Max
d
1
140 0 95 140
d
2
280 75 0 280
Page | 877
d
3
0 35 150 150
The minimax regret = min {140, 280, 150} = 140
Decision: Select d
1
to minimize the regret.
19.2 S
1
S
2
S
3
S
4
Max Min
d
1
50 70 120 110 120 50
d
2
80 20 75 100 100 20
d
3
20 45 30 60 60 20
d
4
100 85 -30 -20 100 -30
d
5
0 -10 65 80 80 -10
a.) Maximax = Max {120, 100, 60, 100, 80} = 120
Decision: Select d
1
b.) Maximin = Max {50, 20, 20, -30, -10} = 50
Page | 878
Decision: Select d
1
c.) o = .5
Max {[.5(120)+.5(50)], [.5(100)+.5(20)],
[.5(60)+.5(20)], [.5(100)+.5(-30)], [.5(80)+.5(-10)]}=
Max { 85, 60, 40, 35, 35 } = 85
Decision: Select d
1
Page | 879
d.) Opportunity Loss Table:November 8, 1996
S
1
S
2
S
3
S
4
Max
d
1
50 15 0 0 50
d
2
20 65 45 10 65
d
3
80 40 90 50 90
d
4
0 0 150 130 150
d
5
100 95 55 30 100
Min {50, 65, 90, 150, 100} = 50
Decision: Select d
1
Page | 880
19.3 R D I Max Min
A 60 15 -25 60 -25
B 10 25 30 30 10
C -10 40 15 40 -10
D 20 25 5 25 5
Maximax = Max {60, 30, 40, 25} = 60
Decision: Select A
Maximin = Max {-25, 10, -10, 5} = 10
Decision: Select B
Page | 881
19.4 Not Somewhat Very Max Min
None -50 -50 -50 -50 -50
Few -200 300 400 400 -200
Many -600 100 1000 1000 -600
a.) For Hurwicz criterion using o = .6:
Max {[.6(-50) + .4(-50)], [.6(400) + .4(-200)],
[.6(1000) + .4(-600)]} = {-50, -160, 360}= 360
Decision: Select "Many"
b.) Opportunity Loss Table:
Not Somewhat Very Max
None 0 350 1050 1050
Few 150 0 600 600
Many 550 200 0 550
Page | 882
Minimax regret = Min {1050, 600, 550} = 550
Decision: Select "Many"
Page | 883
19.5, 19.6
Page | 884
19.7 Expected Payoff with Perfect Information =
5(.15) + 50(.25) + 20(.30) + 8(.10) + 6(.20) = 31.75
Expected Value of Perfect Information = 31.25 - 25.25 = 6.50
19.8 a.) & b.)
Page | 885
c.) Expected Payoff with Perfect Information =
150(40) + 450(.35) + 700(.25) = 392.5
Expected Value of Perfect Information = 392.5 - 370 = 22.50
Page | 886
19.9 Down(.30) Up(.65) No Change(.05) EMV
Lock-In -150 200 0 85
No 175 -250 0 -110
Decision: Based on the highest EMV)(85), "Lock-In"
Expected Payoff with Perfect Information =
175(.30) + 200(.65) + 0(.05) = 182.5
Expected Value of Perfect Information = 182.5 - 85 = 97.5
19.10 EMV
No Layoff -960
Layoff 1000 -320
Page | 887
Layoff 5000 400
Decision: Based on maximum EMV (400), Layoff 5000
Expected Payoff with Perfect Information =
100(.10) + 300(.40) + 600(.50) = 430
Expected Value of Perfect Information = 430 - 400 = 30
19.11 a.) EMV = 200,000(.5) + (-50,000)(.5) = 75,000
b.) Risk Avoider because the EMV is more than the
investment (75,000 > 50,000)
c.) You would have to offer more than 75,000 which
is the expected value.
Page | 888
19.12 a.) S
1
(.30) S
2
(.70) EMV
d
1
350 -100 35
d
2
-200 325 167.5
Decision: Based on EMV,
maximum {35, 167.5} = 167.5
b. & c.) For Forecast S
1
:
Prior Cond. Joint Revised
S
1
.30 .90 .27 .6067
S
2
.70 .25 .175 .3933
F(S
1
) = .445
For Forecast S
2
:
Prior Cond. Joint Revised
Page | 889
S
1
.30 .10 .030 .054
S
2
.70 .75 .525 .946
F(S
2
) = .555
EMV with Sample Information = 241.63
Page | 890
d.) Value of Sample Information = 241.63 - 167.5 = 74.13
Page | 891
19.13
Dec(.60) Inc(.40) EMV
S -225 425 35
M 125 -150 15
L 350 -400 50
Decision: Based on EMV = Maximum {35, 15, 50} = 50
For Forecast (Decrease):
Prior Cond. Joint Revised
Decrease .60 .75 .45 .8824
Increase .40 .15 .06 .1176
F(Dec) = .51
For Forecast (Increase):
Page | 892
Prior Cond. Joint Revised
Decrease .60 .25 .15 .3061
Increase .40 .85 .34 .6939
F(Inc) = .49
Page | 893
The expected value with sampling is 244.275
EVSI = EVWS - EMV = 244.275 - 50 = 194.275
Page | 894
19.14 Decline(.20) Same(.30) Increase(.50) EMV
Don't Plant 20 0 -40 -16
Small -90 10 175 72.5
Large -600 -150 800 235
Decision: Based on Maximum EMV =
Max {-16, 72.5, 235} = 235, plant a large tree farm
For forecast decrease:
Prior Cond. Joint Revised
.20 .70 .140 .8974
.30 .02 .006 .0385
.50 .02 .010 .0641
P(F
dec
) = .156
Page | 895
For forecast same:
Prior Cond. Joint Revised
.20 .25 .05 .1333
.30 .95 .285 .7600
.50 .08 .040 .1067
P(F
same
) = .375
For forecast increase:
Prior Cond. Joint Revised
.20 .05 .01 .0213
.30 .03 .009 .0192
.50 .90 .45 .9595
Page | 896
P(F
inc
) = .469
Page | 897
The Expected Value with Sampling Information is 360.413
EVSI = EVWSI - EMV = 360.413 - 235 = 125.413
19.15 Oil(.11) No Oil(.89) EMV
Drill 1,000,000 -100,000 21,000
Don't Drill 0 0 0
Decision: The EMV for this problem is Max {21,000, 0} = 21,000.
The decision is to Drill.
Actual
Oil No Oil
Oil .20 .10
Forecast
No Oil .80 .90
Forecast Oil:
Page | 898
State Prior Cond. Joint Revised
Oil .11 .20 .022 .1982
No Oil .89 .10 .089 .8018
P(F
Oil
) = .111
Forecast No Oil:
State Prior Cond. Joint Revised
Oil .11 .80 .088 .0990
No Oil .89 .90 .801 .9010
P(F
No Oil
) = .889
Page | 899
The Expected Value With Sampling Information is 21,012.32
EVSI = EVWSI - EMV = 21,000 - 21,012.32 = 12.32
Page | 900
19.16 S
1
S
2
Max. Min.
d
1
50 100 100 50
d
2
-75 200 200 -75
d
3
25 40 40 25
d
4
75 10 75 10
a.) Maximax: Max {100, 200, 40, 75} = 200
Decision: Select d
2
b.) Maximin: Max {50, -75, 25, 10} = 50
Decision: Select d
1
c.) Hurwicz with o = .6
d
1
: 100(.6) + 50(.4) = 80
d
2
: 200(.6) + (-75)(.4) = 90
d
3
: 40(.6) + 25(.4) = 34
Page | 901
d
4
: 75(.6) + 10(.4) = 49
Max {80, 90, 34, 49} = 90
Decision: Select d
2
d.) Opportunity Loss Table:
S
1
S
2
Maximum
d
1
25 100 100
d
2
150 0 150
d
3
50 160 160
d
4
0 190 190
Min {100, 150, 160, 190} = 100
Decision: Select d
1
Page | 902
19.17
b.) d
1
: 400(.3) + 250(.25) + 300(.2) + 100(.25) = 267.5
d
2
: 300(.3) + (-100)(.25) + 600(.2) + 200(.25) = 235
Decision: Select d
1
c.) Expected Payoff of Perfect Information:
400(.3) + 250(.25) + 600(.2) + 200(.25) = 352.5
Value of Perfect Information = 352.5 - 267.5 = 85
Page | 903
Page | 904
19.18 S
1
(.40) S
2
(.60) EMV
d
1
200 150 170
d
2
-75 450 240
d
3
175 125 145
Decision: Based on Maximum EMV =
Max {170, 240, 145} = 240
Select d
2
Forecast S
1
:
State Prior Cond. Joint Revised
S
1
.4 .9 .36 .667
S
2
.6 .3 .18 .333
P(F
S1
) = .54
Forecast S
2
:
Page | 905
State Prior Cond. Joint Revised
S
1
.4 .1 .04 .087
S
2
.6 .7 .42 .913
P(F
S2
) = .46
Page | 906
The Expected Value With Sample Information is 285.00
EVSI = EVWSI - EMV = 285 - 240 = 45
Page | 907
19.19 Small Moderate Large Min Max
Small 200 250 300 200 300
Modest 100 300 600 100 600
Large -300 400 2000 -300 2000
a.) Maximax: Max {300, 600, 2000} = 2000
Decision: Large Number
Minimax: Max {200, 100, -300} = 200
Decision: Small Number
b.) Opportunity Loss:
Small Moderate Large Max
Small 0 150 1700 1700
Modest 100 100 1400 1400
Large 500 0 0 500
Page | 908
Min {1700, 1400, 500} = 500
Decision: Large Number
c.) Minimax regret criteria leads to the same decision as Maximax.
Page | 909
19.20 No Low Fast Max Min
Low -700 -400 1200 1200 -700
Medium -300 -100 550 550 -300
High 100 125 150 150 100
a.) o= .1:
Low: 1200(.1) + (-700)(.9) = -510
Medium: 550(.1) + (-300)(.9) = -215
High: 150(.1) + 100(.9) = 105
Decision: Price High (105)
b.) o = .5:
Low: 1200(.5) + (-700)(.5) = 250
Medium: 550(.5) + (-300)(.5) = 125
High: 150(.5) + 100(.5) = 125
Decision: Price Low (250)
c.) o = .8:
Page | 910
Low: 1200(.8) + (-700)(.2) = 820
Medium: 550(.8) + (-300)(.2) = 380
High: 150(.8) + 100(.2) = 140
Decision: Price Low (820)
d.) Two of the three alpha values (.5 and .8) lead to a decision of pricing low.
Alpha of .1 suggests pricing high as a strategy. For optimists (high
alphas), pricing low is a better strategy; but for more pessimistic people,
pricing high may be the best strategy.
Page | 911
19.21 Mild(.75) Severe(.25) EMV
Reg. 2000 -2500 875
Weekend 1200 -200 850
Not Open -300 100 -200
Decision: Based on Max EMV =
Max{875, 850, -200} = 875, open regular hours.
Expected Value with Perfect Information =
Page | 912
2000(.75) + 100(.25) = 1525
Value of Perfect Information = 1525 - 875 = 650
Page | 913
19.22 Weaker(.35) Same(.25) Stronger(.40) EMV
Don't Produce -700 -200 150 -235
Produce 1800 400 -1600 90
Decision: Based on Max EMV = Max {-235, 90} = 90, select Produce.
Expected Payoff With Perfect Information =
1800(.35) + 400(.25) + 150(.40) = 790
Value of Perfect Information = 790 - 90 = 700
Page | 914
19.23 Red.(.15) Con.(.35) Inc.(.50) EMV
Automate -40,000 -15,000 60,000 18,750
Do Not 5,000 10,000 -30,000 -10,750
Decision: Based on Max EMV =
Max {18750, -10750} = 18,750, Select Automate
Forecast Reduction:
State Prior Cond. Joint Revised
R .15 .60 .09 .60
C .35 .10 .035 .2333
I .50 .05 .025 .1667
P(F
Red
) = .150
Forecast Constant:
State Prior Cond. Joint Revised
R .15 .30 .045 .10
C .35 .80 .280 .6222
I .50 .25 .125 .2778
Page | 915
P(F
Cons
) = .450
Forecast Increase:
State Prior Cond. Joint Revised
R .15 .10 .015 .0375
C .35 .10 .035 .0875
I .50 .70 .350 .8750
P(F
Inc
) = .400
Page | 916
Expected Value With Sample Information = 21,425.55
EVSI = EVWSI - EMV = 21,425.55 - 18,750 = 2,675.55
Page | 917
Page | 918
19.24 Chosen(.20) Not Chosen(.80) EMV
Build 12,000 -8,000 -4,000
Don't -1,000 2,000 1,400
Decision: Based on Max EMV = Max {-4000, 1400} = 1,400,
choose "Don't Build" as a strategy.
Forecast Chosen:
State Prior Cond. Joint Revised
Chosen .20 .45 .090 .2195
Not Chosen .80 .40 .320 .7805
P(F
C
) = .410
Forecast Not Chosen:
State Prior Cond. Joint Revised
Chosen .20 .55 .110 .1864
Page | 919
Not Chosen .80 .60 .480 .8136
P(F
C
) = .590
Expected Value With Sample Information = 1,400.09
EVSI = EVWSI - EMV = 1,400.09 - 1,400 = .09