Applied Statistics in Business & Economics: David P. Doane and Lori E. Seward
Applied Statistics in Business & Economics: David P. Doane and Lori E. Seward
Applied Statistics in Business & Economics: David P. Doane and Lori E. Seward
Economics
David P. Doane and Lori E. Seward
Vũ Võ
[email protected]
3-1
Chapter 3
Describing Data Visually
Chapter Contents
3.1 Stem-and-Leaf Displays and Dot Plots
3.2 Frequency Distributions and Histograms
3.3 Effective Excel Charts
3.4 Line Charts
3.5 Column and Bar Charts
3.6 Pie Charts
3.7 Scatter Plots
3.8 Tables
3.9 Deceptive Graphs
3-2
Chapter 3
Chapter Learning Objectives
LO3-1: Make a stem-and-leaf or dot plot.
LO3-2: Create a frequency distribution for a data set.
LO3-3: Make a histogram with appropriate bins.
LO3-4: Identify skewness, modal classes, and outliers in
a histogram.
LO3-5: Make an effective line chart.
3-3
Chapter 3
Chapter Learning Objectives (continued)
LO3-6: Make an effective column chart or bar chart.
LO3-7: Make an effective pie chart.
LO3-8: Make and interpret a scatter plot.
LO3-9: Make simple tables and pivot tables.
LO3-10: Recognize deceptive graphing techniques.
3-4
Chapter 3
3.1 Stem-and-Leaf Displays and Dot Plots
LO3-1: Make a stem-and-leaf or dot plot.
3-5
Chapter 3
LO3-1: Make a stem-and-leaf or dot plot (continued).
Begin with univariate data (a set of n observations on
one variable) and consider the following (Table 3.1):
Characteristic Interpretation
What are the units of measurement
(e.g., dollars)? Are the data integer
Measurement or continuous? Any missing
observations? Any concerns with
accuracy or sampling methods?
Where are the data values
Center concentrated? What seem to be
typical or middle data values?
How much dispersion is there in the
Variability data? How spread out are the data
values? Are there unusual values?
Are the data values distributed
Shape symmetrically? Skewed? Sharply
peaked? Flat? Bimodal?
3-6
Chapter 3
LO3-1: Make a stem-and-leaf or dot plot (continued, 2).
Preliminary Assessment
• Look at the data and visualize how they were collected and
measured.
• Sorting (Example: Price/Earnings Ratios)
• Sort the data as a first step and then summarize in a
graphical display. Here are the sorted P/E ratios (values
from Table 3.2).
3-7
Chapter 3
LO3-1: Make a stem-and-leaf or dot plot (continued, 3).
Stem-and-Leaf Plot
One simple way to visualize small data sets is a stem-
and-leaf plot. The stem-and-leaf plot is a tool of
exploratory data analysis (EDA) that seeks to reveal
essential data features in an intuitive way. A stem-and-
leaf plot is basically a frequency tally, except that we use
digits instead of tally marks. For two-digit or three-digit
integer data, the stem is the tens digit of the data, and
the leaf is the ones digit.
3-8
Chapter 3
LO3-1: Make a stem-and-leaf or dot plot (continued, 4).
Stem-and-Leaf Plot (continued, 2)
3-9
Chapter 3
LO3-1: Make a stem-and-leaf or dot plot (continued, 5).
Stem-and-Leaf Plot (continued, 3)
• For example, the data values in the fourth stem are 31, 37,
37, 38.
• We always use equally spaced stems (even if some stems
are empty).
• The stem-and-leaf can reveal central tendency (24 of the
44 P/E ratios were in the 10–19 stem) as well as
dispersion (the range is from 7 to 59).
• In this illustration, the leaf digits have been sorted,
although this is not necessary.
• The stem-and-leaf has the advantage that we can retrieve
the raw data by concatenating a stem digit with each of its
leaf digits. For example, the last stem has data values 50
and 59.
3-10
Chapter 3
LO3-1: Make a stem-and-leaf or dot plot (continued, 6).
Dot Plots
• A dot plot is the simplest graphical display of n individual values of
numerical data.
• Easy to understand.
• It reveals dispersion, central tendency, and the shape of the
distribution.
Steps in Making a Dot Plot
1. Make a scale that covers the data range.
2. Mark the axes and label them.
3. Plot each data value as a dot above the scale at its
approximate location.
Note: If more than one data value lies at about the same axis location,
the dots are stacked vertically.
3-11
Chapter 3
LO3-1: Make a stem-and-leaf or dot plot (continued, 7).
Below is the dot plot for the P/E Ratios.
3-12
Chapter 3
LO3-1: Make a stem-and-leaf or dot plot (continued, 8).
Comparing Groups
• A stacked dot plot can be used to compares two or more
groups using a common X-axis scale.
3-13
Chapter 3
3.2 Frequency Distributions and
Histograms
LO3-2: Create a frequency distribution for a data set.
Bins and Bin Limits
• A frequency distribution is a table formed by classifying n data
values into k classes (bins).
• Bin limits define the values to be included in each bin. Widths
must all be the same except when we have open-ended bins.
• For guidance, find the approximate width of each bin by dividing
the data range by the number of bins: (xmax – xmin)/k.
• Frequencies are the number of observations within each bin.
• Express as relative frequencies (frequency divided by the total)
or percentages (relative frequency times 100).
3-14
Chapter 3
LO3-2: Create a frequency distribution for a data set
(continued).
Constructing a Frequency Distribution
Herbert Sturges proposed the following rule:
3-15
Chapter 3
LO3-2: Create a frequency distribution for a data set
(continued, 2).
For the P/E ratio, the smallest P/E ratio was 7 and the largest P/E
ratio was 59, so if we want to use k = 6 bins, we calculate the
approximate bin width as (59 − 7)/6 = 8.67.
To obtain “nice” limits, we could round the bin width up to 10 and
choose bin limits of 0, 10, 20, 30, 40, 50, 60.
3-16
Chapter 3
LO3-3: Make a histogram with appropriate bins.
Histograms
• A histogram is a graphical representation of a
frequency distribution.
• A histogram is a bar chart.
• Y-axis shows frequency within each bin.
• X-axis ticks shows end points of each bin.
3-17
Chapter 3
LO3-3: Make a histogram with appropriate bins
(continued).
Consider 3 histograms for the P/E ratio data with different bin
widths. What do they tell you?
3-18
Chapter 3
LO3-3: Make a histogram with appropriate bins
(continued, 2).
• Choosing the number of bins and bin limits in creating
histograms requires judgment.
• One can use software programs to create histograms
with different bins. These include software such as:
• Excel
• MegaStat
• Minitab
3-19
Chapter 3
LO3-4: Identify skewness, modal classes, and outliers
in a histogram.
Modal Class
3-20
Chapter 3
LO3-4: Identify skewness, modal classes, and outliers
in a histogram (continued).
Shape
• A histogram may suggest the shape of the population.
• It is influenced by the number of bins and bin limits.
• Skewness – indicated by the direction of the longer
tail of the histogram.
• Left-skewed – (negatively skewed) a longer left
tail.
• Right-skewed – (positively skewed) a longer right
tail.
• Symmetric – both tail areas are the same.
3-21
Chapter 3
LO3-4: Identify skewness, modal classes, and outliers
in a histogram (continued, 2).
3-22
Chapter 3
LO3-4: Identify skewness, modal classes, and outliers
in a histogram (continued, 3).
3-23
Chapter 3
LO3-4: Identify skewness, modal classes, and outliers in
a histogram (continued, 4).
3-24
Chapter 3
LO3-4: Identify skewness, modal classes, and outliers
in a histogram (continued, 5).
3-25
Chapter 3
3.3 Effective Excel Charts
This section describes how to use Excel to create charts.
Excel offers a vast array of charts. Refer to Figure 3.8
and to the text as well.
3-26
Chapter 3
3.4 Line Charts
LO3-5: Make an effective line chart.
3-27
Chapter 3
LO3-5: Make an effective line chart (continued).
Simple Line Charts
• Two-scale line chart – used to compare variables that
differ in magnitude or are measured in different units.
3-28
Chapter 3
LO3-5: Make an effective line chart (continued, 2).
Log Scales
• Arithmetic scale – distances on the Y-axis are proportional to
the magnitude of the variable being displayed.
• Logarithmic scale – (ratio scale) equal distances represent
equal ratios.
• Use a log scale for the vertical axis when data vary over a
wide range, say, by more than an order of magnitude.
• This will reveal more detail for smaller data values.
3-29
Chapter 3
LO3-5: Make an effective line chart (continued, 3).
Log Scales
• A log scale is useful for time series data that might be expected
to grow at a compound annual percentage rate (e.g., GDP, the
national debt, or your future income). It reveals whether the
quantity is growing at an
• increasing percent (concave upward),
• constant percent (straight line), or
• declining percent (concave downward).
3-30
Chapter 3
3.5 Column and Bar Charts
LO3-6: Make an effective column chart or bar chart.
• A column chart is a vertical display of the data.
• A bar chart is a horizontal display of the data.
3-31
Chapter 3
LO3-6: Make an effective column chart or bar chart
(continued).
Pareto Charts
• Special type of bar chart used in quality management to
display the frequency of defects or errors of different types.
• Categories are displayed
in descending order of
frequency.
• Focus on significant few
(i.e., few categories that
account for most defects
or errors).
3-32
Chapter 3
LO3-6: Make an effective column chart or bar chart
(continued, 2).
Stacked Column Chart
• Bar height with the sum
of several subtotals.
Areas may be compared
by color to show patterns
in the subgroups and
total.
Source: www.aamc.org
3-33
Chapter 3
3.6 Pie Charts
Pie Chart
• A pie chart can only convey a general idea of the data.
• Pie charts should be used to portray data which sum
to a total (e.g., percent market shares).
• A pie chart should only have a few (i.e., 2 to 5) slices.
• Each slice can be labeled with data values or
percents.
3-34
Chapter 3
LO3-7: Make an effective pie chart (continued).
Pie Chart
• A simple 2-D pie chart is best, as shown in Figure 3.17.
3-35
Chapter 3
LO3-7: Make an effective pie chart (continued, 2).
Pie Chart
• The 3-D pie chart adds visual interest, but the sizes of the
pie slices are harder to assess.
3-36
Chapter 3
LO3-7: Make an effective pie chart (continued, 3).
Bar Chart
• A simple bar chart can be used to display the same data, and
would be preferred by many statisticians.
3-37
Chapter 3
3.7 Scatter Plots
LO3-8: Make and interpret a scatter plot.
3-38
Chapter 3
LO3-8: Make and interpret a scatter plot (continued).
3-39
Chapter 3
LO3-8: Make and interpret a scatter plot (continued, 2).
• Figure 3.21 shows some scatter plot patterns similar to those that
you might observe when you have a sample of (X, Y) data pairs.
• A scatter plot can convey patterns in data pairs that would not be
apparent from a table.
3-40
Chapter 3
LO3-8: Make and interpret a scatter plot (continued, 3).
Other examples of scatter plots.
3-41
Chapter 3
LO3-8: Make and interpret a scatter plot (continued, 4).
3-42
Chapter 3
LO3-8: Make and interpret a scatter plot (continued, 5).
3-43
Chapter 3
LO3-8: Make and interpret a scatter plot (continued, 6).
3-44
Chapter 3
3.8 Tables
LO3-9: Make simple tables and pivot tables.
3-45
Chapter 3
LO3-9: Make simple tables and pivot tables (continued).
3-46
Chapter 3
LO3-9: Make simple tables and Pivot tables (continued, 2).
3-47
Chapter 3
3.9 Deceptive Graphs
LO3-10: Recognize deceptive graphing techniques.
3-48
Chapter 3
LO3-10: Recognize deceptive graphing techniques
(continued, 2).
Error 2: Elastic Graph Proportions
• Keep the aspect ratio (width/height) below 2.00 so as not to
exaggerate the graph. By default, Excel uses an aspect ratio of
1.68.
3-49
Chapter 3
LO3-10: Recognize deceptive graphing techniques
(continued, 3).
Error 3: Dramatic Titles and Distracting Pictures
• A dramatic title often is designed more to grab the reader's
attention than to convey the chart's content (Criminals on a
Spree, Deficit Swamps Economy).
• Sometimes the title attempts to draw your conclusion for you
(Inflation Wipes Out Savings, Imports Dwarf Exports).
• A title should be short but adequate for the purpose.
• To add visual pizzazz, artists may superimpose the chart on a
photograph (e.g., a gasoline price chart atop a photo of an oil-
drilling platform) or add colorful cartoon figures, banners, or
drawings.
3-50
Chapter 3
LO3-10: Recognize deceptive graphing techniques
(continued, 4).
Error 3: Dramatic Titles and Distracting Pictures
(continued)
• This is mostly harmless but can distract the reader or
impart an emotional slant.
• Advertisements sometimes feature mature, attractive,
conservatively attired actors portraying scientists, doctors,
or business leaders examining scientific-looking charts.
• Because the public respects science’s reputation, such
displays impart credibility to self-serving commercial
claims.
• The medical school applications graph (see next slide)
illustrates these deceptive elements.
3-51
Chapter 3
LO3-10: Recognize deceptive graphing techniques
(continued, 5).
Error 3: Dramatic Titles and Distracting Pictures
(continued, 3)
3-52
Chapter 3
LO3-10: Recognize deceptive graphing techniques
(continued, 6).
Error 4: 3-D and Novelty Graphs
• Novelty charts such as the pyramid chart should be
avoided because they distort the bar volume and make it
hard to measure bar height.
Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the
prior written consent of McGraw-Hill Education. 3-53
Chapter 3
LO3-10: Recognize deceptive graphing techniques
(continued, 7).
3-54
Chapter 3
LO3-10: Recognize deceptive graphing technique
(continued, 8).
Error 8: Complex Graphs
• Avoid if possible. This example (surgery volume) combines
several errors (silly subtitle, distracting pictures, no data
labels, no definitions, vague source, too much information).
3-55
Chapter 3
LO3-10: Recognize deceptive graphing techniques
(continued, 9).
Error 11: Area Trick
• As figure height increases, so does width, distorting the
graph.
3-56
Chapter 3
LO3-10: Recognize deceptive graphing techniques
(continued, 10).
3-57