Statistics For Business and Economics: Describing Data: Graphical

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 53

Statistics for

Business and Economics


6th Edition

Chapter 2

Describing Data: Graphical

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-1
Chapter Goals
After completing this chapter, you should be able to:
 Identify types of data and levels of measurement
 Create and interpret graphs to describe categorical variables:
 frequency distribution, bar chart, pie chart, Pareto diagram
 Create a line chart to describe time-series data
 Create and interpret graphs to describe numerical variables:
 frequency distribution, histogram, ogive, stem-and-leaf display

 Construct and interpret graphs to describe relationships between


variables:
 Scatter plot, cross table

 Describe appropriate and inappropriate ways to display data


graphically

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-2
Types of Data

Data

Categorical Numerical

Examples:
 Marital Status
 Are you registered to Discrete Continuous
vote?
 Eye Color Examples: Examples:
(Defined categories or  Number of Children  Weight
groups)  Defects per hour  Voltage
(Counted items) (Measured characteristics)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-3
Measurement Levels
Differences between
measurements, true Ratio Data
zero exists
Quantitative Data

Differences between
measurements but no Interval Data
true zero

Ordered Categories
(rankings, order, or Ordinal Data
scaling)
Qualitative Data

Categories (no
ordering or direction) Nominal Data
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-4
Exercise.
A number of questions were posed to a random sample of visitors to a London
tourist information center. For each question below, describe the type of data
obtained.

a. Are you staying overnight in London?


b. How many times have you visited London
previously?

c. Which of the following attractions have you visited?


- Tower of London
- Buckingham Palace
- Big Ben
- Covent Garden
- Westminster Abbey

d. How likely are you to visit London again in


the next 12 months: (1) unlikely, (2) likely,
(3) very likely?

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-5
Graphical
Presentation of Data
 Data in raw form are usually not easy to use
for decision making
 Some type of organization is needed
 Table

 Graph

 The type of graph to use depends on the


variable being summarized

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-6
Graphical
Presentation of Data
(continued)
 Techniques reviewed in this chapter:

Categorical Numerical
Variables Variables

• Frequency distribution • Line chart


• Bar chart • Frequency distribution
• Pie chart • Histogram and ogive
• Pareto diagram • Stem-and-leaf display
• Scatter plot

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-7
Tables and Graphs for
Categorical Variables
Categorical
Data

Tabulating Data Graphing Data

Frequency
Distribution Bar Pie Pareto
Table Chart Chart Diagram

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-8
The Frequency
Distribution Table
Summarize data by category

Example: Hospital Patients by Unit


Hospital Unit Number of Patients

Cardiac Care 1,052


Emergency 2,245
Intensive Care 340
Maternity 552
Surgery 4,630
(Variables are
categorical)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-9
Bar and Pie Charts

 Bar charts and Pie charts are often used


for qualitative (category) data

 Height of bar or size of pie slice shows


the frequency or percentage for each
category

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-10
Bar Chart Example

Hospital Number
Unit of Patients

Cardiac Care 1,052


Hospital Patients by Unit
Emergency 2,245
5000
Intensive Care 340
4000
patients per year
Maternity 552 Number of

Surgery 4,630 3000

2000

1000

0
Cardiac

Emergency

Intensive

Maternity

Surgery
Care

Care
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-11
Pie Chart Example

Hospital Number % of
Unit of Patients Total Hospital Patients by Unit
Cardiac Care 1,052 11.93
Cardiac Care
Emergency 2,245 25.46 12%
Intensive Care 340 3.86
Maternity 552 6.26
Surgery 4,630 52.50
Emergency
Surgery 25%
53%

Intensive Care
(Percentages 4%
are rounded to Maternity
the nearest 6%
percent)
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-12
Pareto Diagram

 Used to portray categorical data


 A bar chart, where categories are shown in
descending order of frequency
 A cumulative polygon is often shown in the
same graph
 Used to separate the “vital few” from the “trivial
many”

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-13
Pareto Diagram Example
Example: 400 defective items are examined
for cause of defect:
Source of
Manufacturing Error Number of defects
Bad Weld 34
Poor Alignment 223
Missing Part 25
Paint Flaw 78
Electrical Short 19
Cracked case 21
Total 400

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-14
Pareto Diagram Example
(continued)

Step 1: Sort by defect cause, in descending order


Step 2: Determine % in each category

Source of
Manufacturing Error Number of defects % of Total Defects
Poor Alignment 223 55.75
Paint Flaw 78 19.50
Bad Weld 34 8.50
Missing Part 25 6.25
Cracked case 21 5.25
Electrical Short 19 4.75
Total 400 100%
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-15
Pareto Diagram Example
(continued)
Step 3: Show results graphically
Pareto Diagram: Cause of Manufacturing Defect
60% 100%
% of defects in each category

90%

cumulative % (line graph)


50%
80%

70%
(bar graph)

40%

60%

30% 50%

40%

20%
30%

20%
10%

10%

0% 0%
Poor Alignment Paint Flaw Bad Weld Missing Part Cracked case Electrical Short

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-16
Graphs for Time-Series Data

 A line chart (time-series plot) is used to show


the values of a variable over time

 Time is measured on the horizontal axis

 The variable of interest is measured on the


vertical axis

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-17
Line Chart Example

Magazine Subscriptions by Year

350

300
Thousands of subscribers

250

200

150

100

50

0
1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-18
Graphs to Describe
Numerical Variables
Numerical Data

Frequency Distributions Stem-and-Leaf


and Display
Cumulative Distributions

Histogram Ogive

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-19
Frequency Distributions

What is a Frequency Distribution?


 A frequency distribution is a list or a table …
 containing class groupings (categories or
ranges within which the data fall) ...
 and the corresponding frequencies with which
data fall within each class or category

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-20
Why Use Frequency Distributions?

 A frequency distribution is a way to


summarize data
 The distribution condenses the raw data
into a more useful form...
 and allows for a quick visual interpretation
of the data

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-21
Class Intervals
and Class Boundaries

 Each class grouping has the same width


 Determine the width of each interval by
largest number  smallest number
w  interval width 
number of desired intervals

 Use at least 5 but no more than 15-20 intervals


 Intervals never overlap
 Round up the interval width to get desirable
interval endpoints

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-22
Frequency Distribution Example

Example: A manufacturer of insulation randomly


selects 20 winter days and records the daily
high temperature

24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-23
Frequency Distribution Example
(continued)

 Sort raw data in ascending order:


12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
 Find range: 58 - 12 = 46
 Select number of classes: 5 (usually between 5 and 15)
 Compute interval width: 10 (46/5 then round up)
 Determine interval boundaries: 10 but less than 20, 20 but
less than 30, . . . , 60 but less than 70
 Count observations & assign to classes

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-24
Frequency Distribution Example
(continued)
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Relative
Interval Frequency Percentage
Frequency
10 but less than 20 3 .15 15
20 but less than 30 6 .30 30
30 but less than 40 5 .25 25

40 but less than 50 4 .20 20


50 but less than 60 2 .10 10

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-25
Histogram

 A graph of the data in a frequency distribution


is called a histogram
 The interval endpoints are shown on the
horizontal axis
 the vertical axis is either frequency, relative
frequency, or percentage
 Bars of the appropriate heights are used to
represent the number of observations within
each class
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-26
Graphs to describe Numerical Variables

Larger data sets require more classes; smaller data sets require fewer classes.

If we select too few classes, the patterns and various characteristics of the data may be
hidden. If we select too many classes, we will discover that some of our intervals may
contain no observations or have a very small frequency.

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-27
Histogram Example

Interval Frequency
His togram : Daily High Te m pe rature
10 but less than 20 3
20 but less than 30 6 7 6
30 but less than 40 5
6 5
40 but less than 50 4
50 but less than 60 2 5 4
Frequency

4 3
3 2
2
1 0 0
(No gaps 0
between 0 0 10 10 2020 30 30 40 40 50 50 60 60
bars) 70 Temperature in Degrees
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-28
Questions for Grouping Data
into Intervals

 1. How wide should each interval be?


(How many classes should be used?)
 2. How should the endpoints of the
intervals be determined?
 Often answered by trial and error, subject to
user judgment
 The goal is to create a distribution that is
neither too "jagged" nor too "blocky”
 Goal is to appropriately show the pattern of
variation in the data

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-29
How Many Class Intervals?

 Many (Narrow class intervals) 3.5


3
 may yield a very jagged distribution 2.5

Frequency
with gaps from empty classes 2
1.5
 Can give a poor indication of how 1
0.5
frequency varies across classes 0

4
8
12
16
20
24
28
32
36
40
44
48
52
56
60
More
Temperature

 Few (Wide class intervals) 12

10
 may compress variation too much and 8

Frequency
yield a blocky distribution 6

4
 can obscure important patterns of 2

variation. 0
0 30 60 More
Temperature
(X axis labels are upper class endpoints)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-30
The Cumulative
Frequency Distribuiton
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Cumulative Cumulative
Class Frequency Percentage
Frequency Percentage

10 but less than 20 3 15 3 15


20 but less than 30 6 30 9 45
30 but less than 40 5 25 14 70
40 but less than 50 4 20 18 90
50 but less than 60 2 10 20 100
Total 20 100

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-31
The Ogive
Graphing Cumulative Frequencies
Upper
interval Cumulative
Interval endpoint Percentage
Less than 10 10 0
10 but less than 20 20 15
20 but less than 30 30 45 Ogive: Daily High Temperature
30 but less than 40 40 70
40 but less than 50 50 90 100
50 but less than 60 60 100
Cumulative Percentage 80
60
40
20
0
10 20 30 40 50 60
Interval endpoints
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-32
Distribution Shape
 The shape of the distribution is said to be
symmetric if the observations are balanced,
or evenly distributed, about the center.
Symmetric Distribution

10
9
8
7
Frequency

6
5
4
3
2
1
0
1 2 3 4 5 6 7 8 9

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-33
Distribution Shape
(continued)
 The shape of the distribution is said to be
skewed if the observations are not
symmetrically distributed around the center.
Positively Skewed Distribution

A positively skewed distribution 12

(skewed to the right) has a tail that


10

Frequency
extends to the right in the direction of 6

positive values.
4

0
1 2 3 4 5 6 7 8 9

Negatively Skewed Distribution


A negatively skewed distribution 12
(skewed to the left) has a tail that 10

extends to the left in the direction of 8


Frequency

6
negative values. 4

0
1 2 3 4 5 6 7 8 9

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-34
Stem-and-Leaf Diagram

 A simple way to see distribution details in a


data set

METHOD: Separate the sorted data series


into leading digits (the stem) and
the trailing digits (the leaves)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-35
Example
Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41

 Here, use the 10’s digit for the stem unit:


Stem Leaf
 21 is shown as 2 1
 38 is shown as 3 8

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-36
Example
(continued)
Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41

 Completed stem-and-leaf diagram:


Stem Leaves
2 1 4 4 6 7 7
3 0 2 8
4 1

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-37
Using other stem units
 Using the 100’s digit as the stem:
 Round off the 10’s digit to form the leaves

Stem Leaf
 613 would become 6 1
 776 would become 7 8
 ...
 1224 becomes 12 2

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-38
Using other stem units
(continued)

 Using the 100’s digit as the stem:


 The completed stem-and-leaf display:
Data:
Stem Leaves
613, 632, 658, 717, 6 136
722, 750, 776, 827, 7 2258
841, 859, 863, 891, 8 346699
894, 906, 928, 933,
9 13368
955, 982, 1034,
1047,1056, 1140, 10 356
1169, 1224 11 47
12 2
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-39
Relationships Between Variables

 Graphs illustrated so far have involved only a


single variable
 When two variables exist other techniques are
used:

Categorical Numerical
(Qualitative) (Quantitative)
Variables Variables

Cross tables Scatter plots

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-40
Scatter Diagrams

 Scatter Diagrams are used for paired


observations taken from two
numerical variables

 The Scatter Diagram:


 one variable is measured on the vertical

axis and the other variable is measured


on the horizontal axis

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-41
Scatter Diagram Example

Volume Cost per


Cost per Day vs. Production Volume
per day day
23 125 250
26 140 200
29 146
Cost per Day

150
33 160
38 167 100
42 170 50
50 188 0
55 195 0 10 20 30 40 50 60 70
60 200
Volume per Day

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-42
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-43
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-44
Cross Tables

 Cross Tables (or contingency tables) list the


number of observations for every combination
of values for two categorical or ordinal
variables

 If there are r categories for the first variable


(rows) and c categories for the second
variable (columns), the table is called an r x c
cross table

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-45
Cross Table Example
 4 x 3 Cross Table for Investment Choices by Investor
(values in $1000’s)
Investment Investor A Investor B Investor C Total
Category

Stocks 46.5 55 27.5 129


Bonds 32.0 44 19.0 95
CD 15.5 20 13.5 49
Savings 16.0 28 7.0 51
Total 110.0 147 67.0 324

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-46
Graphing
Multivariate Categorical Data
(continued)

 Side by side bar charts


Comparing Investors

S avings

CD

B onds

S toc k s

0 10 20 30 40 50 60

Inves tor A Inves tor B Inves tor C

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-47
Side-by-Side Chart Example
 Sales by quarter for three sales territories:
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East 20.4 27.4 59 20.4
W est 30.6 38.6 34.6 31.6
North 45.9 46.9 45 43.9

60

50

40
East
30 West
North
20

10

0
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-48
Data Presentation Errors

Goals for effective data presentation:


 Present data to display essential information
 Communicate complex ideas clearly and
accurately
 Avoid distortion that might convey the wrong
message

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-49
Data Presentation Errors
(continued)

 Unequal histogram interval widths


 Compressing or distorting the
vertical axis
 Providing no zero point on the
vertical axis
 Failing to provide a relative basis
in comparing data between
groups

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-50
Data presentation Errors. Misleading Histograms

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-51
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-52
Chapter Summary
 Reviewed types of data and measurement levels
 Data in raw form are usually not easy to use for decision
making -- Some type of organization is needed:
 Table  Graph

 Techniques reviewed in this chapter:


 Frequency distribution  Line chart
 Bar chart
 Frequency distribution
 Histogram and ogive
 Pie chart
 Stem-and-leaf display
 Pareto diagram
 Scatter plot
 Cross tables and
side-by-side bar charts
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-53

You might also like