Pertemuan 2 Pengantar Bistat

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 26

Business Statistics

Tenth Edition
Ken Black

Chapter 2

Visualizing Data with Charts and Graphs


This deck contains equations authored in Math Type. For the full experience, please download the Math Type software plug-in.

Copyright ©2020 John Wiley & Sons, Inc.


Learning Objectives
1. Construct a frequency distribution from a set of data.
2. Construct different types of quantitative data graphs, including
histograms, frequency polygons, ogives, dot plots, and stem-
and-leaf plots, in order to interpret the data being graphed.
3. Construct different types of qualitative data graphs, including
pie charts, bar graphs, and Pareto charts, in order to interpret
the data being graphed.
4. Construct a cross-tabulation table and recognize basic trends in
two-variable scatter plots of numerical data.
5. Construct a time-series graph and be able to visually identify
any trends in the data.

Copyright ©2020 John Wiley & Sons, Inc. 2


2.1 Frequency Distributions (1 of 7)
Ungrouped data
• have not been summarized in any way
• are also called raw data
Grouped data
• logical groupings of data exist
o Example: age ranges (20-29, 30-39, etc.)
• have been organized into a frequency distribution

Copyright ©2020 John Wiley & Sons, Inc. 3


2.1 Frequency Distributions (2 of 7)
TABLE 2.1: 60 Years of Canadian Unemployment Rates (ungrouped data)
2.3 7.0 6.3 11.3 9.6

2.8 7.1 5.6 10.6 9.1

3.6 5.9 5.4 9.7 8.3

2.4 5.5 7.1 8.8 7.6

2.9 4.7 7.1 7.8 6.8

3.0 3.9 8.0 7.5 7.2

4.6 3.6 8.4 8.1 7.7

4.4 4.1 7.5 10.3 7.6

3.4 4.8 7.5 11.2 7.2

4.6 4.7 7.6 11.4 6.8

6.9 5.9 11.0 10.4 6.3

6.0 6.4 12.0 9.5 6.0


Copyright ©2020 John Wiley & Sons, Inc. 4
2.1 Frequency Distributions (3 of 7)

TABLE 2.2: Frequency Distribution of 60 Years of


Unemployment Data for Canada (grouped data)

Class Interval Frequency


1–under 3 4
3–under 5 12
5–under 7 13
7–under 9 19
9–under 11 7
11–under 13 5

Copyright ©2020 John Wiley & Sons, Inc. 5


2.1 Frequency Distributions (4 of 7)
Frequency Distribution: summary of data presented in
the form of class intervals and frequencies
• Vary in shape and design
• Constructed according to the individual analyst’s
preferences
• Range: the difference between the largest and smallest
numbers
o The range for the Canadian unemployment example is
9.7 (12.0 – 2.3)

Copyright ©2020 John Wiley & Sons, Inc. 6


2.1 Frequency Distributions (5 of 7)
After calculating the range, determine the number of classes
Rule of thumb: select between 5 and 15 classes
o Too few classes may be too general to be useful
o Too many classes may not be sufficiently aggregated
• Divide the range by the number of classes
o For the Canadian unemployment example, the analyst chose 6 classes
9.7
 1.62
6
o Round up to the nearest whole number (= 2)
o Must start at or below the lowest observation and end at or above the
highest observation

Copyright ©2020 John Wiley & Sons, Inc. 7


2.1 Frequency Distributions (6 of 7)
Class Midpoint (or Class Mark)
• Value halfway across the class interval
o Calculated as the average of the two endpoints
Relative Frequency
• Proportion of the total frequency in any given class interval
Individual Class Frequency
Total Frequency

Cumulative Frequency
• Running total of frequencies through the classes of a frequency
distribution

Copyright ©2020 John Wiley & Sons, Inc. 8


2.1 Frequency Distributions (7 of 7)
TABLE 2.3: Class Midpoints, Relative Frequencies, and Cumulative
Frequencies for Unemployment Data

2.3 7.0 6.3 11.3 9.6 Relative


Class Cumulative
Interval Frequency Frequenc
2.8 7.1 5.6 10.6 9.1 Midpoint Frequency
y
3.6 5.9 5.4 9.7 8.3 1–under 3 4 2 .0667 4
2.4 5.5 7.1 8.8 7.6 3–under 5 12 4 .2000 16
5–under 7 13 6 .2167 29
2.9 4.7 7.1 7.8 6.8
7–under 9 19 8 .3167 48
3.0 3.9 8.0 7.5 7.2
9–under 11 7 10 .1167 55
4.6 3.6 8.4 8.1 7.7
11–under 13 5 12 .0833 60
4.4 4.1 7.5 10.3 7.6 Total 60
3.4 4.8 7.5 11.2 7.2

4.6 4.7 7.6 11.4 6.8

6.9 5.9 11.0 10.4 6.3

6.0 6.4 12.0 9.5 6.0

Copyright ©2020 John Wiley & Sons, Inc. 9


2.2 Quantitative Data Graphs (1 of 7)
Histogram: contiguous rectangles that
represent the frequency of data in given
class intervals
• x-axis has class intervals; y-axis has
frequencies
• Useful for getting an initial
overview of the distribution of the
data
• Note that the scale of each axis can
change the shape of the histogram

Copyright ©2020 John Wiley & Sons, Inc. 10


2.2 Quantitative Data Graphs (2 of 7)

Frequency Polygon: graphical


display of class frequencies
• x-axis has class midpoints;
y-axis has frequencies
• A dot is plotted for each
class midpoint
• Note that scale of each axis
can also change the shape of
the frequency polygon

Copyright ©2020 John Wiley & Sons, Inc. 11


2.2 Quantitative Data Graphs (3 of 7)
Ogive: cumulative frequency polygon
• x-axis has class endpoints; y-axis
has cumulative frequencies
• A dot is plotted at the endpoint of
each class interval
• Ogives are most useful when the
analyst wants to see running totals
• Steep slopes show sharp increases
in frequencies

Copyright ©2020 John Wiley & Sons, Inc. 12


2.2 Quantitative Data Graphs (4 of 7)
Dot Plots: each data point is plotted, with identical values stacked
vertically
• Useful for observing the overall shape of the distribution while
observing where there are groupings or gaps in the data
• In the unemployment data, the distribution appears relatively
balanced, with a peak towards the center and a few gaps

Copyright ©2020 John Wiley & Sons, Inc. 13


2.2 Quantitative Data Graphs (5 of 7)
Stem-and-Leaf Plots: digits for each number are grouped into a
stem and a leaf
• Stems are the leftmost, higher values
• Leaves are the rightmost, lower values
o Useful for observing whether values are in the upper or lower end
of each bracket and seeing the spread of the values
o Retains original data rather than using class midpoints to represent
values
• For 2-digit data, left value is the stem; right value is the leaf
• For numbers with more than 2 digits, split is chosen by the analyst’s
preference

Copyright ©2020 John Wiley & Sons, Inc. 14


2.2 Quantitative Data Graphs (6 of 7)
TABLE 2.4: Safety Examination Scores for TABLE 2.5: Stem-and-Leaf Plot for Plant Safety
Plant Trainees Examination Data

Stem Leaf
86 77 91 60 55
2 3
76 92 47 88 67
3 9
23 59 72 75 83
4 7 9
77 68 82 97 89
5 5 6 9
81 75 74 39 67
6 0 7 7 8 8
79 83 70 78 91
7 0 2 4 5 5 6 7 7 8 9
68 49 56 94 81
8 1 1 2 3 3 6 8 9
9 1 1 2 4 7

Copyright ©2020 John Wiley & Sons, Inc. 15


2.2 Quantitative Data Graphs (7 of 7)
Construction of the safety examination data stem-and-leaf plot:

Copyright ©2020 John Wiley & Sons, Inc. 16


2.3 Qualitative Data Graphs (1 of 5)
Pie Chart: circular depiction of the data where the area of the
whole pie represents 100% of the data and slices of the pie
represent percentage breakdown of the sublevels
• Shows relative magnitudes of the sublevels of the data.
• Constructed by determining the proportion of the sublevel
to the whole.
• Proportion is each figure divided by the total.
• Multiply each proportion by 360 degrees to get the angle
for each slice of the pie.

Copyright ©2020 John Wiley & Sons, Inc. 17


2.3 Qualitative Data Graphs (2 of 5)
TABLE 2.6: Top Five U.S. Petroleum Refining Companies by Revenue

Company Revenue ($ millions) Proportion Degrees


Exxon Mobil 205,004 .4012 144.43
Chevron 107,567 .2105 75.78
Phillips 66 72,396 .1417 51.01
Valero Energy 70,166 .1373 49.43
Marathon Petroleum 55,858 .1093 39.35
Totals 510,991 1.0000 360.00

Copyright ©2020 John Wiley & Sons, Inc. 18


2.3 Qualitative Data Graphs (3 of 5)

Copyright ©2020 John Wiley & Sons, Inc. 19


2.3 Qualitative Data Graphs (4 of 5)
Bar Graph or Chart: 2 or more categories on one axis, bars for each category
on the other axis
• Horizontal bar graphs are usually called bar charts
• Vertical bar graphs are usually called column charts

TABLE 2.7: How Much is Spent on Back-to-


College Shopping by the Average Student

Category Amount Spent ($ US)


Electronics 211.89
Clothing and Accessories 134.40
Dorm Furnishings 90.90
School Supplies 68.47
Misc. 93.72

Copyright ©2020 John Wiley & Sons, Inc. 20


2.3 Qualitative Data Graphs (5 of 5)
Pareto Chart: a vertical bar chart
that displays the most common
types of defects, ranked in order
of occurrence from left to right
• This Pareto chart shows the
count and percentage of each
type of defect found in
electric motors. It also
includes an ogive to show
cumulative frequency.

Copyright ©2020 John Wiley & Sons, Inc. 21


2.4 Charts and Graphs for Two Variables
(1 of 2)
Cross Tabulation: process for producing a two-dimensional table that displays
the frequency count for two variables simultaneously
TABLE 2.8: Banker Data Observations by
Job Satisfaction and Age
TABLE 2.9: Cross Tabulation Table of Banker
Banker Level of Job Satisfaction Age
Data
1 4 53 Age Category: Age Category: Age Category: Total
Under 30 30–50 Over 50
2 3 37
Level of Job 7 3 0 10
3 1 24
Satisfaction: 1
4 2 28
Level of Job 19 14 3 36
5 4 46 Satisfaction: 2
6 5 62
Level of Job 28 17 12 57
7 3 41 Satisfaction: 3
8 3 32 Level of Job 11 22 16 49
9 4 29 Satisfaction: 4
. Level of Job 2 9 14 25
. Satisfaction: 5
. Total 67 65 45 177
177 3 51

Copyright ©2020 John Wiley & Sons, Inc. 22


2.4 Charts and Graphs for Two Variables
(2 of 2)
Scatter Plot: two-dimensional graph
plot of pairs of points from two
numerical variables
• Useful for examining possible
relationships between variables
• Scatter plot here shows values of
new residential and nonresidential
buildings in the U.S. for 35 years
• Data shows a mixed relationship
• Both types of construction might be
expected to rise or fall at the same
time, but data does not confirm that

Copyright ©2020 John Wiley & Sons, Inc. 23


2.5 Visualizing Time Series Data (1 of 2)
TABLE 2.11: Motor Vehicles Produced From
Time Series Data: data gathered 2003 Through 2016 in the United States

on a particular characteristic Year


Number of Motor Vehicles
Produced in the U.S. (1000s)

over a period of time at regular 2003 12,145


2004 12,021
intervals 2005 12,018

• Can be any time period, such 2006


2007
11,351
10,611

as hours, days, years, etc. 2008 8503


2009 5591
• Visualizing time series data 2010 7632
2011 8462
using a line chart makes it 2012 10,142

easier to see any trends or 2013 11,066


2014 11,661
directions in the data 2015 12,100
2016 12,198

Copyright ©2020 John Wiley & Sons, Inc. 24


2.5 Visualizing Time Series Data (2 of 2)

Copyright ©2020 John Wiley & Sons, Inc. 25


Copyright
Copyright © 2020 John Wiley & Sons, Inc.
All rights reserved. Reproduction or translation of this work beyond that permitted in
Section 117 of the 1976 United States Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the
Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up
copies for his/her own use only and not for distribution or resale. The Publisher assumes
no responsibility for errors, omissions, or damages, caused by the use of these programs or
from the use of the information contained herein.

Copyright ©2020 John Wiley & Sons, Inc. 26

You might also like