SB 2023 Lecture2 Updated
SB 2023 Lecture2 Updated
SB 2023 Lecture2 Updated
visually
Statistics for Business
Dr. Le Anh Tuan
1
Graphical Presentation of Data
►Data in raw form are usually not easy to use for
decision making.
►Categorical Variables
►Frequency distribution
►Bar chart
►Pie chart
►Pareto diagram
►Numerical variables
►Line chart
►Frequency distribution
►Histogram and ogive
►Stem-and-leaf display
►Scatter plot
3
Graphical Presentation of Data
Categorical
Data
Frequency
Distribution Bar Pie Pareto
► Bar charts and Pie charts are often used for qualitative
(category) data.
5
Graphical Presentation of Data
Summarize data by category
Finance 120
Marketing 150
Management 50
Accounting 75
Finance 120
Number of students
International 200
Business Number of students
Marketing 150
Management 50
200
Accounting 75
150
120
75
50
7
Bar Charts
8
Bar Charts
9
Pie Charts
► Pie charts are another excellent tool for comparing
proportions for categorical data.
10
Pie Charts
Major # of students Percentage
Finance 120 20
IB 200 34
Marketing 150 25
Management 50 8
Accounting 75 13
Number of students
Finance Int ernational Business Marketing Management Accoun ting
13%
20%
8%
25%
34%
11
Pareto Diagram
12
Pareto Diagram Example
► A Pareto Chart is a combination of a bar graph and a
line graph. A Pareto Chart is a graph that indicates
the frequency of defects, as well as their cumulative
impact. Pareto Charts are useful to find the defects
to prioritize in order to observe the greatest overall
improvement.
► The most problematic categories are shown first.
► For example, you collect customer complaints
information.
Customer Complaints Frequency
Product 9
Service 7
Store 5
Price 3
Location 2
13
Pareto Diagram Example
Product 9 35
Service 7 27
Store 5 19
Price 3 11
Location 2 8
Total 26 100
14
Pareto Diagram Example
10 120
9
100
8
7
80
6
5 60
4
40
3
2
20
1
0 0
Product Service Store Price Location
15
Graphical Presentation of Data
Numerical Data
Histogram Ogive
16
Frequency Distribution
17
Relative Frequency Distribution
18
Number of Classes
19
Frequency Distribution
20
Frequency Distribution
21
Class Boundaries
22
Frequency Distribution
24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27
23
Frequency Distribution
►Find range: 58 - 12 = 46
24
Frequency Distribution
Relative
Interval Frequency Frequency Percentage
25
Histograms
►A histogram is a graphical representation of a
frequency distribution.
26
Histograms
Interval Frequency
Frequency
Total 20
4 3
3 2
2
1 0 0
0
(No gaps 0 10 20 30 40 50 60
between bars) Temperature in Degrees
27
The Shapes of Histograms
28
The Consequences of Too Few or
Too Many Classes
► Wide classes result in few class Weight Distribution
9
intervals
► Can be hide important
8
pattern. 7
► Gives a “blocky” 6
distribution graph. 5
much 3
0
[8, 51] (51, 94] (94, 137]
29
The Consequences of Too Few or
Too Many Classes
► Too many narrow
classes has
4
consequences:
► Result in a
3
“jagged”
Frequency
histogram
2
► Some classes
may be empty
► Does not
1
summarize the
0
data enough 0 20 40
weight
60 80 100
10
30
The Ogive
31
The Cumulative Frequency Distribution
Cumulative
Relative Cumulative
Interval Frequency Percentage Percentage
Frequency Frequency
10 but less than 20 3 0.15 15 3 15
20 but less than 30 6 0.30 30 9 45
Total 20 1 100
32
The Ogive Graphing Cumulative Frequencies
100
Cumulative Percentage
80
60
40
20
0
10 20 30 40 50 60
33
Stem-and-Leaf Diagram
34
Stem-and-Leaf Diagram
35
Stem-and-Leaf Diagram
Stem Leaves
2 1 4 4 6 7 7
3 0 2 8
4 1
36
Stem-and-Leaf Diagram
Stem Leaf
613 6 1
729 7 3
800 8 0
1221 12 2
Stem-and-Leaf Diagram
40
Dot Plots
►A dot plot is the simplest graphical display of n
individual values of numerical data.
►Easy to understand.
►It reveals dispersion, central tendency, and the
shape of the distribution.
►If more than one data value lies at about the same
axis location, the dots are stacked vertically.
41
Dot Plots
42
Graphs for Time-Series Data
43
Graphs for Time-Series Data
► A line chart (time-series plot) is used to show the values of a
variable over time
44
Graphs for Time-Series Data
45
Relationships Between Variables
46
Cross Tables
► If there are r categories for the first variable (rows) and c categories
for the second variable (columns), the table is called an r x c cross
table
► Tools: PivotTables
47
Cross Tables
► 4 x 3 Cross Table for Investment Portfolios by Investor (values in
millions VND)
48
Cross Tables
Investment Portfolio
45
40
35
30
25
20
15
10
5
0
Savings Stock market Bon d market Insurance
49
Scatter Plots
50
Scatter Plots
► Scatter plots can convey patterns in data pairs that would
not be apparent from a table.
51
Scatter Plots
GDP
Happiness Per Capita
Index ($US) Happiness and GDP Per Capita
9 40,000 70000
3 10,230
4 12,939 60000
3 9,383
50000
6 28,300
2 4,000
0
0 2 4 6 8 10 12
Happiness Index
52
Scatter Plots
► The figure shows a scatter plot
with Happiness Index on the X-
axis and GDP per Capita on the
Happiness and GDP Per Capita
Y-axis.
► In this illustration, there seems to
70000
Y. 50000
versa). 10000
► No cause-and-effect relationship
0
is implied because, in this 0 2 4 6 8 10 12
Happiness Index
example, both variables could be
influenced by a third variable that
is not mentioned (e.g.,
Population).
53
Scatter Plots
► A scatter plot can convey patterns in data pairs that would not be
apparent from a table.
► Some scatter plot patterns similar to those that you might observe when
you have a sample of (X, Y) data pairs.
54
Exercise
55