MTH 2210 WK2 Frequency Distributions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Frequency Distribution

Frequency Distribution
Week 2
Instructor: Dr. G. Okello
Learning Objectives
At the end of this session the students should
be able to
• Know how to summarize raw data in tabular and
present it using various graphs
• Know how to draw Histogram, Frequency
Polygon, Less-than and More-than ogives
Frequency distribution
• Data with relatively small number of distinct
values can be conveniently presented in a
frequency table
• Frequency distribution is a tabular summary of
data showing the number (frequency) of items in
groups or classes
Starting Salary (x) Frequency (f)
47 4
48 1
49 3
50 5
Grouped data
• Some data sets do have large number of distinct
values – its useful to divide values into
groupings or class

• End points of class intervals are called class


boundaries/ limits

• Ungrouped data can be grouped in a table


– with class boundaries
– or without class boundaries – discrete frequency
distribution
Grouped data
• If you group data with intervals then one may
use
– Exclusive method of classification: the last
observation in the interval is excluded
– Inclusive method of classification: the last observation
in the interval is included

• Always convert inclusive to exclusive before


performing any analyses on the grouped data
Grouped data
Steps for constructing frequency distribution
table without intervals
• Group similar items
• Count the number of times these similar
items are appearing
Grouped data
• Discrete frequency distribution

• Example: Present the starting salary data in a


discrete frequency distribution table
47,47,47,47,48,49,49,49,50,50,50,50,50
Starting Salary (x) Frequency (f)
47 4
48 1
49 3
50 5
Grouped data
• The other frequency distribution table shows the
class interval in which the value falls

• Example:

Class Interval (x) Frequency (f)


500-600 2
600-700 5
700-800 12
800-900 25
900-1000 58
Grouped data
• Considerations for classification/grouping data:
– Classes should be clearly defined – no ambiguity
– Classes should be exhaustive
– Classes should not overlap
– Classes to be of equal width
– Avoid indeterminate classes i.e. open-ended classes
– less than or greater than
– Number of classes should neither be too large or too
small (between 5 and 20 classes). Struges formula for
determining 𝑘 number of classes, 𝑘 = 1 +
3.322𝑙𝑜𝑔10 𝑁,, where 𝑁 is the total frequency
Grouped data
• Class limits
– Should be chosen in a way that mid-value of the class
interval is the average of observations in that class
– class limits are the smallest and largest observations
(data) in each class.
– each class has two limits: a lower and upper
• Class limits are obtained from class width

ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒−𝑙𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒


• 𝑐𝑙𝑎𝑠𝑠 𝑤𝑖𝑑𝑡ℎ =
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠
Grouped data
• Class Boundaries
– They are halfway points that separate the classes
– The lower class boundary of a given class is obtained
by averaging the upper limit of the previous class and
the lower limit of the given class.
– The upper class boundary of a given class is obtained
by averaging the upper limit of the class and the lower
limit of the next class.

• Class marks are the midpoints of the classes.


– They are obtained by averaging the limits.
Grouped data

Class Frequency Class limits Class Class Class size


boundaries mark
180-199 2 180,199 179.5,199.5 189.5 20
200-219 5 200, 219 199.5, 219.5 209.5 20
220-239 12 220, 239 219.5, 239.5 229.5 20
240-259 6 240, 259 239.5, 259.5 249.5 20
Grouped data
• Creating a Grouped Frequency Distribution
– Find the largest and smallest values
– Compute the Range = Maximum - Minimum
– Select the number of classes desired (between 5–20)
or use Struge’s formula
– Find the class width by dividing the range by the
number of classes and rounding up.
– Pick a suitable starting point less than or equal to the
minimum value.
– To find the upper limit of the first class, subtract one
from the lower limit of the second class.
Grouped data
– Find the boundaries by subtracting 0.5 units from the
lower limits and adding 0.5 units from the upper limits.
– Tally the data
– Find the frequencies, cumulative frequencies, relative
frequencies

• Example: Group the following data into 6


classes and construct the frequency table
• 19.7,19.9,20.2,19.9,20.0,20.6,19.3,20.4,19.9,20.
3,20.1,19.5,20.9,20.3,20.8,19.9,20.0,20.6,19.9,1
9.8
Grouped data
• Using 6 class we get class interval of [21.0-
19.2)/6]=0.3
• The frequency distribution arranged in table will be
Data Mid-point Frequency
19.2-19.4 19.3 1
19.5-19.7 19.6 2
19.8-20.0 19.9 8
20.1-20.3 20.2 4
20.4-20.6 20.5 3
20.7-20.9 20.8 2
20
Frequency distribution
• Once we have a frequency table, we can
generate some measures and charts from the
table:
– Cumulative frequency
– Relative frequency
– Percent relative frequency
– Histogram
– Bar chart
– Pie chart
– Cumulative frequency curve (ogive) etc
Cumulative frequency
• A cumulative frequency distribution shows the total
number of observations in all class up to and
including that class
Example: how to calculate cumulative frequency

Marks No. of Less than More than


group (x) students (f) c.f. c.f.
10-20 4 4 35
20-30 2 4+2=6 35-4=31
30-40 8 6+8=14 31-2=29
40-50 15 14+15=29 29-8=21
50-60 6 29+6=35 21-15=6
Relative frequency
𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑐𝑙𝑎𝑠𝑠
• 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 =
𝑛
– where n is the total number of observation
• Example: Grade distribution

Marks No. of Less than More than Relative


group (x) students (f) c.f. c.f. frequency
10-20 4 4 35 4/35
20-30 2 6 31 2/35
30-40 8 14 29 8/35
40-50 15 29 21 15/35
50-60 6 35 6 6/35
Percent frequency
• Percent Frequency of the class is the relative
frequency multiplied by 100

• Example: Grade distribution


Marks No. of Less than More than Relative Percent
group (x) students (f) c.f. c.f. frequency frequency
10-20 4 4 35 4/35=0.11 11
20-30 2 6 31 2/35=0.06 6
30-40 8 14 29 8/35=0.23 23
40-50 15 29 21 15/35=0.43 43
50-60 6 35 6 6/35=0.17 17
Bar Graph and Pie Chart
• Data from a frequency table can be graphically
presented by line graph
• A line graph is a plot of distinct data values on
the horizontal axis and frequencies by the
heights of vertical lines
• Bar graph – when lines in a line graph are given
added thickness (for discrete data)
• Frequency polygon – plots the frequencies of
different data values on the vertical axis, then
connects the plotted points with straight lines
Bar Graph and Pie Chart
• A pie chart is used to indicate relative
frequencies when the data are not numerical in
nature.
Type of cancer Number of Relative Percent
cases frequency frequency
Lung 42 42/200=0.21 21
Breast 50 50/200=0.25 25
Colon 32 32/200=0.16 16
Prostate 55 55/200=0.275 27.5
Melanoma 9 9/200=0.045 4.5
Bladder 12 12/200=0.06 0.6
Histogram
• Histogram is a bar graph of frequency
distribution, where classes are along the
horizontal axis and frequencies along vertical
axis (grouped frequency must be continuous)
• Stem and leaf plot is obtained by first dividing
the data values into two parts – its stem and its
leaf
– Example 62
– Stem Leaf
–6 2
Ogive
• Graphical presentation of frequency distribution
– histogram and ogive

• “Less than ogive” is when less than c.f. is plotted


against the upper limit of the corresponding
classes

• “More than ogive” is when the more than c.f. is


plotted against the lower limit of the
corresponding classes
Further Reading
• Read Gupta & Kapoor Chapter 2
• Video clips on creating frequency distribution and graphs
• How to create frequency distribution:
https://www.youtube.com/watch?v=B4NYIzj5jVg
• Sturge’s method:
https://www.youtube.com/watch?v=iPHGKhKfj-M
Test your knowledge
• Use data – grouped and ungrouped to do all that
we have learnt in class

END OF SLIDE SHOW
THANK YOU

**********

You might also like