Advenced Level Descriptive Statistics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

DESCRIPTIVE STATISTICS 1

A measure of location is a single value which describes a position in a data set

⇒ If the single value describes the centre of the data, it is called a measure of central
tendency

⇒ You should already know how to work out the mean, median and mode:

 The mode or modal class is the value or class that occurs most often
 The median is the middle value when the data values are put in order
 The mean can be calculated by using the formula:

 𝑥with '-' above it represent the mean (you say 'x bar'), Σx represents the sum of the
data values, n is the number of data values
 EXAMPLE

 For data given in a frequency table, the mean can be calculated using the
formula:
∑ 𝑓𝑥
𝑥̅ =
∑𝑓

EXAMPLE

PREPARED BY MRS ASSUMPTA KASAMBA PROV23


DESCRIPTIVE STATISTICS 2

⇒ The median describes the middle of the data set. It splits the data set into two equal (50%)
halves.

MEASURES OF DISPERSION

These measures are used to find out how the scores spread, scatter around or away from
the mean.These include : the range, the interquartile range , semi-interquartile range ,
percentiles , deciles , mean deviation , variance and standard deviation.

Range is the difference between the highest and the lowest value in the distribution. The
range is based on the extreme values of the distribution.
𝑅𝑎𝑛𝑔𝑒 = ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 − 𝑙𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒
Example Each of these sets of numbers has a mean of 7 but the spread of each set is
different. (a) 7,7,7,7,7,7 (b) 4,6,6.5,7.2,11.3 (iii) -193,-46,28,69,177
In (a) the range = 7 − 7 = 0
In (b) the range = 11.3 − 4 = 7.3
In (c)the range = 177 − (−193) = 370
⇒ You can calculate other measures of dispersion such as quartiles and percentiles
PREPARED BY MRS ASSUMPTA KASAMBA PROV23
DESCRIPTIVE STATISTICS 3

QUARTILES : These divide the data or distribution into four equal parts.

⇒ Use these rules to find the upper and lower quartiles for discrete/ ungrouped data

For ungrouped data, arrange all the values in ascending order


 To find the lower quartile for discrete data, divide n by 4. If this is a whole number, the
lower quartile is halfway between this data point and the one above. If it is not a whole
number, round up and pick tis data point
 To find the upper quartile for discrete data, find 3/4 of n. If this is a whole number, the
upper quartile is halfway between this data point and the one above. If it is not a whole
number, round up and pick this data point
Example
Below are marks of 13 students in a weekly test65,74,91,62,60,92,67,71,75,86,85,65,61
Find the (i) median
(ii) Lower quartile
(iii) Upper quartile
Solution
Arranging the marks in ascending order 60,61,62,65,65,67,71,74,75,85,86,91,92
(i) Median is 71
(ii)Lower quartile
1
Position of 𝑄1 = 4 × 13 = 3.25 ≈ 3𝑟𝑑 ∴ 𝑄1 = 62
(iii) Upper quartile
3
Position of 𝑄3 = 4 × 13 = 9.75 ≈ 10𝑡ℎ 𝑄3 = 85
Example
The table below shows the heights to the nearest cm of 100 pupils
Height(cm) No of pupils
150 2
151 0
152 15
153 29
154 25
155 12
156 10
157 4
158 3
Calculate the (i) median height
(ii) Lower quartile
(iii) Upper quartile
Solution

PREPARED BY MRS ASSUMPTA KASAMBA PROV23


DESCRIPTIVE STATISTICS 4

Height(cm) No of 𝐶𝑓
pupils
150 2 2
151 0 2
152 15 17
153 29 46
154 25 71
155 12 83
156 10 93
157 4 97
158 3 100
1
(i)Position of median = 2 × 100 = 50𝑡ℎ
∴ 𝑚𝑒𝑑𝑖𝑎𝑛 ℎ𝑒𝑖𝑔ℎ𝑡 𝑖𝑠 154𝑐𝑚
1
(ii) Position of 𝑄1 = 4 × 100 = 25𝑡ℎ
∴ 𝑄1 = 153𝑐𝑚
3 𝑡ℎ
(iii) Position of 𝑄3 = 4 × 100 = 75
∴ 𝑄3 = 155𝑐𝑚

MEDIAN OF GROUPED DATA


Step 1: Construct the cumulative frequency distribution.
Step 2: Determine the class that contain the median. Class Median is the class with the
value of cumulative frequency equal to n/2.
Step 3: Find the median by using the following formula:
𝒏
(𝟐 − 𝑪𝒇𝒃 ) 𝒊
𝒎𝒆𝒅𝒊𝒂𝒏 = 𝑳𝒎 +
𝒇𝒎
Where 𝐿𝑚 is the lower class boundary of the median class
𝑛 is the total number of items in the data
𝐶𝑓𝑏 is the cumulative frequency before that of the median class
𝑖 is the class width of the median class
𝑓𝑚 is the actual frequency of the median class

QUARTILES OF GROUPED DATA


LOWER QUARTILE(𝑸𝟏 )
is calculated using the formula
𝒏
(𝟒 − 𝑪𝒇𝒃 ) 𝒊
𝑄1 = 𝑳𝑄1 +
𝒇𝑄1
𝟏
𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 𝑜𝑓 𝑸𝟏 𝒄𝒍𝒂𝒔𝒔 = 𝟒 𝒏𝒕𝒉 ,
Where 𝐿𝑸𝟏 is the lower class boundary of the 𝑸𝟏 class
𝑛 is the total number of items in the data
𝐶𝑓𝑏 is the cumulative frequency before that of the 𝑸𝟏 class
𝑖 is the class width of the 𝑸𝟏 class
𝑓𝑸𝟏 is the actual frequency of the 𝑸𝟏 class

UPPER QUARTILE(𝑸𝟑 )
is calculated using the formula

PREPARED BY MRS ASSUMPTA KASAMBA PROV23


DESCRIPTIVE STATISTICS 5

𝟑𝒏
( 𝟒 − 𝑪𝒇𝒃 ) 𝒊
𝑄3 = 𝑳𝑄3 +
𝒇𝑄3
EXAMPLE
The data below shows the ages of patients who tested positive with COVID 19
Age (years) Number of patients
40 − 44 9
45 − 49 13
50 − 54 17
55 − 59 10
60 − 64 8
65 − 69 6
70 − 74 2
Calculate the (i) median
(ii) Interquartile range and hence the quartile deviation

Solution
Age f 𝐶𝑓 Class 𝒏
( −𝑪𝒇𝒃 )𝒊
𝟐
(years) boundaries (i) 𝒎𝒆𝒅𝒊𝒂𝒏 = 𝑳𝒎 + 𝒇𝒎
40 − 44 9 9 39.5 − 44.5 1
Position of 𝑚𝑒𝑑𝑖𝑎𝑛 class =2 (66)𝑡ℎ = 33𝑟𝑑 ∴
45 − 49 13 22 44.5 − 49.5
50 − 54 17 𝟑𝟗 49.5 − 54.5 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠 𝑖𝑠 50 − 54
𝑛 65
55 − 59 10 49 54.5 − 59.5 , 𝐿𝑚 = 49.5, 2 = 2 = 32.5, 𝐶𝑓𝑏 = 22, 𝑖 = 5, 𝑓𝑚 = 17
60 − 64 8 57 59.5 − 64.5 (𝟑𝟐. 𝟓 − 𝟐𝟐)𝟓
65 − 69 6 63 64.5 − 69.5 𝑚𝑒𝑑𝑖𝑎𝑛 = 𝟒𝟗. 𝟓 +
𝟏𝟕
70 − 74 2 65 69.5 − 74.5
= 𝟒𝟗. 𝟓 + 𝟑. 𝟎𝟖𝟖𝟐 = 𝟓𝟐. 𝟓𝟖𝟖𝟐
(ii) Interquartile range = 𝑢𝑝𝑝𝑒𝑟 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 − 𝑙𝑜𝑤𝑒𝑟 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒

but lower quartile


1 𝑛 65
Position of 𝑄1 class =4 (65)𝑡ℎ = 16.25𝑡ℎ ∴ 𝑄1 𝑐𝑙𝑎𝑠𝑠 𝑖𝑠 45 − 49, 𝐿𝑄1 = 44.5, 4 = 4 = 16.25
𝐶𝑓𝑏 = 9, 𝑖 = 5, 𝑓𝑄1 = 13
𝒏
(𝟒 − 𝑪𝒇𝒃 ) 𝒊
𝑄1 = 𝑳𝑄1 +
𝒇𝑄1
𝟔𝟓
( −𝟗)𝟓
𝟒
= 𝟒𝟒. 𝟓 + = 𝟒𝟕. 𝟐𝟖𝟖𝟓
𝟏𝟑
Upper quartile
3
Position of 𝑄3 class =4 (65)𝑡ℎ = 48.75𝑡ℎ ∴ 𝑄3 𝑐𝑙𝑎𝑠𝑠 𝑖𝑠 55 − 59 𝐿𝑄3 = 54.5
3𝑛 3×65
∴ = = 48.75, 𝐶𝑓𝑏 = 39, 𝑖 = 5, 𝑓𝑄3 = 10
4 4

𝟑𝒏
( 𝟒 − 𝑪𝒇𝒃 ) 𝒊
𝑄3 = 𝑳𝑄3 +
𝒇𝑄3
𝟑 × 𝟔𝟓
( 𝟒 − 𝟑𝟗) 𝟓
= 𝟓𝟒. 𝟓 + = 𝟓𝟗. 𝟑𝟕𝟓𝟎
𝟏𝟎

PREPARED BY MRS ASSUMPTA KASAMBA PROV23


DESCRIPTIVE STATISTICS 6

Interquartile range = 𝟓𝟗. 𝟑𝟕𝟓𝟎 − 𝟒𝟕. 𝟐𝟖𝟖𝟓 = 𝟏𝟐. 𝟎𝟖𝟔𝟓

𝟏𝟐.𝟎𝟖𝟔𝟓
Quartile deviation = = 𝟔. 𝟎𝟒𝟑𝟐𝟓
𝟐

DECILES
These divide the data into ten equal parts. 𝑇ℎ𝑒 𝑘 𝑡ℎ Decile is denoted as 𝐷𝑘 𝑤ℎ𝑜𝑠𝑒 position
𝒌
is determined by 𝟏𝟎 𝒏𝒕𝒉
𝐷𝑘 is calculated using the formula
𝑘 𝑡ℎ
𝑛 − 𝑪𝒇𝒃
𝐷𝑘 = 𝐿𝑘 + (10 )𝑖
𝑓𝑘
PERCENTILES

These divide the data into 100 equal parts. 𝑇ℎ𝑒 𝑘 𝑡ℎ Percentile is denoted as 𝑃𝑘 𝑤ℎ𝑜𝑠𝑒 position is
𝒌
determined by 𝟏𝟎𝟎 𝒏𝒕𝒉
𝑃𝑘 is calculated using the formula
𝑘 𝑡ℎ
𝑛 − 𝑪𝒇𝒃
𝑃𝑘 = 𝐿𝑘 + (100 )𝑖
𝑓𝑘
INTERPERCENTILE RANGE
This is also known as the percentile range.
This is the difference between the 90th percentile and 10th percentile.

Interpercentile Range= 𝑷𝟗𝟎 − 𝑷𝟏𝟎

EXAMPLE
Use the previous data to find the(i) 7th Decile
(ii) percentile range .
Solution
7
(i) Position of 𝐷7 =10 (65)𝑡ℎ = 45.5𝑡ℎ
Age f 𝐶𝑓 classboundaries
7 ×65
(years) 𝐷7 𝑐𝑙𝑎𝑠𝑠 is 55 − 59, 𝐿𝐷7 = 54.5, 10 = 45.5,
40 − 44 9 9 39.5 − 44.5 𝐶𝑓𝑏 = 39, 𝑖 = 5, 𝑓𝐷7 = 10
45 − 49 13 22 44.5 − 49.5 (𝟒𝟓. 𝟓 − 𝟑𝟗)𝟓
50 − 54 17 𝟑𝟗 49.5 − 54.5 𝐷7 = 𝟓𝟒. 𝟓 +
𝟏𝟎
55 − 59 10 49 54.5 − 59.5 = 𝟓𝟒. 𝟓 + 𝟑. 𝟐𝟓
60 − 64 8 57 59.5 − 64.5 ∴ 𝑫𝟕 = 𝟓𝟕. 𝟕𝟓
65 − 69 6 63 64.5 − 69.5
90
70 − 74 2 65 69.5 − 74.5 (ii)Position of 𝑃90 = (65)𝑡ℎ = 58.5𝑡ℎ
100
90 ×65
𝑃90 is 65 − 69, 𝐿90 = 64.5, 100 = 58.5, 𝐶𝑓𝑏 = 57, 𝑖 = 5, 𝑓90 = 8
𝟗𝒏
(𝟏𝟎 − 𝑪𝒇𝒃 ) 𝒊
𝑃90 = 𝑳𝟗𝟎 +
𝒇𝟗𝟎
(𝟓𝟖.𝟓−𝟓𝟕)𝟓
= 𝟔𝟒. 𝟓 + 𝟖

∴ 𝑷𝟗𝟎 = 𝟔𝟓. 𝟒𝟑𝟕𝟓

PREPARED BY MRS ASSUMPTA KASAMBA PROV23


DESCRIPTIVE STATISTICS 7

10
Position of 𝑃10 = (65)𝑡ℎ = 6.5𝑡ℎ
100
10 ×65
𝑃10 𝑐𝑙𝑎𝑠𝑠 is 40 − 44, 𝐿10 = 39.5, 100 = 6.5, 𝐶𝑓𝑏 = 0, 𝑖 = 5, 𝑓10 = 9
𝒏
(𝟏𝟎 − 𝑪𝒇𝒃 ) 𝒊
𝑃10 = 𝑳𝟏𝟎 +
𝒇𝟏𝟎
(𝟔.𝟓−𝟎)𝟓
= 𝟑𝟗. 𝟓 + 𝟗
∴ 𝑷𝟏𝟎 = 𝟒𝟑. 𝟏𝟏𝟏𝟏
Percentile range = 𝑷𝟗𝟎 − 𝑷𝟏𝟎
= 𝟔𝟓. 𝟒𝟑𝟕𝟓 − 𝟒𝟑. 𝟏𝟏𝟏𝟏
= 𝟐𝟐. 𝟑𝟐𝟔𝟒

N.B. The quartiles, deciles, and percentiles can also be obtained from the ogive.

MEAN DEVIATION
This measures the deviation of each score or variable from the mean without regard to the
sign.
(i) MEAN DEVIATION OF UNGROUPED DATA
∑|𝑥 − 𝑥̅ |
𝑚𝑒𝑎𝑛 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 =
𝑛
Where 𝑥̅ is the mean , | | is the magnitude or absolute value sign which means take the positive value
only or ignore the negative sign .
Example
The data below are titre values obtained by students during a chemistry practical
23.50 , 24.40, 23.70,24.20 𝑎𝑛𝑑 23.90 𝑐𝑚3 .Calculate the mean deviation

solution
23.50 + 24.40 + 23.70 + 24.20 + 23.90
𝑥̅ = = 23.94
5
𝑥 𝑥 − 𝑥̅ |𝑥 − 𝑥̅ |
23.50 −0.44 0.44
23.70 −0.24 0.24
23.90 −0.04 0.04
24.20 0.26 0.26
24.40 0.46 0.46
∑|𝑥 − 𝑥̅ | = 1.44
∑|𝑥 − 𝑥̅ | 1.44
𝑚𝑒𝑎𝑛 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = = = 0.288
𝑛 5

(ii) MEAN DEVIATION OF GROUPED DATA


∑ 𝑓|𝑥 − 𝑥̅ |
𝑚𝑒𝑎𝑛 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 =
∑𝑓
Example
The ages of people in a town were as follows
Ages (years) 0−< 5 5−< 15 15−< 30 30−< 50 50−< 70 70−< 90
No of people in4.4 8.1 10.5 14.6 9.8 4.7
thousands
Calculate the (i) mean deviation

PREPARED BY MRS ASSUMPTA KASAMBA PROV23


DESCRIPTIVE STATISTICS 8

(ii) number of people under 18 years


Solution
Ages (years) 𝑓 Mid point,x 𝑓𝑥 |𝑥 − 𝑥̅ | 𝑓|𝑥 − 𝑥̅ |
0−5 4.4 2.5 11.0 33.5125 147.455
5 − 15 8.1 10 81.0 26.0125 210.70125
15 − 30 10.5 22.5 236.25 13.5125 141.8825
30 − 50 14.6 40 584.0 3.9875 58.2175
50 − 70 9.8 60 588.0 23.9875 235.0775
70 − 90 4.7 80 376 43.9875 206.74125
𝑓= 𝑓|𝑥 − 𝑥̅ | =
∑ ∑ 𝑓𝑥 = ∑
52.1 1000.075
1876.25
∑ 𝑓𝑥 1876.25
Mean = ∑𝑓
= = 36.0125
52.1
∑ 𝑓|𝑥−𝑥̅ | 1000.075
(i)mean deviation = ∑𝑓
= =0.5330
1876.25
18−15
(ii) number of people under 18 years go up to the class of 15 − 30= 4.4 + 8.1 + ( 15 × 10.5)
=4.4 + 8.1 + 2.1
= 14.6 × 1000 = 14600
𝟐
THE STANDARD DEVIATION, s, AND THE VARIANCE,𝒔
The standard deviation, s, is a very important and useful measure of spread. It gives a measure of the
deviations of the readings from the mean. It is calculated using all the values in the distribution.
Variance is defined as the mean of deviation.
Relationship between variance and standard deviation
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = √𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒
Standard deviation of ungrouped data
The standard deviation, s, of a set of n numbers with mean 𝑥̅ , is given by
∑(𝑥 − 𝑥̅ )2
𝑠=√
𝑛

And Variance = (𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏)𝟐 = 𝒔𝟐


̅) 𝟐
∑(𝒙 − 𝒙
∴ 𝒔𝟐 =
𝒏
Example
Two machines, A and B are used to pack biscuits. Ten packets were taken from each machine and the mass
of each packet was measured to the nearest gram and noted.
Machine A 196 198 198 199 200 200 201 201 202 205
Machine B 192 194 195 198 200 201 203 204 206 207
Calculate the standard deviation of the masses taken from each machine. Comment on your answer
Solution
∑𝑥 2000 ∑𝑥 2000
Machine A, 𝑥̅ = 𝑛 = 10 = 200 Machine B, 𝑥̅ = 𝑛 = 10 = 200
Since mean mass for each machine is 200 then 𝑥 − 𝑥̅ = 𝑥 − 200

PREPARED BY MRS ASSUMPTA KASAMBA PROV23


DESCRIPTIVE STATISTICS 9

ALTERNATIVE FORM OF THE FORMULA OF STANDARD DEVIATION


∑(𝑥−𝑥̅ )2
The formula 𝑠 = √ is difficult to use most especially when the calculated mean is not an integer, so
𝑛
the alternative form derived below will be used forth with.

𝟐
∑ 𝒙𝟐 ∑ 𝒙𝟐 ∑𝒙
∴𝒔= √ ̅𝟐 = √
−𝒙 −( )
𝒏 𝒏 𝒏
Example

PREPARED BY MRS ASSUMPTA KASAMBA PROV23


DESCRIPTIVE STATISTICS 10

The mean of the five numbers 2,3,5,6,8 is 4.8. Calculate the standard deviation

Solution
∑(𝑥−𝑥̅ )2 ∑ 𝒙𝟐
Method 1 using 𝑠 = √ Method 2 using 𝒔 = √ ̅𝟐
−𝒙
𝑛 𝒏
x 𝒙−𝒙 ̅ ̅) 𝟐
(𝒙 − 𝒙 𝒙 𝒙𝟐
2 -2.8 7.84 2 4
3 -1.8 3.24 3 9
5 0.2 0.04 5 25
6 1.2 1.44 6 36
8 3.2 10.24 8 64
∑(𝒙 − 𝒙̅)𝟐 = 𝟐𝟐. 𝟖𝟎 ∑ 𝒙𝟐 = 𝟏𝟑𝟖 -
∑(𝑥−𝑥̅ )2 ∑ 𝒙𝟐
𝑠=√ 𝒔= √ ̅𝟐
−𝒙
𝑛 𝒏
22.80 𝟏𝟑𝟖
=√ =√ − 𝟒. 𝟖𝟐
5 𝟓
= 𝟐. 𝟏𝟑𝟓𝟒 = 𝟐. 𝟏𝟑𝟓𝟒
Method 2 is less cumbersome so we shall be using more frequently.

STANDARD DEVIATION AND VARIANCE OF A FREQUENCY DISTRIBUTION

This is calculated using the formula


∑ 𝑓(𝑥−𝑥̅ )2 ∑ 𝑓(𝑥−𝑥̅ )2
Standard deviation,𝑠 = √ ∑𝑓
and variance , 𝑠 2 = ∑𝑓
Alternative form
∑ 𝒇𝒙𝟐 ∑ 𝒇𝒙 ∑ 𝒇𝒙𝟐
Standard deviation, 𝒔 = √ ∑𝒇
̅𝟐 where 𝒙
−𝒙 ̅= ∑𝒇
and variance 𝑠 2 = ∑𝒇
̅𝟐
−𝒙
∑ 𝒇𝒙𝟐 ∑ 𝒇𝒙 𝟐 ∑ 𝒇𝒙𝟐 ∑ 𝒇𝒙 𝟐
This is also written as 𝒔 = √ ∑𝒇
− ( ∑ 𝒇 ) and variance 𝑠 2 = ∑𝒇
− ( ∑𝒇 )

Example
The number of children per family is given in the frequency distribution below.
No of children per family, x 1 2 3 4 5
frequency 3 4 8 2 3
Calculate the (i) variance (ii) standard deviation
Solution
x f fx ∑ 𝒇𝒙 𝟓𝟖
mean, 𝒙 ̅ = ∑ = = 𝟐. 𝟗
𝒇 𝟐𝟎
1 3 3 ∑ 𝑓(𝑥−𝑥̅ )2
2 4 8 Method 1 using 𝑠 2 = ∑ 𝑓
3 8 24
4 2 8
5 3 15
∑ 𝒇 = 𝟐𝟎 ∑ 𝒇𝒙 = 𝟓𝟖

∑ 𝑓(𝑥−𝑥̅ )2
Variance= 𝑠2 = ∑𝑓
29.80
= = 1.49
20

PREPARED BY MRS ASSUMPTA KASAMBA PROV23


DESCRIPTIVE STATISTICS 11
𝑥 𝑓 𝑥 − 2.9 (𝑥 − 2.9)2𝑓(𝑥 − 2.9)2
1 3 −1.9 3.61 10.83
2 4 −0.9 0.81 3.24
3 8 0.1 0.01 0.08 Standard deviation=√1.49 = 1.2207
4 2 1.1 1.21 2.42
5 3 2.1 4.41 13.23
∑ 𝑓 = 20 ∑ 𝑓 (𝑥 − 2.9)2 = 29.80
method 2 using
∑ 𝒇𝒙𝟐
𝑥 𝑓 𝑓𝑥 𝑥2 𝑓𝑥 2 𝑠2 = ̅𝟐
−𝒙
1 3 3 1 3 ∑𝒇
2 4 8 4 16 𝟐
3 8 24 9 72 2
∑ 𝒇𝒙𝟐 ∑ 𝒇𝒙
𝑠 = −( )
4 2 8 16 32 ∑𝒇 ∑𝒇
5 3 15 25 75 𝟏𝟗𝟖 𝟓𝟖 𝟐
∑ 𝑓 = 20 ∑ 𝑓𝑥 = 58 ∑ 𝑓 𝑥 2 = 198 = −( )
𝟐𝟎 𝟐𝟎
= 1.49

Standard deviation =√𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = √1.49 = 1.2207

Example
An IQ test was taken by 115 students.For each student the time taken to complete the test was recorded and
tabulated as shown below.
Time in minutes < 1 <2 <3 <5 < 10
frequency 10 15 25 40 25
Calculate the (i) mean (ii) variance
Solution
𝑇𝑖𝑚𝑒 𝑖𝑛 𝑚𝑖𝑛𝑢𝑡𝑒𝑠 𝑓 𝑥 𝑓𝑥 𝑓𝑥 2
0−< 1 10 0.5 5 2.5
1−< 2 15 1.5 22.5 33.75
2−< 3 25 2.5 62.5 156.25
3−< 5 40 4.0 160 640
5−< 10 25 7.5 187.5 1406.25
∑ 𝑓 = 115 ∑ 𝑓𝑥 = 437.5 ∑ 𝑓𝑥 2 = 2238.75
∑ 𝒇𝒙 𝟒𝟑𝟕.𝟓
̅=
(i) mean = 𝒙 ∑𝒇
= = 𝟑. 𝟖𝟎𝟒𝟑
𝟏𝟏𝟓
∑ 𝒇𝒙𝟐 ∑ 𝒇𝒙 𝟐 𝟐𝟐𝟑𝟖.𝟕𝟓 𝟒𝟑𝟕.𝟓 𝟐
(ii)variance = ∑𝒇
− ( ∑𝒇 ) = − ( 𝟏𝟏𝟓 ) = 𝟒. 𝟗𝟗𝟒𝟕
𝟏𝟏𝟓

STANDARD DEVIATION AND VARIANCE FORMULA USING ASSUMED MEAN

The formulae of finding standard deviation and variance using the assumed mean are
∑ 𝒇𝒅𝟐 ∑ 𝒇𝒅 𝟐
Variance = 𝑠 2 = ∑𝒇
− ( ∑ 𝒇 ) where 𝒅 = 𝒙 − 𝑨, and A is the assumed mean
∑ 𝒇𝒅𝟐 ∑ 𝒇𝒅 𝟐
Standard deviation = √𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 = √ ∑𝒇
− ( ∑𝒇 )
Example
The data below shows the masses of goats in kg
Using an assumed mean of 10.95, calculate

PREPARED BY MRS ASSUMPTA KASAMBA PROV23


DESCRIPTIVE STATISTICS 12

Massin kg 4.0 − 5.9 6.0 − 7.9 8.0 − 9.9 10.0 − 11.9 12.0 − 13.9 14.0 − 15.916.0 − 17.9 18.0 − 19.9
frequency 1 3 7 9 9 12 8 1

(i) the mean (ii) standard deviation


N.B we construct only one table to answer all the given questions
Solution
Mass in kg 𝑓 𝑥 𝑑 =𝑥−𝐴 𝑓𝑑 𝑓𝑑 2
4.0 − 5.9 1 4.95 −6 −6 36
6.0 − 7.9 3 6.95 −4 −12 48
8.0 − 9.9 7 8.95 −2 −14 28
10.0 − 11.9 9 10.95 0 0 0
12.0 − 13.9 9 12.95 2 18 36
14.0 − 15.9 12 14.95 4 48 192
16.0 − 17.9 8 16.95 6 48 288
18.0 − 19.9 1 18.95 8 8 64
∑ 𝑓 = 50 ∑ 𝑓𝑑 = 90 ∑ 𝑓𝑑 2 = 692

∑ 𝑓𝑑 90
(i) Mean = 𝐴 + ( ∑ 𝑓 ) = 10.95 + (50) = 12.75
∑ 𝒇𝒅𝟐 ∑ 𝒇𝒅 𝟐 𝟔𝟗𝟐 𝟗𝟎 𝟐
(ii) standard deviation = √ ∑𝒇
− ( ∑ 𝒇 ) = √ 𝟓𝟎 − (𝟓𝟎) = 𝟑. 𝟐𝟓𝟓𝟖

Example
The table below shows the marks obtained in a test by 75 students
Marks <𝟓 < 𝟏𝟎 < 𝟐𝟎 < 𝟑𝟎 < 𝟒𝟓 < 𝟓𝟎 < 𝟓𝟓
Frequency 𝟎. 𝟒 𝟏. 𝟐 𝟎. 𝟖 𝟐. 𝟒 𝟐 𝟏 𝟎
density
Calculate the (i) mean (ii) median (iii) number of students with a mark more 35.
Solution
Marks 𝑓. 𝑑 𝑖 𝑓 𝑥 𝑓𝑥 𝑐𝑓
0−5 0.4 5 2 2.5 5 2
5 − 10 1.2 5 6 7.5 45 8
10 − 20 0.8 10 8 15 120 16
20 − 30 2.4 10 24 25 600 40
30 − 45 2 15 30 37.5 1125 70
45 − 50 1 5 5 47.5 237.5 75
50 − 55 0 5 0 52.5 0 75
∑ 𝑓 = 75 ∑ 𝑓𝑥 =
2132.5
𝟏
Position of median class = × 𝟕𝟔 = 𝟑𝟖𝒕𝒉 ∴ 𝒎𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔 𝒊𝒔 𝟐𝟎 − 𝟑𝟎
𝟐
𝟕𝟓
𝟐𝟏𝟑𝟐.𝟓 −𝟏𝟔
(i) mean = = 𝟐𝟖. 𝟒𝟑𝟑𝟑 (ii) median = 𝟐𝟎 + ( 𝟐
) 𝟏𝟎 = 𝟐𝟖. 𝟗𝟓𝟖𝟑
𝟕𝟓 𝟐𝟒
𝟒𝟓−𝟑𝟓
(iii) number of students with a mark more than 35 = 5 + ( ) × 𝟑𝟎 =25
𝟏𝟓

PREPARED BY MRS ASSUMPTA KASAMBA PROV23


1. Calculate the mean, median and 2. The mode of the numbers 3. For a set of ten numbers
Standard deviation of the set of 4,2,1,4,3,4,2,6,2,x and 5 is 2. ∑ 𝑥 = 290 and ∑ 𝑥 2 = 8469
data 2,4,6,3,1,2,8,12,16 Find the (i) the valueDESCRIPTIVE
of x FindSTATISTICS
the mean and variance.
13
(ii) median (iii)mean

4. For a set of nine numbers 5. For a set of nine numbers 6. For a particular set of data
∑(𝑥 − 𝑥̅ )2 = 234. Find the ∑(𝑥 − 𝑥̅ )2 = 60 , ∑ 𝑥 2 = 285 𝑛 = 100,∑(𝑥 − 50) = 123.5
standard deviation of these Find the mean of the numbers ∑(𝑥 − 50)2 = 238.4 . Find
numbers The mean and standard
Deviation of x
7. Find the variance of x if 8. For a particular set of 9. For a given frequency
∑ 𝑓(𝑥 − 100) = 127, ∑ 𝑓 = 20 values ∑ 𝑓 = 20 , ∑ 𝑓𝑥 2 = 16 distribution, ∑ 𝑓 = 30,
and ∑ 𝑓(𝑥 − 100)2 = 2593 . ∑ 𝑓𝑥 = 563 . Find the mean ∑ 𝑓(𝑥 − 𝑥̅ )2 = 182.3 and
and standard deviation. ∑ 𝑓𝑥 2 = 10 25 . Find the
Mean of the distribution.
11.The scores in an IQ test for 12. The table below shows
60 candidates are shown in the weekly wages in £ of
the table. Find the mean score each of 100 factory workers.
and standard deviation (a) Draw a histogram to
score frequency Illustrate this information
100 − 106 8 (b) calculate the mean wage
107 − 113 13 and the standard deviation
114 − 120 24 Wage £ Workers
121 − 127 11 200 ≤ 𝑥 < 250 10
128 − 134 4 250 ≤ 𝑥 < 300 16
300 ≤ 𝑥 < 375 40
.
375 ≤ 𝑥 < 400 26
400 ≤ 𝑥 < 500 8
13. The marks of 40 students in 14. The cumulative frequency 15.The following table shows
a test were as follows table below shows the ages of the time to the nearest second
marks frequency employees of a certain company. recorded for a telephonist to
30 − 8 Age(years) Cf answer calls received on a
40 − 5 < 15 0 certain day.
50 − 12 < 20 17 Time frequency
60 − 9 < 30 39 10 − 19 20
70 − 6 < 40 69 20 − 24 20
80 − 0 < 50 87 25 − 29 15
Calculate the (i) mode (ii)mean < 60 92 30 14
(iii)𝑃84 (iv) median < 65 98 31 − 34 16
(v) quartile deviation Calculate the mean and the 35 − 39 10
(vi) number of students with a standard deviation using an 40 − 59 10
mark less than 58.5 Assumed mean of 30. (a)Calculate the(i) mean time
(vii) variance (ii) standard deviation
(b)draw a histogram to
Represent the above data
16.The table below shows the 17.The heights in cm were 18. The table below shows the
heights of pupils in a certain recorded as in the table below cumulative Distribution of the
school. Height Cf height in cm of 400 students
Height frequency 149 − 5 Height (cm) Cf
80 − 84 10 153 − 22 < 12 0
85 − 89 15 157 − 42 < 13 27
90 − 94 35 161 − 67 < 14 85
95 − 99 40 165 − 82 < 16 215
100 − 104 28 169 − 88 < 17 320

PREPARED BY MRS ASSUMPTA KASAMBA PROV23


DESCRIPTIVE STATISTICS 14

105 − 109 15 173 − 90 < 18 370


110 − 114 4 177 − 90 < 20 395
115 − 119 3 a) calculate the mean height < 21 400
Draw a cumulative frequency (b) plot a cumulative frequency a) Plot an ogive for the data and
Curve and use it to estimate the Curve and use it to estimate the use it estimate the (i) 20𝑡ℎ to 80𝑡ℎ
(i) median Middle 60% height range of the percentile height range
(ii) interquartile range Candidates. (ii) the number of students who
(iii)number of students with a Are above 19cm
Height below 102.5cm (b) find mean and standard deviation

PREPARED BY MRS ASSUMPTA KASAMBA PROV23

You might also like