Advenced Level Descriptive Statistics
Advenced Level Descriptive Statistics
Advenced Level Descriptive Statistics
⇒ If the single value describes the centre of the data, it is called a measure of central
⇒ You should already know how to work out the mean, median and mode:
The mode or modal class is the value or class that occurs most often
The median is the middle value when the data values are put in order
The mean can be calculated by using the formula:
𝑥with '-' above it represent the mean (you say 'x bar'), Σx represents the sum of the
data values, n is the number of data values
For data given in a frequency table, the mean can be calculated using the
∑ 𝑓𝑥
𝑥̅ =
⇒ The median describes the middle of the data set. It splits the data set into two equal (50%)
These measures are used to find out how the scores spread, scatter around or away from
the mean.These include : the range, the interquartile range , semi-interquartile range ,
percentiles , deciles , mean deviation , variance and standard deviation.
Range is the difference between the highest and the lowest value in the distribution. The
range is based on the extreme values of the distribution.
𝑅𝑎𝑛𝑔𝑒 = ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 − 𝑙𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒
Example Each of these sets of numbers has a mean of 7 but the spread of each set is
different. (a) 7,7,7,7,7,7 (b) 4,6,6.5,7.2,11.3 (iii) -193,-46,28,69,177
In (a) the range = 7 − 7 = 0
In (b) the range = 11.3 − 4 = 7.3
In (c)the range = 177 − (−193) = 370
⇒ You can calculate other measures of dispersion such as quartiles and percentiles
QUARTILES : These divide the data or distribution into four equal parts.
⇒ Use these rules to find the upper and lower quartiles for discrete/ ungrouped data
Height(cm) No of 𝐶𝑓
150 2 2
151 0 2
152 15 17
153 29 46
154 25 71
155 12 83
156 10 93
157 4 97
158 3 100
(i)Position of median = 2 × 100 = 50𝑡ℎ
∴ 𝑚𝑒𝑑𝑖𝑎𝑛 ℎ𝑒𝑖𝑔ℎ𝑡 𝑖𝑠 154𝑐𝑚
(ii) Position of 𝑄1 = 4 × 100 = 25𝑡ℎ
∴ 𝑄1 = 153𝑐𝑚
3 𝑡ℎ
(iii) Position of 𝑄3 = 4 × 100 = 75
∴ 𝑄3 = 155𝑐𝑚
is calculated using the formula
( 𝟒 − 𝑪𝒇𝒃 ) 𝒊
𝑄3 = 𝑳𝑄3 +
The data below shows the ages of patients who tested positive with COVID 19
Age (years) Number of patients
40 − 44 9
45 − 49 13
50 − 54 17
55 − 59 10
60 − 64 8
65 − 69 6
70 − 74 2
Calculate the (i) median
(ii) Interquartile range and hence the quartile deviation
Age f 𝐶𝑓 Class 𝒏
( −𝑪𝒇𝒃 )𝒊
(years) boundaries (i) 𝒎𝒆𝒅𝒊𝒂𝒏 = 𝑳𝒎 + 𝒇𝒎
40 − 44 9 9 39.5 − 44.5 1
Position of 𝑚𝑒𝑑𝑖𝑎𝑛 class =2 (66)𝑡ℎ = 33𝑟𝑑 ∴
45 − 49 13 22 44.5 − 49.5
50 − 54 17 𝟑𝟗 49.5 − 54.5 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠 𝑖𝑠 50 − 54
𝑛 65
55 − 59 10 49 54.5 − 59.5 , 𝐿𝑚 = 49.5, 2 = 2 = 32.5, 𝐶𝑓𝑏 = 22, 𝑖 = 5, 𝑓𝑚 = 17
60 − 64 8 57 59.5 − 64.5 (𝟑𝟐. 𝟓 − 𝟐𝟐)𝟓
65 − 69 6 63 64.5 − 69.5 𝑚𝑒𝑑𝑖𝑎𝑛 = 𝟒𝟗. 𝟓 +
70 − 74 2 65 69.5 − 74.5
= 𝟒𝟗. 𝟓 + 𝟑. 𝟎𝟖𝟖𝟐 = 𝟓𝟐. 𝟓𝟖𝟖𝟐
(ii) Interquartile range = 𝑢𝑝𝑝𝑒𝑟 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 − 𝑙𝑜𝑤𝑒𝑟 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒
( 𝟒 − 𝑪𝒇𝒃 ) 𝒊
𝑄3 = 𝑳𝑄3 +
𝟑 × 𝟔𝟓
( 𝟒 − 𝟑𝟗) 𝟓
= 𝟓𝟒. 𝟓 + = 𝟓𝟗. 𝟑𝟕𝟓𝟎
Quartile deviation = = 𝟔. 𝟎𝟒𝟑𝟐𝟓
These divide the data into ten equal parts. 𝑇ℎ𝑒 𝑘 𝑡ℎ Decile is denoted as 𝐷𝑘 𝑤ℎ𝑜𝑠𝑒 position
is determined by 𝟏𝟎 𝒏𝒕𝒉
𝐷𝑘 is calculated using the formula
𝑘 𝑡ℎ
𝑛 − 𝑪𝒇𝒃
𝐷𝑘 = 𝐿𝑘 + (10 )𝑖
These divide the data into 100 equal parts. 𝑇ℎ𝑒 𝑘 𝑡ℎ Percentile is denoted as 𝑃𝑘 𝑤ℎ𝑜𝑠𝑒 position is
determined by 𝟏𝟎𝟎 𝒏𝒕𝒉
𝑃𝑘 is calculated using the formula
𝑘 𝑡ℎ
𝑛 − 𝑪𝒇𝒃
𝑃𝑘 = 𝐿𝑘 + (100 )𝑖
This is also known as the percentile range.
This is the difference between the 90th percentile and 10th percentile.
Use the previous data to find the(i) 7th Decile
(ii) percentile range .
(i) Position of 𝐷7 =10 (65)𝑡ℎ = 45.5𝑡ℎ
Age f 𝐶𝑓 classboundaries
7 ×65
(years) 𝐷7 𝑐𝑙𝑎𝑠𝑠 is 55 − 59, 𝐿𝐷7 = 54.5, 10 = 45.5,
40 − 44 9 9 39.5 − 44.5 𝐶𝑓𝑏 = 39, 𝑖 = 5, 𝑓𝐷7 = 10
45 − 49 13 22 44.5 − 49.5 (𝟒𝟓. 𝟓 − 𝟑𝟗)𝟓
50 − 54 17 𝟑𝟗 49.5 − 54.5 𝐷7 = 𝟓𝟒. 𝟓 +
55 − 59 10 49 54.5 − 59.5 = 𝟓𝟒. 𝟓 + 𝟑. 𝟐𝟓
60 − 64 8 57 59.5 − 64.5 ∴ 𝑫𝟕 = 𝟓𝟕. 𝟕𝟓
65 − 69 6 63 64.5 − 69.5
70 − 74 2 65 69.5 − 74.5 (ii)Position of 𝑃90 = (65)𝑡ℎ = 58.5𝑡ℎ
90 ×65
𝑃90 is 65 − 69, 𝐿90 = 64.5, 100 = 58.5, 𝐶𝑓𝑏 = 57, 𝑖 = 5, 𝑓90 = 8
(𝟏𝟎 − 𝑪𝒇𝒃 ) 𝒊
𝑃90 = 𝑳𝟗𝟎 +
= 𝟔𝟒. 𝟓 + 𝟖
Position of 𝑃10 = (65)𝑡ℎ = 6.5𝑡ℎ
10 ×65
𝑃10 𝑐𝑙𝑎𝑠𝑠 is 40 − 44, 𝐿10 = 39.5, 100 = 6.5, 𝐶𝑓𝑏 = 0, 𝑖 = 5, 𝑓10 = 9
(𝟏𝟎 − 𝑪𝒇𝒃 ) 𝒊
𝑃10 = 𝑳𝟏𝟎 +
= 𝟑𝟗. 𝟓 + 𝟗
∴ 𝑷𝟏𝟎 = 𝟒𝟑. 𝟏𝟏𝟏𝟏
Percentile range = 𝑷𝟗𝟎 − 𝑷𝟏𝟎
= 𝟔𝟓. 𝟒𝟑𝟕𝟓 − 𝟒𝟑. 𝟏𝟏𝟏𝟏
= 𝟐𝟐. 𝟑𝟐𝟔𝟒
N.B. The quartiles, deciles, and percentiles can also be obtained from the ogive.
This measures the deviation of each score or variable from the mean without regard to the
∑|𝑥 − 𝑥̅ |
𝑚𝑒𝑎𝑛 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 =
Where 𝑥̅ is the mean , | | is the magnitude or absolute value sign which means take the positive value
only or ignore the negative sign .
The data below are titre values obtained by students during a chemistry practical
23.50 , 24.40, 23.70,24.20 𝑎𝑛𝑑 23.90 𝑐𝑚3 .Calculate the mean deviation
23.50 + 24.40 + 23.70 + 24.20 + 23.90
𝑥̅ = = 23.94
𝑥 𝑥 − 𝑥̅ |𝑥 − 𝑥̅ |
23.50 −0.44 0.44
23.70 −0.24 0.24
23.90 −0.04 0.04
24.20 0.26 0.26
24.40 0.46 0.46
∑|𝑥 − 𝑥̅ | = 1.44
∑|𝑥 − 𝑥̅ | 1.44
𝑚𝑒𝑎𝑛 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = = = 0.288
𝑛 5
∑ 𝒙𝟐 ∑ 𝒙𝟐 ∑𝒙
∴𝒔= √ ̅𝟐 = √
−𝒙 −( )
𝒏 𝒏 𝒏
The mean of the five numbers 2,3,5,6,8 is 4.8. Calculate the standard deviation
∑(𝑥−𝑥̅ )2 ∑ 𝒙𝟐
Method 1 using 𝑠 = √ Method 2 using 𝒔 = √ ̅𝟐
𝑛 𝒏
x 𝒙−𝒙 ̅ ̅) 𝟐
(𝒙 − 𝒙 𝒙 𝒙𝟐
2 -2.8 7.84 2 4
3 -1.8 3.24 3 9
5 0.2 0.04 5 25
6 1.2 1.44 6 36
8 3.2 10.24 8 64
∑(𝒙 − 𝒙̅)𝟐 = 𝟐𝟐. 𝟖𝟎 ∑ 𝒙𝟐 = 𝟏𝟑𝟖 -
∑(𝑥−𝑥̅ )2 ∑ 𝒙𝟐
𝑠=√ 𝒔= √ ̅𝟐
𝑛 𝒏
22.80 𝟏𝟑𝟖
=√ =√ − 𝟒. 𝟖𝟐
5 𝟓
= 𝟐. 𝟏𝟑𝟓𝟒 = 𝟐. 𝟏𝟑𝟓𝟒
Method 2 is less cumbersome so we shall be using more frequently.
The number of children per family is given in the frequency distribution below.
No of children per family, x 1 2 3 4 5
frequency 3 4 8 2 3
Calculate the (i) variance (ii) standard deviation
x f fx ∑ 𝒇𝒙 𝟓𝟖
mean, 𝒙 ̅ = ∑ = = 𝟐. 𝟗
𝒇 𝟐𝟎
1 3 3 ∑ 𝑓(𝑥−𝑥̅ )2
2 4 8 Method 1 using 𝑠 2 = ∑ 𝑓
3 8 24
4 2 8
5 3 15
∑ 𝒇 = 𝟐𝟎 ∑ 𝒇𝒙 = 𝟓𝟖
∑ 𝑓(𝑥−𝑥̅ )2
Variance= 𝑠2 = ∑𝑓
= = 1.49
An IQ test was taken by 115 students.For each student the time taken to complete the test was recorded and
tabulated as shown below.
Time in minutes < 1 <2 <3 <5 < 10
frequency 10 15 25 40 25
Calculate the (i) mean (ii) variance
𝑇𝑖𝑚𝑒 𝑖𝑛 𝑚𝑖𝑛𝑢𝑡𝑒𝑠 𝑓 𝑥 𝑓𝑥 𝑓𝑥 2
0−< 1 10 0.5 5 2.5
1−< 2 15 1.5 22.5 33.75
2−< 3 25 2.5 62.5 156.25
3−< 5 40 4.0 160 640
5−< 10 25 7.5 187.5 1406.25
∑ 𝑓 = 115 ∑ 𝑓𝑥 = 437.5 ∑ 𝑓𝑥 2 = 2238.75
∑ 𝒇𝒙 𝟒𝟑𝟕.𝟓
(i) mean = 𝒙 ∑𝒇
= = 𝟑. 𝟖𝟎𝟒𝟑
∑ 𝒇𝒙𝟐 ∑ 𝒇𝒙 𝟐 𝟐𝟐𝟑𝟖.𝟕𝟓 𝟒𝟑𝟕.𝟓 𝟐
(ii)variance = ∑𝒇
− ( ∑𝒇 ) = − ( 𝟏𝟏𝟓 ) = 𝟒. 𝟗𝟗𝟒𝟕
The formulae of finding standard deviation and variance using the assumed mean are
∑ 𝒇𝒅𝟐 ∑ 𝒇𝒅 𝟐
Variance = 𝑠 2 = ∑𝒇
− ( ∑ 𝒇 ) where 𝒅 = 𝒙 − 𝑨, and A is the assumed mean
∑ 𝒇𝒅𝟐 ∑ 𝒇𝒅 𝟐
Standard deviation = √𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 = √ ∑𝒇
− ( ∑𝒇 )
The data below shows the masses of goats in kg
Using an assumed mean of 10.95, calculate
Massin kg 4.0 − 5.9 6.0 − 7.9 8.0 − 9.9 10.0 − 11.9 12.0 − 13.9 14.0 − 15.916.0 − 17.9 18.0 − 19.9
frequency 1 3 7 9 9 12 8 1
∑ 𝑓𝑑 90
(i) Mean = 𝐴 + ( ∑ 𝑓 ) = 10.95 + (50) = 12.75
∑ 𝒇𝒅𝟐 ∑ 𝒇𝒅 𝟐 𝟔𝟗𝟐 𝟗𝟎 𝟐
(ii) standard deviation = √ ∑𝒇
− ( ∑ 𝒇 ) = √ 𝟓𝟎 − (𝟓𝟎) = 𝟑. 𝟐𝟓𝟓𝟖
The table below shows the marks obtained in a test by 75 students
Marks <𝟓 < 𝟏𝟎 < 𝟐𝟎 < 𝟑𝟎 < 𝟒𝟓 < 𝟓𝟎 < 𝟓𝟓
Frequency 𝟎. 𝟒 𝟏. 𝟐 𝟎. 𝟖 𝟐. 𝟒 𝟐 𝟏 𝟎
Calculate the (i) mean (ii) median (iii) number of students with a mark more 35.
Marks 𝑓. 𝑑 𝑖 𝑓 𝑥 𝑓𝑥 𝑐𝑓
0−5 0.4 5 2 2.5 5 2
5 − 10 1.2 5 6 7.5 45 8
10 − 20 0.8 10 8 15 120 16
20 − 30 2.4 10 24 25 600 40
30 − 45 2 15 30 37.5 1125 70
45 − 50 1 5 5 47.5 237.5 75
50 − 55 0 5 0 52.5 0 75
∑ 𝑓 = 75 ∑ 𝑓𝑥 =
Position of median class = × 𝟕𝟔 = 𝟑𝟖𝒕𝒉 ∴ 𝒎𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔 𝒊𝒔 𝟐𝟎 − 𝟑𝟎
𝟐𝟏𝟑𝟐.𝟓 −𝟏𝟔
(i) mean = = 𝟐𝟖. 𝟒𝟑𝟑𝟑 (ii) median = 𝟐𝟎 + ( 𝟐
) 𝟏𝟎 = 𝟐𝟖. 𝟗𝟓𝟖𝟑
𝟕𝟓 𝟐𝟒
(iii) number of students with a mark more than 35 = 5 + ( ) × 𝟑𝟎 =25
4. For a set of nine numbers 5. For a set of nine numbers 6. For a particular set of data
∑(𝑥 − 𝑥̅ )2 = 234. Find the ∑(𝑥 − 𝑥̅ )2 = 60 , ∑ 𝑥 2 = 285 𝑛 = 100,∑(𝑥 − 50) = 123.5
standard deviation of these Find the mean of the numbers ∑(𝑥 − 50)2 = 238.4 . Find
numbers The mean and standard
Deviation of x
7. Find the variance of x if 8. For a particular set of 9. For a given frequency
∑ 𝑓(𝑥 − 100) = 127, ∑ 𝑓 = 20 values ∑ 𝑓 = 20 , ∑ 𝑓𝑥 2 = 16 distribution, ∑ 𝑓 = 30,
and ∑ 𝑓(𝑥 − 100)2 = 2593 . ∑ 𝑓𝑥 = 563 . Find the mean ∑ 𝑓(𝑥 − 𝑥̅ )2 = 182.3 and
and standard deviation. ∑ 𝑓𝑥 2 = 10 25 . Find the
Mean of the distribution.
11.The scores in an IQ test for 12. The table below shows
60 candidates are shown in the weekly wages in £ of
the table. Find the mean score each of 100 factory workers.
and standard deviation (a) Draw a histogram to
score frequency Illustrate this information
100 − 106 8 (b) calculate the mean wage
107 − 113 13 and the standard deviation
114 − 120 24 Wage £ Workers
121 − 127 11 200 ≤ 𝑥 < 250 10
128 − 134 4 250 ≤ 𝑥 < 300 16
300 ≤ 𝑥 < 375 40
375 ≤ 𝑥 < 400 26
400 ≤ 𝑥 < 500 8
13. The marks of 40 students in 14. The cumulative frequency 15.The following table shows
a test were as follows table below shows the ages of the time to the nearest second
marks frequency employees of a certain company. recorded for a telephonist to
30 − 8 Age(years) Cf answer calls received on a
40 − 5 < 15 0 certain day.
50 − 12 < 20 17 Time frequency
60 − 9 < 30 39 10 − 19 20
70 − 6 < 40 69 20 − 24 20
80 − 0 < 50 87 25 − 29 15
Calculate the (i) mode (ii)mean < 60 92 30 14
(iii)𝑃84 (iv) median < 65 98 31 − 34 16
(v) quartile deviation Calculate the mean and the 35 − 39 10
(vi) number of students with a standard deviation using an 40 − 59 10
mark less than 58.5 Assumed mean of 30. (a)Calculate the(i) mean time
(vii) variance (ii) standard deviation
(b)draw a histogram to
Represent the above data
16.The table below shows the 17.The heights in cm were 18. The table below shows the
heights of pupils in a certain recorded as in the table below cumulative Distribution of the
school. Height Cf height in cm of 400 students
Height frequency 149 − 5 Height (cm) Cf
80 − 84 10 153 − 22 < 12 0
85 − 89 15 157 − 42 < 13 27
90 − 94 35 161 − 67 < 14 85
95 − 99 40 165 − 82 < 16 215
100 − 104 28 169 − 88 < 17 320