Chapter - 14 Statistics
Chapter - 14 Statistics
Chapter - 14 Statistics
Statistics
1. Three measures of central tendency are:
i. Mean
ii. Median
iii. Mode
2. The arithmetic mean, also called the average, is the quantity obtained by adding all the
observations and then dividing by the total number of observations.
3. Arithmetic mean may be computed by anyone of the following methods:
i. Direct method
ii. Short-cut method/ Assumed mean method
iii. Step-deviation method
4. Direct method of finding mean:
If a variant X takes values x1, x2, x3.... xn with corresponding frequencies f1, f2, f3 ,... fn
respectively, then arithmetic mean of these values is given by:
n
∑ni=1 fi xi
̅
X= where N = ∑ f1 + f2 + f2 … … . . +fn
N
i=1
1
5. Class mark = (Upper class limit + Lower class limit)
2
6. Short-cut method/ assumed mean method of finding mean:
Let x1, x2....,xn be values of a variable X with corresponding frequencies f1, f2, f3 ,fn
respectively. Let A be the assumed mean. Then:
n
1
̅ = A + {∑ fi di }
X
N
i=1
Note that in case of continuous frequency distribution, the values of x1, x2, x3 ... xn, are
taken as the mid-points or class-marks of the various classes.
7. Step-deviation method of finding mean:
Let x1, x2....,xn be values of a variable X with corresponding frequencies f1, f2, f3 ,…..fn
respectively. Let A be the assumed mean. Then:
n
1
̅
X = A + h { ∑ fi ui }
N
i=1
Here, h is generally taken as common factor of the deviations, in case of ungrouped
frequency distribution. And, in case of grouped frequency distribution, h is the class
xi−A di
width, ui = =
h h
Note that in case of continuous frequency distribution, the values of x1, x2, x3 ..., xn are
(1)
STATISTICS
14
taken as the mid-points or class-marks of the various classes.
8. The step deviation method will be convenient to apply if all the deviations (d’s) have a
common factor.
9. If class mark obtained, are in decimal form, then step deviation method is preferred to
calculate mean.
10. Median is a measure of central tendency which gives the value of the middle observation
in the data, arranged in order. It is that value such that the number of observations
above it is equal to the number of observations below it.
11. For finding the median of a raw data, we arrange the given data in increasing or
𝑛+1 𝑡ℎ
decreasing order. If n is odd, then median is the value of ( ) observation.
2
𝑛 𝑡ℎ 𝑛 𝑡ℎ
If n is even, then median is the arithmetic mean of the values of ( ) and ( + 1)
2 2
observations.
12. The cumulative frequency of a class is the frequency obtained by adding the frequencies
of all the classes preceding the given class to the frequency of the class.
13. In case of an ungrouped frequency distribution, we calculate the median by following
the steps given below:
Step 1: Find the cumulative frequencies (c.f.) and obtain N = ∑ 𝑓1 .
𝑛
Step 2: Find
2
𝑛
Step 3: Look for the cumulative frequency (c. f.) just greater than and determine the
2
corresponding value of the variable. The value so obtained is the median.
14. In case of a continuous frequency distribution, we calculate the median by following the
steps:
Step 1: Find the cumulative frequencies (c.f.) and obtain N = ∑ 𝑓1 .
N
Step 2: Find
2
N
Step 3: Look for the cumulative frequency (c. f.) just greater than and determine the
2
correspondingclass. This class is known as the median class. (Note that the value of the
median will lie in this class)
Step 4: Use the following formula to find median:
N
− cf
Mediun = l + [ 2 ]×h
f
(2)
STATISTICS
14
cf = cumulative frequency of the class preceding the median class
N = ∑ 𝑓1 .
15. Mode is the value of the most frequently occurring observation in the data.
16. In an ungrouped frequency distribution, mode is the value of the variable having
maximum frequency.
17. In a grouped frequency distribution, the modal class is the one with highest frequency
and the
mode can be calculated by the following formula
f1 − f0
Mode = l + ×h
2f1 − f0 − f2
l = lower limit of the modal class
h = size of the class interval
f1 = frequency of the modal class
f0 =frequency of the class preceding the modal class
f2 = frequency of the class succeeding the modal class
18. The most frequently used measure of central tendency is the mean, because the mean is
calculated by taking into account all the observations of a given data. And it lies between
the smallest and the largest value of the data.
19. Thebiggest drawback in considering mean is that it is affected by the extreme values.
One large or small number can distort the average. In that case, median is a better
measure of central tendency. While, when the most repeated value or the most wanted
one is required, then mode is used.
20. When all three measures of central tendency are equal, the distribution is called
symmetrical distribution.
21. When the values of mean, median and mode are not equal, then the distribution is
known as asymmetrical or skewed. In this case, the distribution can be positively skewed
or negatively skewed.
Negatively skewed distributions have a few extremely low scores, while positively
skewed distributions have a few extremely high scores.
i. When the data is negatively skewed, then Mean < Median < Mode
ii. When the positively skewed, then Mean > Median > Mode
(3)
STATISTICS
14
Three measure of central values are connected by the following relation:
3 Median = Mode + 2 Mean
22. The cumulative frequency is the accumulated or sum of frequencies up to a particular
point. A table showing the cumulative frequencies is called a cumulative frequency
distribution.
23. There are two types of cumulative frequencies:
i. Less than type cumulative frequency distribution: It is found by adding sequentially
the frequencies of all the earlier classes including the class adjacent to which it is
written. The cumulate is started from the lowest to the highest size.
ii. More than type cumulative frequency distribution: It is obtained by finding the
cumulate of frequencies starting from the highest to the lowest class.
24. A cumulative frequency distribution can be represented graphically by means of an
ogive.
25. There are two types of ogives:
i. 'Less than' ogive: In a less than ogive the upper limit of a class (x axis) is plotted
against its cumulative frequency (y axis) as a point on the ogive. The ‘less than ogive’ is
a rising curve.
ii. 'More than' ogive: In a ‘more than ogive’ the lower limit of a class (x axis) is plotted
against its cumulative frequency (y axis) as a point on the ogive. The ‘more than ogive’
is a falling curve.
26. The ogives can be drawn only when the given class intervals are continuous and if this is
not the case then first the class intervals are made continuous.
27. In order to determine the median from less than ogive or more than ogive, we follow
the steps given below:
Step 1: Draw more than or less than ogive as asked in question. Find of
N
observations. where N is the total number
2
N
Step 2: Locate the cumulative frequency on the y-axis.
2
Step 3: Draw a line parallel to x-axis through the point obtained in step 2, cutting the
cumulative frequency curve at a point P (say).
Step 4: Draw perpendicular PM from P on the x-axis. The x-coordinate of point M is the
median value.
28. If we draw less than ogive and more than ogive on the same graph, then median can be
obtained by following the steps given below:
Step 1: Draw both ogives on the same graph.
Step 2: Identify the point of intersection of both ogives and mark it as Q (say).
Step 3: Draw perpendicular from Q on x-axis.
Step 4: The point of perpendicular on x-axis is the median.
(4)
STATISTICS
14
Ungrouped Data
Ungrouped data is data in its original or raw form. The observations are not classified into
groups.
For example, the ages of everyone present in a classroom of kindergarten kids with the
teacher is as follows:
3, 3, 4, 3, 5, 4, 3, 3, 4, 3, 3, 3, 3, 4, 3, 27.
This data shows that there is one adult present in this class and that is the teacher.
Ungrouped data is easy to work with when the data set is small.
Grouped Data
In grouped data, observations are organized in groups.
For example, a class of students got different marks in a school exam. The data is tabulated
as follows:
This shows how many students got the particular mark range. Grouped data is easier to
work with when a large amount of data is present.
Frequency
Frequency is the number of times a particular observation occurs in data.
Class Interval
Data can be grouped into class intervals such that all observations in that range belong to
that class.
Class width = upper class limit – lower class limit
Mean
Finding the mean for Grouped Data when class Intervals are not given
For grouped data without class intervals,
∑𝑥𝑖 𝑓𝑖
Mean = 𝑥 = ∑𝑓𝑖
(5)
STATISTICS
14
Classmark = (Upper Class Limit+ Lower Class Limit)/2
Direct method of finding mean
Step 1: Classify the data into intervals and find the corresponding frequency of each class.
Step 2: Find the class mark by taking the midpoint of the upper and lower class limits.
Step 3: Tabulate the product of the class mark and its corresponding frequency for each
class. Calculate their sum (∑xifi).
Step 4: Divide the above sum by the sum of frequencies (∑fi) to get the mean.
Assumed mean method of finding mean
Step 1: Classify the data into intervals and find the corresponding frequency of each class.
Step 2: Find the class mark by taking the midpoint of the upper and lower class limits.
Step 3: Take one of the xi’s (usually one in the middle) as the assumed mean and denote it
by ′a′.
Step 4: Find the deviation of ′a′ from each of the x′is
di = xi − a
Step 5: Find the mean of the deviations
∑𝑓𝑖 𝑑𝑖
𝑑=
∑𝑓𝑖
Step 6: Calculate the mean as
∑𝑓𝑖 𝑑𝑖
𝑥=𝑎+
∑𝑓𝑖
The relation between the Mean of deviations and mean
(6)
STATISTICS
14
Step 5: Divide all deviations −di by the class width (h) to get u′is.
𝑥𝑖 − 𝑎
𝑢𝑖 =
ℎ
Step 6: Find the mean of u′is
∑𝑓𝑖 𝑢𝑖
𝑢=
∑𝑓𝑖
Step 7: Calculate the mean as
∑𝑓𝑖 𝑢𝑖
𝑥 =𝑎+ℎ× = 𝑎 + ℎ𝑢
∑𝑓𝑖
Relation between mean of Step- Deviations (u) and mean
𝑥𝑖 − 𝑎
𝑢𝑖 =
ℎ
𝑥 −𝑎
∑𝑓𝑖 𝑖
𝑢= ℎ
∑𝑓𝑖
1 ∑𝑓𝑖 𝑥𝑖 − 𝑎∑𝑓𝑖
𝑢= ×
ℎ ∑𝑓𝑖
1
𝑢 = × (𝑥 − 𝑎)
ℎ
Important relations between methods of finding mean
• All three methods of finding mean yield the same result.
• Step deviation method is easier to apply if all the deviations have a common factor.
• Assumed mean method and step deviation method are simplified versions of the
direct method.
Median
Finding the Median of Grouped Data when class Intervals are not given
Step 1: Tabulate the observations and the corresponding frequency in ascending or
descending order.
Step 2: Add the cumulative frequency column to the table by finding the cumulative
frequency up to each observation.
Step 3: If the number of observations is odd, the median is the observation whose
cumulative frequency is just greater than or equal to (n+1)/2
If the number of observations is even, the median is the average of observations whose
cumulative frequency is just greater than or equal to n/2 and (n/2)+1.
Cumulative Frequency
Cumulative frequency is obtained by adding all the frequencies up to a certain point.
(7)
STATISTICS
14
Finding median for Grouped Data when class Intervals are given
Step 1: find the cumulative frequency for all class intervals.
Step 2: the median class is the class whose cumulative frequency is greater than or nearest
to n2, where n is the number of observations.
Step 3: Median = l + [(N/2 – cf)/f] × h
Where,
l = lower limit of median class,
n = number of observations,
cf = cumulative frequency of class preceding the median class,
f = frequency of median class,
h = class size (assuming class size to be equal).
Cumulative Frequency distribution of less than type
Cumulative frequency of the less than type indicates the number of observations which are
less than or equal to a particular observation.
Cumulative Frequency distribution of more than type
Cumulative frequency of more than type indicates the number of observations that are
greater than or equal to a particular observation.
Visualising formula for median graphically
(8)
STATISTICS
14
corresponding to the median class.
Step 3: Draw a straight line graph joining the extremes of class and cumulative frequencies.
Step 4: Identify the point on the graph corresponding to cf = n/2
Step 5: Drop a perpendicular from this point onto the x-axis.
Ogive of less than type
The graph of a cumulative frequency distribution of the less than type is called an ‘ogive of
the less than type’.
Ogive of more than type
The graph of a cumulative frequency distribution of the more than type is called an ‘ogive
of the more than type’.
Relation between the less than and more than type curves
The point of intersection of the ogives of more than and less than types gives the median of
the grouped frequency distribution.
Mode
Finding mode for Grouped Data when class intervals are not given
In grouped data without class intervals, the observation having the largest frequency is the
mode.
Finding mode for Ungrouped Data
For ungrouped data, the mode can be found out by counting the observations and using
tally marks to construct a frequency table.
The observation having the largest frequency is the mode.
(9)
STATISTICS
14
(10)
STATISTICS
14
Important Questions
Multiple Choice questions-
1. Cumulative frequency curve is also called
(a) histogram
(b) ogive
(d) median
2. The relationship between mean, median and mode for a moderately skewed
distribution is
(a) is increased by 2
(b) is decreased by 2
4. Mode and mean of a data are 12k and 15A. Median of the data is
(a) 12k
(b) 14k
(c) 15k
(d) 16k
5. The times, in seconds, taken by 150 atheletes to run a 110 m hurdle race are tabulated
below:
(11)
STATISTICS
14
Class Frequency
13.8 – 14.0 2
14.0 – 14.2 4
14.2 – 14.4 5
14.4 – 14.6 71
14.6 – 14.8 48
14.8 – 15.0 20
The number of atheletes who completed the race in less then 14.6 seconds is:
(a) 11
(b) 71
(c) 82
(d) 130
6. The abscissa of the point of intersection of the less than type and of the more than
type cumulative frequency curves of a grouped data gives its
(a) mean
(b) median
(c) mode
7. While computing mean of grouped data, we assume that the frequencies are:
(12)
STATISTICS
14
8. Mean of 100 items is 49. It was discovered that three items which should have been
60, 70, 80 were wrongly read as 40, 20, 50 respectively. The correct mean is
(a) 48
(b) 49
(c) 50
(d) 60
9. While computing mean of grouped data, we assume that the frequencies are
(a) Mean
(b) Median
(c) Mode
(13)
STATISTICS
14
9. If the mode of a distribution is 8 and its mean is also 8, then find median.
Short Questions :
(14)
STATISTICS
14
1. If xi‘s are the mid-points of the class intervals of a grouped data. fi‘s are the
corresponding frequencies and is the mean, then find 𝛴𝑓𝑖 (𝑥𝑖 – 𝑥̅ ).
7. The following data gives the information on the observed lifetimes (in hours) of
225 electrical components:
8. The distribution below gives the weights of 30 students of a class. Find the
median weight of the students.
Long Questions :
(15)
STATISTICS
14
1. The following table gives the literacy rate (in percentage) of 35 cities. Find the
mean literacy rate.
3. The mean of the following frequency distribution is 62.8. Find the missing
frequency x.
Draw a less than type ogive for the given data. Hence, obtain the median
weight from the graph and verify the result by using the formula.
(16)
STATISTICS
14
of diesel filled in them.
a. If xi and fi are sufficiently small, then direct method is appropriate choice for
calculating mean.
b. If xi and fi are sufficiently large, then direct method is appropriate choice for
calculating mean.
a. 8.15 litres
b. 6 litres
c. 7 litres
d. 5.5 litres
iii. If approximately 2000 vehicles comes daily at the petrol pump, then how much
litres of diesel the pump should have?
(17)
STATISTICS
14
a. 16200 litres
b. 16300 litres
c. 10600 litres
d. 15000litres
iv. The sum of upper and lower limit of median class is:
a. 22
b. 10
c. 16
d. None of this.
v. If the median of given data is 8 litres, then mode will be equal to:
a. 7.5 litres
b. 7.7 litres
c. 5.7 litres
d. 8 litres
2. A bread manufacturer wants to know the lifetime of the product. For this, he
tested the lifetime of 400 packets of bread. The following tables gives the
distribution of the lifetime of 400 packets.
a. 2m + √𝑏
b. 2m + b
(18)
STATISTICS
14
c. m - b
d. 2m - b
a. 341hrs
b. 300hrs
c. 340hrs
d. 301hrs
a. 347hrs
b. 340hrs
c. 346hrs
d. 342hrs
a. 340hrs
b. 341hrs
c. 348hrs
d. 349hrs
a. 346hrs
b. 341hrs
c. 340hrs
d. 347hrs
(19)
STATISTICS
14
(b) Both A and R are true and R is not the correct explanation of A.
(c) A is true but R is false.
(d) Both A and R is false.
Reason: If the number of runs scored by 11 players of a cricket team of India are 5, 19,
42, 11, 50, 30, 21, 0, 52, 36, 27 then median is 30
Assertion: if the value of mode and mean is 60 and 66 then the value of median is
64.
Answer Key-
Multiple Choice questions-
1. (b) ogive
4. (b) 14k
5. (c) 82
6. (b) median
8. (c) 50
(20)
STATISTICS
14
1. New median = 21 + 5 = 26
2.
3.
∴ Frequency of class 30 – 40 = 3
5. Maximum frequency = 48
6.
7.
(21)
STATISTICS
14
3 × median = 8 + 2 × 8
8+16 24
Median = = =8
3 3
i.e., the mean of nth and (n + 1)th term will be the median.
Short Answer :
1.
(22)
STATISTICS
14
2. Classes are not continuous, hence make them continuous by adding 0.5 to the
upper limits and subtracting 0.5 from the lower limits.
𝑛 50
Here, =
2 2
4.
(23)
STATISTICS
14
5. Calculation of mean
7. Here, the maximum class frequency is 61 and the class corresponding to this
frequency is 60 – 80.
(24)
STATISTICS
14
8. Calculation of median
𝑛
The cumulative frequency just greater than = 15 is 19, and the corresponding
2
class is 55 – 60.
Long Answer :
1. Here, we use step deviation method to find mean.
Now, we have
(25)
STATISTICS
14
2. Let the assumed mean A = 16 and class size h = 2, here we apply step deviation
method.
Now, we have,
(26)
STATISTICS
14
3. We have
⇒ 12.8x = 128
128
∴ 𝑥 = = 10
12.8
Hence, the missing frequency is 10.
(27)
STATISTICS
14
4.
5. To represent the data in the table graphically, we mark the upper limits of the
class interval on x-axis and their corresponding cumulative frequency on y-axis
choosing a convenient scale. Now, let us plot the points corresponding to the
ordered pair given by (38,0), (40,3), (42,5), (44, 9), (46, 14), (48, 28), (50, 32) and
(52, 35) on a graph paper and join them by a freehand smooth curve.
(28)
STATISTICS
14
𝑛 35
Now, locate = = 17.5 on the y-axis,
2 2
We draw a line from this point parallel to x-axis cutting the curve at a point.
From this point, draw a perpendicular line to the x-axis. The point of
intersection of this perpendicular with the x-axis gives the median of the data.
Here it is 46.5.
Let us make the following table in order to find median by using formula.
𝑛 35 𝑛
Here, n = 35, = = 17.5, cumulative frequency greater than = 17.5 is 28 and
2 2 2
corresponding class is 46 – 48. So median class is 46 – 48.
𝑛
Now, we have l = 46, = 17.5, cf = 14, f = 14, h = 2
2
(29)
STATISTICS
14
(30)
STATISTICS
14
iv. (c) 16
Solution:
(31)
STATISTICS
14
Also, cumulative frequency for the given distribution are 14, 70, 130, 216, 290, 352,
400
∴ c.f just greater than 200 is 216, which is corresponding to the interval 300-350.
l = 300, f = 86, c.f. = 130, h = 50
v. (c) 340hrs
Solution:
Since, minimum of mean, median and mode is approximately 340hrs. So,
manufacturer should claim that lifetime of a packet is 340hrs.
Assertion Reason Answer-
(c) A is true but R is false.
(c) A is true but R is false.
(32)