Mean Mode Median - Merged

2.
2 ARITHMETIC MEAN
Adding all the observations and dividing the sum by the number of observations
results the arithmetic mean. Suppose we have the following observations:
10, 15,30, 7, 42, 79 and 83
These are seven observations. Symbolically, the arithmetic mean, also called simply
mean is
x = ∑x/n, where x is simple mean.
10 + 15 + 30 + 7 + 42 + 79 + 83
=
7
266
= = 38
7
It may be noted that the Greek letter μ is used to denote the mean of the population
and n to denote the total number of observations in a population. Thus the population
mean μ = ∑x/n. The formula given above is the basic formula that forms the
definition of arithmetic mean and is used in case of ungrouped data where weights are
not involved.
2.2.1 UNGROUPED DATA-WEIGHTED AVERAGE
In case of ungrouped data where weights are involved, our approach for calculating
arithmetic mean will be different from the one used earlier.
Example 2.1: Suppose a student has secured the following marks in three tests:
Mid-term test 30
Laboratory 25
Final 20
30 + 25 + 20
The simple arithmetic mean will be = 25
3
21
However, this will be wrong if the three tests carry different weights on the basis of
their relative importance. Assuming that the weights assigned to the three tests are:
Mid-term test 2 points
Laboratory 3 points
Final 5 points
Solution: On the basis of this information, we can now calculate a weighted mean as
shown below:
Table 2.1: Calculation of a Weighted Mean
Type of Test Relative Weight (w) Marks (x) (wx)
Mid-term 2 30 60
Laboratory 3 25 75
Final 5 20 100
Total ∑ w = 10 235
∑ wx w1 x1 + w2 x 2 + w3 x 3
x= =
∑w w1 + w2 + w3
60 + 75 + 100
= = 23.5 marks
2+3+5
It will be seen that weighted mean gives a more realistic picture than the simple or
unweighted mean.
Example 2.2: An investor is fond of investing in equity shares. During a period of
falling prices in the stock exchange, a stock is sold at Rs 120 per share on one day, Rs
105 on the next and Rs 90 on the third day. The investor has purchased 50 shares on
the first day, 80 shares on the second day and 100 shares on the third' day. What
average price per share did the investor pay?
22
Solution:
Table 2.2: Calculation of Weighted Average Price
Day Price per Share (Rs) (x) No of Shares Purchased (w) Amount Paid (wx)
1 120 50 6000
2 105 80 8400
3 90 100 9000
Total - 230 23,400
w1 x1 + w2 x 2 + w3 x 3 ∑ wx
Weighted average = =
w1 + w2 + w3 ∑w
6000 + 8400 + 9000

= = 101.7 marks
50 + 80 + 100
Therefore, the investor paid an average price of Rs 101.7 per share.
It will be seen that if merely prices of the shares for the three days (regardless of the
number of shares purchased) were taken into consideration, then the average price
would be
120 + 105 + 90
Rs. = 105
3
This is an unweighted or simple average and as it ignores the-quantum of shares
purchased, it fails to give a correct picture. A simple average, it may be noted, is also
a weighted average where weight in each case is the same, that is, only 1. When we
use the term average alone, we always mean that it is an unweighted or simple
average.
2.2.2 GROUPED DATA-ARITHMETIC MEAN
For grouped data, arithmetic mean may be calculated by applying any of the
following methods:
(i) Direct method, (ii) Short-cut method , (iii) Step-deviation method
23
In the case of direct method, the formula x = ∑fm/n is used. Here m is mid-point of
various classes, f is the frequency of each class and n is the total number of
frequencies. The calculation of arithmetic mean by the direct method is shown below.
Example 2.3: The following table gives the marks of 58 students in Statistics.
Calculate the average marks of this group.
Marks No. of Students

0-10 4
10-20 8
20-30 11
30-40 15
40-50 12
50-60 6
60-70 2
Total 58
Solution:
Table 2.3: Calculation of Arithmetic Mean by Direct Method
No. of Students
Marks Mid-point m fm
f
0-10 5 4 20
10-20 15 8 120
20-30 25 11 275
30-40 35 15 525
40-50 45 12 540
50-60 55 6 330
60-70 65 2 130
∑fm = 1940
Where,
x=
∑ fm = 1940 = 33.45 marks or 33 marks approximately.
n 58
It may be noted that the mid-point of each class is taken as a good approximation of
the true mean of the class. This is based on the assumption that the values are
distributed fairly evenly throughout the interval. When large numbers of frequency
occur, this assumption is usually accepted.
24
In the case of short-cut method, the concept of arbitrary mean is followed. The
formula for calculation of the arithmetic mean by the short-cut method is given
below:
x= A+
∑ fd
n
Where A = arbitrary or assumed mean
f = frequency
d = deviation from the arbitrary or assumed mean
When the values are extremely large and/or in fractions, the use of the direct method
would be very cumbersome. In such cases, the short-cut method is preferable. This is
because the calculation work in the short-cut method is considerably reduced
particularly for calculation of the product of values and their respective frequencies.
However, when calculations are not made manually but by a machine calculator, it
may not be necessary to resort to the short-cut method, as the use of the direct method
may not pose any problem.
As can be seen from the formula used in the short-cut method, an arbitrary or assumed
mean is used. The second term in the formula (∑fd ÷ n) is the correction factor for the
difference between the actual mean and the assumed mean. If the assumed mean turns
out to be equal to the actual mean, (∑fd ÷ n) will be zero. The use of the short-cut
method is based on the principle that the total of deviations taken from an actual mean
is equal to zero. As such, the deviations taken from any other figure will depend on
how the assumed mean is related to the actual mean. While one may choose any value
as assumed mean, it would be proper to avoid extreme values, that is, too small or too
high to simplify calculations. A value apparently close to the arithmetic mean should
be chosen.
25
For the figures given earlier pertaining to marks obtained by 58 students, we calculate
the average marks by using the short-cut method.
Example 2.4:
Table 2.4: Calculation of Arithmetic Mean by Short-cut Method
Mid-point
Marks f d fd
m
0-10 5 4 -30 -120
10-20 15 8 -20 -160
20-30 25 11 -10 -110
30-40 35 15 0 0
40-50 45 12 10 120
50-60 55 6 20 120
60-70 65 2 30 60
∑fd = -90
It may be noted that we have taken arbitrary mean as 35 and deviations from
midpoints. In other words, the arbitrary mean has been subtracted from each value of
mid-point and the resultant figure is shown in column d.
x= A+
∑ fd
n
⎛ − 90 ⎞
= 35 + ⎜ ⎟
⎝ 58 ⎠
= 35 - 1.55 = 33.45 or 33 marks approximately.
Now we take up the calculation of arithmetic mean for the same set of data using the
step-deviation method. This is shown in Table 2.5.
Table 2.5: Calculation of Arithmetic Mean by Step-deviation Method
Marks Mid-point f d d’= d/10 Fd’

0-10 5 4 -30 -3 -12
10-20 15 8 -20 -2 -16
20-30 25 11 -10 -1 -11
30-40 35 15 0 0 0
40-50 45 12 10 1 12
50-60 55 6 20 2 12
60-70 65 2 30 3 6
∑fd’ =-9
26
x = A+
∑ fd ' × C
n
⎛ − 9 × 10 ⎞
= 35 + ⎜ ⎟ = 33.45 or 33 marks approximately.
⎝ 58 ⎠
It will be seen that the answer in each of the three cases is the same. The step-
deviation method is the most convenient on account of simplified calculations. It may
also be noted that if we select a different arbitrary mean and recalculate deviations
from that figure, we would get the same answer.
Now that we have learnt how the arithmetic mean can be calculated by using different
methods, we are in a position to handle any problem where calculation of the
arithmetic mean is involved.
Example 2.6: The mean of the following frequency distribution was found to be 1.46.
No. of Accidents No. of Days (frequency)

0 46
1 ?
2 ?
3 25
4 10
5 5
Total 200 days
Calculate the missing frequencies.
Solution:
Here we are given the total number of frequencies and the arithmetic mean. We have
to determine the two frequencies that are missing. Let us assume that the frequency
against 1 accident is x and against 2 accidents is y. If we can establish two
simultaneous equations, then we can easily find the values of X and Y.
(0.46) + (1 . x) + (2 . y) + (3 . 25) + (4 . l0) + (5 . 5)

Mean =
200
27
x + 2y + 140
1.46 =
200
x + 2y + 140 = (200) (1.46)
x + 2y = 152
x + y=200- {46+25 + 1O+5}
x + y = 200 - 86
x + y = 114
Now subtracting equation (ii) from equation (i), we get
x + 2y = 152
x+y = 114
- - -
y = 38
Substituting the value of y = 38 in equation (ii) above, x + 38 = 114
Therefore, x = 114 - 38 = 76
Hence, the missing frequencies are:
Against accident 1 : 76
Against accident 2 : 38
2.2.3 CHARACTERISTICS OF THE ARITHMETIC MEAN
Some of the important characteristics of the arithmetic mean are:
1. The sum of the deviations of the individual items from the arithmetic mean is
always zero. This means I: (x - x ) = 0, where x is the value of an item and x is
the arithmetic mean. Since the sum of the deviations in the positive direction
is equal to the sum of the deviations in the negative direction, the arithmetic
mean is regarded as a measure of central tendency.
2. The sum of the squared deviations of the individual items from the arithmetic
mean is always minimum. In other words, the sum of the squared deviations
taken from any value other than the arithmetic mean will be higher.
28
3. As the arithmetic mean is based on all the items in a series, a change in the
value of any item will lead to a change in the value of the arithmetic mean.
4. In the case of highly skewed distribution, the arithmetic mean may get
distorted on account of a few items with extreme values. In such a case, it
may cease to be the representative characteristic of the distribution.
2.3 MEDIAN
Median is defined as the value of the middle item (or the mean of the values of the
two middle items) when the data are arranged in an ascending or descending order of
magnitude. Thus, in an ungrouped frequency distribution if the n values are arranged
in ascending or descending order of magnitude, the median is the middle value if n is
odd. When n is even, the median is the mean of the two middle values.
Suppose we have the following series:
15, 19,21,7, 10,33,25,18 and 5
We have to first arrange it in either ascending or descending order. These figures are
arranged in an ascending order as follows:
5,7,10,15,18,19,21,25,33
Now as the series consists of odd number of items, to find out the value of the middle
item, we use the formula
n +1
Where
2
n +1
Where n is the number of items. In this case, n is 9, as such = 5, that is, the size
2
of the 5th item is the median. This happens to be 18.
Suppose the series consists of one more items 23. We may, therefore, have to include
23 in the above series at an appropriate place, that is, between 21 and 25. Thus, the
series is now 5, 7, 10, 15, 18, 19, and 21,23,25,33. Applying the above formula, the
29
median is the size of 5.5th item. Here, we have to take the average of the values of 5th
and 6th item. This means an average of 18 and 19, which gives the median as 18.5.
n +1
It may be noted that the formula itself is not the formula for the median; it
2
merely indicates the position of the median, namely, the number of items we have to
count until we arrive at the item whose value is the median. In the case of the even
number of items in the series, we identify the two items whose values have to be
averaged to obtain the median. In the case of a grouped series, the median is
calculated by linear interpolation with the help of the following formula:
l 2 + l1
M = l1 (m − c)
f
Where M = the median
l1 = the lower limit of the class in which the median lies
12 = the upper limit of the class in which the median lies
f = the frequency of the class in which the median lies
m = the middle item or (n + 1)/2th, where n stands for total number of
items
c = the cumulative frequency of the class preceding the one in which the median lies
Example 2.7:
Monthly Wages (Rs) No. of Workers

800-1,000 18
1,000-1,200 25
1,200-1,400 30
1,400-1,600 34
1,600-1,800 26
1,800-2,000 10
Total 143
In order to calculate median in this case, we have to first provide cumulative
frequency to the table. Thus, the table with the cumulative frequency is written as:
30
Cumulative Frequency
Monthly Wages Frequency
800 -1,000 18 18
1,000 -1,200 25 43
1,200 -1,400 30 73
1,400 -1,600 34 107
1,600 -1,800 26 133
1.800 -2,000 10 143
l 2 + l1
M = l1 (m − c)
f
M = n + 1 = 143 + 1 = 72
2 2
It means median lies in the class-interval Rs 1,200 - 1,400.
Now, M = 1200 + 1400 − 1200 (72 − 43)

30
200
= 1200 + (29)
30
= Rs 1393.3
At this stage, let us introduce two other concepts viz. quartile and decile. To
understand these, we should first know that the median belongs to a general class of
statistical descriptions called fractiles. A fractile is a value below that lays a given
fraction of a set of data. In the case of the median, this fraction is one-half (1/2).
Likewise, a quartile has a fraction one-fourth (1/4). The three quartiles Q1, Q2 and Q3
are such that 25 percent of the data fall below Q1, 25 percent fall between Q1 and Q2,
25 percent fall between Q2 and Q3 and 25 percent fall above Q3 It will be seen that Q2
is the median. We can use the above formula for the calculation of quartiles as well.
The only difference will be in the value of m. Let us calculate both Q1 and Q3 in
respect of the table given in Example 2.7.
l 2 − l1
Q1 = l1 (m − c)
f
31
n + 1 143 + 1
Here, m will be = = = 36
4 4
1200 − 1000
Q1 = 1000 + (36 − 18)
25
200
= 1000 + (18)
25
= Rs. 1,144
n + 1 3×144
In the case of Q3, m will be 3 = = = 108
4 4
1800 − 1600
Q1 = 1600 + (108 − 107)
26
200
= 1600 + (1)
26
Rs. 1,607.7 approx
In the same manner, we can calculate deciles (where the series is divided into 10
parts) and percentiles (where the series is divided into 100 parts). It may be noted that
unlike arithmetic mean, median is not affected at all by extreme values, as it is a
positional average. As such, median is particularly very useful when a distribution
happens to be skewed. Another point that goes in favour of median is that it can be
computed when a distribution has open-end classes. Yet, another merit of median is
that when a distribution contains qualitative data, it is the only average that can be
used. No other average is suitable in case of such a distribution. Let us take a couple
of examples to illustrate what has been said in favour of median.
32
Example 2.8:Calculate the most suitable average for the following data:
Size of the Item Below 50 50-100 100-150 150-200 200 and above
Frequency 15 20 36 40 10
Solution: Since the data have two open-end classes-one in the beginning (below 50) and the
other at the end (200 and above), median should be the right choice as a measure of central
tendency.
Table 2.6: Computation of Median
Size of Item Frequency Cumulative Frequency
Below 50 15 15
50-100 20 35
100-150 36 71
150-200 40 111
200 and above 10 121
n +1
Median is the size of th item
2
121 + 1
= = 61st item
2
Now, 61st item lies in the 100-150 class
l 2 − l1
Median = 11 = l1 (m − c)
f
150 − 100
= 100 + (61 − 35)
36
= 100 + 36.11 = 136.11 approx.
Example 2.9: The following data give the savings bank accounts balances of nine sample
households selected in a survey. The figures are in rupees.
745 2,000 1,500 68,000 461 549 3750 1800 4795
(a) Find the mean and the median for these data; (b) Do these data contain an outlier? If so,
exclude this value and recalculate the mean and median. Which of these summary measures
33
has a greater change when an outlier is dropped?; (c) Which of these two summary measures
is more appropriate for this series?
Solution:
745 + 2,000 + 1,500 + 68,000 + 461 + 549 + 3,750 + 1,800 + 4,795

Mean = Rs.
9
Rs 83,600
= = Rs 9,289
9
n + 1
Median = Size of th item
2
9 + 1
= = 5th item
2
Arranging the data in an ascending order, we find that the median is Rs 1,800.
(b) An item of Rs 68,000 is excessively high. Such a figure is called an 'outlier'. We
exclude this figure and recalculate both the mean and the median.
83,600 − 68,000
Mean = Rs.
8
15,600
= Rs = Rs. 1,950
8
n + 1
Median = Size of th item
2
8 + 1
= = 4.5th item.
2
1,500 − 1,800
= Rs. = Rs. 1,650
2
It will be seen that the mean shows a far greater change than the median when the
outlier is dropped from the calculations.
(c) As far as these data are concerned, the median will be a more appropriate measure
than the mean.
Further, we can determine the median graphically as follows:
34
Example 2.10: Suppose we are given the following series:
Class interval 0-10 10-20 20-30 30-40 40-50 50-60 60-70
Frequency 6 12 22 37 17 8 5
We are asked to draw both types of ogive from these data and to determine the
median.
Solution:
First of all, we transform the given data into two cumulative frequency distributions,
one based on ‘less than’ and another on ‘more than’ methods.
Table A
Frequency
Less than 10 6
Less than 20 18
Less than 30 40
Less than 40 77
Less than 50 94
Less than 60 102
Less than 70 107
Table B
Frequency
More than 0 107
More than 10 101
More than 20 89
More than 30 67
More than 40 30
More than 50 13
More than 60 5
It may be noted that the point of
intersection of the two ogives gives the
value of the median. From this point of
intersection A, we draw a straight line to
35
meet the X-axis at M. Thus, from the point of origin to the point at M gives the value
of the median, which comes to 34, approximately. If we calculate the median by
applying the formula, then the answer comes to 33.8, or 34, approximately. It may be
pointed out that even a single ogive can be used to determine the median. As we have
determined the median graphically, so also we can find the values of quartiles, deciles
or percentiles graphically. For example, to determine we have to take size of {3(n +
1)} /4 = 81st item. From this point on the Y-axis, we can draw a perpendicular to
meet the 'less than' ogive from which another straight line is to be drawn to meet the
X-axis. This point will give us the value of the upper quartile. In the same manner,
other values of Q1 and deciles and percentiles can be determined.
2.3.1 CHARACTERISTICS OF THE MEDIAN
1. Unlike the arithmetic mean, the median can be computed from open-ended
distributions. This is because it is located in the median class-interval, which
would not be an open-ended class.
2. The median can also be determined graphically whereas the arithmetic mean
cannot be ascertained in this manner.
3. As it is not influenced by the extreme values, it is preferred in case of a
distribution having extreme values.
4. In case of the qualitative data where the items are not counted or measured but
are scored or ranked, it is the most appropriate measure of central tendency.
2.4 MODE
The mode is another measure of central tendency. It is the value at the point around
which the items are most heavily concentrated. As an example, consider the following
series: 8,9, 11, 15, 16, 12, 15,3, 7, 15
36
There are ten observations in the series wherein the figure 15 occurs maximum
number of times three. The mode is therefore 15. The series given above is a discrete
series; as such, the variable cannot be in fraction. If the series were continuous, we
could say that the mode is approximately 15, without further computation.
In the case of grouped data, mode is determined by the following formula:
f1 − f 0
Mode= l1 + ×i
( f1 − f 0 ) + ( f1 − f 2 )
Where, l1 = the lower value of the class in which the mode lies
fl = the frequency of the class in which the mode lies
fo = the frequency of the class preceding the modal class
f2 = the frequency of the class succeeding the modal class
i = the class-interval of the modal class
While applying the above formula, we should ensure that the class-intervals are
uniform throughout. If the class-intervals are not uniform, then they should be made
uniform on the assumption that the frequencies are evenly distributed throughout the
class. In the case of inequal class-intervals, the application of the above formula will
give misleading results.
Example 2.11: Let us take the following frequency distribution:
Class intervals (1) Frequency (2)

30-40 4
40-50 6
50-60 8
60-70 12
70-80 9
80-90 7
90-100 4
We have to calculate the mode in respect of this series.
Solution: We can see from Column (2) of the table that the maximum frequency of
12 lies in the class-interval of 60-70. This suggests that the mode lies in this class-
interval. Applying the formula given earlier, we get:
37
12 - 8
Mode = 60 + × 10
12 - 8 (12 - 8) + (12 - 9)
4
= 60 + × 10
4+3
= 65.7 approx.
In several cases, just by inspection one can identify the class-interval in which the
mode lies. One should see which the highest frequency is and then identify to which
class-interval this frequency belongs. Having done this, the formula given for
calculating the mode in a grouped frequency distribution can be applied.
At times, it is not possible to identify by inspection the class where the mode lies. In
such cases, it becomes necessary to use the method of grouping. This method consists
of two parts:
(i) Preparation of a grouping table: A grouping table has six columns, the first
column showing the frequencies as given in the problem. Column 2 shows
frequencies grouped in two's, starting from the top. Leaving the first
frequency, column 3 shows frequencies grouped in two's. Column 4 shows the
frequencies of the first three items, then second to fourth item and so on.
Column 5 leaves the first frequency and groups the remaining items in three's.
Column 6 leaves the first two frequencies and then groups the remaining in
three's. Now, the maximum total in each column is marked and shown either
in a circle or in a bold type.
(ii) Preparation of an analysis table: After having prepared a grouping table, an
analysis table is prepared. On the left-hand side, provide the first column for
column numbers and on the right-hand side the different possible values of
mode. The highest values marked in the grouping table are shown here by a
bar or by simply entering 1 in the relevant cell corresponding to the values
38
they represent. The last row of this table will show the number of times a
particular value has occurred in the grouping table. The highest value in the
analysis table will indicate the class-interval in which the mode lies. The
procedure of preparing both the grouping and analysis tables to locate the
modal class will be clear by taking an example.
Example 2.12: The following table gives some frequency data:
Size of Item Frequency
10-20 10
20-30 18
30-40 25
40-50 26
50-60 17
60-70 4
Solution:
Grouping Table
Size of item 1 2 3 4 5 6
10-20 10
28
20-30 18 53
43
30-40 25 69
51
40-50 26 68
43
50-60 17 47
21
60-70 4
Analysis table
Size of item
Col. No. 10-20 20-30 30-40 40-50 50-60
1 1
2 1 1
3 1 1 1 1
4 1 1 1
5 1 1 1
39
6 1 1 1
Total 1 3 5 5 2
This is a bi-modal series as is evident from the analysis table, which shows that the
two classes 30-40 and 40-50 have occurred five times each in the grouping. In such a
situation, we may have to determine mode indirectly by applying the following
formula:
Mode = 3 median - 2 mean
Median = Size of (n + l)/2th item, that is, 101/2 = 50.5th item. This lies in the class
30-40. Applying the formula for the median, as given earlier, we get
40 - 30
= 30 + (50.5 − 28)
25
= 30 + 9 = 39
Now, arithmetic mean is to be calculated. This is shown in the following table.
Class- interval Frequency Mid- points d d' = d/10 fd'

10-20 10 15 -20 -2 -20
20-30 18 25 -10 -I -18
30-40 25 35 0 0 0
40-50 26 45 10 1 26
50-60 17 55 20 2 34
60-70 4 65 30 3 12
Total 100 34
Deviation is taken from arbitrary mean = 35
Mean = A+
∑ fd ' × i
n
34
= 35 + × 10
100
= 38.4
Mode = 3 median - 2 mean
= (3 x 39) - (2 x 38.4)
= 117 -76.8
40
= 40.2
This formula, Mode = 3 Median-2 Mean, is an empirical formula only. And it can
give only approximate results. As such, its frequent use should be avoided. However,
when mode is ill defined or the series is bimodal (as is the case in the present
example) it may be used.
2.5 RELATIONSHIPS OF THE MEAN, MEDIAN AND MODE
Having discussed mean, median and mode, we now turn to the relationship amongst
these three measures of central tendency. We shall discuss the relationship assuming
that there is a unimodal frequency distribution.
(i) When a distribution is symmetrical, the mean, median and mode are the same,
as is shown below in the following figure.
In case, a distribution is
skewed to the right, then
mean> median> mode.
Generally, income distri-
bution is skewed to the right where a large number of families have relatively
low income and a small number of families have extremely high income. In
such a case, the mean is pulled up by the extreme high incomes and the
relation among these three measures is as shown in Fig. 6.3. Here, we find that
mean> median> mode.
(ii) When a distribution is skewed to
the left, then mode> median>
mean. This is because here mean is
pulled down below the median
by extremely low values. This is
41
shown as in the figure.
(iii) Given the mean and median of a unimodal distribution, we can determine
whether it is skewed to the
right or left. When mean>
median, it is skewed to the
right; when median> mean, it
is skewed to the left. It may be noted that the median is always in the middle
between mean and mode.
2.6 THE BEST MEASURE OF CENTRAL TENDENCY
At this stage, one may ask as to which of these three measures of central tendency the
best is. There is no simple answer to this question. It is because these three measures
are based upon different concepts. The arithmetic mean is the sum of the values
divided by the total number of observations in the series. The median is the value of
the middle observation that divides the series into two equal parts. Mode is the value
around which the observations tend to concentrate. As such, the use of a particular
measure will largely depend on the purpose of the study and the nature of the data;
For example, when we are interested in knowing the consumers preferences for
different brands of television sets or different kinds of advertising, the choice should
go in favour of mode. The use of mean and median would not be proper. However,
the median can sometimes be used in the case of qualitative data when such data can
be arranged in an ascending or descending order. Let us take another example.
Suppose we invite applications for a certain vacancy in our company. A large number
of candidates apply for that post. We are now interested to know as to which age or
age group has the largest concentration of applicants. Here, obviously the mode will
be the most appropriate choice. The arithmetic mean may not be appropriate as it may
42
MEASURES OF CENTRAL TENDENCY
OBJECTIVES
After going through this unit, you will learn:
• the concept and significance of measures of central tendency
• to compute various measures of central tendency, such as arithmetic mean, median, mode and quartiles
• the relationship among various averages.
INTRODUCTION
The objective here is to find one representative value which can-be used to locate and summarise the
entire set of varying values. This one value can be used to make many decisions concerning the entire
set. We can define measures of central tendency (or location) to find some central value around which the
data tend to cluster.
SIGNIFICANCE OF MEASURES OF CENTRAI TENDENCY

Measures of central tendency i.e. condensing the mass of data in one single value , enable us to get an
idea of the entire data. For example, it is impossible to remember the individual incomes of millions of
earning people of India. But if the average income is obtained, we get one single value that represents the
entire population.
Measures of central tendency also enable us to compare two or more sets of data to facilitate comparison.
For example, the average sales figures of April may be compared with the sales figures of previous
months.
PROPERTIES OF A GOOD MEASURE OF CENTRAL TENDENCY
[1] It should be easy to understand and calculate.

[2] It should be rigidly defined.
[3] It should be based on all observations.
[4] It should be least affected by sampling fluctuation.
[5] It should be capable of further algebraic treatment.
[6] It should be least affected by extreme values.
[7] It should be calculated in case of open end interval.
Following are some of the important measures of central tendency which are commonly used in business
and industry.
 Arithmetic Mean
 Weighted Arithmetic Mean
 Median
 Quantiles(quartiles, deciles and percentiles)
 Mode
 Geometric Mean
 Harmonic Mean
ARITHMETIC MEAN
The arithmetic mean (or mean or average) is the most commonly used and readily understood measure of
central tendency. In statistics, the term average refers to any of the measures of central tendency.
Ungrouped data/Raw data
The arithmetic mean is defined as being equal to the sum of the numerical values of each and every
observation divided by the total number of observations. Symbolically, it can be represented as:
where,
∑X indicates the sum of the values of all the observations, and N is the total number of
observations.
For example, let us consider the monthly salary (Rs.) of 10 employees of a firm x
2500, 2700, 2400, 2300, 2550, 2650, 2750, 2450, 2600, 2400
If we compute the arithmetic mean, then 2500+2700+2400+2300+2550+2650+2750+2450+2600+2400 =
25300
Mean=25300 /10= Rs. 2530.
Therefore, the average monthly salary is Rs. 2530.
Discrete data
When the observations are classified into a frequency distribution, Therefore, for discrete data; the
arithmetic mean is defined as
Where, f is the frequency for corresponding variable x and N is the total frequency, i.e. N = f.
X f fx
10 12 120
20 23 460
30 35 1050
40 47 1880
50 38 1900
60 29 1740
70 16 1120
Sum 200 8270
Mean=8270/200= 41.35
Continuous Data
When the observations are classified into a frequency distribution, Therefore, for grouped data; the
arithmetic mean is defined as
=
Where X is midpoint of various classes, f is the frequency for corresponding class and N is the total
frequency, i.e. N = f.
This method is illustrated for the following data which relate to the monthly sales of 200 firms.
the midpoint of the class interval would be treated as the representative average value of that class.
Monthly Sales Mid point No. of firms fX

(Rs. Thousand) X f
300-350 325 5 1625
350-400 375 14 5250
400-450 425 23 9775
450-500 475 50 23750
500-550 525 52 27300
550-600 575 25 14375
600-650 625 22 13750
650-700 675 7 4725
700-750 725 2 1450
Sum 200 10200
Mean=102000/200=510
MERITS OF MEAN
[1] It is easy to understand and calculate.

[2] It is rigidly defined.
[3] It is based on all observations.
[4] It is least affected by sampling fluctuation.
[5] It is capable of further algebraic treatment.
DEMERITS OF MEAN
1) It is highly affected by extreme values.

2) It is not calculated in case of open end interval.
MEDIAN
A second measure of central tendency is the median. Median is that value which divides the distribution
into two equal parts. Fifty per cent of the observations in the distribution are above the value of median and
other fifty per cent of the observations are below this value of median. The median is the value of the
middle observation when the series is arranged in order of size or magnitude (Ascending order).
UNGROUPED DATA
If the number of observations is odd, then the median is equal to one of the original observations (Middle).
Median = th value
For example, if the income of seven persons in rupees is 1100, 1200, 1350, 1500, 1550, 1600, 1800, then
Median = =4th value

Median =1500
If the number of observations is even, then the median is the arithmetic mean of the two middle
observations.
Median = th value
For example, if the income of eight persons in rupees is 1100, 1200, 1350, 1500, 1550, 1600, 1800,1850,
then the median income of eight persons would be 1500+1550/2= 1525
DISCRETE SERIES
First we find cumulative frequency.then locate (N+1/2) the value in cumulative frequency.corresponding
that value of x is median.
X f cf
10 12 12
20 23 35
30 35 70
40 47 117
50 38 155
60 29 184
70 16 200
Sum 200
N=200
N+1/2=100.5
Median= 40
CONTINUOUS DATA
For continuous data, First we find cumulative frequency. then locate (N+1/2) the value in cumulative
frequency. corresponding class interval is median class.the following formula may be used to locate the
value of median.
Median=l1+ {(N/2 cf)*h} /f
where l1 is the lower limit of the median class, cf is the preceding cumulative frequency to the median class,
f is the frequency of the median class and h is the width of the median class.
Consider the following data which relate to the age distribution of 1000 workers in an industrial
establishment.
The location of median value is facilitated by the use of a cumulative frequency distribution as shown
below in the table.
Age (Years) No. of workers Cumulative frequency

f c.f
Below 25 120 120
25-30 125 245
30-35 180 425(cf)
35-40 160(f) 585
40-45 150 735
45-50 140 875
50-55 100 975
55 and Above 25 1000
N=1000
Median Class=(1000+1)/2=500.5th =(35-40)
l1 =35,cf =425,f=160,N/2=500,h=5
Median =35+{(500-425)*5}/160 =35+375/160=35+2.34=37.34
MERITS OF MEDIAN

[2] It is rigidly defined.
[3] It is not affected by extreme values.
[4] It is calculated in case of open end interval.
[5] It is located by graphically also.
DEMERITS OF MEDIAN
[1] It is not based on all observations.

[2] It is affected by sampling fluctuation.
[3] It is not capable of further algebraic treatment.
MODE
The mode is the typical or commonly observed value in a set of data. It is defined as the value
which occurs most often or with the greatest frequency. The dictionary meaning of the term mode is most
usual'.
For example, in the series of numbers 3, 4, 5, 5, 6, 7, 8, 8, 8, 9, the mode is 8 because it occurs the
maximum number of times. That means in ungrouped data mode can find by inspection only.
DISCRETE DATA
Mode is the value of X which has highest frequency.

For example,
X f
10 12
20 23
30 35
40 47
50 38
60 29
70 16
Sum 200
Mode=40
CONTINUOUS DATA
First, we find Modal class = corresponding to highest frequency.
MODE = l1+ { }*h
where l1 is lower limit of the modal class, f1 is the frequency of the modal class,f0 the frequency of the
preceding class, f2 is the frequency of the succeeding class, h is the size of the modal class.
To illustrate the computation of mode, let us consider the following data.
Age (Years) No. of workers

f
Below 25 120
25-30 125(f0)
30-35 180(f1)
35-40 160(f2)
40-45 150
45-50 140
50-55 100
55 and Above 25
Modal class=(30-35)
l1 =30, f1 =180, f0 =125, f2 =160,h=5

Mode =30+{(180-125)/(2*180-125-160)}*5 =30+(55/75)*5 =30+3.667=33.67
MERITS OF MODE

[2] It is not affected by extreme values.
[3] It is calculated in case of open end interval.
[4] It is located by graphically also.
DEMERITS OF MODE
[1] It is not based on all observations.

[2] It is highly affected by sampling fluctuation.
[3] It is not capable of further algebraic treatment
[4] It is not rigidly defined.
RELATIONSHIP AMONG MEAN, MEDIAN AND MODE

MODE = 3MEDIAN-2 MEAN
QUARTILES
Quartiles are those values which divide the total data into four equal parts. Since three points divide the
distribution into four equal parts, we shall have three quartiles. Let us call them Q1, Q2, and Q3. The first
quartile, Q1, is the value such that 25% of the observations are smaller and 75% of the observations are
larger. The second quartile, Q2, is the median, i.e., 50% of the observations are smaller and 50% are
larger. The third quartile, Q3, is the value such that 75% of the observations are smaller and 25% of the
observations are larger.
UNGROUPED DATA
If the number of observations is odd, then the median is equal to one of the original observations (Middle).
Q1= th value
Q3= th value
For example, if the income of seven persons in rupees is 1100, 1200, 1350, 1500, 1550, 1600, 1800,1850
then
Q1 = =2.25nd value
Q1=1200+0.25(1350-1200) =1200+37.5 =1237.5
Q3= =6.75th value

Q3= 1600+.75(1800-1600) =1600+150 = 1750
DISCRETE SERIES
First we find cumulative frequency. then locate (N+1/4 ) and 3(N+1/4)the value in cumulative frequency
.corresponding that value of x is Q1 and Q2 respectively.
First X f cf
quartile 10 12 12
20 23 35
30 35 70
40 47 117
Third 50 38 155
quartile 60 29 184
70 16 200
Sum 200
N=200
Q1=N+1/4=50.25th value
Q1 = 30
Q3 =3*(N+1/4)= 150.75th value
Q3=50
CONTINUOUS DATA
For continuous data, First we find cumulative frequency. then locate (N+1/4) and 3(N+1/4)the value in
cumulative frequency. corresponding class interval is first quartile class and third quartile class
respectively.the following formula may be used to locate the value of quartiles.
Q1=l1+ {(N/4 cf)*h} /f
where l1 is the lower limit of the first quartile class, cf is the preceding cumulative frequency to the first
quartile class, f is the frequency of first quartile class and h is the width of the first quartile class .
Q3=l1+ {(3N/4 cf)*h} /f
where l1 is the lower limit of the third quartile class, cf is the preceding cumulative frequency to the third
quartile class, f is the frequency of third quartile class and h is the width of the third quartile class .
Consider the following data which relate to the age distribution of 1000 workers in an industrial
establishment.
The location of quartile value is facilitated by the use of a cumulative frequency distribution as shown
below in the table.
First
Age (Years) No. of workers Cumulative frequency quartile
f c.f
class
Below 25 120 120
25-30 125 245
30-35 180 425(cf)
35-40 160(f) 585
40-45 150 735
45-50 140 875 Third
50-55 100 975 quartile
55 and Above 25 1000 class
N=1000
Q1=(1000+1)/4=250.25th =(25-30)
l1 =25,cf =120,f=125,N/4=250,h=5
Q1=25+{(250-120)*5}/125 =35+390/125=35+3.12=38.12
Q3=3(1000+1)/4=750.75th =(45-50)
l1 =45,cf =735,f=140,3N/4=750,h=5
Q3=45+{(750-735)*5}/140=45+75/140=45+0.53=45.53
1] Following is the cumulative frequency distribution of preferred length of study-table obtained from
the preference study of 50 students.
A manufacturer has to take decision on the length of study-table to manufacture. What length would
you recommend and why?
2]An incomplete distribution of daily sales (Rs. thousand) is given below. The data relate to 229 days.
Daily sales No. of days Daily sales No. of days
(Rs. thousand) (Rs. thousand)
10-20 12 50-60 ?
20-30 30 60-70 25
30-40 ? 70-80 18
You are told that the median value is 46. Using the median formula, fill up the missing frequencies and
calculate the arithmetic mean of the completed data.

Mean Mode Median - Merged

Uploaded by

Copyright:

Available Formats

Mean Mode Median - Merged

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mean Mode Median - Merged

Uploaded by

Copyright:

Available Formats

2.

results the arithmetic mean. Suppose we have the following observations:

10, 15,30, 7, 42, 79 and 83

x = ∑x/n, where x is simple mean.

2.2.1 UNGROUPED DATA-WEIGHTED AVERAGE

arithmetic mean will be different from the one used earlier.

Mid-term test 2 points

Table 2.1: Calculation of a Weighted Mean

Type of Test Relative Weight (w) Marks (x) (wx)

Example 2.2: An investor is fond of investing in equity shares. During a period of

average price per share did the investor pay?

Table 2.2: Calculation of Weighted Average Price

Total - 230 23,400

6000 + 8400 + 9000

Therefore, the investor paid an average price of Rs 101.7 per share.

This is an unweighted or simple average and as it ignores the-quantum of shares

2.2.2 GROUPED DATA-ARITHMETIC MEAN

(i) Direct method, (ii) Short-cut method , (iii) Step-deviation method

Calculate the average marks of this group.

Marks No. of Students

Table 2.3: Calculation of Arithmetic Mean by Direct Method

occur, this assumption is usually accepted.

Where A = arbitrary or assumed mean

d = deviation from the arbitrary or assumed mean

because the calculation work in the short-cut method is considerably reduced

may not pose any problem.

the average marks by using the short-cut method.

Table 2.4: Calculation of Arithmetic Mean by Short-cut Method

mid-point and the resultant figure is shown in column d.

= 35 - 1.55 = 33.45 or 33 marks approximately.

step-deviation method. This is shown in Table 2.5.

Table 2.5: Calculation of Arithmetic Mean by Step-deviation Method

Marks Mid-point f d d’= d/10 Fd’

deviation method is the most convenient on account of simplified calculations. It may

from that figure, we would get the same answer.

methods, we are in a position to handle any problem where calculation of the

arithmetic mean is involved.

No. of Accidents No. of Days (frequency)

Calculate the missing frequencies.

against 1 accident is x and against 2 accidents is y. If we can establish two

simultaneous equations, then we can easily find the values of X and Y.

(0.46) + (1 . x) + (2 . y) + (3 . 25) + (4 . l0) + (5 . 5)

x + 2y + 140 = (200) (1.46)

x + y=200- {46+25 + 1O+5}

Now subtracting equation (ii) from equation (i), we get

Substituting the value of y = 38 in equation (ii) above, x + 38 = 114

Hence, the missing frequencies are:

2.2.3 CHARACTERISTICS OF THE ARITHMETIC MEAN

Some of the important characteristics of the arithmetic mean are:

always zero. This means I: (x - x ) = 0, where x is the value of an item and x is

mean is regarded as a measure of central tendency.

distorted on account of a few items with extreme values. In such a case, it

may cease to be the representative characteristic of the distribution.

magnitude. Thus, in an ungrouped frequency distribution if the n values are arranged

in ascending or descending order of magnitude, the median is the middle value if n is

Suppose we have the following series:

15, 19,21,7, 10,33,25,18 and 5

arranged in an ascending order as follows:

item, we use the formula