Median and Mode Calculation
Median and Mode Calculation
Median and Mode Calculation
AND BIOSTATISTICS
MEDIAN AND MODE
Application of Data Analysis
They are mostly used in anatomy and physiology to assess what is normal in
health as in the case of
▪ Height
▪ Weight
▪ Blood pressure
▪ Cholesterol in blood
▪ Pulse rate
▪ RBC count etc.
Methods of Data Analysis
▪ Calculation of Averages,
▪ Percentiles,
▪ Standard Deviation,
▪ Standard Error etc.
Statistical Averages
▪ Measures of central tendency (or statistical averages) tell us the point about which items
have a tendency to cluster.
▪ Such a measure is considered as the most representative figure for the entire data-set and
are a key way to discuss and communicate with graphs.
▪ Average value of a characteristic is the one central value around which all other
observations are dispersed.
▪ The term central tendency refers to the middle, or typical, value of a set of data, is most
commonly measured by using the three m's: Mean, Median, and Mode.
Use of Statistical Averages
▪ To find most of the normal observations lie close to the central value, while few of the too
large or too small lie far away at both ends.
▪ To find which group is better off by comparing the average of one group with that of the
other.
e.g. one finds the average incubation period of cholera is smaller than that of
typhoid; income of pleaders is higher than that of doctors; average daily attendance
of one hospital is higher than that of another and so on.
After finding the difference, one may reason out why in one group it is more than
that in the other.
MEDIAN
The median of a series of observations is that value above which there are
as many scores as below it; that is, it divides a rank-ordered distribution
into two equal halves.
▪ This measure of central tendency is typically used when the mean value is affected by an
unusually low number or an unusually high number in the data set (outliers). Outliers
distort the mean value to the extent that the mean value no longer accurately depicts the
set of data
Median value = 10
In this case, mean value gives a distorted result as one observation 32 is too large, so the
mean as a measure of central tendency should not be considered appropriate. To have a
better idea of average, one should ignore unduly high observations such as 32 in the above
example. Mean of the remaining observations will be 54/6 = 9.0 which is much closer to the
median, i.e. 10 than the mean 12.3calculated with seven observations.
Conclusion
▪ Median, therefore, is a better indicator of central value when one or more of the lowest or
the highest observations are wide apart or not so evenly distributed, e.g.
For example: If one of the houses in your neighborhood was broken down and
maintained a low property value, then you would not want to include this property
when determining the value of your own home. However, if you are purchasing a home
in that neighborhood, you may want to include the outlier since it would drive down the
price you would have to pay.
Calculation Of Median - Ungrouped Series
▪ Arrange the observations in the series in ascending or descending order of magnitude.
The central observation of the arranged series gives the median.
▪ ESRs of seven subjects are arranged in ascending order: 3, 4, 4, (5), 5, 6, 7. The 4th
observation (5) is the median in this series. When a distribution contains an odd
number of scores, such as 4, 5, 6, 7, 8, the middle score, 6, is median.
▪ The midpoint between the two middle scores is the median, so for the series 4, 5, 6,
7, 8, 9, the median lies halfway between 6 and 7. Therefore, median equals 6.5.
Formula
▪ The median is represented as M .
Step 2: Since number of data values is odd, the median will be found position.
Step 3: In this case, the median is the value that is found in the
fourth position of the organized data.
2, 6, 8, 10, 12, 14, 16
Step 1: Organize the data, or arrange the numbers from smallest to largest.
1, 1, 3, 4, 4, 6, 7, 8, 9, 11
Step 2: Since the number of data values is even, the median will be the mean value of the
numbers found before and after the position.
Step 3: The number found before the 5.5 position is 4 and the number found after the
5.5 position is 6. Now, you need to find the mean value.
1, 1, 3, 4, 4, 6, 7, 8, 9, 11
Class Median is the first class with the value of cumulative frequency equal at
least n/2.
▪ This is the most frequently occurring observation in a series, i.e. the most
common or most fashionable, such as 8 mm in tuberculin test of 10 boys given
below: 3, 5, 7, 7, 8, 8, 10, 11, 12.
▪ Mode is rarely used in medical studies. Out of the three measures of central
tendency mean is better and utilized more often because it uses all the
observations in the data and is further used in the tests of significance.
Advantages
▪ Mode is most commonly or frequently occurring value in a series. The mode in a
distribution is that item around which there is maximum concentration.
▪ Mode is size of item which has the maximum frequency, but at items such an
item may not be mode on account of the effect of the frequencies of the
neighbouring items.
▪ Like median, mode is a positional average and is not affected by the values of
extreme items. it is, therefore, useful in all situations where we want to
eliminate the effect of extreme variations.
▪ Mode is particularly useful in the study of popular sizes.
▪ Mode can be determined for qualitative data as well as quantitative data, but
the mean and the median can only be determined for quantitative data.
Example
A manufacturer of shoes is usually interested in finding out the size most in
demand so that he may manufacture a larger quantity of that size. In other words,
he wants a modal size to be determined for median or mean size would not serve
his purpose. but there are certain limitations of mode as well.
▪ If a data set has only one value that occurs most often, the set is
called unimodal.
▪ A data set that has two values that occur with the same greatest
frequency is referred to as bimodal.
▪ When a set of data has more than two values that occur with the
same greatest frequency, the set is called multimodal.
Example
The mode is the most
frequent score in our data
set. On a histogram it
represents the highest bar
in a bar chart or histogram.
You can, therefore,
sometimes consider the
mode as being the most
popular option. An example
of a mode is presented
Example
Normally, the mode is used
for categorical data where
we wish to know which is
the most common
category, as illustrated the
most common form of
transport, in this particular
data set, is the bus.
Example
We can see above that the
most common form of
transport, in this particular
data set, is the bus.
However, one of the
problems with the mode is
that it is not unique, so it
leaves us with problems
when we have two or more
values that share the
highest frequency, as
illustrated
Disadvantages
▪ The downside to using the mode as a measure of central tendency is that a set of data may
have no mode, or it may have more than one mode. However, the same set of data will
have only one mean and only one median.
▪ When determining the mode of a data set, calculations are not required, but keen
observation is a must. The mode is a measure of central tendency that is simple to locate,
but it is not used much in practical applications.
For example, consider measuring 30 peoples' weight (to the nearest 0.1 kg).
How likely is it that we will find two or more people with exactly the same weight
(e.g., 67.4 kg)? The answer, is probably very unlikely - many people might be close,
but with such a small sample (30 people) and a large range of possible weights, you
are unlikely to find two people with exactly the same weight; that is, to the nearest
0.1 kg. This is why the mode is very rarely used with continuous data.
Example – Location
Another problem with the mode is that it will not
provide us with a very good measure of central
tendency when the most common mark is far
away from the rest of the data in the data set, as
depicted in the diagram
▪ There is no need to organize the data, unless you think that it would be easier to
locate the mode if the numbers were arranged from least to greatest.
▪ In the above data set, the number 79 appears twice, but all the other numbers
appear only once.
▪ Since 79 appears with the greatest frequency, it is the mode of the data values.
Example – Qualitative Data
You begin to observe to the color of clothing employees wear at a company. Your goal is to
find out what color is worn most frequently so that you can offer company shirts to your
employees.
Monday: Red, Blue, Black, Pink, Green, and Blue The color blue was worn 11 times
Tuesday: Green, Blue, Pink, White, Blue, and Blue
during the week. All other colors were
worn with much less frequency in
Wednesday: Orange, White, White, Blue, Blue, and Red comparison to the color blue.
Thursday: Brown, Black, Brown, Blue, White, and Blue The mode is blue.
Friday: Blue, Black, Blue, Red, Red, and Pink
1 – 10 8
11 – 20 14
21 – 30 12
31 – 40 9
41 – 50 7
Mode – Histogram Method
Mode can also be obtained from a histogram.