Measures of Dispersion
Measures of Dispersion
Measures of Dispersion
The Range
In the preceding section we introduced three types of average values for a data
set- the mean, the median and the mode. Some characteristics of a set of data may not be
evident from an examination of averages. For instance, consider a soft-drink dispensing
machine that should dispense 8 oz of your selection into a cup. Table 4.5 shows data for
two of these machines.
Machine 1 Machine 2
10.07 7.95
5.85 8.03
8.15 8.02
x̄ = 8.0 x̄ = 8.0
The mean data value for each machine is 8 oz. However look at the variation
in data values for machine 1. The quantity of soda dispensed is very inconsistent- in
some cases soda overflows the cup, and in other cases too little soda is dispensed. The
machine obviously needs adjustment. Machine 2, on the other hand, is working just
fine. The quantity dispensed is very consistent with little variation.
This example shows that average values do not reflect the spread or
dispersion of data. To measure the spread or dispersion of data, we must introduce
statistical values known as the range and the standard deviation.
Range
The range of a set of data values is the difference between the greatest data value
and the least data value.
Example: Find a Range
Find the range of the number of ounces dispensed by machine 1 in table 4.5.
Solution:
The greatest number of ounces dispensed is 10.07 and the least is 5.85. The
range of the number of ounces is 10.07 - 5.85 = 4.22 oz
The Standard Deviation
The range of a set of data is easy to compute, but it can be deceiving. The range
is a measure that depends only on the two most extreme values, and as such it is very
sensitive. A measure of dispersion that is less sensitive to extreme value is the standard
deviation. The standard deviation of a set of numerical data makes use of the amount by
which each individual data value deviates from the mean. These deviations represented by
( x - x̄ ), are positive when the data value x is greater than the mean x̄ and are negative
when x is less than the mean x̄ . The sum of all the deviations ( x - x̄ ) is 0 for all sets of
data. This is shown in table 4.6 for the machine 2 data of table 4.5.
Table 4.6
x x - x̄
Sum of deviations = 0
Because the sum of all the deviations of the data values from the mean is always
0, we cannot use the sum of the deviations as a measure of dispersion for a set of data.
Instead, the standard deviation uses the sum of the squares of the deviations.
Σ( x - μ )2
𝝈
n
If x1, x2, x3, … , xn is a sample of n numbers with a mean of x̄ , then the Standard
Deviation of the sample is:
Σ( x - x̄ )2
s
n-1
2, 4, 7, 12, 15
Solution:
2 + 4 + 7 + 12 + 15 40
x̄ 5 8
5
Step 2: For each number, calculate the deviation between the number and the mean.
x x - x̄
2 2 - 8 = -6
4 4 - 8 = -4
7 7 - 8 = -1
12 12 - 8 = 4
15 15 - 8 = 7
Step 3: Calculate the square of each deviation in step 2, and find the sum of these squared
deviations.
x x - x̄ ( x - x̄ )2
2 2 - 8 = -6 ( -6 )2 = 36
4 4 - 8 = -4 ( -4 )2 = 16
7 7 - 8 = -1 ( -1 )2 = 1
12 12 - 8 = 4 42 = 16
15 15 - 8 = 7 72 = 49
118
4 29.5
Step 5: The standard deviation of the sample is s = √29.5. To the nearest hundredth, the
standard deviation is s = 5.43
In the next example we use standard deviation to determine which company
produces batteries that are most consistent with regard to their life expectancy.
12.34
1.328h
7
The batteries from Dependable have a standard deviation of:
3.62
0.719h
7
5.38
0.877h
7
The batteries from Dependable have the smallest standard deviation. According
to these results, the Dependable company produces the most consistent batteries with
regard to life expectancy under constant use.
The Variance
A statistic known as the variance is also used as a measure of dispersion. The
variance for a given set of data is the square of the standard deviation of the data. The
term variance refers to a statistical measurement of the spread between numbers in a data
set. More specifically, variance measures how far each number in the set is from the mean
and thus from every other number in the set
Solution:
Consider the data x1, x2, …, xn, which are arranged in ascending order. The
average, or mean, of this data is