Ch03 Statistics For Economic & MGT

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 81

CHAPTER 3

NUMERICAL DESCRIPTIVE
MEASURES
Lecturer: Mr.Abdishakur Abdi
Bile

BPA,MBA
A score that indicates where the center of the distribution tends to
be located.
 Measures of central tendency (Also called measures of
location) are scores that indicate where the center of the
distribution tends to be located

 The following are the five measures of average or


central tendency that are in common use :

(i) Arithmetic average or arithmetic mean or simple mean


(ii)Median (iii)Mode
(iv)Geometric mean
(v) Harmonic mean

 Arithmetic mean, Geometric


mean and Harmonic
means are usually called Mathematical averages while
Mode and Median are called Positional averages
MEASURES OF CENTRAL TENDENCY FOR
UNGROUPED DATA
 Mean
 Median
 Mode
 Relationships among the Mean, Median, and Mode

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Figure 3.1

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Population Mean
For ungrouped data, the population
mean is the sum of all the population
values divided by the total number of
population values:
Sample Mean

For ungrouped data, the sample mean is


the sum of all the sample values divided
by the number of sample values:
Mean (Arithmetic Average)
The mean for ungrouped data is obtained by dividing the
sum of all values by the number of values in the data set. Thus,

Mean for population data:   x


N

Mean for sample data: x


 x
n

where
 x is the sum of all values; N is the population size; n
is the sample size;  is the population mean; and is the
x
sample mean.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Characteristics of the Mean

The arithmetic mean is the most widely


used measure of location.
Requires the interval scale.
Major characteristics:
1. All values are used.
2. It is unique.
3. The sum of the deviations from the mean is
0.
4. It is calculated by summing the values and
dividing by the number of values.
Example 3-1
Table 3.1 lists the total cash donations (rounded to millions of
dollars) given by eight U.S. companies during the year 2010
(Source: Based on U.S. Internal Revenue Service data
analyzed by The Chronicle of Philanthropy and USA TODAY).

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Table 3.1 Cash Donations in 2010 by Eight U.S.
Companies

Find the mean of cash donations made by these eight


companies.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-1: Solution

x  x 1  x2  x3  x4  x5  x6  x7  x8
 319  199  110  63  21  315  26  63  1116

x
 x 1116
  139.5  $139.5million
n 8

Thus, these eight companies donated an average of $139.5 million in


2010 for charitable purposes.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-2
The following are the ages (in years) of all eight employees of a
small company:
53 32 61 27 39 44 49 57
Find the mean age of these employees.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-2: Solution
The population mean is

  x 362
  45.25 years
N 8

Thus, the mean age of all eight employees of this company is


45.25 years, or 45 years and 3 months.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-3
Table 3.2 lists the total number of homes lost to foreclosure in
seven states during 2010.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Table 3.2 Number of Homes Foreclosed in 2010

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-3
Note that the number of homes foreclosed in California is
very large compared to those in the other six states.
Hence, it is an outlier. Show how the inclusion of this outlier
affects the value of the mean.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-3: Solution
 If we do not include the number of homes foreclosed in
California (the outlier), the mean of the number of
foreclosed homes in six states is

Mean without the outlier


49,723  20,352  10,824  40,911  18,038  61,848

6
201,696
  33,616
6

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-3: Solution
 Now, to see the impact of the outlier on the value of the
mean, we include the number of homes foreclosed in
California and find the mean number of homes foreclosed
in the seven states. This mean is

Mean with the outlier


173,175  49,723  20,352  10,824  40,911  18,038  61,848

6
374,871
  53,553
7

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Case Study 3-1 Average NFL Ticket Prices in the
Secondary Market

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Median
 Definition
 The median is the value of the middle term in a data set that
has been ranked in increasing order.
 MEDIAN The midpoint of the values after they have been
ordered from the smallest to the largest, or the largest to the
smallest.
 The calculation of the median consists of the following two
steps:
1. Rank the data set in increasing order.
2. Find the middle term. The value of this term is the median.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Median: Computational Procedure
 First Procedure
 Arrange observations in an ordered array.
 If number of terms is odd, the median is
the middle term of the ordered array.
 If number of terms is even, the median is
the average of the middle two terms.

 Second Procedure
 The median’s position in an ordered array is
given by (n+1)/2.

Slide 3-21 © 2002 Thomson / South-Western


Example 3-4
Refer to the data on the number of homes foreclosed in seven
states given in Table 3.2 of Example 3.3. Those values are
listed below.

173,175 49,723 20,352 10,824 40,911 18,038 61,848

Find the median for these data.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-4: Solution
First, we rank the given data in increasing order as follows:
10,824 18,038 20,352 40,911 49,723 61,848 173,175
Since there are seven homes in this data set and the middle
term is the fourth term,

Thus, the median number of homes foreclosed in these seven


states was 40,911 in 2010.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-5
 Table 3.3 gives the total compensations (in millions of
dollars) for the year 2010 of the 12 highest-paid CEOs of
U.S. companies.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Table 3.3 Total Compensations of 12 Highest-Paid
CEOs for the Year 2010
Find the median for
these data.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-5: Solution
 First we rank the given total compensations of the 12 CESs as
follows:

 21.6 21.7 22.9 25.2 26.5 28.0 28.2 32.6 32.9 70.1 76.1 84.5

 There are 12 values in this data set. Because there are an


even number of values in the data set, the median is given by
the average of the two middle values.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-5: Solution
 The two middle values are the sixth and seventh in the
arranged data, and these two values are 28.0 and 28.2.

28.0  28.2 56.2


Median    28.1  $28.1million
2 2

 Thus, the median for the 2010 compensations of these 12


CEOs is $28.1 million.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Median
The median gives the center of a histogram, with half the data
values to the left of the median and half to the right of the
median. The advantage of using the median as a measure of
central tendency is that it is not influenced by outliers.
Consequently, the median is preferred over the mean as a
measure of central tendency for data sets that contain
outliers.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Case Study 3-3 Education Pays

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Mode
 Definition
 The mode is the value that occurs with the highest frequency
in a data set.

 Mode – the number that appears most frequently in a set


of numbers.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-6
 The following data give the speeds (in miles per hour) of
eight cars that were stopped on I-95 for speeding violations.

77 82 74 81 79 84 74 78

Find the mode.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-6: Solution
 In this data set, 74 occurs twice and each of the remaining
values occurs only once. Because 74 occurs with the highest
frequency, it is the mode. Therefore,

Mode = 74 miles per hour

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Mode
 A major shortcoming of the mode is that a data set may
have none or may have more than one mode, whereas it
will have only one mean and only one median.
 Unimodal: A data set with only one mode.
 Bimodal: A data set with two modes.
 Multimodal: A data set with more than two modes.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-7 (Data set with no mode)
 Last year’s incomes of five randomly selected families were
$76,150, $95,750, $124,985, $87,490, and $53,740.

 Find the mode.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-7: Solution
 Because each value in this data set occurs only once, this data
set contains no mode.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-8 (Data set with two modes)
A small company has 12 employees. Their commuting times
(rounded to the nearest minute) from home to work are 23,
36, 12, 23, 47, 32, 8, 12, 26, 31, 18, and 28, respectively.

Find the mode for these data.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-8: Solution
In the given data on the commuting times of the 12
employees, each of the values 12 and 23 occurs twice, and
each of the remaining values occurs only once. Therefore, that
data set has two modes: 12 and 23 minutes.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-9 (Data set with three modes)
The ages of 10 randomly selected students from a class are 21,
19, 27, 22, 29, 19, 25, 21, 22 and 30 years, respectively.

Find the mode.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-9: Solution
This data set has three modes: 19, 21 and 22. Each of these
three values occurs with a (highest) frequency of 2.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Mode
One advantage of the mode is that it can be calculated for
both kinds of data - quantitative and qualitative - whereas the
mean and median can be calculated for only quantitative data.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-10
 The status of five students who are members of the student
senate at a college are senior, sophomore, senior, junior, and
senior, respectively. Find the mode.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Example 3-10: Solution
 Because senior occurs more frequently than the other
categories, it is the mode for this data set. We cannot
calculate the mean and median for this data set.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Relationships Among the Mean, Median, and Mode
1. For a symmetric histogram and frequency distribution with
one peak (see Figure 3.2), the values of the mean,
median, and mode are identical, and they lie at the center
of the distribution.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Figure 3.2 Mean, median, and mode for a symmetric
histogram and frequency distribution curve.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Relationships Among the Mean, Median, and Mode
2. For a histogram and a frequency distribution curve skewed
to the right (see Figure 3.3), the value of the mean is the
largest, that of the mode is the smallest, and the value of
the median lies between these two. (Notice that the mode
always occurs at the peak point.) The value of the mean is
the largest in this case because it is sensitive to outliers
that occur in the right tail. These outliers pull the mean to
the right.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Figure 3.3 Mean, median, and mode for a histogram and
frequency distribution curve skewed to the right.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Relationships Among the Mean, Median, and Mode
3. If a histogram and a frequency distribution curve are
skewed to the left (see Figure 3.4), the value of the mean
is the smallest and that of the mode is the largest, with the
value of the median lying between these two. In this case,
the outliers in the left tail pull the mean to the left.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Figure 3.4 Mean, median, and mode for a histogram and
frequency distribution curve skewed to the left.

Prem Mann, Introductory Statistics, 8/E


Copyright © 2013 John Wiley & Sons. All rights reserved.
Mean for Grouped Data
Grouped data are the data or scores that are arranged in a frequency
distribution.

Frequency distribution is the arrangement of scores


according to category of classes including the frequency.

Frequency is the number of observations falling in a category.


MEAN
The only one formula in solving the mean for
grouped data is called midpoint method. The
formula is:
_
X = Σfx
̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅
n

_
Where X = mean value
x = midpoint of each class or category
f = frequency in each class or category
Σ f x = summation of the product of f x
MEAN
Steps in Solving Mean for Grouped Data

1. Find the midpoint or class mark (X) of each class or


category

2. Multiply the frequency and the corresponding class


2
mark f x.

3. Find the sum of the results in step 2.

4. Solve
_ the mean using the formula
X = Σfx
̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅

n
Example:
Scores of 40 students in a science class consist of 60
items and they are tabulated below.
Scores 40 Freque Midpoin Midpoint x _
students ncy(f) t (x) frequency (fx) X = Σ f x
n
10 – 14 5 12 60
15 – 19 2 17 34 = 1 345
20 – 24 3 22 66 40
25 – 29 5 27 135
30 – 34 2 32 64 = 33.63
35 – 39 9 37 333
40 – 44 6 42 252
45 – 49 3 47 141
50 - 54 5 52 260
MEAN
Analysis:

The mean performance of 40 students in


science quiz is 33.63. Those students who
got scores below 33.63 did not perform
well in the said examination while those
students who got scores above 33.63
performed well.
Solutio
n
Classes f x fx

10-14 8 12 96

15-19 12 17 204

20-24 15 22 330

25-29 20 27 540

30-34 10 32 320

Total 65 1490
MEAN
Properties of the Mean
• Mean can be calculated for any set of numerical data, so it always
exists.

• Mean is the most reliable measure of central tendency


since it takes into account every item in the set of data.
• It may easily affected by the extreme scores.

• It can be applied to interval level of


measurement.
• It is very easy to compute.
MEDIAN
Median of Grouped Data
Formula:
n_
x̃ = LB + _2 ̅ cfp_ X c.i
fm

X̃ = median value
_n_
MC = median class is a category containing the
2
LB = lower boundary of the median class (MC)
cfp = cumulative frequency before the median
class if the
scores are arranged from lowest to
highest value
MEDIAN
Steps in Solving Median for Grouped Data

1. Complete the table for cf<.

_n_
2. Get 2 of the scores in the distribution
so that you can identify MC.

3. Determine LB, cfp, fm, and c.i.

4. Solve the median using the formula.


Example: Scores of 40 students in a science class
consist of 60 items and they are tabulated below.
The highest score is 54 and the lowest score is 10.

Scores of 40 f cf<
students

10 – 14 5 5
15 – 19 2 7
20 – 24 3 10
25 – 29 5 15
30 – 34 2 17 (cfp)
35 – 39 9 (fm) 26
40 – 44 6 32
45 – 49 3 35
50 – 54 5 40
n = 40
Solution:
_n_ _40_
2 = 2 = 20 _n_
The category containing 2 is 35 – 39.
LL of the MC = 35
Ln = 34.5
cfp = 17
fm = 9
c.i = 5
_n_
x̃ = LB + _2 ̅ cfp_ X c.i
fm
_20 – 17_ X 5
= 34.5 + 9
= 34.5 + 15/9
x̃ = 36.17
MEDIAN
Properties of the Median

• It may not be an actual observation in the data


set.
• It can be applied in ordinal level.

• It is not affected by extreme values because


median is a positional measure.
MODE
Mode for Grouped Data

In solving the mode value in grouped data, use the


formula:
___d1___
X̂ = LB + d 1 + d2 x c.i
LB = lower boundary of the modal class
Modal Class (MC) = is a category containing the highest
frequency
d1 = difference between the frequency of the modal
class and the
frequency above it, when the scores are
arranged from lowest
to highest.
d2 = difference between the frequency of the modal class and
the
frequency below it, when the scores are arranged from
lowest to highest.
MODE
Example: Scores of 40 students in a science class
consist of 60 items and they are tabulated below.

x f
10 – 14 5
15 – 19 2
20 – 24 3
25 – 29 5
30 – 34 2
35 – 39 9
40 – 44 6
45 – 49 3
50 – 54 5
n = 40
Solution
Modal Class = 35 – 39
LL of MC = 35
LB = 34.5
d1 = 9 – 2 = 7
d2 = 9 – 6 = 3
c.i = 5
___d1___
X̂ = LB + d 1 + d2 x c.i
___7___
= 34.5 + 7+3 x 5
= 34. 5 + 35/10
X̂ = 38

The mode of the score distribution that consists of 40


students
is 38, because 38 occurred several times.
MODE
Properties of the Mode

• It can be used when the data are qualitative as well


as quantitative.
• It may not be unique.

• It is affected by extreme values.

• It may not exist.

• It is used when you want to find the value which occurs


most often.
Quiz: Find the Mean, Median &
Mode
No. of Midpoint Frequency Midpoint x
Employees (x) (f) frequency (fx)

0-4 2 6 12

5-9 7 12 84

10-14 12 7 84

15-19 17 5 85

20-24 22 0 0

Total n = 30 ∑(fx) = 265


GEOMETRIC MEAN,
HORMONIC MEAN
AND WEIGHTED
MEAN
Geometric mean
 The geometric mean is a type of average , usually
used for growth rates, like population growth or
interest rates. While the arithmetic mean adds items,
the geometric mean multiplies items. Also, you can
only get the geometric mean for positive numbers.
Simple mean ≥ geometric mean. It can be shown that
the arithmetic mean (i.e., Simple
Mean) is greater than or equal to the geometric mean if
all the observations are non-negative.
• Formula:
formula of geometric mean:
Geometric mean = A*B

Example
 Find the geo-metric mean for the
following:
36. And b) 64
Solution.
Geometric mean = 36*64
= 48
Harmonic mean
Harmonic mean is defined as the value obtained when
the number of values in the data set is divided by the
sum of its reciprocals. ... Also, stability of the data set
with outliers is more when harmonic mean is applied.
For example, consider 2, 3, 5, 7, and 60 with number of
observations as 5.
The Basics of Harmonic Mean

The harmonic mean helps to find multiplicative or


divisor relationships between fractions without
worrying about common denominators. Harmonic
means are often used in averaging things like rates (e.g.,
the average travel speed given a duration of several
trips).
• If a set of data has “n”values by xí such that 1≤ í
≤ n then the harmonic mean , “ H” can be found
as;
Formula:
H= N/n
∑ ( 1/xí).
í=1
Example 1
A man travels 100 miles at rate of 30 miles
per hour and then on free way travels the
next 100 miles at the rate of 55 miles per
hour . what is his average speed.?
Solution:
Harmonic mean= 2/(1/30)+(1/55)
= 2/0.03333+0.018181818
=38.8 miles /hour.
Example 2
Find the harmonic mean of :
3, 4, 5, 6.
SOLTION:
HM= 4/(1/2)+(1/4)+(1/5)+(1/6)
= 4/57/60.
=4*60/57
= 240/57
=4.21
Weighted Mean
• The weighted mean is a type of mean that is
calculated by multiplying the weight (or probability)
associated with a particular event or outcome with its
associated quantitative outcome and then summing all
the products together. It is very useful when
calculating a theoretically expected outcome where
each outcome shows a different probability of
occurring, which is the key feature that distinguishes
the weighted mean from the arithmetic mean.
Uses of Weighted Mean
• Weighted means are useful in a wide variety of
scenarios. For example, a student may use a weighted
in order to calculate his/her percentage grade in a
course. In such an example, the student would
multiply the weighing of all assessment items in the
course (e.g., assignments, exams, projects, etc.) by
the respective grade that was obtained in each of the
categories. Consider the following student with the
following grades:
To form a weighted mean of a nunbers, we
first multiply each number by a numbers “
weighted” for that number then add up all
the weighted numbers then divide by the
sum of the weights.
Formula: x =n
∑ (X1)*(w)/ n(w)
Xí=1
Example 1
• A business has 10 workers where as 3
workers take $ 50 , 4 workers take $60 ,
and 3 workers take $ 80 . Find weighted
mean?
SOLUTION:
X = 3(50)+4(60)+3(80)
= 150+240+240/3+4+3
= 630/10
= 63.
Example 2
• A POLITICIAN HAS 20 STAFF WHERE 5
STAFFS TAKE $200, 6 STAFF TAKE $100, 9
STAFF TAKE $ 300. FIND WEIGHTED
MEAN?
SOLUTION:
X = 5(200)+6(100)+9(300)/5+6+9
= 4300/20
= $215
Quiz: Find the Mean, Median &
Mode
No. of Midpoint Frequency Midpoint x
Employees (x) (f) frequency (fx)

0-4 2 6 12

5-9 7 12 84

10-14 12 7 84

15-19 17 5 85

20-24 22 0 0

Total n = 30 ∑(fx) = 265

You might also like