Statistics For Economics Module Teaching
Statistics For Economics Module Teaching
Statistics For Economics Module Teaching
Prepared
By:
Abdella Mohammed Ahmed (M.Sc.)
JULY, 2024
CHIRO, ETHIOPIA
_________________________________________________________________________1
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Introduction
Statistics is the science that deals with the methods of collection, organization, analysis of
data and interpretation of the result, which often leads to the drawing of conclusions. It is
not only scientific methods for collecting, organizing, summarizing, presenting and
analyzing data but also drawing valid conclusions and making reasonable decisions on the
basis of such analysis. Statistics is an art of learning form data
In a narrower sense, the term statistics is used to denote the data themselves or numerical
summary derived from the data, such as mean, median, mode, range etc. Thus, we speak of
employment statistics, accident statistics, etc.
Statistics is also used to mean either statistical data or statistical method. When it is used in
the sense of statistical data, it refers to quantitative aspects of things, and is a numerical
description. The other aspect of statistics is as a body of theories and techniques employed
in analyzing the numerical information and using it to make wise decisions.
The science of statistics is very essential for research and decision making processes in all
aspects of human life. The primary purpose of statistics is to provide information for the
decision-making process. That is why statistics is called a partner in decision making.
_________________________________________________________________________2
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
CHAPTER ONE
INTRODUCTION
This chapter introduces basic concepts of statistics like, definition of statistics and the two
broad divisions of statistics: descriptive and inferential statistics. It also deals with the
importance of statistics in general and its application in economics in particular. Basics
statistical terms or concepts are also explained here so that students will be familiar with
these concepts and be able to understand the subsequent chapters with out difficulties.
Objectives,
After this chapter, students will be able to
Define statistics
Identify the two divisions of statistics
Understand the meaning of some basic statistical terms or concepts
List application of statistics in the field of economics
Identify reasons why we use samples instead of complete enumeration of the entire
population-census
Statistics is also used to mean either statistical data or statistical method. When it is used in
the sense of statistical data, it refers to quantitative aspects of things, and is a numerical
description. The other aspect of statistics is as a body of theories and techniques employed
in analyzing the numerical information and using it to make wise decisions.
_________________________________________________________________________3
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
The science of statistics is very essential for research and decision making processes in all
aspects of human life. The primary purpose of statistics is to provide information for the
decision-making process. That is why statistics is called a partner in decision making.
_________________________________________________________________________4
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
_________________________________________________________________________5
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Sampling error: is the difference between population value (parameter) and the
corresponding sample value (statistics). Sampling error occurs when the sample
does not perfectly represent the population from which it was selected
Sample size: the amount of information or measurements should be collected in the sample
is called sample size.
Sampling frame: sampling frame is the list of population elements from which the sample
will be drawn.
Let us see the above basic concepts using the following data so that it will be very easy to
understand. Let us see the above basic concepts using the following data so that it will be
very easy to understand.
Suppose there are only 10 students in group 1(Economics department). We want to know
the academic performance of the group. The variable that we will be more interested is
grade point average (GPA). We collected the following data form the whole population (10
students)-census.
Students(N) 1 2 3 4 5 6 7 8 9 10
GPA(X) 2.0 2.7 3.2 1.0 2.5 3.5 1.5 3.0 2.5 2.8
Populations mean (µ ) =
X
24.7
2.47 (parameter)
N 10
Because of a number of reasons we may not be able to collect data from the whole
population; in that case we collect information from a sample using appropriate sampling
techniques.
Let us take three samples: sample1, sample2 and sample3, with sample size four (n = 4)
_________________________________________________________________________6
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
_________________________________________________________________________7
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Statistics can be used to presents facts in a definite form, simplifies complex mass of data,
classifies numerical facts and it furnishes a technique of comparison. A population is treated
as universe and a sample is a fraction or segment of universal. Parameter is the descriptive
measure of population where as statistics is the descriptive measure of sample. Usually
there is difference between parameter and statistics and this difference is said to be
sampling error.
Even if we can take census, there are reasons to take a sample, some of the reasons are: time
constraint, cost, to improve accuracy, impossibility of census.
Exercise
1. Define statistics briefly and explain how useful is statistics for your field of study
3. Explain the reasons why in most cases we prefer sampling to census or complete
enumeration of the entire population
5. Explain the two main classification of statistics and give examples for each
8. Suppose you determined the grade point average (GPA) of some members of the
group. What would this represent: a census or a sample?
_________________________________________________________________________8
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Chapter two
Probability
Introduction
We live in a world in which we are unable to forecast the future with compete certainty.
Our need to cope up with uncertainty leads us to the study and use of probability theories.
Probability concepts, rules and principles can be applied in many disciplines like statistics,
economics, management, etc. Probability and statistics are related in an important way.
Probability is used as a tool in statistics and economics uses statistics as a tool.
In this chapter, we will see definition of basic probability concepts like experiments,
outcomes, events, and sample space. We will also see methods of assigning probability like
classical, relative frequency and subjective approach. How to find probability of events
using the general addition rule and multiplication rule is also treated in this chapter.
Examples are given in each sub units and finally exercises are given at the end of the
chapter for your better understanding of the chapter.
Objectives;
After this chapter the students will be able to:
Define probability and other related concepts
Assign probability to different outcomes
Find probability of events in an experiment
2. Elementary Probability
Probability is a part of our every day lives. In personal and managerial decisions we face
uncertainty and use probability theories. We live in a world in which we are unable to
forecast the future with complete certainty. Our need to cope up with uncertainty leads us to
the study and use of probability theories.
Probability as a general concept can be defined as the chance of an event occurring and its
value lies between 0 and 1.
Probability concepts, rules and principles can be applied in many disciplines like statistics,
economics, management, etc.
_________________________________________________________________________9
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
2.2 Definition of Basic Concepts (Experiment, Outcome, Events and Sample Space)
The firs technical term we will study is experiment, we will see that most of the other terms
in probability follows quite naturally from this term.
Experiment: any activity that yields a result or an outcome is called an experiment.
Normally, there are a variety of possible outcomes of an experiment and the one that occurs
when the experiment is performed is a matter of chance.
Example: consider the experiment of tossing a coin, there are two possible outcomes: head
(H) and tail (T). The process of under taking tossing a fair coin represents an experiment
An event: is the out come of an experiment. It is the subset of a sample space. The tossing
of a coin experiment the occurrence of head or tail on the upper face of the coin represents
an event.
Sample space is a collection (or set) of some of the possible outcomes from the sample
space. In other words, an event is a subset of the sample space. We say that the event occurs
if, when we perform the experiment.
Examples: 2.1.
1. If we toss a coin once the possible out comes are Head (H) or Tail (T) and
the sample space (S) is S = {H, T}
3. If we toss a die, there are six possible outcomes, therefore, the sample space
is S= {1, 2, 3, 4, 5, 6}
4. Consider the experiment of flipping two coins, the sample space is
S= {HH, HT, TH, TT}
10
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
11
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
If A is the event in which the number appear on the top is even numbers, event A is defined
A 2,4,6
as
n( A) 3
n( A) 3 1
p( A)
n( S ) 6 2
2. Relative Frequency or Empirical Probability approach
The difference between classical and empirical probability is that classical probability
assumes that certain outcomes are equally likely, while empirical probability relies on
action experiment to determine the likelihood of outcomes. In other words, this approach
the probability of an event can be estimated only through repeated experiment. In such
n
experiment, the relative frequency of the event can serve as its probability
N
Suppose, for example, that a researcher asked 25 people if they liked the taste of a new soft
drink. The response were classified as "Yes”, “No" or "Undecided". The results were
categorized in frequency distribution as shown
Response frequency
Yes 15
No 8
Undecided 2
Total 25
Probability now can be computed for various categories using the relative frequency
approach, i.e., the probability of selecting a person who liked the taste is 15/25 or 3/5
Given a frequency distribution, the probability of an event being in a given class is:
n
P (E) = where n- Frequency for the class (f) and N- total frequency of the
N
distribution.
Response frequency Relative frequency (probability)
Yes 15 0.6
No 8 0.32
Undecided 2 0.08
Total 25 1.00
12
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
3. Subjective Probability
Subjective probability, probabilities of the occurrence of an event determined based on an
educated guess or estimate. This guess is based on the person's professional and life
experience and evaluation of a solution.
For example, a physician might say that because of his diagnosis, there is a 30% chance
that the patient will need an operation.
A probability function is a real valued function defined on the class of all subsets of the
sample space s : the value that is associated with sub set A denoted by p(A).The assignment
of probability for an event must satisfy the following three rules.
1.P( S ) 1
2.P( A) 0 For event, A is subset of sample space s
3.P( A B ) P ( A) P ( B )ifA B
It means the probability of an event must be non-negative and the probability of the union
of two mutually exclusive events is the sum the probabilities.
13
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Example 2: A new car buyer has a choice of five body styles, two engines styles, and eight
different colors. How many different car choices does the buyer have?
Solution; there are 528=80 different choices among the cars that could be ordered.
14
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
a) In a row of 7 chairs
Solution: 7!=7654321=5040ways
b) Around a circular table
Solution: (n-1)! = (7-1)!=6!=654321=720ways
Example 2 : How many distinct permutations can be formed from all the letters of each
Word? a) Them b) unusual c) sociological
Solution: a) them has four different letters therefore, the number of permutation is
4! =4321=24 ways
c) The word “unusual” has 3 u‟s, therefore the number of permeations
are:
7!
840 ways
3!
d) In the word “sociological” the numbers of each letters is (0=3,l=2,i=2
and c=2)
12!
9,979,200 ways
3!2!2!2!
Example 3: Two lottery tickets are drawn from 20 tickets for the first and second prizes.
Find the number of sample points in the space
Example 4: How many different ways can 3 red, 4 yellow and 2 blue bulbs be arranged in a
string of Christmas tree lights with 9 sockets?
Solution: the total number of distinct arrangement is
9!
1260 Arrangements
3!4!2!
In many problems, we are interested in the number of ways of selecting r objects from n
without regard to order. These selections are made a principle of counting called
combinations.
15
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
16
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
P( A B C ) P( A) P( B) P(C ) )
P( A C ) P( A B ) P(C C ) P( A B C ) (AC)
2
Example 1: The Probabilities that Abebe passes microeconomics is , and the probability
3
4 1
that he passes statistics . If the probability of passing both courses is , what is the
9 4
probability that Abebe will pass at least one of these courses?
Solution:
If M is the event "passing Macroeconomics I, and S the event "passing statistics," then by
the additive rule we have
P( M S ) P( M ) P( S ) P( M )
2 4 1 31
=( )
3 9 4 36
If the events are mutually, exclusive events the addition rule modified as
P (A B) = P (A) +P (B)
P( A B C ) P( A) P( B) P(C )
This is because if the events are mutually exclusive P( A B ) 0 . A mutually exclusive
event means that when one of the events occurs, none of the other events can occur at the
same time.
Example 1: Find the probability of getting 6, or 4 or 2 on one role of a die.
Solution:
Since the occurrence of the events are mutually exclusive,
P( A B C ) P( A) P( B) P(C )
Where A-the event that number 6 appear on the upper face of the die
B- the event that number 4 appear on the upper face of the die.
C-the event that number 2 appear on the upper face of the die.
n( A) 1
P( A)
n( S ) 6
n( B ) 1
P( B )
n( S ) 6
17
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
n(C ) 1
P (C )
n( S ) 6
P( A B C ) P( A) P( B) P(C )
1 1 1 3 1
=
6 6 6 6 2
P( A B )
P( A B ) = if P (B) 0
P( B )
The conditional probability of B, given that A has occurred
P( A B )
P( B A) = if P (A) 0
P( A)
Now let us redefine independence of event in terms of conditional probability. Two event A
and B are said to be independent if and only if either
P( B A) =P(A) or
= P ( B ) P( B A)
18
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Similarly if A, B and C are mutually independent event, then the probability that A, B and
C occur is
P( A B C ) P( A) P( B) P(C )
Example: The following table gives the classification of all employees of a company by
Sex and college graduate.
College graduate Not a college Total
(G) graduate (N)
Males (M) 7 20 27
Female (F) 4 9 13
Total 11 29 40
13 4 4
( ) 0.10
40 13 40
In the same manner, we can compute three other joint probabilities for the table as follows.
27 20 7
P (M and G) =P (M G) = P( M ) P(G M ) = ( )
40 27 40
19
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
27 20
P (M and N) = P( M N ) P( M ) P( N M ) = ( ) = 0.500
40 27
13 9
P (M and N) = P( F N ) P( F ) P( F N ) = ( ) =0.225
40 13
Complement Rule
The complement of an event A is the event that A does not occur –that is the event
consisting of all sample space that is not in event A. It is denoted with A‟. Since the two
events are mutually exclusive, the probability of the occurrence of the event is the sum of
the probability of the complementary event and the probability of the event A.
P (A) +P (A‟) =1
P (A‟) = 1- P (A)
Example 2.13: if the probability that an automobile mechanic will service 3,4,5,6,7, 8 or
more cars on any given workday are respectively 0.12, 0.19, 0.28, 0.24, 0.10 and 0.07.
what is the probability that he will service at least 5 cars on his next day of work?
Solution: let E be the event that at least 5 cars are serviced. Now P(E) = 1 – P(E‟) where E‟
is the event that fewer than 5 cars are served.
Since P (E‟) = 0.12 + 0.19 = 0.31, it follows that
P (E) = 1 – 0.31 = 0.69
Summary
Probability is a part of our every day lives. In personal and managerial decisions we face
uncertainty and use probability theories. We live in a world in which we are unable to
forecast the future with complete certainty. Our need to cope up with uncertainty leads us to
the study and use of probability theories.
Probability as a general concept can be defined as the chance of an event occurring and its
value lies between 0 and 1.
Permutation is an ordered arrangement of a group of objects where as combination is a
particular arrangement of a group of objects or persons selected from a large group without
regard to order. We can assign probabilities to the outcomes using classical probability,
empirical probability and subjective probability approaches.
20
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
1. A news paper editor is going to assign two reporters to cover a political convention.
The assignment will be made from a pool of six women and four men. How many
groups of two reporters can be assigned if
21
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Chapter three
3. Random Variables and Probability Distributions
3.1. Introduction
The concept of a probability space that completely describes the outcome of a random
experiment has been developed in chapter II. In this chapter, we develop the idea of a
function defined on the outcome of a random experiment, which is a very high- level
definition of a random variable. Thus, the value of a random variable is a random
phenomenon and is a numerically valued random phenomenon.
A random variable could be discrete or continuous. In this chapter, we will see probability
distribution of discrete random variable and probability density function of continuous
random variable. Finally, we will try to see the cumulative probability distribution of
discrete and continuous variable and the expected values, variance and standard deviations
of discrete and continuous random variable.
.
Objectives;
After the end of this chapter, the students will be able to:
Define random variable
Determine the probability distribution and cumulative probability distribution
function
Find the expected value, the variance and the standard deviation of a probability
distribution.
Generally, a single letter X instead of the function X (w) represents a random variable.
Therefore, in the remainder of the module we use X to denote a random variable. The
22
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
sample space S is called the domain of the random variable X. In addition, the collection of
all numbers that are values of X is called the range of the random variable X.
Example3.1. Suppose that a coin is tossed twice so that the sample space is
S TT , TH , HT , HH . Let X represents the number of heads, which can come up. With
each sample point, we can associate a number for X as shown in table below. Thus ,in the
case of TT ( i.e 0 heads), X=0 while for TH ( 1 head) X=1, or HH ( i.e 2 heads) X=2. It
follows that X is a random variable.
Table 3.1
Sample Point TT TH HT HH
X 0 1 1 2
It should be noted that many other random variables could also be defined on this sample
space, for example the square of the number of heads, the number of heads minus the
number of tails, etc.
A random variable could be discrete or continuous. A random variable is discrete if the
number of values it can assume forms a countable set; that is, this set has either a finite
number of elements or its elements are countable infinite in that they can be put into a one –
to –one correspondence with the positive integers. For instance, if the random variable X
represents the number of points obtained in the roll of a single six- sided die, then X takes
on a finite set of possible outcomes: 1,2,3,4,5,6. But if the random variable X is defined
according to the rule “ roll a single six –sided die repeatedly until a 4 appears for the first
time,” then this could happen on the first roll (X=1), on the second roll (X=2), on the third
roll (X=3), and so on . Clearly, there are infinitely many possibilities and thus X assumes
the countable infinite set of values 1,2,3,....
23
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
P( X xk ) f ( xk ) k=1,2,3...
It is convenient to introduce the probability function, also referred to as probability
distribution, given by
P( X x) f ( x)
In general f ( x) is a probability function if
1. f ( x) 0
2. f ( x) 1
x
Where the sum in equation ( 2) is taken over all possible values of x, a graph of f(x) is
called a probability graph.
Then
P( X 0) P(TT ) 1/ 4
P( X 1) P( HT U TH ) P ( HT ) P (TH ) 1/ 4 1/ 4 1/ 2
P( X 2) P ( HH ) 1/ 4
24
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
f(x) ¼ ½ 1/4
A probability graph can be obtained by use of a bar chart, as indicted in figure 3.1 or a
histogram, as indicted in figure 3.2. In the bar chart, the sum of the ordinates is 1 while in
the histogram the sum of the rectangular areas is 1. In the case of the histogram, we can
think of the random variable X as being made continuous, eg. X=1 mean that it lies between
0.5 and 1.5
f(x)
1/2
1/4
0 1 2 x
f(x)
1/4
0 1 2
x
Figure 3.2 Histogram of probability distribution of example 3.2
25
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
P( X x) F ( x)
Where x is any real number, i.e. - < x < . The distribution function can be obtained
from the probability function by nothing that
F ( x) P ( X x) f (u )
u x
Where the sum on the right is taken over all values of u, for which u x, conversely the
probability function can be obtained from the distribution function
If X takes on only a finite number of values x1, x2,....,xn then the distribution function is
given by
0 x x1
f (x ) x1 x x2
1
f ( x ) f ( x2 ) x2 x x3
F ( x) 1
.
.
f ( x1 ) ... f ( xn ) xn x
Example3. 3. Let a random experiment involve the rolling of a pair of fair six-sided dice
and let the random variable X be defined as the sum of the faced showing. See the
following table (table 3.3). The sample space S (the domain of X) and the range of X are
depicted in figure 3.4. The probability associated with the X‟s and the cumulative
probability function F(X) appear in table 3.4
26
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Table 3.3: the sum of upturned faces in rolling a pair of fair die
The second die
1 2 3 4 5 6
1 2 3 4 5 6 7
The first die
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
Table 3.4: the probability distribution and cumulative probability of example 3.3
X f (X ) F(X)
X1 = 2 f ( X1) 1
36 F ( X1) 1
36
X2 = 3 f (X2) 2
36 F(X2) 3
36
X3 = 4 f ( X3) 3
36 F ( X 3 ) 6 36
X4 = 5 f (X4) 4
36 F ( X 4 ) 10 36
X5 = 6 f ( X 5 ) 5 36 F ( X 5 ) 15 36
X6 = 7 f ( X 6 ) 6 36 F ( X 6 ) 2136
X7 = 8 f ( X 7 ) 5 36 F ( X 7 ) 26 36
X8 = 9 f ( X 8 ) 4 36 F ( X 8 ) 30 36
X9 = 10 f ( X9 ) 3
36 F ( X 9 ) 33 36
X10 = 11 f ( X10 ) 2 36 F ( X10 ) 35 36
X11= 12 f ( X 11 ) 1
36 F ( X 11 ) 1
11
f (X ) 1
i 1
i
27
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
f x
7 36
6 36
5 36
4 36
2 36
2 36
1 36 …
11 2 3 4 5 6 7 8 9 10 11 11
12
0 X
Figure 3.3: probability distribution graph of example 3.3
Example 3.4 a) find the distribution function for the random variable x of example 3.2.
b) Obtain its graph
Solution:
a) the distribution function is
0 x 0
1
0 x 1
F ( x) 4
3 1 x 2
4
1 2 x
28
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
F(x)
1/4
0
1 2 x
Figure 3.4
The following things about the above distribution function, which are true in general,
should be noted.
1. The magnitude of the jumps at 0, 1, 2 are ¼, 1/2, ¼ which are precisely the
ordinates in fig3.4. This fact enables one to obtain the probability function
from the distribution function.
2. Because of the appearance of the graph of fig3.4. It is often called a staircase
function or step function. The value of the function at an integer is obtained
from the higher step, thus the value at 1 is ¾ and not ¼. This is expressed
mathematically by stating that the distribution function is continuous from
the right to 0,1, 2.
3. As we proceed from left to right( i.e. going upstairs) the distribution
function either remain the same or increases, taking on values from 0 to 1.
Because of this it is said to be a monotonically increasing function.
29
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
If X is a continuous random variable the probability that X takes on any one particular value
is generally zero. Therefore, we cannot define a probability function in the same way as for
a discrete random variable. In order to arrive at a particular distribution for a continuous
random variable we note that the probability that X lies between two different values is
meaningful.
In general for continuous random variable X
1. f ( x) 0
2. f ( x)dx 1
Where the second is a mathematical statement of the fact that a real-valued random variable
must certainly lies between and . We then define the probability that X lies between
a and b by
b
P(a X b) f ( x)dx
a
A function f (x) , which satisfies the above requirement, is called a probability function or
probability distribution for a continuous random variable, but it is more often called a
probability density function or simply density function. Any function f (x) satisfying
properties 1 and 2 above will automatically be a density function and required probabilities
b
can then be obtained from P(a X b) f ( x)dx
a
Example 3.5: (a) Find the constant C such that the function
cx 2 0 x3
f ( x) is a density function and
0 otherwise
(b) Compute P(1 x 2)
30
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Solution:
(a) Since f (x) satisfies property, (1) if c 0, it must satisfy property 2 in order to be a
density function. Now
3
1
f ( x)dx 1 cx dx 1 9c 1 c 9
2
0
2
1 7
(b) P(1 x 2) x 2 dx
1
9 27
In case f(x) is continuous, which we shall assume unless otherwise stated, the probability
that X is equal to any particular value is zero. In such case we can replace either or both of
the signs < by thus
7
P(1 X 2) P(1 X 2) P(1 X 2) P(1 X 2)
27
Find (a) the density function, (b) the probability that X>2, and (c) the probability that -3 <
X < 4.
Solution:
d
(a) f ( x) F ( x) 2e 2 x x0
dx
(b) P( X 2) 2
2e 2u du e 2u l 2 e 4
Another method
By definition, P (X < 2) = F(2) = 1 - e-4. Hence,
P (X > 2) = 1 – (1 – e-4) = e-4
(c )
P 3 X 4 4
3
f (u )du 0du
0
3
4
0
2e 2u du
e 2u / 0 1 e 8
4
31
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Another method
P(-3 < X < 4) = P(X < 4) - P(X < -3)
= F(4) – F(-3)
= (1 - e-8) – (0) = 1 – e-8
Solution:
0 2
2
f ( x)dx 1 odx A(2 x x )dx odx 1 A(2 x x 2 )dx 1
2
0
0 2
thus , we obtain
1
2 x 3
A x 1
3 0
8 4A
A 4 1
3 3
3
A
4
b) P( x 1) f ( x)dx
1
32
41
3 x
2 x x 2 dx x 2
4 3
3 4 2 1
4 3 3 2
32
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
3.5. The Expected Values of Random Variable, Moment, Skewness and Kurtosis
E (x x
n
xi x f ( x)
n
x discrete
x x f ( x)dx
n
x continu
The central moment for the case of n=2 is very important and carries a special name, the
2
x
2
x x x x 2 2 x x f ( x)
2
2
xx f ( x) dx
The positive square root of the variance is called the standard deviation and is given by
2
x
var ( x ) E xx
33
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
The variance (or standard deviation) is a measure of the dispersion or scatter of the values
of the random variable about the sample mean ( x ) or population mean( )
If the values tend to be concentrated near the mean, the variance (or standard deviation) is
small while if the values tend to be distributed far from the mean, the variance is large.
Small Variance
Large Variance
Figure 3.5 normal distributions with the same mean () but with different standard
deviations ()
Example 3.8: Find the expected value, variance & standard deviation of
1
x 0 x2
f ( x) 2
0 otherwise
Solution:
4
Mean = E ( X )
3
34
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
2
2 2
4 4 2
4 1
2
E ( x ) x f ( x) dx x xdx 2
3 3 0 3 2 3
2 2
9 3
Example3.9: consider the random variable X1, X2 & X3 with the following probability
distribution:
1
0x4
f ( x1 ) 4
0 otherwise
1
0.5 X 3.5
f ( x2 ) 3
0 otherwise
1
1 x 3
f ( x3 ) 2
0 otherwise
From direct computation of the mean value of these random variables, we see that
35
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
E( x1 ) E( x2 ) E( x3 ) 2 . However, there spreads about the mean values are different, see
F(x)
1
f(x3)
2
1
f(x2)
3
1
f(x1)
4
0 0.5 1 2 3 3.5 4 A
Figure 3.6 probability distributions with the same expected value or mean (µ=2) and with
different standard deviations ()
In terms of variance, we therefore say that x1 has the largest variance, while x3 has the
smallest variance.
Example3.10: Let x be a continues random variable with
1
2x6
f ( x) 4
0 otherwise
2
6
x 42 dx 4
2 2
E ( x x) x x f ( x) dx
x
2 4 3
36
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Moments
A moment of a random variable X is defined as the expected value of some particular
function of X. In general, the moments of a probability distribution amount to a collection
of descriptive measures that can be used to characterize the location and shape of the
distribution. Hence, a probability distribution can be completely specified in terms of its
moments. As we shall now see, moments of a random variable typically are defined in
terms of having either zero or the expectation of X as the reference point.
E ( X ' ) xi f ( xi )
1 r
r
i
37
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
1
(Note that the first moment about zero is the mean of X or E ( X ) ) and the 4th
r
r
E ( X )r
i
x f ( x )
i
r
i
If X is a continuous random variable with probability density f(x), then, provided the
following integrals exist, we may correspondingly define
1
E ( X ' ) x r f ( x)dx;
r
and
r
E ( X )r
x r f ( x)dx;
The first moment about zero locates the mean or measures central tendency of a probability
distribution and the second moment about the mean describes its shape in terms of variation
or dispersion about the mean. Additional information about the shape of a probability
distribution, as characterized by measures of skewness and kurtosis, are provided by the
third and fourth central moments of X, respectively. In particular, we shall develop
standardized (independent of units and taken relative to ) measures of skewness and
kurtosis.
Skewness
Skewness is the degree of asymmetry, or departure form symmetry, of a distribution. If a
distribution has a longer tail to the right of the central maximum than to the left, the
distribution is said to be skewed to the right, or to have positive skewness. If the reverse is
true, it is said to be skewed to the left, or to have negative skewness.
38
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Kurtosis
Kurtosis is the degree of peaked ness of a distribution, usually taken relative to a normal
distribution. A distribution having a relatively high peak, is called leptokurtic, while a
distribution which is flat-topped, is called platykurtic. The normal distribution, which is not
very peaked or very flat-topped is called mesokurtic.
If the peak of X‟s probability distribution mirrors that of a normal distribution, then 4 = 3.
If 4 > 3 (respectively, <3), then the peak of the probability distribution is sharper
(respectively, flatter) than that of a normal distribution.
39
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
r E ( X ) r (1) j j r j
r
r 1
j 0 j
Form the properties of the expectation operator explained earlier, we can readily
demonstrate that:
2 1 2 ;
1
3 3 3 2 2 3 ;
1 1
4 4 4 3 6 2 2 3 4 .
1 1 1
x
If we standardize the random variable X to obtain Z , then, since E(Z)=0, the rth
central moment of Z can be expressed in terms of the rth central moment of X as
X r 1 r ( x) r ( X )
( Z ) E ( Z ) E
r r
r E ( X ) r
2
(X ) r 2
r
Also, V(Z) = µ 2(Z) = 1 and 3(Z) = 3(X)and 4(Z) = 4(X).Hence standardizing a random
variable X affects its mean and variance but not its standardized third and fourth moments.
Table 4.5
X F(x)
1 0.2
2 0.3
3 0.4
5 0.1
1.0
40
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Example 4.6.1. Given the discrete probability distribution in Table 4.5, determine and
interpret its standardized third and fourth moments or the coefficients of skewness and
kurtosis. From (4.21):
E ( X ) X i f ( xi ) 1(0.2) 2(0.3) 3(0.4) 5(0.1) 2.50,
E( X 2 )
1
f ( xi) 1(0.2) 4(0.3) 9(0.4) 25(0.1) 7.50,
2
2 X i
) X
1
E( X 3 f ( xi) 1(0.2) 8(0.3) 27(0.4) 125(0.1) 25.90,
3
3 i
) X
1
E( X 4 f ( xi) 1(0.2) 16(0.3) 81(0.4) 625(0.1) 99.9,
4
4 i
3 3 3 2 2 3 0.90,
1 1
4 4 4 3 6 2 2 3 4 4.96.
1 1 1
2 2
Since 3>0, this discrete probability distribution is slightly skewed to the right. Moreover,
with 4>0, the distribution has a peak that is slightly sharper than that of a normal
distribution.
Example 4.6.2. Let the probability density function for a continuous random variable X be
2 x, 0 x 1;
f ( x)
0 elsewhere.
Then:
1 2
E ( X ) xf ( x)dx 2 x 2 dx x 3 ]0 0.666,
1
0 3
1 1 1
E ( X 2 ) x 2 f ( x)dx 2 x 3 dx x 4 ] 0.500,
1
2 0 2 0
41
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
2 V ( X ) 2 2 0.055,
1
3 3 3 2 2 3 0.007,
1 1
4 4 4 3 6 2 2 3 4 0.007.
1 1 1
3 4
3 0.534, 4 2.333.
32 2
2 2
With 3 < 0 we see that this continuous probability distribution is moderately skewed to the
left, and a4 < 3 indicates that its peak is a bit flatter than that of a normal distribution.
Summary
Generally a random variable is a variable assigned to the different values of probability
distribution. Random variable is represented by a single letter X and the collection of all
numbers that are values of X is called the range of the random variable X.
This chapter developed the concept of function defined on the outcomes of random
phenomena. These functions which are called random variables, can be classified into two
types: discrete random variable that have a set of possible values that are either finite or
accountably infinite, and continuous random variables that can assume an uncountable set
of possible values.
Associated with both types of random variables is the concept of cumulative distribution
function that denotes the probability that a random variable X takes on a value that is less
than or equal to X.
Similarly the probability density function is a nonnegative function associated with a
continuous random variable such as that integrating the probability density function
between two distinct values of the random variable gives the probability that the random
variable takes a value that lies between these two values. Thus, the area under the curve
defined by the probability density function is the probability that the random variable lies
between the values limiting the area. Because of this, the probability that a continuous
random variable takes on a particular value is zero since the area associated with a point is
42
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
zero. Understanding the concept of random variable is key to understanding the rest of the
topics
Exercises
1. suppose that the error in the reaction temperature, in oC, for a controlled laboratory
experiment is a continuous random variable X having the probability density function
X 2
1 X 2
f ( x) 3
0
otherwise
a. Is that valid probability density function? Why?
b. Find P(0 X 1)
c. Find the cumulative probability distribution function?
2. the cumulative density function of the random variable X is defined by
0 x2
F ( X ) A( X 2) 2x6
1 x6
43
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
x 1 5 7 9
f(x) 1/6 2/6 2/6 1/6
Find
a. The expected value of x E(X)
b. The variance of x V(X)
6. Given the probability density function of
2(1 x) 0 x 1
f ( x)
0 elsewere
Find
a. The expected value of x, E(X)
b. The variance of x, V(X)
7. verify that if X is a discrete or continuous random variable whose variance exists,
then for a and b constant;
a. V(a) = 0
b. V(a + x) = V(x)
c. V(a + bx) = b2V(x)
d. V(x) = E(x2) – E(x)2
8. let the cumulative distribution function for a random variable appear as:
0 x0
F ( x ) 2 x x 2 0 x 1
1 x 1
1
a) P( X )
4
44
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
1 3
b) P( x )
3 4
c) The probability density function
45
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Chapter four
4. Special Probability Distributions and Densities
Introduction
Chapter 3 deals with general probability of random variables. Random
variables with special probability distributions are encountered in different
filed of social and natural sciences, including business and economics. The
objective of this chapter is to describe some of these distributions, including
their expected values and variance. These include the discrete random variable
probability distributions like, Bernoulli distribution, Binomial distribution,
hyper geometric distribution, Poisson distribution. The continuous random
variable: uniform probability distribution and normal distribution are also
discussed here. Examples are given for each distribution and there are
exercises at the end of the chapter so that you can understand each distribution
very well.
Objective;
After this chapter the students will be able to:
Identifies the different types of special probability distributions
Uses these special probability distributions to solve problems
Find the expected values and variance of each special probability
distributions
46
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
4.1.1. The Bernoulli distribution: This is perhaps one of the simplest possible discrete
random variables. We say that a random variable x has the Bernoulli (p) distribution if and
only if its probability mass function is given by
f(x) = P (X = x) = Px (1-p) 1-x for x = 0, 1
Where 0 < P <1, here, P is often referred to as a parameter. In applications, we may collect
dichotomous data, for example simply record whether an item is defective (x = 0) or non-
defective (x = 1), whether an individual is married (x = 0) or unmarried (x = 1) or whether a
vaccine works (x = 0) or does not work (x = 1), and so on. In such situation, P stands for
P(x = 1) and 1-P stands for P (x = 0). In other words, Bernoulli distribution deals with a
simple experiment that may result in either of two possible outcomes. We call an
experiment with two possible outcomes Bernoulli trials and we label the two outcomes
success (S) and failure (f). Here the probability of success is the complement of failure.
Where 0<P<1. Here again P is referred to as a parameter. Observe that the Bernoulli (P)
distribution is the same as the Binomial (1, P) distribution.
47
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
The Binomial (n, p) distribution arises as follows, consider repeating the Bernoulli
experiment independently n times where each time one observes the out come (0 or 1)
where P = P (X = 1) remains the same throughout.
Example4.1: In a short multiple choice quiz suppose that there are ten unrelated questions,
each with five suggested choices as the possible answers, each question has exactly one
correct answer given. An unprepared student guessed all the answers in that quiz. Suppose
that each correct (wrong) answer to a question carries one (zero) point.
a. Find the probability that the student gets 0/10
b. Find the probability that the student gets at least 8
Solution:
Let X stand for the student‟s quiz score.
1
We can postulate that X has the Binomial n 10, P distribution, then
5
n
P( X x) px (1 p) n x
x
48
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
10 1
0 10
4 0
a) P( x 0) 0.10737 ( the probabilit y that the student get )
0 5 5 10
10 1 4
8 2
10 1 4
9 1
10 1
10 0
4 5
b) P( x 8) 7.7926 x 10
8 5 5 9 5 8 10 5 5
the probabilit y that the studnet ' s result is 8
Example 4. 2: We toss a coin five times and we are interested in the number of heads in
each possible out comes. Find the probability distribution of X. X represents the number of
heads and graph the distribution
Solution :
The probability distribution of X is
No of heads (x) 0 1 2 3 4 5
F(x) = P (X = x) 1 5 10 10 5 1
32 32 32 32 32 32
This probability distribution can be found using Binomial formula. The probability that the
random variable X equals
0 5
5 1 1 1
f (0) P( x 0)
0 2 2 32
1 4
5 1 1 5
f (1) P( x 1)
1 2 3 32
10
32
5
32
1
32
0 1 2 3 4 5
49
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
50
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
number of elements in the population is small in relation to the sample size ( n > 0.05) or
N
if the sampling is with our replacement, the probability of a success for a given trial is
dependent on the outcomes of preceding trials. Then the number X of successes follows
what is known as hypergeometric probability distribution
The formula for calculating the probability of exactly K successes in n trials is given
by or a population contains M successes and N – M failures. The probability of exactly K
successes in a random sample of size n is
M N M
CR Cn K
P( X k ) N The mean and variance of hypergeometric distribution
Cn
M
n
N
M N M N n
2 n
N N N 1
Example 4.4: A case of wine has 12 bottles, 3 of which contain spoiled wine. A simple of 4
bottles is randomly selected from the case.
1. Find the probability distribution for x, the number of bottles of spoiled wine in the
sample.
2. What are the mean (µ) and variance ( 2) of x?
Solution:
C x 3C4 x 9
P( x )
C412
51
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
3
2) 4 1
12
3 9 12 4
2 4 0.5455
12 12 11
Let µ be the average number of times that an event occurs in a certain period of time or
space. The probability of K occurrences of this event is
52
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
k e
P( X K )
K!
For values of K = 0, 1, 2, 3, …
The mean and standard deviation of Poisson distribution respectively are:
Mean
S tan dard divation the symbole e 2.71828
Example 4.5: the average number of traffic accidents on a certain section of highway (for
example, from Addis to Adama) is two (µ = 2) per week
1. Find the probability that:
a. No accidents on this section of high way during a week-P(X=0)
b. At most three accidents on this section of high way during 2 weeks-P(3)
Solution:
K e
p (x K )
K!
20 e 2
a) P( x 0) 0.135335
0!
53
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
1
a xb
f ( x) b a
0 otherwise
It is used to model events that are equally likely to occur at any time within a given time
interval.
Graph of uniform distribution is
f(x)
1
ba
a b
54
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
b
x x2 b
E ( x) xf ( x)dx
a ba
dx
2(b a) a
b2 a 2 (b a) (b a) a b
2(b a) 2(b a) 2
aba
/
3(b a) a
b3 a 3 (b a)(b 2 ab a 2 ) b 2 ab a 2
3(b a) 3(b a) 3
2
x
E x 2 ( E ( x))2
b 2 ab a 2 b 2 2ab a 2
3 4
b 2ab a
2 2
12
(b a) 2
12
Example 4.3. The time that Abebe, the teaching assistant, takes to grade a paper is
uniformly distributed between 5 minutes and 10 minutes. Find the mean and variance of the
time to grade a paper.
Solution: Let X be a random variable that denotes the time it takes Abebe to grade a paper,
therefore, the mean or expected value E(X) and variance ( x )
2
10 5
E ( x) 7.5
2
(10 5) 2 25
2
x
12 12
55
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
It has been observed that most business and economic variables generate continuous data
whose behaviour is often best described by a bell-shaped continuous curve. Since this is
what we normally come across in the case of most populations on these variables, a bell-
shaped curve has come to be universally known as normal curve. Accordingly, the
probability distribution described by a normal curve is called the normal (probability)
distribution.
The normal distribution has come to acquire a wide range of applications in many areas of
human knowledge. It is being used in almost all databased research in the field of
agriculture, trade, business, and industry. As will be noticed a little later, much of the theory
of inductive statistics, concerning estimation of unknown population parameter and testing
hypotheses on the basis of sample statistics, has been developed using the concepts of
normal curve.
56
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
The parameter and are critical values in the normal distribution equation. Other terms
being constant, the exponent
1 x 2 ( x )2
( ) or
2 2 2
Is the only operational part of the equation. It shows the deviation of the value of normal
variable X from it mean . The larger these deviations, the higher the value of standard
deviation (or variance 2), which is the denominator in the exponent.
It may be seen that lies at the center of the normal curve, and indicates the central value of
the normal distribution. The standard deviation is a measure of the extent of the spread of
X values from the central value -. Thus, while fixes the position or the level of the
distribution on the X-axis, determines the spread of the distribution along the X-axis on
both sides of the central value.
In the light of the above, consider the following three situations as shown in the figure 4.4
1. A change in , standard deviation remaining the same, shifts the curve along the X-
axis without changing its spread. This is shown in figure below.
57
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
1
µ1 µ2
2. A change in , means- remaining the same, changes the shape or spread of the normal
curve. This is shown in figure 4.5
1
2
µ 1 = 2
Figure 4.5, Normal curves with different standard deviations (2>1) and 1=2
3. an increase in increases the spread of the normal curve equally on both sides of the
central value. It lower the normal curve in height, irrespective of whether or not there is any
change in. A decrease in, on the contrary, reduces the spread of the normal curve and
increase its height. The inverse relationship obtaining between the extent of spread of the
58
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
normal curve and its height at the central value can be easily grasped by observing figure
4.5. This is so because the total area under any two normal curves must always be 1.
59
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
one would have to have a table of areas under the curve. In order to simplify this situation,
statisticians use what is called the standard normal distribution.
The standard normal distribution is a distribution with mean of 0 and a standard deviation of
1. One advantage of all normally distributed variables is that they can be transformed into
the standard normal distribution by using the formula for the standard score (Z):
value mean X
Z or Z
s tan dard devaition
As we stated earlier, the area under the normal distribution curve is used to solve practical
application problem, hence the major emphasis of this section is to show the procedures for
finding the area under the normal distribution curve for any Z value. Once the X values are
transformed by using the above formula, they are called Z values. The Z value is actually
the number of standard deviation that particular X value is away from its mean ( i.e. below
or above the mean). For example Z 2 implies that the value X is 2 standard deviation
above or below the mean
60
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Z-values -3 -2
-3 -1 0 1 2 3
68.26%
95.44%
99.74%
Figure 4.6 shows the graph of the probability density function of Z with mean equal to zero
and standard deviation equal to one. This curve is symmetric, bell-shaped and is centered on
the mean equal to zero and has most of the area contained with in the range 3 (99.74%)
Example 4.7: A continuous manufacturing process produces items whose weights are
normally distributed with a mean weight of 800 g and standard deviation of 300 g. A
random sample of 16 items is to be drawn from the process.
a. What is the probability that the arithmetic mean of the sample
exceeds 900 g ? Interpret the results.
61
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
b. Find the value of the sample arithmetic mean within which the
middle 95 percent of all sample means will fall.
Solution: (a) we are given the following information
µ = 800g, = 300g, and n= 16
since population is normally distributed, the distribution of sample mean is normal with
mean and standard deviation equal to
x 800
300 300
And x 75
n 16 4
The required probability P( x 900) is represented by the shaded area in figure 4.7 of a
normal curve. Hence
Hence, 9.18 percent of all possible samples of size n=16 will have a sample mean value
greater than 900g.
0.0918
x 800 x 900
Z 1.33
Figure 4.7
(b) since Z = 1.96 for the middle 95 percent area under the normal curve as shown in
figure 4.8 , therefore using the formula for Z to solve for the values of x in terms of
the known values are as follows:
62
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
x1 x Z x
800 1.96(75) 653 g
x2 x Z x
800 1.96(75) 947 g
therefore, 95% of the population are with in [653,947]
0.9500
x1 x 41.5 x2
Figure 4.8
Example 4.8: In a normal distribution 31 percent of the items are under 45 and 8 percent
are over 64. Find the mean and standard deviation of the distribution.
Solution: since 31 percent of the items are under 45, therefore the left of the ordinate at X =
45 is 0.31, and obviously the area to the right of the ordinate up to the mean is (0.5-0.31) =
0.19. The value of Z corresponding to this area is 0.5. Hence
64
Z 0.5 or 0.5 45
As 8 percent of the items are above 64, therefore areas to the right of the ordinate at 64 is
0.08. Area to the left to the ordinate at X = 64 up to mean ordinate is (0.5 – 0.08) = 0.42 and
the value of Z corresponding to this area is 1.4. Hence
64
Z 1.4 or 1.4 64
63
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
From these two equations, we get 1.9 = 19 or = 10 in the first equation, we get
- 0.5 10 = 45 or = 50
Thus, mean of the distribution is 50 and standard deviation 10
19% 42%
8%
31%
45 64
Figure 4.9
64
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Summary
This chapter introduced some of the many classes of random variables. The Bernoulli
random variable is used to model experiments that have only two possible outcomes, which
are referred to as success and failure. The random variable that is used to denote the number
of successes that occurred in those n Bernoulli trials is the Binomial random variable and
the distribution is Binomial distribution. One popular area of application of probability is
quality control. Some of the items coming off a product line are good and some are bad. If
we know before hand the fraction of the items in a production batch that are good, we may
want to know the probability that the sample contains a specified number of bad items. The
random variable that is used to denote this number is the hyper geometric random variable
and the distribution is hyergeometric distribution.
The Poisson random variable is used to count the number of events over a given interval,
space, etc. Example, the number of equipment failures over a given interval.
The two commonly used continuous random variables are the uniform distributed random
variable, which is used to denote the time of events that are equally likely to occur at any
time within a given interval. The other is the normal distribution that is used to denote
events that have a higher probability of occurrence around the mean value and a smaller
probability of occurrence the farther a way from the mean value.
65
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Exercise
1. In a class there are 40 students for the course statistics for economists. The students‟
result out of 100% is normally distributed with mean 70 and variance 64. The top 10%
of the students are to get “A” grade and the bottom 5% are to get “F” grade
a. How many students‟ result is with in the range 62 to 78?
b. Determine the minimum mark that the students should get in order to get “A”
grade
c. For what range of markets students get “F” grade
3. Suppose that a sample of households is randomly selected from all the households in the
city in order to estimate the percentage in which the head of the household is
unemployed. A literature indicates that the percentage of unemployment in the city is
10%. If a random sample of 5 household is to be selected from the households in the
city. What is the probability that all five heads of the household are employed
4. A shipment of 8 similar microcomputers to a retail outlet contains 3 that are defective. If
a school makes a random purchases of 2, of these computers, let X be a random variable
whose value is the number of defective computers.
a. Find the probability distribution for the number of defectives?
b. Find the mean and variance of the distribution?
5. The probability that a person dies from a certain respiratory infection is 0.002. find
the probability that fewer than 5 of the next 2000 so infected will die (0.6288)
6. Lots of 40 components each are called acceptable if they contain no more than 3
defective. The procedure for sampling the lot is to select 5 components at random
and to reject the lot if a defective is found. What is the probability that exactly 1
defective will be found in the sample if there are 3 defective in the entire lot?
(0.3011)
66
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
7. The number of mistakes counted in one hundred typed pages of a typist revealed
that she made 2.8 mistakes on average per page. Find the probability that in a page
typed by her
8. A stockiest has 20 items in a lot. Out of these 12 are non-defective and 8 are
defective. A customer selects 3 at random
a. What is the probability that all the three items are non-defective? (0.193)
b. that is the probability that out of these three items, two are non-defective and
one is defective?(0.463)
9. The lifetimes of certain kinds of electric devices have a mean of 300 hours and
standard deviation of 25hous. Assuming that the distribution of these lifetimes,
which are measured to the nearest hour, can be approximated closely with a normal
curve
a. Find the probability that any one of these electric devices will have a
lifetime of more than 350 hours. (0.0228)
b. What percentage will have lifetimes of 300 hours or less?(50%)
c. What percentage will have lifetimes from 220 or 260 hours?(5.41%)
67
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Chapter five
5. Joint Distributions
Introduction
We have so far been concerned with the properties of a single random variable defined on a
given sample space. Sometimes we encounter problem that deal with two or more random
variables defined on the sample space. In this chapter, we consider this joint probability
distribution or multivariate random variable.
The random variable ideas discussed earlier can be easily generalized to two or more
random variables. We consider the typical case of two random variables which are either
both discrete or both continuous. It also discussed the concept of covariance and correlation
coefficient of two random variables. In cases where one variable is discrete and the other
continuous, appropriate modifications are easily made, generalizations to more than two
variables can also be made
Objectives;
After this chapter, the students will be able to:
identify univarate and bi-variate random variable
find probability of joint probability distributions
find the marginal probability distribution of joint distribution
calculate the covariance and correlation coefficient of two random variables
68
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
In chapter III, we have seen a random variable which means single random variable
(example X). Random variable could represent characteristics of an item or an
individual. For instance X, could represent age of an individual. However, some time,
we may be interested to know two characteristics of an item or individual, for example
level of education of an individual which can be represented by X and income which
can be represented by Y. In this case, our random variable is two and the distribution is
said to joint distribution. Like what we have seen in chapter III, random variable could
be discrete or continuous:
For j=1,2,. . . ,m these are indicated by the entry totals in the extreme right hand column or
margin of Table 5.1. Similarly the probability that Y=yk is obtained by adding all entries in
the column corresponding to y and is given by
n
P(Y y k ) f 2 ( y k ) f ( x j , y k )
j 1
69
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
For k=1,2,. . . ,n these are indicated by the entry totals in the bottom row or margin of Table
5.1
Table 5.1 Joint probability distribution
Y Totals
X y1 y2 … yn
x1 f(x1,y1) f(x1,y2) … f(x1,yn) f1(x1)
x2 f(x2,y1) f(x2,y2) … f(x2,yn) f1(x2)
.. .. .. .. .. ..
xm f(xm,y1) f(xm,y2) … f(xm, yn) f1(xm)
Totals f2(y1) f2(y2) … f2(yn) 1 Grand Total
refer to f1(xj) and f2(yk) (or simply f1(x) and f2(y)) as the marginal probability functions of X
and Y respectively. It should also be noted that
m n
f1 ( x j ) 1
j 1
f (y ) 1
k 1
2 k
f (x , y ) 1
j 1 k 1
j k
This is simply the statement that the total probability of all entries is 1. The grand total of 1
is indicated in the lower right-hand corner of the table.
The joint distribution function of X and Y is defined by
F ( x, y) P( X x, Y y) f (u, v)
u x v y
In Table 5.1, F(x,y) is the sum of all entries for which xj < x and yk < y.
70
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
5.2.Continuous Case
The case where both variables are continuous is obtained easily by analogy with the discrete
case on replacing sums by integrals. Thus the joint probability function for the random
variables X and Y (or, as it is more commonly called, the joint density function of X and Y)
is defined by
1. f ( x, y ) 0
2.
f ( x, y ) dx dy 1
x y
F ( x, y ) P ( X x, Y y ) f ( x, y ) du dv
u v
It follows that
2F
f ( x, y )
xy
i.e. the density function is obtained by differentiating the distribution function with respect
to x and y, therefore, we obtain
x y
P( X x) F1 ( x) f (u, v) du dv
v y
x y
P(Y y ) F2 ( y ) f (u, v) du dv
u y
We call, the above equation, the marginal distribution functions, or simply the distribution
functions, of X and Y respectively. The derivatives with respect to x and y are then called
the marginal density functions, or simply the density functions, of X and Y and are given by
x
f1 ( x) u f ( x,v)dv f 2 ( y) u
f (u, y) du
71
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Example 5.1 The joint probability function of two discrete random variables X and Y is
given by f(x,y) = c(2x + y), where x and y can assume all integers such that 0 < x < 2, 0 < y
< 3, and f(x,y) = 0 otherwise.
a) Find the value of the constant c,
b) Find P(X = 2, Y = 1).
c) Find P(X > 1, Y < 2)
Solution:
The sample points (x,y) for which probabilities are different from zero are indicated in fig
5.1. The probabilities associated with these points, given by c(2x + y), are
a) + y), are shown in Table 5.2. Since the grand total, 42c, must equal 1, we have c =
1/42.
y
Table 5.2
Y Totals 3
X 0 1 2 3 2
0 0 C 2c 3c 6c
1
1 2c 3c 4c 5c 14c
2 4c 5c 6c 7c 22c x
0 1 2
Totals 6c 9c 12c 15c 42c
Figure 5.1
b) from table 5.2 we see that
5
P( X 2, Y 1) 5c
42
c) from table 5.2 we see that
P ( X 1 , Y 2) f ( x, y )
x 1 y2
72
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Example 5.2 Find the marginal probability functions (a) of X and (b) of Y for the random
variables of example 5.1
Solution:
a) The marginal probability function for X is given by P(X = x) = f1(x) and can be obtained
from the margin totals in the right-hand column of Table 5.2. From these we see that
6c 1 7 x0
P( x x) f1 ( x) 14c 1 3 x 1
22c 11 21 x 2
1 1 11
Check : 1
7 3 21
b) The marginal probability function for Y is given by P(Y = y) = f2 (y) and can be obtained
from the margin totals in the last row of Table 5.2. From these we see that
6c 1 7 x0
P( X y ) F2 ( y ) 9c 3 14 x 1
12c 2 7 x2
15c 5 14 y3
1 3 2 5
Check : 1
7 7 7 14
73
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
If X and Y are continuous random variables we say that they are independent random
variables if the events X<x and Y<y are independent events for all x and y. In such case we
can write
P(X<x, Y< y) = P(X<x) P(Y <y)
or equivalently
F(x,y) = F1(x) F2(y)
where F1(x) and F2(y) are the (marginal) distribution functions of X and Y respectively.
Example 5.3 Show that the random variables X and Y of example 5.1 are dependent.
Solution:
If the random variables X and Y are independent then we must have, for all x and y.
P(X = x, Y = y) = P(X = x) P(Y = y)
But, as seen from example 5.1(b) and 5.2,
5 11 3
P( X 2, Y 1) P( X 2) P(Y 1)
42 21 14
P( X 2, Y 1) P( X 2) P(Y 1)
The result also follows from the fact that the joint probability function, (2x + y)/42, cannot
be expressed as a function of x alone times a function of y alone.
74
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
If X and Y are discrete random variables and we have the events (A:X=x), (B:Y=y), then
the above conditional distribution becomes
f ( x, y )
P(Y y \ X x)
f1 ( x)
Where f(x,y) = P(X=x, Y=y) is the joint probability function and f1(x) is the marginal
probability function for X. We define
f ( x, y )
f ( y \ x)
f1 ( x )
and call it the conditional probability function of Y given X. Similarly, the conditional
probability function of X given Y is
f ( x, y)
f ( x \ y)
f 2 ( y)
We shall sometimes denote f(x/y) and f(y/x) by f1 (x/y) and f2(y/x) respectively.
These ideas are easily extended to the case where X, Y continuous random variables are.
For example, the conditional density functions of Y given X
f ( x, y )
f ( y \ x)
f1 ( x )
Where f(x, y) is the joint density function of X and Y, and f1(x) is the marginal density
function of X. We can for example find that the probability of Y being between c and d
given that x<X<x+dx is
Example 5. 4. The joint probability function of two discrete random variable X and Y is
given by f ( x, y ) c(2 x y ), where x and y can assume all integers such that
0 x 2 , 0 y 3, and f ( x, y ) 0 otherwise. Find, a) f ( y 2) b) P( y 1 x 2)
75
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Solution
f ( x, y ) (2 x y ) 42
a. f ( y / x) , so that with X = 2
f1 ( x) f1 ( x)
(4 y ) 42 4 y
f ( y 2)
11 21 22
5
b. P( y 1 x 2) f (1 2)
22
E E X x
yf ( y / x) dy
Example 5.5. The average travel time to a distant city is c hours by car or b hours by bus. A
man cannot decide whether to drive or take the bus, so he tosses a coin. What is his
expected travel time?
Solution:
Here we are dealing with the joint distribution of the outcome of the toss, X and the travel
time, Y, where Y = Y car if X = 1. Presumably, both Y car and Y bus are independent of X, so
that by Property 1 above
E (Y / X = 0 ) = E (Y car / X = 0) = E (Y car ) = c
and E (Y / X = 1) = E (Y bus/ X = 1) = E (Y bus) = b
Then property 2 (with the integral replaced by a sum) gives, for a fair coin,
76
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
cb
E (Y ) E (Y / X 0) P( X 0) E (Y / X 1) PX 1)
2
In a similar manner we can define the conditional variance of Y given X as
E Y 2 / X x ( y 2 ) 2 f ( y \ x)dy
2
where 2=E(Y / X = x) Also we can define the rth conditional moment of Y given X about
any value “a” as
r
E Y a / X x ( y a) r f ( y \ x) dy
The usual theorems for variance and moments extend to conditional variance and moments.
2
X
( x ) f ( x, y) dx dy
E ( X X )2
X
2
E (Y ) ( y ) f ( x, y ) dx dy
2 2 2
Y y
Y
Another quantity which arises in the case of two variables X and Y is the covariance
defined by
XY
Cov( X , Y ) E X X Y Y
Similar remarks can be made for two discrete random variables. In such case,
X x f ( x, y ) Y y f ( x, y )
x y x y
XY
( x X )( y Y ) f ( x, y )
x y
77
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
where the sums are taken over all the discrete values of X and Y
The following are some important theorems on covariance.
XY
E ( XY ) E ( X ) E (Y ) E ( XY ) X Y
X Y 2 XY
2 2 2
X Y
or
XY X Y
Correlation coefficient
If X and Y are independent, then Cor (X, Y) = XY = 0. On the other hand if x and Y are
completely dependent, for example when X = Y, then Cor (X, Y) – XY = XY From this
we are led to a measure of the dependence of the variables X and Y given by
XY
P
XY
which is a dimensionless quantity. We call p the correlation coefficient or coefficient of
correlation. From Theorem 3-17 we see that -1 < p < 1. In the case where p = 0 (i.e. the
covariance is zero) we call the variables X and Y uncorrelated. In such case however the
variables may or may not be independent.
Exercise
78
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Y
X -3 2 4 Sum
1 0.1 0.2 0.2 0.5
3 0.3 0.1 0.1 0.5
Sum 0.4 0.3 0.3
d. P(3 X 4, Y 2)
e. P( X 3)
f. P( X Y ) 4
g. The joint distribution function
h. Whether X and Y are independent
79
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
4. the joint probability mass function of two random variable X and Y is given by
k ( 2 x y ) x 1, 2, y 1, 2
f ( x)
0 otherwise
Where k is a constant
a. What is the value of K ?
b. find the marginal probability mass function of x and y
c. Are x and y independent?
5. The joint probability mass function of two random variables X and Y is given by
1
(2 x y ) x 1, 2 y 1, 2
f ( x, y ) 18
0 otherwise
a) What is f ( y x)
b) What is f ( x y )
80
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Chapter six
Sampling distribution
Introduction
The main objective of statistical analysis is to know the actual value of different parameters
of a given population. One way of knowing the parameters can be through conducting
census. A census means complete enumeration of the entire population and determining the
value of parameter of interest. However, in most cases census is not feasible from practical
point of view due to cost, time, labor and other constraints. Alternative to census one can
use sampling approach to determine the same thing. Sampling is the process of selecting a
sample from a population. That is a random samples of a given size are taken from the
population and these samples characteristics are properly analyzed to infer the
characteristics of the population from the sample taken.
When random samples of a certain size are repeatedly drawn from a given population to
determine sample statistic, the computed value of the sample statistic (e.g. Sample mean)
will differ from sample to sample. Since the sample statistic based on a sample of certain
size, they are a random variable and each follow a probability distribution of its own called
sampling distribution
Sampling distribution has its own properties upon which rules for generalizing about
population based up on sample drawn from a population. In this chapter, we will study the
properties of some statistics a bit in depth and about widely used sampling distribution such
as t, F, and 2 distribution.
Objective of the chapter
After this chapter, the student will able to:
- Familiar with the concepts like statistic, parameters, random variable
- define sampling distribution
- identify the distribution of different sample statistic(sample mean, proportion,
variance)
- describe the properties of the sampling distribution of sample statistic
81
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Random variable
A variable is a random variable if its value determined by a random experiment. If variable
X is said to be random variable, it represents a phenomena of interest in which the observed
outcomes of an activity is entirely by chance. It is unpredictable and varies or changes
depending up on particular outcome of experiment measured. For example, suppose you
toss a die and measure X as the numbers observed on the upper face. The variable X can
take on any of six values 1, 2, 3, 4, 5 and 6 depending on random outcome of the
experiment. Since the value of X cannot determine before the experiment, variable X
represents a random variable. X can be also being occurrence of an event like number of
telephone call received randomly during a given time.
Statistic
A statistic is a numerical descriptive measures calculated from a sample. In other words, it
represents the summary measures that describe the characteristics of a sample. In most
cases, it refers to sample mean and sample variance. If X1, X2. . . Xn are a random sample,
n n
xi ( x x)
i
2
then X i 1
is called a sample mean and S 2 i 1
is called a sample variance.
n n 1
The value of X and S2 represent a statistic.
For example: Consider a population consists of five observations: 3, 6, 9, 12, and 15. If a
random sample of size n=3 is selected without replacement, find the sample mean and
sample variance (statistic for the sample drawn).
Solution: suppose the sample drawn from the population is 3, 6, 9, then
n
X i
3 6 9
X i 1
6
N 3
82
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Xi Xi - X (Xi- x )2
3 -3 9
6 0 0
9 3 9
( x x)
i
2
18
( x x) 2
18
S 2 i
9
n 1 2
S 92
Parameter
A parameter is a numerical descriptive measure that characterizes a population. In other
words a summary measure that describes any characteristic of the population. Since it is
determine based on observations of population, the value of parameters are unknown in the
case of large population. Parameters include population mean and variance among others .
The mean and the variance of the above population represent the parameter of a given
3 6 9 12 15
population. That is 9 representing a parameter of a population that
5
populations mean. Here we can determine population parameter since the population under
the study is finite.
Sampling distribution
Sampling distribution provides the basis for determining the level of confidence or
reliability with which a particular value of a given sample statistics can be used as an
estimate of the parameter. It also serves as the necessary ground for evaluating a particular
hypothesis stated with reference to a parameter. Both these processes require a clear
understanding of the various sampling distributions and their properties defining the
relationships between a given sampling statistics and the corresponding population
parameter. Therefore, let us first describe what a sampling distribution means and
understand the properties of different statistic sampling distribution.
As stated in the introductory part, sampling is used alternative to censuses to determine the
characteristic of a population. That is a random sample of a given size is taken from a given
83
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
likely draw with probability of 1 . These samples, along with the calculated value of X
10
are given as follows:
Sample Sample values X (sample mean)
1 3, 6, 9 6
2 3, 6, 12 7
3 3, 6, 15 8
4 3, 9, 12 8
5 3, 9, 15 9
6 3, 12, 15 10
7 6, 9, 15 9
8 6, 9, 15 10
9 6, 12, 15 11
10 9, 12, 15 12
84
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
without replacement. For each sample, we compute N means and construct a frequency
n
85
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Samples mean X i
2, 4 3
2, 6 4
2, 8 5
2, 10 6
4, 6 5
4, 8 6
4, 10 7
6, 8 7
6, 10 8
8, 10 9
Then organize this distribution of sample means in to frequency distribution and probability
distribution
Sample mean Frequency Relative frequency Probability
3 1 1 0.1
10
4 1 0.1
1
10
5 2 0.2
2
6 2 10 0.2
2
7 2 10 0.2
8 1 2 0.1
10
9 1 1 0.1
10
1
10
86
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
2
mean and variance 2 then E ( X ) and var ( X ) .
n
Proof
1 n X X 2 X 3 Xn
X
n i 1
xi 1
n
1
E X E X 1 X 2 X n
n
n
n
1 1 1
Var X var X 1 X 2 X n
n n n
1 2 n 2
n 2 2
i 2
i 1 n n n
2
var( X )
n
(The variance of linear combination if independent random variable is the sum of linear
coefficients squared times the variance of random variable.)
Using the above example, we can illustrate that the population mean and the mean of
sampling distribution of sample means are equal.
87
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
3 0.1
4 0.2
5 0.2
6 0.2
7 0.2
8 0.1
9 0.1
Then, E(x) = Σ X (P ( X ))
= 3(0.1) + 4(0.1) + 5 x (0.2) + 6(0.1) + 7(0.2) + 8(0.1) + 9 (0.1)
=6
5
xi xi 2 4 6 8 10
i 1
N 5 5
30
6
5
Thus if samples of n random and independent observation are repeatedly and independently
drawn from a population, then as the number of samples become large, the mean of sample
means approaches the true population mean. Moreover the variance of sampling distribution
,
2
of X decreases as the sample size n increases. This means as sample sizes become
n
larger, sampling distribution of X concentrated around the population mean. Thus, larger
samples results in greater certainty about our inference of the population mean.
In statistics, the degree of precision or reliability of estimator of population parameters is
measured by standard error of the estimator. In this case the degree of precision that
sampling distribution of sample mean ( X ) estimates the population mean could be
measured by the standard error of the sample mean.
The standard error of the estimator is the standard deviation of statistic used as an estimator
of a population parameter. Therefore, the standard deviation of X is referred to as the
88
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
is denoted by
2
then the corresponding standard error of X , is given as
n
S E (X ) var( X )
2
n n
S E (X )
n
If the sample size n is not a small fraction of the population size, N, then individual sample
members is not distributed independently of one another. Since a population member cannot
be included more than once in a sample, the probability of a specific sample member being
the second observation depends on the sample member chosen as the first observation.
Thus, the observations are not selected independently. In this case the variance of the
sample mean is
Var( X )
2 N n where
( N n)
is often called a finite population
n N 1 N 1
N n
correction factor. S E ( X ) , when N is large relative to the sample
n N 1
N n
size n, is approximately equal to 1 and S E ( X ) .
N 1 n
As a general rule of thumb, correction factor for finite population is used if the sample size
is more than 5 percent of a given population.
We have now developed expressions for the mean and standard deviation of the sampling
distribution of sample mean X . However, we have to know the distribution of the mean and
standard error of X to make inference about the population parameters. So let us define the
distribution of sampling mean.
If X is the mean of random sample of size n from normally distributed population with
mean µ and variance δ2, its sampling distribution is normally distributed with mean µ and
variance
2
regardless of the sample size.
n
89
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
variance
2
.
n
2
X : N ( , )
n
Thus for sufficiently large samples the sampling distribution of sample mean is
approximately normal. How large must the sample size n be so that the normal distribution
provides a good approximation for the sampling distribution of X ? The answer depends on
the shape of sampled population. The greater the skweness of the sampled population
distribution, the larger the sample size must be before the normal distribution is adequate
approximation for sampling distribution of X . For most sampled populations, sample sizes
of n 30 will suffice for the normal approximation to be reasonable.
Many estimators that are used to make inferences about population parameters are sums or
averages of sample measurements. Therefore, we have to restate the central limit theorem in
the form that enables us to make some statistical analysis about the population parameters
based on the average values of sample measurements. Thus, If X1, X2, - - - Xn be a set of n
independent random variables having identical distributions with mean µ and variance δ2,
then the distribution of a random variable, Z defined as:
X
Z Is normally distributed with mean 0 and variance 1 as sample size becomes
n
large (n )
If we able to convert a random variable, X in to standard normal, it is possible to describe
the behavior of sample mean X by calculating the probability of observing certain values
of X in a repeated sampling.
90
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Example: A soft drink vending machine is set so that the amount of drink dispensed is a
random variable with mean of 200 milliliters and standard deviation of 15 milliliters. What
is the probability that the mean amount dispensed in a random sample of size 36 is at least
204 milliliters?
Solutions: The distribution of X has mean X = 200 and standard error of
15
2.5
X
n 36
According to central limit theorem, the sample mean approximately normally distributed
X 204 200
and can be converted to standard normal as Z 1.6
SE ( X ) 2.5
The probability that the sample mean greater than 204 is P( X 204) P( Z 1.6) .
From the standard normal, Z- table P (Z > 1.6) = 0.0548. From the result we can concluded
that the probability that sample mean will be greater than 204 is equal to 0.0548.
Exercise: A bulb manufacture claim that the life of its bulb is normally distributed with
mean 36,000 hours and standard deviation of 4,000 hours. A random sample of 16 bulbs
had on average life of 34,500 hours. If the manufactures claim is correct what is the
probability that the sample mean small than 34,500 hours.
91
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
X
The sample proportion P is defined as where X is the number of successes (or the
n
number of items in the sample with the characteristics we are interested in) and n is the
X
sample size. Similarly can be defined as where N is the population size and X the
N
number of success in the population. So a sample of size n is taken from the population and
identifying the proportion of elements with the feature of interest to determine P. Take a
repeated sample of same size and determine the frequency and then the probability of each
proportion of sample with characters of interest to construct the sampling distribution of
sample proportion.
Example: supposes that we have a population of five students who are asked if they wanted
to become stations or not. The answers to the questions are given below.
Student Answer
1 Yes(Y)
2 No (N)
3 No (N)
4 Yes(Y)
5 No (N)
The number of students who wants to become statisticians out of this population is two.
Hence N = 5 and X= 2. The population proportion of the students who want to become
X 2
statisticians, = 0.4 or 40 percent. Now, let us take all possible samples
N 5
92
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
of size 4 from this population of size 5 and compute the sample proportion (p) of the student
for each sample who becomes statisticians.
0.5 3 3 = 0.6
5
0.25 2 2 = 0.4
5
5
Exercise: supposes that a manufacture of a bulb produces 6 bulbs a week, out of which 2 of
them are defective. If a random sample of 4 bulbs are taken to determine the number of
bulbs which are detective,
N
a. determine the number of sample drawn using the formula
n
b. Construct sampling distribution of sample proportion (of detective bulbs).
93
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
(1 )
P
n
Similar to variance for sample mean we can use the finite population correction factor when
the population is not large compared to the sample size.
(1 ) N n
Var ( P)
n N 1
(1 ) N n
P
n N 1
94
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Example: 45% of all graduate students pursuing their doctorate degree at Addis Ababa
University are married. If a sample of 200 graduate students is selected at a random, what is
the probability that the proportion of married students in this sample would be between 40%
and 48 %.
Solution: Distribution of proportion of various samples of 200 graduate students each from
the population would follow the normal distribution with average population proportion
=0.54 and δP. The standard error of the population:
(1 ) 0.45(0.55)
P
n 200
0.0012375 0.035
To find the probability that the proportion of married students in the sample of 200 would
be between 40% and 48%, we must find the area between 0.40 and 0.48.
Area between 0.4 and 0.45 can be found after converting the value to standard normal
P1 0.40 0.45 0.05
Z1
P 0.035 0.35
Z1 = -1.43. From Z- table the area equals 0.4236
P2 0.48 0.45 0.03
Similarly, Z 2 0.86
P 0.035 0.035
The area from the table for Z = 0.86 is equals to 0.3051.
Therefore, Total area = 0.4236 + 0.3051 = 0.7287
P (0.40 < P < 0.48] = 0.7287
95
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Exercise: According to the internal Revenue service, 75% of all tax returns lead to a refund.
If a random sample of 100, tax return is taken.
A. What is the mean of the sample proportion of returns leading to refunds?
B. What is the Variance o the Sample proportion?
C. What is the standard error of the sample proportion?
D. Determine the probability that the sample proportion exceeds 0.80?
deviation.
Here we use n-1 to find sample standard deviation for a random sample of n- observation.
This is because we computed sample mean and left with n-1 different value that can be
uniquely defined.
Given the above definition of sample variance, let us define its mean and distribution.
The mean (the expected value) of sample variance is equals to population variance.
E (S2) = δ2
Proof S 2
(X i X )2
from the chapter on expectation
n 1
96
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
X X
n
(X
2
i X )2 i
n 1
X
i
2
2X X
i X 2
X 2X X X
2
2
i i
X 2n ( X )
i
2 2
n ( X )2
( X ) n ( X )
2
i
2
E Xi X
2
E ( Xi ) 2
nE X
2
E Xi
n
i 1
2
n E X
2
n n 2
E ( X i X ) 2 n 2 (n 1) 2
i 1 n
E S
E 2 1
(X i X )2
n 1
So
1
n 1
E (X i X )2
1
n 1 2 2
n 1
E (S 2 ) 2
This implies, sample variance, 2 is unbiased estimator of population variance, 2. This
means that, in a repeated sampling, the average of all your sample estimates will equal the
target parameter, 2.
As we have seen in the preceding topics, identifying the distribution of sampling
distribution of a sample statistic is essential to make inference about a parameter of
population. Therefore, let us identify the distribution of sample variance.
97
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
( v2 ) Chi square
When the population mean is not known, a particular sample mean X based on a
random sample of size n may be used as an unbiased estimate of. Therefore, we can define
v2 as
(X
2
X
2
X X X )2
i i i
2
(X X )2
s 2
n 1
i
or (X i X )2 (n 1) S 2
98
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
(n 1) S 2
v2
2
Chi square has many important applications. Some of its application (uses) is
- Test of independence of attributes
- Test of goodness of fit
- Test for the equality of population variance and test for homogeneity
The calculated value of 2 is compared with the critical value at a particular level of
The chi square distribution has several important mathematical properties. Some of them
are the following.
1. If X1, X2 - - - Xn are independent random variables having standard normal
distributions, then
n
Y X i 2 Has the chi-square distribution with V=n-1 degree of freedom.
i 1
of freedom.
3. The mean and variance of chi-square distribution are equal to the number of degree
of freedom and twice the number of degree of freedom
E V and var
2
v V
2
2V Where V is the degree of freedom
That is
n 1 2 n 1
E E S 2 but E ( 2 ) 2
2
2
n 1
2 2 n 1
99
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
n 1
n 1
2
2 2
2 (n 1)
2(n 1)
For many applications involving, the population variance we need to find values for the
cumulative distribution of 2 , especially the upper and lower tails of the distribution. To
make inference about the population variance the calculated value of 2 is compared with
tabulated value of 2 for the given level of degree of freedom. For convenience of
interpretation, the 2 values listed under any column headed by specific value of may be
2
denoted as . It means the probability is that a random sample size n produces a
,v .
2
2 value greater than the tabulated value .
for d.f V = n-1.
For example, the tabulated 2 value for v = n - 1 = 10 d.f under the column heading =
0.05 is
2
= 18.3. It means the probability is = 0.05 that the 2 value computed from a
0.05
2
sample size of n = 11 is greater than = 18.3
0.05,10
f( 2)
=0.05
100
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
2
This =18.3, the area = 0.05 is the probability that X2 value based on sample of size
0.05
2
n=11 is greater than = 18.3
0.05
2
Tabulated value of X2 distribution with v=10 above which the area is .
P v2 P X 18.3 P 18.3 X 005
2
0.05
2 2
P X 2
0.05
K u 0.05 upper tail
P( X 0.05 K L ) 0.05 lower tail
2
P 2
10
3.94 0.05
P(
2
18.31) 0.05
10
2
That is, is such that if X is a random variable having a chi square distribution with v
x, y
P x
2
x, y
Example
A cement manufacturer claims that concrete prepared from his product has a relatively
stable compressive strength and that strength measured in kilograms per square centimeter
101
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
variance equal to, X 312 and s2 = 195. Do these data present sufficient evidence to reject
the manufacturer‟s claim d.f the population variance is equal to 100.
The claim of the manufacturer can be rejected if the calculated value of chi square exceeds
2
the critical value of 16.919 from the table
0.05,9
(n 1) s 2 9(195) 175
2
17.55
2
100 100
Since the observed value of chi square value 17.55 is greater than the critical value, we can
reject the manufacturer claim.
variances, s 1
2 . When independent random samples are drawn from two normal populations
s 2
variance, 2
1
2
2
then s 1
2 has a probability distribution in repeated sampling that is
s 2
termed as F-distribution.
F-distribution is a sampling distribution of the ratio of two independent random variables
with chi square distributions, each divided by its respective degree of freedom. If U and v
are independent random variables having chi square distributions with v1 and v2 degree of
freedom, then
2
u 1
v1 v1
F
2
v
v2 2
v2
102
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Is a random variable having F-distribution whose values vary with every set of two samples
of size n1 and n2.
2
( n1 1) s 1
n1 1
2
F 1
2
( n2 1) s 2
2
2
n2 1
F s
2 2
2 1
s
1 2
1 2
and 2 are the variances of independent random samples of size n1 and n2 from
2 2
If 1
The critical values of F-distribution are tabulated like the case of chi square and Z-
distribution. F, v1, v2 represent area to its right under the curve of F-distribution with v1
and v2 degree of freedom is equal to . That is F, v1, v2, is defined as P(F > f, v1, v2)=
Example: if the value of V1 = 10 V2 = 20 = 0.05
F10, 20, 0.05 = 2.35
To test whether the variance of two populations is equal or not, compare the calculated
value of F with the critical value of F.
Example: The research staff of investors was interested to determine if there is a difference
in the variance of maturities of AA-rated industrial bond and CC-rated industrial bonds. A
2
random sample of AA-rated bonds resulted in a sample variance s X
123.35 and an
103
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
1. The shape of the sampling distribution of X and /or Z statistics depend on the shape
of population sampled. We can no longer assume that the distribution of X is
approximately normal, because central limit theorem ensures normality only for
sample that are sufficiently large. However, the sampling distribution of sample
mean is normal if the sampled population is normal.
2. The population standard deviation is always unknown. Even though it is possible to
estimate the population standard deviation with the sample standard deviation s, it is
poor approximation of for population standard deviation when the sample size is
small.
In the case where the population standard deviation is unknown, standard normal statistic
cannot be used. It is natural to replace the unknown by the sample standard deviation, s.
These will gives a distribution called student t-distribution after Gosset who developed the
X
probability distribution of the statistic t .
s
n
Given a random sample of n observations, with mean X and standard deviation s, from
normally distributed population with mean , the random variable t, follows the student‟s t
distribution with (n-1) degrees of freedom .The shape of the student‟s t distribution is rather
similar to that of the standard normal distribution. Both distributions have mean zero, and
104
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
the probability density functions of both are symmetric about their mean. However, the
density function of the student‟s t distribution has a larger dispersion (variability) than the
standard normal distribution. The actual amount of variability in the sampling distributions
of t depends on the size of the sample n.
As the number of degree of freedom increases, (sample size increases) the student‟s t-
distribution becomes increasingly similar to the standard normal distribution. This is
intuitively reasonable and follows from the fact that for a large sample size, the sample
standard deviation is a very precise estimator of the population standard deviation. In
particular, the small the degree of freedom associated with the t-statistic, the more the
variable will be its sampling distribution.
If xi is the n sample values drawn from a normal population with mean and variance 2,
xi
the standard normal random variable can be defined as: Z which follows a normal
n
distribution with mean 0 and variance 1. For the same n sample values, the square of
105
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Z X
t where Z
y
n 1 n
y
2
Z i
x
n X
t
X i X 1 X i X
1 2
n n 1
n n 1
X
t
S n
t1-=-t t
106
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
=0.05 =0.05
-1.76 1.76
107
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Chapter seven
Estimation
Introduction
In the previous chapters of this module, we have been dealing with the concepts that are
used in statistical inference. That is probability and probability distributions are considered.
However, Statistical inference relates sample characteristics to population characteristics to
draw conclusion about the population parameter. This is because taking entire population
for determination of its characteristics is not possible due to constraints like capital, time
and other resources. A process known as estimation and hypothesis testing could do such
process of drawing conclusion about the unknown population parameter based on sample
statistic. Estimation, which is the subject of this chapter, means estimating or predicting the
value of population based on sample observations. On the other hand, hypothesis-testing
means making decision about the value of parameter based on some preconceived value of
statistic.
estimated by either the sample mean or the sample median X md . The estimator of
population mean , X represent an estimator. That is an estimator of a population
108
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
In other words, estimators are expressed as a function of random variables and an estimate
is a single number.
For example: if X1 = 10, X2 = 6, X3 = 5, X4 = 8 be samples taken from a population to
a given size drawn from the population whose parameter is to be estimate. For example, a
particular sample mean X is a point estimate of the population mean.
Based on sample data, two numbers are calculated to from an interval with in which the
parameter is expected to lie.
109
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Interval estimate expressing an estimate as an interval or range that most likely contains the
value of the population parameter. For example, when it is stated that the average age of the
university campus students lies between, say 16 and 24 years, is an interval estimate and
expressed as (16 yrs < < 24 yrs)
The advantage of an interval estimate is that it allows us to assign a definite probability that
interval contains the parameter being estimated. It also helps indicate the magnitude of error
in estimation, which serves as a measure of how accurately and precisely a parameter has
been estimated.
1. Unbiased ness
An estimator of a parameter is said to be unbiased if the mean (expected value) of its
distribution is equal to the true value of the parameter. If the mean of a sampling
distribution is not equal to the parameter, the statistics (estimator) is said to be biased
estimate of the parameter. That is a statistic is an unbiased estimator of the parameter
if and only if E( ) = sometimes will overestimated and other times underestimate the
parameter, but it follows form the notation of expectation that if the sampling procedure is
repeated many times, then on the average, the value obtained for unbiased estimator will be
equal to the population parameter. Unbiased ness however does not mean that the estimate
we get with any particular sample is equal to population parameter or even very close to
. Rather, if we could indefinitely draw random samples from the population, compute an
estimate each time, and then average these estimates of over all random samples, we would
obtain population parameter.
110
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Proof
1. E X = µ
Let us consider n sample taken from the population
then X
x i
X1 X 2 X n
n n
1 1 1
X x1 x 2 x n
n n n
1 1 1
E x E x1 E x 2 E x n
n n n
1 1 1
E x E x1 E x 2 E x n
n n n
Ex
1
n
1
n
1
n
1
n
n
n
E x
111
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
xi x
2
S 2
n 1
xi x
1 2
E (s2 ) E
n 1
E (s2 )
1
n 1
n
i 1
2
E xi x
1
n 1
E x 2 xx x
2
i i
2
2 2
Since xi x
2
x i
2 xxi x
1
n 1
E x i x xi x but xi nx
2
1
n 1
2
E x i 2nx nx
2 2
1
n 1
2
E x i nx
2
E s2
1
n 1
E xi n E x
2 2
(1)
Similarly var x E x 2
and var x 2
n
2
n
E x
2
2
E x 2 x 2
n
2
2
E x E 2 x E 2
E x 2
2
2 2 2
Thus
n
E x
2
2 2
E x
2
2
2
(3)
n
Substitute equation 2 and 3 in equation (1)
112
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
1 2 2 2
E S2
n 1
2
n
n
1
n 1
2 n 1 2
E (s 2 ) 2 This implies sample variance is unbiased estimator of population
variance.
Illustration: consider a population consisting of the measurements 0, 3 and 12 and
sampling distribution of sample mean and sample median a random sample of n=3
measurements from a population defined by the probability distribution shown below.
X 0 3 12
1 1 1
P(X)
3 3 3
The sampling distributions of sample mean X and sample median m
__________________________________________________________________
X 0 1 2 3 4 5 6 8 9 12
1 1 1 1 3 6 3 3 3 1
P(X )
27 27 27 27 27 27 27 27 27 27
____________________________________________________________________
____________________________________________________________________
M 0 3 12
7 13 7
P (M)
27 27 27
____________________________________________________________________
Show that X an unbiased estimator of and m is biased estimator of
Solution: the expected value of a discrete random variable x is defined to be E(X)
E(X) = xP( x ) = 5
113
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
E ( X ) X p( X )
E ( X ) =5= This implies that sample mean is unbiased estimator of population mean.
To show the sample median is biased estimator of population mean, let us find the expected
value of the sample median.
E (m) = mp(m) =4.56. Since the expected value of sample mean is not the same ,
Exercise-1
Consider the probability distribution shown below
X 0 1 4
1 1 1
P(X)
3 3 3
a) Find the sampling distribution of sample mean for a random sample of
n=2 measurement from the distribution.
b) Show that the sample mean is unbiased estimator of population mean.
Exercise-2
If X 1 X 2, ..... X n constitute a random sample from normal population with mean , show
xi
that n
is unbiased estimator of 2
2. Efficiency
Unbiased ness only ensures that the sampling distribution of an estimator has a mea value
equal to the parameter it is supposed to estimating .This is fine, but we also need to know
how spread out of the distribution an estimator is. The degree of dispersion of an estimator
can be measured as efficiency of an estimator.
114
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
An estimator is said to be efficient if its value remains stable from sample to sample taken
randomly from the same population. It is estimator whose distribution is most closely
concentrated about the population parameter being estimated. That is it has minimum
variance compared to other estimator of the population parameter. Such type of estimators
is reliable and give greater information about population parameter.
Suppose there are several unbiased estimators of θ. Then among unbiased estimators of
population parameter, an estimator with minimum variance is said to be the most efficient
observations. Then, is said to be more efficient than if var ( ) < var ( ). This is to
mean that the distribution of is more tightly centered about θ than the distribution of .
That means the sampling distribution of is considerably more variable than the sampling
distribution of .
To check whether a given unbiased estimator has the smallest variance or not we have to
check the following condition.
If is an unbiased estimator of θ and
1
var then has a minimum variance unbiased estimator of θ.
2nf ( x) 2
n E
2
Where f(x) is the value of the population probability density value at x and n is the size of
the random sample.
Example: show that x is a minimum variance unbiased estimator of the mean µ of a
normal population.
Solution: Since the normal distribution density function for the random variable x is given
by
x
2
1 1
f ( x) .e 2
for x
2
115
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
x
2
1
It follows that n f ( x) 1n 2
2
n f ( x) 1 x
So that
n f ( x) 2 1 x 2 1
and hence E . E 2 .1
2
1 1 2
Thus
n f ( x) 2 n.
1 n
n . E 2
estimator of µ.
The efficiency of estimators can be also determined by comparing the variance of
estimators of the same parameter. If 1 and 2 are two unbiased estimator of the parameter
var 1
more efficient than 2 . In addition, we use the ratio: as a measure of the efficiency
var 2
of 1 relative to 2 .
One way of comparing estimators, those are not necessarily unbiased is to compute the
mean square error (MSE) of the estimator. If is the estimator of θ, then the MSE of an
estimator is defined as MSE ( ) = E ( -θ). It measures the dispersion around the true
value of the parameter. An estimator with least MSE consider as the estimator with less
variability from the population parameter supposed to be estimated.
Exercise
If x1 and x2 are the means of independent random samples of size n1 and n2 from normal
population with mean µ and variance 2 , show that the variance of unbiased estimator
n1
x1 (1 ) x2 is a minimum variance when
n1 n2
116
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
3. Consistency
Consistency refers to the effect of sample size on the accuracy of the estimator. A statistic is
said to be consistent estimator of the population parameter, if it approaches the parameter as
the sample size increases. In other words, for large sample size, n the estimators will take on
values that are very close to the respective parameters.
Let be an estimator of θ based on a sample y1, y2, ...... yn of size n. Then, is consistency
estimator of θ for ever >0, if the probability that the absolute value of the difference
between and θ is greater than and approach zero as the size of sample becomes large.
p 0 as n this is often expressed as
var 0 as n , then is consistent estimator of θ.
Alternatively, a sufficient condition for consistency is that the MSE ( ) tends to zero as n
increases indefinitely.
Example-1: show that for a random sample from a normal population, the sample variance
S2 is a consistent estimator of δ2.
Solution: Since S2 is an unbiased estimator of δ2 let us show that var (S2) 0 as n . In
the previous chapter, we have shown that for a random sample from a normal population:
2 4
var( S 2 ) . It follows that var (S2) 0 as n .This show that S2 is a
n 1
consistent estimator of the variance of normal population.
Example-2: let x1, x2 ....... xn be a random sample from a distribution with mean and
117
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Solution: From elementary statistics, it is known that E ( X ) = and var ( X ) = . Since
n
4. Sufficiency
An estimator is said to be sufficient if it used all the information about the population
parameter contained in the sample.
For example, the statistic mean uses all the sample values in its computation while mode
and median do not. Hence, the mean is a better estimator in this sense.
The statistic is sufficient estimator of parameter θ if and only if for each value of the
118
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
a random variable. That is the rth moment of y is E ( y r ). Thus, the method of moment
estimation of parameter is the process of relating a random variable to its expected value
in the distribution of sample random variable. If the population random variable has a
known probability distribution with unknown parameter, the first moments of the
variable X,
m1 E ( X ) g ( )
is some function of the unknown parameter. The method of moment then generate an
k=1, 2, 3 …
Thus, if a population has r parameters, the method of moments consists of solving the
system of equations
r
r 1, 2,
r
M k
r for the r parameter.
k
Example: given a random sample of size n from a uniform population with =1. use the
method of moments to obtain a formula for estimating the parameter .
1
r
2 2
1
x in addition, we can write the estimate of as 2 x 1
2
Generally, if x1, x2 ……xn is a random sample of a random variable x whose probability
distribution depends on unknown parameters θ1, θ2 … k, the method of moments estimators
119
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
for the parameters are given by setting sample moments equal to population moments and
solving the resulting equations simultaneously.
= pX ( x1 ) pX ( x2 ).......... p X ( xn )
= LX ( )
120
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
As with all maximization and minimization problems, the value of the estimator that
maximizes the likelihood function can be found by trial and error. Alternatively, however
likelihood function would be maximized by the method of calculus. The necessary
condition for maximization of the likelihood function or its log is
ln l ( ; data )
0 this equation is called likelihood equation.
The root to likelihood equation gives rise to the maximum likelihood estimator.
121
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
L , 2 n( xi , , )
n
i 1
n
1
L ,
n
2
. exp 1 2 . ( xi )
2 2 i 1
Find the maximum likelihood estimate of these two parameters.
Solution: Take the natural logarithm of L (, 2) then the joint probability distribution the
random variable becomes:
n L ,
n
n 2
2 n
n
2 ( xi )2
2 2 2 2
To maximized the likelihood function log linearizes the maximum likelihood joint
probability function and take its partial derivative with respect and 2
n L , 2
1
n
2
( x )
i 1
i (1)
n L , 2 n ( x ) 2
i
(2)
2 2 2 2( 2 )2
Setting these two partial derivatives equal to zero gives solution for,
n
1
n
x
i 1
i x Substituting this value in the second equation gives
2
1
n
( x x)
i
2
2
This implies x is the maximum likelihood estimator of and is the maximum
1 xi
n 1
n
L( ; X 1..... X n ) i 1
f ( X i ; ) e i 1
Take natural log both side and differentiate with respect to
122
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
1
ln L( ; X i ) n ln
n
xi
1
i 1
d L n 1
. X i
d 2
Equating this derivative to zero and solving for , we get the maximum likelihood
estimate
1 n
xi X
n i 1
Hence, the maximum likelihood estimator = X
Exercise
Let X be a gamma random variable with parameters r and . Assume X 1, X 2 ..... X n is a
random sample of X and the likelihood function for the sample is
n n
r xir 1 x
Lx ( r, ) f x ( xi ) e i
. Find the maximum likelihood estimator of the
i 1 i 1 (r)
population parameter.
7.3. 3. Least square estimation method
We have studied the method of moments and the maximum likelihood method for
estimating unknown parameters of population. Under this topic, we will study the third
method of estimating population parameter called least squares estimation. It is especially
applicable to models that involve the values for two or more variables. Least square method
is used to estimate unknown parameters in the assumed relationships between the variables
using regression analysis. This analysis how ever concerned with what is known as the
statistical dependence between variables not functional or deterministic relation ship. In
statistical relation between variables, we deal with random or stochastic variables that are
variables that have probability distribution.
For example, the dependence of crop yield on temperature, rainfall, and other weather
condition has statistical nature. That is the explanatory variable ,although certainly
important ,they we not enable agronomist to certainly predict crop yield exactly because of
the error involved in measuring the variables as well as other factors(variables) that
collectively affect the yield but difficult to identify individually. Variables having such
relation can be expressed as
Yi 1 2 X i ui
123
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
= Yi ui
ui Yi Yi
= Yi - 1 2 X
Now given n pairs of observation on Y and X, the least square estimator of the actual value
of Y determine in such manner that it is as closed as the actual value. That is determining
the estimators, which minimizes the sum of the residual. However, the sum of residual term
equals to zero even though the individual values of error terms are scattered. To avoide this
situation there fore the sum of square residual is minimize to determine the estimator when
we use least square method of estimation. Therefore, the method of least square provides us
with unique estimate of 1 and 2 that gives the smallest possible value for ui . This
2
u ( Yi - µ ¶
2
(Yi Yi ) 2 = 1 2 X i )
2
Min i =
Take the partial derivative of the minimization problem with respect to 1and 2 and
( ui )
2
Y 1 2 X
1 Y 2 X
124
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
( ui )
2
( X i X )(Yi Y )
2 i 1
n
( X
i 1
i X )2
n X iYi X i Yi
2
n X i2 ( X i )2
Example: Given the following data, determine the estimator of population that relates the
two variables X and Y.
Y 3.5 4.3 5.2 5.8 6.4 7.3 7.2 7.5 7.8 8.3
X 6 8 9 12 10 15 17 20 18 24
Solution:
Y X XY X2 Y2
3.5 6 21.0 36 12.25
4.3 8 34.4 64 18.49
5.2 9 46.8 81 27.04
5.8 12 69.6 144 33.64
6.4 10 64.0 100 40.96
7.3 15 109.5 225 53.29
7.2 17 122.4 289 51.84
7.5 20 150.0 400 56.25
7.8 18 140.4 324 60.84
Yi 1 2 X i u i
125
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
1 Y 2 X
63.3 139
1 = ( ) 0.252 =2.823
10 10
126
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
value of .
For some specified value of (1-) the interval L1 L2 , L1, L2 represent (1-) 100%
confidence interval that contain the unknown value of . The ends points L1andL2 are
called the lower and upper confidence limits. This leads to our saying, we are 100(1- )
percent confident that the interval contain the true population parameter value.
127
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
size n form a normal population with mean and variance 2 is normal distribution
2
with X and X . Then transforming the sampling distribution of sample means
n
into the standard normal distribution as
Z
X
n
Z X X Z
n n
Since falls with in a range of values equidistance from X , the interval estimate of
population mean with normal distribution is equals to plus or minus its standard error times
table value of Z for the indicated level of significance.
X Z
n
If the mean of a random sample of size n from a normal population with the known
variance 2, is to be used as an estimator of the mean of the population, the probability is 1-
that the error will be less than Z .
2 n
i.e. P x Z . 1 or
2 n
P Z Z Z 1
2 2
1-
Z Z
2 2
128
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
x
P Z Z 1
2 2
n
x Z 1
P Z
2 n 2 n
1
P x Z x Z
2 n 2 n
1
P x Z xZ
2 n 2 n
- acceptable error level where Z is the Z value representing an area in the right and
2 2
left tails of the standard normal distribution and (1-) is the level of confidence.
1.96 1.96
P x
n n
1.96 1.96
P x x
n n
129
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Example 2
Suppose that shopping times for customers at a local grocery store are normally distributed.
A random sample of shoppers in the local grocery store had a mean time of 25min. Assume
x 6 minutes. Find a 95% confidence interval for the population mean.
Solution: The 95% confidence interval of estimator of population mean is given as
1.96 1.96
x x
n n
1.96(6) (1.96)(6)
25 25
16 6
25 2.94
22.06 27.94
The above result can be interpreted as, based on 16 observations, there is 95% confidence
that the unknown population mean fall between 22 minute and 28 minutes.
Exercise: unoccupied seats on flights causes air lines to loss revenue .suppose a large
airline wants to estimate its average number of unoccupied seats per flights over the past
year. To accomplish this, the records of 225 flights are randomly selected, and the number
of unoccupied seats is noted for each of the sample flights. The sample mean and standard
deviations are given as follows;
x =11.6 seats s=4.1 seats.
Estimate , the mean number of unoccupied seat per flight during past year using 90
confidence interval.
130
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Therefore if x and s are the value of the mean and standard deviation of a random sample of
size n from normal population, then
S
x t ,n1 x t ,n1 Is (1- ) 100% Confidence interval for the mean of the
2 n 2 n
population with unknown standard deviation small sample size.
In other words if x1, x2 ………….. xn be a random sample of a normal random variable with
mean and 2 then the interval (L1, L2) can be defined as ;
S t , n 1 S t , n 1
L1 x 2
and : L2 x 2 is a 100 (1- )% confidence interval
n n
for or it can be expressed as
P t t 1
2 2
x
where t
s
S S
P x t x t 1
2 n 2 n
Example: Gasoline price rose drastically during the early years of this century. Suppose
that a recent study conducted using truck drivers with equivalent year of experience to test
run 24 trucks of a particular model over the same highway. Estimate the populations mean
fuel consumption for this model of trucks with 90% confidence if the fuel consumption, in
miles per gallon, for the 24 truck were:
15.5 21.0 18.5 19.3 19.7 16.9 20.2 14.5
16.5 19.2 18.7 18.2 18.0 17.5 18.5 20.5
18.6 19.1 19.8 18.0 19.8 18.2 20.3 21.8
131
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Exercise
1. A random sample of 64 sales invoices was taken from a large population sales
invoice. The average value was found to be Birr 2000 with a standard deviation of
birr 540. Find a 90% confidence interval for the true mean value of all the sales.
2. The quality control manager at a factor manufacturing light bulbs is interested to
estimate the average life of large shipment of light bulbs. The standard deviation is
known to be 100 hours. A random sample of 50 light bulbs gave a sample average of
life of 350 hours determine 95% confidence interval estimate of the true average life
of light bulbs in the shipment.
x1 x i
and x2 x i
and take the difference x1 x2
n1 n2
So the desired confidence interval for 1-2 can be obtained in terms of the sampling
distribution of x1 x2 , provided that the two populations are approximately normal, or the
sample sizes n1 and n2 are both greater than or equal to 30. When either of these conditions
is satisfied, the sampling distribution of x1 x2 is approximately normal with mean
2 2
132
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Z
x x
1 1 2
2
2 2
1
n1 n2
has a standard normal distribution. If the variance of the two populations is known the
probability that the difference between the two means take value Z is 1-. That is
2
P Z Z Z 1
2 2
P x1 x2 Z
2 n1
1
n2
1 2 x1 x 2 Z
2 n1
1
n2
1
Thus (1-) 100% confidence interval for 1-2 is given by the double inequalities
2 2 2 2
x1
x2 Z
2 n1
1
n2
2
1 2 x1 x 2 Z
2 n1
1
n2
2
Example: The strength of the wire produced by company A has a mean of 4,500kg and a
standard deviation of 200kg. Company B has a mean of 4000 kg and standard deviation of
300kg. A sample of 50 wires of company A and 100 wires of company B are selected at
random for interesting the strength. Find 99 percent confidence limits of the difference in
average strength of the population of wires produced by the two companies.
2 2
40,000 90,000
x x 1 2
41.23
1 2
n1 n2 50 100
The interval estimate of the difference between the two-population mean expressed as:
133
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
2 2 2 2
x1 x2 Z 2 1 2
1 2 x1 x 2 Z 2 2 2
2 n1 n2 2 n1 n2
If the variance of the two populations is unknown, the population standard deviation of the
two populations is estimated by sample standard deviation. Under such condition the
confidence interval of the difference between the two means is obtained using sample
standard deviations S1 and S2 in place of 1 and 2 respectively. Thus, we have
2 2 2 2
x x
1 2 Z
2
S
n1
1
S 2
n2
1 2 x1 x2 Z
2
S
n1
1
S
n
2
When the standard deviations of the two populations are unknown and the sample sizes n1
and n2 are both small, the desired confidence interval is obtained by using t-distribution,
provided that the two populations are approximately normal.
x x
1 2 t
2
SP
1 1
n1 n2
( 1 2 ) x1 x2 S P 1 1
n1 n2
Example: A study has been made to compare the nicotine contents of two brands of
cigarettes. Ten cigarettes of Brand A had an average nicotine content of 3.1 milligrams with
134
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
standard deviation of 0.5 milligram. While eight cigarettes of Brand B had an average
nicotine of 2.7 milligrams with a standard deviation of 0.7 milligram. Assuming the two
sets of data are independent random samples from normal population with equal variance,
construct a 95% confidence interval for the difference between the mean nicotine contents
of the two brands of cigarette.
Solution: substitute n1 =10 n2= S1=0.5 and S2=0.7 into the formula for sp and we get
9(0.25) 7(0.49)
SP 0.596
16
x2 2.7, x1 3.1 and t 0.025 , 26 2,120
Exercise:
A study of two types photocopying equipments show that 61 failures of the first kind of
equipment took on average 80.7 minutes to repair with standard deviation of 19.4 minutes ,
where as 61 failure the second kind of equipment took 88.1 minute to repair with a standard
deviation of 18.8 minutes. Find the 99 percent confidence interval for the difference
between the true average times it takes to repair the failure of the two kind of photocopy
equipment.
135
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
p
Z approximate a standard normal distribution.
(1 )
n
This result can be used to construct confidence intervals for the population proportion.
P Z Z Z 1
2 2
P(1 p) P(1 p)
P P Z P Z 1
2 n 2 n
Let P denote the observed proportion of success in a random sample of n-observations from
a population with a proportion of success. Then, if n is large enough that (n) ( ) (1- ) >
9 then a 100 (1- ) % confidence interval for the population proportion is given by
P(1 P) P(1 p)
P Z P Z
2 n 2 n
Example: In a random sample, 136 of 400 persons given a five vaccine experienced some
discomfort. Construct a 95% confidence interval for the true proportion of persons who will
experience some discomfort from the vaccine.
Solution
136
n 400 P 0.34 Z 0.025 1.96
400
P(1 P) P(1 p)
P Z P Z
2 n 2 n
(0.34)(0.66) (0.34)(0.66)
0.34 1.96 0.34 1.96
400 400
0.294 0.386
Exercise: suppose we want to estimate the proportion of families in a town, which have two
or more children. A random sample of 144 families shows that 48 families have two or
more children. Set a 95% confidence estimate of population proportion of families having
two or more children.
136
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
x1 x
proportions. p1 and p 2 2 based on two independent random samples of size n1 and
n1 n2
n2 drawn from populations with proportion P1 and P2, respectively. The desired confidence
interval for P1 – P2 may be obtained by using the sampling distribution of p1 P2 which is
approximately normal with mean = P1 – P2 and standard deviation
p1 q1 p 2 q 2
n1 n2
Defining standard normal variety Z as
( p1 p 2 ) ( p1 p 2 )
Z
P1 q1 P2 q 2
n1 n2
We make a claim that
P Z Z Z 1
2 2
Substituting for Z and solving for (P1 – P2) in exactly the same way as for we did for
difference between means, we have
p1 q1 p2 q2
Pq p q
P p1 p2 Z 1 1 2 2 ( p1 p2 ) p1 p2 Z
n1 n2
1
2 2 n1 n2
Pq p2 q2
p1 p2 Z
1 1
( p1 p2 ) p1 p2 Z 2 p1 q1 p2 q2
2
n1 n2 n1 n2
Z
P P P P
1 2 1 2
is random variable having approximately the standard
P1 (1 p1 ) P2 (1 p 2 )
n1 n2
137
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
1 2 2 2
1 1
1 (1 1 )
1 (1 1 )
1 2
Z
2 n1
n2
1 2 1 2 Z
2 n1
n2
Example: During a presidential election year, many forecasts are made to determine how
voters perceive a particular candidate. In a random sample of 120 registered voters in region
– A, 107 support the candidate in question. In an independent random sample of 141
registered voters in region-B, only 73 support the same candidate. If the respective
population proportions are denoted A and B, find a 95% confidence interval for the
population proportion difference (A - B)
Solution:
73
nA = 120 nB = 141 PB 0.518
141
For a 95% confidence interval, = 0.05 and Z = Z0.05 = 1.96
2
Exercise: In a study of the relationship between birth order and college success, an
investigator found that 126 in a sample of 180-college graduate were first born children. In
a sample of 100 non-graduates of comparable age and socio economic background, the
number of first born children was 54. Estimate the difference between the proportions of
138
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
first-born children in the two populations from which these samples were drawn. Use a 90%
confidence interval and interpreter your results.
It follow that the estimator S2 is used to define a random variable 2 such that
(n 1) S 2
2 which follow a chi-square distribution with n-1 degree of freedom.
2
We use this distribution to estimate population variance in interval form. Given a random
sample of size n from a normal population, we can obtain a (1- )100% confidence interval
for 2 by using a chi-square distribution as follows
2 (n 1) S 2
P x n 1 1
2
1 n 1
2,
2
2
(n 1) S 2
(n 1) S 2
P 2
2 2
1
x 2
n 1 X 1 ,n 1
2
If s2 is the value of the variance of a random sample of size n from a normal population,
then
(n 1) S 2 ( n 1) S 2
2
is (1 ) 100%
2 2
, n 1 1 , n 1
2 2
Confidence interval for 2. That is the value of population variance found between
( n 1) S 2 ( n 1) S 2
and with 1- probability or with (1-)100% confidence.
2 2
, n 1 1 , n 1
2 2
139
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
In general, let x1, x2 ------- xn has a random sample of normal random variable with mean
(n 1) S 2
has a 2 probability distribution with n-1
2
2
degree of freedom. It follows that we can find confidence interval for population variance
( n 1) S 2
2 2
or
2 2 1
2
( n 1) S 2 ( n 1) S 2
2
2 2
1 1
2 2
( n 1) S 2 ( n 1) S 2
P (L1 < 2 < L2) = 1- where L1 L2
2 2
1
2 2
Example: Suppose, the weight of frozen food package produced by a given manufacturer is
a normal random variable with mean and variance 2 (both are unknown). A sample of 10
of these packages are selected at random and independently from those produced. Their
mean X =15.9 and standard deviation of the observation is S=0.57. Compute 95%
confidence interval for 2.
Solution:
n=10
(xi - x )2 = (n-1)S2 =2.90
2
df = 9 2 0.005 =1.73 0.025
= 2.70
2 2
= 19.0 = 23.6
0.975 0.995
(n 1) S 2 (2.90)
L1 0.15
2
19
1
2
(n 1) S 2 2.90
L2 1.07
2
2.70
2
140
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Based on the sample we are 95 percent sure that the interval (0.15, 1.07) contain 2.
Exercise: In 16 test runs, the gasoline consumption of an experimental engine had standard
deviation of 2.2 gallons. Construct a 99% confidence interval for 2, which measures the
true variability of the gasoline consumption of the engine.
point estimate of
2 2
1 is given by the ratio of two sample variances, s 1
. So if s
2
and
2 2 1
2 s 2
2
s 2
are the variance of independent random samples of size n1 and n2 from normal
populations, then
F S
2 2
2 1
is a random variable having an F-distribution with n1 – 1 and n2 – 1
s
2 2
1 2
2
1
companies is normally distributed. Find a 98 percent confidence interval of where
2
2
141
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
2 2
1
and 2
are the variances of the life tubes manufactured by company x and company y
respectively.
Solution:
2
n1=25 s 1
= (15)2 = 225
2
n2=16 s 2
=(20)2 = 400
= 0.02 =0.01
2
f 0.01( 24.15 ) 3.29 f 0.01(15 , 24 ) 2.89
2 2
400 3.29 400 2
2
Exercise: A study has been made to compare the nicotine contents of two brands of
cigarettes. Ten cigarettes of Brand A had an average nicotine content of 3.1 milligrams with
standard deviation of 0.5 milligram. While eight cigarettes of Brand B had an average
nicotine of 2.7 milligrams with a standard deviation of 0.7 milligram. Assuming the two
sets of data are independent random samples from normal population with equal variance,
construct a 95% confidence interval for the ratio of the variance of nicotine contents of the
two brands of cigarette.
142
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Chapter Eight
Hypothesis testing
Introduction
So far, we have discussed the first aspect of inferential statistics, estimation of population
parameter based up on the information obtained from random sample. Now we will
consider the second aspect of statistically inference, the hypothesis testing. It is the process
of making decision about the value of population parameter based on the information
obtained from sample result. That is hypothesis testing involves identifying the validity of
some conjectures or claims about population parameter using statistic computed from
random samples. Hypothesis testing enable as to make decision about the estimated value of
population based on observed sample data.
In this chapter, we are going to describe testing hypothesis about different population
parameter based on sample evidence by starting with description of basic concepts.
143
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
based on data collected for a random sample. The claim that is made about the relationship
between training program and worker productivity represent statistical hypothesis.
Null hypothesis
A statistical hypothesis about the parameter that will be maintained unless there is strong
contrary evidence. Simply it means claim about population parameter that is tested based on
sample information. It represents a statement that a research test its validity based on
sample information. It is denoted by H0. Using random sample of size n a researcher
determine a point estimator . Since the true value of is rarely known, we raise the
question: is the estimated value the same with the hypothesized value of . The
statement about the compatibility of the estimate with the true population in the language of
hypothesis testing is known as null hypothesis. If the information found not consistent with
H0, the null hypothesis is rejected and we conclude that it is false. On the other hand if the
sample information is found consistent with H0, it is accepted even though do not conclude
it is true.
Alternative hypothesis
A hypothesis that comes to be accepted at the cost of H0 if the sample data provide
convincing evidence of its truth. In other word, a statement contradicts the null hypothesis.
Usually alternative hypothesis is denoted by H1. It may be stated in different way depending
on the nature of the problem statement. If the hypothesis to be tested relates to any
parameter θ whose value is predetermined or otherwise specified as θ = θ0 for example, the
null hypothesis is stated is
H0: parameter = value
H0: θ = θ0
The form in which H1 is to be stated depends on what the present value of θ is expected to
be. It may be stated in either of the following ways.
If our objective is to know whether the value of θ the same as before or has changed, the
alternative hypothesis is stated as
H1 : θ θ0
144
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
If our interest is to know the value of the parameter θ increase or decrease from the stated
value, θ0; the alternative hypothesis against which we test the null hypothesis is stated as
H1 : θ > θ0 or
H1 : θ < θ0
Hypothesis testing
It is application of set of rules for deciding whether to accept the null hypothesis or reject it
in favor of the alternative hypothesis.
For example, a pharmaceutical company plans to test the efficacy of a medicine against a
disease on the belief that 95 percent of all persons suffering from the disease on average get
cured. To test this belief the company draws a random sample of 100 patients who suffered
from the disable and treated with the medicine. Then applying statistical rules to make
decision about the effectiveness of the medicine is known as hypothesis testing.
Decision on null
hypothesis Null hypothesis is true Null hypothesis is false
Accept (fail to reject) Correct decision probability = 1- Type II error probability =
Reject Type I error probability = Correct decision prob. = 1-
(level of significance) (power of test )
When making conclusion about H0 (to reject or fail to reject) researchers will never know
for sure whether an error was committed or not. However, the probability of making either
145
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
of type I error and type II error can be computed. In hypothesis testing rules are constructed
to make the probability of committing type I error should be small for reliability of the
decision made. In most case the probability of committing type – I error is denoted by and
it is known as level of significance. Symbolically it is expressed as = P (Reject H0 / H0 is
true) read as the probability of rejecting H0 given that H0 is true.
Classical hypothesis testing requires that we initially specify a significance level for a test.
That is specifying the value of (quantity of tolerance of type I error). Commonly used
value of are 0.10, 0.05 and 0.01.
The complement (1 - ) of the probability of type I error measures the probability level of
not rejecting a true null hypothesis. It is also referred as confidence level.
The probability of committing type II error is denoted by. It is the probability of accepting
a false null hypothesis. The value of varies with the actual values of population parameter
being tested when H0 is false. The complement (1- ) of the probability of type II error
measures the probability of rejecting the false null hypothesis. It is also called the power of
statistical test.
The quality of statistical test is measured by the size of the two errors probability, and.
The test should be takes place at small value of and. Another way of evaluating a test is
to look at the complement of a type II error that is rejecting H0 when H1 is true. It has the
probability, 1- = P (reject H0 when H1is true). It measures the ability of test to perform
what is required.
The value can be computed using the follow steps.
1. Find the critical value of Values of x or other sample statistic used to separate the
acceptance and rejection regions.
2. Using one or more values for µ consistent with alternative hypothesis H 1 and then
calculate the probability that the sample mean x falls in the acceptance region. This
produce the value = P(accept H1 when µ= µ a) and then power of test 1-
146
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Example: The daily yield for a local chemical plant has average 880 tons for the last
several years. The quality control manager would like to know whether this average has
changed in recent months. She randomly selects 50 days from the computer database and
computes the average and the standard deviation of n=50 yields x = 871 tons and S=21
tons. Find and the power of test when µ actually = 870 if you test the appropriateness of
the hypothesis at =0.05.
Solution:
The acceptance region for the test is located in the interval
0 1.96
n
21
880 1.96
50
(874.18 885.82)
The probability of accepting H0, given µ=870 is equal to the area under sampling
distribution for test statistic x in the interval from 874.18 to 885.82. Since x is normally
distributed with mean of 870 and SE 21 = 2.97, is equal to the area under normal
50
curve with µ = 870 located between 874.18 and 885.82. So calculating the Z-values
corresponding to 874.18 and 885.82
x 874.18 870
Z1 1.41
S 21
n 50
x 885.82 870
Z2 5.33
S 21
n 50
147
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
The probability of correctly rejecting H0, given that µ is equal to 870 is 0.9207.
P- Value
The P-value or observed significance level of a statistical test is the smallest value of for
which H0 can be rejected. It is the actual risk of committing a type I error, if H0 is rejected
based on the observed value of test statistics. P-value measures the strength of evidence
against H0. P-value is the actually the area to the right of the calculated value of test
statistic.
A small P-value indicates that the observed value of test statistic lies far away from the
hypothesized value of µ. This presents strong evidence that H0 is false and should be
rejected. P- Value for statistical test is the probability of observing a value of test statistics
that is contradictory to the null hypothesis and supportive of the alternative hypothesis. As
cut off point if the P-value is less than a pre assigned significance level, then the null
hypothesis would be rejected and you can report that the results are statistically significant
at level. For example, H0 is rejected at 5% level of significance if the P-value is less than
0.05.
Steps for calculating the P-value for a test of hypothesis
1. Determine the value of the test statistic corresponding to the result of the sampling
experiment.
2. a). if the test is one-tailed, the p-value is equal to the tail area beyond test statistic
value in the same direction as the alternative hypothesis. Thus, if the alternative
hypothesis is of the form >, the p-value is the area to the right of, or above, the
observed test statistic value. Conversely, if the alternative is of the form <, the p-
value is the area to the left of, or below, the observed test statistic value.
b) If the test is two-tailed, the p-value is equal to twice the tail area beyond the observed
test statistic value in the direction of the sign of its sign. That is, if test statistic value is
positive, the p-value is twice the area to the right of, or above, the observed test statistic
value. Conversely, if test statistic value is negative, the p-value is twice the area to left
of, or below, the observed test statistic value.
148
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Example: A manufacture of cereal wants to test the performance of one of its filling
machines. The machine is designed to discharge a mean amount of =12 ounces per boxes,
and the manufacturer wants to detect any departure from this setting. This quality study
calls for randomly sampling 100 boxes from today‟s production run and determining
whether the mean fill for the run is 12 ounces per box. Determine the p-value using =0.01.
Solutions:
The null and alternative hypothesis of the problem would be stated as
Ho: µ = 12
H1 : 12
X X 12
Test statistic: Z= s=.5 n=100 X =11.85
s
n 100
11.85 12
= 3.0
0.5
10
P-value =p(Z<-3.0)or Z>3.0)=p( p Z 3.0
Since the test is two tail test double p (Z<-3.0) which is equal to 0.5- 0.4987 =.0013
2 p(Z<-3.0)=2(.0013)= .0026
Based on the calculated value of p, we decide to reject or not as follows:
Choose the maximum value of that you are will to tolerate.
If the observed significant level (p-value) the test is less than the chosen value of
,reject the null hypothesis. Other wise do not reject the null hypothesis.
For the above hypothesis since the p-value of the test statistic is less than chosen level of
we can reject the null hypothesis.
Exercise: In a test of the hypothesis Ho: µ = 50 against H1 : 50 , a sample of 100
observation with a mean of x =49.4 and standard deviation of s=3.1. Find and interpret p-
value at =.01
149
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Note that the form in which H1 stated is important as it determines the type of the test to be
used. We conduct a lower tail test when stated as H1: θ <θ0 ,upper tail test when stated as
H1: θ > θ0 and two tail test when stated as H1: θ θ0. Statistical testing requires that H0 is
stated as precisely and clearly as possible. In most cases testing hypothesis requires the
description of H0 in affirmative terms, since it is easier to make use of the sample data for
rejecting H0. For example, if we intended to test whether a new technique of production is
more or less efficient that the old one, H0 is stated as the two techniques are equally
efficient.
150
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
acceptance region 1- = 1 So that H0 will always be accepted even when it is false.
Therefore, the value of should be set optimally. In most case, conventionally set at
=0.05 or 0.01.
The result of testing are said to be significant where H0 is rejected at = 0.05, and highly
significant where it is rejected at =0.01. This means that the sample result is significantly
different from the hypothesized value, and that the difference between the two is not merely
because of sampling error.
Step 3:- selecting the test statistic
In this step, select the test statistic up on which we based to reject or accept H 0. While
selecting the test statistic, it is necessary that one should identify the sampling distribution
of sample statistic that is used to estimate population parameter. Based on the sampling
distribution of the statistic the critical region that defined the region of acceptance and
rejection of null hypothesis will be defined. If the calculated value of the statistic falls with
in the acceptance region the null hypothesis could not rejected. However, it fall with in the
rejection region we reject the null hypothesis.
151
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
4. Check whether the values of the test statistic fall in to the critical region and
accordingly, reject the null hypothesis or accept it.
Alternatively, P-value is used for a decision to accept or reject the null hypothesis. P-value
corresponds to test statistic represents the lowest value of significance at which the null
hypothesis could have been rejected. Hence, after determining the value of test statistic and
the corresponding P-value from a sample, check whether the p-value less than or equal to .
Then reject the null hypothesis if the p-value is less than the value of and accept null
hypothesis if the p-value of the test statistic greater than .
152
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Graphically the critical region (acceptance and rejection region of H0 presented using the
following graphs
-Z Z
Left tailed test Right tailed test
- Z Z
2 2
Example: The average weekly earnings for women in managerial and professional position
is $670. Do men in the same positions have average weekly earnings that are higher than
153
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
those for women are? A random sample of n = 40 men in managerial and professional
positions showed X = $725 and S = $102. Test the hypothesis using =0.01
Solution: we would like to show that the average weekly earnings for men are higher than
$670, the women‟s hence, if is the average weekly earnings in managerial and
professional positions for men, the hypothesis to be tested are:
H0 : = 670
H1 : > 670
The rejection region for this one tailed test consists of large values of X or equivalently,
values of the standardized test statistic Z in the right tail of the standard normal distribution,
with = 0.01. This value is obtained from standard normal table (Z – table) and equal to Z
= 2.33. The observed value of the test statistic, using s as an estimate of the population
standard deviation is
x 725 670
Z 3.41
s 102
n 40
Since the observed value of the test statistic falls in the rejection region, H0 should be
rejected (Zcal > Z ) and conclude that the average weekly earnings for men in managerial
and professional positions are significantly higher than those of women. The probability
that you have made an incorrect decision is = 0.01.
Alternatively, we can also use p-value to test the hypothesis that men in the same position
have average weekly income higher than women. In the right tail test with observed test
statistic Z = 3.41, the smallest critical value we can use to reject H0 is Z = 3.41. For this
critical value the risk of an incorrect decision is
P (Z > 3.41) = 1 – 0.9997 (0.9997 is obtained from Z table associated with
calculated value of test statistic Z)
= 0.003
This probability is p-value of the test. It is the area to the right of the calculated value of test
statistic. H0 is rejected if the p-value associated with test statistic is less than the specified
significance level,. Thus, in this case H0 is rejected since p-value equals to 0. 003 which is
less than = 0.01.
154
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
= 0.01
P-value =0.003
2.33 3.41
Example – 2: An auto company decided to introduce a new six-cylinder car whose mean
petrol consumptions claimed to be lower than that of the existing auto engine. It was found
that the mean petrol consumption for 50 cars was 10km per liter with standard deviation of
3.5km per liter. Test for the company at 5 percent level of significance, the claim that the
new car petrol consumption is 9.5km per liter on the average.
Solution: Let us assume the null hypothesis H0 that there is no significant difference
between the company‟s claim and sample average value, that is
H0 : = 9.5
H1 : ≠ 9.5
x 10 9.5
The test statistic of the problem is, Z 0.495
s 3.5
n 10
Using = 0.05, the rejection region consists of values of Z > 1.96 and values Z < -1.96.
Since 0.495, the calculated values of Z falls in the acceptance region, we accept the null
hypothesis =9.5. Since Zcal < Z . Hence, we can conclude that the new car‟s petrol
2
155
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
= (1 – 0.6879) + (1-3121)
= 0.3121 + 0.6979
P – Value = 1.0100
The null hypothesis is rejected only if the P-value is less than the specified value of level of
significance = 0.05. P – Value = 1.101 > = 0.05.
Therefore, H0 is not rejected and the results are not statistically significant. There is no
enough evidence to indicate that the new petrol car consumption is different from 9.5
km/liter on average.
Exercise
1. The mean lifetime of a sample of 400 fluorescent light bulbs produced by a
company is found to be 1570 hours with a standard deviation of 150 hours. Test the
hypothesis that the mean lifetime of the bulbs produced by the company is 1600
hours against the alternative hypothesis that it is greater than 1600 hours at 1 percent
level of significance.
2. The daily yield for local chemical plant has averaged 880 tons for the last several
years. The quality control manager would like to know whether this average has
changed in recent months. She randomly selects 50 days from the computer database
and computes the average and standard deviation of the n=50 yields as x =871 tons
and s = 21 tons, respectively. Test the hypothesis at = 0.05 level of significance
using both the identification critical region based on Z- value and p-value.
When sample size is small (i.e. less than 30) the central limit theorem does not assure as to
assume that sampling distribution of a statistic such as mean x , proportion p , is normal.
156
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
x
Consequently, when the sample size n is small, the test statistic does not have a
s
n
normal distribution. Therefore, the critical value of Z that is used for large sample size case
cannot used for accepting or rejecting null hypothesis. As we have discussed in chapter one,
if random sample of n less than 30 taken from normally distributed population, the sampling
x
distribution of a random variable t defined as: t where s-sample standard deviation,
s
n
has t distribution with n-1 degrees of freedom. Thus, the critical regions for testing the null
hypothesis =0 against alternative hypothesis can be constructed using t-distribution..
1. Null hypothesis H0: =0
2. Alternative hypothesis
One tail test Two-tail test
H1: > 0 or H1: ≠ 0
H1: < 0
x 0
3. Define test statistics t
s
n
4. Rejection region
Reject H0 for one tailed test t > t or t < -t when alternative hypothesis, H1: <0 or
2 2
Note that the critical value of t , t are found from standard student t-table for n-1 degrees
2
of freedom.
157
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
to indicate that the average weight of the diamonds produced by the process is in excess of
0.5 karat?
If the level of significance = 0.05, the right tailed rejection region is found using the
critical values of t obtained from t – table. With D.f n-1 = 5 t0.05 = 2.015. So reject H0 if the
calculated value of t is greater than table value of t (tcal > t0.5). Since the calculated value of
test statistic, 1.32 is less than 2.015. It does not fall in to the rejection region. This implies
do not reject H0.
P – Value method
Unlike the Z-table, the table for t gives the value of t corresponding to upper tail area equal
to 0.100, 0.050, 0.025, and 0.005. Consequently, you can only approximate the upper tail
area that corresponds to the probability that t > 1.32. Since the statistic for this test is based
on 5 D.f, we refer to the row corresponding to D.f = 5. The value t = 1.32 falls below t 0.10.
Therefore, the right tail area corresponding to the probability that t > 2.27 is greater than 0.1
but to reject null hypothesis, H0 the p – value must be less than the specified significant
level, . This implies we could not reject H0 since the p – value is greater than the value of
.
Example 2: Suppose that it is known from experience that the standard deviation of the
weight of 8-ounce packages of cookies made by a certain bakery is 0.16 ounce. To check
whether its production is under control on a given day, that is to check whether the true
158
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
t 0.005 , 24 2.797
Exercise
1. A claim is made that Adama University students have an IQ of 120. To test this
claim, a random sample of 10 students was taken and their IQ scores are recorded as
follows: 105, 110, 120, 125, 100, 130, 120, 115, 125, 130. At 0.05 level of
significance, test the validity of this claim.
2. Test at the 0.05 level of significance whether the mean of a random sample of size
n=16 is significantly less than 10. Assumed that the distribution from which the
sample was taken is normal with x =8.4 and = 3.2.
8.3.2 Hypothesis testing for the difference between two population means
In many applied research, we are interested in hypothesis testing concerning the difference
between the means of the two populations. For example we might wants to compare the
output between two different production process for which we do not know either
population mean similarly we might want to know if one marketing strategy results in
higher sales than another with out having the population mean sales for either. These
questions can be handled effectively by hypothesis testing.
159
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Let us suppose that we are dealing with independent random samples of size n1 and n2 from
two normal population having 1 and 2 with variance 12and 22 . We can test the null
Let the sample mean for the two population is x1 and x 2 , then the estimator of 1 - 2 is
x1 x 2 .
According to central limit theorem if the samples size are large, x1 and x 2 approximate
n1 n2
The normal test procedure for difference of two means of large sample size carried out as
follow.
1. Null hypothesis H0: (1 - 2) = D where D is some specified difference that you wish
to test. For many tests, we will hypothesize that there is no difference between 1
and 2, that is D = 0
2. Alternative hypothesis:
One tailed test two tail test
H1: 1 – 2 > D or H1: 1 – 2 ≠ D
H1: 1 – 2 < D
x1 x 2 D0 ( x1 x 2 ) D0
3. Test statistic: Z
SE 2 2
S 1
S 2
n1 n2
4. Reject H0 when
One tail
Z > Z for Z < -Z when the alternative hypothesis is H1: (1 -2) < 0 or when P-value <.
160
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
When the sample sizes are small, we cannot no more rely on central limit theorem to ensure
that the sample means will be normal. If the original populations are normal, however, then
the sampling distribution of the difference in the sample means x1 x2 will be normal
even for small sample sizes. If both populations have the same variance (shape), 2 a
x1 x 2 ( 1 2 )
random variable has a student t – distribution.
1 1
2
n1 n2
Thus, in case of small sample, the critical region to accept or reject the null hypothesis
about the difference between the two means of normally distributed population done using
t- statistic. To compute t- statistic, if the population variance of the two population is not
known, it is estimated with sample variance 2, where
(n1 1) 1 (n2 1) 2
2 2
S
2
and the t-statistic becomes
n1 n2 2
( x1 x 2 ) ( 1 2 )
t
1 1
S 2
n1 n2
So reject H0 when t > t for H1: 1 - 2 > D and for t < - t when the alternative hypothesis
H1: 1 - 2 < D. Or when P – value is less than . For two tail test reject H0 when t t or
2
t t . Note that the tabulated value of t (t-critical) for t and t are based on n1 + n1 – 2
2 2
degrees of freedom.
161
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
the hypothesis (1 - 2) = 0.20 against the alternative hypothesis 1 - 2 ≠ 0.20 at = 5%
use p-value method also to make decision to reject or accept null hypothesis.
or Z < -1.96. So to make comparison between the table value of Z (1.96) and its calculated
value find the calculated value of Z – statistic.
x1 x 2 D
Z ,
2 2
1 2
n1 n2
x1 = 2.61 n1 = 50
x 2 = 2.38 n2 = 40
S1 = 0.12 S2 = 0.14
2.61 2.38 0.20
1.08
(0.12) 2 (0.14) 2
50 40
Since the calculate value of Z (1.08) is less than the table value of Z0.025 = 1.96, the null
hypothesis can not be reject at 5 percent level of significance.
ii. P-value approach: calculate the p-value, the probability that Z is greater than 1.08 Plus
the probability that Z is less than Z = -1.08
P - value = P (Z > 1.08) + P (Z < - 1.08)
Or P-value = 2p (Z > 1.08) = 2 (0.500 – 0.3599) where 0.3599 is the entry of
Z – table for Z = 1.08.
P – value = 0.2802. Since 0.2802 exceeds the value of , 0.05, the null
hypothesis can not be rejected. We can say also the difference between the
two means is not statistically significant.
162
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Solution: Let 1 be the average marks of students taught by traditional lecture method, and
2 be the average marks of those taught by the case method. Thus, we can test the
hypothesis as follows:
H0: 1 - 0 = 0 = 0.01
H1: 1 - 2 ≠ 0
The critical region: t t or t t t , n1 n2 2
t 0.005, 23 2.870
2 2 2
Test statistics
n1 1 S 12 (n2 1) S 22
S 2
n1 n2 2
2 2
x1 x 2 n1 S 1 n2 S 2
t for the save var iance case.
1 1 n1 n2 2
S 2
n1 n2 (4.64) 2
x1 x2 64 67
t 1.59
1 1 4.64 1 1
S 1 2 10 15
n n
Since t=-1.59 is greater than t t 2,807 , we can‟t reject H0. It means that the
2
traditional lecture method and the case method are equally effective methods of teaching.
Exercise
163
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
a second normal population with standard deviation 2 = 4 has a mean x2 80. Test
164
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
P P0 P P0 x
Test statistic: Z with P where X is the number of
SE Po qo n
n
success in n binomial trials. Note that n – should be large enough so that the distribution of
P can be approximated by normal distribution. Thus, critical region for the test can be
stated as
Reject H0 when: Z > Z or (Z < Z when H1: P < P0) or when P – value is less than for
One tail test
Reject H0 when Z or Z Z for two-tail test
2 2
Example
A student leader seeking election to the office of the president of the University student
union claims that 55 percent of the votes will be polled in his favor and that he will win the
election. A sample survey of 100 students before the elections revealed that only 45
students expressed the desire to vote his favor. Verify the claim at 0.01 level of
significance.
Solution: H0 : P0 = 0.55
H1 : P0 < 0.55 = 0.0.1, P 0,45
P P0 0.45 0.55
test statistic Z 2.00
P0 q0 (0.55) (0.45)
n
critical region: Reject H0 if Z < - Z
-Z = -2.33 (from Z – table)
Z = - 2.00
Decision: we can‟t reject H0 since Z > -Z
Exercise
1. Suppose that 10% of the fields in a given agricultural area infested with the sweet
potato whitefly. One side fields in this area are randomly selected, and 25 are found
to be instead with whitefly. Assuming that the experiment satisfies the condition of
binomial experiment,
165
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
p1 q1 p1q 2
(SE), SE to find test statistic used to construct critical region for the test.
n2 n2
For large samples of size n1 and n2, both greater than 30, the test statistic has standard
normal distribution. Thus, we follow the formal procedure to conduct the test as follows.
1. Null hypothesis H0: P1 – P2 = 0 or equivalently H0: P1 = P2
Alternative hypothesis:
One tailed test Two tailed test
H1 : (P1 – P2) > 0 or H1: (P1 – P2) ≠ 0
H1: (P1 – P2) < 0
2. Test statistic:
P1 P2 0
P1 P2
Z
SE pq pq
n1 n2
x1 x 2
Where P1 , P2 , Since the common value of P1 = P2 = P is unknown it is estimated
n1 n2
P1 P2 0
x1 x 2 P1 P2
by P , the test statistic is Z
n1 n2 1 1
pq pq
p q
n1 n2 n1 n2
166
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Assumption: Samples are selected in a random and independent manner from two binomial
populations, and n1 and n2 are large enough so that the sampling distribution of
P1 P2 can be approximated by normal distribution.
Example: The records of a hospital show that 52 men in a sample of 1000 men versus 23
women in a sample of 1000 women were admitted because of heart disease. Do these data
present sufficient evidence to indicate a higher rate of heart disease among men admitted to
the hospital? Use = 0.05 to test the claim assuming the number of patients admitted for
heart disease has approximately binomial distribution for both men and women with
parameters p1 and p2, respectively.
Solution:
Null hypothesis H0: P1 – P2 = 0
Alternative hypothesis H1: (p1 – p2) > 0
To determine test statistic, let use find the pooled standard error using then pooled estimate
of P.
x1 x 2 52 23
P 0.0375
n1 n2 1000 1000
P1 P2 0.052 0.023
Z
1 1 1 1
Test statistic p q (0.0375) (0.9625)
n1 n2 1000 1000
Z 3.41
167
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Z0.05 = 1.645
Reject H0: P1 – P2 since (Z = 3.41) > Z0.05 = 1.645 and conclude, the data present sufficient
evidence to indicate that the percentage of men entering the hospital because of heart
disease is higher than that of women.
Exercise
1. Independent random samples of 280 and 350 observations were selected from
binomial distribution 1 and 2 respectively. Sample 1 had 132 successes and sample
2 had 178 successes. Do the data present sufficient evidence to indicate that the
proportion of success in population-1 is smaller that the proportion in population 2?
Test the claim at 5% level of significance.
2. An experiment was conducted to test the effect of a new drug on a viral infection.
The infection was induced in 100 mice and the mice were randomly split in to two
groups of 50. The first group, the control group, received no treatment for infection.
The second group received the drug. After a 30 day period, the proportions of
survivors, P1 and P2 in the two groups were found to be 0.36 and 0.60 respectively.
Is there sufficient evidence to indicate that the drug is effective in treating the viral
infection use =0.05.
Case 1: Testing the uniformity of a given population. Similar to hypothesis testing of other
population parameter test about population variance 2 is done based on sample variance
computed from random sample observation from normally distributed population. You
might recall from chapter -, the sampling distribution of the ratio of sample variance to
168
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Here the null hypothesis about population variance 2 would claim to be equal to a specified
2
H0 : 2 tested against alternative hypothesis
0
To reject or not reject the null hypothesis a test static computed based on the observed
values as follows
(n 1) s 2
2
X where Null hypothesis value
2 0
0
S2 sample variance
n sample size
Reject H0 for one tailed test, when
2
H1:
2
2
x x
2
2 or x x1 for the alternative hypothesis .
0 0
x x
2
For two tailed test reject H0 when 2 or x2 x
1
or when p-value less than .
2 2
2 2
Note that x and x 1
are the upper and lower tail values of X2 for n-1 degree of freedom
Example
A company manufacturing a radio tubes for the test 10 years find that the life of their tubes
has a variance of 0.6 years. As a result of some qualitative improvement brought about the
product, the company claims that the variance of the life of their tubes has decreased. Using
0.05 level of significance, test the claim made by the company if the sample variance s2
based on the observation of 9 tubes is found as 0.45 years.
169
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Solution
Null hypothesis, H0: 2 = 0.6
Alternative hypothesis; H1: 2 < 0.6, = 0.05
(n 1) 2
Test statistic: X 2 , n 9 S 2 0.45
2
0
8(0.45)
X2 6.7
(0.6)
2 2
From chi-square table, x 1
x 0.095
2.73
2 2
This implies x x 0.95
as a result we can not reject H0. This means that the sample
2 2
1, you will find little evidence to indicate that 1
and 2
are unequal. On the other hand, a
variance. More over it is possible to compare the variability of two population by testing the
equality of their variance.
That is test:
170
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
H0 : 1
2 2
2
against
H1 : 1
2 2
2
or
2 2
1
2
or
2 2
1
2
.
The decision to accept or eject H0 is made using F-distribution since up on repeated random
2
sampling 1 has F-distribution. When we write the ratio of the two sample variance, it
2
2
If H 1 : 1
2 2
F F (V1 , V2 ) 2
If H 1 : 1
2 2
F F1 (V1 , V2 ) 2
F F (V1 , V2 )
If H 1 : 1
2 2 2
F F1 (V1 , V2 )
2
F (V1,V2) and F (V1, V2) are the table value of F leaving an area and to the right of
2 2
where the unit measurement are 1,000 pounds per square inch. Assuming that the
measurements constitutes independent random samples from two normal populations, test
2 2 2 2
the null hypothesis S 1
S 2 against the alternative S 1
S 2
at 0.02 level of significance.
Solution
171
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
2 2
H 0 : S1 S 2
0.02
2 2
H 1 : S1 S 2
Test statistis
F
2
19.2
1 5.49
2
2
3.5
172
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
Chapter III
0 x 1
3
x
2
x2
1. a) yes because dx 1 b) 1/9
1 3
c) F ( x) 1 x 2
9
1 x2
2. a) ¼ b) ½ c) 1/2
3. a) ½ b) 1/3 c) 1/6
4. yes
5. a) 5.66
6. a) 1/3
2 2 x 0 x 1
8. a) 7/16 b) 5/16-5/9 c) f ( x)
0 otherwise
Chapter IV
1. a) 27 b) 80.3 c) 56.8
3 . 0.59049
173
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
4. a) 0.75
5. 0.6288
6. 0.3011
7. a) 0. 061 b) 0.471
8. a) 0.193 b) 0.463
9. a) 0.0228 b) 50% c) 5.41%
Chapter V
(2 x (1 y 2 ) y x 1
1. a) f 1 ( x) 4 x 3 b) f 2 ( y) 4 y(1 y 2 ) c) f1 ( x y )
0 other x
3. a)1/210
0 x2
0 y0
2 x 2 5 x 18 2
y 16 y
b) F1 ( x) 2 x6 F2 ( y ) 0 y5
84 105
1 y5
1 x6
16 y y 2
d) 3/20 e) 23/28 f) 2/35 g) F ( x, y) h) they are dependent
105
1 1
4. a) 1/18 b) f1 ( x) (4 x 3) x 1, 2 f2 y (2Y 6) y 1, 2
18 18
2x y 2x y
5. a) b)
4x 3 2y 6
3 4 xy
0 y 1
6. a) f ( y x) 3 2 x b) 9/16
0 other y
174
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS
STATISTICS FOR ECONOMICS
References
1. Mendenhall W, Beaver R, Beaver B, (2003): introduction to probability and
statistics 8th ed.
2. Freund’s J (1999): Mathematical statistics 6th ed.
3. Sharma (2004): Business Statistics.
4. New Bold P, Carlson W, Thorne B (2003): statistics for business and economics
5th ed.
5. Hood RP, (2003) statistics for business and economics 3rd ed.
6. Larson H. (). Introduction to probability theory and statistical inference. 3rd ed.
7. Micheal J. Panlk. Advanced Statistics from and Elementary Point of View
,ELSEVIER Academic Press, USA, 2005
8. Murray R. Spiegel., Probability and Statistics, Schaum’s outline series, McGraw –
Hill Book company, New York, 1992
9. Oliver C. Ibe, Fundamentals of Applied Probability and Random Processes,
University of Massachusetts, Elsevir Acadmic press, USA, 2005
10. Seymour LipscatZ, Theory and problems of probability, Schaum’s Outlie Series,
McGraw-Hill Book Company, UK, 1974.
175
_________________________________________________________________________
ODA BULTUM UNIVERSITY, DEPARTMENT OF ECONOMICS