Chapter 5 - Census and Sample Survey

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

DEBRE BERHAN UNIVERSITY

COLLEGE OF BUSINESS AND ECONOMICS


DEPARTMENT OF ECONOMICS

CHAPTER FIVE: CENSUS AND SAMPLE


SURVEY

SOLOMON ESTIFANOS

FEBRUARY, 2024
1
DEBRE BERHAN, ETHIOPIA
Census and Sample Survey
 All items in any field of inquiry constitute a ‘universe’ or a
‘population’. A complete enumeration of all items in the
population is known as a census inquiry.
 In such an inquiry, when all items are covered, no element of
chance is left and highest accuracy is obtained. This type of
inquiry involves a great deal of ;
o Time,
o Money and
o Energy
 Government, in very rare cases, is the only institution which
can get the complete enumeration carried out. For example,
population census in our country is carried out once in a
2 decade (every 10 years).
Census Con’t . . .
 Many times undertaking a census survey is not possible.

 Sometimes it is possible to obtain sufficiently accurate results


by studying only a part of total population, technically called
samples.
 The process of selecting samples is called sampling technique.

 In sampling, however, the samples selected should be as


representative of the total population as much as possible.
 A researcher must prepare a sample design for his study i.e. s/he
must plan
o How a sample should be selected and
3 o What size such a sample would be
The Need for Sampling
Benefits of studying samples;
1. There could be resource (time, finance, manpower, etc.)
limitations which would make it difficult to study the whole
population.
2. In some cases, tests may be critical. For example, when we
test the breaking strength of materials, we must destroy them.
A census would mean complete destruction of materials. In
such a case, we must sample.
3. Sampling provides much quicker results than does a
census. When the time between the recognition of the need
of information and the availability of that information is
short, sampling helps not to miss the information.
4
Need for Sampling Con’t . . .
4. Sampling is the only process possible if the population is
large.
5. There is also an argument that the quality of a study is often
better with sampling than with a census. The basis of the
argument is that;
o Sampling possesses the possibility of better interviewing,
o More systematic investigation of missing, wrong, or
suspicious information,
o Better supervision, and
o Better processing than is possible with complete coverage.

5
Steps in Sampling Design
1. Type of Universe: The first step is to define the universe. The
universe can be finite or infinite.
 Finite Universe - the number of items is certain.
 Infinite Universe - the number of items is infinite.
2. Sampling Unit: A decision has to be taken concerning a sampling
unit before selecting sample. Sampling unit may be
 Geographical Unit - such as district, Kebelle, village, etc., or
 Social Unit - such as family, school, etc, or it may be an
individual.
3. Source List: It is also known as sampling frame from which
sample is to be drawn. It contains the names of all items of a
universe (for finite universe). A source list should be
6
comprehensive, correct, reliable and appropriate.
4. Size of Sample: the number of items to be selected from the
universe. The size of sample should neither be excessively large,
nor too small. It should be optimum.

 In order to decide on the size of the sample to be selected, a


researcher must take in to consideration:

o The size of population variance,


o The size of population,
o The parameter of interest in the research study, and
o Budgetary constraint.

5. Parameters of Interest: one must consider the question of the


specific population parameters which are of interest.

7
6. Budgetary Constraint: Cost considerations, from practical
point of view, have a major impact on decisions relating to;
o Size of the sample
o Type of sample
This fact can even lead to non-probability samples.

6. Sampling Procedure: There are several sample designs out


of which the researcher must choose one for his study.
o S/he must select that design which, for a given sample
size and for a given cost, has a smaller sampling error.

8
Criteria for Selecting a Sampling Procedure

 Two costs are involved in a sampling analysis:


1. The cost of collecting the data and
2. The cost of an incorrect inference resulting from the data.

 There are two causes of incorrect inferences namely systematic


bias and sampling error.

1. Systematic Bias: Systematic bias results from errors in the


sampling procedures, and it cannot be reduced or eliminated by
increasing the sample size.

 Causes responsible for these errors can be detected and corrected.


 Bias enters in when a sample fails to represent the population it
9 was intended to represent.
 Factors responsible for systematic bias:
Inappropriate sampling frame
Defective measuring device
Non-respondents
Indeterminancy principle
Natural bias in reporting data

2. Sampling Errors: Sampling errors are the random variations in


the sample estimates around the true population parameters.

 Sampling error decreases with the increase in the size of the


sample, and it happens to be of a smaller magnitude in case of
homogeneous population.

 The measurement of sampling error is usually called the


‘precision of the sampling plan’
10
 If we increase the sample size, the precision can be improved.

 But increasing the size of the sample has its own limitations.

 A large sized sample increases the cost of collecting data and


also enhances the systematic bias.
 Thus the effective way to increase precision is usually to select
a better sampling design which has a smaller sampling error
for a given sample size at a given cost.

11
12
Sampling error = Frame error + Chance error + Response error
Characteristics of a good sample design: Sample design must
 Result in a truly representative sample
 Result in small sampling error
 Be viable in the context of funds available for the research study
 Must enable to control the systematic bias in a better way be such
that the results of the sample study can be applied, in general, for
the universe with a reasonable level of confidence

Determination of the Size

Types of Sampling Design


Sample designs are basically of two types;
o Probability Sampling - is based on the concept of random selection
o Non-Probability Sampling - is based non-random selection
13
Probability Sampling Techniques
 Probability sampling is also known as Random Sampling or
Chance Sampling. Random sampling techniques can be divided in
to;
 Simple random sampling and
 Complex random sampling

A. Simple Random Sampling


 Individuals are randomly drawn from the population at large.
 Under this method each unit in the universe has the same chance
of being included in the sample.
 Random sampling needs a ‘sampling frame’ or a ‘sampling unit’,
i.e. complete and up-to-date list of all members of the population.
 For a homogeneous type of population, simple random sampling
14 is reliable.
Simple random sampling gives:
1. Each element in the population an equal probability of getting
into the sample; and all choices are independent of one another
2. Each possible sample combination has an equal probability of
being chosen.
 We can define a simple random sample from a finite population as a
sample which is chosen in such a way that each of the possible
samples has the same probability NCn of being selected.
 For example, if a finite population consists of 4 elements (a, b, c, d)
i.e. N = 4. Suppose that we want to take a sample of n = 2 from it.
 Then there are possible distinct samples of the required size, and
they consist of the elements ab, ac, ad, bc, bd, and cd.
 Each sample has a probability 1/6 of being chosen, we will then call
this a random sample.
15
B. Complex Random Sampling

 Probability sampling under restricted sampling techniques may


result in complex random sampling designs.
 It is sometimes also called ‘Mixed Sampling Designs’ for
many of such designs may represent a combination of
probability and non-probability procedures in selecting a
sample.
 Some of the popular complex random sampling designs are as
follows:
I. Systematic Sampling
II. Stratified Sampling
III. Cluster Sampling

16
I. Systematic Sampling
 In some instances, the most practical way of sampling is to select
every ith item on a list.

The following steps will help:


 Assign a sequence number to each member of the population.

 Determine the skip interval by dividing the number of units in the


population by the sample size. I = P/S where I is skip interval, P
is population size, and S is sample size.
 Select a starting point in a random digit table (it must be between
1 and I).
 Include that item in a sample and select every ith item thereafter
until total sample has been selected.
17
 For example, if we want to take 20 samples from a population
of 100 members, our skip interval is 5 (i.e 100/20).
 Our starting point must be selected randomly from the interval
1 to 5. Then every fifth item will be our sample. If our starting
point is 2, then our sample must include members with
sequence numbers of 2, 7, 12, 17, 22, 25, . . , 97.

Advantages:
 The samples will spread evenly over the entire population
 It is also an easier and less costly method of sampling
 Can be conveniently used even in case of large populations

 However, if there is a hidden periodicity in the population,


systematic sampling will prove to be an inefficient method of
sampling.
18
II. Stratified Sampling:
 If a population from which a sample is to be drawn does not
constitute a homogeneous group, stratified sampling is generally
applied.
 Under stratified sampling, the population is divided into several
subpopulations (strata) that are individually more homogeneous than
the total population and then we select items from each stratum to
constitute a sample.
The basic steps for stratified sampling are:
 Divide the population to be surveyed in to strata of similar study
units or into areas with which similar social, environmental, or
economic conditions exist.
 Make a separate and complete list of the stratum and from each
stratum draw a separate random sample of study units using these
lists.
 A similar survey is then done on the sample of study units in each of
19
the strata i.e. the same questionnaire is used.
Advantages:
(i) More reliable information is obtained for the same sample size if
the population is stratified than they are for the population as a
whole.
(ii) Comparisons between strata are easy because a separate but
similar survey is done in each stratum.
Questions are highly relevant in the context of stratified sampling:

A. How to Form Strata?


 Formed on the basis of common characteristic(s) of the items to be
put in each stratum i.e. elements being most homogeneous within
each stratum and most heterogeneous between the different strata

 Strata are purposively formed usually based on past experience and


personal judgment of the researcher.
20
B. How Should Items be Selected From Each Stratum?

 For selection of items from each stratum, we may use simple


random sampling. Systematic sampling can also be used if it is
considered more appropriate in certain situations.

C. How Many Items be Selected From Each Stratum or


How to Allocate the Sample Size to Each Stratum?
 The method of proportional allocation under which the sizes of
the samples from the different strata are kept proportional to
the sizes of the strata should be followed.
 That is, if Pi represents the proportion of population included
in stratum i, and n represents the total sample size, the number
21
of elements selected from stratum i is n.Pi.
 For example, Suppose we want to take a sample of size n = 30
to be drawn from a population of size N = 800 which is divided
into three strata of size N1 = 400, N2 = 240, and N3 = 160.
 The sample size for stratum with N1 = 400 is n1 = 30(400/800) = 15.
 The sample size for stratum with N2 = 240 is n2 = 30(240/800) = 9.
 The sample size for stratum with N3 = 160 is n3 = 30(160/800) = 6.

 In cases where strata differ not only in size but also in variability
and is considered reasonable to take larger samples from more
variable strata and smaller samples from less variable strata,

 we can then account for both (differences in stratum size and


differences in stratum variability) by using disproportionate sampling
design by using the formula:

22
 For example, assume a population is divided into three strata so that
N1=5000, N2=2000, and N3=3000. Respective standard deviations
are How should a sample of size n=84 be allocated to the three
strata , if we want optimum allocation using disproportionate
sampling design?
 The solution will be:

NB.
 Each stratum, in stratified sampling, is homogeneous internally and
heterogeneous with other strata.
 The more strata used, the closer you come to maximizing inter-
strata differences and minimizing intra-stratum variances.
23
III. Cluster Sampling

 If the total area of interest happens to be a big one, a


convenient way in which a sample can be taken is to divide the
area in to a number of smaller non-overlapping areas and
then to randomly select a number of these smaller areas
(clusters)

 In cluster sampling, the total population is divided into a


number of relatively small subdivisions which are themselves
clusters of still smaller units and then some of these clusters
are randomly selected for inclusion in the overall sample.

 Cluster sampling reduces cost by concentrating surveys in


selected clusters. But certainly it is less precise than simple
24 random sampling.
Differences between stratified sampling and cluster sampling

Stratified Sampling Cluster Sampling

1. We divide the population into a 1. We divide the population into


few subgroups, each with many many subgroups, each with a
few elements in it. The
elements in it. The subgroups subgroups are selected
are selected according to some according to some criterion
criterion that is related to the of ease or availability in data
variables under study collection
2. We try to secure
2. We try to secure homogeneity heterogeneity within
within subgroups and subgroups and homogeneity
heterogeneity between between subgroups, but we
subgroups usually get the reverse
3. We randomly choose elements 3. We randomly choose a
number of subgroups, which
from within each subgroup we then typically study in to
25
✓ Area Sampling: If clusters happen to be some geographic
subdivisions
• In other words, cluster designs, where the primary sampling unit
represents a cluster of units based on geographic area, are
distinguished as area sampling.

✓ Multi-Stage Random Sampling: Multi-stage sampling is a further


development of the principle of cluster sampling. Suppose we want
to investigate the working efficiency of health stations in Ethiopia,
and we want to take a sample of few health stations for this purpose.
▪ The first stage is to select large primary sampling unit such as
regional states in country.
▪ Then we may select certain zones in the second stage.
▪ Then we may again select districts (woredas) and interview all
health stations in the selected districts. This would represent a
three-stage sampling design.
26
✓ Sampling with Probability Proportional to the Cluster Size:

• In case the cluster sampling units do not have the same number or
approximately the same number of elements, for this the
probability of each cluster being included in the sample is
proportional to the size of the cluster.

• For this purpose, we have to list the number of elements in each


cluster irrespective of the method of ordering the cluster.

• Then we must sample systematically the appropriate number of


elements from the cumulative totals.

 For illustration, consider the following are the number of


departmental stores in 10 cities: 35, 17, 10, 32, 80, 18, 26, 19, 26,
and 57. If we want to select a sample of 8 stores, using cities as
clusters and selecting within clusters proportional to size, how many
27
stores from each city should be chosen? (Use a starting point of 8)
City number Number of Cumulative total Sample
departmental stores

1 35 35 8
2 17 52 48
3 10 62

4 32 94 88
5 80 174 128, 168
6 18 192

7 26 218 208
8 19 237

9 26 263 248
10 57 320 288
28
Comparison of Probability Sampling Designs
Design Random selection Other characteristics

Simple Sample members Each population element has an equal chance to be selected
random individually from the Disadv.-requires a listing of population elements, -expensive and requires
sampling population more time to implement

Systematic The initial sample member Designation of the initial sample member determines the entire sample.
sampling is individually selected Disadv. –periodicity within the population may skew the sample and the
results

Stratified Sample members All strata are represented in the sample most frequently by proportional
individually within each of allocation
the subpopulations or strata Disadv. –creating strata on the population is expensive

Cluster Clusters of members All members of a selected clusters are included in the sample
sampling selected from the larger Not all clusters are included
population of clusters Disadv. –often lower statistical efficiency (more error) due to subgroups
being homogeneous rather than heterogeneous

29
Non-Probability Sampling Techniques

 Non-probability sampling procedures provide only a weak


basis for generalization.
 In reality, the conclusions drawn from a study of a non-
probability sample are limited to that sample and cannot be
used for further generalization.
 In this type of sampling, items for the sample are selected
deliberately by the researcher; his choice concerning the items
remains supreme.
 Thus the judgment of the organizers of the study plays an
important part in this sampling design.
 Personal element has a great chance of entering into the
selection of the sample. Sampling error in this type of
sampling cannot be estimated and the element of bias, great
or small, is always there.
30
Some of the non-probability sampling techniques are;
 Judgment (Purposive) Sampling - The researcher uses his
judgment to select people that he feels are representative of the
population to have a particular expertise or knowledge which makes
them suitable.
 Convenience (Accidental) Sampling - the most convenience
population is chosen, which may be the researchers friends, work
colleagues or students from a nearby college. This method is often
used to save time and resources.
 Quota Sampling - The researcher selects a predetermined number
of individuals from different group (i.e. based on age, gender, etc).
 Referral Sampling - this is a non-probability sampling technique
which utilizes some form of referral, wherein respondents who are
initially contacted are asked to supply the names and addresses of
members of the target population.
31
References in Use

 C.R. Kothari. 2004. Research Methodology:


Methods and Techniques. Second Revised Edition.

32

You might also like