CH 3 Sampling
CH 3 Sampling
CH 3 Sampling
Malede m.(Bsc.)
1
Objectives
2
Sampling
Why sample?
3
Why sample?
Cost in terms of money, time and manpower
Accessibility
Utility e.g. to do diagnostic laboratory test you
don’t draw the whole of patient’s blood.
A census is a sample consisting of the entire population.
Even though a census is not full proof, it gives detailed
information about every small area of the population.
It has the following disadvantages:
Expensive
Takes a long time
Cumbersome & therefore inaccurately done ( a careful sample
produces a more accurate data than a census.)
4
Sampling…..
Sampling is the process of selecting a representative sample
from populations.
It Selecting cases (elements)—or locating people (or other units of analysis)
—from a target population in order to study the population.
sampling
Sample
Inference
Population
5
Cont’d
The process of obtaining information from a subset (sample) of a larger
group (population)
The results for the sample are then used to make estimates of the larger
group
Faster and cheaper than asking the entire population
Two keys
1. Selecting the right people
Have to be selected scientifically so that they are representative of the population
2. Selecting the right number of the right people
To minimize sampling errors I.e. choosing the wrong people by chance
6
Population Vs. Sample
Population of Interest
Population Sample
Sample
Parameter Statistic
7
Characteristics of Good Samples
o Representation
Sample surveys are almost never conducted for the
purposes of describing the particular sample under
study. Rather they are conducted for purposes of
understanding the larger population from which the
sample was initially selected
10
Basic term cont’d….
11
Basic Terms cont’d…
12
Basic Terms cont’d…
13
Basic term cont’d….
14
Basic term cont’d….
15
Hierarchy of sampling
16
Errors in statistical Study
Sampling or Random
Errors
Non-sampling or
systematic
17
1. Sampling error
18
Sampling error cont’d…
19
Sampling error cont’d…
22
Non-sampling Error……
o The basic types of non-sampling error
Non-response error
Response or data error
o A non-response error occurs when units selected as part of the
sampling procedure do not respond in whole or in part
If non-respondents are not different from those that did
respond, there is no non-response error
When non-respondents constitute a significant proportion of
the sample (about 15% or more
23
Non-sampling Error…….
o A response or data error is any systematic bias
that occurs during data collection, analysis or
interpretation
Respondent error (e.g., lying, forgetting, etc.)
Interviewer bias
Recording errors
Poorly designed questionnaires
24
Non-Sampling Error cont’d …
Random error can distort the results in any given direction but
tend to balance out on average
Thus, the total survey error
25
Advantage of sampling
26
Disadvantage of Sampling
If the population is very large and there are many sections and
subsections, the sampling procedure becomes very complicated
27
Characteristics Of A Good Sample Design
From what has been stated above, we can list down the characteristics
of a good sample design as:
Sample design must result in a truly representative sample.
Sample design must be such which results in a small sampling error.
Sample design must be viable in the context of funds available for
the research study.
Sample design must be such so that systematic bias can be
controlled in a better way.
Sample should be such that the results of the sample study can be
applied, in general, for the universe with a reasonable level of
confidence.
28
Types of Sampling
29
Types of Sampling Methods
Sampling Method
Convenience
Multistage Random
Sampling
30
Probability Sampling Method …
31
Probability Sampling Method cont’d …
In probability sampling
A sampling frame exists or can be compiled.
should have an equal or at least a known or nonzero chance
of being included in the sample.
Generalization is possible (from sample to population)
Simple Random Sampling,
Systematic Sampling,
Stratified Random Sampling,
Cluster Sampling
Multistage Sampling.
32
1. Simple Random Sampling(SRS)
33
Simple Random Sampling cont’d …
34
Simple Random Sampling cont’d …
Lottery method is appropriate if the total population is not too
large, otherwise if the population is too large then it will be very
difficult to use lottery method.
Thus, table of random number or computer generated random
number is the feasible method to be used.
Sampling schemes may be
o without replacement- no element can be selected more than once in the
same sample, N possible samples.
n
o with replacement- an element may appear multiple times in the one sample
possible samples.n
N
35
Example
ti on is
pop ul a
ou g h th e
m a y n ot be
Th
g e ne o u s, there
homo f r ame.
avail ab le
so
what?
36
2. Systematic Random Sampling
37
Steps in systematic sampling:
38
E.g. systematic sampling
• N = 1200, and n = 60
sampling fraction = 1200/60 = 20
• List persons from 1 to 1200
• Randomly select a number between 1 and 20
(e.g. 8)
• 1st person selected = the 8th on the list
• 2nd person = 8 + 20 = 28th list e.t.c.
39
Systematic sampling….
40
Though the frame available, the population may
not be homogeneous, so what?
41
3. Stratified Random Sampling
So, you divide your sample into male and female members and
randomly select the required sample size within each subgroup
(or "stratum")
43
Steps involve in stratified sampling method:
Define the population
Determine the desired sample size
Identify the variable and subgroups (strata) for which you want to guarantee
appropriate representation (either proportional or equal)
Then the total sample size will be the sum of all samples from each subgroup.
44
There are two methods to get the study subject from each subgroup,
proportional allocation or
equal allocation.
We use proportional allocation technique when our subgroups vary dramatically in size
in our population
Let N be total population and N1, N2 . . . . Nk be the subtotal population for strata 1, 2,
…. K respectively. Moreover let n be the total sample size and n1, n2…..nk be th
subsample for strata 1, 2…..k respectively in which N = N1 + N2 +….. …+ NK
and n = n1 + n2 + …………..+ nk
Then the subsample “ni “which will be selected from subgroup Ni can be computed by
n Ni
ni w here i 1, 2, 3........k
N
45
The higher the population in the subgroup, the higher the
sample size will be.
46
Advantage of stratified sampling over simple random sampling
DEMERIT
Sampling frame for the entire population has to be prepared
separately for each stratum.
47
4. Cluster Random Sampling
48
Steps in cluster sampling are:
49
Consider the following graphical display:
50
5. Multistage Random Sampling
51
Non-Probability Sampling Method
52
Cont’d……….
Advantages
Cheaper and faster than probability
Reasonably representative if collected in a thorough manner
53
1. Judgment Sampling/ Purposive sampling
54
2. Convenience Sampling
55
Cont’d………..
56
3. Quota sampling
57
Cont’d
58
Cont’d
In quota sampling the selection of the sample is non-random.
59
4. Snowball sampling
60
Cont’d
While this technique can dramatically lower search costs,
it comes at the expense of introducing bias because the
technique itself reduces the likelihood that the sample will
represent a good cross section from the population.
61
Sample Size Determination
The answer will depend on the aims, nature and scope of the
study and on the expected result. All of which should be
carefully considered at the planning stage.
62
Sample……
o If sample (“n”) is
Large
Increase accuracy
Costy / complex
Take
Optimum
Small sample
o Decrease accuracy
o Less costy
How ?
63
Factors to determine sample size
Size of population
Resources – subjects, financial, manpower
Method of Sampling- random, stratified
Degree of difference to be detected
Variability (S.D.) – pilot study, historical
Degree of Accuracy (or errors)
- Type I error (alpha) p<0.05
- Type II error (beta) less than 0.2 (20%)
- Power of the test : more than 0.8 (80%)
Statistical Formulae
Dropout rate, non-compliance to Rx
64
o Sample size determination depending on outcome variables.
65
The third category covers continuous response variables
such as birth weight, age at first marriage, blood
pressure and cerium uric acid level, for which
numerical measurement are usually made.
66
Sample Size………...
67
Sample for Single population
To estimate sample size for single survey using simple
or systematic random sampling, need to know:
oEstimate of the prevalence of the outcome
o Precision desired
o Design effect
o Size of total population
oLevel of confidence (always use 95%)
68
Sample size for single population mean
69
Maximum acceptable difference (w): This is the maximum
amount of error that you are willing to accept.
Desired confidence level (Z/2 ) : is your level of certainty that
the sample mean does not differ from the true population mean
by more than the maximum acceptable difference. Commonly
we use a 95% confidence level.
Then the sample size determination formula for single
population mean is defined by:
z 2 2 2
n 2
w
70
Sample size for single population mean cont’d…
Where
α= The level of significance which can be obtain as 1-
confidence level.
σ=Standard deviation of the population
w= Maximum acceptable difference
z α/2 = The value under standard normal table for the
given value of confidence level
71
Sample Size for Single Population Proportion
72
Then the formula for the sample size of single population proportion is
defined as:
z22 * p (1 p )
n 2
w
73
Example 1
One of MPH student want to conduct a research on the prevalence of ANC utilization
of mothers in DABAT district. Given that the prevalence from the previous study found
to be 45.7% , what will be the sample size he should take to address his objective?
Solution:
Margin of error d= 5%
A confidence level of 95% will give the value of as Zα/2=1.96.
Then using the formula :
2 2
Z P (1 P ) Z 0 . 457 (1 0 . 457 )
0 . 05
n 2 2
2
W 0 . 05 2
1 . 96 0 . 457 ( 0 . 543 )
2
0 . 05 2
382
74
Some Considerations
The final sample size will be corrected for
Nonresponse, lost to follow up, lack of compliance and so on
Consider the total size of the population (N): if N <10000 then we
need correction the formula which is defined by
n
n o
f
no
1
N
75
Incorrect sample size will lead to
o Wrong conclusions
o Poor quality research (Errors)
o Type II error can be minimized by increasing the sample size
o Waste of resources
o Loss of money
o Ethical problems
o Delay in completion
76
Example 2
77
78