Sampling
Sampling
Sampling
Sampling is the process of selecting a number of individuals for a study in such a way that the individuals
represent the larger group from which they were selected.
A sample is “a smaller (but hopefully representative) collection of units from a population used to
determine truths about that population” (Field, 2005)
The sampling frame is a list of all elements or other units containing the elements in a population.
Population is the larger group from which data/units/individuals are selected to be part in a study
Target population - A set of elements larger than or different from the population sampled and to which
the researcher would like to generalize study findings.
Quantitative Sampling
Important issues
Representation – the extent to which the sample is representative of the population
Generalization – the extent to which the results of the study can be reasonably extended from the
sample to the population
Sampling error - The chance occurrence that a randomly selected sample is not representative of the
population due to errors inherent in the sampling technique
Sampling bias - Some aspect of the researcher’s sampling design creates bias in the data.
Three fundamental steps
Identify a population
Define the sample size
Select the sample
SAMPLING TECHNIQUES
PROBABILITY SAMPLING
If data is to be used to make decisions about a population, then how the data is collected is critical. For
sample data to provide reliable information about a population of interest, the sample must be
representative of that population. Selecting samples from the population using chance provides a
mechanism to make the samples representative.
Best method to achieve a representative sample
A sample survey involving every member of the population to have a known nonzero chance of being
selected into the sample is called a probability sample. Probability samples are meant to ensure that
the segment taken is representative of the entire population.
Data collected from these probability sampling-based surveys yield estimates of characteristics of the
population that these surveys attempt to describe.
2.[3.] Systematic sampling, elements are selected from the population at a uniform interval that is measured
in time, order, or space
3.[4.] Cluster sampling divides the population into groups, called clusters, selects a random sample of clusters,
and then subjects the sampled clusters to complete enumeration, that is everyone in the sampled clusters
are made part of the sample.
Data from the sample are used to calculate statistics, which are estimates of the corresponding population
parameters.
The descriptive measures computed from a population are called parameters while the descriptive
measures computed from a sample are called statistics.
In probability sampling each member of the population has a positive and measurable chance of inclusion
in the sample. These inclusion probabilities serve as the bridge from sample to population.
3. Volunteer sampling, sample units consist of volunteers in studies in which the measuring process is
painful or troublesome to the respondent.
4. Purposive sampling pertains to having an expert selecting a representative sample according to his
own subjective judgment.
5. Quota Sampling - sample units are picked for convenience but certain quotas (such as the number of
persons to interview) are given to interviewers. This design is especially used in market research.
6. Snowball Sampling – sample units are identified by asking previously picked sample units for other
people that can be added to the sample. Usually, this is used when the topic is not common, or the
population is hard to access.
SAMPLING DISTRIBUTION OF THE SAMPLE MEANS
The number of samples of size n that can be drawn from a population of size N is given by NCn.
A sampling distribution of sample means is a frequency distribution using the means computed from all possible
random samples of a specific size taken from a population.
Illustrative Problem:
1. Consider a population consisting 1, 2, 3, 4 and 5. Suppose samples of size 2 are drawn from this population:
What is the mean and variance of the sampling distribution of the sample means?
Compare these values to the mean and variance of the population.
Draw the histogram of the sampling distribution of the population mean.
Total
Any mean based on the sample drawn from a population is expected to assume different values for the
samples. This leads to a conclusion that sample mean is a random variable, which depends on a particular sample.
Being a random variable, it has a probability distribution. The probability distribution of the sample means is also
called the sampling distribution of the sample means.
The difference between the sample mean and the population mean is called the sampling error. It is the
error due to sampling.
Sample Probabilit
Mean
Frequency
y P( X )
X ∙P( X ) ( X −μ ) ( X −μ )2 P( X )∙ ( X−μ )2
Total
σ X=
factor
σ
∙
√
N −n
√ n N −1
for the finite population where
√ N −n
N −1
is the finite population correction
σ
σ X= for infinite population
√n
The standard deviation ( σ X ) of the sampling distribution of the sample means is also known as the
standard error of the mean. It measures the degree of accuracy of the sample mean ( σ X ) as estimate of the
population mean ( μ ) .
Illustrative Problems:
1. The average time it takes a group of college students to complete a certain examination is 46.2 minutes.
The standard deviation is 8 minutes. Assume that the variable is normally distributed.
a) What is the probability that a randomly selected college student will complete the examination in
less than 43 minutes?
b) If 50 randomly selected college students take the examination, what is the probability that the
mean time it takes the group to complete the test will be less than 43 minutes?
c) Does it seem reasonable that a college student would finish the examination in less than 43
minutes?
d) Does it seem reasonable that the mean of the 50 college students could be less than 43 minutes?
2. The average number of milligrams (mg) of cholesterol in a cup of a certain brand of ice cream is 660 mg,
and the standard deviation is 35 mg. Assume the variable is normally distributed.
a) If a cup of ice cream is selected, what is the probability that the cholesterol content will be more
than 670 mg?
b) If a sample of 10 cups of ice cream is selected, what is the probability that the mean of the sample
will be larger than 67 mg.
ESTIMATION OF PARAMETERS
An estimate is a value or a range of values that approximate a parameter. It is based on sample statistics
computed from sample data.