Rm-Module-3 What Is Sampling?: Statistical Analysis

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 24

RM-MODULE-3

What Is Sampling?

Sampling is a process in statistical analysis where researchers take a predetermined number of


observations from a larger population. Sampling allows researchers to conduct studies about a
large group by using a small portion of the population. The method of sampling depends on the
type of analysis being performed, but it may include simple random sampling or systematic
sampling. Sampling is commonly done in statistics, psychology, and the financial industry.

How Sampling Works

It can be difficult for researchers to conduct accurate studies on large populations. In some cases,
it can be impossible to study every individual in the group. That's why they often choose a small
portion to represent the entire group. This is called a sample. Samples allow researchers to use
characteristics of the small group to make estimates of the larger population.

The chosen sample should be a fair representation of the entire population. When taking a sample
from a larger population, it is important to consider how the sample is chosen. To get a
representative sample, it must be drawn randomly and encompass the whole population. For
example, a lottery system could be used to determine the average age of students in a university
by sampling 10% of the student body.

Population: Total of items about which information is desired. It can be classified into two
categories- finite and infinite. The population is said to be finite if it consists of a fixed number
of elements so that it is possible to enumerate in its totality.
Examples of finite population are the populations of a city, the number we use the term
of workers in a factory, etc. An infinite population is that population infinite population
in which it is theoretically impossible to observe all the elements. In an for a population
infinite population the number of items is infinite. Example of infinite thatSample
cannot be
population is the number of stars in sky. From practical consideration, enumerated in a
Basic Guidelines for Research SMS Kabir
reasonable period of time. Population

Sample: It is part of the population that represents the characteristics of


the population.

Sampling: It is the process of selecting the sample for estimating the population characteristics.
In other words, it is the process of obtaining information about an entire population by
examining only a part of it.

Sampling Unit: Elementary units or group of such units which besides being clearly defined,
identifiable and observable, are convenient for purpose of sampling are called sampling units.
For instance, in a family budget enquiry, usually a family is considered as the sampling unit
since it is found to be convenient for sampling and for ascertaining the required information. In a
crop survey, a farm or a group of farms owned or operated by a household may be considered as
the sampling unit.

Sampling Frame: A list containing all sampling units is known as sampling frame. Sampling
frame consists of a list of items from which the sample is to be drawn.

Sample Survey: An investigation in which elaborate information is collected on a sample basis is


known as sample survey.

Statistic: Characteristics of the sample. For example, sample Mean, proportion, etc.
Parameter: Characteristics of the population. For example, population Mean, proportion,
Target Population
etc.
Target Population: A target population is the entire group about which SAMPLE

information is desired and conclusion is made.

Sampled Population: The population, which we actually sample, is


the sampled population. It is also called survey population.

Sampled/ Survey Population

Sampling With and Without Replacement: Sampling schemes may be without replacement
('WOR' - no element can be selected more than once in the same sample) or with replacement
('WR' - an element may appear multiple times in the one sample). For example, if we catch fish,
measure them, and immediately return them to the water before continuing with the sample, this
is a WR design, because we might end up catching and measuring the same fish more than once.
However, if we do not return the fish to the water (e.g. if we eat the fish), this becomes a WOR
design.
Basic Guidelines for Research SMS Kabir
Sample Design: Sample design refers to the plans and methods to be followed in selecting sample
from the target population and the estimation technique formula for computing the sample
statistics. These statistics are the estimates used to infer the population parameters.

Characteristics of good research

Goal Orientation

A sample design should be orientated to the research aims, adapted to the survey design, and
fitted to the survey conditions. If this is done, it should have an impact on the population
selection, measurement, and sample selection procedure.

Measurability

A sample design should allow meaningful estimates of sampling variability to be computed.


In surveys, this variability is typically reported as standard error. However, this is only
achievable with probability sampling. It is impossible to know the degree of precision of
survey results in non-probability samples, such as a quota sample.

Practicality

This means that the sample design can be correctly followed in the survey, as planned.
Complete, correct, practical, and unambiguous instructions must be provided to the
interviewer so that no errors occur in sampling unit selection and the final selection in the
field is consistent with the initial sample design. Practicality also relates to the design’s
simplicity, or its ability to be understood and followed in actual fieldwork operations.

Economy

Finally, economy means that the survey’s goals should be met with the least amount of
money and effort possible. Generally, survey objectives are stated in terms of precision,
which isdefined as the inverse of the variation of survey estimates. The sample design
Basic Guidelines for Research SMS Kabir
should provide the lowest cost for a given degree of precision. Alternatively, the sample
design should yield maximum precision for a given per unit cost (minimum variance).

Sample Size Decisions

Following our examination of key sample designs, we now shift our attention to another
critical component of sampling, namely, sample size decisions. When doing a survey and
not being able to reach the complete population, the marketing researcher must first
determine how large the sample should be.

SAMPLING PROCESS

The sampling process comprises several stages-


1. Define the population.
2. Specifying the sampling frame.
3. Specifying the sampling unit.
4. Selection of the sampling method.
5. Determination of sample size.
6. Specifying the sampling plan.
7. Selecting the sample.

Define the Population: Population must be defined in terms of elements, sampling units, extent
and time. Because there is very rarely enough time or money to gather information from everyone
or everything in a population, the goal becomes finding a representative sample (or subset) of that
population.

Sampling Frame: As a remedy, we seek a sampling frame which has the property that we can
identify every single element and include any in our sample. The most straightforward type of
frame is a list of elements of the population (preferably the entire population) with appropriate
contact information. A sampling frame may be a telephone book, a city directory, an employee
roster, a listing of all students attending a university, or a list of all possible phone numbers.

Sampling Unit: A sampling unit is a basic unit that contains a single element or a group of
elements of the population to be sampled. The sampling unit selected is often dependent upon the
sampling frame. If a relatively complete and accurate listing of elements is available (e.g. register

Basic Guidelines for Research SMS Kabir


of purchasing agents) one may well want to sample them directly. If no such register is available,
one may need to sample companies as the basic sampling unit.

Sampling Method: The sampling method outlines the way in which the sample units are to be
selected. The choice of the sampling method is influenced by the objectives of the research,
availability of financial resources, time constraints, and the nature of the problem to be
investigated. All sampling methods can be grouped under two distinct heads, that is, probability
and non-probability sampling.

Sample Size: The sample size calculation depends primarily on the type of sampling designs used.
However, for all sampling designs, the estimates for the expected sample characteristics (e.g.
mean, proportion or total) desired level of certainty, and the level of precision must be clearly
specified in advanced. The statement of the precision desired might be made by giving the amount
of error that we are willing to tolerate in the resulting estimates. Common levels of precisions are
5% and 10%.

Sampling Plan: In this step, the specifications and decisions regarding the implementation of the
research process are outlined. As the interviewers and their co-workers will be on field duty of
most of the time, a proper specification of the sampling plans would make their work easy and
they would not have to reverting operational problems.

Select the Sample: The final step in the sampling process is the actual selection of the sample
elements. This requires a substantial amount of office and fieldwork, particularly if personal
interviews are involved.

Types of Sampling
There are two basic approaches to sampling: Probability Sampling and Non-probability Sampling.

Basic Guidelines for Research SMS Kabir


PROBABILITY SAMPLING

Probability sampling is also known as random sampling or chance sampling. In this, sample is
taken in such a manner that each and every unit of the population has an equal and positive
chance of being selected. In this way, it is ensured that the sample would truly represent the
overall population. Probability sampling can be achieved by random selection of the sample
among all the units of the population.
Major random sampling procedures are -
 Simple Random Sample
Basic Guidelines for Research SMS Kabir
 Systematic Random Sample
 Stratified Random Sample, and
 Cluster/ Multistage Sample.

1. Simple Random Sample: For this, each member of the population is numbered. Then, a given size of
the sample is drawn with the help of a random number chart. The other way is to do a lottery. Write all
the numbers on small, uniform pieces of paper, fold the papers, put them in a container and take
out the required lot in a random manner from the container as is done in the kitty parties. It is
relatively simple to implement but the final sample may miss out small sub groups.

Advantages: The sample will be free from Bias (i.e. it’s random!).
Disadvantages: Difficult to obtain.
Due to its very randomness, “freak” results can sometimes be obtained that are not
representative of the population. In addition, these freak results may be difficult to
spot. Increasing the sample size is the best way to eradicate this
problem.

2. Systematic Random Sample: It also requires numbering


the entire population. Then every nth number (say every
5th or 10th number, as the case may be) is selected to
constitute the sample. It is easier and more likely to
represent different subgroups.

Advantages: Can eliminate other sources of bias.


Disadvantages: Can introduce bias where the pattern used for the samples coincides with a
pattern in the population.

3. Stratified Random Sample: At first, the population is first divided Women Men

into groups or strata each of which is homogeneous with respect to


the given characteristic feature. From each strata, then, samples
are drawn at random. This is called stratified random sampling.
For example, with respect to the level of socio-economic status, the
population may first be grouped in such strata as high, middle, low
and very low socio-economic levels as per pre-determined criteria,
and random sample drawn from each group.
The sample size for each sub-group can be fixed to get representative sample. This way, it is
possible that different categories in the population are fairly represented in the sample, which
could have been left out otherwise in simple random sample.

Basic Guidelines for Research SMS Kabir


Advantages: Yields more accurate results than simple random sampling.
Can show different tendencies within each category (e.g. men and women).
Disadvantages: Nothing major, hence it’s used a lot.

As with stratified samples, the population is broken down into different categories. However, the
size of the sample of each category does not reflect the population as a whole. The Quota
sampling technique can be used where an unrepresentative sample is desirable (e.g. you might
want to interview more children than adults for a survey on computer games), or where it would
be too difficult to undertake a stratified sample.

4. Cluster/ Multistage Sample: In some cases, the selection of units may pass through various
stages, before you finally reach your sample of study. For this, a State, for example, may be
divided into districts, districts into blocks, blocks into villages, and villages into identifiable
groups of people, and then taking the random or quota sample from each group. For example,
taking a random selection of 3 out of 15 districts of a State, 6 blocks from each selected
district, 10 villages from each selected block and 20 households from each selected village,
totaling 3600 respondents. This design is used for large-scale surveys spread over large areas.

Basic Guidelines for Research SMS Kabir


Primary Area

The advantage is that it


needs detailed sampling
Ch Sample Location
frame for selected
clusters only rather than
for the entire target area. Segment Housing Unit

There are savings in


travel costs and time as
well. However, there is
a risk of missing on
important sub-groups
and not having complete
representation of the
target population.

Advantages: Less expensive and time consuming than a fully random sample.
Can show ‘regional’ variations.
Disadvantages: Not a genuine random sample.
Likely to yield a biased result (especially if only a few clusters are sampled).

NON-PROBABILITY SAMPLING

Non-probability sampling is any sampling method where some elements of the population have no chance
of selection (these are sometimes referred to as 'out of coverage'/'under covered'), or where the probability
of selection can't be accurately determined. It involves the selection of elements based on assumptions
regarding the population of interest, which forms the criteria for selection. Hence, because the selection of
elements is nonrandom, non-probability sampling does not allow the estimation of sampling errors.
Non-probability sampling is a non-random and subjective method of sampling where the selection of the
population elements comprising the sample depends on the personal judgment or the discretion of the
sampler. Non-probability sampling includes –

 Accidental/ Convenience/ Opportunity/ Availability/ Haphazard/ Grab Sampling


 Quota Sampling

Basic Guidelines for Research SMS Kabir


 Judgment/ Subjective/ Purposive Sampling
 Snowball Sampling.

1. Convenience/ Accidental Sampling: Accidental sampling (sometimes known as grab, convenience


or opportunity sampling) is a type of non-probability sampling which involves the sample being
drawn from that part of the population which is close to hand. That is, a sample population selected
because it is readily available and convenient. The researcher using such a sample cannot
scientifically make generalizations about the total population from this sample because it would not
be representative enough. For example, if the interviewer was to conduct such a survey at a shopping
center early in the morning on a given day, the people that s/he could interview would be limited to
those given there at that given time, which would not represent the views of other members of society
in such an area, if the survey was to be conducted at different times of day and several times per
week. This type of sampling is most useful for pilot testing.

Basic Guidelines for Research SMS Kabir


The primary problem with availability sampling Hey!

is that you can never be certain what population Do you believe in


spirituality?

the participants in the study represent. The


population is unknown, the method for selecting
cases is haphazard, and the cases studied probably
don't represent any population you could come up
with.
However, there are some situations in which this kind of design has advantages - for example, survey
designers often want to have some people respond to their survey before it is given out in the ‘real’
research setting as a way of making certain the questions make sense to respondents. For this purpose,
availability sampling is not a bad way to get a group to take a survey, though in this case researchers care
less about the specific responses given than whether the instrument is confusing or makes people feel bad.

2. Quota Sampling: In quota sampling, the population is


first segmented into mutually exclusive sub-groups,
just as in stratified sampling. Then judgment is used to
select the subjects or units from each segment based
on a specified proportion. For example, an interviewer Sample

may be told to sample 200 females and 300 males


between the age of 45 and 60. In quota sampling the
selection of the sample is non-random. For example
interviewers might be tempted to interview those who
Population
look most helpful. The problem is that these samples
may be biased because not everyone gets a chance of
selection. This random element is its greatest
weakness and quota versus probability has been a
matter of controversy for many years.

3. Subjective or Purposive or Judgment Sampling: In this sampling, the sample is selected with
definite purpose in view and the choice of the sampling units depends entirely on the discretion and
judgment of the investigator.
This sampling suffers from drawbacks of does representative
favoritism and nepotism depending upon the not sample of the
beliefs and prejudices of the investigator and thus give a population.
Basic Guidelines for Research SMS Kabir
Sample

Population

This sampling method is seldom used and cannot be recommended for general use since it is often biased
due to element of subjectivity on the part of the investigator. However, if the investigator is experienced
and skilled and this sampling is carefully applied, then judgment samples may yield valuable results.

Some purposive sampling strategies that can be used in qualitative studies are given below. Each strategy
serves a particular data gathering and analysis purpose.
Extreme Case Sampling: It focuses on cases that are rich in information because they are unusual or
special in some way. e.g. the only community in a region that prohibits felling of trees.

Basic Guidelines for Research SMS Kabir


Maximum Variation Sampling: Aims at capturing the central themes that cut across participant variations.
e.g. persons of different age, gender, religion and marital status in an area protesting against child
marriage.
Homogeneous Sampling: Picks up a small sample with similar characteristics to describe some particular
sub-group in depth. e.g. firewood cutters or snake charmers or bonded laborers.
Typical Case Sampling: Uses one or more typical cases (individuals, families / households) to provide a
local profile. The typical cases are carefully selected with the co-operation of the local people/ extension
workers.
Critical Case Sampling: Looks for critical cases that can make a point quite dramatically. e.g. farmers
who have set up an unusually high yield record of a crop.
Chain Sampling: Begins by asking people, ‘who knows a lot about ’. By asking a number of
people, you can identify specific kinds of cases e.g. critical, typical, extreme etc.
Criterion Sampling: Reviews and studies cases that meet some pre-set criterion of importance e.g.
farming households where women take the decisions.
In short, purposive sampling is best used with small numbers of individuals/groups which may well be
sufficient for understanding human perceptions, problems, needs, behaviors and contexts, which are the
main justification for a qualitative audience research.

Snowball Sampling: Snowball sampling is a method in which a researcher identifies one member of
some population of interest, speaks to him/her, and then asks that person to identify others in the
population that the researcher might speak to.
This person is then asked to refer the researcher to yet
Snowball Sampling
another person, and so on. This sampling technique is
used against low incidence or rare populations.
Sampling is a big problem in this case, as the defined
population from which the sample can be drawn is
not available. Therefore, the process sampling depends
on the chain system of referrals. Although small
sample sizes and low costs are the clear advantages of
snowball sampling, bias is one of its disadvantages.
The referral names obtained from those sampled in the
initial stages may be similar to those initially sampled.

Therefore, the sample may not represent a cross-section of the total population. It may also happen that
visitors to the site or interviewers may refuse to disclose the names of those whom they know.

Basic Guidelines for Research SMS Kabir


Some Other Sampling Methods -

Matched Random Sampling: A method of assigning participants to groups in which pairs of participants
are first matched on some characteristic and then individually assigned randomly to groups. The
Procedure for Matched random sampling can be briefed with the following contexts- (a) Two samples in
which the members are clearly paired, or are matched explicitly by the researcher. For example, IQ
measurements or pairs of identical twins. (b) Those samples in which the same attribute, or variable, is
measured twice on each subject, under different circumstances. Commonly called repeated measures.

Basic Guidelines for Research SMS Kabir


Mechanical Sampling: Mechanical sampling is typically used in sampling solids, liquids and gases, using
devices such as grabs, scoops; thief probes etc. Care is needed in ensuring that the sample is
representative of the frame.
Line-intercept Sampling: Line-intercept sampling is a method of sampling elements in a region whereby
an element is sampled if a chosen line segment, called a ‘transect’, intersects the element.
Panel Sampling: Panel sampling is the method of first selecting a group of participants through a random
sampling method and then asking that group for the same information again several times over a period of
time. Therefore, each participant is given the same survey or interview at two or more time points; each
period of data collection is called a ‘wave’. This sampling methodology is often chosen for large scale or
nation-wide studies in order to gauge changes in the population with regard to any number of variables
from chronic illness to job stress to weekly food expenditures. Panel sampling can also be used to inform
researchers about within-person health changes due to age or help explain changes in continuous
dependent variables such as spousal interaction.
Rank Sampling: A non-probability sample is drawn and ranked. The highest value is chosen as the first
value of the targeted sample. Another sample is drawn and ranked, the second highest value is chosen for
the targeted sample. The process is repeated until the lowest value of the targeted sample is chosen. This
sampling method can be used in forestry to measure the average diameter of the trees.
Voluntary Sample: A voluntary sample is made up of people who self-select into the survey. Often, these
folks have a strong interest in the main topic of the survey. Suppose, for example, that a news show asks
viewers to participate in an on-line poll. This would be a volunteer sample. The sample is chosen by the
viewers, not by the survey administrator.

Basic Guidelines for Research SMS Kabir


7.5 SAMPLING ERROR AND SURVEY BIAS

Basic Guidelines for Research SMS Kabir


Basic Guidelines for Research SMS Kabir
Basic Guidelines for Research SMS Kabir
Survey results are typically subject to some error. Total errors can be classified into sampling
errors and non-sampling errors. The term ‘error’ here includes systematic biases as well as random
errors.
Sampling errors and biases: Sampling errors and biases are induced by the sample design. They
include-

1. Selection bias: When the true selection probabilities differ from those assumed in calculating the
results.
2. Random sampling error: Random variation in the results due to the elements in the sample being
selected at random.

Basic Guidelines for Research SMS Kabir


Non-sampling error: Non-sampling errors are other errors which can impact the final survey estimates,
caused by problems in data collection, processing, or sample design. They include-
1. Over-coverage: Inclusion of data from outside of the population.
2. Under-coverage: Occurs when some members of the population are inadequately represented in the
sample. Under-coverage is often a problem with convenience samples.
3. Measurement error: When respondents misunderstand a question, or find it difficult to answer.
4. Processing error: Mistakes in data coding.
5. Non-response: Failure to obtain complete data from all selected individuals.

After sampling, a review should be held of the exact process followed in sampling, rather than that
intended, in order to study any effects that any divergences might have on subsequent analysis. A
particular problem is that of non-response.

Basic Guidelines for Research SMS Kabir


Two major types of non-response exist: unit non-response (referring to lack of completion of any part of
the survey) and item non-response (submission or participation in survey but failing to complete one or
more components/questions of the survey). In survey sampling, many of the individuals identified as part
of the sample may be unwilling to participate, not have the time to participate (opportunity cost), or
survey administrators may not have been able to contact them. In this case, there is a risk of differences,
between respondents and non-respondents, leading to biased estimates of population parameters. This is
often addressed by improving survey design, offering incentives, and conducting follow-up studies which
make a repeated attempt to contact the unresponsive and to characterize their similarities and differences
with the rest of the frame. The effects can also be mitigated by weighting the data when population
benchmarks are available or by imputing data based on answers to other questions.
Non-response is particularly a problem in internet sampling. Reasons for this problem include improperly
designed surveys, over-surveying (or survey fatigue), and the fact that potential participants hold multiple
e-mail addresses, which they don’t use anymore or don’t check regularly.

Bias Due to Measurement Error: A poor measurement process can also lead to bias. In survey research,
the measurement process includes the environment in which the survey is conducted, the way that
questions are asked, and the state of the survey respondent. Response bias refers to the bias that results
from problems in the measurement process. Some examples of response bias are given below.
Leading questions: The wording of the question may be loaded in some way to unduly favor one response
over another. For example, a satisfaction survey may ask the respondent to indicate where she is satisfied,
dissatisfied, or very dissatisfied. By giving the respondent one response option to express satisfaction and
two response options to express dissatisfaction, this survey question is biased toward getting a dissatisfied
response.
Social desirability: Most people like to present themselves in a favorable light, so they will be reluctant to
admit to unsavory attitudes or illegal activities in a survey, particularly if survey results are not
confidential. Instead, their responses may be biased toward what they believe is socially desirable.
Increasing the sample size tends to reduce the sampling error; that is, it makes the sample statistic less
variable. However, increasing sample size does not affect survey bias. A large sample size cannot correct
for the methodological problems (under-coverage, non-response bias, etc.) that produce survey bias.

Basic Guidelines for Research SMS Kabir


 Population specification error

A population specification error occurs when researchers don’t know precisely who to survey.
For example, imagine a research study about kid’s apparel. Who is the right person to survey? It can be both parents, only the
mother, or the child. The parents make purchase decisions, but the kids may influence their choice .

 Sample frame error

Sampling frame error occurs when researchers target the sub-population wrongly while selecting the sample.
For example, picking a sampling frame from the telephone white pages book may have erroneous inclusions because
people shift their cities. Erroneous exclusions occur when people prefer to un-list their numbers. Wealthy households
may have more than one connection, thus leading to multiple inclusions .

 Selection error

Selection error occurs when respondents self-select themselves to participate in the study. You can control selection
errors by going the extra step to request responses from the entire sample. Only interested ones respond.

Pre-survey planning, follow-ups, and a neat and clean survey design will boost respondents’ participation rate. Also,
try sampling methods like CATI surveys and in-person interviews to maximize responses.

 Sampling errors

Sampling errors occur due to a disparity in the representativeness of the respondents. It majorly happens when the
researcher does not plan his sample carefully.

Basic Guidelines for Research SMS Kabir


These sampling errors can be controlled and eliminated by creating a careful sample design, having a large enough
sample to reflect the entire population, or using an online sample or survey audiences to collect responses.

Determination of sample size

Determination of sample size is probably one of the most important phases in the sampling process
.
Generally the larger the sample size, the better is the estimation. But always larger sample sizes cannot be
used in view of time and budget constraints. Moreover, when a probability sample reaches a certain size
the precision of an estimator cannot be significantly increased by increasing the sample size any further.
Indeed, for a large population the precision of an estimator depends on the sample size, not on what
proportion of the population has been sampled. It can be stated that whenever a sample study is made,
there arises some sampling error which can be controlled by selecting a sample of adequate size. For
example, a researcher may like to estimate the mean of the universe within ± 3 of the true mean with 95
percent confidence. In this case, we will say that the desired precision is ± 3, i, e., if the true mean is Tk
100, the estimated value of the mean will be no less than Tk. 97 and no more than Tk. 103. In other
words, all this means that the acceptable error, e, is equal to 3. Keeping this in view, we can now explain
the determination of sample size so that specified precision is ensured.
error; σ = Standard deviation of population; Z = Standard normal variate at a given confidence level.

Basic Guidelines for Research SMS Kabir


Basic Guidelines for Research SMS Kabir

You might also like