Ch. 7 - Sampling and Infrential Statistics
Ch. 7 - Sampling and Infrential Statistics
Ch. 7 - Sampling and Infrential Statistics
Presented by
Hussein Walid Hussein Alkhawaja
Supervised by
Vahid Nimehchisalem, PhD
Important statistical terms
Population:
a set which includes all
measurements of interest
to the researcher
(The collection of all
responses, measurements, or
counts that are of interest)
Sample:
A subset of the population
Why sampling?
Probability sampling
- By chance inclusion of subjects/elements
- Random selection
- Help in inferential statistics to generalize findings
Criteria: every member or element of the population has a known
probability of being chosen in the sample.
Nonprobability sampling :
- the elements are not chosen by chance procedures.
- Its success depends on the knowledge, expertise, and judgment of
the researcher.
- the application of probability sampling is not feasible.
- Its advantages are convenience and economy.
Probability samples
Random sampling
Each subject has a known probability of
being selected
Allows application of statistical sampling
theory to results to:
Generalise
Test hypotheses
Methods used in probability
samples
Stratified sampling
Multi-stage sampling
Cluster sampling
Simple random sampling
Definition
random means “without purpose or by accident.”
chance alone determines which elements in the population will be in the sample
Purpose:
When random sampling is used, the researcher can employ inferential statistics to
estimate how much the population is likely to differ from the sample.
Steps
1. Define the population.
3. Select the sample by employing a procedure where sheer chance determines which
members on the list are drawn for the sample.
Methods:
- A container
- a table of random numbers (can be absolutely without bias)
- Research Randomizer (www.randomizer.org).
Table of random numbers
684257954125632140
582032154785962024
362333254789120325
985263017424503686
Stratified Sampling
Definition
a number of subgroups, or strata, that may differ in the characteristics being studied
e.g., age, neighborhood, and occupation
Purpose:
To employ inferential statistics to estimate how much the population is likely to differ
from the sample.
When the population to be sampled is not homogeneous but consists of several sub-
groups, stratified sampling may give a more representative sample than simple random
sampling.
Steps
1. identify the strata of interest
2. randomly draw a specified number of subjects from each stratum with exact ratio of
number
Possible bias
- Geographic
- characteristics of the population ( income, occupation, gender, age, or year in college)
Steps
1. Random selection of the cluster
2. all the members of the cluster must be included in the sample
Possible problems
- sampling error especial when number of clusters is small
Cluster sampling
Section 1 Section 2
Section 3
Section 5
Section 4
Systematic Sampling
Definition
drawing a sample by taking every Kth case from a list of the population.
Sampling fraction: Ratio between sample size and population size
Method:
Randomize the sample
Decide how many subjects you want in the sample (n)
Divide N (total number of members in the population) by n to determine the
sampling interval (K ) list.
Select the first member randomly from the first K members of the list and then
select every Kth member of the population for the sample by adding the K
value each time
e.g., N=500 subjects and a desired sample size n is 50:
K = N/n = 500/50 = 10.
Note: you could use cluster sampling if you were studying a very large and widely
dispersed population. At the same time, you might be interested in stratifying the
sample to answer questions regarding its different strata.
Systematic sampling
Non probability samples
Probability of being chosen is unknown
Cheaper- but unable to generalise
potential for bias
Sample size
Quantitative Qualitative
Z 2σ 2 Z2 π(1 π)
n n
D2 D2
(σ12 σ 22 )xF 2 P (1 - P) F
n n
D2 D2
Conclusions
Ensure
Representativeness
Precision
Errors in sample
Thus, e =X − μ. For example, if you know that the mean intelligence score for a
population of 10,000 fourth-graders is μ = 100 and a particular random sample of 200
has a mean of X = 99, then the sampling error is X − μ = 99 − 100 = −1.
Type 1 error
The probability of finding a difference with our sample compared to
population
The investigator will either retain or reject the null hypothesis. Either
decision may be correct or wrong. If the null hypothesis is true, the
investigator is correct in retaining it and in error in rejecting it. The
rejection of a true null hypothesis is labeled a Type I error.