4 Sampling Distributions Revised

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 21

Sampling Distributions & Point

Estimation
Questions
• What is a sampling distribution?
• What is the standard error?
• What is the principle of maximum
likelihood?
• What is bias (in the statistical sense)?
• What is a confidence interval?
• What is the central limit theorem?
• Why is the number 1.96 a big deal?
Population
Parameter Estimation
We use statistics to estimate parameters,
e.g., effectiveness of pilot training,
effectiveness of psychotherapy.
X  SD  
Sampling Distribution (1)
Example
1st 2nd M 1st 2nd M 1st 2nd M
1 1 3 1 5 1
1 2 3
1 2 1.5 3 2 2.5 5 2 3.5
1 3 2 3 3 3 5 3 4
1 4 2.5 3 4 3.5 5 4 4.5
1 5 3 3 5 4 5 5 5
1 6 3 6 5 6
3.5 4.5 5.5
2 1 1.5 4 1 2.5 6 1 3.5
2 2 2 4 2 3 6 2 4
2 3 2.5 4 3 3.5 6 3 4.5
2 4 3 4 4 4 6 4 5
2 5 3.5 4 5 4.5 6 5 5.5
2 6 4 4 6 5 6 6 6

Possible Outcomes
Histogram
Sampling
distribution for
mean of 2 dice.

1+2+3+4+5+6 = 21.
21/6 = 3.5

There is only 1
way to get a
mean of 1, but 6
ways to get a
mean of 3.5.
Sampling Distribution Mean
and SD
• The Mean of the sampling distribution is
defined the same way as any other
distribution (expected value).
• The SD of the sampling distribution is the
Standard Error. Important and useful.
• Variance of sampling distribution is the
expected value of the squared difference – a
mean square.
• Review
 G2  E (G   G ) 2
Review

• What is a sampling distribution?


• What is the standard error of a statistic?
Statistic as Estimators

X 
More Goodness (1)

E( X )  
Sampling Distribution of the
Mean
• Unbiased: E ( X )  
• Variance of sampling distribution 2of

means based on N obs: VM   M  N
2


• Standard Error of the Mean: M 
N
• Law of large numbers: Large samples
produce sample estimates very close to
the parameter.
Unbiased Estimate of
Variance
• It can be shown that: E (S 2
)  2 2  N 1  2
 
N  N 

• The sample variance is too small by a


factor of (N-1)/N.
• We fix with 2 N 
s  S 
 (X  X )
2
2

 N 1  N 1

• Although the variance is unbiased, the


SD is still biased, but most inferential
work is based on the variance, not SD.
Interval Estimation
• Use the standard error of the mean to create a
bracket or confidence interval to show where
good estimates of the mean are.
• The sampling distribution of the mean is
nice* when N>20. Therefore:
p ( X  3 M    X  3 M )  .95
• Suppose M=100, SD=14, N=49. Then
SDM=14/7=2. Bracket = 100-6 =94 to 100+6
= 106 is 94 to 106. P is probability of sample
not mu.
* Unimodal and symmetric
Review
• What is a confidence interval?
• Suppose M = 50, SD = 10, and N =100.
What is the confidence interval?

SEM = 10/sqrt(100) = 10/10 = 1


CI (lower) = M-3SEM = 50-3 = 47
CI (upper) = M+3SEM = 50+3 = 53
CI = 47 to 53
Central Limit Theorem

– 1. Sampling distribution of means becomes


normal as N increases, regardless of shape of
original distribution.
– 2. Binomial becomes normal as N increases.
– 3. Applies to other statistics as well (e.g.,
variance)
Confidence Intervals for the
Mean
• Over samples of size N, the probability is .95
for  1.96 M  X    1.96 M
• Similarly for sample values of the mean, the
probability is .95 that
X  1.96 M    X  1.96 M
• The population mean is likely to be within 2
standard errors of the sample mean.
• Can use the Normal to create any size
confidence interval (85, 99, etc.)
Size of the Confidence
Interval
• The size of the confidence interval depends on
desired certainty (e.g., 95 vs 99 pct) and the
size of std error of mean ( ).  M
• Std err of mean is controlled by population SD
and sample size. Can control sample size.

M 
N
• SD 10. If n=25 then SEM = 2 and CI width is
about 8. If n=100, then SEM = 1 and CI width
is about 4. CI shrinks as N increases. As N
gets large, decreasing change in CI because of
square root. Less bang for buck as N gets big.
Review
• What is the central limit theorem?
• Why is the number 1.96 a big deal?
• Assume that scores on a curiosity scale are
normally distributed. If the sample mean is
50 based on 100 people and the population
SD is 10, find an approx 99 pct CI for the
population mean.
More Goodness (2)
• Efficiency – size of the sampling variance.
• Relative Efficiency. Relative efficiency is the
ratio of two sampling variances.
H2
 efficiency of G relative to H
G
2

• More efficient statistics have smaller


sampling variances, smaller standard error,
and are preferred because if both are
unbiased, one is closer than the other to the
parameter on average.

You might also like