Stat 410 Tutorial Week 7
Stat 410 Tutorial Week 7
Stat 410 Tutorial Week 7
February 16th
Review
1. Comparison between stratified random sampling and simple random sampling
In general, the variance of the estimator of T from a stratified random sample will be smaller than
the variance of the estimator of T from an SRS with the same number of observations.
Take a stratified random sample of size n with proportional allocation as an example.
ANOVA table
H
X 1 1
varprop (T̂str ) = Nh2 ( − )S 2
nh Nh h
h=1
H
X nh Sh2
= Nh2 (1 − )
Nh nh
h=1
H
X nh Nh
= (1 − ) Nh Sh2
Nh nh
h=1
H
n NX
= (1 − ) Nh Sh2
N n
h=1
H
n N X
= (1 − ) (SSW + Sh2 )
N n
h=1
1
1 1
varSRS (T̂ ) = N 2 ( − )S 2
n N
N2 n
= (1 − )S 2
n N
n N 2 SST O
= (1 − )
N n N −1
n N2
= (1 − ) (SSW + SSB)
N n(N − 1)
H
n N X
= varprop (T̂str ) + (1 − ) [N (SSB) − (N − Nh )Sh2 ]
N n(N − 1)
h=1
The above result shows us that proportional allocation with stratification always gives smaller vari-
ance than SRS unless
H
X Nh 2
SSB < (1 − )S
N h
h=1
This rarely happens when Nh is large, generally, the large population sizes of the strata will force
Nh (Yh − Y )2 > Sh2 .
Remarks:
1. The more unequal the stratum means Yh (SSB is larger), the more precision you will gain by using
proportional allocation.
2. Compared with the SRS, stratification results almost always in a smaller sampling variance of
the mean or total value estimators, when the measurements within the strata are more homogeneous
(SSW is smaller).
Example (This is a rare case)
It is possible for a variance estimate from proportional allocation to be larger than that from an SRS
merely because the sample selected had large within-stratum sample variance.
A wholesale food distributor in a large city wants to know whether demand is great enough to justify
adding a new product to his stock. To aid in making his decision, he plans to add this product
to a sample of the stores he services in order to estimate average monthly sales. He only services
four large chains in the city. He uses stratified random sampling with each chain as a stratum and
proportional allocation is used when deciding the sample sizes in each stratum.
Stratum Nh nh sales in each store yh s2h
1 24 4 94, 90, 102, 110 99 78.67
2 36 6 91, 99, 93, 105, 111, 101 100 55.60
3 30 5 108, 96, 100, 93, 93 98 39.50
4 30 5 92, 110, 94, 91, 113 100 112.5
H
ˆ 1 X
Y str = Nh yh
N
h=1
1
= (24 ∗ 99 + 36 ∗ 100 + 30 ∗ 98 + 30 ∗ 100)
120
= 99.3
2
H
1 X 2 1 1 2
ˆ Yˆ str ) =
var( Nh ( − )s
N2 nh Nh h
h=1
1 1 1 1 1 1 1 1 1
= 2
[242 ( − ) ∗ 78.67 + 362 ( − ) ∗ 55.60 + 302 ( − ) ∗ 39.50 + 302 ( − ) ∗ 112.5]
120 4 24 6 36 5 30 5 30
= 2.93
Suppose the 20 stores constitute a simple radom sample rather than a stratified random sample.
Then
Yˆ SRS =
X
yi /n
i∈u
= 99.3
1 1
ˆ Yˆ SRS ) = (
var( − )s2
n N
1 1 1 X
=( − ) (yi − Yˆ SRS )2
n N n−1
i∈u
1 1 1 X 2 2
=( − ) ( yi − n ∗ Yˆ SRS )
n N n − 1 u∈u
1 1
=( − ) ∗ 59.8
20 120
= 2.49