Stat 410 Tutorial Week 7

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

STAT 410 Tutorial Week 7

February 16th

Review
1. Comparison between stratified random sampling and simple random sampling
In general, the variance of the estimator of T from a stratified random sample will be smaller than
the variance of the estimator of T from an SRS with the same number of observations.
Take a stratified random sample of size n with proportional allocation as an example.
ANOVA table

Source Sum of Squares


PNh
SSB = h=1 j=1 (Yh − Y )2 = H
PH P 2
Between strata h=1 Nh (Yh − Y )
PH PNh
SSW = h=1 j=1 (yhj − Yh )2 = H 2
P
Within Strata h=1 (Nh − 1)Sh
PH PNh
Total SST O = h=1 j=1 (yhj − Y )2 = (N − 1)S 2

Based on proportional allocation, we have


nh n
=
Nh N

H
X 1 1
varprop (T̂str ) = Nh2 ( − )S 2
nh Nh h
h=1
H
X nh Sh2
= Nh2 (1 − )
Nh nh
h=1
H
X nh Nh
= (1 − ) Nh Sh2
Nh nh
h=1
H
n NX
= (1 − ) Nh Sh2
N n
h=1
H
n N X
= (1 − ) (SSW + Sh2 )
N n
h=1

1
1 1
varSRS (T̂ ) = N 2 ( − )S 2
n N
N2 n
= (1 − )S 2
n N
n N 2 SST O
= (1 − )
N n N −1
n N2
= (1 − ) (SSW + SSB)
N n(N − 1)
H
n N X
= varprop (T̂str ) + (1 − ) [N (SSB) − (N − Nh )Sh2 ]
N n(N − 1)
h=1

The above result shows us that proportional allocation with stratification always gives smaller vari-
ance than SRS unless
H
X Nh 2
SSB < (1 − )S
N h
h=1
This rarely happens when Nh is large, generally, the large population sizes of the strata will force
Nh (Yh − Y )2 > Sh2 .
Remarks:
1. The more unequal the stratum means Yh (SSB is larger), the more precision you will gain by using
proportional allocation.
2. Compared with the SRS, stratification results almost always in a smaller sampling variance of
the mean or total value estimators, when the measurements within the strata are more homogeneous
(SSW is smaller).
Example (This is a rare case)
It is possible for a variance estimate from proportional allocation to be larger than that from an SRS
merely because the sample selected had large within-stratum sample variance.
A wholesale food distributor in a large city wants to know whether demand is great enough to justify
adding a new product to his stock. To aid in making his decision, he plans to add this product
to a sample of the stores he services in order to estimate average monthly sales. He only services
four large chains in the city. He uses stratified random sampling with each chain as a stratum and
proportional allocation is used when deciding the sample sizes in each stratum.
Stratum Nh nh sales in each store yh s2h
1 24 4 94, 90, 102, 110 99 78.67
2 36 6 91, 99, 93, 105, 111, 101 100 55.60
3 30 5 108, 96, 100, 93, 93 98 39.50
4 30 5 92, 110, 94, 91, 113 100 112.5

H
ˆ 1 X
Y str = Nh yh
N
h=1
1
= (24 ∗ 99 + 36 ∗ 100 + 30 ∗ 98 + 30 ∗ 100)
120
= 99.3

2
H
1 X 2 1 1 2
ˆ Yˆ str ) =
var( Nh ( − )s
N2 nh Nh h
h=1
1 1 1 1 1 1 1 1 1
= 2
[242 ( − ) ∗ 78.67 + 362 ( − ) ∗ 55.60 + 302 ( − ) ∗ 39.50 + 302 ( − ) ∗ 112.5]
120 4 24 6 36 5 30 5 30
= 2.93

Suppose the 20 stores constitute a simple radom sample rather than a stratified random sample.
Then

Yˆ SRS =
X
yi /n
i∈u
= 99.3

1 1
ˆ Yˆ SRS ) = (
var( − )s2
n N
1 1 1 X
=( − ) (yi − Yˆ SRS )2
n N n−1
i∈u
1 1 1 X 2 2
=( − ) ( yi − n ∗ Yˆ SRS )
n N n − 1 u∈u
1 1
=( − ) ∗ 59.8
20 120
= 2.49

ˆ Yˆ str ) > var(


Because var( ˆ Yˆ SRS ), we conclude the simple random sampling may have been better
than stratified random sampling for this problem. The experimenter did not consider the fact that
sales vary greatly among stores within these strata. He could have obtained a smaller variance for
his estimator by stratifying on amount of sales, that is, by putting stores with low monthly sales in
one stratum, stores with high sales in another, and so forth.

You might also like