Stat 475 Notes 8: y B X y B X y BX N SEB NNX N X Is Unknown, Then We Substitute The Sample Mean X For It

Stat 475 Notes 8
Reading: Lohr, Chapter 4.2-4.5
Note for Homework 2:

yU y
For estimating a ratio B = with the estimator Bˆ = , the
xU x
standard error of B̂ is
� n � 1 i�S
�( y - Bx
i
ˆ )
i
2
SE ( Bˆ ) = �
1- � 2
� N �nxU n -1
If xU is unknown, then we substitute the sample mean x for it.
I. Inference from a stratified sample
Suppose we take a stratified sample from H strata with

N1 ,K , N H units in the population in the strata (
N1 + L + N H = N ) and sample sizes in the strata of n1 ,K , nH .
Our estimators of the population total and population mean are

H H
tˆstr = �tˆh = �N h yh
h =1 h =1
tˆstr H
N
ystr = = � h yh
N h =1 N
The properties of these estimators follow directly from the

property of simple random sample estimators:
1
 Unbiasedness. ystr and tˆstr are unbiased estimators of yU
and t . This is true because
�H N h � H N h H
Nh
E��
�h =1 N
y h� �=
� h =1 N
E [ y h ] = �
h =1 N
yhU = yU
 Variance of the estimators. Since we are sampling

independently from the strata and we know Var (tˆh ) from
simple random sampling theory, we have
H H
� nh � 2 S h2
Var (tˆstr ) = �Var (tˆh ) = �� 1- �N h .
h =1 h =1 � N h � n h
 Variance estimates for stratified samples. We can obtain an

unbiased estimator of Var (tˆstr ) by substituting the sample
2 2
estimates sh for the population quantities Sh . Note that to
estimate the variances, we need at least two units from each
stratum.
H
� nh � 2 sh2
ˆ (tˆstr ) = ��
Var 1- Nh
�
h =1 � N h � nh
2
1 H
� nh �� N � s 2
Var ˆ (tˆstr ) = ��
ˆ ( ystr ) = 2 Var 1- �
� �
h h
N h =1 � N h �
�N �nh
As always, the standard error of an estimator is the square
root of the estimated variance: SE ( ystr ) = Varˆ ( ystr ) .
 If either (1) the sample sizes within each stratum are large
or (2) the sampling design has a large number of strata, an
approximate 95% confidence interval for the population
mean yU is
ystr �1.96* SE ( ystr )
2
Some survey researchers use the 0.975 quantile of the t-
distribution with n - H degrees of freedom instead of 1.96
(this multipler converges to 1.96 as n - H gets large).
Example 1: An advertising firm, interested in determining how

much to emphasize television advertising in a certain county,
decides to conduct a sample survey to estimate the average
number of hours each week that households within the county
watch television. The county contains two towns, A and B, and
a rural area. Town A is built around a factory, and most
households contain factory workers with school age children.
Town B is an exclusive suburb of a city in a neighboring county
and contains older residents with few echildren at home. There
are 155 households in town A, 62 in town B and 93 in the rural
area.
Merits of using stratified random sampling in this situation: The

population of households falls into three natural groupings, two
towns and a rural area, according to geographic location. Thus,
to use these divisions as three strata is quite natural simply for
administrative convenience in selecting the samples and
carrying out the fieldwork. In addition, each of the three groups
of households should have similar behavioral patterns among
residents within the group. We expect to see relatively small
variability in number of hours of television viewing among
households within a group, and this is precisely the situation in
which stratification produces a reduction in the variance of the
estimate of the population mean.
The advertising firm has enough time and money to interview

n = 40 households and decides to select random samples of size
3
n1 = 20 from town A, n2 = 8 from town B and n3 = 12 from the
rural area (We will discuss the choice of sample sizes later). The
simple random samples are selected and the interviews are
conducted. The data and summaries are shown below.
towna=c(35,43,36,39,28,28,29,25,38,27,26,32,29,40,35,41,37,31,45,34);
townb=c(27,15,4,41,49,25,10,30);
rural=c(8,14,12,15,30,32,21,20,34,7,11,24);
mean(towna)
> [1] 33.9
mean(townb)
> [1] 25.125
mean(rural)
> [1] 19
sd(towna)
> [1] 5.94625
sd(townb)
> [1] 15.24502
sd(rural)
> [1] 9.36143
A good way to view the key features of these samples and look
for any outliers or unusual features is to make side-by-side
boxplots.
boxplot(towna,townb,rural,names=c("Town A","Town B","Rural"),main="Box

plots of Television Viewing Time")
4
There do not appear to be any outliers or unusual features to be
concerned about.
Note that N = 155, N = 62, N = 93, N = 155 + 62 + 93 = 310 Our estimate of the
1 2 3
population mean is
H
N
ystr = � h yh =
h =1 N
1
[ (155)(33.90) + (62)(25.12) + (93)(19) ] = 27.7
310
The standard error is
5
2
H
� nh � 2
�N h �sh
SE ( ystr ) = �� 1- �N �n =
�
h =1 � N h �
� � h
�
� 155 �
2 2
�155 �5.95 � 62 �
2 2
�62 �15.25 � 93 � �93 �9.36 �
2 2
�
�1 - �
� � + 1
� - �
� � + 1
� - �
� � �
� 310 �
� �310 � 20 � 310 � �310 � 8 � 310 �
�310 � 12 �
= 1.40
An approximate 95% confidence interval for the population

mean is
ystr �1.96SE ( ystr ) = 27.7 �1.96*1.40 = (25.0, 30.4)
II. Sampling Weights
The stratified sampling estimator tˆstr can be expressed as a

weighted sum of the individual sampling units.
H H
N
tˆstr = �N h yh = �� h yhj
h =1 h =1 j�Sh nh
The sampling weight whj = ( N h / nh ) can be thought of as the

number of units in the population represented by the sample
member (h, j ) . If the population has 1600 men and 400 women
and the stratified sample design specifies sampling 200 men and
200 women, then each man in the sample has weight 8 and each
woman has weight 2. Each woman in the sample represents
herself and 1 other woman not selected to be in the sample, and
each man represents himself and 7 other men not in the sample.
Note that the probability of selecting the jth unit in the ith
stratum to be in the sample is p hj = nh / N h , the sampling
fraction in the hth stratum. Thus, the sampling is simply the
reciprocal of the probability of selection:
6
1
whj = .
p hj
The sum of the sampling weights equals the population size N ;

each sampled unit “represents” a certain number of units in the
population, so the whole sample “represents” the whole
population.
The stratified estimate of the population total may thus be

written as:
H
tˆstr = ��whj yhj
h =1 j�S h
and the estimate of the population mean as

H
��w
h =1 j�Sh
hj yhj
ystr = H .
��w
h =1 j�S h
hj
Example 1 continued. In Example 1, the weights are
Stratum Nh nh whj
Town A 155 20 7.75

Town B 62 8 7.75
Rural 93 12 7.75
The sampling weights are identical for each stratum. This is an

example of proportional allocation. In proportional allocation,
7
so called because the number of sampled units in each stratum is
proportional to the size of the stratum, the probability of
selection p hj = nh / N h is the same (= n / N ) for all strata: in a
population of 2400 men and 1600 women, proportional
allocation with a 10% sample would mean sampling 240 men
and 160 women.
For a stratified random sample with proportional allocation, the

probability that an individual will be selected in the sample,
n / N , is the same as in a simple random sample but many of the
“bad” samples that could occur in a simple random sample (for
example, a sample in which all 400 persons are men) cannot be
selected in a sample with proportional allocation.
III. Optimal Allocation
The objective in designing a sample survey is to maximize the

information, i.e., minimize the variance of the estimator of the
desired quantity, for a fixed total cost. Let C represent total cost,
co represent overhead cost such as maintaining an office; and ch
represent the cost of taking an observation in stratum h so that
H
C = co + �ch nh .
h =1
We want to allocate observations to strata in order to minimize

Var ( ystr ) for a given total cost C or equivalently to minimize C
for a fixed Var ( ystr ) . Suppose the costs c1 ,K , ch are fixed. To
8
minimize the variance for a fixed cost, we can prove, using
calculus, that the optimal allocation has nh proportional to
N h Sh
ch
for each h. Thus, the optimal sample size in stratum h is
� N h Sh �
� �
c
nh = �H h �n
� N l Sl �
��
�l =1 cl �
We thus sample heavily within a stratum if
 The stratum accounts for a large part of the population.
 The variance within the stratum is large; we sample more
heavily to compensate for the heterogeneity.
 Sampling in the stratum is inexpensive.
The variance of ystr is

2
H
� nh ��N h �S h
2
Var ( ystr ) = �� 1- �
�N � n
h =1 � N h �
� � h
If we would like to set Var ˆ ( ystr ) equal to some fixed value D
and we use the optimal allocation, then we can solve for the
value of n that makes Var ( ystr ) equal to D .
�H �
�H �
�
�
�
h =1
N S
h h / ch �
�
�
�
�h =1
N S
h h ch �
�
n= H
N D + �N h Sh2
2
h =1
9
Example 1 continued. The advertising firm finds that obtaining
an observation from a rural household costs more than obtaining
a response in town A or B. The increase is due to the costs of
traveling from one rural household to another. The cost per
observation in each town is estimated to be $9 (that is,
c1 = c2 = 9 ) and the cost per observation in the rural area $16
(that is, c3 = 16 ). The stratum standard devations (approximated
by the strata sample variances from a prior survey) are
S1 �5, S2 �15, S3 �10 . Find the overall sample size n and the
stratum sample sizes n1 , n2 , n3 that allow the firm to estimate, at
minimum cost, the average television-viewing time with a
margin of error equal to 2 hours.
The margin of error is half the width of the 95% confidence

interval which is approximately equal to 2*standard deviation of
ystr . Thus, we want the standard deviation of ystr and the
variance of ystr to be 1.
We have
H
155(5) 62(15) 93(10)
�N h Sh / ch =
h =1 9
+
9
+
16
= 800.83
H
�N S
h =1
h h ch = 155(5) 9 + 62(15) 9 + 93(10) 16 = 8835
Thus,
10
�H �
�H �
�
�
�
h =1
N h S h / ch �
�
�
�
�h =1
N h S h ch �
�
n= H
N D + �N h Sh2
2
h =1
(800.83)(8835)
= = 57.42 �58
(310) 21 + 27,125
Then,
� �
� NS / c � � 155(5) / 3 �
n1 = n �3 1 1 1
�= 58 � �= 58(.32) = 18.5 �18
� N S / c � �800.83 �
�� h h h �
�h =1 �
Similarly,
� 62(15) / 3 �
n2 = 58 � � = 58(.39) = 22.6 �23
� 800.83 �
�93(10) / 4 �
n3 = 58 � �= 58(0.29) = 16.8 �17
� 800.83 �
Hence, we should select 18 households at random from town A,
23 from town B, and 17 from the rural area. We can then
estimate the average number of hours spent watching television
at minimum cost with a margin of error of 2 hours.
Neyman allocation is a special case of optimal allocation used

when the costs in the strata are approximately equal. Under
Neyman allocation, nh is proportional to N h S h .
11
If all variances in strata and costs are equal, proportional
allocation is the same as optimal allocation. If we know the
variances within each stratum and they differ, optimal allocation
gives a smaller variance than proportional allocation. But
optimal allocation is a more complicated scheme; often the
simplicity and self weighting property of proportional allocation
are worth the extra variance. In addition, the optimal allocation
will differ for each variable being measured, whereas the
proportional allocation depends only on the number of
population units in each stratum.
Variance comparisons for different designs
Let y , ystr , pa , ystr ,na be for a sample of size n the mean from a
simple random sample, a proportional allocation and the
Neyman allocation respectively. Ignoring the finite population
correction,
2
1 H Nh
Var ( ystr , pa ) - Var ( ystr ,na ) = � ( S h - S ) ,
n h =1 N
H
N
where S = � h S h
h =1 N
and
2
1 H Nh
Var ( y ) - Var ( ystr , pa ) = � ( yhU - yU ) .
n h =1 N
Thus proportional allocation yields the same results as the

optimal Neyman allocation (assuming costs are the same) when
12
the variances of the strata are all the same, but if the variances
differ, the optimal allocation is better.
Stratified random sampling with proportional allocation always

gives a smaller variance than does simple random sampling.
Comparing the equations for the variances under simple random
sampling, proportional allocation and optimal allocation
assuming costs of all observations are equal, we see that
stratification with proportional allocation is better than simple
random sampling if the strata means are quite variable and that
stratification with optimal allocation is even better than
stratification with proportional allocation if the strata standard
deviations are variable.
13

Stat 475 Notes 8: y B X y B X y BX N SEB NNX N X Is Unknown, Then We Substitute The Sample Mean X For It

Uploaded by

Copyright:

Available Formats

Stat 475 Notes 8: y B X y B X y BX N SEB NNX N X Is Unknown, Then We Substitute The Sample Mean X For It

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stat 475 Notes 8: y B X y B X y BX N SEB NNX N X Is Unknown, Then We Substitute The Sample Mean X For It

Uploaded by

Copyright:

Available Formats

Stat 475 Notes 8

Reading: Lohr, Chapter 4.2-4.5

Note for Homework 2:

I. Inference from a stratified sample

Suppose we take a stratified sample from H strata with

Our estimators of the population total and population mean are

The properties of these estimators follow directly from the

 Variance of the estimators. Since we are sampling

 Variance estimates for stratified samples. We can obtain an

Example 1: An advertising firm, interested in determining how

Merits of using stratified random sampling in this situation: The

The advertising firm has enough time and money to interview

boxplot(towna,townb,rural,names=c("Town A","Town B","Rural"),main="Box

The standard error is

An approximate 95% confidence interval for the population

II. Sampling Weights

The stratified sampling estimator tˆstr can be expressed as a

The sampling weight whj = ( N h / nh ) can be thought of as the

The sum of the sampling weights equals the population size N ;

The stratified estimate of the population total may thus be

and the estimate of the population mean as

Example 1 continued. In Example 1, the weights are

Town A 155 20 7.75

The sampling weights are identical for each stratum. This is an

For a stratified random sample with proportional allocation, the

III. Optimal Allocation

The objective in designing a sample survey is to maximize the

We want to allocate observations to strata in order to minimize

The variance of ystr is

The margin of error is half the width of the 95% confidence

Neyman allocation is a special case of optimal allocation used

Variance comparisons for different designs

Thus proportional allocation yields the same results as the

Stratified random sampling with proportional allocation always

You might also like