When To Use This Sampling?: Sampling With Probability Proportion To Size Measure: PPS
When To Use This Sampling?: Sampling With Probability Proportion To Size Measure: PPS
When To Use This Sampling?: Sampling With Probability Proportion To Size Measure: PPS
Examples
Objectives Auxiliary information
To estimate total number of The number of households in the village
unemployed youth in a district
To estimate total - The number of tube wells in a village for a
number of tube wells in a certain district previous period
- net irrigated area for the village
To estimate the number of job openings in a The number of employee in a firm of the city
city by sampling firms in that city
Total 𝑋 = ∑ 𝑋𝑖
𝑖=1
2. Select another random number (say) 𝑗, such that 1 ≤ 𝑗 ≤ 𝑀 , where 𝑀 is either equal to the
maximum of the sizes {𝑋𝑖 },, 𝑖 = 1, 2, … , 𝑁 , or is more than the maximum size in the population
3. If 𝑗 ≤ 𝑋𝑖 ,the 𝑖 𝑡ℎ unit is selected, otherwise, the pair (i, j) of random numbers is rejected
Proof
1 𝑦 1 𝑦
𝐸(𝑌̂𝐻𝐻 ) = 𝐸 [ ∑𝑛𝑖=1 ( 𝑖 )] = ∑𝑛𝑖=1 𝐸 ( 𝑖 )
𝑛 𝑃𝑖 𝑛 𝑃𝑖
𝑦 𝑌1 𝑌2 𝑌
Now 𝑃𝑖 is a random variable and can take values , 𝑃 , … , 𝑃𝑁
𝑖 𝑃1 2 𝑁
with probabilities 𝑃1 , 𝑃2 , … 𝑃𝑁 respectively.
𝑁 𝑁
𝑦𝑖 𝑌𝑖
𝐸 ( ) = ∑ ( ) 𝑃𝑖 = ∑ 𝑌𝑖 = 𝑌
𝑃𝑖 𝑃𝑖
𝑖=1 𝑖=1
𝑛
1
𝐸(𝑌̂𝐻𝐻 ) = ∑𝑌 = 𝑌
𝑛
𝑖=1
Alternatively,
𝑛 𝑁
1 𝑦𝑖 1 𝑌𝑖
𝐸(𝑌̂𝐻𝐻 ) = 𝐸 [ ∑ ( )] = 𝐸 [∑ 𝑟𝑖 ( )]
𝑛 𝑃𝑖 𝑛 𝑃𝑖
𝑖=1 𝑖=1
1 𝑌𝑖2 2
̂ 𝑯𝑯 )
𝑉(𝒀 𝑁
= [∑𝑖=1 ( ) −𝑌 2
] or 𝑉(𝒀̂𝑯𝑯 ) = 1
∑𝑁
𝑗>𝑖 𝑃𝑖 𝑃𝑗
𝑌𝑖 𝑌𝑗
(𝑃 − 𝑃 )
𝑛 𝑃 𝑖
𝑛 𝑖 𝑗
Proof
Method 1
̂ 𝑯𝑯 ) = V [1 ∑ni=1 yi ] =
𝑉(𝒀
1
2
∑ni=1 V ( i )
y
n pi n p i
yi yi 2 yi 2
Now V ( ) = E [( ) ] − {E ( )}
p pi p i i
𝑁 𝑌𝑖2
= ∑ 𝑃𝑖 ( 2 ) − 𝑌 2
𝑖=1 𝑃𝑖
𝑌2
= ∑𝑁 𝑖
𝑖=1 ( 𝑃 ) − 𝑌
2
𝑖
1 𝑁 𝑌𝑖2
̂ 𝑯𝑯 ) = [∑
𝑉(𝒀 𝑖=1 ( 𝑃 ) − 𝑌2]
𝑛 𝑖
Method 2
n N
1 y 1 Y
̂ 𝑯𝑯 ) = V [ ∑ i ] = V [ ∑ ri i ]
𝑉(𝒀
n pi n Pi
i=1 i=1
N N
1 Yi2 Yi Yj
= 2 [∑ 2 V(ri ) + ∑ COV(ri , rj )]
n Pi Pi Pj
i=1 i≠j=1
V(ri ) = nPi (1 − Pi )
The above expression involves unknowns (𝑌𝑖 ’s) that can be estimated. Thus we found estimated
standard error (estimated sampling error)
𝑁 𝑁
1 𝑌𝑖2
𝑉(𝑌̅̂) − 𝑉(𝑌̅̂) = [𝑁 ∑ 𝑌𝑖
2
− ∑ ]
𝑆𝑅𝑆𝑊𝑅 𝑝𝑝𝑠 𝑛 𝑁2 𝑃𝑖
𝑖=1 𝑖=1
𝑁
1 2
1
= [∑ 𝑌𝑖 (𝑁 − )] (1)
𝑛 𝑁2 𝑃𝑖
𝑖=1
1 𝑦2 1
Now we can estimate (1) by [∑𝑛𝑖=1 𝑃𝑖 (𝑁 − 𝑃 )] from the sample with varying probability
𝑛2 𝑁2 𝑖 𝑖
selection and with replacement, since
1 𝑦2 1 1 𝑌2 1
𝐸 {𝑛2 𝑁2 [∑𝑛𝑖=1 𝑃𝑖 (𝑁 − 𝑃 )]} = 𝐸 {𝑛2 𝑁2 [∑𝑁 𝑖
𝑖=1 𝑃 (𝑁 − 𝑃 ) 𝛼𝑖 ]} 𝐸(𝛼𝑖 ) = 𝑛𝑃𝑖
𝑖 𝑖 𝑖 𝑖
𝑁
1 1
= 2
[∑ 𝑌𝑖2 (𝑁 − )]
𝑛𝑁 𝑃𝑖
𝑖=1
𝑦
Theorem # 4 If 𝜋𝑖 > 0 (𝑖 = 1,2, … , 𝑁) 𝑌̂𝐻𝑇 = ∑𝑛𝑖=1 𝑖 is an
𝜋𝑖
unbiased estimator of 𝑌, with variance 𝑉( 𝑌̂𝐻𝑇 ) =
(1−𝜋𝑖 ) (𝜋𝑖𝑗 −𝜋𝑖 𝜋𝑗 )
∑𝑁
𝑖=1 𝑦𝑖2 + 2 ∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖 𝑦𝑖 𝑦𝑗
𝜋𝑖 𝜋𝑖 𝜋𝑗
Proof
Let 𝑡𝑖 (𝑖 = 1,2, … , 𝑁) be a random variable that takes the value 1 if the ith
unit is drawn and zero otherwise. Then 𝑡𝑖 follows binomial distribution with
sample size 1 and probability 𝜋𝑖
𝐸 (𝑡𝑖 ) = 𝜋𝑖 𝑉(𝑡𝑖 ) = 𝜋𝑖 (1 − 𝜋𝑖 )
𝐶𝑜𝑣(𝑡𝑖 , 𝑡𝑗 ) = 𝐸(𝑡𝑖 , 𝑡𝑗 ) − 𝐸 (𝑡𝑖 )𝐸(𝑡𝑗 ) = 𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗
𝑦𝑖
𝐸( 𝑌̂𝐻𝑇 ) = 𝐸 (∑𝑁
𝑖=1 𝑡𝑖 )= ∑𝑁
𝑖=1 𝑦𝑖 = 𝑌
𝜋𝑖
𝑦𝑖 𝑦𝑖 2
𝑉( 𝑌̂𝐻𝑇 ) = 𝑉 (∑𝑁
𝑖=1 𝑡𝑖 ) = ∑𝑁
𝑖=1 ( ) V(𝑡𝑖 ) +
𝜋𝑖 𝜋𝑖
𝑦𝑖 𝑦𝑗
2 ∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖 𝐶𝑜𝑣(𝑡𝑖 , 𝑡𝑗 )
𝜋𝑖 𝜋𝑗
Alternative Expression
(1−𝜋𝑖 ) 𝜋𝑖 (1−𝜋𝑖 )
∑𝑁
𝑖=1 𝑦𝑖2 = ∑𝑁
𝑖=1 𝑦𝑖2
𝜋𝑖 𝜋𝑖 2
Now ∑𝑁 𝑁 𝑁
𝑗≠𝑖(𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 ) = ∑𝑗≠𝑖 𝜋𝑖𝑗 − 𝜋𝑖 ∑𝑗≠𝑖 𝜋𝑗 = (𝑛 − 1)𝜋𝑖 − 𝜋𝑖 (𝑛 − 𝜋𝑖 ) = −𝜋𝑖 (1 − 𝜋𝑖 )
𝑦𝑖 2 𝑦𝑖 2
= ∑𝑁 𝑁
𝑖=1 ∑𝑗≠𝑖 (𝜋𝑖 𝜋𝑗 − 𝜋𝑖𝑗 ) ( ) = ∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖 (𝜋𝑖 𝜋𝑗 − 𝜋𝑖𝑗 ) [( ) +
𝜋 𝑖 𝜋𝑖
2
𝑦𝑗
( ) ]
𝜋𝑗
(1−𝜋𝑖 ) 2
𝑉( 𝑌̂𝐻𝑇 )𝑆𝑌𝐺 = ∑𝑁
𝑖=1 𝑦𝑖 +
𝜋𝑖
2
𝑁 (𝜋𝑖𝑗 −𝜋𝑖 𝜋𝑗 ) 𝑦𝑖 2 𝑦𝑗
2 ∑𝑁 ∑
𝑖=1 𝑗>𝑖 𝑦𝑖 𝑦𝑗 = ∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖(𝜋𝑖 𝜋𝑗 − 𝜋𝑖𝑗 ) [( ) + ( ) −
𝜋𝑖 𝜋𝑗 𝑖 𝜋 𝜋 𝑗
𝑦𝑖 𝑦𝑖
2 ]
𝜋𝑖 𝜋𝑖
2
𝑦𝑖 𝑦𝑗
=∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖(𝜋𝑖 𝜋𝑗 − 𝜋𝑖𝑗 ) ( − )
𝜋𝑖 𝜋𝑗
An unbiased estimator of 𝑉( 𝑌̂𝐻𝑇 ) is
𝑛 𝑛 𝑛
(1 − 𝜋𝑖 ) 2 (𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 )
𝑣1 ( 𝑌̂𝐻𝑇 ) = ∑ 𝑦𝑖 + 2 ∑ ∑ 𝑦𝑖 𝑦𝑗
𝜋𝑖 2 𝜋𝑖 𝜋𝑗 𝜋𝑖𝑗
𝑖=1 𝑖=1 𝑗>𝑖
Proof
(1−𝜋 )
The estimator of 1st part of 𝑉( 𝑌̂𝐻𝑇 ) will be ∑𝑛𝑖=1 𝜋 𝑖 𝑦𝑖2 𝛼𝑖 (linear estimator)
𝑖
𝑛 𝑁
(1 − 𝜋𝑖 ) 2 (1 − 𝜋𝑖 ) 2
𝐸 [∑ 𝑦𝑖 𝛼𝑖 ] = 𝐸 [∑ 𝑦𝑖 𝛼𝑖 𝑡𝑖 ]
𝜋𝑖 𝜋𝑖
𝑖=1 𝑖=1
(1−𝜋𝑖 ) (1−𝜋𝑖 )
= ∑𝑁𝑖=1 𝜋𝑖
𝑦𝑖2 𝛼𝑖 𝐸(𝑡𝑖 ) = ∑𝑁
𝑖=1 𝜋𝑖
𝑦𝑖2 𝛼𝑖 𝜋𝑖
𝑁 𝑁
(1 − 𝜋𝑖 ) 2 (1 − 𝜋𝑖 ) 2
∑ 𝑦𝑖 𝛼𝑖 𝜋𝑖 = ∑ 𝑦𝑖
𝜋𝑖 𝜋𝑖
𝑖=1 𝑖=1
𝛼𝑖 𝜋𝑖 = 1
𝛼𝑖 = 1/𝜋𝑖
(𝜋 −𝜋 𝜋 )
The estimator of 2nd part of 𝑉( 𝑌̂𝐻𝑇 ) will be ∑𝑛𝑖=1 ∑𝑛𝑗>𝑖 𝑖𝑗𝜋 𝜋 𝑖 𝑗 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗
𝑖 𝑗
𝑛 𝑛 𝑁 𝑁
(𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 ) (𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 )
𝐸 [∑ ∑ 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗 ] = ∑ ∑ 𝑦𝑖 𝑦𝑗
𝜋𝑖 𝜋𝑗 𝜋𝑖 𝜋𝑗
𝑖=1 𝑗>𝑖 𝑖=1 𝑗>𝑖
𝑛 𝑛 𝑁 𝑁
(𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 ) (𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 )
𝐸 [∑ ∑ 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗 ] = 𝐸 [∑ ∑ 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗 𝑡𝑖𝑗 ]
𝜋𝑖 𝜋𝑗 𝜋𝑖 𝜋𝑗
𝑖=1 𝑗>𝑖 𝑖=1 𝑗>𝑖
𝑁 𝑁 𝑁 𝑁
(𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 ) (𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 )
∑∑ 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗 𝐸(𝑡𝑖𝑗 ) = ∑ ∑ 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗 𝜋𝑖𝑗
𝜋𝑖 𝜋𝑗 𝜋𝑖 𝜋𝑗
𝑖=1 𝑗>𝑖 𝑖=1 𝑗>𝑖
𝑁 𝑁 𝑁 𝑁
(𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 ) (𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 )
∑∑ 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗 𝜋𝑖𝑗 = ∑ ∑ 𝑦𝑖 𝑦𝑗
𝜋𝑖 𝜋𝑗 𝜋𝑖 𝜋𝑗
𝑖=1 𝑗>𝑖 𝑖=1 𝑗>𝑖
𝛼𝑖𝑗 𝜋𝑖𝑗 = 1
𝛼𝑖𝑗 = 1/𝜋𝑖𝑗
Inclusion Probabilities
Suppose a population consists of 4 units 𝑈 = {1,2,3,4}. We draw a
4
random sample of size 𝑛 = 2, No of samples (wor) is ( ) = 6
2
The possible simple random samples are
𝑆1 = {1,2}, 𝑆2 = {1,3} 𝑆3 = {1,4} 𝑆4 = {2,3} 𝑆5 = {2,4} 𝑆6 =
{3,4}
1
The selection probability of each of 6 samples are 𝑝(𝑠) =
6
∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖 𝜋𝑖𝑗 = 1/2𝑛(𝑛 − 1)
𝑃𝑗
=𝑃𝑖 [1 + ∑𝑁
𝑗≠𝑖=1 ]
1−𝑃𝑗
𝑃𝑗 𝑃𝑖
=𝑃𝑖 [1 + ∑𝑁
𝑗=1 − ]
1−𝑃𝑗 1−𝑃𝑖
𝜋𝑖𝑗 = Total probability that ith unit and jth unit will be selected in a sample of size 2
1 1
=𝑃𝑖 𝑃𝑗 [ + ]
1−𝑃𝑖 1−𝑃𝑗
Lahri-Midzuno-Sen Sampling (WOR)
Step # 1 The unit in first draw is selected with unequal probability selection
Step# 2 The remaining units are selected with SRSWOR in all subsequent draws
𝜋𝑖 =
Total probability that ith unit will be selected at either the first draw or the subsequent (n −
1) draws
=𝑃(ith unit will be selected in first draw) +
𝑃(ith unit will be selected in any other draws)
= 𝑃𝑖 +
𝑃((ith unit will not be selected in first draws but will be selected in any of subsequent n −
1 draws)
(𝑛−1)
=𝑃𝑖 + (1 − 𝑃𝑖 ) ×
(𝑁−1)
𝑁−𝑛 𝑛−1
= 𝑃𝑖 +
𝑁−1 𝑁−1
𝜋𝑖𝑗 = Total probability that ith unit and jth unit will be selected in the sample
=Probability that the ith unit will be selected in 1st draw and the jth unit will be
selected in any of subsequent n − 1 draws
+Probability that the jth unit will be selected in 1st draw and the ith unit will be
selected in any of subsequent n − 1 draws
+
Probability that the ith unit and jth unit will not be selected in 1st draw but will be
selected in any of subsequent n − 1 draws
(𝑛−1) (𝑛−1) (𝑛−1) (𝑛−2)
=𝑃𝑖 × + 𝑃𝑗 × + (1 − 𝑃𝑖 − 𝑃𝑗 ) × ×
(𝑁−1) (𝑁−1) (𝑁−1) (𝑁−2)
𝑛−1 𝑁−𝑛 (𝑛−2)
= [ (𝑃𝑖 + 𝑃𝑗 ) + ]
𝑁−1 𝑁−2 (𝑁−2)
Additional Ideas
̂ ) for Hensen-Harwitz (HH) Estimator
Another Expression for 𝑽(𝒀
(∑ 𝑎𝑗 ) (∑ 𝑏𝑘 ) = ∑ ∑ 𝑎𝑗 𝑏𝑘
𝑗=1 𝑘=1 𝑗=1 𝑘=1
= ∑ 𝒂 𝒊 𝒃𝒊 + ∑ 𝑎 𝑗 𝑏𝑘
𝒊=𝟏 𝑗≠𝑘