When To Use This Sampling?: Sampling With Probability Proportion To Size Measure: PPS

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Sampling with Probability Proportion to Size Measure: PPS

When to use this sampling?


When the sampling units vary considerably in size, SRS does not seem to be an appropriate
procedure since it does not take into account the possible importance of the size of the units.

Examples
Objectives Auxiliary information
To estimate total number of The number of households in the village
unemployed youth in a district
To estimate total - The number of tube wells in a village for a
number of tube wells in a certain district previous period
- net irrigated area for the village
To estimate the number of job openings in a The number of employee in a firm of the city
city by sampling firms in that city

Types of PPS sampling


• with replacement (WR)
• without replacement (WoR)

Procedure of Selecting a Sample (WR)


* Cumulative Total Method
* Lahiri’s Method
Cumulative Total Method
Let the size of the 𝑖 𝑡ℎ unit be denoted by 𝑋𝑖 , the total size for N population units being 𝑋 =
∑𝑁
𝑖=1 𝑋𝑖 . Then, the selection procedure consists of following steps:

Steps involved in cumulative total method :


1. Write down cumulative totals for the sizes 𝑋𝑖 , 𝑖 = 1, 2, … , 𝑁 (Table1.1)
2. Choose a random number r, such that, 1 ≤ 𝑟 ≤ 𝑋
3. Select i-th population unit if 𝑇𝑖−1 < 𝑟 ≤ 𝑇𝑖 where 𝑇𝑖−1 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑖−1 and𝑇𝑖 =
𝑇𝑖−1 + 𝑋𝑖

4. Step 2 -3 are to be repeated till n units are selected

Cumulative Total Table


Sl # Auxiliary variable size Cumulative total Abbreviation
1 𝑋1 𝑋1 𝑇1
2 𝑋2 𝑋1 + 𝑋2 𝑇2
3 𝑋3 𝑋1 + 𝑋2 + 𝑋3 𝑇3

𝑁 𝑋𝑁 𝑋1 + 𝑋2 + 𝑋3 + ⋯ + 𝑋𝑁 𝑇𝑁
𝑁

Total 𝑋 = ∑ 𝑋𝑖
𝑖=1

Lahri’s Method or Rejection Method


1. Select a random number (say) 𝑖 from 1 to N (1 ≤ 𝑖 ≤ 𝑁)

2. Select another random number (say) 𝑗, such that 1 ≤ 𝑗 ≤ 𝑀 , where 𝑀 is either equal to the
maximum of the sizes {𝑋𝑖 },, 𝑖 = 1, 2, … , 𝑁 , or is more than the maximum size in the population

3. If 𝑗 ≤ 𝑋𝑖 ,the 𝑖 𝑡ℎ unit is selected, otherwise, the pair (i, j) of random numbers is rejected

4. Step 1 -3 are to be repeated till n units are selected


Estimation: Population Total
Let (𝑦𝑖 , 𝑥𝑖 ) 𝑖 = 1,2, … , 𝑛 denote the values of the study variable Y, and auxiliary
𝑋
variable 𝑋, for the n units selected in the sample. Let 𝑃𝑖 = 𝑖 , where 𝑋 = ∑𝑁 𝑖=1 𝑋𝑖
𝑋
denotes the known probability of selecting the ith population unit in the sample

Theorem #1 An unbiased estimator of population total 𝑌 = ∑𝑁


𝑖=1 𝑌𝑖 is
1 𝑦
given by 𝑌̂𝐻𝐻 = ∑𝑛𝑖=1 ( 𝑖 )
𝑛 𝑃𝑖

Proof
1 𝑦 1 𝑦
𝐸(𝑌̂𝐻𝐻 ) = 𝐸 [ ∑𝑛𝑖=1 ( 𝑖 )] = ∑𝑛𝑖=1 𝐸 ( 𝑖 )
𝑛 𝑃𝑖 𝑛 𝑃𝑖
𝑦 𝑌1 𝑌2 𝑌
Now 𝑃𝑖 is a random variable and can take values , 𝑃 , … , 𝑃𝑁
𝑖 𝑃1 2 𝑁
with probabilities 𝑃1 , 𝑃2 , … 𝑃𝑁 respectively.
𝑁 𝑁
𝑦𝑖 𝑌𝑖
𝐸 ( ) = ∑ ( ) 𝑃𝑖 = ∑ 𝑌𝑖 = 𝑌
𝑃𝑖 𝑃𝑖
𝑖=1 𝑖=1
𝑛
1
𝐸(𝑌̂𝐻𝐻 ) = ∑𝑌 = 𝑌
𝑛
𝑖=1

Alternatively,
𝑛 𝑁
1 𝑦𝑖 1 𝑌𝑖
𝐸(𝑌̂𝐻𝐻 ) = 𝐸 [ ∑ ( )] = 𝐸 [∑ 𝑟𝑖 ( )]
𝑛 𝑃𝑖 𝑛 𝑃𝑖
𝑖=1 𝑖=1

1 if ith unit is selected in the sample


where ri = {
0 otherwise
Obviously 𝑟𝑖 is a binomial random variable (0 ≤ 𝑟𝑖 ≤ 𝑛).
Therefore,𝐸(𝑟𝑖 ) = 𝑛𝑃𝑖
𝑁 𝑁
1 𝑌𝑖 1 𝑌𝑖
𝐸(𝑌̂𝐻𝐻 ) = 𝐸 [∑ 𝑟𝑖 ( )] = [∑ 𝐸(𝑟𝑖 ) ( )] = 𝑌
𝑛 𝑃𝑖 𝑛 𝑃𝑖
𝑖=1 𝑖=1
Theorem #2 The variance of the estimator 𝒀̂𝑯𝑯 is given by

1 𝑌𝑖2 2
̂ 𝑯𝑯 )
𝑉(𝒀 𝑁
= [∑𝑖=1 ( ) −𝑌 2
] or 𝑉(𝒀̂𝑯𝑯 ) = 1
∑𝑁
𝑗>𝑖 𝑃𝑖 𝑃𝑗
𝑌𝑖 𝑌𝑗
(𝑃 − 𝑃 )
𝑛 𝑃 𝑖
𝑛 𝑖 𝑗

Proof
Method 1

̂ 𝑯𝑯 ) = V [1 ∑ni=1 yi ] =
𝑉(𝒀
1
2
∑ni=1 V ( i )
y
n pi n p i

yi yi 2 yi 2
Now V ( ) = E [( ) ] − {E ( )}
p pi p i i
𝑁 𝑌𝑖2
= ∑ 𝑃𝑖 ( 2 ) − 𝑌 2
𝑖=1 𝑃𝑖

𝑌2
= ∑𝑁 𝑖
𝑖=1 ( 𝑃 ) − 𝑌
2
𝑖

1 𝑁 𝑌𝑖2
̂ 𝑯𝑯 ) = [∑
𝑉(𝒀 𝑖=1 ( 𝑃 ) − 𝑌2]
𝑛 𝑖

Method 2
n N
1 y 1 Y
̂ 𝑯𝑯 ) = V [ ∑ i ] = V [ ∑ ri i ]
𝑉(𝒀
n pi n Pi
i=1 i=1

1 if ith unit is selected in the sample


where ri = {
0 otherwise
N
1 Y
̂ 𝑯𝑯 ) = V (∑ ri i )
𝑉(𝒀
n2 Pi
i=1
N N
1 Yi Yi Yj
= 2 [∑ V (ri ) + ∑ COV (ri , rj )]
n Pi Pi Pj
i=1 i≠j=1

N N
1 Yi2 Yi Yj
= 2 [∑ 2 V(ri ) + ∑ COV(ri , rj )]
n Pi Pi Pj
i=1 i≠j=1

V(ri ) = nPi (1 − Pi )

COV(ri , rj ) = E(ri , rj ) − E(ri )E(rj )


2
1 𝑌𝑖
̂ 𝑯𝑯 ) =
𝑉(𝒀 [ ∑𝑁
𝑖=1 𝑃 − 𝑌2]
𝑛 𝑖

The above expression involves unknowns (𝑌𝑖 ’s) that can be estimated. Thus we found estimated
standard error (estimated sampling error)

Theorem #3 An unbiased estimator of 𝑉{𝑌̂𝐻𝐻 } is given by


𝒏
𝟏 𝒚𝟐𝒊
̂ 𝑯𝑯 ) =
𝒗(𝒀 ̂ 𝟐𝑯𝑯 ]]
[∑ 𝟐 − 𝒏𝒀
𝒏(𝒏 − 𝟏) 𝒑𝒊
𝒊=𝟏
Proof we have
𝒏
𝟏 𝟏 𝒚𝟐𝒊
̂ 𝑯𝑯 )] =
̂(𝒀
𝑬[𝒗 ̂ 𝟐𝑯𝑯 )]
[ 𝑬(∑ 𝟐 ) − 𝑬(𝒀
(𝒏 − 𝟏) 𝒏 𝒑𝒊
𝒊=𝟏
𝟐
𝟏 𝑵 𝒀𝒊

=(𝒏−𝟏) [ 𝒊=𝟏 ̂ ) + {𝑬(𝒀
− {𝑽(𝒀 ̂ )}𝟐 }]
𝑷𝒊
𝑵 𝑵
𝟏 𝒀𝟐𝒊 𝟐
𝟏 𝒀𝟐𝒊
= [(∑ −𝒀 −) (∑ − 𝒀𝟐 )]
(𝒏 − 𝟏) 𝑷𝒊 𝒏 𝑷𝒊
𝒊=𝟏 𝒊=𝟏
𝑵
𝟏 𝒀𝟐𝒊
= [∑ − 𝒀𝟐 ]
𝒏 𝑷𝒊𝒊=𝟏
Comparison with SRSWR

In SRSWR, the variance of 𝑌̅̂, an unbiased estimator of 𝑌̅


𝑁−1 2
𝑉(𝑌̅̂) = 𝑆
𝑆𝑅𝑆𝑊𝑅 𝑛𝑁 𝑦
𝑁
𝑁−1 1
= [∑ 𝑌𝑖2 − 𝑁𝑌̅ 2 ]
𝑛𝑁 𝑁 − 1
𝑖=1
𝑁
1
= [∑ 𝑌𝑖2 − 𝑁𝑌̅ 2 ]
𝑛𝑁
𝑖=1
𝑁
1 𝑌𝑖2
𝑉(𝑌̅̂) = [∑ − 𝑁 2 𝑌̅ 2 ]
𝑝𝑝𝑠 𝑛 𝑁2 𝑃𝑖
𝑖=1

𝑁 𝑁
1 𝑌𝑖2
𝑉(𝑌̅̂) − 𝑉(𝑌̅̂) = [𝑁 ∑ 𝑌𝑖
2
− ∑ ]
𝑆𝑅𝑆𝑊𝑅 𝑝𝑝𝑠 𝑛 𝑁2 𝑃𝑖
𝑖=1 𝑖=1
𝑁
1 2
1
= [∑ 𝑌𝑖 (𝑁 − )] (1)
𝑛 𝑁2 𝑃𝑖
𝑖=1

Hence the gain in efficiency due to sampling with varying probability


1 1
[∑𝑁 2
𝑖=1 𝑌𝑖 (𝑁 − 𝑃 )]
𝑛 𝑁2 𝑖
𝐺=
1 𝑌𝑖2
2 ̅2
[∑𝑁
𝑖=1 𝑃 − 𝑁 𝑌 ]
𝑛 𝑁2 𝑖

1 𝑦2 1
Now we can estimate (1) by [∑𝑛𝑖=1 𝑃𝑖 (𝑁 − 𝑃 )] from the sample with varying probability
𝑛2 𝑁2 𝑖 𝑖
selection and with replacement, since
1 𝑦2 1 1 𝑌2 1
𝐸 {𝑛2 𝑁2 [∑𝑛𝑖=1 𝑃𝑖 (𝑁 − 𝑃 )]} = 𝐸 {𝑛2 𝑁2 [∑𝑁 𝑖
𝑖=1 𝑃 (𝑁 − 𝑃 ) 𝛼𝑖 ]} 𝐸(𝛼𝑖 ) = 𝑛𝑃𝑖
𝑖 𝑖 𝑖 𝑖

𝑁
1 1
= 2
[∑ 𝑌𝑖2 (𝑁 − )]
𝑛𝑁 𝑃𝑖
𝑖=1

Therefore, estimate of gain


1 𝑦2 1
[∑𝑛𝑖=1 𝑃𝑖 (𝑁 − 𝑃 )]
𝑛2 𝑁 2
𝐺̂ = 𝑖 𝑖
2
1 𝑦𝑖
∑𝑛𝑖=1 ( − 𝑌̅̂𝐻𝐻 )
𝑛 (𝑛−1) 𝑛𝑃𝑖

Procedure of Selecting a Sample (WoR)

-Draw-by-draw method (a sample of size 𝑛 can be selected in 𝑛 steps)


- Cumulative Total Method/ LM (Rejection
Method)
- Sen- Midzuno procedure
-Random grouping
- Rao-Hartly- Cochran procedure

𝑦
Theorem # 4 If 𝜋𝑖 > 0 (𝑖 = 1,2, … , 𝑁) 𝑌̂𝐻𝑇 = ∑𝑛𝑖=1 𝑖 is an
𝜋𝑖
unbiased estimator of 𝑌, with variance 𝑉( 𝑌̂𝐻𝑇 ) =
(1−𝜋𝑖 ) (𝜋𝑖𝑗 −𝜋𝑖 𝜋𝑗 )
∑𝑁
𝑖=1 𝑦𝑖2 + 2 ∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖 𝑦𝑖 𝑦𝑗
𝜋𝑖 𝜋𝑖 𝜋𝑗

Proof
Let 𝑡𝑖 (𝑖 = 1,2, … , 𝑁) be a random variable that takes the value 1 if the ith
unit is drawn and zero otherwise. Then 𝑡𝑖 follows binomial distribution with
sample size 1 and probability 𝜋𝑖
𝐸 (𝑡𝑖 ) = 𝜋𝑖 𝑉(𝑡𝑖 ) = 𝜋𝑖 (1 − 𝜋𝑖 )
𝐶𝑜𝑣(𝑡𝑖 , 𝑡𝑗 ) = 𝐸(𝑡𝑖 , 𝑡𝑗 ) − 𝐸 (𝑡𝑖 )𝐸(𝑡𝑗 ) = 𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗
𝑦𝑖
𝐸( 𝑌̂𝐻𝑇 ) = 𝐸 (∑𝑁
𝑖=1 𝑡𝑖 )= ∑𝑁
𝑖=1 𝑦𝑖 = 𝑌
𝜋𝑖
𝑦𝑖 𝑦𝑖 2
𝑉( 𝑌̂𝐻𝑇 ) = 𝑉 (∑𝑁
𝑖=1 𝑡𝑖 ) = ∑𝑁
𝑖=1 ( ) V(𝑡𝑖 ) +
𝜋𝑖 𝜋𝑖
𝑦𝑖 𝑦𝑗
2 ∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖 𝐶𝑜𝑣(𝑡𝑖 , 𝑡𝑗 )
𝜋𝑖 𝜋𝑗

(1−𝜋𝑖 ) (𝜋𝑖𝑗 −𝜋𝑖 𝜋𝑗 )


= ∑𝑁
𝑖=1 𝑦𝑖2 + 2 ∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖 𝑦𝑖 𝑦𝑗
𝜋𝑖 𝜋𝑖 𝜋𝑗

Alternative Expression
(1−𝜋𝑖 ) 𝜋𝑖 (1−𝜋𝑖 )
∑𝑁
𝑖=1 𝑦𝑖2 = ∑𝑁
𝑖=1 𝑦𝑖2
𝜋𝑖 𝜋𝑖 2

Now ∑𝑁 𝑁 𝑁
𝑗≠𝑖(𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 ) = ∑𝑗≠𝑖 𝜋𝑖𝑗 − 𝜋𝑖 ∑𝑗≠𝑖 𝜋𝑗 = (𝑛 − 1)𝜋𝑖 − 𝜋𝑖 (𝑛 − 𝜋𝑖 ) = −𝜋𝑖 (1 − 𝜋𝑖 )

𝑦𝑖 2 𝑦𝑖 2
= ∑𝑁 𝑁
𝑖=1 ∑𝑗≠𝑖 (𝜋𝑖 𝜋𝑗 − 𝜋𝑖𝑗 ) ( ) = ∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖 (𝜋𝑖 𝜋𝑗 − 𝜋𝑖𝑗 ) [( ) +
𝜋 𝑖 𝜋𝑖
2
𝑦𝑗
( ) ]
𝜋𝑗

(1−𝜋𝑖 ) 2
𝑉( 𝑌̂𝐻𝑇 )𝑆𝑌𝐺 = ∑𝑁
𝑖=1 𝑦𝑖 +
𝜋𝑖
2
𝑁 (𝜋𝑖𝑗 −𝜋𝑖 𝜋𝑗 ) 𝑦𝑖 2 𝑦𝑗
2 ∑𝑁 ∑
𝑖=1 𝑗>𝑖 𝑦𝑖 𝑦𝑗 = ∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖(𝜋𝑖 𝜋𝑗 − 𝜋𝑖𝑗 ) [( ) + ( ) −
𝜋𝑖 𝜋𝑗 𝑖 𝜋 𝜋 𝑗

𝑦𝑖 𝑦𝑖
2 ]
𝜋𝑖 𝜋𝑖

2
𝑦𝑖 𝑦𝑗
=∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖(𝜋𝑖 𝜋𝑗 − 𝜋𝑖𝑗 ) ( − )
𝜋𝑖 𝜋𝑗
An unbiased estimator of 𝑉( 𝑌̂𝐻𝑇 ) is
𝑛 𝑛 𝑛
(1 − 𝜋𝑖 ) 2 (𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 )
𝑣1 ( 𝑌̂𝐻𝑇 ) = ∑ 𝑦𝑖 + 2 ∑ ∑ 𝑦𝑖 𝑦𝑗
𝜋𝑖 2 𝜋𝑖 𝜋𝑗 𝜋𝑖𝑗
𝑖=1 𝑖=1 𝑗>𝑖

Proof
(1−𝜋 )
The estimator of 1st part of 𝑉( 𝑌̂𝐻𝑇 ) will be ∑𝑛𝑖=1 𝜋 𝑖 𝑦𝑖2 𝛼𝑖 (linear estimator)
𝑖

Choose 𝛼𝑖 such that


(1−𝜋𝑖 ) (1−𝜋𝑖 )
𝐸 [∑𝑛𝑖=1 𝑦𝑖2 𝛼𝑖 ] = ∑𝑁
𝑖=1 𝑦𝑖2
𝜋𝑖 𝜋𝑖

𝑛 𝑁
(1 − 𝜋𝑖 ) 2 (1 − 𝜋𝑖 ) 2
𝐸 [∑ 𝑦𝑖 𝛼𝑖 ] = 𝐸 [∑ 𝑦𝑖 𝛼𝑖 𝑡𝑖 ]
𝜋𝑖 𝜋𝑖
𝑖=1 𝑖=1
(1−𝜋𝑖 ) (1−𝜋𝑖 )
= ∑𝑁𝑖=1 𝜋𝑖
𝑦𝑖2 𝛼𝑖 𝐸(𝑡𝑖 ) = ∑𝑁
𝑖=1 𝜋𝑖
𝑦𝑖2 𝛼𝑖 𝜋𝑖

𝑁 𝑁
(1 − 𝜋𝑖 ) 2 (1 − 𝜋𝑖 ) 2
∑ 𝑦𝑖 𝛼𝑖 𝜋𝑖 = ∑ 𝑦𝑖
𝜋𝑖 𝜋𝑖
𝑖=1 𝑖=1

𝛼𝑖 𝜋𝑖 = 1
𝛼𝑖 = 1/𝜋𝑖
(𝜋 −𝜋 𝜋 )
The estimator of 2nd part of 𝑉( 𝑌̂𝐻𝑇 ) will be ∑𝑛𝑖=1 ∑𝑛𝑗>𝑖 𝑖𝑗𝜋 𝜋 𝑖 𝑗 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗
𝑖 𝑗

𝑛 𝑛 𝑁 𝑁
(𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 ) (𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 )
𝐸 [∑ ∑ 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗 ] = ∑ ∑ 𝑦𝑖 𝑦𝑗
𝜋𝑖 𝜋𝑗 𝜋𝑖 𝜋𝑗
𝑖=1 𝑗>𝑖 𝑖=1 𝑗>𝑖

𝑛 𝑛 𝑁 𝑁
(𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 ) (𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 )
𝐸 [∑ ∑ 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗 ] = 𝐸 [∑ ∑ 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗 𝑡𝑖𝑗 ]
𝜋𝑖 𝜋𝑗 𝜋𝑖 𝜋𝑗
𝑖=1 𝑗>𝑖 𝑖=1 𝑗>𝑖

𝑁 𝑁 𝑁 𝑁
(𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 ) (𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 )
∑∑ 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗 𝐸(𝑡𝑖𝑗 ) = ∑ ∑ 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗 𝜋𝑖𝑗
𝜋𝑖 𝜋𝑗 𝜋𝑖 𝜋𝑗
𝑖=1 𝑗>𝑖 𝑖=1 𝑗>𝑖

𝑁 𝑁 𝑁 𝑁
(𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 ) (𝜋𝑖𝑗 − 𝜋𝑖 𝜋𝑗 )
∑∑ 𝑦𝑖 𝑦𝑗 𝛼𝑖𝑗 𝜋𝑖𝑗 = ∑ ∑ 𝑦𝑖 𝑦𝑗
𝜋𝑖 𝜋𝑗 𝜋𝑖 𝜋𝑗
𝑖=1 𝑗>𝑖 𝑖=1 𝑗>𝑖
𝛼𝑖𝑗 𝜋𝑖𝑗 = 1

𝛼𝑖𝑗 = 1/𝜋𝑖𝑗

Corollary 2 An unbiased estimator of 𝑉( 𝑌̂𝐻𝑇 )𝑆𝑌𝐺 is (SYG:Sen-Yates-Grundy)


𝑛 𝑛 2
(𝜋𝑖 𝜋𝑗 − 𝜋𝑖𝑗 ) 𝑦𝑖 𝑦𝑗
𝑣2 ( 𝑌̂𝐻𝑇 ) = ∑ ∑ ( − )
𝜋𝑖𝑗 𝜋𝑖 𝜋𝑗
𝑖=1 𝑗>𝑖
𝑛 2
1 (𝜋𝑖 𝜋𝑗 − 𝜋𝑖𝑗 ) 𝑦𝑖 𝑦𝑗
= ∑ ( − )
2 𝜋𝑖𝑗 𝜋𝑖 𝜋𝑗
𝑗≠ 𝑖

Inclusion Probabilities
Suppose a population consists of 4 units 𝑈 = {1,2,3,4}. We draw a
4
random sample of size 𝑛 = 2, No of samples (wor) is ( ) = 6
2
The possible simple random samples are
𝑆1 = {1,2}, 𝑆2 = {1,3} 𝑆3 = {1,4} 𝑆4 = {2,3} 𝑆5 = {2,4} 𝑆6 =
{3,4}
1
The selection probability of each of 6 samples are 𝑝(𝑠) =
6

𝜋𝑖 = probability that the ith unit is included in the sample


=∑𝑖∋𝑠 𝑝(𝑠)
𝒊 ∋ 𝒔 means all samples which include element 𝑖
In this example
𝜋1 = ∑1∋𝑠 𝑝(𝑠) = 𝑝(𝑠1 ) + 𝑝(𝑠2 ) + 𝑝(𝑠3 )
𝜋𝑖𝑗 = probability that the ith and jth units are both included in the
sample
𝜋𝑖𝑗 = ∑ 𝑃(𝑆) over all samples containing ith and jth unit
𝜋𝑖 > 0, 𝜋𝑖𝑗 > 0
Properties
∑𝑁
𝑖=1 𝜋𝑖 = 𝑛
∑𝑁
𝑗≠𝑖 𝜋𝑗 = (𝑛 − 1)𝜋𝑖

∑𝑁 𝑁
𝑖=1 ∑𝑗>𝑖 𝜋𝑖𝑗 = 1/2𝑛(𝑛 − 1)

Inclusion probabilities for Horvitz Thompson estimator (n=2)


Note inclusion probability if we use PPSWOR (CTM or RM) sampling
𝜋𝑖 = Total probability that ith unit will be selected at either the first or the second draw

= 𝑃𝑖 +𝑃1 𝑃𝑖|1 + 𝑃2 𝑃𝑖|2 + ⋯ + 𝑃𝑁 𝑃𝑖|𝑁


𝑃𝑗
=𝑃𝑖 + ∑𝑁
𝑗≠𝑖=1 𝑃𝑖 1−𝑃𝑗

𝑃𝑗
=𝑃𝑖 [1 + ∑𝑁
𝑗≠𝑖=1 ]
1−𝑃𝑗

𝑃𝑗 𝑃𝑖
=𝑃𝑖 [1 + ∑𝑁
𝑗=1 − ]
1−𝑃𝑗 1−𝑃𝑖

𝜋𝑖𝑗 = Total probability that ith unit and jth unit will be selected in a sample of size 2

=𝑃𝑖 𝑃𝑗|𝑖 + 𝑃𝑗 𝑃𝑖|𝑗


𝑃𝑖 𝑃𝑗 𝑃𝑗 𝑃𝑖
= +
1−𝑃𝑖 1−𝑃𝑗

1 1
=𝑃𝑖 𝑃𝑗 [ + ]
1−𝑃𝑖 1−𝑃𝑗
Lahri-Midzuno-Sen Sampling (WOR)
Step # 1 The unit in first draw is selected with unequal probability selection
Step# 2 The remaining units are selected with SRSWOR in all subsequent draws
𝜋𝑖 =
Total probability that ith unit will be selected at either the first draw or the subsequent (n −
1) draws
=𝑃(ith unit will be selected in first draw) +
𝑃(ith unit will be selected in any other draws)
= 𝑃𝑖 +
𝑃((ith unit will not be selected in first draws but will be selected in any of subsequent n −
1 draws)
(𝑛−1)
=𝑃𝑖 + (1 − 𝑃𝑖 ) ×
(𝑁−1)
𝑁−𝑛 𝑛−1
= 𝑃𝑖 +
𝑁−1 𝑁−1
𝜋𝑖𝑗 = Total probability that ith unit and jth unit will be selected in the sample

=Probability that the ith unit will be selected in 1st draw and the jth unit will be
selected in any of subsequent n − 1 draws
+Probability that the jth unit will be selected in 1st draw and the ith unit will be
selected in any of subsequent n − 1 draws
+
Probability that the ith unit and jth unit will not be selected in 1st draw but will be
selected in any of subsequent n − 1 draws
(𝑛−1) (𝑛−1) (𝑛−1) (𝑛−2)
=𝑃𝑖 × + 𝑃𝑗 × + (1 − 𝑃𝑖 − 𝑃𝑗 ) × ×
(𝑁−1) (𝑁−1) (𝑁−1) (𝑁−2)
𝑛−1 𝑁−𝑛 (𝑛−2)
= [ (𝑃𝑖 + 𝑃𝑗 ) + ]
𝑁−1 𝑁−2 (𝑁−2)
Additional Ideas
̂ ) for Hensen-Harwitz (HH) Estimator
Another Expression for 𝑽(𝒀

Lemma #1 If 𝑎1 , 𝑎2 , … , 𝑎𝑟 , 𝑏1 , 𝑏2 , … , 𝑏𝑟 are any numbers, then


∑𝑟𝑖=1 𝑎𝑖 𝑏𝑖 = (∑𝑟𝑗=1 𝑎𝑗 )(∑𝑟𝑘=1 𝑏𝑘 ) − ∑𝑗≠𝑘 𝑎𝑗 𝑏𝑘
Proof One observes that
𝑟 𝑟 𝑟 𝑟

(∑ 𝑎𝑗 ) (∑ 𝑏𝑘 ) = ∑ ∑ 𝑎𝑗 𝑏𝑘
𝑗=1 𝑘=1 𝑗=1 𝑘=1

= ∑ 𝒂 𝒊 𝒃𝒊 + ∑ 𝑎 𝑗 𝑏𝑘
𝒊=𝟏 𝑗≠𝑘

Lemma #2 If 𝐴1 , 𝐴2 , … , 𝐴𝑟 , 𝑃1 , 𝑃2 , … , 𝑃𝑟 are real numbers, if 𝑃𝑖 > 0 for


𝐴
1 ≤ 𝑖 ≤ 𝑟, and if𝐴 = ∑𝑟𝑖=1 𝐴𝑖 and ∑𝑟𝑖=1 𝑃𝑖 = 1 then ∑𝑛𝑟=1 𝑃𝑖 ( 𝑖 −
𝑃𝑖
2 2
𝐴 𝐴𝑗
𝐴) ∑𝑖<𝑗 ( 𝑖 − ) 𝑃𝑖 𝑃𝑗
𝑃𝑖 𝑃𝑗

On the other hand


Lemma #1

You might also like