Lecture 05

STA732
Statistical Inference
Lecture 05: Rao-Blackwell Theorem
Yuansi Chen
Spring 2023
Duke University
https://www2.stat.duke.edu/courses/Spring23/sta732.01/
1
Recap from Lecture 04
• 𝑉 is ancillary if its distribution does not depend on 𝜃

• Completeness + sufficiency as the ideal notion of optimal data
compression. To prove completeness, one usually goes by
definition or by identifying exponential family.
• Basu’s theorem is useful to prove independence between a
complete sufficient statistics and an ancillary statistics.
2
Goal of Lecture 05
1. Convex loss
2. Rao-Blackwell Theorem
3. Uniformly minimum variance unbiased estimator (UMVU)
Chap. 3.6, 4.1-4.2 in Keener or Chap. 1.7, 2.1 in Lehmann and Casella
We are entering the first approach of arguing for “the best”

estimator in point estimation: by restricting to a smaller class of
estimators!
3
Convex loss
Definition. Convex set
A set 𝒞 ⊆ ℝ𝑝 is convex if given any two points 𝑥, 𝑦 ∈ 𝒞, for any

𝜆 ∈ [0, 1], we have
𝜆𝑥 + (1 − 𝜆)𝑦 ∈ 𝒞
4
Definition. Convex function
A real-valued function 𝑓 defined on a convex set 𝒞 ⊆ ℝ𝑝 is a convex

function if for any two points 𝑥, 𝑦 ∈ 𝒞 and any 𝜆 ∈ [0, 1], we have
𝑓(𝜆𝑥 + (1 − 𝜆)𝑦) ≤ 𝜆𝑓(𝑥) + (1 − 𝜆)𝑓(𝑦).
It is called strictly convex if the above inequality holds strictly for

𝑥 ≠ 𝑦 and 𝜆 ∈ (0, 1).
5
Jensen’s inequality in finite form
Jensen’s inequality in finite form

For a convex function 𝑓, 𝑥1 , … , 𝑥𝑛 in its domain, and positive
𝑛
weights 𝛼𝑖 with ∑𝑖=1 𝛼𝑖 = 1. Then
𝑛 𝑛
𝑓(∑ 𝑎𝑖 𝑥𝑖 ) ≤ ∑ 𝑎𝑖 𝑓(𝑥𝑖 )
𝑖=1 𝑖=1
proof by induction, omitted
6
Jensen’s inequality in a probabilistic setting
Jensen’s inequality in a probabilistic setting

𝑋 is an integrable real-valued random variable, 𝑓 is convex. Then
𝑓(𝔼[𝑋]) ≤ 𝔼[𝑓(𝑋)]
If 𝑓 is strictly convex, the inequality holds strictly unless 𝑋 is almost

surely constant.
proof see Thm 3.25, remark 3.26 in Keener or Wikipedia
7
Examples of convex functions
• 𝑥 ↦ 1/𝑥 is strictly convex on (0, ∞). Then for 𝑋 > 0, we have
1
≤ 𝔼[1/𝑋]
𝔼[𝑋]
• 𝑥 ↦ − log(𝑥) is strictly convex on (0, ∞). Then for 𝑋 > 0, we

have
log(𝔼[𝑋]) ≥ 𝔼 log(𝑋).
8
Convex loss penalizes extra noise to an estimator
Proposition
Suppose the loss 𝐿(𝜃, 𝑑) is convex in 𝑑. Let 𝛿(𝑋) be an estimate of
̃
𝜃. Define 𝛿(𝑋) = 𝛿(𝑋) + 𝜖, where 𝜖 is a zero-mean random variable
independent of 𝑋. Then
𝑅(𝜃, 𝛿)̃ ≥ 𝑅(𝜃, 𝛿)
where the risk 𝑅(𝜃, 𝛿) = 𝔼𝜃 [𝐿(𝜃, 𝛿(𝑋))]
Proof idea: tower property + Jensen’s inequality
9
Rao-Blackwell Theorem
Rao-Blackwell Theorem
Thm 3.28 in Keener

Let 𝑇 be a sufficient statistics for P = {𝑃𝜃 ∶ 𝜃 ∈ Ω}, let 𝛿 be an
estimator of 𝑔(𝜃). Define 𝜂(𝑇 ) = 𝔼𝜃 [𝛿(𝑋) ∣ 𝑇 ]. If 𝐿(𝜃, ⋅) is convex,
then
𝑅(𝜃, 𝜂) ≤ 𝑅(𝜃, 𝛿).
where the risk 𝑅(𝜃, 𝛿) = 𝔼𝜃 [𝐿(𝜃, 𝛿(𝑋))].

Furthermore, if 𝐿(𝜃, ⋅) is strictly convex, the inequality is strict
a.s.
unless 𝛿(𝑋) = 𝜂(𝑇 ).
10
Interpretation
For convex loss functions,
1. If an estimator is not just based on sufficient statistics 𝑇 , we

can improve it.
2. The step of constructing 𝜂(𝑇 ) = 𝔼𝜃 [𝛿(𝑋) ∣ 𝑇 ] from 𝛿 is called
Rao-Blackwellization.
3. When discussing optimal estimators, the only estimators of
𝑔(𝜃) that are worth considering are functions of sufficient
statistics 𝑇 .
11
Proof of Rao-Blackwell Theorem
See Keener Thm 3.28, apply Jensen
12
UMVU
Bias
• The bias of an estimate 𝛿(𝑋) is 𝔼𝜃 [𝛿(𝑋) − 𝑔(𝜃)]

• We say an estimator 𝛿 is unbiased for 𝑔(𝜃) if
𝔼𝜃 [𝛿(𝑋)] = 𝑔(𝜃), ∀𝜃 ∈ Ω.
Ex: what is an unbiased estimator of 𝜃 for 𝑋 drawn from a uniform

distribution on (0, 𝜃)?
13
Bias-variance decomposition under squared error loss
Squared error loss:
𝐿(𝜃, 𝑑) = (𝑑 − 𝑔(𝜃))2
Risk decomposition under squared error loss

2
Risk becomes the mean squared error 𝑅(𝜃, 𝛿) = 𝔼𝜃 (𝛿(𝑋) − 𝑔(𝜃))
2
𝔼𝜃 (𝛿(𝑋) − 𝑔(𝜃))
2
= 𝔼𝜃 (𝛿(𝑋) − 𝔼𝜃 [𝛿] + 𝔼𝜃 [𝛿] − 𝑔(𝜃))
2 2
=𝔼
⏟⏟𝜃 (𝛿(𝑋)
⏟⏟⏟⏟ − 𝔼⏟
𝜃 [𝛿])
⏟⏟ + 𝔼 𝜃 (𝔼𝜃 [𝛿] − 𝑔(𝜃)) + 2𝔼
⏟⏟⏟⏟⏟⏟⏟ [(𝛿 − 𝔼𝜃 [𝛿])(𝔼𝜃 𝛿 − 𝑔(𝜃))]
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
Var𝜃 (𝛿) Bias(𝛿)2 =0
14
UMVU
Logic: according to the bias-variance decomposition under

squared error loss, if we restrict to unbiased estimators, comparing
variance is equivalent to comparing risk
Def. UMVU
An unbiased estimator 𝛿 is uniformly minimum variance unbiased
(UMVU) if
̃ ∀𝜃 ∈ Ω
Var𝜃 (𝛿) ≤ Var𝜃 (𝛿),
for any competing unbiased estimator 𝛿.̃
15
Does UMVU always exist?
No! Even unbiased estimators might not exist
1
Ex: estimate 𝜃2
for 𝑋 drawn from Uniform(0, 𝜃)
16
Does UMVU always exist?
No! Even unbiased estimators might not exist
1
Ex: estimate 𝜃2
for 𝑋 drawn from Uniform(0, 𝜃)
Def. U-estimable
We say 𝑔(𝜃) is U-estimable if there exists 𝛿 such that
𝔼𝜃 𝛿 = 𝑔(𝜃), ∀𝜃 ∈ Ω
Does UMVU exist under U-estimable assumption?
16
UMVU under U-estimable and given complete sufficient statistics
Theorem 4.4 in Keener, Lehmann-Scheffé

Suppose 𝑇 (𝑋) is complete sufficient for P = {𝑃𝜃 ∶ 𝜃 ∈ Ω}. For
a.s.
any U-estimable 𝑔(𝜃), there is a unique (up to = ) UMVU estimator
which is based on 𝑇 .
17
Proof of Thm 4.4
• Existence
• Uniqueness
• UMVU
18
Extension to convex loss
Extension of Thm 4.4 to convex loss

Supppose 𝑇 (𝑋) is complete sufficient for P = {𝑃𝜃 ∶ 𝜃 ∈ Ω}.
Under a strictly convex loss, among all unbiased estimators, there
a.s.
is a unique (up to = ) uniformly minimum risk unbiased estimator
which is based on 𝑇
19
Strategies for finding UMVU estimators
Two strategies for finding UMVU estimators:

• Directly find an unbiased estimator based on a complete
sufficient 𝑇
• Find any unbiased estimator, then Rao-Blackwellize it.
20
Example 1
i.i.d.
𝑋1 , … , 𝑋𝑛 ∼ Poisson(𝜃), 𝜃 > 0.
• Find a UMVU estimator for 𝜃

• Find a UMVU estimator for 𝜃2
21
Example 2
i.i.d.
𝑋1 , … , 𝑋𝑛 ∼ Unif(0, 𝜃), 𝜃 > 0.
• Find a UMVU estimator for 𝜃 in two ways
22
In Example 2, is the UMVU estimator also a “good” (admissible)
estimator in terms of total risk?
23
Summary
• Jensen’s inequality for convex function. Convex loss allows us

to rule out estimators with extra noise
• Rao-Blackwell theorem allows us to improve an estimator
based on sufficient statistics 𝑇
• If unbiased estimator exists, complete sufficient statistics 𝑇
exists, then UMVU estimator exists and is unique
24
What is next?
• Reflexion on the unbiasedness

• Information inequality
25
Thank you
26
27

Lecture 05

Uploaded by

Copyright:

Available Formats

Lecture 05

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 05

Uploaded by

Copyright:

Available Formats

STA732

• 𝑉 is ancillary if its distribution does not depend on 𝜃

We are entering the first approach of arguing for “the best”

A set 𝒞 ⊆ ℝ𝑝 is convex if given any two points 𝑥, 𝑦 ∈ 𝒞, for any

A real-valued function 𝑓 defined on a convex set 𝒞 ⊆ ℝ𝑝 is a convex

𝑓(𝜆𝑥 + (1 − 𝜆)𝑦) ≤ 𝜆𝑓(𝑥) + (1 − 𝜆)𝑓(𝑦).

It is called strictly convex if the above inequality holds strictly for

Jensen’s inequality in finite form

proof by induction, omitted

Jensen’s inequality in a probabilistic setting

If 𝑓 is strictly convex, the inequality holds strictly unless 𝑋 is almost

proof see Thm 3.25, remark 3.26 in Keener or Wikipedia

• 𝑥 ↦ 1/𝑥 is strictly convex on (0, ∞). Then for 𝑋 > 0, we have

• 𝑥 ↦ − log(𝑥) is strictly convex on (0, ∞). Then for 𝑋 > 0, we

𝑅(𝜃, 𝛿)̃ ≥ 𝑅(𝜃, 𝛿)

where the risk 𝑅(𝜃, 𝛿) = 𝔼𝜃 [𝐿(𝜃, 𝛿(𝑋))]

Proof idea: tower property + Jensen’s inequality

Thm 3.28 in Keener

𝑅(𝜃, 𝜂) ≤ 𝑅(𝜃, 𝛿).

where the risk 𝑅(𝜃, 𝛿) = 𝔼𝜃 [𝐿(𝜃, 𝛿(𝑋))].

For convex loss functions,

1. If an estimator is not just based on sufficient statistics 𝑇 , we

See Keener Thm 3.28, apply Jensen

• The bias of an estimate 𝛿(𝑋) is 𝔼𝜃 [𝛿(𝑋) − 𝑔(𝜃)]

Ex: what is an unbiased estimator of 𝜃 for 𝑋 drawn from a uniform

Squared error loss:

Risk decomposition under squared error loss

Logic: according to the bias-variance decomposition under

for any competing unbiased estimator 𝛿.̃

Does UMVU exist under U-estimable assumption?

Theorem 4.4 in Keener, Lehmann-Scheffé

Extension of Thm 4.4 to convex loss

Two strategies for finding UMVU estimators:

• Find a UMVU estimator for 𝜃

• Find a UMVU estimator for 𝜃 in two ways

• Jensen’s inequality for convex function. Convex loss allows us

• Reflexion on the unbiasedness

You might also like