ProbStat Lec08 Mine

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Lecture 8.

ESTIMATION
▪ [1] Chapter 8, Chapter 11

▪ Concept of Estimate
▪ Point Estimate
▪ Maximum Likelihood Estimate
▪ Interval Estimate
▪ For Mean
▪ For Proportion
▪ For Variance (pp. 484 – 488)

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 1


Lecture 8. ESTIMATION
▪ Case study
• Observed 100 England
and 100 người American
about watching TV habit.
• Same average: 35 hours
per week
• Variance 100 and 25
(h/week)2
• What is the CI 95% (confidence interval) of the average
time for watching TV of England and American?
• What is the necessary assumption to do this analysis?
PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 2
Lecture 8. ESTIMATION
▪ Case study
• Observed 42 of D.O.G.
about losing weight after
one year, obtained: 𝑥𝐷 =
7,2 and 𝑠𝐷 = 3,7
• 47 of E.O.G. about losing
weight after one year,
obtained: 𝑥𝐸 = 4,0 and
𝑠𝐸 = 3,9
• What is the better method for “Lose weight” purpose
• Besides the method comparing CI of each group, show
out another way that express the difference between
two groups
PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 3
8.1. Concept of Estimate
▪ Estimation is determining the approximate value of an
unknown parameter on given data.
▪ Two types of estimate:
▪ Point estimate: single value
▪ Interval estimate: an interval that parameter falls into
it with a determined probability level

Ex. “The average height of VN’s male is 163 cm”


“The average height of VN’s male is from 160 to 165 cm”

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 4


8.2. Point Estimate
▪ Point estimate of parameter 𝜃 is denoted by 𝜃መ
▪ Estimator: a statistic calculated on random sample, an
approximation to unknown parameter. Estimator is a
random variable
▪ Estimate: specific value calculated on observed
sample. Estimate is number.

∑𝑥𝑖
Ex. Formula 𝑥ҧ = is an estimator
𝑛
3+4+8
Value 𝑥ҧ = = 5 is an estimate
3

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 5


Point Estimate
▪ Population parameters 𝜇, 𝜎 2 , 𝑝 are unknown
▪ Estimate by statistics from sample

▪ Sample mean 𝑥ҧ is point estimate for 𝜇


▪ Sample variance 𝑠 2 is point estimate for 𝜎 2
▪ Sample proportion 𝑝ҧ is point estimate for 𝑝

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 6


Properties of Point Estimate
▪ Estimate for 𝜃 by 𝜃መ
▪ Unbiased: 𝜃መ is unbiased estimator of 𝜃
𝐸 𝜃መ = 𝜃
▪ Efficient: 𝜃መ1 and 𝜃መ2 are unbiased, 𝜃መ1 is more efficient
than 𝜃መ2 :
𝑉 𝜃መ1 < 𝑉 𝜃መ2
▪ If 𝑉 𝜃መ1 is smallest in every unbiased estimator: most
efficient, best estimator
▪ BUE (Best Unbiased Estimator)

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 7


Example
Example.8.1. Population mean 𝜇, sample (𝑥1 , 𝑥2 , … , 𝑥𝑛 )
(a) Which of the followings are unbiased estimator?
(b) Which of the unbiased estimator is more efficient?

1 1 1 2
𝑀1 = 𝑥1 + 𝑥𝑛 𝑀2 = 𝑥1 + 𝑥𝑛
3 3 3 3
1 1 1 1 1
𝑀3 = 𝑥 + 𝑥 𝑀4 = 𝑥 + 𝑥 + 𝑥
2 1 2 𝑛 3 1 3 2 3 𝑛
1 1 1
𝑀5 = 𝑥𝑚𝑖𝑛 + 𝑥𝑚𝑎𝑥 𝑀6 = 𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛
2 2 𝑛

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 8


Example
Example.8.2. Population mean of 𝜇 , there are two
sample. The first sample has size of 4 and mean of 𝑥ҧ1 ; the
second sample has size of 8 and sample mean 𝑥ҧ2 .
Which are the unbiased estimator, and which is more
efficient in the followings:

1 1 1 1
𝑀1 = 𝑥ҧ1 + 𝑥ҧ2 𝑀2 = 𝑥ҧ1 + 𝑥ҧ2
3 3 2 2
2 1 1 2
𝑀3 = 𝑥ҧ + 𝑥ҧ 𝑀4 = 𝑥ҧ + 𝑥ҧ
3 1 3 2 3 1 3 2

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 9


8.3. Maximum Likelihood (ML) Estimate
Ex. Probability that a student pass an exam is 𝑝 = 0.6.
▪ Which of the following sample is most likely to occurs
𝑆1 = (𝑃𝑎𝑠𝑠, 𝐹𝑎𝑖𝑙, 𝐹𝑎𝑖𝑙) 𝑆2 = (𝐹𝑎𝑖𝑙, 𝑃𝑎𝑠𝑠, 𝐹𝑎𝑖𝑙)
𝑆3 = (𝐹𝑎𝑖𝑙, 𝐹𝑎𝑖𝑙, 𝐹𝑎𝑖𝑙) 𝑆4 = (𝑃𝑎𝑠𝑠, 𝑃𝑎𝑠𝑠, 𝑃𝑎𝑠𝑠)
Likelihood function is Probability
▪ 𝐿 𝑆1 = 0.6 ∗ 0.4 ∗ 0.4 =
▪ 𝐿 𝑆2 = 0.4 ∗ 0.6 ∗ 0.4 =
▪ 𝐿 𝑆3 = 0.4 ∗ 0.4 ∗ 0.4 =
▪ 𝐿 𝑆4 = 0.6 ∗ 0.6 ∗ 0.6 =

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 10


Maximum Likelihood (ML) Estimate
Ex. The sample is: 𝑆 = (𝑃𝑎𝑠𝑠, 𝐹𝑎𝑖𝑙, 𝑃𝑎𝑠𝑠, 𝑃𝑎𝑠𝑠)
▪ Which of the following value of 𝑝 is most likely?
𝑝 = 0.2; 𝑝 = 0.4; 𝑝 = 0.6; 𝑝 = 0.8
Likelihood function
▪ 𝑝 = 0.2  𝐿 𝑆 = 0.2 ∗ 0.8 ∗ 0.2 ∗ 0.2 =
▪ 𝑝 = 0.4  𝐿 𝑆 = 0.4 ∗ 0.6 ∗ 0.4 ∗ 0.4 =
▪ 𝑝 = 0.6  𝐿 𝑆 = 0.6 ∗ 0.4 ∗ 0.6 ∗ 0.6 =
▪ 𝑝 = 0.8  𝐿 𝑆 = 0.8 ∗ 0.2 ∗ 0.8 ∗ 0.8 =
Find the maximum likelihood estimate?

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 11


8.4. Interval Estimate for Mean
▪ The interval (𝐺1 , 𝐺2 ) is interval estimate of 𝜃, then
𝑃 𝐺1 < 𝜃 < 𝐺2 = (1 − 𝛼)
▪ (𝐺1 , 𝐺2 ) is confidence interval (CI)
▪ (1 − 𝛼) is confidence level
▪ 𝑤 = 𝐺2 − 𝐺1 is the width of confidence interval
▪ 𝐺1 : Lower Limit (LL)
▪ 𝐺2 : Upper Limit (UL)
▪ The shorter (narrower) the 𝑤 is, the more accurate
the estimate is.

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 12


Confidence Interval for Mean 𝝁
▪ Population is unknown, estimate for the mean 𝜇
▪ Sample size 𝑛, mean of 𝑥,ҧ variance of 𝑠 2
▪ Confidence level (1 − 𝛼), confidence interval
(𝒏−𝟏) 𝒔 (𝒏−𝟏) 𝒔
ഥ − 𝒕𝜶
𝒙 <𝝁<𝒙 ഥ + 𝒕𝜶
𝟐 𝒏 𝟐 𝒏
(𝒏−𝟏) 𝒔
▪ or 𝒙
ഥ± 𝒕𝜶
𝒏
𝟐

▪ Value 𝒕 𝒅𝒇 in the Table 2 (p. 976); 𝒕 𝒅𝒇>𝟑𝟎 ≈𝒛

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 13


Confidence Interval for 𝝁
(𝒏−𝟏) 𝒔
▪ Marignal Error 𝑴𝑬 = 𝒕𝜶
𝒏
𝟐
▪ Confidence interval: 𝒙
ഥ ± 𝑴𝑬

▪ Narrower the Confidence Interval


▪ Reduce standard deviation 𝑠
▪ Reduce confidence level
▪ Increase sample size 𝑛
𝟐
𝒔 (𝒏−𝟏)
𝒏= 𝒕𝜶
𝑴𝑬 𝟐

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 14


Example
Example 8.3. From the normality population, collect two
samples
▪ Sample 1: 15, 17, 16, 20, 17
▪ Sample 2: 18, 13, 14, 20, 15, 13

With confidence level of 95%, find confidence interval of


mean by using data in Sample 1, Sample 2, and Pooled
sample

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 15


Example
▪ Example 8.3 (cont.)
▪ Sample 1:
15+17+16+20+17
𝑥ҧ1 = =
???
15−𝑥ҧ 2 + 17−𝑥ҧ 2 + 16−𝑥ҧ 2 + 20−𝑥ҧ 2 + 17−𝑥ҧ 2
𝑠12 = =
???
𝑛 = 5; 1 − 𝛼 = 0.95

▪ Sample 2: 𝑥ҧ2 = 15.5 𝑠22 = 8.3


▪ Pooled sample: 𝑥ҧ𝑃 = 16.182 𝑠𝑃2 = 6.164

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 16


Example
Example 8.4. To estimate the average score of students in
Maths, 40 students’ score are collected, and the sample
mean is 74.5, and sample variance is 64. Assumed that
score is normal distributed
(a) Find the 95% confidence interval of average score
(b) Find the 90% confidence interval of average score
(c) To reduce interval width to less than 4:
(c1) With level 95%, how many observations should be
surveyed
(c2) With above sample, how much confidence level
should be

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 17


8.5. Interval Estimate for Proportion
▪ Population proportion 𝑝 is unknown
▪ Sample has size of 𝑛, sample proportion 𝑝ҧ
▪ Confidence interval level (1 − 𝛼)
ഥ(𝟏 − 𝒑
𝒑 ഥ) ഥ(𝟏 − 𝒑
𝒑 ഥ)
ഥ − 𝒛𝜶
𝒑 ഥ + 𝒛𝜶
<𝒑<𝒑
𝟐 𝒏 𝟐 𝒏
ഥ(𝟏−ഥ
𝒑 𝒑)
▪ Or ഥ ± 𝑴𝑬
𝒑 with 𝑴𝑬 = 𝒛 𝜶
𝟐 𝒏
𝒛𝟐𝜶/𝟐 𝒑
ഥ(𝟏−ഥ
𝒑)
▪ Sample size: 𝒏 =
𝑴𝑬𝟐

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 18


Example
Example 8.5. In 200 observed visitors, 50 of them buying
goods, and 40 using services. With confidence level of
95%
(a) Estimate of buyer proportion in the visitors
(b) To have interval that narrower than 10%, how many
observations should be surveyed?
(c) Estimate the proportion of visitors who do not use
services
(d) In 4000 visitors, estimate the total number of buyers

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 19


8.6. Interval Estimate for Variance
▪ Population variance 𝜎 2 is unknown
▪ Sample has size of 𝑛, sample variance 𝑠 2
▪ Confidence interval level (1 − 𝛼)
𝒔𝟐 (𝒏 − 𝟏) 𝒔𝟐 (𝒏 − 𝟏)
𝟐 𝒏−𝟏
< 𝝈𝟐 < 𝟐 𝒏−𝟏
𝝌𝜶 𝝌 𝜶
𝟐 𝟏−𝟐

Example 8.6. Estimate the variability of the time spent on


producing, manager randomly observed 40 workers, the
variance of their producing time was 12.5 minutes2. Find the
95% interval estimate of the producing time’s variance,
assumed that producing time is normally distributed.
PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 20
Key Concepts
▪ Point Estimate
▪ Unbiased, Efficient Estimator
▪ Confidence Interval
▪ Confidence Level, Marginal Error
▪ Confidence Interval for Mean
▪ Confidence Interval for Proportion
▪ Confidence Interval for Sample

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 21


Exercise
[1] Chapter 8
▪ (349) 2, 6, 8, 10
▪ (357) 13, 14, 17, 18, 21,
▪ (261) 26, 30
▪ (366) 32, 34, 36, 38, 40
▪ Case Problem 1, 3

PROBABILITY & STATISTICS – Nguyen Hai Duong – NEU – www.mfe.edu.vn/nguyenhaiduong 22

You might also like