Topic 9 Estimation (No Answer)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

Topic 9 Estimation

Population Parameters

Sampling Inference

Sample Statistics

◎Sampling:The method or procedure to select a sample from the population.


◎Statistical Inference:A method or procedure for inferring population parameters using
sample statistics. That is, a method of estimating, predicting, and
determining the characteristics of a population based on sample.

▲ Inferences are based on data and methods, and those without evidence are
conjectures.

▲The most inferred population parameters are the mean and standard deviation.

▲Statistical inference mainly includes:


(1) Estimation: point estimation and interval estimation
(2) Hypothesis testing

▲Statistical inferences cannot be completely accurate because the sample contains


less information than the population. The reliability measures of statistical inference
include confidence level, significant level.

一、Introduction: Statistical Inference

二、Point Estimation

三、Interval Estimation

(1) Calculate the interval estimates of (1 − α)100%

1 The interval estimates of population mean; μ


Case 1:Population variance 𝜎 2 is known

Case 2:Population variance 𝜎 2 isn’t known

2 The interval estimates of population proportion; p



1
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

3 The interval estimates of the mean difference between the two population (𝜇1 − 𝜇2 )

Case 1: 𝜎12 and 𝜎22 are known, sample is independent.

Case 2: 𝜎12 and 𝜎22 aren’t known, sample is independent. Both samples are large samples

Case 3: 𝜎12 and 𝜎22 aren’t known, 𝜎12 ≠ 𝜎22 , sample is independent. Both samples are small
samples

Case 4: 𝜎12 and 𝜎22 aren’t known, 𝜎12 = 𝜎22 , sample is independent. Both samples are small
samples

4 The interval estimates of proportion of two population 𝑝1 − 𝑝2 .


5 The interval estimates of population variance 𝜎 2 .


𝜎2
6 The interval estimates of two population variance proportion ( 1 ).
○ 𝜎2 2

(2) Sample size determination

1 Sample size to estimate a population mean


2 Sample size to estimate a population proportion


2
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

一、Introduction: Statistical Inference

There are two main types of statistical inference:

1. ______________________ (Topic 9)

☛The goal of estimation is to determine possible values or approximations of population


parameters based on ____________________. For example, use the sample mean (𝑋̅) to
estimate the population mean (μ).

☛Terminology

1 Estimation: The method or procedure for estimating population parameters using



sample statistics.

2 Estimator: Formulas (such as sample statistics) that generate estimated values from

sample data are called estimators. The statistic used to estimate the
population parameter is often called estimator.

3 Estimate: The real value calculated by substituting the sample observations into the

estimator is called the estimator.

☛The presentation methods of estimation are point estimation and interval estimation.

When making estimates, we select a random sample from the population and get the
estimates of the parameters of the population from the sample. There are two ways to
present estimates:

Point estimation:Calculate the sample statistic based on the sample, and use the sample
statistic as the estimated value of the population parameter. [point estimation as a
_____________]

Interval estimation:An interval with upper and lower bounds is estimated for the
unknown population parameter, and the reliability of the interval containing the population
parameter is indicated. [Interval estimate is an _________ on the real line]

2. ____________________ (Topic 10)

In fact, estimation and hypothesis testing use very similar concepts.

3
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

二、Point Estimation

1. Significance and Limitations of Point Estimation

Definition:A set of random samples (𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ) with n samples are drawn from the mother,
and the sample statistics obtained from this are used as the estimated values of the population
parameter.

4
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

▶Point estimator

The point estimation formula 𝜃̂ is a real-valued function of random samples (𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ),


that is, 𝜃̂ = W(𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ). In other words, any sample statistic is a ______________, for
example, sample mean 𝑋̅; sample proportion 𝑝̂ ; sample standard deviation s.

▶Point estimate

The observed value of random sample (𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ) is brought into the real value
calculated by the 𝜃̂ = W(𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ) function, which is a ______________, which is a
______________.

For example:Using the sample mean 𝑋̅ to estimate the population mean μ, then the sample
mean 𝑋̅ is an estimator. Bring the sample observation value into the estimation formula to get
a certain value, such as 𝑥̅ 0 = 30, then this value 30 is the estimate.

Limitation:Although point estimation is simple, it cannot point out the accuracy of the
estimation results, the estimated value varies with different ____________, and there will
inevitably be errors after the estimation, which makes decision makers uneasy, so there is a
method of interval estimation.

Explanation:

The steps of point estimation:


1 Select representative sample

2 Choose a better sample statistic as the estimator

3 Calculate the value of the sample statistic (estimate)

4 Use the value of the sample statistic (estimated) to infer the value of the population
parameter and make a decision.

✎Review
Population Parameter Sample Statistics
Population mean Sample mean
Ex: The mean house value for all houses in Ex: The mean house value for a sample of n=200 houses
Kaohsiung in Kaohsiung
Population proportion Sample proportion
Ex: The proportion of all houses in Kaohsiung Ex: The proportion of Kaohsiung’s houses in a sample of
with lead paint n=200 with lead paint

5
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

【EXAMPLE 1】Chinho wants to buy a 25~30 square feet house in Kaohsiung City, so he
wants to know the general (average) housing price in Kaohsiung City, what can he
do?______________

☛Is the estimate accurate? (limitation of point estimator)


It is impossible to judge the correctness of the estimated results by estimating the population
parameters with point estimator.

Because the sample selected is a random sample, and it may or may not be accurate to use a
group of sample statistic of the population parameter.

Suppose we know that the real average price of 25~30 square feet houses in Kaohsiung City
is 10 million NT dollars, then the estimated value 𝑋̅ = 935.03 in the above example is
obviously a wrong estimate.
If another set of samples is selected and the resulting estimate is 𝑋̅ = 1000 , then this
estimate is "正中" of the population parameter. 這個估計值「正中」母體參數。

In general, the point estimate is always different from the true value of the population
parameter. Knowing only the point estimates of the population parameters does not help to
understand the estimation error (that is, the accuracy of the estimation), which is the
motivation behind interval estimation.

2. Judgment Criteria for Point Estimator -What is a "good" point estimator?

In theory, there are many possible estimators for a population parameter. For example, to
estimate the population mean, possible point estimators are sample mean, sample median,
sample mode, etc.
☛How to choose better point estimator?
If a good estimator can be selected, no matter what kind of sample appears, ______________,
a good estimator can make the estimation result closer to the true value of the population
parameter or make the error smaller. The criteria for choosing good estimators are:
(1) Unbiasedness; (2) Efficiency; (3) Consistency; (4) Sufficiency。

6
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

(1) Unbiasedness

Definition:If the expected value of an estimator 𝜃̂ (usually a statistic) is equal to the parent
parameter) θ, that is E(𝜃̂) = θ, then this estimator is called an __________ estimator of the
estimated parent parameters.

Explanation:

1 Examples of unbiased estimators:


 ______________:The sample mean 𝑋̅ is an unbiased estimator of the population mean


μ. It is worth noting that there are many possible estimators for the population mean,
such as sample maximum, sample minimum, sample mode, sample median, etc. Being
_________ is the most basic requirement to be chosen as an estimator, so even though
the sample maximum is an estimator, it is not usually used to estimate the population
mean because it _______ unbiased.
1 2
 ______________:Sample variance(𝑠 2 = 𝑛−1 ∑𝑛𝑖=1(𝑋𝑖 − 𝑋) ) is an unbiased estimator

of the population variance 𝜎 2 .


 ______________ : Sample proportion 𝑝̂ is an unbiased estimator of the population
proportion p.
 E(𝑋̅1 − 𝑋̅2 ) = 𝜇1 − 𝜇2:The difference of sample mean is an unbiased estimator of the
difference of population mean.
 E(𝑝̂1 − 𝑝̂2 ) = 𝑝1 − 𝑝2 :The difference of sample proportion is an unbiased estimator of
the difference of population proportion.

2 Biased estimator:

If E(𝜃̂) ≠ θ, then (bias)= _______________________

7
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

∑𝑋 ∑ 𝑎𝑖 𝑋𝑖
【EXAMPLE 2】Prove that both 𝑋̅ = 𝑛 𝑖 and ∑ 𝑎𝑖
(𝑎𝑖 is constant) are unbiased

∑𝑎 𝑋
estimators of the parent mean μ, that is, prove E(𝑋̅) = μ及E ( ∑ 𝑎𝑖 𝑖 ) = μ.
𝑖

【EXAMPLE 3】Prove that E(𝑠 2 ) = 𝜎 2 [refer 陳 p.455-456]

8
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

(2) Efficiency
Since there may be many unbiased estimators. (such as 【EXAMPLE 2】) Therefore, in
addition to unbiasedness, other judgment criteria are required to make choices, and validity is
one of them.
Definition:Let 𝜃̂ (usually a statistic) be the estimator of the population parameter θ. If the
2
mean square error of 𝜃̂ (mean square error;MSE)─MSE(𝜃̂) = E(𝜃̂ − 𝜃) is the __________

among all estimator, then 𝜃̂ is said to have _______________________ in estimating θ.

If 𝜃̂1 and 𝜃̂2 are estimators of θ , and MSE(𝜃̂1 ) is less than MSE(𝜃̂2) , that is
̂
MSE(𝜃 )
eff(𝜃̂1 , 𝜃̂2 ) = MSE(𝜃̂2) > 1, then 𝜃̂1 is said to have ________________ to 𝜃̂2 in estimating θ.
1

Explanation:

1 The efficiency of the estimator is measured by MSE. The smaller the MSE, the higher the

efficiency of the estimator. There are absolute efficiency and relative efficiency.

̂1 and 𝜃̂2 are the unbiased estimators of θ, if Var(𝜃̂1 ) is smaller than


2 Suppose that 𝜃

̂
Var(𝜃 )
Var(𝜃̂2 ), that is eff(𝜃̂1, 𝜃̂2 ) = Var(𝜃̂2) > 1, then 𝜃̂1 has relative efficiency relative to 𝜃̂2 in
1

estimating θ.

9
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

【EXAMPLE 4】If the population is normally distributed N(μ, 𝜎 2 ), try to prove that the
sample mean 𝑋̅ and the sample median are both unbiased estimators of the parent
mean, and the sample mean 𝑋̅ is a relatively efficient estimator.

【EXAMPLE 5】Let (𝑌1 , 𝑌2 , ⋯ , 𝑌𝑛 ) be a random sample of size n from a population with


1
mean μ and variance 𝜎 2 . Consider 𝜃̂1 = 2 (𝑌1 + 𝑌2 ) and 𝜃̂2 = 𝑌̅. Find

eff(𝜃̂1 , 𝜃̂2 ) .

10
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

(3) Consistency 【The feature of large sample】

Definition:When the sample size n approach ∞, the probability limit that the absolute value
of the difference between the estimator 𝜃̂ (usually a statistic) and the population

parameter θ is less than a trace ε is equal to 1, that is, lim 𝑃(|𝜃̂ − θ| < 𝜀) = 1 or
n→∞

lim 𝑃(|𝜃̂ − θ| ≥ 𝜀) = 0 for any ε > 0, then this estimator 𝜃̂ is called a consistent
n→∞

estimator of the estimated population parameter θ.

Explanation:
̂ for θ is consistent, if lim Var(𝜃̂) = 0
1 Theorem: An unbiased estimator 𝜃

n→∞

̂) = 𝟎→ 𝜽
【unbiased + 𝐥𝐢𝐦 𝐕𝐚𝐫(𝜽 ̂ is consistent】
𝐧→∞

𝐸(𝑌)
Proof:𝑃(𝑌 ≥ 𝑏2 ) ≤ 𝑏2

𝑉𝑎𝑟(𝑋)
☛This proof requires the use of Chebyshev’s Inequality:P(|𝑋 − 𝐸(𝑋)| ≥ b) ≤ 𝑏2
2 2
2 𝐸(𝜃̂ − 𝜃) 𝑉𝑎𝑟(𝜃̂) + [𝐸(𝜃̂) − 𝜃]
𝑃(|𝜃̂ > 𝜃| ≥ 𝜀) = 𝑃 ((𝜃̂ > 𝜃) ≥ 𝜀 ) ≤ 2
=
𝜀2 𝜀2
∵ lim 𝐸(𝜃̂ ) ⟶ 𝜃 且 lim 𝑉𝑎𝑟(𝜃̂) = 0
𝑛→∞ 𝑛→∞

∴ 𝑃(|𝜃̂ > 𝜃| ≥ 𝜀) ⟶ 0 as 𝑛 ⟶ ∞ for all 𝜀 > 0

̂1 and 𝜃̂2 are both unbiased and consistent estimators of θ, how to choose a suitable
2 If 𝜃

estimator?

→ Choose an estimator with less variance

3 For example:we knew



𝜎 2
 E(𝑋̅) = μ,V(𝑋̅) = 𝑛

𝑝(1−𝑝)
 E(𝑝̂ ) = p,V(𝑝̂ ) = 𝑛

2𝜎4
 E(𝑠 2 ) = 𝜎 2 ,V(𝑠 2 ) = 𝑛−1

When the sample size n increases such that n → ∞


𝜎 2 𝑝(1−𝑝) 2𝜎 4
 V(𝑋̅) = 𝑛 → 0;V(𝑝̂ ) = 𝑛 → 0;V(𝑠 2 ) = 𝑛−1 → 0

So 𝑋̅ is a consistent estimator of μ ; 𝑝̂ is a consistent estimator of p; 𝑠 2 is a consistent


11
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

estimator of 𝜎 2 .
In general, as long as the calculation process is the same as the statistic of the population
number, it must be a consistent estimator of the corresponding population number.

12
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

(4) Sufficiency

Definition:From a population f(x; θ), randomly select (𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ) as a set of samples,


the joint probability function f(𝑥1 , 𝑥2 , ⋯ , 𝑥𝑛 ; θ) = f(𝑥1 ; θ)f(𝑥2; θ) ⋯ f(𝑥𝑛 ; θ). If
it can be decomposed into f(𝑥1 , 𝑥2 , ⋯ , 𝑥𝑛 ; θ) = 𝑔(𝜃̂, 𝜃) h(𝑥1 , 𝑥2 , ⋯ , 𝑥𝑛 ) and
h(𝑥1, 𝑥2 , ⋯ , 𝑥𝑛 ) is independent of the population parameter θ, then 𝜃̂ is called
a sufficient estimator of θ, that is, estimating θ with 𝜃̂ is sufficient.

Explanation:

A statistic is a sufficient estimator of a population if it provides the most information about the
estimated population parameter from the sample data. That is, a sufficient estimator can provide
all the information about the parameter θ from the sample data, and no other statistics can
provide more information.

【EXAMPLE 6】Consider the outcomes of n trials of a binomial experiment, 𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 .

1 with probabilit y p;
Xi = 
0 with probabiity (1 - p)

Y = i =1 X i ~ Bin (n, p)
n

Solution:

𝑃(𝑋1 = 𝑥1 , 𝑋2 = 𝑥2 , … , 𝑋𝑛 = 𝑥𝑛 )
𝑃(𝑋1 = 𝑥1, 𝑋2 = 𝑥2 , … , 𝑋𝑛 = 𝑥𝑛 |𝑌 = 𝑦) =
𝑃(𝑌 = 𝑦)

𝑃(𝑋1 = 𝑥1 , 𝑋2 = 𝑥2, … , 𝑋𝑛 = 𝑥𝑛 , 𝑌 = 𝑦) = 0 if ∑𝑛𝑖=1 𝑥𝑖 ≠ 𝑦


={ 𝑃 𝑦 (1−𝑃)𝑛−𝑦 1
𝑃(𝑋1 = 𝑥1, 𝑋2 = 𝑥2 , … , 𝑋𝑛 = 𝑥𝑛 , 𝑌 = 𝑦) = = if ∑𝑛𝑖=1 𝑥𝑖 = 𝑦
𝐶𝑦𝑛 𝑃 𝑦 (1−𝑃)𝑛−𝑦 𝐶𝑦𝑛

1
⟹ does not depend on p
𝐶𝑦𝑛

⟹ 𝑌 = ∑𝑛𝑖=1 𝑥𝑖 is sufficient for p

13
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

三、Interval Estimator

1. The meaning and method of interval estimation

Definition:An interval of upper and lower bounds is estimated for the unknown population
parameter, and the reliability (probability) of the interval containing the
population parameter is indicated.

Method:Interval estimation starts from ______________, and then derives an interval with
upper and lower limits under a given ______________ to illustrate the reliability of
the interval (that is confidence interval) containing the true value of the population
parameter. The formula is:

__________________________________________

θ̂ − kσ(θ̂) is lower confidence limit

θ̂ + kσ(θ̂) is upper confidence limit

▶level of confidence:(1 − α)
The confidence level (1 − α) refers to the reliability, confidence, reliability, etc. of the
confidence interval including the population parameters. α is the probability of error.

__________________________________________
▶Confidence Interval
The confidence interval is an interval constituted under a given confidence level and is an
interval including upper and lower bounds composed of sample statistics and sampling errors.

▶Length of Confidence Interval


Length of Confidence Interval is Upper Limit−Lower Limit

Explanation:


1 If the probability that the estimated value is in this interval is calculated according to the

sampling distribution of the estimated formula, it is (1 − α) [level of confidence], then


this interval is called "(1-α)100% confidence interval (Confidence Interval)". For example,
the interval established by the 95% confidence level is the 95% confidence interval.


2 The 95% confidence level means that the interval established at this confidence level has a

95% probability of containing the true value of the population parameter. The 90%
confidence level means that the interval established at this confidence level has a 90%
probability of containing the true value of the population parameter.

14
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾


3 For example:Suppose we try to estimate the average income of students in summer vacation,

a possible range is estimated that there is a 95% chance that the average
income of students in summer vacation (this is unknown) is between
$380~$400.


4 The factors that affect the confidence interval are point estimate, the level of confidence, the

sample size, and the method of taking the confidence limit (one-tailed, two-tailed).

15
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

2. Interval Estimator
(1) Calculate the interval estimate of (1 − α)100%

1 The interval estimator of population mean; 𝛍


The statistic 𝑋̅ is the best estimator of the parent mean μ, so the sample mean 𝑋̅ can be used
as a point estimator of μ. As for the interval estimation of the population mean μ, there are the
following situations:

Case 1:Population variance 𝝈𝟐 is known

If the population variance 𝜎 2 is known, ________________________________________can


be used.
Confidence interval for a two-tailed interval estimate of (1 − α)100% of 𝜇:

__________________________________________

→𝑋, 𝑍𝛼 , 𝜎, 𝑛 must be known


2

▶Level of confidence:______________
𝑋−𝜇
▶The probability interval is 𝑃[−𝑍𝛼/2 ≤ 𝜎 ≤ 𝑍𝛼/2 ] = ______________
√𝑛

▶Margin of error:______________
𝜎
▶Length of Confidence Interval:2 × 𝑍𝛼 ∙
2 √𝑛

Explanation:
𝜎
(i) The condition for 𝑋 ± 𝑍𝛼 ∙ as an interval estimator of the population mean is:【(a)+ (b)
2 √𝑛
or (a)+ (c)】

(a) Population variance 𝜎 2 or population standard deviation σ is known


(b) The population is the normal distribution, no matter the large sample or the small sample,
the sampling distribution of 𝑋 is the normal distribution
(c) If the population is not normally distributed, but the sample is large enough to apply the
Central Limit Theorem, the sampling distribution of 𝑋 will approach the normal
distribution.
16
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

If the population distribution is unknown, the population variance 𝜎 2 or the population


standard deviation σ is known, and the sample is small, chebyshev's theorem is used for
interval estimation.

(ii)Derivation of Confidence Interval:

(iv)Explanation:______________________________________________________________

【EXAMPLE 7】

(1) Determine the critical value 𝑍𝛼 that corresponds to a 95% level of confidence.
2

(2) Determine the critical value 𝑍𝛼 that corresponds to a 96% level of confidence.
2

【EXAMPLE 8】 Del Monte sets the filling operation to dispense 4.51 ounces of peaches and
gel in each cup. From historical data, Del Monte knows that 0.04 ounce is the standard
deviation of the filling process and that amount, in ounces, follows the normal distribution.
The quality control technician selects a sample of 64 cups at the start of each shift, measures
the amount in each cup, computes the mean fill amount, and then develops a 95% confidence
interval for the population mean. Using the confidence interval, is the process filling the cups
to the desired amount? This morning’s sample of 64 cups had a sample mean of 4.507 ounces.

17
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

【EXAMPLE 9】 A sample of 81 observations is taken from a normal population with a


standard deviation of 5. The sample mean is 40. Determine the 95% confidence
interval for the population mean.

【EXAMPLE 10】The American Management Association is studying the income of store


managers in the retail industry. A random sample of 49 managers reveals a sample
mean of $45,420. The standard deviation of the population is $2,050. What is the
population mean? What is a reasonable range of values for the population mean?
How do we interpret these results?

18
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

Case 2:Population variance 𝝈𝟐 is unknown

I. Large sample

If the population variance 𝜎 2 is unknown, _________________________________________


can be used.

But since 𝜎 2 is unknown, it is replaced by___________, therefore,

confidence region for two-tailed interval estimate of 𝜇:______________________________

→𝑋, 𝑍𝛼 , 𝑠, 𝑛 must be known


2

▶level of confidence:______________
𝑋−𝜇
▶The probability interval is 𝑃[−𝑍𝛼/2 ≤ 𝑠 ≤ 𝑍𝛼/2 ] = ______________
√𝑛

▶Margin of error:______________________
𝑠
▶Length of Confidence Interval:2 × 𝑍𝛼 ∙
2 √𝑛

Explanation:

𝜎2
When the population is a normal distribution, 𝑋 is a normal distribution, 𝑋~𝑁 (𝜇, ).
𝑛

𝑋−𝜇
Therefore, normalizing 𝑋 can get the standard normal distribution Z = 𝜎 ~𝑁(0,1).
√𝑛

When the population is not normally distributed and the sample is large enough, according to
the Central Limit Theorem (CLT), the sampling distribution of the sample mean will still
approach the normal distribution. Therefore, if the maternal variance 𝜎 2 is unknown, the
sample is a large sample, and the standard normal distribution Z distribution can also be
processed.

Although 𝜎 2 is unknown, we can use the sample variance to estimate the population variance,
so replace 𝜎 2 by 𝑠 2 .

【EXAMPLE 11】

19
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

II. Small sample

If the population is normal distribution or approximately normal distribution, and the


population variance 𝜎 2 is unknown, and the sample is a small sample, it will be processed by
________________. But since 𝜎 2 is unknown, it is replaced by ________, so,
The confidence interval for the (1 − α)100% two-tailed interval estimate of 𝜇:

__________________________________________________

→𝑋, 𝑡𝛼 , 𝑠, 𝑛 must be known


2

▶Level of confidence:______________
𝑋−𝜇
▶Probability interval:𝑃 [−𝑡𝛼 ≤ 𝑠 ≤ 𝑡𝛼 ] = ______________
2 √𝑛 2

▶Margin of error:______________________
𝑠
▶Length of Confidence Interval:2 × 𝑡𝛼 ∙
2 √𝑛

Explanation:
𝜎2
(i)When the population is normal distribution, 𝑋 is normal distribution, 𝑋~𝑁 (𝜇, ).
𝑛

𝑋−𝜇
Therefore, normalize 𝑋 to get the standard normal distribution Z = 𝜎 ~𝑁(0,1). Since
√𝑛

𝑋−𝜇
𝜎 2 is unknown, it is replaced by 𝑠 2 . In the case of small samples, t = 𝑠 is a t
√𝑛

distribution with (n-1) degrees of freedom. Therefore, in the population, it is a normal


distribution. 𝜎 2 is unknown, and in the case of small samples, the confidence interval of
𝜇 should be derived using t distribution.

☛ t distribution
 t distribution is the probability distribution of a random variable ______________
The mean of t distribution is________, its shape is determined by its
______________________________( df= (n − 1) )

❖ Degree of freedom refers to the number of random variables in a statistic that can
freely change. For example, There are (n − 1) random variables that can change

∑𝑛 ̅ 2
𝑖=1(𝑋𝑖 −𝑋 )
freely in the statistic 𝑆 2 = . Therefore, there is 𝑋̅ in 𝑆 2 , 𝑋̅ must satisfy
𝑛−1

𝑛
∑ 𝑋𝑖
the restriction 𝑋̅ = 𝑖=1 , and the n variables of (𝑋1 , ⋯ , 𝑋𝑛 ) satisfy 𝑋̅
𝑛

20
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

restrictive, Only (n − 1) variables (𝑋1 , ⋯ , 𝑋𝑛−1 ) can be freely changed, and the
last variable 𝑋𝑛 cannot be freely changed, so the degree of freedom of 𝑆 2 is (n − 1).

For example: there are 4 observations, under the condition that the average is 25 and
𝑋1 = 20, 𝑋2 = 23, 𝑋3 = 28 , the fourth variable is automatically determined to be
25*4-20-23-28=29, so the degree of freedom is 4-1=3.

 Characteristics of t distribution:
➢ Like z distribution, t distribution is also a continuous probability distribution;
➢ Like z distribution, the graph of t distribution is also bell-shaped and symmetrical;
➢ The t distribution is not the only one, but a whole family. The mean of all t distribution
is 0, however, the standard deviation varies with the number of samples n. The smaller
the sample size, the larger the standard deviation.t
➢ The t distribution is flatter and more discrete than the standard normal distribution z
distribution. As the number of samples increases, the t distribution gets closer to the
standard normal distribution z distribution, because the error of using s to estimate σ
will shrink as the sample size increases.

➢ Since the t distribution is more discrete than the z distribution, the value of the t
distribution for a given confidence level will be greater than the value of the z
distribution for the same confidence level.

(ii)Explanation:______________________________________________________________

21
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

Population is normal distribution Z distribution


Large sample
Compliant
Population is Z distribution
With CLT
abnormal distribution
𝜎 2 is known
Population is normal distribution Z distribution
Small sample
Population is Not Compliant chebyshev's
𝜇
abnormal distribution With CLT theorem
Z distribution
Population is normal distribution
Large sample replace 𝜎 2 with 𝑠 2
𝜎 2 is unknown
Population is Compliant Z distribution
abnormal distribution With CLT replace 𝜎 2 with 𝑠 2
t distribution,
Small sample Population is normal distribution
replace 𝜎 2 with 𝑠 2

【EXAMPLE 12】 A tire manufacturer wishes to investigate the tread life of its tires. A sample
of 10 tires driven 50,000 miles revealed a sample mean of 0.32 inch of tread
remaining with a standard deviation of 0.09 inch. Construct a 95% confidence
interval for the population mean. Would it be reasonable for the manufacturer to
conclude that after 50,000 miles the population mean amount of tread remaining
is 0.30 inch?

22
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

【 EXAMPLE 13 】 The U.S. Dairy Industry wants to estimate the mean yearly milk
consumption. A sample of 16 people reveals the mean yearly consumption to be
45 gallons with a standard deviation of 20 gallons. Assume the population
distribution is normal.
a) What is the value of the population mean? What is the best estimate of this
value?
b) Explain why we need to use the t distribution. What assumption do you need
to make?
c) For a 90% confidence interval, what is the value of t?
d) Develop the 90% confidence interval for the population mean.
e) Would it be reasonable to conclude that the population mean is 48 gallons?

【EXAMPLE 14】 The owner of Britten’s Egg Farm wants to estimate the mean number of
eggs produced per chicken. A sample of 20 chickens shows they produced an
average of 20 eggs per month with a standard deviation of 2 eggs per month.
a) What is the value of the population mean? What is the best estimate of this
value?
b) Explain why we need to use the t distribution. What assumption do you need
to make?
c) For a 95% confidence interval, what is the value of t?
d) Develop the 95% confidence interval for the population mean.
e) Would it be reasonable to conclude that the population mean is 21 eggs? What
about 25 eggs?

23
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

2 Interval estimation of population proportion (𝐩)


Proportion : The proportion of a sample or population that has a characteristic, usually


expressed as a fraction, proportion, or percentage.
For example, according to a survey of married men between the ages of 35 and 50, 63 percent
believed that husbands and wives should share the cost of living.
Company representatives noted that 45 percent of Burger King's customers use drive-thru to
purchase meals.公

𝑋
A good estimator of the population proportion p is the sample proportion 𝑝̂ = 𝑛 , n is the

number of trials, and X is the number of successful. Interval estimation of the population
proportion, p, vary by sample size or probability distribution. Here we only discuss the case of
large samples.

假設樣本資料符合二項分配,且________________________________,則可以 z 分配建
構母體比例 p 的區間估計。

The confidence interval for the (1-α)100% two-tailed interval estimate of p:

__________________________________________
▶Level of confidence:(1 − α)

▶Probability interval _______________________________________

▶Margin of error:____________________________
𝑝̂(1−𝑝̂)
▶Length of Confidence Interval:2 × 𝑍𝛼 ∙ √
2 𝑛

Explanation:

(i) The interval estimation for constructing the population proportions must meet the following
conditions:

(a) Two conditions must be met;

(b)__________________________________, it means that the sample is large enough.


According to the ____________________________, the sampling distribution of the sample
proportion will be close to the _______________________, the mean is _______ and the
variance is Var(𝑝̂ ) = 𝜎𝑝2̂ = ___________________________ The confidence interval can then be
calculated by using the standard normal distribution, z distribution.

24
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

(ii)Derivation of Confidence Interval:

(iii) Explanation: ______________________________________________________________

25
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

【EXAMPLE 15】The union representing the Bottle Blowers of America (BBA) is considering
a proposal to merge with the Teamsters Union. At least three-fourths of the BBA
membership must approve any merger. A random sample of 2,000 current
members reveals 1,600 plan to vote for the merger proposal. Develop a 95%
confidence interval for the population proportion. Basing your decision on this
sample information, can you conclude that the necessary proportion of BBA
members favor the merger? Why?

【EXAMPLE 16】Cliff Obermeyer is running for Congress from the 6th District of New Jersey.
Suppose 500 voters are contacted upon leaving the polls and 275 indicate they
voted for Mr. Obermeyer. We will assume that the exit poll of 500 voters is a
random sample of those voting in the 6th District. Construct a 95% confidence
interval for the population proportion. Should Mr. Obermeyer be elected?

26
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

Exercise:
1) Mileage tests were conducted on a randomly selected sample of 100 newly developed
automobile tires. The results showed that the mean tread life was 50,000 miles, with a
standard deviation of 3,500 miles. What is the best estimate of the mean tread life in miles for
the entire population of these tires?

2) A random sample of 85 supervisors revealed that they worked an average of 6.5 years
before being promoted. The population standard deviation was 1.7 years. Using the 0.95
degree of confidence, what is the confidence interval for the population mean?

3) There are 2,000 eligible voters in a precinct. A total of 500 voters are randomly selected
and asked whether they plan to vote for the Democratic incumbent or the Republican
challenger. Of the 500 surveyed, 350 said they would vote for the Democratic incumbent.
Using the 0.99 confidence coefficient, what are the confidence limits for the proportion that
plan to vote for the Democratic incumbent?

4) A random sample of 42 college graduates revealed that they worked an average of 5.5
years on the job before being promoted. The sample standard deviation was 1.1 years. Using
the 0.99 degree of confidence, what is the confidence interval for the population mean?

5) A survey of 50 retail stores revealed that the average price of a microwave was $375, with
a sample standard deviation of $20. Assuming the population is normally distributed, what is
the 95% confidence interval to estimate the true cost of the microwave?

6) What is the interpretation of a 96% confidence level?


A) There’s a 96% chance that the given interval includes the true value of the population
parameter.
B) Approximately 96 out of 100 such intervals would include the true value of the population
parameter.
C) There’s a 4% chance that the given interval does not include the true value of the
population parameter.
D) The interval contains 96% of all sample means.

27
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

3 Interval Estimation of the Difference of two population 𝝁𝟏 − 𝝁𝟐


It is assumed that 𝜇1 and 𝜇2 are the averages of the two population, respectively. In order to
estimate the difference of the two population mean 𝜇1 − 𝜇2 , we randomly select 𝑛1 and 𝑛2
from the population 1 and 2, respectively. The two samples are _________________________.
If 𝑋̅1 and 𝑋̅2 are the means of the two samples, the good (point) estimator of the difference
of the two population mean 𝜇1 − 𝜇2 is ______________.

For example:

Population 1 Population 2
Inner-City Store Customers
Suburban Store Customers
𝜇1 =mean age of inner-city store
𝜇2 =mean age of suburban store
customers
customers

𝜇1 − 𝜇2 =difference between mean ages

Two Independent Simple Random Samples

Random sample of 𝑛1 Inner-City Store Customers Random sample of 𝑛2 suburban Store Customers

𝑋̅1 =sample mean age for the inner-city store customers 𝑋̅2 =sample mean age for the suburban store customers

̅1 − X
X ̅2 is the point estimator of 𝜇1 − 𝜇2

If both populations are normal distribution or non-normal distribution, but the sample is large
enough to apply the Central Limit Theorem, the sampling distribution of 𝑋̅1 and 𝑋̅2 will
approach the ________________________, and the sampling distribution of 𝑋̅1 − 𝑋̅2 will also
approach the normal distribution. When the sampling distribution of 𝑋̅1 − 𝑋̅2 is a normal
distribution, 𝑋̅1 − 𝑋̅2 is used to estimate 𝜇1 − 𝜇2 , and its interval is estimated by whether 𝜎12
and 𝜎22 are known or not, and the sample size is different and independent or not (this course
only discusses the condition of sample independence), there are four cases, which are described
as follows:

28
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

Case 1: 𝝈𝟐𝟏 and 𝝈𝟐𝟐 are known, and the samples are independent.

If both populations have normal distribution or non-normal distribution, but the sample is large
enough to apply the Central Limit Theorem, and 𝜎12 and 𝜎22 are known, the samples are
independent, then use the standard normal distribution to estimate the confidence interval of
𝜇1 − 𝜇2 .

The confidence interval for the (1 − α)100% two-tailed interval estimate of 𝜇1 − 𝜇2 :

______________________________________________________________________
▶Level of confidence:(1 − α)

▶Probability interval:__________________________________________

▶Margin of error:____________________________

Case 2: 𝝈𝟐𝟏 and 𝝈𝟐𝟐 are unknown, the samples are independent, and both are large
samples.

If both populations have normal distribution or non-normal distribution, but the sample is large
enough to apply the Central Limit Theorem, and 𝜎12 and 𝜎22 are unknown, the samples are
independent, the 𝜇1 − 𝜇2 confidence interval can still be estimated using the standard normal
distribution, but with the sample variances 𝑠12 and 𝑠22 instead of 𝜎12 and 𝜎22 . At this time,
the confidence interval of the two-tailed interval estimates of (1 − α)100% of 𝜇1 − 𝜇2 :

_____________________________________________________________________________

29
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

Case 3: 𝝈𝟐𝟏 and 𝝈𝟐𝟐 are unknown and not equal (𝜎12 ≠ 𝜎22 ), the samples are independent,
and both samples are small samples.

Assume that both population are normal distribution, 𝜎12 and 𝜎22 are unknown and not equal
(𝜎12 ≠ 𝜎22 ) , the samples are independent, and both samples are small samples, then use t
distribution to estimate the confidence interval of 𝜇1 − 𝜇2 .

When the variance of the two normal population is unknown and not equal, then

____________________________ approximate t distribution, the confidence interval of the


two-tailed interval estimates of (1 − α)100% of 𝜇1 − 𝜇2:

Case 4:𝝈𝟐𝟏 and 𝝈𝟐𝟐 are unknown and equal (𝜎12 = 𝜎22 ) , the samples are independent, and
both samples are small samples

Assume that both population are normal distribution, 𝜎12 and 𝜎22 are unknown and equal
(𝜎12 = 𝜎22 ) , the samples are independent, and both samples are small samples, then use t
distribution to estimate the confidence interval of 𝜇1 − 𝜇2 .

When the variance of the two normal population is unknown and equal, that is 𝜎12 = 𝜎22 = 𝜎 2 ,

then ______________________________________________________________________
★The common variance 𝜎 2 is obtained by jointly estimating the variance of the two samples,

____________________________
The confidence interval of the two-tailed interval estimates of (1 − α)100% of 𝜇1 − 𝜇2 :

________________________________________________________

★As for whether the variance of the two normal mothers is equal, the F statistic can be
used to test.
30
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

【EXAMPLE 17】
Inner-City Store Suburban Store
Sample size n1 = 36 n2 = 49
Sample mean x1 = 40 x2 = 40
Population standard deviation 1 = 9  2 = 10
Suppose both populations have a normal distribution and are independent. Find the 95%
confidence interval estimate of the difference between the two population means.

【EXAMPLE 18】
Cherry Grove Beechmont
Sample size n1 = 28 n2 = 22
Sample mean x1 = 1025 x2 = 910
Sample standard deviation s1 = 150 s2 = 125
Suppose both populations have a normal distribution and are independent. Find the 95%
confidence interval estimate of the difference between the two population means.

31
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

4 Interval estimation of the difference between two population proportions (𝒑𝟏 − 𝒑𝟐)

Suppose 𝑝1 and 𝑝2 are the proportions of the two mothers that meet a certain characteristic,
respectively. To estimate the difference of the proportions of the two population 𝑝1 − 𝑝2 , we
randomly selected 𝑛1 and 𝑛2 from the population 1 and 2, respectively. The two sets of
samples are ____________________________. If 𝑝̂ and 𝑝̂2 are respectively the proportions
of the two samples that meet a certain feature, then the good (point) estimator for the difference
between the proportions 𝑝1 − 𝑝2 of the two populations is______________.

From the sampling distribution of 𝑝̂1 − 𝑝̂2 , the confidence interval of 𝑝1 − 𝑝2 can be
determined. If both 𝑛1 and 𝑛2 are large samples, then according to the Central Limit
Theorem, the sampling distribution of 𝑝̂1 − 𝑝̂2 is close to the normal distribution, and the
confidence interval of 𝑝1 − 𝑝2 can be estimated by the _______________________________,
z distribution.

The confidence interval of the two-tailed interval estimates of (1 − α)100% of 𝑝1 − 𝑝2 :

_____________________________________________________________________________

【EXAMPLE 19】A tax preparation firm is interested in comparing the quality of work at two
of its regional offices. By randomly selecting samples of tax returns prepared at
each office and verifying the sample returns’ accuracy, the firm will be able to
estimate the proportion of erroneous returns prepared at each office. Of particular
interest is the difference between these proportions. The independent simple
random samples from the two offices provide the following information.
Office 1 Office 2
Sample size 𝑛1 = 250 𝑛2 = 300
Number of returns with errors 𝑥1 = 35 𝑥2 = 27
Find the 90% confidence interval for the difference between two population
proportions.

32
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

Summary:
𝜎
 Confidence interval limits for population mean, 𝜎 2 is known: 𝑋 ± 𝑍𝛼 ∙
2 √𝑛

𝑠
 Confidence interval limits for population mean, 𝜎 2 is unknown, large sample size: 𝑋 ± 𝑧𝛼 ∙
2 √𝑛

𝑠
 Confidence interval limits for population mean, 𝜎 2 is unknown, small sample size: 𝑋 ± 𝑡𝛼 ∙
2 √𝑛

𝑝̂(1−𝑝̂)
 Confidence interval limits for population proportion: 𝑝̂ ± 𝑍𝛼 ∙ √
2 𝑛

 Confidence interval limits for 𝜇1 − 𝜇2, 𝜎12 and 𝜎22 are known, independent samples:

𝜎 𝜎 2 2
(𝑋̅1 − 𝑋̅2 ) ± 𝑍𝛼 √ 1 + 2
𝑛2 𝑛 1 2

 Confidence interval limits for 𝜇1 − 𝜇2, 𝜎12 and 𝜎22 are unknown; independent samples; unequal
2
𝑠2 𝑠2
(𝑛1 +𝑛2 )
𝑠12 𝑠22 1 2
variances: (𝑋̅1 − 𝑋̅2 ) ± 𝑡𝛼 ∙ √𝑛 + 𝑛 with the 𝑑𝑓 = 2 2
2 1 2 𝑠2 𝑠2
(𝑛1 ) (𝑛2 )
1 2
+
𝑛1 −1 𝑛2−1

 Confidence interval limits for𝜇1 − 𝜇2, 𝜎12 and 𝜎22 are unknown; independent samples; equal variances:

1 1 (𝑛 −1)𝑠1 +(𝑛2−1)𝑠2 2 2
(𝑋̅1 − 𝑋̅2 ) ± 𝑡𝛼 ∙ √𝑠𝑝2 ( + ) with 𝑠𝑝2 = 1 , 𝑑𝑓 = 𝑛1 + 𝑛2 − 2
2 𝑛
1 𝑛 2 𝑛 +𝑛 −2 1 2

𝑝̂1(1−𝑝̂1) 𝑝̂2(1−𝑝̂2)
 Confidence interval limits for 𝑝1 − 𝑝2 : (𝑝̂1 − 𝑝̂ 2 ) + 𝑍𝛼 ∙ √ +
2 𝑛1 𝑛2

Finite Population:
𝜎 𝑁−𝑛
 Confidence interval limits for population mean, 𝜎 2 is known: 𝑋 ± 𝑍𝛼 ∙ √𝑁−1
2 √𝑛

𝑠 𝑁−𝑛
 Confidence interval limits for population mean, 𝜎 2 is unknown, large sample size: 𝑋 ± 𝑧𝛼 ∙ √𝑁−1
2 √𝑛

𝑠 𝑁−𝑛
 Confidence interval limits for population mean, 𝜎 2 is unknown, small sample size: 𝑋 ± 𝑡𝛼 ∙ √𝑁−1
2 √𝑛

𝑝̂(1−𝑝̂) 𝑁−𝑛
 Confidence interval limits for population proportion: 𝑝̂ ± 𝑍𝛼 ∙ √ √
2 𝑛 𝑁−1

 Confidence interval limits for𝜇1 − 𝜇2, 𝜎12 and 𝜎22 are known, independent samples:

𝜎 2 𝑁1 − 𝑛1 𝜎22 𝑁2 − 𝑛2
(𝑋̅1 − 𝑋̅2 ) ± 𝑍𝛼 √ 1 +
2 𝑛1 𝑁1 − 1 𝑛2 𝑁2 − 1

𝑝̂1(1−𝑝̂1) 𝑁1 −𝑛1 𝑝̂2 (1−𝑝̂2) 𝑁2 −𝑛2


 Confidence interval limits for𝑝1 − 𝑝2 : (𝑝̂1 − 𝑝̂ 2 ) + 𝑍𝛼 ∙ √ +
2 𝑛1 𝑁1 −1 𝑛2 𝑁2 −1

33
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

【EXAMPLE 20】There are 250 families residing in Scandia, Pennsylvania. A random


sample of 40 of these families revealed the mean annual church contribution was
$450 and the standard deviation of this was $75.
a) What is the population mean? What is the best estimate of the population
mean?
b) Develop a 90% confidence interval for the population mean.
c) Using the confidence interval, explain why the population mean could be
$445. Could the population mean be $425? Why?

34
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

Exercise:
1) A sample of 25 is selected from a known population of 100 elements. What is the finite
population correction factor?

2) A sample of 100 is selected from a known population of 350 elements. The population
standard deviation is 15. Using the finite correction factor, what is the standard error of the
sample means?

3) A survey of an urban university (population of 25,450) showed that 750 of 1,100 students
sampled attended a home football game during the season. Using the 99% level of
confidence, what is the confidence interval for the proportion of students attending a football
game?

4) The following results come from two independent random samples taken of two
populations.
Sample 1 Sample 2
Sample size n1 = 50 n2 = 35
Sample mean x1 = 13.6 x2 = 11.6
Population standard deviation 1 = 2.2  2 = 3.0
If the difference of two sample means is normally distributed, find the 95% confidence
interval estimate for the difference between the two population means.

5) The following results come from two independent random samples taken of two
populations.
Sample 1 Sample 2
Sample size n1 = 20 n2 = 30
Sample mean x1 = 22.5 x2 = 20.1
Sample standard deviation s1 = 2.5 s2 = 4.8
If the difference of two sample means is normally distributed, find the 95% confidence
interval estimate for the difference between the two population means.

35
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

5 Interval estimation of population variance 𝝈𝟐


Supposed that population is normal distribution 𝐍(𝛍, 𝝈𝟐 ) , among them μ and 𝜎 2 are
unknown parameters, selecting samples (𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ) randomly from population. When
estimating population variance, a good point estimator is the sample variance 𝑠 2 =
1
∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)2 , and____________________________, it’s called
𝑛−1

______________subject

to a Chi-squared distribution with ___________ degrees of freedom, ______________is “Chi-

squared” random variable with (𝑛 − 1) degrees of freedom.

The confidence interval for 100(1 − 𝛼)% two-tailed interval estimate of 𝜎 2 :

__________________________________________
The confidence interval for 100(1 − 𝛼)% two-tailed interval estimate of σ:

__________________________________________

Explanation:

(i)𝝌𝟐 distribution;Chi-squared distribution

Let 𝑍1 , 𝑍2 , ⋯ , 𝑍𝜐 be υ independent standard normal random variables and let W = 𝑍12 +


𝑍22 + ⋯ + 𝑍𝜐2 = ∑𝜐𝑖=1 𝑍𝑖2 , then W is called a chi-square distribution with degrees of freedom
υ. Usually represented by W~𝜒𝜐2 or 𝜒 2 (𝜐).

 The chi-square distribution is a right-biased distribution defined in the range greater than
0 (positive number), and different degrees of freedom determine different chi-square
distributions.

 The chi-square distribution has only one parameter, its degree of freedom is υ.
 As the degrees of freedom increase, the chi-square distribution tends to be symmetrical;
as the degrees of freedom approach infinity, the normalized chi-square distribution
approaches the standard normal distribution.
 Look-up table:The 𝜒𝛼2 value of ____________________________ displayed in the chi-
square distribution table.
(i) υ = 5, α = 0.1 → ____________________________ (one-tailed)
36
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

(ii) υ = 17, α = 0.05 → __________________________________________ (two-tailed)


1
(ii) Sample variance 𝑠 2 = 𝑛−1 ∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅ )2 doesn’t have distribution, if the sample

𝑛−1 2
variance is multiplied by ___________,亦即____________________________, then 𝑠
𝜎2

is a chi-square distribution with 𝑛−1 degrees of freedom, that is, ________________________.


𝒏−𝟏 𝟐
★Use 𝒔 to find the confidence interval of 𝝈𝟐
𝝈𝟐

(iii)Derivation of Confidence Interval:

【EXAMPLE 21】A large candy manufacturer produces, packages, and sells packs of candy
targeted to weigh 52 grams. A quality control manager working for the company
was concerned that the variation in the actual weights of the targeted 52-gram packs
was larger than acceptable. That is, he was concerned that some packs weighed
significantly less than 52-grams and some weighed significantly more than 52
grams. In an attempt to estimate 𝜎 2 , the variation of the weights of all of the 52-
gram packs the manufacturer makes, he took a random sample of n = 10 packs off
of the factory line. The random sample yielded a sample variance of 4.2 grams. Use
the random sample to derive a 95% confidence interval for 𝜎 2 .

37
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

𝝈𝟐
6 Interval Estimation of the Ratio of two population variances ( 𝟏 )
○ 𝝈𝟐 𝟐

Assume that both populations are normal distribution 𝐗~𝐍(𝝁𝟏 , 𝝈𝟐𝟏 )、Y~𝐍(𝝁𝟐 , 𝝈𝟐𝟐 ), among
them, 𝝁𝟏 , 𝝁𝟐 and population variances 𝝈𝟐𝟏 , 𝝈𝟐𝟐 are unknown parameters. A group of
samples is randomly selected from the population, the samples are independent, the sample
sizes are 𝑛1 and 𝑛2, sample variances are 𝑠12 𝑎𝑛𝑑 𝑠22, the degrees of freedom are 𝜐1 = 𝑛1 −
σ21
1、𝜐2 = 𝑛2 − 1, then the interval estimation of the ratio of the two population variance σ22

can be carried out through the_____________________.


σ21
The confidence interval for 100(1 − 𝛼)% two-tailed interval estimate of :
σ22

______________________________________________________________________

Explanation:

(i)The ratio of the two maternal variances can be estimated by the________________________,


the F statistic is the ratio of the chi-square statistic to the ratio of degrees of freedom.

☛ F distribution

Assume two independent random samples 𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ~N(𝜇1 , 𝜎12 ) and


𝑆12 (𝑛1 −1)
𝑌1 , 𝑌2 , ⋯ , 𝑌𝑛 ~N(𝜇2 , 𝜎22 ) , 𝑆12 and 𝑆22 are sample variances, then ~𝜒𝑛21−1 ;
𝜎12

𝑆22 (𝑛2 −1)


~𝜒𝑛22 −1
𝜎22

The formula for F distribution:

(ii) Derivation of Confidence Interval:

38
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

【EXAMPLE 22】The following summary statistics were obtained on the size, in


millimeters, of the prey of the two species:

Adult Dinopis Adult Menneus


Sample size n1 = 10 n2 = 10
Sample mean x1 = 10.26mm x2 = 9.02mm
Sample variance 𝑠12 = 2.512
𝑠22 = 1.902
Estimate, with 95% confidence, the ratio of the two population variances.

(2) Sample size determination

Sample size is an important factor when using confidence intervals. To obtain good estimates
of population parameters, the choice of sample size is important. In general, we decide the
sample size based on the following three variables:

(i) the researcher chooses a tolerable margin of error

★There is a tradeoff between the margin of error and the sample size::a small margin of error
requires a larger sample, that is, it needs more money and time to collect the sample. A large
margin of error allows for a smaller sample but has a looser confidence interval. Therefore,
instead of choosing the smallest error possible, the researcher chooses a tolerable margin of
error.

(ii) The confidence level, for example, 95%;


39
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

★A larger sample (that is, it takes more time and money to collect the sample) corresponds to
a higher level of confidence.

(iii) Degree of dispersion or variation in the population

★ The greater the dispersion of the population, the larger the sample size is; the more
concentrated the population, the smaller the sample size is.

→ Goal: Find an appropriate sample size for a given margin of error, confidence level (1 − α),
and variance.

1 Sample size to estimate a population mean



Maximum likely error/margin of error/sampling error:

𝜎
E=z → ___________________________
√𝑛
n: sample size
𝜎: population standard deviation (If the population standard deviation is unknown, the sample
standard deviation (s) is used instead)
z: At a certain level of confidence, the value of the standard normal distribution.
E: Maximum allowable range error

Results calculated using formulas are not always integers. When the result is not an integer, we
round up to obtain ____________________________. For example, if the result is 201.21,
round up to 202 unconditionally.

2 Sample size to estimate a population proportion


𝑝(1 − 𝑝)
E = z√ → ____________________________
𝑛

n: sample size
𝑝: Population proportion (If the maternal proportion is unknown, use ______________ instead,
because 𝑝(1 − 𝑝) is the largest at 𝑝 = 0.5)
z: At a certain level of confidence, the value of the standard normal distribution.
E: Maximum allowable range error

40
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

【EXAMPLE 23】A student in public administration wants to estimate the mean monthly
earnings of city council members in large cities. She can tolerate a margin of error
of $100 in estimating the mean. She would also prefer to report the interval
estimate with a 95% level of confidence. The student found a report by the
Department of Labor that reported a standard deviation of $1,000. What is the
required sample size?

【EXAMPLE 24】A student wants to estimate the proportion of cities that have private refuse
collectors. The student wants to estimate the population proportion within a
margin of error of 0.1, prefers a level of confidence of 90%, and has no estimate
for the population proportion. What is the required sample size?

41
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾

Exercise:
1) The mean number of travel days per year for salespeople employed by three hardware
distributors needs to be estimated with a 0.90 degree of confidence. For a small pilot study,
the mean was 150 days and the standard deviation was 14 days. If the population mean is
estimated within two days, how many salespeople should be sampled?

2) A research firm needs to estimate within 3% the proportion of junior executives leaving
large manufacturing companies within three years. A 0.95 degree of confidence is to be used.
Several years ago, a study revealed that 21% of junior executives left their company within
three years. To update this study, how many junior executives should be surveyed?

3) Assume that the number of days needed to hatch an egg of a certain type of a rare lizard is
distributed normally. Using incubator we were able to hatch 13 eggs from different nests
separately. We have a sample mean of 18.97 weeks with a sample standard variance of 10.7
weeks (i.e.,𝑠 2 = 10.7). What is 90% confidence interval for population variance?

42

You might also like