Topic 9 Estimation (No Answer)
Topic 9 Estimation (No Answer)
Topic 9 Estimation (No Answer)
Topic 9 Estimation
Population Parameters
Sampling Inference
Sample Statistics
▲ Inferences are based on data and methods, and those without evidence are
conjectures.
▲The most inferred population parameters are the mean and standard deviation.
二、Point Estimation
三、Interval Estimation
3 The interval estimates of the mean difference between the two population (𝜇1 − 𝜇2 )
○
Case 2: 𝜎12 and 𝜎22 aren’t known, sample is independent. Both samples are large samples
Case 3: 𝜎12 and 𝜎22 aren’t known, 𝜎12 ≠ 𝜎22 , sample is independent. Both samples are small
samples
Case 4: 𝜎12 and 𝜎22 aren’t known, 𝜎12 = 𝜎22 , sample is independent. Both samples are small
samples
𝜎2
6 The interval estimates of two population variance proportion ( 1 ).
○ 𝜎2 2
2
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
1. ______________________ (Topic 9)
☛Terminology
2 Estimator: Formulas (such as sample statistics) that generate estimated values from
○
sample data are called estimators. The statistic used to estimate the
population parameter is often called estimator.
3 Estimate: The real value calculated by substituting the sample observations into the
○
estimator is called the estimator.
☛The presentation methods of estimation are point estimation and interval estimation.
When making estimates, we select a random sample from the population and get the
estimates of the parameters of the population from the sample. There are two ways to
present estimates:
Point estimation:Calculate the sample statistic based on the sample, and use the sample
statistic as the estimated value of the population parameter. [point estimation as a
_____________]
Interval estimation:An interval with upper and lower bounds is estimated for the
unknown population parameter, and the reliability of the interval containing the population
parameter is indicated. [Interval estimate is an _________ on the real line]
3
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
二、Point Estimation
Definition:A set of random samples (𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ) with n samples are drawn from the mother,
and the sample statistics obtained from this are used as the estimated values of the population
parameter.
4
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
▶Point estimator
▶Point estimate
The observed value of random sample (𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ) is brought into the real value
calculated by the 𝜃̂ = W(𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ) function, which is a ______________, which is a
______________.
For example:Using the sample mean 𝑋̅ to estimate the population mean μ, then the sample
mean 𝑋̅ is an estimator. Bring the sample observation value into the estimation formula to get
a certain value, such as 𝑥̅ 0 = 30, then this value 30 is the estimate.
Limitation:Although point estimation is simple, it cannot point out the accuracy of the
estimation results, the estimated value varies with different ____________, and there will
inevitably be errors after the estimation, which makes decision makers uneasy, so there is a
method of interval estimation.
Explanation:
○
1 Select representative sample
○
2 Choose a better sample statistic as the estimator
○
3 Calculate the value of the sample statistic (estimate)
○
4 Use the value of the sample statistic (estimated) to infer the value of the population
parameter and make a decision.
✎Review
Population Parameter Sample Statistics
Population mean Sample mean
Ex: The mean house value for all houses in Ex: The mean house value for a sample of n=200 houses
Kaohsiung in Kaohsiung
Population proportion Sample proportion
Ex: The proportion of all houses in Kaohsiung Ex: The proportion of Kaohsiung’s houses in a sample of
with lead paint n=200 with lead paint
5
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
【EXAMPLE 1】Chinho wants to buy a 25~30 square feet house in Kaohsiung City, so he
wants to know the general (average) housing price in Kaohsiung City, what can he
do?______________
Because the sample selected is a random sample, and it may or may not be accurate to use a
group of sample statistic of the population parameter.
Suppose we know that the real average price of 25~30 square feet houses in Kaohsiung City
is 10 million NT dollars, then the estimated value 𝑋̅ = 935.03 in the above example is
obviously a wrong estimate.
If another set of samples is selected and the resulting estimate is 𝑋̅ = 1000 , then this
estimate is "正中" of the population parameter. 這個估計值「正中」母體參數。
In general, the point estimate is always different from the true value of the population
parameter. Knowing only the point estimates of the population parameters does not help to
understand the estimation error (that is, the accuracy of the estimation), which is the
motivation behind interval estimation.
In theory, there are many possible estimators for a population parameter. For example, to
estimate the population mean, possible point estimators are sample mean, sample median,
sample mode, etc.
☛How to choose better point estimator?
If a good estimator can be selected, no matter what kind of sample appears, ______________,
a good estimator can make the estimation result closer to the true value of the population
parameter or make the error smaller. The criteria for choosing good estimators are:
(1) Unbiasedness; (2) Efficiency; (3) Consistency; (4) Sufficiency。
6
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
(1) Unbiasedness
Definition:If the expected value of an estimator 𝜃̂ (usually a statistic) is equal to the parent
parameter) θ, that is E(𝜃̂) = θ, then this estimator is called an __________ estimator of the
estimated parent parameters.
Explanation:
2 Biased estimator:
○
7
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
∑𝑋 ∑ 𝑎𝑖 𝑋𝑖
【EXAMPLE 2】Prove that both 𝑋̅ = 𝑛 𝑖 and ∑ 𝑎𝑖
(𝑎𝑖 is constant) are unbiased
∑𝑎 𝑋
estimators of the parent mean μ, that is, prove E(𝑋̅) = μ及E ( ∑ 𝑎𝑖 𝑖 ) = μ.
𝑖
8
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
(2) Efficiency
Since there may be many unbiased estimators. (such as 【EXAMPLE 2】) Therefore, in
addition to unbiasedness, other judgment criteria are required to make choices, and validity is
one of them.
Definition:Let 𝜃̂ (usually a statistic) be the estimator of the population parameter θ. If the
2
mean square error of 𝜃̂ (mean square error;MSE)─MSE(𝜃̂) = E(𝜃̂ − 𝜃) is the __________
If 𝜃̂1 and 𝜃̂2 are estimators of θ , and MSE(𝜃̂1 ) is less than MSE(𝜃̂2) , that is
̂
MSE(𝜃 )
eff(𝜃̂1 , 𝜃̂2 ) = MSE(𝜃̂2) > 1, then 𝜃̂1 is said to have ________________ to 𝜃̂2 in estimating θ.
1
Explanation:
1 The efficiency of the estimator is measured by MSE. The smaller the MSE, the higher the
○
efficiency of the estimator. There are absolute efficiency and relative efficiency.
estimating θ.
9
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
【EXAMPLE 4】If the population is normally distributed N(μ, 𝜎 2 ), try to prove that the
sample mean 𝑋̅ and the sample median are both unbiased estimators of the parent
mean, and the sample mean 𝑋̅ is a relatively efficient estimator.
eff(𝜃̂1 , 𝜃̂2 ) .
10
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
Definition:When the sample size n approach ∞, the probability limit that the absolute value
of the difference between the estimator 𝜃̂ (usually a statistic) and the population
parameter θ is less than a trace ε is equal to 1, that is, lim 𝑃(|𝜃̂ − θ| < 𝜀) = 1 or
n→∞
lim 𝑃(|𝜃̂ − θ| ≥ 𝜀) = 0 for any ε > 0, then this estimator 𝜃̂ is called a consistent
n→∞
Explanation:
̂ for θ is consistent, if lim Var(𝜃̂) = 0
1 Theorem: An unbiased estimator 𝜃
○
n→∞
̂) = 𝟎→ 𝜽
【unbiased + 𝐥𝐢𝐦 𝐕𝐚𝐫(𝜽 ̂ is consistent】
𝐧→∞
𝐸(𝑌)
Proof:𝑃(𝑌 ≥ 𝑏2 ) ≤ 𝑏2
𝑉𝑎𝑟(𝑋)
☛This proof requires the use of Chebyshev’s Inequality:P(|𝑋 − 𝐸(𝑋)| ≥ b) ≤ 𝑏2
2 2
2 𝐸(𝜃̂ − 𝜃) 𝑉𝑎𝑟(𝜃̂) + [𝐸(𝜃̂) − 𝜃]
𝑃(|𝜃̂ > 𝜃| ≥ 𝜀) = 𝑃 ((𝜃̂ > 𝜃) ≥ 𝜀 ) ≤ 2
=
𝜀2 𝜀2
∵ lim 𝐸(𝜃̂ ) ⟶ 𝜃 且 lim 𝑉𝑎𝑟(𝜃̂) = 0
𝑛→∞ 𝑛→∞
̂1 and 𝜃̂2 are both unbiased and consistent estimators of θ, how to choose a suitable
2 If 𝜃
○
estimator?
𝑝(1−𝑝)
E(𝑝̂ ) = p,V(𝑝̂ ) = 𝑛
2𝜎4
E(𝑠 2 ) = 𝜎 2 ,V(𝑠 2 ) = 𝑛−1
estimator of 𝜎 2 .
In general, as long as the calculation process is the same as the statistic of the population
number, it must be a consistent estimator of the corresponding population number.
12
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
(4) Sufficiency
Explanation:
A statistic is a sufficient estimator of a population if it provides the most information about the
estimated population parameter from the sample data. That is, a sufficient estimator can provide
all the information about the parameter θ from the sample data, and no other statistics can
provide more information.
1 with probabilit y p;
Xi =
0 with probabiity (1 - p)
Y = i =1 X i ~ Bin (n, p)
n
Solution:
𝑃(𝑋1 = 𝑥1 , 𝑋2 = 𝑥2 , … , 𝑋𝑛 = 𝑥𝑛 )
𝑃(𝑋1 = 𝑥1, 𝑋2 = 𝑥2 , … , 𝑋𝑛 = 𝑥𝑛 |𝑌 = 𝑦) =
𝑃(𝑌 = 𝑦)
1
⟹ does not depend on p
𝐶𝑦𝑛
13
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
三、Interval Estimator
Definition:An interval of upper and lower bounds is estimated for the unknown population
parameter, and the reliability (probability) of the interval containing the
population parameter is indicated.
Method:Interval estimation starts from ______________, and then derives an interval with
upper and lower limits under a given ______________ to illustrate the reliability of
the interval (that is confidence interval) containing the true value of the population
parameter. The formula is:
__________________________________________
▶level of confidence:(1 − α)
The confidence level (1 − α) refers to the reliability, confidence, reliability, etc. of the
confidence interval including the population parameters. α is the probability of error.
__________________________________________
▶Confidence Interval
The confidence interval is an interval constituted under a given confidence level and is an
interval including upper and lower bounds composed of sample statistics and sampling errors.
Explanation:
○
1 If the probability that the estimated value is in this interval is calculated according to the
○
2 The 95% confidence level means that the interval established at this confidence level has a
95% probability of containing the true value of the population parameter. The 90%
confidence level means that the interval established at this confidence level has a 90%
probability of containing the true value of the population parameter.
14
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
○
3 For example:Suppose we try to estimate the average income of students in summer vacation,
a possible range is estimated that there is a 95% chance that the average
income of students in summer vacation (this is unknown) is between
$380~$400.
○
4 The factors that affect the confidence interval are point estimate, the level of confidence, the
sample size, and the method of taking the confidence limit (one-tailed, two-tailed).
15
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
2. Interval Estimator
(1) Calculate the interval estimate of (1 − α)100%
The statistic 𝑋̅ is the best estimator of the parent mean μ, so the sample mean 𝑋̅ can be used
as a point estimator of μ. As for the interval estimation of the population mean μ, there are the
following situations:
__________________________________________
▶Level of confidence:______________
𝑋−𝜇
▶The probability interval is 𝑃[−𝑍𝛼/2 ≤ 𝜎 ≤ 𝑍𝛼/2 ] = ______________
√𝑛
▶Margin of error:______________
𝜎
▶Length of Confidence Interval:2 × 𝑍𝛼 ∙
2 √𝑛
Explanation:
𝜎
(i) The condition for 𝑋 ± 𝑍𝛼 ∙ as an interval estimator of the population mean is:【(a)+ (b)
2 √𝑛
or (a)+ (c)】
(iv)Explanation:______________________________________________________________
【EXAMPLE 7】
(1) Determine the critical value 𝑍𝛼 that corresponds to a 95% level of confidence.
2
(2) Determine the critical value 𝑍𝛼 that corresponds to a 96% level of confidence.
2
【EXAMPLE 8】 Del Monte sets the filling operation to dispense 4.51 ounces of peaches and
gel in each cup. From historical data, Del Monte knows that 0.04 ounce is the standard
deviation of the filling process and that amount, in ounces, follows the normal distribution.
The quality control technician selects a sample of 64 cups at the start of each shift, measures
the amount in each cup, computes the mean fill amount, and then develops a 95% confidence
interval for the population mean. Using the confidence interval, is the process filling the cups
to the desired amount? This morning’s sample of 64 cups had a sample mean of 4.507 ounces.
17
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
18
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
I. Large sample
▶level of confidence:______________
𝑋−𝜇
▶The probability interval is 𝑃[−𝑍𝛼/2 ≤ 𝑠 ≤ 𝑍𝛼/2 ] = ______________
√𝑛
▶Margin of error:______________________
𝑠
▶Length of Confidence Interval:2 × 𝑍𝛼 ∙
2 √𝑛
Explanation:
𝜎2
When the population is a normal distribution, 𝑋 is a normal distribution, 𝑋~𝑁 (𝜇, ).
𝑛
𝑋−𝜇
Therefore, normalizing 𝑋 can get the standard normal distribution Z = 𝜎 ~𝑁(0,1).
√𝑛
When the population is not normally distributed and the sample is large enough, according to
the Central Limit Theorem (CLT), the sampling distribution of the sample mean will still
approach the normal distribution. Therefore, if the maternal variance 𝜎 2 is unknown, the
sample is a large sample, and the standard normal distribution Z distribution can also be
processed.
Although 𝜎 2 is unknown, we can use the sample variance to estimate the population variance,
so replace 𝜎 2 by 𝑠 2 .
【EXAMPLE 11】
19
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
__________________________________________________
▶Level of confidence:______________
𝑋−𝜇
▶Probability interval:𝑃 [−𝑡𝛼 ≤ 𝑠 ≤ 𝑡𝛼 ] = ______________
2 √𝑛 2
▶Margin of error:______________________
𝑠
▶Length of Confidence Interval:2 × 𝑡𝛼 ∙
2 √𝑛
Explanation:
𝜎2
(i)When the population is normal distribution, 𝑋 is normal distribution, 𝑋~𝑁 (𝜇, ).
𝑛
𝑋−𝜇
Therefore, normalize 𝑋 to get the standard normal distribution Z = 𝜎 ~𝑁(0,1). Since
√𝑛
𝑋−𝜇
𝜎 2 is unknown, it is replaced by 𝑠 2 . In the case of small samples, t = 𝑠 is a t
√𝑛
☛ t distribution
t distribution is the probability distribution of a random variable ______________
The mean of t distribution is________, its shape is determined by its
______________________________( df= (n − 1) )
❖ Degree of freedom refers to the number of random variables in a statistic that can
freely change. For example, There are (n − 1) random variables that can change
∑𝑛 ̅ 2
𝑖=1(𝑋𝑖 −𝑋 )
freely in the statistic 𝑆 2 = . Therefore, there is 𝑋̅ in 𝑆 2 , 𝑋̅ must satisfy
𝑛−1
𝑛
∑ 𝑋𝑖
the restriction 𝑋̅ = 𝑖=1 , and the n variables of (𝑋1 , ⋯ , 𝑋𝑛 ) satisfy 𝑋̅
𝑛
20
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
restrictive, Only (n − 1) variables (𝑋1 , ⋯ , 𝑋𝑛−1 ) can be freely changed, and the
last variable 𝑋𝑛 cannot be freely changed, so the degree of freedom of 𝑆 2 is (n − 1).
For example: there are 4 observations, under the condition that the average is 25 and
𝑋1 = 20, 𝑋2 = 23, 𝑋3 = 28 , the fourth variable is automatically determined to be
25*4-20-23-28=29, so the degree of freedom is 4-1=3.
Characteristics of t distribution:
➢ Like z distribution, t distribution is also a continuous probability distribution;
➢ Like z distribution, the graph of t distribution is also bell-shaped and symmetrical;
➢ The t distribution is not the only one, but a whole family. The mean of all t distribution
is 0, however, the standard deviation varies with the number of samples n. The smaller
the sample size, the larger the standard deviation.t
➢ The t distribution is flatter and more discrete than the standard normal distribution z
distribution. As the number of samples increases, the t distribution gets closer to the
standard normal distribution z distribution, because the error of using s to estimate σ
will shrink as the sample size increases.
➢ Since the t distribution is more discrete than the z distribution, the value of the t
distribution for a given confidence level will be greater than the value of the z
distribution for the same confidence level.
(ii)Explanation:______________________________________________________________
21
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
【EXAMPLE 12】 A tire manufacturer wishes to investigate the tread life of its tires. A sample
of 10 tires driven 50,000 miles revealed a sample mean of 0.32 inch of tread
remaining with a standard deviation of 0.09 inch. Construct a 95% confidence
interval for the population mean. Would it be reasonable for the manufacturer to
conclude that after 50,000 miles the population mean amount of tread remaining
is 0.30 inch?
22
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
【 EXAMPLE 13 】 The U.S. Dairy Industry wants to estimate the mean yearly milk
consumption. A sample of 16 people reveals the mean yearly consumption to be
45 gallons with a standard deviation of 20 gallons. Assume the population
distribution is normal.
a) What is the value of the population mean? What is the best estimate of this
value?
b) Explain why we need to use the t distribution. What assumption do you need
to make?
c) For a 90% confidence interval, what is the value of t?
d) Develop the 90% confidence interval for the population mean.
e) Would it be reasonable to conclude that the population mean is 48 gallons?
【EXAMPLE 14】 The owner of Britten’s Egg Farm wants to estimate the mean number of
eggs produced per chicken. A sample of 20 chickens shows they produced an
average of 20 eggs per month with a standard deviation of 2 eggs per month.
a) What is the value of the population mean? What is the best estimate of this
value?
b) Explain why we need to use the t distribution. What assumption do you need
to make?
c) For a 95% confidence interval, what is the value of t?
d) Develop the 95% confidence interval for the population mean.
e) Would it be reasonable to conclude that the population mean is 21 eggs? What
about 25 eggs?
23
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
𝑋
A good estimator of the population proportion p is the sample proportion 𝑝̂ = 𝑛 , n is the
number of trials, and X is the number of successful. Interval estimation of the population
proportion, p, vary by sample size or probability distribution. Here we only discuss the case of
large samples.
假設樣本資料符合二項分配,且________________________________,則可以 z 分配建
構母體比例 p 的區間估計。
__________________________________________
▶Level of confidence:(1 − α)
▶Margin of error:____________________________
𝑝̂(1−𝑝̂)
▶Length of Confidence Interval:2 × 𝑍𝛼 ∙ √
2 𝑛
Explanation:
(i) The interval estimation for constructing the population proportions must meet the following
conditions:
24
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
25
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
【EXAMPLE 15】The union representing the Bottle Blowers of America (BBA) is considering
a proposal to merge with the Teamsters Union. At least three-fourths of the BBA
membership must approve any merger. A random sample of 2,000 current
members reveals 1,600 plan to vote for the merger proposal. Develop a 95%
confidence interval for the population proportion. Basing your decision on this
sample information, can you conclude that the necessary proportion of BBA
members favor the merger? Why?
【EXAMPLE 16】Cliff Obermeyer is running for Congress from the 6th District of New Jersey.
Suppose 500 voters are contacted upon leaving the polls and 275 indicate they
voted for Mr. Obermeyer. We will assume that the exit poll of 500 voters is a
random sample of those voting in the 6th District. Construct a 95% confidence
interval for the population proportion. Should Mr. Obermeyer be elected?
26
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
Exercise:
1) Mileage tests were conducted on a randomly selected sample of 100 newly developed
automobile tires. The results showed that the mean tread life was 50,000 miles, with a
standard deviation of 3,500 miles. What is the best estimate of the mean tread life in miles for
the entire population of these tires?
2) A random sample of 85 supervisors revealed that they worked an average of 6.5 years
before being promoted. The population standard deviation was 1.7 years. Using the 0.95
degree of confidence, what is the confidence interval for the population mean?
3) There are 2,000 eligible voters in a precinct. A total of 500 voters are randomly selected
and asked whether they plan to vote for the Democratic incumbent or the Republican
challenger. Of the 500 surveyed, 350 said they would vote for the Democratic incumbent.
Using the 0.99 confidence coefficient, what are the confidence limits for the proportion that
plan to vote for the Democratic incumbent?
4) A random sample of 42 college graduates revealed that they worked an average of 5.5
years on the job before being promoted. The sample standard deviation was 1.1 years. Using
the 0.99 degree of confidence, what is the confidence interval for the population mean?
5) A survey of 50 retail stores revealed that the average price of a microwave was $375, with
a sample standard deviation of $20. Assuming the population is normally distributed, what is
the 95% confidence interval to estimate the true cost of the microwave?
27
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
It is assumed that 𝜇1 and 𝜇2 are the averages of the two population, respectively. In order to
estimate the difference of the two population mean 𝜇1 − 𝜇2 , we randomly select 𝑛1 and 𝑛2
from the population 1 and 2, respectively. The two samples are _________________________.
If 𝑋̅1 and 𝑋̅2 are the means of the two samples, the good (point) estimator of the difference
of the two population mean 𝜇1 − 𝜇2 is ______________.
For example:
Population 1 Population 2
Inner-City Store Customers
Suburban Store Customers
𝜇1 =mean age of inner-city store
𝜇2 =mean age of suburban store
customers
customers
Random sample of 𝑛1 Inner-City Store Customers Random sample of 𝑛2 suburban Store Customers
𝑋̅1 =sample mean age for the inner-city store customers 𝑋̅2 =sample mean age for the suburban store customers
̅1 − X
X ̅2 is the point estimator of 𝜇1 − 𝜇2
If both populations are normal distribution or non-normal distribution, but the sample is large
enough to apply the Central Limit Theorem, the sampling distribution of 𝑋̅1 and 𝑋̅2 will
approach the ________________________, and the sampling distribution of 𝑋̅1 − 𝑋̅2 will also
approach the normal distribution. When the sampling distribution of 𝑋̅1 − 𝑋̅2 is a normal
distribution, 𝑋̅1 − 𝑋̅2 is used to estimate 𝜇1 − 𝜇2 , and its interval is estimated by whether 𝜎12
and 𝜎22 are known or not, and the sample size is different and independent or not (this course
only discusses the condition of sample independence), there are four cases, which are described
as follows:
28
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
Case 1: 𝝈𝟐𝟏 and 𝝈𝟐𝟐 are known, and the samples are independent.
If both populations have normal distribution or non-normal distribution, but the sample is large
enough to apply the Central Limit Theorem, and 𝜎12 and 𝜎22 are known, the samples are
independent, then use the standard normal distribution to estimate the confidence interval of
𝜇1 − 𝜇2 .
______________________________________________________________________
▶Level of confidence:(1 − α)
▶Probability interval:__________________________________________
▶Margin of error:____________________________
Case 2: 𝝈𝟐𝟏 and 𝝈𝟐𝟐 are unknown, the samples are independent, and both are large
samples.
If both populations have normal distribution or non-normal distribution, but the sample is large
enough to apply the Central Limit Theorem, and 𝜎12 and 𝜎22 are unknown, the samples are
independent, the 𝜇1 − 𝜇2 confidence interval can still be estimated using the standard normal
distribution, but with the sample variances 𝑠12 and 𝑠22 instead of 𝜎12 and 𝜎22 . At this time,
the confidence interval of the two-tailed interval estimates of (1 − α)100% of 𝜇1 − 𝜇2 :
_____________________________________________________________________________
29
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
Case 3: 𝝈𝟐𝟏 and 𝝈𝟐𝟐 are unknown and not equal (𝜎12 ≠ 𝜎22 ), the samples are independent,
and both samples are small samples.
Assume that both population are normal distribution, 𝜎12 and 𝜎22 are unknown and not equal
(𝜎12 ≠ 𝜎22 ) , the samples are independent, and both samples are small samples, then use t
distribution to estimate the confidence interval of 𝜇1 − 𝜇2 .
When the variance of the two normal population is unknown and not equal, then
Case 4:𝝈𝟐𝟏 and 𝝈𝟐𝟐 are unknown and equal (𝜎12 = 𝜎22 ) , the samples are independent, and
both samples are small samples
Assume that both population are normal distribution, 𝜎12 and 𝜎22 are unknown and equal
(𝜎12 = 𝜎22 ) , the samples are independent, and both samples are small samples, then use t
distribution to estimate the confidence interval of 𝜇1 − 𝜇2 .
When the variance of the two normal population is unknown and equal, that is 𝜎12 = 𝜎22 = 𝜎 2 ,
then ______________________________________________________________________
★The common variance 𝜎 2 is obtained by jointly estimating the variance of the two samples,
____________________________
The confidence interval of the two-tailed interval estimates of (1 − α)100% of 𝜇1 − 𝜇2 :
________________________________________________________
★As for whether the variance of the two normal mothers is equal, the F statistic can be
used to test.
30
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
【EXAMPLE 17】
Inner-City Store Suburban Store
Sample size n1 = 36 n2 = 49
Sample mean x1 = 40 x2 = 40
Population standard deviation 1 = 9 2 = 10
Suppose both populations have a normal distribution and are independent. Find the 95%
confidence interval estimate of the difference between the two population means.
【EXAMPLE 18】
Cherry Grove Beechmont
Sample size n1 = 28 n2 = 22
Sample mean x1 = 1025 x2 = 910
Sample standard deviation s1 = 150 s2 = 125
Suppose both populations have a normal distribution and are independent. Find the 95%
confidence interval estimate of the difference between the two population means.
31
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
4 Interval estimation of the difference between two population proportions (𝒑𝟏 − 𝒑𝟐)
○
Suppose 𝑝1 and 𝑝2 are the proportions of the two mothers that meet a certain characteristic,
respectively. To estimate the difference of the proportions of the two population 𝑝1 − 𝑝2 , we
randomly selected 𝑛1 and 𝑛2 from the population 1 and 2, respectively. The two sets of
samples are ____________________________. If 𝑝̂ and 𝑝̂2 are respectively the proportions
of the two samples that meet a certain feature, then the good (point) estimator for the difference
between the proportions 𝑝1 − 𝑝2 of the two populations is______________.
From the sampling distribution of 𝑝̂1 − 𝑝̂2 , the confidence interval of 𝑝1 − 𝑝2 can be
determined. If both 𝑛1 and 𝑛2 are large samples, then according to the Central Limit
Theorem, the sampling distribution of 𝑝̂1 − 𝑝̂2 is close to the normal distribution, and the
confidence interval of 𝑝1 − 𝑝2 can be estimated by the _______________________________,
z distribution.
_____________________________________________________________________________
【EXAMPLE 19】A tax preparation firm is interested in comparing the quality of work at two
of its regional offices. By randomly selecting samples of tax returns prepared at
each office and verifying the sample returns’ accuracy, the firm will be able to
estimate the proportion of erroneous returns prepared at each office. Of particular
interest is the difference between these proportions. The independent simple
random samples from the two offices provide the following information.
Office 1 Office 2
Sample size 𝑛1 = 250 𝑛2 = 300
Number of returns with errors 𝑥1 = 35 𝑥2 = 27
Find the 90% confidence interval for the difference between two population
proportions.
32
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
Summary:
𝜎
Confidence interval limits for population mean, 𝜎 2 is known: 𝑋 ± 𝑍𝛼 ∙
2 √𝑛
𝑠
Confidence interval limits for population mean, 𝜎 2 is unknown, large sample size: 𝑋 ± 𝑧𝛼 ∙
2 √𝑛
𝑠
Confidence interval limits for population mean, 𝜎 2 is unknown, small sample size: 𝑋 ± 𝑡𝛼 ∙
2 √𝑛
𝑝̂(1−𝑝̂)
Confidence interval limits for population proportion: 𝑝̂ ± 𝑍𝛼 ∙ √
2 𝑛
Confidence interval limits for 𝜇1 − 𝜇2, 𝜎12 and 𝜎22 are known, independent samples:
𝜎 𝜎 2 2
(𝑋̅1 − 𝑋̅2 ) ± 𝑍𝛼 √ 1 + 2
𝑛2 𝑛 1 2
Confidence interval limits for 𝜇1 − 𝜇2, 𝜎12 and 𝜎22 are unknown; independent samples; unequal
2
𝑠2 𝑠2
(𝑛1 +𝑛2 )
𝑠12 𝑠22 1 2
variances: (𝑋̅1 − 𝑋̅2 ) ± 𝑡𝛼 ∙ √𝑛 + 𝑛 with the 𝑑𝑓 = 2 2
2 1 2 𝑠2 𝑠2
(𝑛1 ) (𝑛2 )
1 2
+
𝑛1 −1 𝑛2−1
Confidence interval limits for𝜇1 − 𝜇2, 𝜎12 and 𝜎22 are unknown; independent samples; equal variances:
1 1 (𝑛 −1)𝑠1 +(𝑛2−1)𝑠2 2 2
(𝑋̅1 − 𝑋̅2 ) ± 𝑡𝛼 ∙ √𝑠𝑝2 ( + ) with 𝑠𝑝2 = 1 , 𝑑𝑓 = 𝑛1 + 𝑛2 − 2
2 𝑛
1 𝑛 2 𝑛 +𝑛 −2 1 2
𝑝̂1(1−𝑝̂1) 𝑝̂2(1−𝑝̂2)
Confidence interval limits for 𝑝1 − 𝑝2 : (𝑝̂1 − 𝑝̂ 2 ) + 𝑍𝛼 ∙ √ +
2 𝑛1 𝑛2
Finite Population:
𝜎 𝑁−𝑛
Confidence interval limits for population mean, 𝜎 2 is known: 𝑋 ± 𝑍𝛼 ∙ √𝑁−1
2 √𝑛
𝑠 𝑁−𝑛
Confidence interval limits for population mean, 𝜎 2 is unknown, large sample size: 𝑋 ± 𝑧𝛼 ∙ √𝑁−1
2 √𝑛
𝑠 𝑁−𝑛
Confidence interval limits for population mean, 𝜎 2 is unknown, small sample size: 𝑋 ± 𝑡𝛼 ∙ √𝑁−1
2 √𝑛
𝑝̂(1−𝑝̂) 𝑁−𝑛
Confidence interval limits for population proportion: 𝑝̂ ± 𝑍𝛼 ∙ √ √
2 𝑛 𝑁−1
Confidence interval limits for𝜇1 − 𝜇2, 𝜎12 and 𝜎22 are known, independent samples:
𝜎 2 𝑁1 − 𝑛1 𝜎22 𝑁2 − 𝑛2
(𝑋̅1 − 𝑋̅2 ) ± 𝑍𝛼 √ 1 +
2 𝑛1 𝑁1 − 1 𝑛2 𝑁2 − 1
33
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
34
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
Exercise:
1) A sample of 25 is selected from a known population of 100 elements. What is the finite
population correction factor?
2) A sample of 100 is selected from a known population of 350 elements. The population
standard deviation is 15. Using the finite correction factor, what is the standard error of the
sample means?
3) A survey of an urban university (population of 25,450) showed that 750 of 1,100 students
sampled attended a home football game during the season. Using the 99% level of
confidence, what is the confidence interval for the proportion of students attending a football
game?
4) The following results come from two independent random samples taken of two
populations.
Sample 1 Sample 2
Sample size n1 = 50 n2 = 35
Sample mean x1 = 13.6 x2 = 11.6
Population standard deviation 1 = 2.2 2 = 3.0
If the difference of two sample means is normally distributed, find the 95% confidence
interval estimate for the difference between the two population means.
5) The following results come from two independent random samples taken of two
populations.
Sample 1 Sample 2
Sample size n1 = 20 n2 = 30
Sample mean x1 = 22.5 x2 = 20.1
Sample standard deviation s1 = 2.5 s2 = 4.8
If the difference of two sample means is normally distributed, find the 95% confidence
interval estimate for the difference between the two population means.
35
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
Supposed that population is normal distribution 𝐍(𝛍, 𝝈𝟐 ) , among them μ and 𝜎 2 are
unknown parameters, selecting samples (𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ) randomly from population. When
estimating population variance, a good point estimator is the sample variance 𝑠 2 =
1
∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)2 , and____________________________, it’s called
𝑛−1
______________subject
__________________________________________
The confidence interval for 100(1 − 𝛼)% two-tailed interval estimate of σ:
__________________________________________
Explanation:
The chi-square distribution is a right-biased distribution defined in the range greater than
0 (positive number), and different degrees of freedom determine different chi-square
distributions.
The chi-square distribution has only one parameter, its degree of freedom is υ.
As the degrees of freedom increase, the chi-square distribution tends to be symmetrical;
as the degrees of freedom approach infinity, the normalized chi-square distribution
approaches the standard normal distribution.
Look-up table:The 𝜒𝛼2 value of ____________________________ displayed in the chi-
square distribution table.
(i) υ = 5, α = 0.1 → ____________________________ (one-tailed)
36
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
𝑛−1 2
variance is multiplied by ___________,亦即____________________________, then 𝑠
𝜎2
【EXAMPLE 21】A large candy manufacturer produces, packages, and sells packs of candy
targeted to weigh 52 grams. A quality control manager working for the company
was concerned that the variation in the actual weights of the targeted 52-gram packs
was larger than acceptable. That is, he was concerned that some packs weighed
significantly less than 52-grams and some weighed significantly more than 52
grams. In an attempt to estimate 𝜎 2 , the variation of the weights of all of the 52-
gram packs the manufacturer makes, he took a random sample of n = 10 packs off
of the factory line. The random sample yielded a sample variance of 4.2 grams. Use
the random sample to derive a 95% confidence interval for 𝜎 2 .
37
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
𝝈𝟐
6 Interval Estimation of the Ratio of two population variances ( 𝟏 )
○ 𝝈𝟐 𝟐
Assume that both populations are normal distribution 𝐗~𝐍(𝝁𝟏 , 𝝈𝟐𝟏 )、Y~𝐍(𝝁𝟐 , 𝝈𝟐𝟐 ), among
them, 𝝁𝟏 , 𝝁𝟐 and population variances 𝝈𝟐𝟏 , 𝝈𝟐𝟐 are unknown parameters. A group of
samples is randomly selected from the population, the samples are independent, the sample
sizes are 𝑛1 and 𝑛2, sample variances are 𝑠12 𝑎𝑛𝑑 𝑠22, the degrees of freedom are 𝜐1 = 𝑛1 −
σ21
1、𝜐2 = 𝑛2 − 1, then the interval estimation of the ratio of the two population variance σ22
______________________________________________________________________
Explanation:
☛ F distribution
38
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
Sample size is an important factor when using confidence intervals. To obtain good estimates
of population parameters, the choice of sample size is important. In general, we decide the
sample size based on the following three variables:
★There is a tradeoff between the margin of error and the sample size::a small margin of error
requires a larger sample, that is, it needs more money and time to collect the sample. A large
margin of error allows for a smaller sample but has a looser confidence interval. Therefore,
instead of choosing the smallest error possible, the researcher chooses a tolerable margin of
error.
★A larger sample (that is, it takes more time and money to collect the sample) corresponds to
a higher level of confidence.
★ The greater the dispersion of the population, the larger the sample size is; the more
concentrated the population, the smaller the sample size is.
→ Goal: Find an appropriate sample size for a given margin of error, confidence level (1 − α),
and variance.
𝜎
E=z → ___________________________
√𝑛
n: sample size
𝜎: population standard deviation (If the population standard deviation is unknown, the sample
standard deviation (s) is used instead)
z: At a certain level of confidence, the value of the standard normal distribution.
E: Maximum allowable range error
Results calculated using formulas are not always integers. When the result is not an integer, we
round up to obtain ____________________________. For example, if the result is 201.21,
round up to 202 unconditionally.
𝑝(1 − 𝑝)
E = z√ → ____________________________
𝑛
n: sample size
𝑝: Population proportion (If the maternal proportion is unknown, use ______________ instead,
because 𝑝(1 − 𝑝) is the largest at 𝑝 = 0.5)
z: At a certain level of confidence, the value of the standard normal distribution.
E: Maximum allowable range error
40
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
【EXAMPLE 23】A student in public administration wants to estimate the mean monthly
earnings of city council members in large cities. She can tolerate a margin of error
of $100 in estimating the mean. She would also prefer to report the interval
estimate with a 95% level of confidence. The student found a report by the
Department of Labor that reported a standard deviation of $1,000. What is the
required sample size?
【EXAMPLE 24】A student wants to estimate the proportion of cities that have private refuse
collectors. The student wants to estimate the population proportion within a
margin of error of 0.1, prefers a level of confidence of 90%, and has no estimate
for the population proportion. What is the required sample size?
41
112 學年度企管系統計學 主題 9:估計 授課老師:林晉禾
Exercise:
1) The mean number of travel days per year for salespeople employed by three hardware
distributors needs to be estimated with a 0.90 degree of confidence. For a small pilot study,
the mean was 150 days and the standard deviation was 14 days. If the population mean is
estimated within two days, how many salespeople should be sampled?
2) A research firm needs to estimate within 3% the proportion of junior executives leaving
large manufacturing companies within three years. A 0.95 degree of confidence is to be used.
Several years ago, a study revealed that 21% of junior executives left their company within
three years. To update this study, how many junior executives should be surveyed?
3) Assume that the number of days needed to hatch an egg of a certain type of a rare lizard is
distributed normally. Using incubator we were able to hatch 13 eggs from different nests
separately. We have a sample mean of 18.97 weeks with a sample standard variance of 10.7
weeks (i.e.,𝑠 2 = 10.7). What is 90% confidence interval for population variance?
42