Chapter 3 SSCM1103

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Chapter 3: Estimation

DR. ADINA NAJWA KAMARUDIN


Topic
• Introduction; Inferential Statistics
• Point EstimateInterval Estimate
• Confidence Interval for a Single Mean
• Confidence Interval for Difference of Means
• Confidence Interval for a Single Proportion
• Confidence Interval for Difference of Proportions
• Confidence Interval for a Single Variance
• Confidence Interval for Ratio of Variances
Introduction
• Previous chapter, we have learnt the distributions of population and
random sample.
• Descriptive statistics such as mean, variance and distribution have
been discussed to describe our data.
• In this chapter, the inferential statistics is being introduced.
• Inferential statistics; use information from a random sample to make
a generalization of the population.
• Estimation; we estimate our population parameters using statistic in a
sample.
Terminology
• A statistic from random sample is the estimator for a population
parameter
• The value assigned to a population parameter based on sample
statistic is called estimate
• Two types of estimations are point estimate and interval estimate.
Point estimate gives a single estimate while interval estimate provides
a range of estimates with lower and upper limits.
• Confidence Interval is an interval constructed around the point
estimate with certain level of confidence associated.
Point Estimate
• The population mean 𝜇 is estimated using the sample mean 𝑥;ҧ
𝑥1 + 𝑥2 + 𝑥3 + ⋯ + 𝑥𝑛
𝜇ො = 𝑥ҧ =
𝑛
• 𝑋ത is the estimator of the population mean 𝜇
• An unknown population variance 𝜎 2 can be estimated using point
estimator 𝑆 2 .
• Best (most efficient) estimator:
a) Be unbiased
b) Have minimum variance
Interval Estimate
• An interval around the point estimate between lower limit and upper
limit
• The probability that the point estimate within the interval can be
expressed;
𝑃 𝐿 ≤𝜇 ≤𝑈 =1−𝛼
• This interval is called a 100(1-𝛼)% confidence interval of the true
parameter 𝜇
• Interpretation: We are 100(1-𝛼)% confident that the true parameter
is between the lower limit and upper limit.
Confidence Interval for a Single Mean 𝝁
• The best estimator for population mean 𝜇 is the sample mean 𝑋ത
• As we have learnt in the previous chapter, the sampling distribution
𝜎2

of sample mean, 𝑋~𝑁 𝜇,
𝑛

𝑋−𝜇
• Thus, we transform to Z distribution using this formula; 𝑍 =
𝜎/ 𝑛
𝜎

• Solve for 𝜇;𝜇 = 𝑋 − 𝑍( )
𝑛
• The 100(1-𝛼)% confidence interval for the population mean 𝜇;
𝜎 𝜎
ത ത
𝑃 𝐿 ≤ 𝜇 ≤ 𝑈 = 1 − 𝛼 with L is 𝑋 − 𝑍( ) and U is 𝑋 + 𝑍( )
𝑛 𝑛
• Three cases we need to consider
1. When the population variance 𝝈𝟐 is known
The 100(1-𝛼)% CI for the population mean 𝜇 is
𝒛𝜶/𝟐 𝝈 𝒛𝜶/𝟐 𝝈
ഥ−
𝒙 ≤𝝁≤𝒙ഥ+
𝒏 𝒏
2. When the population variance 𝝈𝟐 is unknown and 𝒏 ≥ 𝟑𝟎
The 100(1-𝛼)% CI for the population mean 𝜇 is
𝒛𝜶/𝟐 𝒔 𝒛𝜶/𝟐 𝒔
ഥ−
𝒙 ≤𝝁≤𝒙ഥ+
𝒏 𝒏

3. When the population variance 𝝈𝟐 is unknown and 𝒏 < 𝟑𝟎


The 100(1-𝛼)% CI for the population mean 𝜇 is
𝒕𝜶 𝒔 𝒕𝜶 𝒔
𝟐 ,𝒏−𝟏 𝟐 ,𝒏−𝟏
ഥ−
𝒙 ഥ+
≤𝝁≤𝒙
𝒏 𝒏
Exercises
• A random sample of 16 compact cars tested for fuel consumption
gave a mean of 12.5km per liter with a standard deviation of 0.83 km
per liter. Assuming that the fuel consumption in km per liter of all
compact cars have a normal distribution, construct a 99% confidence
interval for the population mean of fuel consumption for compact
cars.
• SEA Steel Corporation produces iron rings that are supplied to ABKIA
Co Ltd. These rings are supposed to have a diameter of 60 cm. The
machine that makes these rings does not produce each ring with a
diameter of exactly 60cm. The diameter of each of the rings varies
slightly. It is known that when the machine is working properly, the
rings made on this machine have a mean diameter of 60cm. The
quality control department takes a random sample of 35 such rings
every week, calculates the mean of the diameters for these rings, and
makes a 99% confidence interval for the population mean.
• If either the lower limit of this confidence interval is less than
59.938cm or the upper limit of this confidence interval is greater than
60.063cm, the machine is stopped and adjusted. The most recent
drawn sample of 35 rings produces a mean diameter of 60.038cm
with a standard deviation of 0.15cm. Based on this sample, can you
conclude that the machine needs an adjustment?
Confidence Interval for Difference of Means
• As we have learnt in the previous chapter, the sampling distribution of the difference of
𝜎1 2 𝜎2 2
means,𝑋ത1 − 𝑋ത2 ~𝑁 𝜇1 − 𝜇2 , +
𝑛1 𝑛2

• Thus, we transform to Z distribution using this formula;


(𝑋ത1 − 𝑋ത2 ) − (𝜇1 − 𝜇2 )
𝑍=
𝜎1 2 𝜎2 2
+
𝑛1 𝑛2

𝜎1 2 𝜎2 2
• Solve for (𝜇1 −𝜇2 );(𝜇1 −𝜇2 ) = (𝑋ത1 − 𝑋ത2 ) − 𝑍( + )
𝑛1 𝑛2
The 100(1-𝛼)% confidence interval for the population mean 𝜇;
𝑃 𝐿 ≤ (𝜇1 −𝜇2 ) ≤ 𝑈 = 1 − 𝛼 with
𝜎1 2 𝜎2 2 𝜎1 2 𝜎2 2
L is(𝑋ത1 − 𝑋ത2 ) − 𝑍( + ) and U is(𝑋ത1 − 𝑋ത2 ) + 𝑍( + )
𝑛1 𝑛2 𝑛1 𝑛2
• Three cases we need to consider
1. When the population variances are known
The 100(1-𝛼)% CI for the difference of population means(𝜇1 −𝜇2 ) is
𝝈𝟏 𝟐 𝝈𝟐 𝟐 𝝈𝟏 𝟐 𝝈𝟐 𝟐
(ഥ ഥ𝟐 ) − 𝒛𝜶
𝒙𝟏 − 𝒙 + ≤ (𝝁𝟏 −𝝁𝟐 ) ≤ (ഥ ഥ𝟐 ) + 𝒛𝜶/𝟐
𝒙𝟏 − 𝒙 +
𝟐 𝒏𝟏 𝒏𝟐 𝒏𝟏 𝒏𝟐
2. When the population variances are unknown and𝒏𝟏 , 𝒏𝟐 ≥ 𝟑𝟎
The 100(1-𝛼)% CI for the difference of population mean (𝜇1 −𝜇2 ) is
a) 𝜎12 = 𝜎22
𝟏 𝟏 𝟏 𝟏
(ഥ
𝒙𝟏 − 𝒙ഥ𝟐 ) − 𝒛𝜶 𝒔𝒑 + ≤ (𝝁𝟏 −𝝁𝟐 ) ≤ (ഥ ഥ𝟐 ) + 𝒛𝜶/𝟐 𝒔𝒑
𝒙𝟏 − 𝒙 +
𝟐 𝒏𝟏 𝒏𝟐 𝒏𝟏 𝒏𝟐
𝑛1 −1 𝑠12 + 𝑛2 −1 𝑠22
Where 𝒔𝒑 =
𝑛1 +𝑛2 −2
b) 𝜎12 ≠ 𝜎22
𝒔𝟏 𝟐 𝒔 𝟐 𝟐 𝒔𝟏 𝟐 𝒔𝟐 𝟐
(ഥ ഥ 𝟐 ) − 𝒛𝜶
𝒙𝟏 − 𝒙 + ≤ (𝝁𝟏 −𝝁𝟐 ) ≤ (ഥ ഥ𝟐 ) + 𝒛𝜶/𝟐
𝒙𝟏 − 𝒙 +
𝟐 𝒏𝟏 𝒏𝟐 𝒏𝟏 𝒏𝟐

3. When the population variances are unknown and𝒏𝟏 , 𝒏𝟐 < 𝟑𝟎


The 100(1-𝛼)% CI for the difference of population mean(𝜇1 −𝜇2 ) is
a) 𝜎12 = 𝜎22
𝟏 𝟏 𝟏 𝟏
(ഥ ഥ𝟐 ) − 𝒕𝜶,𝒏 +𝒏 −𝟐 𝒔𝒑
𝒙𝟏 − 𝒙 + ≤ (𝝁𝟏 −𝝁𝟐 ) ≤ (ഥ ഥ𝟐 ) 𝒕𝜶,𝒏 +𝒏 −𝟐 𝒔𝒑
𝒙𝟏 − 𝒙 +
𝟐 𝟏 𝟐 𝒏𝟏 𝒏𝟐 𝟐 𝟏 𝟐 𝒏𝟏 𝒏𝟐

𝑛1 −1 𝑠12 + 𝑛2 −1 𝑠22
Where 𝒔𝒑 =
𝑛1 +𝑛2 −2
b) 𝜎12 ≠ 𝜎22
𝒔𝟏 𝟐 𝒔𝟐 𝟐 𝒔𝟏 𝟐 𝒔𝟐 𝟐
(ഥ ഥ𝟐 ) − 𝒕𝜶,𝝂
𝒙𝟏 − 𝒙 + ≤ (𝝁𝟏 −𝝁𝟐 ) ≤ (ഥ ഥ𝟐 ) + 𝒕𝜶,𝝂
𝒙𝟏 − 𝒙 +
𝟐 𝒏𝟏 𝒏𝟐 𝟐 𝒏𝟏 𝒏𝟐

𝟐 𝟐 𝟐
𝒔𝟏 𝒔𝟐
+
𝒏𝟏 𝒏𝟐
Where 𝝂 = 𝟐 𝟐 𝟐
𝒔𝟐
𝟏 𝒔𝟐
𝒏𝟏 𝒏𝟐
+
𝒏𝟏 −𝟏 𝒏𝟐 −𝟏
Exercises
• A car reviewer is comparing the total repair costs incurred during the
first three years on two mid-sized cars, the Adria and the Wanem.
Random samples of 16 Adria and nine Wanem are taken. All 25 cars
are three years old and have similar mileages. The mean of repair
costs for the 16 Adria cars is RM5,000 for the first three years with a
standard deviation of RM800.For the nine Wanem cars, the mean is
RM7,700 with a standard deviation of RM1,000. Assume that the
repair costs follow a normal distribution with the same population
variance. Construct a 90% confidence interval for the difference
between the two populations means.
𝟏 𝟏 𝟏 𝟏
(ഥ ഥ𝟐 ) − 𝒕𝜶,𝒏
𝒙𝟏 − 𝒙 +𝒏 −𝟐 𝒔𝒑 + ≤ (𝝁𝟏 −𝝁𝟐 ) ≤ (ഥ ഥ𝟐 ) 𝒕𝜶,𝒏
𝒙𝟏 − 𝒙 +𝒏 −𝟐 𝒔𝒑 +
𝟐 𝟏 𝟐 𝒏𝟏 𝒏𝟐 𝟐 𝟏 𝟐 𝒏𝟏 𝒏𝟐

𝑛1 −1 𝑠12 + 𝑛2 −1 𝑠22
Where 𝒔𝒑 =
𝑛1 +𝑛2 −2
• A process engineer is comparing two different etching solutions for
removing silicon from the back of wafers. The etch rates follow a
normal distribution and have equal population variances of 0.352 .
Below are the observed etch rates from 10 wafers for each solution.
Solution 1 Solution 2
9.7 10.1
9.3 10.5
9.1 10.6
9.5 10.3
10.0 10.3
10.5 9.9
10.2 10.1
9.9 10.2
10.3 10.3
10.1 10.1
• Find a 90% for the difference in mean etch rates.
• Construct CI for the difference in mean etch rates if we do not know
the population variances and assume that both populations have an
unequal variances.
Confidence Interval for Proportion
• The best estimator for population proportion 𝜋 is the sample mean 𝑃
• As we have learnt in the previous chapter, the sampling distribution of
𝜋(1−𝜋)
sample proportion, P~𝑁 𝜋,
𝑛 𝑃−𝜋
• Thus, we transform to Z distribution using this formula; 𝑍 =
𝜋(1−𝜋)
𝑛
𝜋(1−𝜋)
• Solve for 𝜋;𝜋 = 𝑃 − 𝑍 . Since the population proportion is
𝑛
unknown, thus replace 𝜋 with P instead.

• The 100(1-𝛼)% confidence interval for the population proportion 𝜋;


𝑃(1−𝑃) 𝑃(1−𝑃)
𝑃 𝐿 ≤ 𝜋 ≤ 𝑈 = 1 − 𝛼 with L is 𝑃 − 𝑍 and U is 𝑃 + 𝑍
𝑛 𝑛
Thus, the 100(1-𝛼)% CI for the proportion 𝜋 is
𝑷(𝟏 − 𝑷) 𝑷(𝟏 − 𝑷)
𝑷 − 𝒛𝜶/𝟐 ≤ 𝝅 ≤ 𝑷 + 𝒛𝜶/𝟐
𝒏 𝒏
Exercises
• A random sample of 400 components were tested and 6.25 percent
of the sample components fail to satisfy production specifications.
Find a 90% CI on the true proportion of components that fail to satisfy
the specifications.
• A manufacturer of computer chips inspected a random sample of
1000 chips. The following are the number of defects according to
their types.
Types Number of defects
Holes too small 90
Holes too large 25
Poor connections 10
Chip oversize 2
Chip undersize 1

• What is the point estimate of the proportion of defective ships due to


holes too small?
• Construct a 90% CI for the proportion of defective chips for the
production process due to holes too small.
• What is the point estimate for proportion of defective chips due to
poor connection?
• Construct a 90% CI for the proportion of defective ships for the
production process due to poor connection.
• If oversize and undersize chips can be classified as incorrect chip size,
what is the point estimate of the proportion of defects due to
incorrect chip size?
• Hence find a 95% interval estimate for the proportion of defective
chips due to incorrect chip size.
Confidence Interval for Difference of
Proportions
• The sampling distribution for difference in sample proportions,
𝜋1 1 − 𝜋1 𝜋2 1 − 𝜋2
𝑃1 −𝑃2 ~𝑁 𝜋1 − 𝜋2 , +
𝑛1 𝑛2

• Thus, we transform to Z distribution using this formula;


(𝑃1 − 𝑃2 ) − (𝜋1 − 𝜋2 )
𝑍=
𝜋1 (1 − 𝜋1 ) 𝜋2 (1 − 𝜋2 )
+
𝑛1 𝑛2

𝜋1 (1−𝜋1 ) 𝜋2 (1−𝜋2 )
• Solve for(𝜋1 − 𝜋2 );(𝜋1 − 𝜋2 ) = (𝑃1 − 𝑃2 ) − 𝑍 + . Since the
𝑛1 𝑛2
population proportion is unknown, thus replace 𝜋 with P instead.
• The 100(1-𝛼)% confidence interval for the difference of population
proportions(𝜋1 − 𝜋2 );

𝑃 𝐿 ≤ (𝜋1 − 𝜋2 ) ≤ 𝑈 = 1 − 𝛼 with
𝑃1 (1−𝑃1 ) 𝑃2 (1−𝑃2 ) 𝑃1 (1−𝑃1 ) 𝑃2 (1−𝑃2 )
L is (𝑃1 −𝑃2 ) − 𝑍 + and U is (𝑃1 −𝑃2 ) + 𝑍 +
𝑛1 𝑛2 𝑛1 𝑛2

• Thus,
𝑃1 (1 − 𝑃1 ) 𝑃2 (1 − 𝑃2 ) 𝑃1 (1 − 𝑃1 ) 𝑃2 (1 − 𝑃2 )
(𝑃1 −𝑃2 ) − 𝑍 + ≤ (𝜋1 − 𝜋2 ) ≤ (𝑃1 −𝑃2 ) + 𝑍 +
𝑛1 𝑛2 𝑛1 𝑛2
Exercise
• A survey conducted by independent Engineering Education Research
Unit found that among teenagers aged 17 to 19, 20% of school girls
and 25% of school boys wanted to study in engineering discipline.
Suppose that these percentages are based on random samples of 501
school girls and 500 school boys. Determine a 90% CI for the
difference between the proportions of all school girls and all school
boys who would like to study in engineering discipline.
Confidence Interval for Variance
• Involves Chi-Square Distribution.
• The graph is skewed to the right and takes only positive values
2
• 𝜒𝛼,𝜐 denotes the number along the horizontal axis that cuts off to its
left an area of 𝛼 under the chi-square distribution with 𝜐 degrees of
freedom.
• So the probability 𝑃 𝜒 2 > 𝜒𝛼,𝜐
2
= 𝛼 can be found from the statistical
table.
𝑛−1 𝑠2
• Recall 𝜒 2 = is distributed as 𝜒 2 with 𝑛 − 1 degrees of
𝜎2
freedom.
• To construct the confidence interval;

2 2
𝑃 𝜒1−𝛼/2 ≤ 𝜒 2 ≤ 𝜒𝛼/2 =1−𝛼

2 𝑛 − 1 𝑠2 2
𝑃 𝜒1−𝛼/2 ≤ 2
≤ 𝜒𝛼/2 = 1 − 𝛼
𝜎

𝒏−𝟏 𝒔𝟐 𝒏−𝟏 𝒔𝟐
≤ 𝝈𝟐 ≤ .
𝝌𝟐𝜶 𝝌𝟐 𝜶
,𝒏−𝟏 𝟏− ,𝒏−𝟏
𝟐 𝟐
Exercises
• A random sample of 13 bolts is selected and the inside diameter is
measured. The sample standard deviation of the bolt inside diameter
is 0.018mm. Construct a 90% CI for the standard deviation.
• An optical firm is concerned about the variability of the refractive
index of a typical glass when its employee grinds it into lenses. The
refractive index is approximately normally distributed. A random
sample of 15 glasses is drawn from a large shipment which give a
variance of 1.5 × 10−4 refractive index. Construct a 95% CI for the
standard deviation of refractive index of all glasses.
Confidence Interval for Variances
• Involves F distribution.
• It is used in two-sample situations to draw inferences about the
population variances.
• Let say there are two independent random variables U and V having
chi-square distribution with v1 and v2 degrees of freedom,
respectively. Then
𝑈/𝜈1
𝐹=
𝑉/𝜈2
• Then random variable F~𝐹𝛼,𝜈1 𝜈2 with 𝑃 𝐹 > 𝐹𝛼,𝜈1 𝜈2 = 𝛼 , can be
found from the statistical table.
• The sampling distribution of F statistics is
𝑆22 /𝜎22
F= 2 2
𝑆1 /𝜎1
is distributed as 𝐹 with 𝑛2 − 1 and 𝑛1 − 1 degrees of freedom.
• Thus, the confidence interval
𝑃 𝑓1−𝛼/2,𝑛2 −1,𝑛1 −1 ≤ 𝐹 ≤ 𝑓𝛼/2,𝑛2 −1,𝑛1 −1 = 1 − 𝛼

𝑆22 /𝜎22
𝑃 𝑓1−𝛼/2,𝑛2 −1,𝑛1 −1 ≤ 2 2 ≤ 𝑓𝛼/2,𝑛2 −1,𝑛1 −1 = 1 − 𝛼
𝑆1 /𝜎1
𝟏
Using 𝑭𝟏−𝜶/𝟐,𝒏𝟏 −𝟏,𝒏𝟐 −𝟏 =
𝑭𝜶/𝟐,𝒏𝟐 −𝟏,𝒏𝟏 −𝟏

𝒔𝟐𝟏 𝟏 𝝈𝟐𝟏 𝒔𝟐𝟏


𝟐 𝒇
≤ 𝟐 ≤ 𝟐 𝒇𝜶/𝟐,𝒏𝟐 −𝟏,𝒏𝟏 −𝟏
𝒔𝟐 𝜶/𝟐,𝒏𝟏 −𝟏,𝒏𝟐 −𝟏 𝝈𝟐 𝒔𝟐
Exercises
1. An engineer is studying an axial load of aluminum cans. It is
measured using a plate where an increasing pressure is applied on
top of the can until it collapses. This maximum weight that the sides
of the can can support is the axial loads. Two random sample of
sizes 10 and seven aluminum cans are selected and the standard
deviations are 10.1 kg and 11.8 kg respectively. Find a 90% CI on the
ratio of variances of the loads.
2. A mechanical engineer in a car manufacturing company is
investigating two types of bumper guards. A random sample of size
guards from each type were mounted on a compact car. Each car
was then run into a concrete wall at 8km per hour. The following
data are the costs of repairs (in RM):

Bumper 305 420 363 485 300 360


Guard 1
Bumper 405 345 336 450 400 360
Guard 2
a) Construct a 90% for the mean cost of repairs using Bumper Guard
1. State 3 conditions in constructing the CI.
b) Assuming that all conditions in part (a) are satisfied, construct a
90% CI for mean costs of repairs using Bumper Guard 2. What can
you have observed from these CIs?
c) Assuming that the variances of cost of repairs are equal, construct a
95% CI on the mean differences of cost of repairs.
d) What is the point estimate of the variance of cost of repair for
Bumper Guard 1? Construct a 95% CI for variance of cost of repair
for Bumper Guard 1.
e) What is the point estimate of the standard deviation of cost of
repair for Bumper Guard 2? Construct a 95% CI for the standard
deviation of cost of repair from BG2.
f) Find a 90% CI for the ratio of two variances for cost of repairs.
Past Year Questions

You might also like