STAT210 FL17 LCN 6 Edited

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

STAT 210

Probability and Statistics


Unit 6:Statistical Estimation
Confidence Intervals for a
Single Sample
Interval Estimation
The field of statistical inference consists of those methods
used to make decisions or to draw conclusions about a
population.

These methods utilize the information contained in a


sample from the population in drawing conclusions.

Statistical inference may be divided into two major areas:

Parameter estimation

Hypothesis testing

STAT220: Engineering Statistics 2


Estimation
Estimation:
 The assignment of value(s) to a population parameter based on the
corresponding sample statistic is called estimation.

 Point Estimation:

 The value of a sample statistic that is used to estimate a population


parameter is called a point estimate.
The sample proportion 𝑝Ƹ is a point estimate of the
population proportion 𝑝.
S is a point estimate of the population standard deviation
𝑋ത is a point estimator of the population mean 𝜇.

STAT220: Engineering Statistics 3


Properties of a Good Estimator
The estimator must be an unbiased estimator. That is, the expected value or
the mean of the estimates obtained from samples of a given size is
equal to the parameter being estimated.

 The estimator must be consistent. For a consistent estimator, as sample


size increases, the value of the estimator approaches the value of the
parameter estimated.

 The estimator must be a relatively efficient estimator. That is, of all the
statistics that can be used to estimate a parameter, the relatively efficient
estimator has the smallest variance.
STAT220: Engineering Statistics 4
Interval Estimation
An alternative to reporting a single value for the parameter being
estimated is to calculate and report an entire interval of plausible values –
a confidence interval (CI). A confidence level is a measure of the degree of
reliability of the interval.

 Recall that if the population under study is normal or if the sample size n is
large, the
𝑋ሜ − 𝜇
𝑍= ~ 𝑁(0,1)
𝜎Τ 𝑛

STAT220: Engineering Statistics 5


Interval Estimation: Large sample CI for
population mean
 It follows that :
 X  
P   z / 2   z / 2   1  
  n 
where 1 − 𝑎 represents the area under the
standard Normal curve between -𝑧 and 𝑧.

STAT220: Engineering Statistics 6


Confidence Interval for 
 A 100(1-)% confidence interval for the mean 
of a normal population when the value of  is
known is given by
   
 x  z 2 , x  z 2 
 n n
 If  is unknown, but n is large (>30), it can be
replaced by the sample standard deviation s.
 s s 
 x  z 2 , x  z 2 
 n n

STAT220: Engineering Statistics 7


Interpretation of Confidence Interval

 In the long run, 100(1 − α)% of the intervals,


computed from successive random samples of
size 𝑛 will cover the true value of .

STAT220: Engineering Statistics 8


Example
The lifetime in hours of a 75-watt light bulb is known to
be normally distributed with =25 hours. A random
sample of 20 bulbs has a mean life of 1014 hours.
Construct a 95% confidence interval for the mean life.
 A 95% CI is given by
𝑥ҧ = 1014, 𝑛 = 20, 𝑧0.025 = 1.96, σ = 25

𝑥lj ± 𝑧𝛼/2 𝑠/ 𝑛 ⇒ 1014 ± (1.96) 25/ 20


⇒ (1003, 1025).

STAT220: Engineering Statistics 9


Sample size (n) determiniation
The half-width (margin of error ) of the CI depends
on the sample size. For given confidence level and
standard deviation the sample size, n, can be
calculated as
𝑧𝛼/2 𝜎 2
𝑛=
𝐸
E=(U-L)/2.
If the result is not an integer, we must
round it up.
 If  is unknown, an estimated value from previous
studies is used in the above formula.

STAT220: Engineering Statistics 10


Example
Suppose that in the previous example we wanted to be 95%
confident that the error in estimating the mean life is less than
five hours. What sample size should be used?

This means that E=5

2
1.96 × 25
𝑛= ≈ 97.
5

STAT220: Engineering Statistics 11


Exercise
A civil engineer is analyzing the compressive strength of concrete. Compressive
strength is normally distributed with 2=1000(psi)2. A random sample of 12
specimens has a mean compressive strength of 3250 psi.

a) Construct a 99% confidence interval on mean compressive


strength.

b) Suppose that it is desired to estimate the compressive strength with an


error that is less than 15 psi at 99% confidence. What sample size is
required?
STAT220: Engineering Statistics 12
T intervals :
Small sample CIs for population mean
If the sample size is small,  is unknown, and the
population is normal, then the statistic
X 
t
s n
has a Student’s t distribution with (n-1) degrees
of freedom (d.f.) denoted by t n 1 . In such case
a 100(1-)% confidence interval for  is given
by;
 s s 
 x  t n 1, / 2 , x  t n 1, / 2 
 n n
STAT220: Engineering Statistics 13
One-Sided Confidence Bounds for 
 A 100(1-)% upper-confidence bound for  is
given by 
U  X  z
n

 A 100(1-)% lower-confidence bound for  is


given by

L  X  z
n
When the value of σ is unknown it can be
replaced with s.
STAT220: Engineering Statistics 14
Properties of t Distributions
 Let tv denote the density function curve for v
df. Then
 Each tv curve is bell-shaped and centered at 0.
 Each tv curve is spread out more than the
standard normal (z) curve.
 As v increases, the spread of the corresponding
tv curve decreases.
 As , the sequence of tv curves approaches
the standard normal curve (the z curve is
called a t curve with df =).

STAT220: Engineering Statistics 15


More on t-intervals
 A reasonable assumption in many cases is that
the underlying distribution is approx normal.
 Normal Probability Plot is used to check the
normality of the underlying distribution.
 In normal probability plot the data are plotted
against a theoretical normal distribution in such a
way that the points should form an approximate
straight line. Too much departures from this
straight line indicate departures from normality.

 In Minitab, Graph  Probability Plot

STAT220: Engineering Statistics 16


Example
Suppose that we wish to estimate the mean CPU service time of a
job. A sample of 10 jobs gives a mean of 10 sec and standard
deviation of 1.5 sec. Assume that the CPU service times are
normally distributed, find a 95% confidence interval for the mean
CPU service time.
df=n - 1 = 10 - 1 = 9, t 0.025, 9 = 2.262.
𝑠 1.5
𝑋ሜ ± 𝑡𝑛−1,𝛼/2 = 10 ± 2.262 = 10 ± 1.07
𝑛 10

or 8.93 to 11.07 seconds

STAT220: Engineering Statistics 18


Exercise
1- A random sample of 15 metal rods used in an automobile
suspension system is selected, and the diameter is measured.
The sample mean and standard deviation are 8.23 and 0.025
mm respectively. Assuming that the distribution of rod
diameter is normal, find a 99% confidence interval on mean
rod diameter.

STAT220: Engineering Statistics 19


2- A quality control technician is checking the weights of a product.
She takes a random sample of 8 units and weights each unit. The
observed weights are: 50, 48, 55, 52, 53, 46, 54, 50. Construct a
97% CI for the mean weight of the units.

STAT220: Engineering Statistics 20


CI for a Population Proportion (p)
Let X be the number of successes in n independent
Bernoulli trials with success probability p, so that
X~Bin(n, p). Then from CLT (Central Limit Theorem)
 p(1  p) 
pˆ ~ N   pˆ  p,  pˆ 
2

 n 

A 100(1-)% CI for p is given by

pˆ (1  pˆ )
pˆ  z / 2
n
STAT220: Engineering Statistics 21
Example
 A manufacturer of electronic calculators is
interested in estimating the fraction of defective
units produced. A random sample of 800
calculators contains 10 defectives. Compute a
95% confidence interval for the proportion of
defective calculators.
 The sample proportion is 10/800=0.0125.
 Therefore, a 95% CI for p is

(0.0125)(0.9875)
0.0125  1.96  0.0125  0.0077
800
or (0.0048, 0.0202).

STAT220: Engineering Statistics 23


Exercise
1. A Hi-Tech company has introduced a new flash disk. Some
dealers and customers have complained that some of the
new disks are defective. In order to investigate this the
company randomly selected 64 disks and tested them.
They found that 8 of the 64 were defective. Find a 95%
confidence interval for the proportion of all flash disks that
are defective.

STAT220: Engineering Statistics 24


Sample Size
The sample size required to estimate p so that the error will not
exceed a specified amount E is given by

𝒛𝜶 𝟐 𝒑(𝟏 − 𝒑)
𝟐
𝒏=
𝑬𝟐

If an estimate of p from previous study, or pilot study is


available, it can be substituted for p in the above equation.
Otherwise, p=0.5 is used to give the most conservative
sample size.
STAT220: Engineering Statistics 25
Exercise
A study is to be conducted of the percentage of homeowners who own at
least two television sets. How large a sample is requird if we wish to be
99% confident that the error in estimating this quantity is less than 0.015?

STAT220: Engineering Statistics 26


Exercise
A civil engineer wishes to determine the percentage of old buildings (20+
years) in a large city. He wishes to be 95% confident that the estimate is
within 2% of the true proportion. A recent study of 180 buildings showed
that 25% are old.

a) How large should the sample size be?


b) If no estimate of the sample proportion is available, how large
should the sample be?

STAT220: Engineering Statistics 27

You might also like