Chapter 6 Data Analysis 2018

STATISTIC AND DATA ANALYSIS
Faculty of Engineering-Semester5
Teacher: Elsy Wehbe
2017-2018
1
Chapter 6 : Confidence Interval
Estimation
Learning Objectives
In this chapter, you learn:
• To construct and interpret confidence interval estimates for

the mean and the proportion
• How to determine the sample size necessary to develop a
confidence interval estimate for the mean or proportion
Chap 8-2
Chapter Outline
• Confidence Intervals for the Population Mean, μ

– when Population Standard Deviation σ is Known
– when Population Standard Deviation σ is Unknown
• Confidence Intervals for the Population
Proportion, π
• Determining the Required Sample Size
Chap 8-3
Point and Interval Estimates
DCOVA
• A point estimate is a single number,

• a confidence interval provides additional
information about the variability of the estimate
Lower Upper
Confidence Confidence
Point Estimate
Limit Limit
Width of
confidence interval
Chap 8-4
Point Estimates
DCOVA
We can estimate a with a Sample

Population Parameter … Statistic
(a Point Estimate)
Mean μ X
Proportion π p
Chap 8-5
Confidence Intervals
DCOVA
• How much uncertainty is associated with a

point estimate of a population parameter?
• An interval estimate provides more

information about a population
characteristic than does a point estimate.
• Such interval estimates are called

confidence intervals.
Chap 8-6
Confidence Interval Estimate
DCOVA
• An interval gives a range of values:

– Takes into consideration variation in sample
statistics from sample to sample
– Based on observations from 1 sample
– Gives information about closeness to unknown
population parameters
– Stated in terms of level of confidence
• e.g. 95% confident, 99% confident
• Can never be 100% confident
Chap 8-7
Confidence Interval Example
DCOVA
Cereal fill example
• Population has µ = 368 and σ = 15.
• If you take a sample of size n = 25 you know
– 368 ± 1.96 * 15 / 25= (362.12, 373.88) contains 95% of the sample
means
– When you don’t know µ, you use X to estimate µ
• If X = 362.3 the interval is 362.3 ± 1.96 * 15 / 25 =
(356.42, 368.18)
• Since 356.42 ≤ µ ≤ 368.18 the interval based on this sample
makes a correct statement about µ.
But what about the intervals from other possible samples of size 25?
Chap 8-8
DCOVA
(continued)
Lower Upper Contain

Sample # X
Limit Limit µ?
1 362.30 356.42 368.18 Yes
2 369.50 363.62 375.38 Yes
3 360.00 354.12 365.88 No
4 362.12 356.24 368.00 Yes
5 373.88 368.00 379.76 Yes

Chap 8-9
DCOVA
(continued)
• In practice you only take one sample of size n

• In practice you do not know µ so you do not
know if the interval actually contains µ
• However, you do know that 95% of the intervals
formed in this manner will contain µ
• Thus, based on the one sample you actually
selected, you can be 95% confident your interval
will contain µ (this is a 95% confidence interval)
Note: 95% confidence is based on the fact that we used Z = 1.96.

Chap 8-10
Estimation Process
DCOVA
I am 95% confident
Random Sample that μ is between 40
& 60.
Population Mean
(mean, μ, is X = 50
unknown)
Sample
Chap 8-11
General Formula
DCOVA
• The general formula for all confidence

intervals is:
Point Estimate ± (Critical Value)(Standard Error)
Where:
• Point Estimate is the sample statistic estimating the population
parameter of interest
• Critical Value is a table value based on the sampling distribution of

the point estimate and the desired confidence level
• Standard Error is the standard deviation of the point estimate
Chap 8-12
Confidence Level
DCOVA
• Confidence Level
– Confidence the interval will contain
the unknown population parameter
– A percentage (less than 100%)
Chap 8-13
Confidence Level, (1-)
DCOVA
(continued)
• Suppose confidence level = 95%

• Also written (1 - ) = 0.95, (so  = 0.05)
• A relative frequency interpretation:
– 95% of all the confidence intervals that can
be constructed will contain the unknown true
parameter
• A specific interval either will contain or
will not contain the true parameter
Chap 8-14
DCOVA
Confidence
Intervals
Population Population
Mean Proportion
σ Known σ Unknown
Chap 8-15
Confidence Interval for μ
(σ Known) DCOVA
• Assumptions
– Population standard deviation σ is known
– Population is normally distributed
– If population is not normal, use large sample
• Confidence interval estimate:
σ
X  Z /2
n
where is the point estimate
X
Zα/2 is the normal distribution critical value for a probability of /2 in each tail
is the standard error
σ/ n
Chap 8-16
Finding the Critical Value, Zα/2
DCOVA
Z /2  1.96
• Consider a 95% confidence interval:
1  α  0.95so α  0.05
α α
 0.025  0.025
2 2
Z units: Zα/2 = -1.96 0 Zα/2 = 1.96

Lower Upper
X units: Confidence Point Estimate Confidence
Limit Limit
Chap 8-17
Common Levels of Confidence
DCOVA
• Commonly used confidence levels are 90%,

95%, and 99%
Confidence
Confidence
Coefficient, Zα/2 value
Level
1 
80% 0.80 1.28
90% 0.90 1.645
95% 0.95 1.96
98% 0.98 2.33
99% 0.99 2.58
99.8% 0.998 3.08
99.9% 0.999 3.27
Chap 8-18
Intervals and Level of Confidence
DCOVA
Sampling Distribution of the Mean
/2 1  /2
x
Intervals μx  μ
extend from x1
σ x2 (1-)x100%
X  Z  /2 of intervals
n constructed contain
to μ;
σ ()x100% do not.
X  Z  /2
n
Chap 8-19
Example
DCOVA
• A sample of 11 circuits from a large normal

population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is 0.35 ohms.
• Determine a 95% confidence interval for

the true mean resistance of the population.
Chap 8-20
Example DCOVA
(continued)
• A sample of 11 circuits from a large normal

population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is 0.35 ohms.
• Solution: σ
X  Z/2
n
 2.20  1.96 (0.35/ 11 )
 2.20  0.2068
1.9932    2.4068
Chap 8-21
Interpretation DCOVA
• We are 95% confident that the true mean

resistance is between 1.9932 and 2.4068
ohms
• Although the true mean may or may not be in
this interval, 95% of intervals formed in this
manner will contain the true mean
Chap 8-22
DCOVA
Confidence
Intervals
Mean Proportion
σ Known σ Unknown
Chap 8-23
Do You Ever Truly Know σ?
• Probably not!
• In virtually all real world situations, σ is not known.
• If there is a situation where σ is known, then µ is also known

(since to calculate σ you need to know µ.)
• If you truly know µ there would be no need to gather a

sample to estimate it.
Chap 8-24
(σ Unknown) DCOVA
• If the population standard deviation σ is

unknown, we can substitute the sample
standard deviation, S
• This introduces extra uncertainty, since S is
variable from sample to sample
• So we use the t distribution instead of the
normal distribution
Chap 8-25
(σ Unknown)
(continued)
• Assumptions DCOVA
– Population standard deviation is unknown

– Population is normally distributed
– If population is not normal, use large sample
• Use Student’s t Distribution
• Confidence Interval Estimate:
S
X  t / 2
n
(where tα/2 is the critical value of the t distribution with n -1 degrees of freedom and
an area of α/2 in each tail)
Chap 8-26
Student’s t Distribution
DCOVA
• The t is a family of distributions

• The tα/2 value depends on degrees of freedom
(d.f.)
– Number of observations that are free to vary after sample
mean has been calculated
d.f. = n - 1
Chap 8-27
Degrees of Freedom (df)
DCOVA
Idea: Number of observations that are free to vary

after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0
Let X1 = 7
If the mean of these three values is 8.0,
Let X2 = 8 then X3 must be 9
(i.e., X3 is not free to vary)
What is X3?
Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2

(2 values can be any numbers, but the third is not free to vary for a
given mean)
Chap 8-28
Student’s t Distribution
DCOVA
Note: t Z as n increases
Standard
Normal
(t with df = ∞)
t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal
0 t
Chap 8-29
Student’s t Table
DCOVA
Upper Tail Area

Let: n = 3
df = n - 1 = 2
df .10 .05 .025  = 0.10
/2 = 0.05
1 3.078 6.314 12.706
2 1.886 2.920 4.303
3 1.638 2.353 3.182 /2 = 0.05
The body of the table

contains t values, not 0 2.920 t
probabilities
Chap 8-30
Selected t distribution values
DCOVA
With comparison to the Z value
Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) (∞ d.f.)
0.80 1.372 1.325 1.310 1.28

0.90 1.812 1.725 1.697 1.645
0.95 2.228 2.086 2.042 1.96
0.99 3.169 2.845 2.750 2.58
Note: t Z as n increases
Chap 8-31
Example of t distribution confidence interval
DCOVA
A random sample of n = 25 has X = 50 and

S = 8. Form a 95% confidence interval for μ
– d.f. = n – 1 = 24, so t α/2  t 0.025  2.0639
The confidence interval is

S 8
X  t/2  50  (2.0639)
n 25
46.698 ≤ μ ≤ 53.302
Chap 8-32
Example of t distribution confidence interval
(continued)
• Interpreting this interval requires the DCOVA
assumption that the population you are

sampling from is approximately a normal
distribution (especially since n is only 25).
Chap 8-33
DCOVA
Confidence
Intervals
Mean Proportion
σ Known σ Unknown
Chap 8-34
Confidence Intervals for the
Population Proportion, π
DCOVA
• An interval estimate for the population

proportion ( π ) can be calculated by adding
an allowance for uncertainty to the sample
proportion ( p )
Chap 8-35
Confidence Intervals for the
Population Proportion, π
(continued)
• Recall that the distribution of the sample DCOVA
proportion is approximately normal if the sample

size is large, with standard deviation
 (1   )
σp 
n
• We will estimate this with sample data
p (1 p )
n
Chap 8-36
Confidence Interval Endpoints
DCOVA
• Upper and lower confidence limits for the

population proportion are calculated with the
formula
p (1  p )
p  Z /2
n
• where
– Zα/2 is the standard normal value for the level of confidence desired
– p is the sample proportion
– n is the sample size
• Note: must have X > 5 and n – X > 5
Chap 8-37
Example
DCOVA
• A random sample of 100 people shows

that 25 are left-handed.
• Form a 95% confidence interval for the
true proportion of left-handers
Chap 8-38
Example DCOVA
(continued)
• A random sample of 100 people shows
that 25 are left-handed. Form a 95%
confidence interval for the true proportion
of left-handers.
np = 100 * 0.25 = 25 > 5 & n(1-p) = 100 * 0.75 = 75 > 5
Make sure p  Z/2 p (1  p)/n

the sample
is big enough  25/100  1.96 0.25(0.75)/100
 0.25  1.96 (0.0433)
0.1651    0.3349
Chap 8-39
Interpretation
DCOVA
• We are 95% confident that the true percentage

of left-handers in the population is between
16.51% and 33.49%.
• Although the interval from 0.1651 to 0.3349

may or may not contain the true proportion,
95% of intervals formed from samples of size
100 in this manner will contain the true
proportion.
Chap 8-40
Determining Sample Size
DCOVA
Determining
Sample Size
For the For the

Mean Proportion
Chap 8-41
Sampling Error
DCOVA
• The required sample size can be found to obtain a

desired margin of error (e) with a specified level of
confidence (1 - )
• The margin of error is also called sampling error

– the amount of imprecision in the estimate of the population
parameter
– the amount added and subtracted to the point estimate to
form the confidence interval
Chap 8-42
DCOVA
Determining
Sample Size
For the
Mean Sampling error (margin
of error)
σ σ
X  Z  /2 e  Z  /2
n n
Chap 8-43
DCOVA
(continued)
Determining
Sample Size
For the
Mean
σ 2
Z  / 2 σ2
e  Z  /2 Now solve for n
to get n
n e2
Chap 8-44
DCOVA
(continued)
• To determine the required sample size

for the mean, you must know:
– The desired level of confidence (1 - ), which

determines the critical value, Zα/2
– The acceptable sampling error, e
– The standard deviation, σ
Chap 8-45
Required Sample Size Example
DCOVA
If  = 45, what sample size is needed to

estimate the mean within ± 5 with 90%
confidence?
Z 2 σ 2 (1.645) 2 (45) 2
n  2
 2
 219.19
e 5
So the required sample size is n = 220

(Always round up)
Chap 8-46
If σ is unknown
DCOVA
• If unknown, σ can be estimated when

determining the required sample size
– Use a value for σ that is expected to be at
least as large as the true σ
– Select a pilot sample and estimate σ with the

sample standard deviation, S
Chap 8-47
(continued)
Determining DCOVA
Sample Size
For the
Proportion
π (1 π ) Now solve for n Z 2 π (1  π )

eZ to get n
n e2
Chap 8-48
DCOVA
(continued)
• To determine the required sample size for the

proportion, you must know:
– The desired level of confidence (1 - ), which

determines the critical value, Zα/2
– The acceptable sampling error, e
– The true proportion of events of interest, π
• π can be estimated with a pilot sample if necessary (or conservatively
use 0.5 as an estimate of π)
Chap 8-49
DCOVA
How large a sample would be necessary to

estimate the true proportion defective in a
large population within ±3%, with 95%
confidence?
(Assume a pilot sample yields p = 0.12)
Chap 8-50
(continued)
Solution: DCOVA
For 95% confidence, use Zα/2 = 1.96

e = 0.03
p = 0.12, so use this to estimate π
2
Z /2 π (1  π ) (1.96) 2 (0.12)(1  0.12)
n 2
 2
 450.74
e (0.03)
So use n = 451
Chap 8-51
Ethical Issues
• A confidence interval estimate (reflecting
sampling error) should always be included
when reporting a point estimate
• The level of confidence should always be
reported
• The sample size should be reported
• An interpretation of the confidence interval
estimate should also be provided
Chap 8-52
Chapter Summary
• Introduced the concept of confidence intervals

• Discussed point estimates
• Developed confidence interval estimates
• Created confidence interval estimates for the mean (σ
known and unknown)
• Created confidence interval estimates for the proportion
• Determined required sample size for mean and
proportion confidence interval estimates with a desired
margin of error
• Addressed confidence interval estimation and ethical
issues
Chap 8-53

Chapter 6 Data Analysis 2018

Uploaded by

Copyright:

Available Formats

Chapter 6 Data Analysis 2018

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 6 Data Analysis 2018

Uploaded by

Copyright:

Available Formats

STATISTIC AND DATA ANALYSIS

• To construct and interpret confidence interval estimates for

• Confidence Intervals for the Population Mean, μ

• A point estimate is a single number,

We can estimate a with a Sample

• How much uncertainty is associated with a

• An interval estimate provides more

• Such interval estimates are called

• An interval gives a range of values:

Lower Upper Contain

1 362.30 356.42 368.18 Yes

2 369.50 363.62 375.38 Yes

3 360.00 354.12 365.88 No

4 362.12 356.24 368.00 Yes

5 373.88 368.00 379.76 Yes

• In practice you only take one sample of size n

Note: 95% confidence is based on the fact that we used Z = 1.96.

• The general formula for all confidence

• Critical Value is a table value based on the sampling distribution of

• Standard Error is the standard deviation of the point estimate

• Suppose confidence level = 95%

• Confidence interval estimate:

Z units: Zα/2 = -1.96 0 Zα/2 = 1.96

• Commonly used confidence levels are 90%,

• A sample of 11 circuits from a large normal

• Determine a 95% confidence interval for

• A sample of 11 circuits from a large normal

• We are 95% confident that the true mean

• In virtually all real world situations, σ is not known.

• If there is a situation where σ is known, then µ is also known

• If you truly know µ there would be no need to gather a

• If the population standard deviation σ is

– Population standard deviation is unknown

• The t is a family of distributions

Idea: Number of observations that are free to vary

Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2

Upper Tail Area

2 1.886 2.920 4.303

3 1.638 2.353 3.182 /2 = 0.05

The body of the table

0.80 1.372 1.325 1.310 1.28

A random sample of n = 25 has X = 50 and

– d.f. = n – 1 = 24, so t α/2  t 0.025  2.0639

The confidence interval is

• Interpreting this interval requires the DCOVA

assumption that the population you are

• An interval estimate for the population

• Recall that the distribution of the sample DCOVA

proportion is approximately normal if the sample

• Upper and lower confidence limits for the

• A random sample of 100 people shows

Make sure p  Z/2 p (1  p)/n

• We are 95% confident that the true percentage

• Although the interval from 0.1651 to 0.3349

For the For the

• The required sample size can be found to obtain a

• The margin of error is also called sampling error

• To determine the required sample size

– The desired level of confidence (1 - ), which