Chapter 6 Data Analysis 2018

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 53

STATISTIC AND DATA ANALYSIS

Faculty of Engineering-Semester5
Teacher: Elsy Wehbe
2017-2018

1
Chapter 6 : Confidence Interval
Estimation
Learning Objectives
In this chapter, you learn:

• To construct and interpret confidence interval estimates for


the mean and the proportion
• How to determine the sample size necessary to develop a
confidence interval estimate for the mean or proportion

Chap 8-2
Chapter Outline

• Confidence Intervals for the Population Mean, μ


– when Population Standard Deviation σ is Known
– when Population Standard Deviation σ is Unknown
• Confidence Intervals for the Population
Proportion, π
• Determining the Required Sample Size

Chap 8-3
Point and Interval Estimates
DCOVA

• A point estimate is a single number,


• a confidence interval provides additional
information about the variability of the estimate

Lower Upper
Confidence Confidence
Point Estimate
Limit Limit
Width of
confidence interval

Chap 8-4
Point Estimates
DCOVA

We can estimate a with a Sample


Population Parameter … Statistic
(a Point Estimate)

Mean μ X
Proportion π p

Chap 8-5
Confidence Intervals
DCOVA

• How much uncertainty is associated with a


point estimate of a population parameter?

• An interval estimate provides more


information about a population
characteristic than does a point estimate.

• Such interval estimates are called


confidence intervals.

Chap 8-6
Confidence Interval Estimate
DCOVA

• An interval gives a range of values:


– Takes into consideration variation in sample
statistics from sample to sample
– Based on observations from 1 sample
– Gives information about closeness to unknown
population parameters
– Stated in terms of level of confidence
• e.g. 95% confident, 99% confident
• Can never be 100% confident

Chap 8-7
Confidence Interval Example
DCOVA
Cereal fill example
• Population has µ = 368 and σ = 15.
• If you take a sample of size n = 25 you know
– 368 ± 1.96 * 15 / 25= (362.12, 373.88) contains 95% of the sample
means
– When you don’t know µ, you use X to estimate µ
• If X = 362.3 the interval is 362.3 ± 1.96 * 15 / 25 =
(356.42, 368.18)
• Since 356.42 ≤ µ ≤ 368.18 the interval based on this sample
makes a correct statement about µ.

But what about the intervals from other possible samples of size 25?

Chap 8-8
Confidence Interval Example
DCOVA
(continued)

Lower Upper Contain


Sample # X
Limit Limit µ?

1 362.30 356.42 368.18 Yes

2 369.50 363.62 375.38 Yes

3 360.00 354.12 365.88 No

4 362.12 356.24 368.00 Yes

5 373.88 368.00 379.76 Yes


Chap 8-9
Confidence Interval Example
DCOVA

(continued)

• In practice you only take one sample of size n


• In practice you do not know µ so you do not
know if the interval actually contains µ
• However, you do know that 95% of the intervals
formed in this manner will contain µ
• Thus, based on the one sample you actually
selected, you can be 95% confident your interval
will contain µ (this is a 95% confidence interval)

Note: 95% confidence is based on the fact that we used Z = 1.96.


Chap 8-10
Estimation Process
DCOVA

I am 95% confident
Random Sample that μ is between 40
& 60.
Population Mean
(mean, μ, is X = 50
unknown)

Sample

Chap 8-11
General Formula
DCOVA

• The general formula for all confidence


intervals is:
Point Estimate ± (Critical Value)(Standard Error)
Where:
• Point Estimate is the sample statistic estimating the population
parameter of interest

• Critical Value is a table value based on the sampling distribution of


the point estimate and the desired confidence level

• Standard Error is the standard deviation of the point estimate

Chap 8-12
Confidence Level
DCOVA

• Confidence Level
– Confidence the interval will contain
the unknown population parameter
– A percentage (less than 100%)

Chap 8-13
Confidence Level, (1-)
DCOVA

(continued)

• Suppose confidence level = 95%


• Also written (1 - ) = 0.95, (so  = 0.05)
• A relative frequency interpretation:
– 95% of all the confidence intervals that can
be constructed will contain the unknown true
parameter
• A specific interval either will contain or
will not contain the true parameter

Chap 8-14
Confidence Intervals
DCOVA

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown

Chap 8-15
Confidence Interval for μ
(σ Known) DCOVA

• Assumptions
– Population standard deviation σ is known
– Population is normally distributed
– If population is not normal, use large sample

• Confidence interval estimate:

σ
X  Z /2
n
where is the point estimate
X
Zα/2 is the normal distribution critical value for a probability of /2 in each tail
is the standard error
σ/ n

Chap 8-16
Finding the Critical Value, Zα/2
DCOVA

Z /2  1.96
• Consider a 95% confidence interval:
1  α  0.95so α  0.05

α α
 0.025  0.025
2 2

Z units: Zα/2 = -1.96 0 Zα/2 = 1.96


Lower Upper
X units: Confidence Point Estimate Confidence
Limit Limit
Chap 8-17
Common Levels of Confidence
DCOVA

• Commonly used confidence levels are 90%,


95%, and 99%
Confidence
Confidence
Coefficient, Zα/2 value
Level
1 
80% 0.80 1.28
90% 0.90 1.645
95% 0.95 1.96
98% 0.98 2.33
99% 0.99 2.58
99.8% 0.998 3.08
99.9% 0.999 3.27
Chap 8-18
Intervals and Level of Confidence
DCOVA
Sampling Distribution of the Mean

/2 1  /2
x
Intervals μx  μ
extend from x1
σ x2 (1-)x100%
X  Z  /2 of intervals
n constructed contain
to μ;
σ ()x100% do not.
X  Z  /2
n
Confidence Intervals
Chap 8-19
Example
DCOVA

• A sample of 11 circuits from a large normal


population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is 0.35 ohms.

• Determine a 95% confidence interval for


the true mean resistance of the population.

Chap 8-20
Example DCOVA

(continued)

• A sample of 11 circuits from a large normal


population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is 0.35 ohms.
• Solution: σ
X  Z/2
n
 2.20  1.96 (0.35/ 11 )
 2.20  0.2068
1.9932    2.4068
Chap 8-21
Interpretation DCOVA

• We are 95% confident that the true mean


resistance is between 1.9932 and 2.4068
ohms
• Although the true mean may or may not be in
this interval, 95% of intervals formed in this
manner will contain the true mean

Chap 8-22
Confidence Intervals
DCOVA

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown

Chap 8-23
Do You Ever Truly Know σ?
• Probably not!

• In virtually all real world situations, σ is not known.

• If there is a situation where σ is known, then µ is also known


(since to calculate σ you need to know µ.)

• If you truly know µ there would be no need to gather a


sample to estimate it.

Chap 8-24
Confidence Interval for μ
(σ Unknown) DCOVA

• If the population standard deviation σ is


unknown, we can substitute the sample
standard deviation, S
• This introduces extra uncertainty, since S is
variable from sample to sample
• So we use the t distribution instead of the
normal distribution

Chap 8-25
Confidence Interval for μ
(σ Unknown)
(continued)
• Assumptions DCOVA

– Population standard deviation is unknown


– Population is normally distributed
– If population is not normal, use large sample
• Use Student’s t Distribution
• Confidence Interval Estimate:
S
X  t / 2
n
(where tα/2 is the critical value of the t distribution with n -1 degrees of freedom and
an area of α/2 in each tail)
Chap 8-26
Student’s t Distribution
DCOVA

• The t is a family of distributions


• The tα/2 value depends on degrees of freedom
(d.f.)
– Number of observations that are free to vary after sample
mean has been calculated

d.f. = n - 1

Chap 8-27
Degrees of Freedom (df)
DCOVA

Idea: Number of observations that are free to vary


after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0

Let X1 = 7
If the mean of these three values is 8.0,
Let X2 = 8 then X3 must be 9
(i.e., X3 is not free to vary)
What is X3?

Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2


(2 values can be any numbers, but the third is not free to vary for a
given mean)
Chap 8-28
Student’s t Distribution
DCOVA
Note: t Z as n increases

Standard
Normal
(t with df = ∞)

t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal

0 t
Chap 8-29
Student’s t Table
DCOVA

Upper Tail Area


Let: n = 3
df = n - 1 = 2
df .10 .05 .025  = 0.10
/2 = 0.05
1 3.078 6.314 12.706

2 1.886 2.920 4.303

3 1.638 2.353 3.182 /2 = 0.05

The body of the table


contains t values, not 0 2.920 t
probabilities
Chap 8-30
Selected t distribution values
DCOVA
With comparison to the Z value

Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) (∞ d.f.)

0.80 1.372 1.325 1.310 1.28


0.90 1.812 1.725 1.697 1.645
0.95 2.228 2.086 2.042 1.96
0.99 3.169 2.845 2.750 2.58

Note: t Z as n increases

Chap 8-31
Example of t distribution confidence interval
DCOVA

A random sample of n = 25 has X = 50 and


S = 8. Form a 95% confidence interval for μ

– d.f. = n – 1 = 24, so t α/2  t 0.025  2.0639

The confidence interval is


S 8
X  t/2  50  (2.0639)
n 25
46.698 ≤ μ ≤ 53.302

Chap 8-32
Example of t distribution confidence interval
(continued)

• Interpreting this interval requires the DCOVA

assumption that the population you are


sampling from is approximately a normal
distribution (especially since n is only 25).

Chap 8-33
Confidence Intervals
DCOVA

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown

Chap 8-34
Confidence Intervals for the
Population Proportion, π
DCOVA

• An interval estimate for the population


proportion ( π ) can be calculated by adding
an allowance for uncertainty to the sample
proportion ( p )

Chap 8-35
Confidence Intervals for the
Population Proportion, π
(continued)

• Recall that the distribution of the sample DCOVA

proportion is approximately normal if the sample


size is large, with standard deviation

 (1   )
σp 
n
• We will estimate this with sample data

p (1 p )
n
Chap 8-36
Confidence Interval Endpoints
DCOVA

• Upper and lower confidence limits for the


population proportion are calculated with the
formula

p (1  p )
p  Z /2
n
• where
– Zα/2 is the standard normal value for the level of confidence desired
– p is the sample proportion
– n is the sample size
• Note: must have X > 5 and n – X > 5
Chap 8-37
Example
DCOVA

• A random sample of 100 people shows


that 25 are left-handed.
• Form a 95% confidence interval for the
true proportion of left-handers

Chap 8-38
Example DCOVA

(continued)
• A random sample of 100 people shows
that 25 are left-handed. Form a 95%
confidence interval for the true proportion
of left-handers.
np = 100 * 0.25 = 25 > 5 & n(1-p) = 100 * 0.75 = 75 > 5

Make sure p  Z/2 p (1  p)/n


the sample
is big enough  25/100  1.96 0.25(0.75)/100
 0.25  1.96 (0.0433)
0.1651    0.3349
Chap 8-39
Interpretation
DCOVA

• We are 95% confident that the true percentage


of left-handers in the population is between
16.51% and 33.49%.

• Although the interval from 0.1651 to 0.3349


may or may not contain the true proportion,
95% of intervals formed from samples of size
100 in this manner will contain the true
proportion.

Chap 8-40
Determining Sample Size
DCOVA

Determining
Sample Size

For the For the


Mean Proportion

Chap 8-41
Sampling Error
DCOVA

• The required sample size can be found to obtain a


desired margin of error (e) with a specified level of
confidence (1 - )

• The margin of error is also called sampling error


– the amount of imprecision in the estimate of the population
parameter
– the amount added and subtracted to the point estimate to
form the confidence interval

Chap 8-42
Determining Sample Size
DCOVA

Determining
Sample Size

For the
Mean Sampling error (margin
of error)

σ σ
X  Z  /2 e  Z  /2
n n

Chap 8-43
Determining Sample Size
DCOVA
(continued)
Determining
Sample Size

For the
Mean

σ 2
Z  / 2 σ2
e  Z  /2 Now solve for n
to get n
n e2

Chap 8-44
Determining Sample Size
DCOVA
(continued)

• To determine the required sample size


for the mean, you must know:

– The desired level of confidence (1 - ), which


determines the critical value, Zα/2
– The acceptable sampling error, e
– The standard deviation, σ

Chap 8-45
Required Sample Size Example
DCOVA

If  = 45, what sample size is needed to


estimate the mean within ± 5 with 90%
confidence?

Z 2 σ 2 (1.645) 2 (45) 2
n  2
 2
 219.19
e 5

So the required sample size is n = 220


(Always round up)

Chap 8-46
If σ is unknown
DCOVA

• If unknown, σ can be estimated when


determining the required sample size
– Use a value for σ that is expected to be at
least as large as the true σ

– Select a pilot sample and estimate σ with the


sample standard deviation, S

Chap 8-47
Determining Sample Size
(continued)
Determining DCOVA
Sample Size

For the
Proportion

π (1 π ) Now solve for n Z 2 π (1  π )


eZ to get n
n e2

Chap 8-48
Determining Sample Size
DCOVA

(continued)

• To determine the required sample size for the


proportion, you must know:

– The desired level of confidence (1 - ), which


determines the critical value, Zα/2
– The acceptable sampling error, e
– The true proportion of events of interest, π
• π can be estimated with a pilot sample if necessary (or conservatively
use 0.5 as an estimate of π)

Chap 8-49
Required Sample Size Example
DCOVA

How large a sample would be necessary to


estimate the true proportion defective in a
large population within ±3%, with 95%
confidence?
(Assume a pilot sample yields p = 0.12)

Chap 8-50
Required Sample Size Example
(continued)

Solution: DCOVA

For 95% confidence, use Zα/2 = 1.96


e = 0.03
p = 0.12, so use this to estimate π

2
Z /2 π (1  π ) (1.96) 2 (0.12)(1  0.12)
n 2
 2
 450.74
e (0.03)
So use n = 451

Chap 8-51
Ethical Issues
• A confidence interval estimate (reflecting
sampling error) should always be included
when reporting a point estimate
• The level of confidence should always be
reported
• The sample size should be reported
• An interpretation of the confidence interval
estimate should also be provided
Chap 8-52
Chapter Summary

• Introduced the concept of confidence intervals


• Discussed point estimates
• Developed confidence interval estimates
• Created confidence interval estimates for the mean (σ
known and unknown)
• Created confidence interval estimates for the proportion
• Determined required sample size for mean and
proportion confidence interval estimates with a desired
margin of error
• Addressed confidence interval estimation and ethical
issues

Chap 8-53

You might also like