Population Sample: Parameters:, , Statistics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Population

Parameters: 𝝁𝝁, 𝝈𝝈𝟐𝟐 , 𝒑𝒑


Sample
�, 𝑺𝑺𝟐𝟐 , 𝒑𝒑
Statistics: 𝒙𝒙 �

• By constructing
confidence intervals on
population parameters
Statistical
Inference • Or by setting up a
hypothesis test on a
population parameter.
Quick Recap:
Ways of obtaining an appropriate
Chapter 1 sample of size n from a population of
size N.
Descriptive Organizing the data collected from
Statistics Chapter 2 the sample, summarizing it and
representing it graphically…
Drawing conclusions Inferential
Chapter 7 about a population Statistics
parameter (𝝁𝝁, 𝝈𝝈𝟐𝟐 , 𝒑𝒑)
by using a statistic
𝒙𝒙, 𝑺𝑺𝟐𝟐 , 𝒑𝒑
(� �) calculated Estimation - Ch 7
from a sample.
Hypothesis Tests – Ch 8
Statistical inference is the process by which we acquire
information and draw conclusions about populations from samples.
Statistics

Examples: Data Information

1) The government of a country wants to estimate the proportion of voters


(p) in the country that approve of their economic policies.
2) A manufacturer of car batteries wishes to estimate the average lifetime (µ)
of their batteries.
3) A paint company is interested in estimating the variability (as measured by
the variance, σ2) in the drying time of their paints.
The quantities p, µ and σ2 that are to be estimated are called
population parameters.
Recall: A sample estimate of a population parameter is called
a statistic.

The table below gives examples of some commonly used


parameters together with their statistics:
Parameter Statistic
𝑝𝑝 𝑝𝑝̂
𝜇𝜇 𝑥𝑥̅
𝜎𝜎 2 𝑆𝑆 2
• A point estimate of a parameter is a single value (point) that
estimates a parameter.

• An interval estimate of a parameter is a range of values from L


(lower value) to U (upper value) that estimate a parameter.

We say( with some _____% certainty) that the population


parameter of interest is between some lower and upper values.
Example 1:
Suppose the mean time it takes to serve customers at a supermarket checkout
counter is to be estimated.
� is calculated to be 2.283 minutes.
For n = 100 customers, 𝒙𝒙

Point estimate

An alternative statement is: We are 95% confident that the mean service
time will be from 1.637 minutes to 4.009 minutes,

Interval estimate

We are going to focus on methods of obtaining an interval estimate


for each of the parameters 𝝁𝝁, 𝝈𝝈𝟐𝟐 and 𝒑𝒑.
A confidence interval is a range of values from L (lower value) to U
(upper value) that estimate a population parameter 𝜽𝜽 with
(1−𝜶𝜶)100% confidence.
• 𝜽𝜽 (pronounced “theta”) can be the parameters 𝝁𝝁, 𝝈𝝈𝟐𝟐 or 𝒑𝒑.
• L is the lower confidence limit.
• U is the upper confidence limit.
• The interval (L, U) is called the confidence interval.
• 1−𝜶𝜶 is called the confidence coefficient.
• (1−𝜶𝜶)100 is called the confidence percentage. It is the percentage of
confidence that the interval will contain 𝜽𝜽, the parameter that is being
estimated.
Example 2:
In example 1, it was stated that we are 95% confident that the mean
service time will be from 1.637 minutes to 4.009 minutes. The
interval of values (1.637, 4.009) is an interval estimate of the
parameter μ.

• 𝜽𝜽 , the parameter that is being estimated, is the population


mean μ.
• The confidence percentage is (1−𝜶𝜶)100 = 95
• The confidence coefficient is (1−𝜶𝜶) = 0.95
• 𝜶𝜶 = 0.05
• L = 1.637 and U = 4.009
• The confidence interval is the interval (1.637, 4.009).
How do we find L and U when estimating the parameters µ, p and σ2 ?
From example 1: 𝝁𝝁
𝝁𝝁
Maybe 𝝁𝝁 is smaller than 2.283
𝑥𝑥̅ = 2.283 and is actually 1.283

This value of 2.283 is not likely to be 1.283 2.283 3.283


exactly equal to 𝝁𝝁 , but it should be
close. Maybe 𝝁𝝁 is larger than 2.283
𝜇𝜇 may be larger than 𝑥𝑥̅ = 2.283 or may and is actually 3.283
be smaller it.
When estimating 𝝁𝝁, there is always
Therefore, it is better to get an interval some error (E) involved. This is the
estimate for 𝝁𝝁… e.g. (1.283 ; 3.283) distance between the true value of 𝝁𝝁
�.
and its estimate, 𝒙𝒙
Note: Distance is always positive
If 𝝁𝝁 = 1.283, then, Therefore, for this
� and 𝝁𝝁
E = distance between 𝒙𝒙 example, a possible
E = 2.283 – 1.283 = 1
interval estimate for 𝝁𝝁
could be:

(1.283 ; 3.283)
1.283 2.283 3.283 = (2.283 – 1 ; 2.283 + 1)
=( �−E
𝒙𝒙 ; �
𝒙𝒙 + E )
If 𝝁𝝁 = 3.283, then,
� = 2.283 and E = 1
since 𝒙𝒙
� and 𝝁𝝁
E = distance between 𝒙𝒙
E = 3.283 – 2.283 = 1
In general, an interval estimate/confidence interval for 𝝁𝝁:

� − E ; 𝒙𝒙
(𝒙𝒙 � + E) �±E
or 𝒙𝒙
i.e. (point estimate − error ; point estimate + error)

Now,
• How do we determine the value of the error (E)?
• How sure are we that the interval contains the population parameter?
Answer: We utilize the standard normal distribution and its properties…
(Note: The following slides show the theory behind the formula for a interval
estimate/confidence interval for 𝝁𝝁, however you do not need to know this theory.)
Not examinable

In order to answer the questions on the previous slide, we need to determine a


way to obtain a range of values (from a lower value to an upper value) such that
we know what the chance of a value falling into this range is.
Therefore, we can use the standard normal distribution to obtain this:

Suppose we take the middle area of


𝜶𝜶 𝜶𝜶 1−𝜶𝜶, where 𝜶𝜶 is some arbitrary
𝟐𝟐 𝟐𝟐 value.
Confidence 𝜶𝜶
coefficient Then using symmetry, the area of
𝟐𝟐
will lie on either side.
Not examinable

We can now determine the


𝜶𝜶 𝜶𝜶
z-values that corresponds to
the middle area of 1−𝜶𝜶: 𝟐𝟐 𝟐𝟐

𝒛𝒛𝜶𝜶 𝒛𝒛𝟏𝟏−𝜶𝜶
Due to symmetry, these two 𝟐𝟐 𝟐𝟐
z-values will be the same, = −𝒛𝒛𝟏𝟏−𝜶𝜶
where one will be positive 𝟐𝟐
and the other will be
negative. Since the area to the right of this z-
𝜶𝜶
value is , the area to the left of
𝟐𝟐 𝜶𝜶
this value is 𝟏𝟏 −
𝟐𝟐
Not examinable

Therefore, a (1−𝜶𝜶)𝟏𝟏𝟏𝟏𝟏𝟏𝟏 confidence interval(CI) for 𝝁𝝁 is:

𝑷𝑷 −𝒛𝒛𝟏𝟏− 𝜶𝜶 < 𝒁𝒁 < 𝒛𝒛𝟏𝟏− 𝜶𝜶 = 𝟏𝟏 − 𝜶𝜶


𝟐𝟐 𝟐𝟐

� − 𝝁𝝁
𝒙𝒙
Recall from chapter 6: 𝒁𝒁 = 𝝈𝝈 −𝒛𝒛𝟏𝟏−𝜶𝜶 𝒛𝒛𝟏𝟏−𝜶𝜶
𝟐𝟐 𝟐𝟐
𝒏𝒏
Now we can
� − 𝝁𝝁
𝒙𝒙
= 𝑷𝑷 −𝒛𝒛𝟏𝟏− 𝜶𝜶 < 𝝈𝝈 < 𝒛𝒛𝟏𝟏− 𝜶𝜶 = 𝟏𝟏 − 𝜶𝜶 solve for 𝝁𝝁 in
𝟐𝟐 𝟐𝟐 the centre of
𝒏𝒏 this
expression…
Not examinable

� − 𝝁𝝁
𝒙𝒙
𝟏𝟏 − 𝜶𝜶 = 𝑷𝑷 −𝒛𝒛𝟏𝟏− 𝜶𝜶 < 𝝈𝝈 < 𝒛𝒛𝟏𝟏− 𝜶𝜶
𝟐𝟐 𝟐𝟐
𝒏𝒏
𝝈𝝈 𝝈𝝈 Multiply by 𝝈𝝈
= 𝑷𝑷 −𝒛𝒛𝟏𝟏− 𝜶𝜶 × � − 𝝁𝝁 < 𝒛𝒛𝟏𝟏− 𝜶𝜶 ×
< 𝒙𝒙
𝟐𝟐 𝒏𝒏 𝟐𝟐 𝒏𝒏 𝒏𝒏

𝝈𝝈 𝝈𝝈 Subtract 𝒙𝒙
= 𝑷𝑷 −�
𝒙𝒙 − 𝒛𝒛𝟏𝟏− 𝜶𝜶 × 𝒙𝒙 + 𝒛𝒛𝟏𝟏− 𝜶𝜶 ×
< −𝝁𝝁 < −� �
𝟐𝟐 𝒏𝒏 𝟐𝟐 𝒏𝒏
𝝈𝝈 𝝈𝝈
� − 𝒛𝒛𝟏𝟏− 𝜶𝜶 ×
∴ 𝟏𝟏 − 𝜶𝜶 = 𝑷𝑷 𝒙𝒙 � + 𝒛𝒛𝟏𝟏− 𝜶𝜶 ×
< 𝝁𝝁 < 𝒙𝒙 Divide by −𝟏𝟏
𝟐𝟐 𝒏𝒏 𝟐𝟐 𝒏𝒏
and reorder
When the value of 𝝈𝝈𝟐𝟐 is known,
we can be (1−𝜶𝜶)100% confident that 𝝁𝝁 will lie between
𝝈𝝈 𝝈𝝈

𝒙𝒙 − 𝒛𝒛𝟏𝟏− 𝜶𝜶 and �
𝒙𝒙 + 𝒛𝒛𝟏𝟏− 𝜶𝜶 .
𝟐𝟐 𝒏𝒏 𝟐𝟐 𝒏𝒏

𝝈𝝈 𝝈𝝈 On the
� − 𝒛𝒛𝟏𝟏− 𝜶𝜶
𝒙𝒙 � + 𝒛𝒛𝟏𝟏− 𝜶𝜶
; 𝒙𝒙 formula
𝟐𝟐 𝒏𝒏 𝟐𝟐 𝒏𝒏 sheet

Lower confidence Upper confidence


limit (L) limit (U)
Or a (1−𝜶𝜶)100% confidence interval for 𝝁𝝁 (𝝈𝝈𝟐𝟐 known) is:
𝝈𝝈
� ± 𝒛𝒛𝟏𝟏− 𝜶𝜶
𝒙𝒙 �±E )
(in the form 𝒙𝒙
point estimate 𝟐𝟐 𝒏𝒏
for 𝝁𝝁 Error (E)
The value of the error (E) therefore depends on the confidence
percentage for the interval ( 𝟏𝟏 − 𝜶𝜶 ), the population standard
deviation 𝝈𝝈, and the size of the sample (𝒏𝒏) used to obtain the point
estimate 𝒙𝒙
�.
• 𝒛𝒛𝟏𝟏− 𝜶𝜶 is referred to as the z-multiplier.
𝟐𝟐
𝝈𝝈
• is the standard error.
𝒏𝒏
Example 3:
The actual content of cool drink in a 500 ml bottle is known to vary. The
standard deviation is known to be 5 ml. Thirty (30) of these 500 ml bottles were
selected at random and their mean content found to be 498.5.

Calculate 95% and 99% confidence intervals for the mean content of all the
bottles.
𝝈𝝈
𝝈𝝈 𝟓𝟓 � ± 𝒛𝒛𝟏𝟏− 𝜶𝜶
𝒙𝒙
𝟐𝟐 𝒏𝒏
𝒏𝒏 𝟑𝟑𝟑𝟑 Given
We are to calculate a

𝒙𝒙 𝟒𝟒𝟒𝟒𝟒𝟒. 𝟓𝟓
confidence interval for 𝝁𝝁, the
𝒛𝒛𝟏𝟏− 𝜶𝜶 ? mean amount of cool drink in
𝟐𝟐
ALL the bottles (the
population mean).
(i) For a 95% confidence interval for the mean content:

𝜶𝜶
∴ 𝟏𝟏 − 𝜶𝜶 𝟏𝟏𝟏𝟏𝟏𝟏𝟏 = 𝟗𝟗𝟗𝟗𝟗 𝜶𝜶
𝟐𝟐
= 0.025 𝟐𝟐
= 0.025

∴ 𝟏𝟏 − 𝜶𝜶 = 𝟎𝟎. 𝟗𝟗𝟗𝟗 = 𝟎𝟎. 𝟗𝟗𝟗𝟗

𝜶𝜶 = 𝟎𝟎. 𝟎𝟎𝟎𝟎
𝜶𝜶 𝟎𝟎. 𝟎𝟎𝟎𝟎
𝟐𝟐 = = 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎
𝟐𝟐
𝜶𝜶 𝝈𝝈
∴ 𝟏𝟏 − = 𝟏𝟏 − 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎 = 𝟎𝟎. 𝟗𝟗𝟗𝟗𝟗𝟗 � ± 𝒛𝒛𝟏𝟏− 𝜶𝜶
𝒙𝒙
𝟐𝟐 𝟐𝟐 𝒏𝒏
Use the std 𝟓𝟓
∴ 𝒛𝒛𝟏𝟏− 𝜶𝜶 = 𝒛𝒛𝟎𝟎.𝟗𝟗𝟗𝟗𝟗𝟗 = 𝟒𝟒𝟒𝟒𝟒𝟒. 𝟓𝟓 ± 𝟏𝟏. 𝟗𝟗𝟗𝟗
𝟐𝟐 normal tables 𝟑𝟑𝟑𝟑
= 𝟏𝟏. 𝟗𝟗𝟗𝟗 = (𝟒𝟒𝟒𝟒𝟒𝟒. 𝟕𝟕𝟕𝟕 ; 𝟓𝟓𝟓𝟓𝟓𝟓. 𝟐𝟐𝟐𝟐)
We are 95% sure that μ lies in this interval ( L ; U )
(ii) For a 99% confidence interval for the mean content:

𝜶𝜶
∴ 𝟏𝟏 − 𝜶𝜶 𝟏𝟏𝟏𝟏𝟏𝟏𝟏 = 𝟗𝟗𝟗𝟗𝟗 𝜶𝜶
𝟐𝟐
= 0.005 𝟐𝟐
= 0.005

∴ 𝟏𝟏 − 𝜶𝜶 = 𝟎𝟎. 𝟗𝟗𝟗𝟗 = 𝟎𝟎. 𝟗𝟗𝟗𝟗

𝜶𝜶 = 𝟎𝟎. 𝟎𝟎𝟎𝟎
𝜶𝜶 𝟎𝟎. 𝟎𝟎𝟎𝟎
𝟐𝟐 = = 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎
𝟐𝟐
𝜶𝜶 𝝈𝝈
∴ 𝟏𝟏 − = 𝟏𝟏 − 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎 = 𝟎𝟎. 𝟗𝟗𝟗𝟗𝟗𝟗 � ± 𝒛𝒛𝟏𝟏− 𝜶𝜶
𝒙𝒙
𝟐𝟐 𝟐𝟐 𝒏𝒏
Use the std 𝟓𝟓
∴ 𝒛𝒛𝟏𝟏− 𝜶𝜶 = 𝒛𝒛𝟎𝟎.𝟗𝟗𝟗𝟗𝟗𝟗 = 𝟒𝟒𝟒𝟒𝟒𝟒. 𝟓𝟓 ± 𝟐𝟐. 𝟓𝟓𝟓𝟓𝟓𝟓
𝟐𝟐 normal tables 𝟑𝟑𝟑𝟑
= 𝟐𝟐. 𝟓𝟓𝟓𝟓𝟓𝟓 = (𝟒𝟒𝟒𝟒𝟒𝟒. 𝟏𝟏𝟏𝟏 ; 𝟓𝟓𝟓𝟓𝟓𝟓. 𝟖𝟖𝟖𝟖)
We are 99% sure that μ lies in this interval ( L ; U )

You might also like