c08 Sampling

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

CHAPTER 8

FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA


DESCRIPTIONS

8.1 Random Sampling pling procedure, it is desirable to choose a random


sample in the sense that the observations are made
The basic idea of the statistical inference is that we independently and at random.
are allowed to draw inferences or conclusions about a
Random Sample
population based on the statistics computed from the
sample data so that we could infer something about Let X1 , X2 , . . . , Xn be n independent random variables,
the parameters and obtain more information about the each having the same probability distribution f (x).
population. Thus we must make sure that the samples Define X1 , X2 , . . . , Xn to be a random sample of size
must be good representatives of the population and n from the population f (x) and write its joint proba-
pay attention on the sampling bias and variability to bility distribution as
ensure the validity of statistical inference.
f (x1 , x2 , . . . , xn ) = f (x1 ) f (x2 ) · · · f (xn ).

8.2 Some Important Statistics


It is important to measure the center and the variabil-
ity of the population. For the purpose of the inference,
we study the following measures regarding to the cen-
ter and the variability.

8.2.1 Location Measures of a Sample

The most commonly used statistics for measuring the


center of a set of data, arranged in order of mag-
nitude, are the sample mean, sample median, and
sample mode. Let X1 , X2 , . . . , Xn represent n random
variables.

Sample Mean
To calculate the average, or mean, add all values, then
Bias divide by the number of individuals.
Any sampling procedure that produces inferences that X1 + X2 + · · · + Xn 1 n
consistently overestimate or consistently underestimate X= = ∑ Xi
n n i=1
some characteristic of the population is said to be bi-
ased. where X is the special symbol of the sample mean and
1 n
x = ∑ xi denotes its value, or the realization of X.
To eliminate any possibility of bias in the sam- n i=1
38 Chapter 8. Fundamental Sampling Distributions and Data Descriptions

N OTE . The mean is the balance point. It is the “center The sample variance “S2 ” is used to describe the vari-
of mass”. ation around the mean. We use
E XAMPLE 8.1. The weights of a group of students (in 1
s2 = (xi − x)2
lbs) are given below: n−1 ∑
" #
135 105 118 163 172 183 122 150 121 162 1 2 (∑ x)2
= x −
n−1 ∑ n
Find the mean. If another student joins in the group and
n ∑(x2 ) − (∑ x)2
his weight is 250 lbs, what would be the new mean? =
n(n − 1)

Sample Median to denote the realization or the computed values of S2 .


The number such that half of the observations are
smaller and half are larger, i.e., the midpoint of a dis- Sample Standard Deviation
tribution.
( The sample standard deviation is the squared root of
x if n is odd the sample variance.
x̃ = 1(n+1)/2 
2 xn/2 + xn/2+1 if n is even √
S = S2

E XAMPLE 8.2. The weights of a group of students (in and r


lbs) are given below: 1
s= (xi − x)2
n−1 ∑
135 105 118 163 172 183 122 150 121 162

Find the median. If another student joins in the group N OTE . Properties of Standard Deviation
and his weight is 250 lbs, what would be the new me-
dian? • s measures spread about the mean and should be
used only when the mean is the measure of center.
Sample Mode • s = 0 only when all observations have the same
The mode of a data set is the value that occurs most value and there is no spread. Otherwise, s > 0.
frequently.
• s gets larger, as the observations become more
spread out about their mean.
The cases are unimodal, bimodal, multimodal
and no mode. The mode is/are the value(s) whose • s has the same units of measurement as the origi-
frequencies are the largest (the peaks). nal observations.
E XAMPLE 8.3. The weights of three group of students
N OTE . The standard
r deviation of a population is de-
(in lbs) are given below:
∑(x − µ)2
fined by σ = , where N is the population
(a) 135, 105, 118, 163, 172, 183, 122, 150 N
size and µ is population mean. Be careful with the de-
(b) 135, 105, 118, 163, 172, 183, 122, 135 nominator inside square-root is N, instead of N − 1.

(c) 135, 135, 118, 118, 122, 118, 122, 135 E XAMPLE 8.4. Calculate the sample variance and the
sample standard deviation of the following set of data:
Find the mode for each group.
0 1 −2 −3 9

8.2.2 Variability Measures of a Sample


Sample Range
The most commonly used statistics for measuring the The sample range R of a data set is defined as
center of a set of data, arranged in order of magnitude,
are the sample variance, sample standard deviation, R = Xmax − Xmin
and sample range. Let X1 , X2 , . . . , Xn represent n ran-
dom variables.
E XAMPLE 8.5. Refer to Example 8.4. Find the sample
Sample Variance range.

STAT-3611 Lecture Notes 2015 Fall X. Li


Section 8.4. Sampling Distribution of Means and the Central Limit Theorem 39

8.3 Sampling Distributions • The variation of X is much smaller than that of the
population. The standard deviation of X decreases
as the sample size n increases.
Sampling Distribution
• The above results do NOT require any assump-
In general, the sampling distribution of a given statistic
tions on the shape of the population. However, a
is the distribution of the values taken by the statistic
random sample is a must.
in all possible samples of the same size form the same
population. E XAMPLE 8.7. The mean and standard deviation of the
strength of a packaging material are 55 kg and 6 kg, re-
In other words, if we repeatedly collect samples spectively. A quality manager takes a random sample of
of the same sample size from the population, compute specimens of this material and tests their strength. If the
the statistics (mean, standard deviation, proportion), manager wants to reduce the standard deviation of X to
and then draw a histogram of those statistics, the dis- 1.5 kg, how many specimens should be tested?
tribution of that histogram tends to have is called the
sample distribution of that statistics (mean, standard E XAMPLE 8.8. A soft-drink machine is regulated so
deviation, proportion). that the amount of drink dispensed averages 240 milliliters
with a standard deviation of 15 milliliters. Periodically,
N OTE . The statistical applets are good tools to study
the machine is checked by taking a sample of 40 drinks
the sampling distribution. Check out the Rice Univer-
and computing the average content. If the mean of the
sity Applets at http://onlinestatbook.com/stat_
40 drinks is a value within the interval µX ± 2σX , the
sim/sampling_dist/index.html.
machine is thought to be operating satisfactorily; other-
wise, adjustments are made. The company official found
the mean of 40 drinks to be x = 236 milliliters and con-
8.4 Sampling Distribution of Means cluded that the machine needed no adjustment. Was this
and the Central Limit Theorem a reasonable decision?

8.4.1 Sampling Distribution of Sample Means Sampling Distribution of Sample Means from a
from a Normal Population Normal Population
1 n
Theorem. Let X = ∑ Xi be the sample mean of a
Mean and Standard Deviation of a Sample Mean n i=1
random sample of size n drawn from a normal popu-
Theorem. Let X be the sample mean of a random sam-
lation having mean µ and standard deviation σ , then X
ple of size n drawn from a population having mean µ and
follows an exact normal
√ distribution with mean µ and
standard deviation σ , then the mean of X is
standard deviation σ / n. That is,
µX = µ √ 
Xi ∼ N (µ, σ ) =⇒ X ∼ N µ, σ / n .
and the standard deviation of X is
σ
σX = √ E XAMPLE 8.9. Prove the above theorem.
n
N OTE . • One of the essential assumptions is a ran-
E XAMPLE 8.6. Prove the above theorem. dom sample.

• The distribution of X has the EXACTLY normal


distribution if the random sample is from a nor-
mal population.

E XAMPLE 8.10. The contents of bottles of beer vary


according to a normal distribution with mean µ = 341
ml and standard deviation σ = 3 ml.

(a) What is the probability that the content of a ran-


domly selected bottle is less than 339 ml?

N OTE . • The sample mean X is an unbiased esti- (b) What is the probability that the average content of
mator of the population mean µ and is less vari- the bottles in a 12-pack of beer is less than 339
able than a single observation. ml?

X. Li 2015 Fall STAT-3611 Lecture Notes


40 Chapter 8. Fundamental Sampling Distributions and Data Descriptions

E XAMPLE 8.11. A patient is classified as having ges- approximate probability statement concerning the sam-
tational diabetes if the glucose level is above 140 mil- ple mean, without knowledge of the shape of the popu-
ligrams per deciliter (mg/dl) one hour after a sugary drink lation distribution.
is ingested. Sheila’s measured glucose level one hour
after ingesting the sugary drink varies according to the • Again, one of the essential assumptions is a ran-
normal distribution with µ = 125 mg/dl and σ = 10 dom sample.
mg/dl.
• The distribution of X has the approximately nor-
(a) If a single glucose measurement is made, what is mal distribution if the random sample is from a
the probability that Sheila is diagnosed as having population other than normal.
gestational diabetes? • How large a sample size? Usually, it would safe
to apply the CLT if n ≥ 30. It also depends on the
(b) If measurements are made on three separate days
population distribution, however. More observa-
and the mean result is compared with the criterion
tions are required if the population distribution is
140 mg/dl, what is the probability that Sheila is
far from normal.
diagnosed as having gestational diabetes?
E XAMPLE 8.12. The time a family physician spends
(c) What is the level L such that there is probability
seeing a patient follows some right-skewed distribution
only 5% that the mean glucose level of three test
with a mean of 15 minutes and a standard deviation of
results fall above L for Sheila’s glucose level dis-
11.6 minutes.
tribution.
(a) Can you calculate the probability that the doctor
8.4.2 The Central Limit Theorem (CLT) spends less than 12 minutes with the next patient
she sees? If so, do it. If not, explain why.
(b) What is the probability that the doctor spends an
average time between 13 and 18 minutes with her
30 patients of the day?
(c) One day, 35 patients have an appointment to see
the doctor. What is the probability that she will
have to work overtime, beyond her 8-hour shift?

8.4.3 Sampling Distribution of the Differ-


ence between Two Means

Suppose that we have two populations, the first with


mean µ1 and standard deviation σ1 , and the second
with mean µ2 and standard deviation σ2 . We take a
random sample of size n1 from the first population and
Theorem (Central Limit Theorem). If X is the mean
measure some variable X1 , and take an independent
of a random sample of size n taken from a population
random sample of size n2 from the second population
with mean µ and finite variance σ 2 , then
and measure the value of the some variable X2 .
X −µ By the Central Limit Theorem, we know that, if
Z= √ → n(z; 0, 1)
σ/ n n1 and n2 are sufficiently large,
as n → ∞. · √
X 1 ∼ N (µ1 , σ1 / n1 ) ,
In other words, if a random sample of size n is selected and
from any population with mean µ and standard devi- · √
X 2 ∼ N (µ2 , σ2 / n2 ) .
ation σ , then
√  It can also be shown that
X is approximately N µ, σ / n ,  
s
· σ12 σ22 
when n is sufficiently large. X 1 − X 2 ∼ N µ1 − µ2 , + .
n1 n2
N OTE . The Central Limit Theorem is important because,
for reasonably large sample size, it allows us to make an

STAT-3611 Lecture Notes 2015 Fall X. Li


Section 8.6. t-Distribution 41

Theorem. If independent samples of size n1 and n2 are N OTE (Degrees of Freedom). There are n degrees of
drawn at random from two populations, discrete or con- freedom, or independent pieces of information, in the
tinuous, with means µ1 and µ2 and variances σ12 and σ22 , random sample from the normal distribution. When the
respectively, then the sampling distribution of the dif- data (the values in the sample) are used to compute the
ferences of means, X 1 − X 2 , is approximately normally mean (i.e., when µ is replaced by x), a degree of free-
distributed with mean and variance given by dom is lost in the estimation of µ. Hence, there are the
remaining (n − 1) degrees of freedom in the information
σ12 σ22 used to estimate σ 2 .
µX 1 −X 2 = µ1 − µ2 and σX2 = + .
1 −X 2 n1 n2
Let χα2 (ν) be the χ 2 value above which we find
So, an area of α under the curve of the chi-squared distri-
 bution with ν degrees of freedom. That is,
X 1 − X 2 − (µ1 − µ2 ) · 
Z= q ∼ N(0, 1) P χ 2 (ν) > χα2 (ν) = α.
σ12 /n1 + σ22 /n2
We use table A.5. to find these critical values of the
N OTE . If both samples are from the normal popula- chi-squared distribution with ν degrees of freedom.
tions, the sampling distribution of X 1 − X 2 will be ex-
E XAMPLE 8.15. Find the critical values
actly normal, instead of approximate normal.
2 (4)
(a) χ0.95
E XAMPLE 8.13. We take a random sample of five 10-
year-old boys and four 10-year-old girls and measure 2 (22)
(b) χ0.75
their heights. Suppose that we know that heights X1 of

10-year old boys follow a normal distribution with mean E XAMPLE 8.16. Find k such that P χ 2 (12) < k = 0.80.
55.7 inches and standard deviation 2.9 inches, and that
heights X2 of 10-year old girls follow a normal distribu- E XAMPLE 8.17. Use Table A.5. to give the best esti-
tion with mean 54.1 inches and standard deviation 2.6 mate to each of the following probabilities.
inches. What is the probability that the mean height of 
the girls in the sample is smaller than the mean height (a) P χ 2 (5) ≥ 3
for the boys in the sample? 
(b) P χ 2 (8) > 3.33
E XAMPLE 8.14. A research on bulimia among college 
(c) P χ 2 (10) ≤ 6.66
women studies the connection between childhood sexual

abuse and a measure of family cohesion (the higher the (d) P χ 2 (25) > 99.9
score, the greater the cohesion). Assume that sexually
abused students have an average family cohesion scale
of 2.8 and a standard deviation of 2.1, while non-abused
students have the average scale of 4.8 and a standard de-
8.6 t-Distribution
viation of 3.2. What is the probability that a random
sample of 49 non-abused students will have an average We have learned that Z = σX−µ √ (exactly or approxi-
/ n
family cohesion scale that is at least 0.5 scores higher mately) follows the standard normal distribution, where
than the average scale of a randoms sample of 36 sexu- the data are from a random sample of size n from the
ally abused students? What can you conclude? population with mean µ and standard deviation σ .
And, it is very likely that both µ and σ are unknown
parameters. In practice, it suffices that the distribu-
8.5 Sampling Distribution of S2 tion is symmetric and single-peaked unless the sample
is very small.
Since most of the simple work in statistical in-
Distribution of (n − 1)S2 /σ 2 ference focus on the unknown population mean µ, we
If S2 is the variance of a random sample of size n taken will need deal with the unknown σ especially when n is
from a normal population having the variance σ 2 , then not large. It is quite intuitive and natural to estimate
the statistic the unknown population standard deviation σ using
 the sample standard deviation S.
n X −X 2
2 (n − 1)S2 i
χ = =∑ X −µ
σ2 i=1 σ2 We have another statistic T = √ as an ana-
S/ n
has a chi-squared distribution with ν = n − 1 degrees X −µ
log sample version of Z = √ .
of freedom. σ/ n

X. Li 2015 Fall STAT-3611 Lecture Notes


42 Chapter 8. Fundamental Sampling Distributions and Data Descriptions

Student t distribution Because the symmetrically property, t1−α = −tα .


Let X1 , X2 , . . . , Xn be independent random variables that We use table A.4. to find these critical values of
are all normal with mean µ and standard deviation σ . the t distribution with ν degrees of freedom.
2
Let X = 1n ∑ni=1 Xi and S2 = n−1 1
∑ni=1 Xi − X . Then N OTE . The t table, as well as the χ 2 table, gives us
the random variable the UPPER tail probabilities, while the z table gives the
X −µ lower tail probabilities.
T= √
S/ n E XAMPLE 8.18. Find the critical values.
has a t-distribution with ν = n − 1 degrees of freedom.
N OTE . When n is very large, s is a very good estimate (a) t0.005 (5), t0.05 (5), t0.5 (5), t0.85 (5), t0.975 (5)
of σ , and the corresponding t distributions are very close (b) t0.10 (10), t0.20 (20), t0.30 (30), t0.40 (40), t0.60 (60)
to the normal distribution. The t distributions become
wider for smaller sample sizes, reflecting the lack of pre- (c) t0.90 (10), t0.95 (15), t0.99 (19)
cision in estimating σ from s.
E XAMPLE 8.19. Let T (ν) denote the Student t-distribution
with ν degrees of freedom. Find k such that

(a) P (T (8) > k) = 0.02.


(b) P (T (18) < k) = 0.80.
(c) P (T (28) ≥ k) = 0.99.
E XAMPLE 8.20. Use Table A.4. to give the best esti-
mate to each of the following probabilities.

(a) P (T (5) ≥ 1.11)


Chapter 8 Fundamental Sampling Distributions and Data Descriptions (b) P (T (8) < 2.22)
Important Properties of the Student t distribution
(c) P (T (10) ≥ 3.33)
8.1: Let X1 , X2 , . . . , Xn be• independent
The t distribution is different
random variables for different
that are all normal with sample
mean µ and standard deviation σ. Let
sizes, or different degrees of freedom. (d) P (T (15) > 4.44)
n n
1! 1 !
X̄ = Xi and S2 = (Xi − X̄)2 .
n i=1• The t distribution
n − 1 i=1has the same general symmet- N OTE . Clearly,
ric bell
X̄−µ shape as the standard normal distribu-
Then the random variable T = S/√n has a t-distribution with v = n − 1 degrees X−µ

of freedom. tion, but it reflects the greater variability (with σ/ n
widerof Tdistributions)
The probability distribution that
was first published in 1908 is
in aexpected
paper written with small T=r
by W. S. Gosset. At the time, Gosset was employed by an Irish brewery that
samples. [(n−1)S2 /σ 2 ]
prohibited publication of research by members of its staff. To circumvent this re- n−1
striction, he published his work secretly under the name “Student.” Consequently,
the distribution of T • The called
is usually t distribution has a mean
the Student t-distribution of tthe=t-0.
or simply Z
distribution. In deriving the equation of this distribution, Gosset assumed that the =q
samples were selected •fromThe
a normal population.
standard Although this
deviation ofwould
the seem to be a
t distribution varies χ 2 (n−1)
very restrictive assumption, it can be shown that nonnormal populations possess- n−1
with the
ing nearly bell-shaped distributions willsample size,
still provide valuesbut it isapproximate
of T that greater than 1.
the t-distribution very closely.
• As the sample size n gets larger, the t distribu- More generally,
the t-Distribution Look Like?
tion gets closer to the standard normal distribu- Theorem. Let Z be a standard normal random variable
The distribution of T is similar to the distribution of Z in that they both are
symmetric about a meantion.
of zero. Both distributions are bell shaped, but the t- and V a chi-squared random variable with ν degrees of
distribution is more variable, owing to the fact that the T -values depend on the
fluctuations of two quantities, X̄ and S 2 , whereas the Z-values depend only on the freedom. If Z and V are independent, then the distribu-
changes in X̄ from sample Let tα (ν)The
to sample. bedistribution
the t value above
of T differs from which
that of Z we find an tion of the random variable T , where
in that the variance of T depends on the sample size n and is always greater than
area of α under the curve of the t distribution
1. Only when the sample size n → ∞ will the two distributions become the same. with ν
In Figure 8.8, we show the relationship between a standard normal distribution Z
degrees of freedom. That is,
(v = ∞) and t-distributions with 2 and 5 degrees of freedom. The percentage
T=p
points of the t-distribution are given in Table A.4. V /ν
P (T (ν) > tα (ν)) = α.
v"#
is given by the density function
v"5
v"2
 −(ν+1)/2
Γ[(ν + 1)/2] t2
h(t) = √ 1+ , −∞ < t < ∞.
Γ(ν/2) πν ν
This is known as the t-distribution with ν degrees of
0 1 2 t1! α " !tα 0 tα
t freedom.
t-distribution curves for v = 2, 5, Figure 8.9: Symmetry property (about 0) of the
t-distribution.
STAT-3611 Lecture Notes 2015 Fall X. Li

You might also like