Lecture 4 Characteristics and Some

Characteristics of probability distributions
AND
Some important probability distributions
G&P Appendix B and C

Outcomes (part 1)
• At the end of this part you should be able to:
• Define/describe key terms and concepts such as
• expected value,
• variance,
• standard deviation,
• skewness and kurtosis.
• Find values for
• the expected value,
• variance,
• standard deviation,
• skewness and
• kurtosis coefficients.
Summary characteristics
• A probability distribution of a r.v. indicates the different values that
the r.v. can take, and the probability that a certain value will realise.
• Often we are not interested in the individual probabilities, but in the
characteristics of the distribution.
• These summary characteristics are the moments of a distribution.
• The different (population and sample) moments that we examine are:

• Expected value (measure of central tendency).
• Variance (measure of dispersion).
• Skewness (measures the shape of the
• Kurtosis distribution).
Population Expected value
o The expected value of the geometric average – the sum of the values
divided by the number of observations
o The expected value (population mean value) of a discrete r.v. X is
E ( X )   xf ( X )
x
• It is the weighted average of the possible values of X, with the probabilities of
these values [f(X)] serving as the weights.
o Rolling a die:
• Properties of expected value:
E (b)  b
E ( X  Y )  E ( X )  E (Y )
E ( XY )  E ( X ) E (Y )
E (aX )  aE ( X )
POPULATION Variance
o Variance measures the distribution of individual values around the
mean.
var( X )   x  E ( X   )
2 2
x
o It is the expected value of the squared difference between an

individual X value and its mean/ expected value (μ).
oTo calculate the variance:
• Subtract the mean/expected value of all the X values from the individual
value of X,
• Square this difference,
• And multiply it with the probability that this individual X value will realise.
var( X )   ( X   ) 2 f ( X )
x
x
• The following graph shows four datasets with the same mean, but
different variances
o The properties of variance (see #1 p.439)
o The variance of a constant is zero
o The positive square root of  x is 

2
x and is known as the standard
deviation (s.d.).
o Variance and s.d. depend on the units of measurement. To improve

comparison use the coefficient of variation.
x
V  100
x
Covariance
o Covariance is a characteristic of multivariate PFs
o It shows how two variables vary together around their means.
 
cov( X , Y )  E  X   x Y   y 
cov( X , Y )  E  XY    x  y
o Express the value of each variable as a deviation from its mean, and
take the expected value of the product.
o The covariance between two random variables can be positive,
negative or zero.
o Properties of covariance – see #1
o If X and Y are independent their covariance is zero
Correlation coefficient
o The correlation coefficient shows how strongly two variables are related.
o It is a linear relationship cov( X , Y )

 x y
o Covariance of two r.v. divided by the product of their standard deviations
o Properties of correlation coefficient (see all 6 p.445 and Figure B-3)
o Can be + or -, same sign as covariance
o Measure linear relationship
o Values between perfect + and perfect – relationship
o Pure number devoid of units of measurement
o Covariance zero if statistically independent; correlation also zero (linear?)
o Correlation does not imply causality
Conditional expected value
o Knowledge of the outcome of one event influence the expected value
that the other will take on.
o E.g. what is the expected number of printers that will be sold on a
particular day, if we already know that PCs are sold on that day?
oConditional expectation:
E X Y  y    Xf X Y  y 
x
oExample B.9 (p.448): Conditional expected value of # printers
sold if we know 2 PCs are sold
• E(Y|X=2) = f(Y=1|X=2) +2f(Y=2|X=2) + 3f(Y=3|X=2) + 4f(Y=4|X=2)
Remember, the condition probability of Y is the joint probability of X and Y,
divided by the marginal probability of X
= 1*(0.06/0.24) + 2*(0.1/0.24) + 3*(0.05/0.24) + 4*(0.01/0.24)
= 1.875
• On average, 1.875 printers are sold on a day where 2 PCs are sold.
Skewness
•The third moment of probability distributions is skewness
•Skewness is a measure of asymmetry of a PDF:
E X   x 
3
S
 x3
•If the S value is positive

the PDF is right or
positively skewed.
•If the S value is negative
the PDF is left or
negatively skewed.
Kurtosis
• Kurtosis, the fourth moment of a distribution, is a measure of the
tallness or flatness of a PDF:
E X   x 
4
K
E X    
x
2 2
• K<3 PDF is fat /short-tailed.

• K=3 PDF is mesokurtic.
• K>3 PDF is slim / long-tailed
From population to sample
o Often the PMF / PDF is not known for the total population and
practical constraints limit researchers to use a representative sample
of the population.
o To learn about the population characteristics you can examine the
sample moments.
o The sample moments are estimators of the population moments.
o A summary value determined for the population is called a
parameter.
o A summary value determined from a sample, is called a statistic or
estimate.
Sample mean
o The sample mean X is the estimator of the population mean E(X).
n
Xi
X 
i 1 n
o Where n is the sample size.

o The numerical value taken by the estimator is called the estimate.
Sample variance
o The sample variance is an estimator of the population variance.
n
X X 
2
S 
2 i
x
i 1 n 1
o The (n–1) refers to the degrees of freedom.
o The sample standard deviation is the positive square root of the

sample variance …
More sample characteristics
o Sample covariance: cov( X , Y ) 
 X  X Y  Y  i i
n 1
o Sample correlation coefficient:

r
 X
n
i 1 i  
 X Yi  Y / n  1
SxSy
𝑛
o Sample skewness (𝑋𝑖 − 𝑋ത)3
෍
𝑛−1
𝑖=1
o Sample kurtosis 𝑛
(𝑋𝑖 − 𝑋ത)4
෍
𝑛−1
𝑖=1
Some important probability distributions
o At the end of this part you should be able to:

 Define/describe key terms and concepts such as i.i.d random
variables, standard error, degrees of freedom.
 Determine probability values using standard normal, t and χ2
distributions.
 Distinguish between population variances using an F test.
Probability distributions
o A random variable can be described by the moments of its probability
function – not always known.
o Certain r.v. occur frequently, PDFs and properties known, specifically:
 The normal distribution.
 The t distribution
 The chi-square distribution.
 The F distribution.
o Details about these distributions allows us to draw inferences about
the true population values.
The normal distribution
o The normal distribution is a reasonably good model for a continuous
r.v. whose value depends on a number of factors, each factor exerting
a small + or – influence.
o E.g. Weight
A normally distributed
r.v. X is indicated by
X ~N  , 
x
2
x
The normal distribution
o Properties:
 The normal distribution curve is symmetrical around its mean value.
 The prob. of obtaining a value close to the mean is higher than to obtain a value
close to the tail.
 68% of the values under a normal curve lies within 1 standard deviation from the
mean (2 and 3 std dev?)
 It is possible to determine the prob. that X lies within a certain interval if the
mean and variance are known.
 Any linear combination of two or more normally distributed r.v. is itself normally
distributed
 A normal distribution has a skewness of 0 and kurtosis of 3.
The standard normal distribution
o Normal distributions can differ in the means or variance or both (figure C-2)
o Any normally distributed r.v. can be converted to a standard normal variable Z, calculated  x
Xby:
Z
x
Z ~ Nof(0zero
o Z has a mean ,1) and a variance of 1
The standard normal distribution (Table E1(a, b))
The standard normal distribution (Table E1(a, b))
Examples
• Example C.2 – Suppose X is a bakery’s • Example C.4 – Suppose X is a bakery’s
daily sale of bread and X ~N(70,9). daily sale of bread and X ~N(70,9).
• What is the probability that more • What is the probability that the daily
than 75 loaves of bread will be sold sale of bread will be between 65 and
on any given day? 75 loaves?
• Calculate the Z-variable: Z=(75-70)/√9 • First calculate the Z-values: Z1 = (65-
= ≈1.67 70)/√9 = -1.67 and
• What you want to know is the Z2 = (75-70)/√9 = 1.67
probability of obtaining a Z-value • From Table E1(b) P(-3.0 ≤ Z ≤ -1.67) =
greater than 1.67. 1 - 0.9525 = 0.0475 and P(-3.0 ≤ Z ≤
• Seek for the critical Z-value on table E- 1.67) = 0.9525
1(b) in the Appendix (p.518). • Thus P(-1.67 ≤ Z ≤ 1.67) = 0.9525 -
• The probability to obtain a Z-value 0.0475 = 0.9050
between -3 and 1.67 is 0.9525.
• Thus P(Z>1.67) = 1- 0.9525 = 0.0475.
The normal distribution and sampling
o Example C.6:
 If you have a random sample from a normal population, the sample mean also follows the normal
X ~ N  , x 
distribution.  2

 x
n 
 
•The square root of the

variance of a random variable
is called the standard deviation.
•The square root of the
variance of an estimator is
called the standard error.
The central limit theorem
Regardless of the underlying

distribution, the sample mean
of samples with n≥30 will
be close to a normal distribution.
(a) Samples drawn from
a normal population;
(b) Samples drawn from
a non-normal population.
The t distribution
o If we draw random samples from a normal population with mean µx
and variance σx2 but replace the variance by its estimator Sx2, the
sample mean follows the t distribution.
X  x
t
Sx n
o Properties:
o Symmetric
o Mean is zero
o Variance is k/(k-2)
o T approaches standard normal distribution as df increases
Degrees of freedom
o The number of independent observations available to compute a
statistic
o The number of observations that can vary freely
o Eg: Calculate variance by dividing by (n-1)
o Lose 1 df for every piece of information extracted from a sample/ for
every parameter held constant in a sample
The χ2 distribution
o To derive the sampling distribution of the sample variance, one can use the χ2
distribution.
o The square of a standard normal variable is distributed as a χ2 probability
distribution with one degree of freedom.
Z 
2 2
o Properties: (1)
o Only positive values
o Skewed; low df more skew
The F distribution
o The F distribution is used to compare the variances of two
populations.
S
F 2
2
x  X i  X  m  1
2
Sy  i Y  Y 2
n  1
o The F ratio follows
the F distribution.
o Properties:
o Skewed
o between 0 and ∞
Activities
• X is the number of calories in a salad, X~N(200,25)
• Find the probability that the salad:
• Has more than 208 calories.
• Has between 190 and 200 calories.
• Exercises App B&C on eFundi

• End of Unit 4 in Study Guide
• Examples from G&P

Lecture 4 Characteristics and Some

Uploaded by

Copyright:

Available Formats

Lecture 4 Characteristics and Some

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 4 Characteristics and Some

Uploaded by

Copyright:

Available Formats

Characteristics of probability distributions

G&P Appendix B and C

• The different (population and sample) moments that we examine are:

o It is the expected value of the squared difference between an

o The positive square root of  x is 

o Variance and s.d. depend on the units of measurement. To improve

•If the S value is positive

• K<3 PDF is fat /short-tailed.

o Where n is the sample size.

o The (n–1) refers to the degrees of freedom.

o The sample standard deviation is the positive square root of the

o Sample correlation coefficient:

o At the end of this part you should be able to:

•The square root of the

Regardless of the underlying

• Exercises App B&C on eFundi

You might also like