Lecture 4 Characteristics and Some
Lecture 4 Characteristics and Some
Lecture 4 Characteristics and Some
AND
Some important probability distributions
E (b) b
E ( X Y ) E ( X ) E (Y )
E ( XY ) E ( X ) E (Y )
E (aX ) aE ( X )
POPULATION Variance
o Variance measures the distribution of individual values around the
mean.
var( X ) x E ( X )
2 2
x
x
V 100
x
Covariance
o Covariance is a characteristic of multivariate PFs
o It shows how two variables vary together around their means.
cov( X , Y ) E X x Y y
cov( X , Y ) E XY x y
o Express the value of each variable as a deviation from its mean, and
take the expected value of the product.
o The covariance between two random variables can be positive,
negative or zero.
o Properties of covariance – see #1
o If X and Y are independent their covariance is zero
Correlation coefficient
o The correlation coefficient shows how strongly two variables are related.
o It is a linear relationship cov( X , Y )
x y
o Covariance of two r.v. divided by the product of their standard deviations
o Properties of correlation coefficient (see all 6 p.445 and Figure B-3)
o Can be + or -, same sign as covariance
o Measure linear relationship
o Values between perfect + and perfect – relationship
o Pure number devoid of units of measurement
o Covariance zero if statistically independent; correlation also zero (linear?)
o Correlation does not imply causality
Conditional expected value
o Knowledge of the outcome of one event influence the expected value
that the other will take on.
o E.g. what is the expected number of printers that will be sold on a
particular day, if we already know that PCs are sold on that day?
oConditional expectation:
E X Y y Xf X Y y
x
oExample B.9 (p.448): Conditional expected value of # printers
sold if we know 2 PCs are sold
• E(Y|X=2) = f(Y=1|X=2) +2f(Y=2|X=2) + 3f(Y=3|X=2) + 4f(Y=4|X=2)
Remember, the condition probability of Y is the joint probability of X and Y,
divided by the marginal probability of X
= 1*(0.06/0.24) + 2*(0.1/0.24) + 3*(0.05/0.24) + 4*(0.01/0.24)
= 1.875
• On average, 1.875 printers are sold on a day where 2 PCs are sold.
Skewness
•The third moment of probability distributions is skewness
•Skewness is a measure of asymmetry of a PDF:
E X x
3
S
x3
K
E X
x
2 2
S
2 i
x
i 1 n 1
n 1
𝑛
o Sample skewness (𝑋𝑖 − 𝑋ത)3
𝑛−1
𝑖=1
o Sample kurtosis 𝑛
(𝑋𝑖 − 𝑋ത)4
𝑛−1
𝑖=1
Some important probability distributions
X ~N ,
x
2
x
The normal distribution
o Properties:
The normal distribution curve is symmetrical around its mean value.
The prob. of obtaining a value close to the mean is higher than to obtain a value
close to the tail.
68% of the values under a normal curve lies within 1 standard deviation from the
mean (2 and 3 std dev?)
It is possible to determine the prob. that X lies within a certain interval if the
mean and variance are known.
Any linear combination of two or more normally distributed r.v. is itself normally
distributed
A normal distribution has a skewness of 0 and kurtosis of 3.
The standard normal distribution
o Normal distributions can differ in the means or variance or both (figure C-2)
o Any normally distributed r.v. can be converted to a standard normal variable Z, calculated x
Xby:
Z
x
Z ~ Nof(0zero
o Z has a mean ,1) and a variance of 1
The standard normal distribution (Table E1(a, b))
The standard normal distribution (Table E1(a, b))
Examples
• Example C.2 – Suppose X is a bakery’s • Example C.4 – Suppose X is a bakery’s
daily sale of bread and X ~N(70,9). daily sale of bread and X ~N(70,9).
• What is the probability that more • What is the probability that the daily
than 75 loaves of bread will be sold sale of bread will be between 65 and
on any given day? 75 loaves?
• Calculate the Z-variable: Z=(75-70)/√9 • First calculate the Z-values: Z1 = (65-
= ≈1.67 70)/√9 = -1.67 and
• What you want to know is the Z2 = (75-70)/√9 = 1.67
probability of obtaining a Z-value • From Table E1(b) P(-3.0 ≤ Z ≤ -1.67) =
greater than 1.67. 1 - 0.9525 = 0.0475 and P(-3.0 ≤ Z ≤
• Seek for the critical Z-value on table E- 1.67) = 0.9525
1(b) in the Appendix (p.518). • Thus P(-1.67 ≤ Z ≤ 1.67) = 0.9525 -
• The probability to obtain a Z-value 0.0475 = 0.9050
between -3 and 1.67 is 0.9525.
• Thus P(Z>1.67) = 1- 0.9525 = 0.0475.
The normal distribution and sampling
o Example C.6:
If you have a random sample from a normal population, the sample mean also follows the normal
X ~ N , x
distribution. 2
x
n
o Properties:
o Symmetric
o Mean is zero
o Variance is k/(k-2)
o T approaches standard normal distribution as df increases
Degrees of freedom
o The number of independent observations available to compute a
statistic
o The number of observations that can vary freely
o Eg: Calculate variance by dividing by (n-1)
o Lose 1 df for every piece of information extracted from a sample/ for
every parameter held constant in a sample
The χ2 distribution
o To derive the sampling distribution of the sample variance, one can use the χ2
distribution.
o The square of a standard normal variable is distributed as a χ2 probability
distribution with one degree of freedom.
Z
2 2
o Properties: (1)
o Only positive values
o Skewed; low df more skew
The F distribution
o The F distribution is used to compare the variances of two
populations.
S
F 2
2
x X i X m 1
2
Sy i Y Y 2
n 1
o The F ratio follows
the F distribution.
o Properties:
o Skewed
o between 0 and ∞
Activities
• X is the number of calories in a salad, X~N(200,25)
• Find the probability that the salad:
• Has more than 208 calories.
• Has between 190 and 200 calories.