What Are Parameters? Why Do We Care?: N X X X

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

What Are Parameters? Why Do We Care?

• Consider some probability distributions: • In real world, don’t know “true” parameters
 Ber(p) =p  But, we do get to observe data
 Poi(l) =l o E.g., number of times coin comes up heads, lifetimes of disk
 Multinomial(p1, p2, ..., pm)  = (p1, p2, ..., pm) drives produced, number of visitors to web site per day, etc.
 Uni(a, b)  = (a, b)  Need to estimate model parameters from data
 Normal(m, 2)  = (m, 2)  “Estimator” is random variable estimating parameter
 Etc. • Want “point estimate” of parameter
 Single value for parameter as opposed to distribution
• Call these “parametric models”
• Estimate of parameters allows:
• Given model, parameters yield actual distribution
 Better understanding of process producing data
 Usually refer to parameters of distribution as 
 Future predictions based on model
 Note that  that can be a vector of parameters
 Simulation of processes

Recall Sample Mean Sampling Distribution


• Consider n I.I.D. random variables X1, X2, ... Xn • Note that sample mean X is random variable
 Xi have distribution F with E[Xi] = m and Var(Xi) =  2  “Sampling distribution of mean” is the distribution of
the random variable X
 We call sequence of Xi a sample from distribution F
n
 Central Limit Theorem tells us sampling distribution of
Xi
 Recall sample mean: X   where E[X ]  m X is approximately normal when sample size, n, is
i 1 n
large
Recall variance of sample mean: Var ( X )  
2

n o Rule of thumb for “large” n: n > 30, but larger is better (> 100)
o Can use CLT to make inference about sample mean
 Clearly, sample mean X is a random variable

Confidence Interval for Mean Example of Confidence Interval


• Consider I.I.D. random variables X1, X2, ... • Idle CPUs are the bane of our existence
 Xi have distribution F with E[Xi] = m and Var(Xi) =  2  Large (unnamed) company wants to estimate average
1 n 2 ( X  X )2 number of idle hours per CPU
 Xi
n
 Let X  Var ( X )  S  i
2

n i 1 n i 1 n 1
 225 computers are monitored for idle hours
 For large n, 100(1 – a)% confidence interval is:
 Say X  11.6 hrs., S 2  16.81 hrs2., so S  4.1 hrs.
 S S 
 X  za / 2 , X  za / 2   Estimate m, mean idle hrs./CPU, with 90% conf. interval
 n n
a  0.10, a / 2  0.05, ( za / 2 )  0.95, za / 2  1.645
where ( za / 2 )  1  (a / 2)
 S S 
o E.g.:a  0.05, a / 2  0.025, ( za / 2 )  0.975, za / 2  1.96  X  za / 2 , X  za / 2 
 n n
 Meaning: 100(1 – a)% of time that confidence interval is  4.1 
  11.15, 12.05
4.1
computed from sample, true m would be in interval 11.6  1.645 , 11.6  1.645
 225 225 
o Not: X or m is 100(1 – a)% likely to be in this particular interval  90% of time that such an interval computed, true m is in it

1
Method of Moments Examples of Method of Moments
1 n
• Recall: n-th moment of distribution for variable X: • Recall the sample mean: X  X i  mˆ 1  E[ X ]
n i 1
mn  E[ X n ]  This is method of moments estimator for E[X]
• Consider I.I.D. random variables X1, X2, ..., Xn • Method of moments estimator for variance
1 n
Xi have distribution F Estimate second moment: mˆ 2   Xi
2
 
1 n
1 n
1 n n i 1
Let mˆ 1   Xi mˆ 2   X i ... mˆ k  n  Var ( X )  E[ X 2 ]  ( E[ X ])2
2 k
 Xi 
n i 1 n i 1 i 1
 Estimate: Var ( X )  m
ˆ 2  (m
ˆ 1 )2
 m̂i are called the “sample moments”
 (Xi  X 2)
n 2
1 n 2 1 n 1 n
   X i   X 2   X i   X 2  i 1
2
o Estimates of the moments of distribution based on data
 i 1
n  n i 1 n i 1 n
• Method of moments estimators  Recall sample variance:
 Estimate model parameters by equating “true” n
( X  X )2 n
(X  2Xi X  X 2)
2

n
(Xi  X 2)
2
n
moments to sample moments: mi  mˆi S  i
2
 i  i 1
 ˆ 2  (m
(m ˆ 1 )2 )
i 1 n 1 i 1 n 1 n 1 n 1

Small Samples = Problems Estimator Bias


• What is difference between sample variance and • Bias of estimator: E[ˆ]  
MOM estimate for variance?  When bias = 0, we call the estimator “unbiased”
 Imagine you have a sample of size n = 1  A biased estimator is not necessarily a bad thing
1 n
 What is sample variance?  Sample mean X   X i is unbiased estimator
n i 1
( X  X )2
n
S2   i  undefined n
( X i  X )2
i 1 n 1  Sample variance S 2   is unbiased estimator
n 1
 I.e., don’t really know variability of data i 1

n 1 2
 What is MOM estimate of variance?  MOM estimator of variance = S is biased
n
 
n 1
(Xi  X 2) (Xi  Xi ) o Asymptotically less biased as n  
2 2 2
i 1
 i 1
0
n 1  For large n, either sample variance or MOM estimate
 I.e., have complete certainty about distribution! of variance is fine.
o There is no variance

Estimator Consistency Method of Moments with Bernoulli


• Estimator “consistent”: lim P(| ˆ   |  e )  1 for e > 0 • Consider I.I.D. random variables X1, X2, ..., Xn
n
 As we get more data, estimate should deviate from true  Xi ~ Ber(p)
value by at most a small amount • Estimate p
 This is actually known as “weak” consistency
1 n
 Note similarity to weak law of large numbers:
p  E[ X i ]  mˆ 1  X   X i  pˆ
n i 1
lim P(| X  m |  e )  0  Can use estimate of p for X ~ Bin(n, p)
n
 Equivalently:  If you know what n is, you don’t need to estimate that
lim P(| X  m |  e )  1
n
 Establishes sample mean as consistent estimate for m
 Generally, MOM estimates are consistent

2
Method of Moments with Poisson Method of Moments with Normal
• Consider I.I.D. random variables X1, X2, ..., Xn • Consider I.I.D. random variables X1, X2, ..., Xn
 Xi ~ Poi(l)  Xi ~ N(m, 2)
• Estimate l • Estimate m
1 n 1 n
l  E[ X i ]  mˆ 1  X   X i  lˆ
n i 1
m  E[ X i ]  mˆ 1  X   X i  mˆ
n i 1
 But note that for Poisson, l = Var(Xi) as well! • Now estimate 2
 Could also use method of moments to estimate:  2  mˆ 2  (mˆ 1 ) 2

n
(Xi  X 2)  (Xi  X 2)
2 n 2
1 n 2 1 n 1 n
l  E[ X 12 ]  E[ X i ]2  mˆ 2  (mˆ 1 ) 2  i 1
 lˆ    X i   mˆ 2   X i   X 2  i 1
2

n  n i 1  n i 1 n i 1 n

 Usually, use first moment estimate


 More generally, use the one that’s easiest to compute

Method of Moments with Uniform


• Consider I.I.D. random variables X1, X2, ..., Xn
 Xi ~ Uni(a, b)
 Estimate mean:
1 n
m  mˆ 1   X i  mˆ
n i 1
 Estimate variance:

n
(Xi  X 2)
2

 2  mˆ 2  (mˆ 1 ) 2  i 1
 ˆ 2
n
ab (b  a) 2
 For Uni(a, b), know that: m  and  2 
2 12
 Solve (two equations, two unknowns):
o Set b = 2m – a, substitute into formula for 2 and solve:
aˆ  X  3ˆ and bˆ  X  3ˆ

You might also like