MIT (14.32) Spring 2009 J. Angrist Preliminaries

1MIT (14.
32) Spring 2009

J. Angrist
Preliminaries
Reading: MHE Chapters 1-2
Ultimately, were interested in measuring causal relationships !las, we ha"e to pay some pro# $ stats dues
#e%ore we learn how &ut causality is a #ig and deep concept, so we should start thin'ing a#out it now
(e ma'e sense o% causal relationships using potential outcomes )hese capture *what i%s+, a#out the world
-or e.ample,
/1i 0 my health i% 1 go to the hospital
/2i 0 my health i% 1 stay away
3Here, were using an e.plicit notation %or potential outcomes 4ometimes well 'eep this in the
#ac'ground5
My %riend Mi'e, who runs emergency medicine at Hart%ord Hospital, descri#es the causal e%%ect o%
hospitali6ation li'e this:
People come to the ER and they want to be admitted. They figure theyll just get admitted to the hospital
and well take over and make them better. They dont realie that the hospital can be a pretty dangerous
place. !nless youre really sick" youre really better off going home.#
How does /1i compare with /2i+ (e can ne"er 'now %or sure so we try to loo' at e.pectations or a"erages:
E7/1i 8/2i 901: 0 E7/1i 9i01: 8 E7/2i 9i01:
31n general, *E7/i; <i:, means the *population a"erage o% a random "aria#le, /i, holding the random "aria#le
<i %i.ed,5
)he ta#le on page 1= o% MHE shows some &ritish data on hospitali6ation and health )a'en at %ace "alue,
these data suggest Mi'e is right )he pro#lem with a causal interpretation o% this ta#le is selection #ias >et
yi denote the o#ser"ed outcome 3say an inde. o% health5 )hen, we ha"e:

E7yi 9i 01: 8 E7yi 9i0 2: 0 E7/1i 8 /2i 9i 0 1: ? @E7/2i 9i01: 8 E7/2i 9i02:A
0 the a"erage causal e%%ect on the hospitali6ed ? selection #ias
Hospitali6ation may help you or hurt you 3on a"erage5, #ut as a rule, its the sic' who see' treatment
Random assignment o% 9i %i.es selection #ias #ecause 9i and /2i are then independent E.periments are
there%ore said to ha"e *internal "alidity, though they may not ha"e *e.ternal "alidity,, which is predicti"e
"alue %or another time or conte.t than the one e"aluated
/ou cant randomi6e e"erything, o% course, perhaps not hospitali6ation %or routine medical complaints, #ut
you can try to use the data you ha"e 3or collect more5 in an e%%ort to come close to the desired e.periment
)he details o% how this is done is what most o% econometrics is a#out
1
Lecture !te 1
Pr!"a"ilit# an$ %istri"uti!n
Reading: (ooldridge !ppendices ! and &
! Bro#a#ility
*! system %or Cuanti%ying chance and ma'ing predictions a#out %uture e"ents,
Concepts
$ample space: 4 0 @a1, a2, a=, , aDA the basic elements of the experiment
example: toss two coins (J=4)
(to make this interesting, we could place bets)
Random variable: X(a) the data
A function that assigns numerical values to events
example: number of heads in two coin tosses
Probability: a function defined over events or random variables.
When defined over events, probability satisfies axioms:
0<P(A)<1
P(S)=1
P{U
j
A
j
) = _
j
P(A
j
) for disjoint events A
j
and has properties
P() =0
P(A
c
) = 1 - P(A)
AB P(A)<P(B)
P(AOB)=P(A)+P(B)-P(AB)
When we write P(x) for a discrete r.v. this is shorthand for P(the union of all events a
j
such that X(a
j
)=x).
For a continuous r.v., we write P(X<x) to mean P(the union of all events a
j
such that X(a
j
)<x).
But what is probability really?
The relative frequency of an event in many () repeated trials.
A personal and subjective assessment of the likelihood of an event, where the assessment obeys the
axioms of probability
2
Probabil ity (cont.)
Conditional probability: P(A|B) P(AB)/P(B)
Conditional probability has the properties of and obeys the axioms of probability
Bayes Rule: Let the set {C
i
; i=1, . . .I} be a partition of the sample space. Then:
P(C
i
| A) = {P(A|C
i
)P(C
i
)}/{_
i
P(A| C
i
)P(C
i
)}
Proof: use P(C
i
| A)=P(A|C
i
)P(C
i
)/P(A) and the fact that {C
i
; i=1,. . .,I} is a partition.
Bayes rule is useful for reversing conditional probability statements.
Independence: A is said to be independent of B iff P(AB)=P(A)P(B)
Sometimes we write: A_ B
Note: A_B P(A|B)=P(A)
Note: r.v.s are independent if their distribution or density functions factor (more below)
B. Distribution and density functions (how we characterize r.v.s)
For the rest of the course, our probability statements will apply directly to r.v.s
1. Discrete random variables
Empirical distribution functions
Example: years of schooling
Probability mass function (pmf)
Parametric examples: Bernoulli, binomial, multinomial, geometric
Cumulative distribution functions (cdf)-- discrete r.v.
Obtain by summation
2. Continuous random variables
Probability density functions (pdf) note: P(X=x)=0

=
Parametric examples: uniform, exponential, normal;
empirical PDFs of students grades
Cumulative distribution functions -- continuous r.v.
Obtain by integration
P(X<c)=F(c) =
-
|
c
f(t)dt
P(a<X<b)=
a
|
b
f(t)dt= F(b)-F(a)
P(X>c)=1-F(c)
Relationship between cdf and pdf
F(x)=f(x).
3. Functions of random variables
Mantra: A function of a random variable is a random variable and therefore has a distribution
Discrete r.v.
Y=r(X); P(X=x
j
) = f(x
j
); then
g(y) = P[r(X)=y] = _
{x: r(x)=y}
f(x)
Continuous r.v.
Examples: (i) Y=lnX; X F
G(y)=P(Y<y)=P(lnX<y)=P(X<e
y
)=F(e
y
)
g(y) = f(e
y
)e
y
(ii) Y=1/X; X>0
G(y)=P(Y<y)=P(1/X<y)=P(X>1/y)=1-F(1/y)
g(y)=-f(1/y)(-y
-2
)
In general: for Y=r(X) and X = r
-1
(Y) s(Y), where r(X) is continuous and invertible
if r is increasing, G(y) = F[s(y)] and g(y) = f[s(y)]s(y)
E
If r is decreasing, G(y) = 1-F[s(y)] and g(y) = -f[s(y)]s(y)
Important special case: Y = r(X) =a + bX; b>0
X = (Y-a)/b = s(Y)
G(Y) = F[(Y-a)/b]
g(Y)=f[(Y-a)/b](1/b)
Standardize r.v. X by setting a=-E(X)/
X
and b=1/
X
.
C. Bivariate distribution functions: how r.v.s move together
For discrete r.v.s: f(x,y) = P(X=x, Y=y)
For continuous r.v.s: f(x,y) is the joint density
Probability statements for joint continuous r.v.s. use the cdf:
F(x, y) = P(X<x, Y<y) =
-
|
x

-
|
y
f(s,t)dsdt
Marginal distributions
Marginal for X: f
1
(x); obtain by integrating the joint density or summing the joint pmf over Y
Marginal for Y: f
2
(y); obtain by integrating the joint density or summing the joint pmf over X
Conditional distributions
Divide the joint density or pmf by the marginal density or pmf
f
2
(y|x)=f(x,y)/f
1
(x); f
1
(x|y)=f(x,y)/f
2
(y)
Example:
Joint normal: marginal and conditional are also normal
f(x,y) = [(1/2)(1-
2
)]
-1/2
exp{-1/2(1-
2
)[(x-
x
)
2
-2(x-
x
)(y-
y
)+(y-
y
)
2
]}
X and Y are normally distributed with means
x
and
y
, standard deviation 1,
F
and correlation
Example:
Roof distribution
f(x,y)=(x+y) for 0<x<1 and 0<y<1.
f
1
(x)=x+(1/2)
f
2
(y)=y+(1/2)
f
2
(y|x)= 2(x+y)/(2x+1)
D. Example: the effect of a wage voucher (Burtless, 1985)
Simple conditional distributions for Bernoulli outcomes in a randomized trial
G

MIT (14.32) Spring 2009 J. Angrist Preliminaries

Uploaded by

Copyright:

Available Formats

MIT (14.32) Spring 2009 J. Angrist Preliminaries

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MIT (14.32) Spring 2009 J. Angrist Preliminaries

Uploaded by

Copyright:

Available Formats

1MIT (14.

32) Spring 2009

You might also like