Lecture 2: Conditional Expectation: ! ! ! ! ! ! ! G (X) " - Y Residual

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Stat 150 Stochastic Processes

Spring 2009

Lecture 2: Conditional Expectation


Lecturer: Jim Pitman

Some useful facts (assume all random variables here have nite mean square): E(Y g (X )|X ) = g (X )E(Y |X ) Y E(Y |X ) is orthogonal to E(Y |X ), and orthogonal also to g (X ) for every measurable function g . Since E(Y |X ) is a measurable function of X , this characterizes E(Y |X ) as the orthogonal projection of Y onto the linear space of all square-integrable random variables of the form g (X ) for some measurable function g . Put another way g (X ) = E(Y |X ) minimizes the mean square prediction error E[(Y g (X ))2 ] over all measurable functions g .
Residual E (Y |X ) Y T g (X )

These facts can all be checked by computations as follows: Check orthogonality: E[(Y E(Y |X ))g (X )] = E(g (X )Y g (X )E(Y |X )) = E(g (X )Y ) E(g (X )E(Y |X )) = E(E(g (X )Y |X )) E(g (X )E(Y |X )) = E(g (X )E(Y |X )) E(g (X )E(Y |X )) =0 Recall: Var(Y ) = E(Y E(Y ))2 and Var(Y |X ) = E([Y E(Y |X )]2 |X ). Claim: Var(Y ) = Var(E(Y |X )) + E(Var(Y |X ))

Lecture 2: Conditional Expectation

Proof: Y = E(Y |X ) + Y E(Y |X ) E(Y 2 ) = E(E(Y |X )2 ) + E([Y E(Y |X )]2 ) + 0 = E(E(Y |X )2 ) + E(Var(Y |X )) E(Y 2 ) (E(Y ))2 = E(E(Y |X )2 ) (E(Y ))2 + E(Var(Y |X )) = E(E(Y |X )2 ) (E(E(Y |X )))2 + E(Var(Y |X )) =Var(Y ) = Var(E(Y |X )) + E(Var(Y |X )) Exercise P. 84, 4.3 T is uniform on [0,1]. Given T , U is uniform on [0, T ]. What is P(U 1/2)? P(U 1/2) = E(E[1(U 1/2)|T ]) = E[P(U 1/2|T )] T 1/2 1(T 1/2) =E T 1 t 1/2 = dt t 1/2 Random Sums: Random time T . Sn = X1 + + Xn . Wants a formula for E(ST ) which allows that T might not be independent of X1 , X2 , . . . . Condition: For all n = 1, 2, . . . the event (T = n) is determined by X1 , X2 , . . . , Xn . Equivalently: (T n) is determined by X1 , X2 , . . . , Xn . Equivalently: (T > n) is determined by X1 , X2 , . . . , Xn . Equivalently: (T n) is determined by X1 , X2 , . . . , Xn1 . Call such a T a stopping time relative to the sequence X1 , X2 , . . . Example: The rst n (if any) such that Sn 0 or Sn b. Then (T = n) = (S1 (0, b), S2 (0, b), . . . , Sn1 (0, b), Sn (0, b)) is a function of S1 , . . . , Sn . Walds identity: If T is a stopping time relative to X1 , X2 , . . . , which are i.i.d. and Sn := X1 + + Xn , then E(ST ) = E(T )E(X1 ), provided E(T ) < .

Lecture 2: Conditional Expectation

Sketch of proof: ST = X1 + + XT = X1 1(T 1) + X2 1(T 2) + E(ST ) = E(X1 1(T 1) + E(X2 1(T 2)) + = E(X1 ) + E(X2 )P(T 2) + E(X3 )P(T 3) +

= E(X1 )
n=1

P(T n)

= E(X1 )E(T ) Key point is that for each n the event (T n) is determined by X1 , X2 , . . . , Xn1 , hence is independent of Xn . This justies the factorization E(Xn 1(T n) = E(Xn )E(1(T n)) = E(X1 )P(T n). It is also necessary to justify the swap of E and . This is where E(T ) < must be used in a more careful argument. Note that if Xi 0 the swap is justied by monotone convergence. Example. Hitting probabilities for simple symmetric random walk
+C

Sn = X1 + Xn , Xi 1 with probability 1/2,1/2. T = rst n s.t. Sn = +C or B . It is easy to see that E(T ) < . Just consider successive blocks if indices of
B+C B+C B+C ... length B + C . Wait until a block of length B + C with Xi = 1 for all i in the block. Geometric distribution of this upper bound on T = E(T ) < .

Lecture 2: Conditional Expectation

Let p+ = P(ST = +C ) and p := P(ST = C ). Then E(ST ) = E(T )E(X1 ) = E(T ) 0 = 0 0 = p+ C p B 1 = p+ + p B C = p+ = p = B+C B+C

You might also like