ELEC5300 Lecture2 2020
ELEC5300 Lecture2 2020
ELEC5300 Lecture2 2020
ELEC5300 Lecture 2 1
Affine Function of a Single RV
If X is a continuous RV with density fX(x), what is the density of
Y = aX + b?
ELEC5300 Lecture 2 2
Graphical Interpretation fY y
1 y b
fX
a a Y
Y Y
b
X X X
a < 1, b = 0 a < 1, b > 0 a > 1, b = 0
x x x
fY(y) fY(y) fY(y)
y y y
b
ELEC5300 Lecture 2 3
Affine Function of a Gaussian RV
ELEC5300 Lecture 2 4
Affine Function of Multiple RVs
If a random vector, X , has density f X x and
Y AX b
1 y b
Compare this with the 1D result: fY y fX
a a
ELEC5300 Lecture 2 5
Graphical Interpretation
ELEC5300 Lecture 2 6
Mean of Affine Transformations
Consider an affine transformation Y AX b
where Y , X n are random vectors
m
and A , b m are constant.
mn
X1
Y1 a11 a12 a13 b1
Y a a22
a23 X2
b2
2 21 X 3
The mean of Y is
E[Y ] AE[ X ] b
This is a direct consequence of linearity of the expectation.
ELEC5300 Lecture 2 7
Variance of Affine Transformations
The variance of Y is given by CY AC X AT
Proof:
T
CY E[(Y E[Y ]) (Y E[Y ]) ]
T
E[( AX b AE[ X ] b ) ( AX b AE[ X ] b ) ]
T
E[( AX AE[ X ]) ( AX AE[ X ]) ]
T T
E[ A( X E[ X ]) ( X E[ X ]) A ]
T
A E[( X E[ X ]) ( X E[ X ]) ] AT
A C X AT
Solution
X
Express Z as Z 1 1
Y
Then, Var Z 1 1
Var[ X ] Cov( X , Y ) 1
Cov( X , Y ) Var[ Y ] 1
Var X 2Cov( X , Y ) Var Y
ELEC5300 Lecture 2 9
Jointly Gaussian (Normal) Random Variables
The RVs X1, X2,…, Xn are said to be jointly Gaussian (or jointly
normal) if their joint pdf is given by
1 1 T 1
f (x)
X 1/ 2
exp x m C x m
(2 ) n/2
C 2
where
X1 x1
X and x
X n xn
and m E[ X ] and C is the covariance matrix of X
Solution:
1 1 T 1
The density is X )
f ( x 1/ 2
exp x m C x m
n/2
(2 ) C 2
where
31 16 13 5
T
5 C R E[ X ] E[ X ] 16 13 8 35 3 2
m E[ X ] 3 13 8 11 2
2 31 16 13 25 15 10 6 1 3
16 13 8 15 9 6 1 4 2
13 8 11 10 6 4 3 2 7
ELEC5300 Lecture 2 11
Two Jointly Gaussian RVs
If X1 and X2 are jointly Gaussian, their pdf is given by
( x1 m1 ) 2
2 x1 m1 x2 m2
( x 2 m2 ) 2
1 12 1 2 22
f X 1 X 2 ( x1 , x2 ) exp
2 1 2 1 2 2(1 2 )
where mi E[ X i ], i 2 VAR ( X i )
and r is the correlation coefficient.
ELEC5300 Lecture 2 12
Justification
Since COV( X 1 , X 2 ) 1 2
12 1 2 1 1 22 1 2
C and C 2 2
1 2 22 (1 2 ) 1 2 1 2 12
2 2
Thus, C 1 2 (1 2 )
1 T 1
x m C x m
2
1 22 1 2 x1 m1
2 2 2
x1 m1 x2 m2
2 1 2 (1 ) 1 2 12 x
2 m2
1 22 ( x1 m1 ) 1 2 ( x2 m2 )
2 2 2
x1 m1 x2 m2
2 1 2 (1 )
1 2 ( x1 m1 ) 2
1 ( x 2 m )
2
1 ( x1 m1 ) 2 x1 m1 x2 m2 ( x2 m2 ) 2
2 2
2 2
2(1 ) 1 1 2 2
ELEC5300 Lecture 2 13
PDF of Two Jointly Gaussian RVs m1 0 m2 0
12 1 22 1
0
ELEC5300 Lecture 2 14
Contour diagrams of the pdf
0, 1 2 0, 1 2 0, 1 2
x2 x2 x2
m2 m2 m2
m1 x1 m1 x1 m1 x1
xi mi 2
i i
2
C 1
C i
2
1 1 T
f X ( x ) 1/ 2
exp x m C 1 x m
(2 ) n / 2 C 2
1 1 xi mi 2 1 1 xi mi 2
exp exp
(2 ) n / 2 i 2 i i
2
i 2 i 2 i
2
i
ELEC5300 Lecture 2 17
Affine transformations of Gaussians are Gaussian
If Y AX b , then fY y
A
1
1
f X A y b
Substituting the Gaussian density,
where m Am b and C ACAT
Y Y
ELEC5300 Lecture 2 18
Affine transformations (II)
Thus,
1 1 T 1 T
f y
exp y m y CY y m y
Y
2 n / 2 A C 1/ 2 2
1 1 T 1 T
exp y m y CY y m y
2 n/2
CY
1/ 2
2
since C ACA T A C A T A 2 C
Y
1/ 2 1/ 2
CY AC
ELEC5300 Lecture 2 19
Example of Affine Transformation
Suppose X1, X2 and X3 are jointly Gaussian with the mean/covariance shown.
Let Y1 X 1 X 3 1 5 6 1 3
Y2 X 1 2 X 2 1 m X E[ X ] 3 C X 1 4 2
Find the joint density of Y1 and Y2. 2 3 2 7
Solution:
X1
Y
1 1 0 1 1
Expressing the equations above as, X2
Y2 1 2 0 X 1
3
We have A b
mY A E X b CY AC X AT
5 6 1 3 1 1
1 0 1 1 2 1 0 1
3 1 0 1 4 2 0 2
1 2 0 2 1 2 0
3 2 7 1 0
which we can substitute into the 1 1
3 1 4 7 5
equation for the Gaussian density. 0 2
4 7 1 1 0
5 18
ELEC5300 Lecture 2 20
Subsets of joint Gaussians are jointly Gaussian
Suppose that X1, X2 and X3 are jointly Gaussian, and we are
interested in the joint density of X1 and X3 . Define
X1
X1
X X 2 and Y
X3
X 3
Since A
X1
X 1 1 0 0
Y X 2 AX
X 3 0 0 1 X
3
and we know that affine functions of Gaussians are Gaussian,
we know the subset is Gaussian.
ELEC5300 Lecture 2 21
Computing the marginal density for two RVs (I)
Suppose X1 and X2 are jointly Gaussian,
∞
𝑓 𝑋 (𝑥 1)= ∫ 𝑓 𝑋 𝑋 (𝑥 1 ,𝑥 2)𝑑𝑥 2
1 1 2
−∞
where we have made the change of variables
x2 m2 dx2
y , which implies that dy
2 2
ELEC5300 Lecture 2 22
Computing the marginal density for two RVs (II)
Completing the square
∞
1
𝑓 𝑋 (𝑥1)= ∫2 exp¿¿¿
2𝜋𝜎1 √1−𝜌 −∞
1
ELEC5300 Lecture 2 23
Example
Suppose X1, X2 and X3 are jointly Gaussian 5 6 1 3
with the mean and covariance matrix shown. m E[ X ] 3 C 1 4 2
Find the joint density of X1 and X3. 2 3 2 7
Solution:
We know that X1 and X3 are jointly Gaussian, thus we need only find the
corresponding mean and covariance matrics, and substitute into the equation:
1 1 T 1
f X ( x ) 1/ 2
exp x m C x m
n/2
(2 ) C 2
The mean vector and covariance matrix are subsets of the mean/covariance
above:
X 1 5 6 3
m E C
X 3 2 3 7
ELEC5300 Lecture 2 24
Conditional density of two Gaussians (I)
Suppose X and Y are jointly Gaussian,
f XY ( x, y )
f X |Y ( x | y )
fY ( y)
( x m X2 ) 2 x m X y mY ( y m2Y )
2 2
1 Y
exp X
X Y
2
2 X Y 1 2 2(1 )
1 1 ( y mY ) 2
exp 2
2 Y 2 Y
( xmX )2
2 x m X y mY
( y mY ) 2
1 X 2 X Y Y 2 1 ( y mY ) 2
exp 2
2 X 1 2 2(1 2 ) 2 Y
where we have made use of the fact that the marginal of Y is
Gaussian.
ELEC5300 Lecture 2 25
Conditional density of two Gaussians (II)
We can rearrange the term in the exponential as follows:
( xmX )2 x m X y mY ( y mY ) 2
X 2 2 X Y Y 2 1 ( y mY ) 2
2(1 2 ) 2 Y 2
( xmX )2 x m X y mY ( y mY ) 2 2 ( y mY ) 2
X2
2 X Y Y 2
(1 ) Y 2
2
2(1 )
( xmX )2 x m X y mY 2 ( y mY ) 2
X2
2 X Y Y 2
2(1 2 )
xmX
X Y
y mY 2
1 x m X ( y mY )
X
Y
2
2(1 ) 2
2 1 2
2
X
ELEC5300 Lecture 2 26
Conditional density of two Gaussians (III)
Substituting the new expression into the exponential:
2
X
1 1 x m X Y ( y mY )
f X |Y ( x y ) exp
2
2 X 1 2
2 X 1 2
x m 2
1 1 X |y
exp 2
2 X | y 2 X | y
where
mX | y mX YX ( y mY ) E[ X | y ]
(the conditional mean of X given Y=y)
X | y 2 X 2 1 2 Var[ X | y ]
(the conditional variance of X given Y=y)
ELEC5300 Lecture 2 27
Minimum Mean Squared Error Estimation
The conditional expected value of X given Y, E[X|Y] is the “best”
estimate of the value of X if we observe Y.
Proof:
Take any guess of X based on Y, g(Y). Define the mean squared error of the
estimate by
MSE E X g (Y )
2
To find the function g(·) that results in the minimum MSE, we write
MSE E E[ X g (Y ) | Y ]
2
(remove conditioning by expectation)
= E[( X g ( y )) 2 | y ] fY ( y )dy
ELEC5300 Lecture 2 28
Interpretation of Formulas
Conditional expected value of X given Y = y
mX | y mX YX ( y mY )
difference between
observed and
best constant dependency expected values of Y
estimator “units
between
conversion”
X and Y
X | y 2 X 2 1 2
reduction in MSE
MSE of estimator MSE of due to dependency
of X given Y=y best constant Between X and Y
estimator of X
ELEC5300 Lecture 2 29
Example
joint density density of y given x
5 5
4 4
3 3
2 2
1 1
0 0
y
y
-1 -1
-2 -2
-3 -3
-4 -4
-5 -5
-5 -4 -3 -2 -1 0 1 2 3 4 5 0 0.01 0.02
x f(y|x=1)
density of x given y
0.02
0.015 mX 1 mY 0
f(x|y=1)
0.01
X 2 1 Y 2 1
0.005
XY 0.5
0
-5 -4 -3 -2 -1 0 1 2 3 4 5
ELEC5300 Lecture 2 x 30
Definition of a Random Process
Definition: A random process or stochastic process maps a probability
space S to a set of functions, X(t,x)
It assigns to every outcome a time function for where I
is a discrete or continuous index set.
S
If I is discrete, X(t,x) is a discrete-time random process.
If I is continuous, (t , x)) is atcontinuous-time
X X(t, I random process.
t1 t2
t=n
ELEC5300 Lecture 2 31
Example
Suppose that x is selected at random from S = [0,1] and
X (t , ) cos(t ) for t
consider
0
13
23
randomness
1
ELEC5300 Lecture 2 33
Random sequences
A random sequence
is simply a discrete-
time random
process. For every
outcome of the
experiment, a
sequence is
generated
ELEC5300 Lecture 2 34
Example: Random Binary Sequence
Let x be a number selected at random from the interval S =
[0,1] and let b1, b2, b3, … be the binary expansion of x:
1 1 1
i
i 1 bi 2 b1 b2 b3 ...
2 4 8
Define the random sequence X ( n, ) bn n 1,2,...
b1b2b3
It can be shown that the sequence generated in this way is
equivalent to the sequence generated by flipping a fair coin at
every time step.
ELEC5300 Lecture 2 35
Convergence of sequences
A sequence of real numbers xn is said to converge to a limit x if
for every e > 0, there exists an N > 0 such that
xn x for all n N
ELEC5300 Lecture 2 36
Convergence of Random Sequences
Let Xn(x) and X(x) be a random sequence and random variable.
ELEC5300 Lecture 2 37
Relationship between convergence types
almost surely
surely
mean square
in probability
ELEC5300 Lecture 2 38
Example of sure convergence
Let x be the outcome of flipping a coin, and define
ELEC5300 Lecture 2 39
Example of almost-sure convergence
Let x be chosen at random from [0,1] (i.e. a uniform distribution).
Define
ELEC5300 Lecture 2 40
Proof: almost surely in probability
If Xn(x) → X(x) almost surely, then P(C) = 1 where C | X n X
Define
and note that C A(e)c, since Xn(x) → X(x) if and only if x A(e).
Thus, if P(C) = 1, then P(A(e)) = 0 for all e > 0.
Bn Am
Define
m n
and note that that Bn(e) is a decreasing sequence of events with limit A(e).
Thus, P(Bn(e)) → 0 as n → ∞.
Finally, since An(e) Bn(e), P(An(e)) ≤ P(Bn(e)), which implies that
P X n X P An 0 as n for all 0
ELEC5300 Lecture 2 41
Example: in probability but not almost surely
Consider a sequence of independent random variables Xn where
Xn → 0 in probability, since
E[ X n X ] 0 as n
2
E[X n X ]
2
P X n X P X n X 2
2
2
E[ X n X ]
2
P X n X 2
0 as n
ELEC5300 Lecture 2 43
Markov Inequality
Suppose that Z is a non-negative random variable. The Markov
Inequality states that E[ Z ]
P[ Z a ]
a
Proof:
a
E[ Z ] zf Z ( z )dz zf Z ( z )dz
0 a
zf X ( z )dz (throw away first term)
a
af Z ( z )dz (since z a )
a f Z z
z
a f Z ( z )dz aPZ a
a
a z
ELEC5300 Lecture 2 44
Example: in probability, but not mean square
Consider a sequence of independent random variables Xn:
Xn → 0 in probability:
ELEC5300 Lecture 2 45
Example: in mean square but not almost surely
Consider a sequence of independent random variables Xn where
ELEC5300 Lecture 2 46
Example: almost surely, but not mean square
Let x be chosen at random in [0,1].
Define
as n
ELEC5300 Lecture 2 47
Events involving random processes
In general, events of interest for a random process concern the
value of the random process at specific instants in time.
For example:
A { X (0, ) 1}
B { X (0, ) 0, 1 X (1, ) 2}
ELEC5300 Lecture 2 48
Example: Events involving random binary sequence
Find the following probabilities for the random binary sequence:
P[X(1,x)=0] and P[X(1,x)=0 and X(2,x)=1]
Solution
1 1
P[ X (1, ) 0] P[0 ]
2 2
1 1 1
P[ X (1, ) 0 and X (2, ) 1] P[ ]
4 2 4
In general, any particular sequence of k bits corresponds to an
interval of length 1 2 k . Thus, its probability is 1 k
2
X (n, ) bn n 1,2,...
b1b2b3
ELEC5300 Lecture 2 49
Specifying random processes
A random process is uniquely specified by the collection of all n-th order
distribution or density functions.
The first order distribution of X(t,x) is
F ( x1 ;t 1) P ({ X (t1 , ) x1})
ELEC5300 Lecture 2 50
Higher Order Distributions and Densities
The second order distribution of X(t,x) is
F ( x1 , x2 ;t 1, t 2 ) P({ X (t1 ) x1 , X (t 2 ) x2 })
2
The second order density of X(t,x) is f ( x1 , x2 ; t1 , t 2 ) F ( x1 , x2 ; t1 , t 2 )
x1x2
Similarly, the n-th order distribution is defined as:
ELEC5300 Lecture 2 51
Example: Random pulse
Let X(t) be a unit amplitude unit length pulse delayed by a
random time T that is exponentially distributed.
e t t0
f T t
0 t0
Graphical interpretation
X(t) = 1 here
ELEC5300 Lecture 2 52
First order probability mass function (random pulse)
For t0>≤ 1t ≤ 1,
ELEC5300 Lecture 2 53
Second order probability mass function
There are many possible cases.
We consider only one to illustrate
the idea.
ELEC5300 Lecture 2 54
I.I.D. Process
Definition: A discrete time process X(n) is said to be
independent and identically distributed or i.i.d. if all vectors
formed by a finite number of samples of the process are i.i.d.
ELEC5300 Lecture 2 55
Bernoulli Random Process
Definition: A Bernoulli random process is simply a binary
alphabet i.i.d. random process. Thus, at each time n it assumes
values 1 or 0 with probability p or q = 1-p.
Example:
P({ X (0) 1, X (1) 0, X (5) 0, X (6) 1})
2 2 1
p(1 p )(1 p ) p p (1 p ) 4
2
Intuitively, the Bernoulli process is an infinite sequence of
independent coin flips, each with probability p of heads (1).
However, there is only one experiment. The random binary
sequence example shows how a Bernoulli random process with
p = 0.5 can be generated by from a single experiment (picking a
number between 0 and 1.)
ELEC5300 Lecture 2 56
Mean, Autocorrelation and Autocovariance
Mean
m X (t ) E[ X (t )] xf ( x; t )d x
Autocorrelation
ELEC5300 Lecture 2 58
Example: Random phase process
Let X (t ) cos(t ) where Q is uniform on[-p, p].
Find mX(t), RX(t1,t2) and CX(t1,t2).
Solution:
1
m X (t ) E[cos(t )]
2
cos(t )d 0
C X (t1,t 2 ) R X (t1,t 2 )
E[cos(t1 )cos(t 2 )]
1
2 cos(t1 ) cos(t 2 ) d
1 1
2 2
{cos( (t1 t 2 )) cos( (t1 t 2 ) 2 )}d
1
cos( (t1 t 2 )) cos( a )cos(b )
1
(cos(a b) cos(a b))
2 2
ELEC5300 Lecture 2 59
Mean and Covariance Function of an I.I.D. Process
The mean of an i.i.d. process is constant.
C X ( n1,n2 ) 2 (n1 , n2 )
where ( n , n )
1 n1 n2
1 2
0 n1 n2
ELEC5300 Lecture 2 60
Derivation of Covariance Fxn of I.I.D. Process
C X (n1,n2 ) R X (n1,n2 ) m X (n1 ) m X (n2 )
R X (n1,n2 ) m 2
where R X (n1,n2 ) E[ X (n1 ) X (n2 )] xyf ( x, y; n1 , n2 )dxdy
( xf ( x)dx)( yf ( y )dy ) m 2
Thus,
R(n1,n2 ) 2 ( x y ) m 2 C (n1,n2 ) 2 (n1 , n2 )
ELEC5300 Lecture 2 61
Gaussian Random Process
Definition: A random process is said to be Gaussian if all finite
order distributions are jointly Gaussian distributed,
i.e., for k <∞ and any set of sample times ni I where i {1,..., k}
1
1 ( x m )T C 1 ( x m )
f ( x1 ,..., xk ; n1 ,..., nk ) e 2
(2 ) k / 2 | C |1/ 2
where
m X (n1 ) C ( n1 , n1 ) C (n1 , nk )
m and C
m X (nk ) C (nk , n1 ) C (nk , nk )
ELEC5300 Lecture 2 62