1.1 Shannon's Information Measures: Lecture 1 - January 26
1.1 Shannon's Information Measures: Lecture 1 - January 26
1.1 Shannon's Information Measures: Lecture 1 - January 26
Spring 2006
Lecture 1 January 26
Lecturer: Sergio D. Servetto Scribe: Roberto Torralbas
1.1
Claude E. Shannon in his paper The Mathematical Theory of Communications laid out the foundation for the eld of information theory. In it Shannon introduced two fundamental concepts about information. The rst one being that information is uncertainty and the second that information to be transmitted is digital. Shannon also introduced a very important theorem know as the source coding theorem, which introduces entropy as the fundamental measure of information. Entropy characterizes the minimum rate of a source code representing an information source essentially free of error. Information measures refer to entropy, conditional entropy, mutual information, and conditional mutual information. In the following section this measures are introduced along with their properties. The physical meaning of these measures will be discussed in subsequent lectures.
1.2
Entropy
H (X ) =
x
Denition 1.1. The enrtopy H(X) of a random variable X is dened by: p(x) loga p(x) (1.1)
Entropy is the rst measure of information (average amount of uncertainty about the outcome of a random variable). In the above denition p(x) is the pmf for the random variable X. In subsequent lectures it will be shown that a, the base of the logarithm is usually taken to be the size of the code alphabet. This denition could be writen in a diferent form if we use the denition of the expectation of a random variable. Say: g: X R X is an random variable on X( X = {1...n}) Then the expectation Y = g (X ) is: E (Y ) = E (g (X )) =
x
p(x)g (x)
(1.2)
1-1
ECE 562
Lecture 1 January 26
Spring 2006
Example 1.1. Take 0 p 1 an dene the binary entropy with samples in X = {1...n}as follows: hb (p) = p log2 p (1 p) log2 (1 p) , 0 < p < 1 0 , otherwise. (1.4)
Notice that uncertainty is based in your ability to predict. This happens when we have a fair coin toss.
1.3
Denition 1.2. The joint entropy H(X,Y) of a pair of random variables X and Y is dened by p(x, y ) loga p(x, y ) = E (loga p(X, Y )). (1.5) H (X, Y ) =
x, y
The join entropy is the measure of the amount of uncertainty two variables contain
1-2
ECE 562
Lecture 1 January 26
Spring 2006
1.4
Denition 1.3. The conditional entropy H (X |Y ) of a pair of random variables X and Y is dened by p(x, y ) loga p(x|y ) = E (loga p(X |Y )). (1.6) H (X | Y ) =
x, y
p(x,y ) p(y )
H (X | Y ) =
x, y
= =
y
p (y )
x
=
y
p (y )H (X | Y = y )
H (X | Y ) Ey (H (X |Y ) Example 1.2. Given the table below calculate H (Y |X = 0), H (X |Y = 1), and H (X |Y ). x\y a b 0 1/4 1/4 1 1/2 0 It is obvious that if X = 0 then we have the binary entropy discussed previously and therefore H (Y |X = 0) = 1 since the probability of event a equals that of event b. Similiarly H (Y |X = 1) = 0 since there is not uncertainty as event b never occurs. Finally for H (X |Y ) we get: H (Y |X ) = P (X = 0)H (Y |X = 0) + P (X = 1)H (Y |X = 1) = (1/2) (1) + (1/2) 0 = 1/2 There are some special properties among the entropy , mutual entropy , and conditional entropy of two random variables: H (X, Y ) = H (X ) + H (Y |X ) = H (Y ) + H (X | Y )
1-3
Lecture 1 January 26
Spring 2006
H (X, Y ) = H (X ) + H (Y | X ) = H (X ) + H (Y | X ) = E (loga p(X, Y )) = E (loga p(X )p(Y |X ) = E (loga p(X )) E (loga p(Y |X ) = H (X ) + H (Y | X )
1.5
Mutual Information
Denition 1.4. The mutual information, I(X,Y), of a pair of random variables X and Y is dened by: p(x, y ) p(x, y ) p(x, y ) loga ( I (X, Y ) = ) = E (loga ( )). (1.7) p ( x ) p ( y ) p ( x ) p ( y ) x, y -Notice that if Y and X are independent then the mutual information between the two random variables is zero ( I(X,Y)=0 for Y X ). -If P(X=Y)= 1 I(X,Y)=H(X)=H(Y). Proof: I (X, X ) = E (loga ( p(x) 1 ) = E (loga ( )) = H (x) p(x)p(x) p(x)
1-4
ECE 562
Lecture 1 January 26
Spring 2006
Figure 1.2. Venn Diagram Relating the Mutual Information, Entropy, and Conditional Entropy of Two Random Variables
1-5