Information Theory: Dr. Muhammad Imran Farid

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Information Theory

Dr. Muhammad Imran Farid


Topics
▪ Introduction
▪ Uniquely decodable and prefix codes
▪ Huffman code

2
Typical Communication Block Diagram
Source u v
Source Encoding Encryption Channel Encoding Digital Modulation
Information

Channel

Sink Source Decoding Decryption


û Channel Decoding
r Digital
Demodulation
3
Source Coding
▪ Objective: To represent source at a lower rate.

Source Channel Destination


x(t) y(t)
or Or
x[n] y[n]

Figure 1: Typical communication systems

Why Source coding?

4
▪ x(t) / x[n] may have redundant or
undesired/unimportant extra information
▪ Example: Telephone quality speech signal
can be compressed to as little as 1-2Kbps.
▪One cannot find the difference in the quality of
voice

5
▪ x(t) / x[n] may have redundant or
undesired/unimportant extra information
▪ Example: A band-limited continuous time signal,
bandwidth W
x(t)

x[n]
Sampling at rate >= 2W

No loss of Information

• Intermediate value redundant


6
▪ The channel may not have the “capacity” to
communicate at the source information rate
▪ Therefore, it need to represent source at a lower
rate with some loss of information
▪ Example: A real valued sample has infinite
information. It needs to be quantized before
transmission through a discrete memoryless
channel (DMC)

7
▪ Non-uniform distribution of the source may allow
efficient representation of the signal at a lower rate
▪ Example: For binary transmission of English text,
frequent letters may be represented by shorter bit
sequences, e.g. Morse Code

E: .
Z: _ _ . .

8
▪ Morse Code of English Alphabets

9
▪ Morse Code Generator

10
Types of source Coding
▪ Lossless
▪ No loss of information
▪ Used by various compression schemes for file
storage (e.g. compress in Linux) (WinZip in
windows)
▪ Lossy
▪ With loss of information, e.g. quantization, vector
quantization, sub-band coding
▪ Used by standards like JPEG, MPEG etc.

11
Lossless source coding
for
Discrete Memory-Less
Sources

12
Discrete Memoryless Source

X 0 , X 1 , X 2 ........
Source
i.i.d

X i   x0 , x1 ,..., xM 1

P  X i  x0  ,P  X i  x1 ,..., P  X i  xM 1 known)


Example: an analog source sampled and quantized

13
Independent and identically distributed
random variables

▪ In probability theory and statistics, a sequence or


other collection of random variables is independent
and identically distributed if each random variable
has the same probability distribution as the others
and all are mutually independent. (Wikipedia)

14
• Without source coding what is the obvious way of coding a
discrete memoryless source.

If M <= 2b, then we can represent the symbols by binary fixed length
codes as:

Symbols xi codes C  xi  Length li Probability Pi


x0 0…000 l0  b P0
x1 0…001 l1  b P1
x2 0…010 l2  b P2
⋮ ⋮ ⋮ ⋮

This require b-bits per symbol


15
Something better possible
Example:
symbol: x0 x1 x2 x3
probability: 0.5 0.25 0.125 0.125
fixed length: 00 01 10 11
variable length: 0 10 110 111

• Average length for fixed length code = 2


• For variable length code = 1.75

• Entropy, H  x  = 1.75
• By Shannon source coding theorem; Entropy is the minimum
number of bits that can be attained to code the random variable 16
Entropy
▪ What are the overall uncertainty for an information source
where probability of outcomes are unequal.
▪ The name of this measurement is called Entropy.
▪ Entropy is the average amount of information that we will
receive from every symbol from any information source.
▪ NOTE: Entropy is NOT information and information is not
Entropy.

17
Entropy cont..
▪ Entropy is a MEASUREMENT of our average
UNCERTAINTY when we don’t know the outcome of an
information source. That means it’s a measurement of
how much information we DON’T have.
▪ However, this also means it’s the average amount of
information that we WILL have when we receive an
outcome from an information source.
M
1
H  X    Pi log 2  
i 1  Pi 
18
Entropy examples
 1   1 
▪ For fair coin: H  X   0.5log 2   0.5log 2   1
 0.5   0.5 

▪ For unfair coin:


 1   1 
H  X   0.75log 2   0.25log 2    0.811bits
 0.75   0.25 

▪ For unbalanced die of value {0.1, 0.1, 0.1, 0.5, 0.1, 0.1}
▪ H  X   2.161bits and for balance its log 2   bits
19
Desired properties of source codes
▪ Non-singular: xi  x j C  xi C  x j 
▪ All codes are different from each other

▪ Example: 0,010,01,10

▪ singular codes are not useful

20
Desired properties of source codes
▪ uniquely decodable: For any two finite
sequences
x1 x2 ....xm  x1x2 ....xn
thenC  x1  C  x2  ....C  xm   C  x1  C  x2  ....C  xn 

▪ decoder may need to observe future bits


▪ Non-example: 0,010,01,10
▪ 01010 = 01 = 010 = 01
▪ Example: 10,00,11,110
21
Desired properties of source codes
▪ Prefix code or instantaneous code: no code word is
a prefix of any other code word
▪ can be decoded instantaneously
▪ Non-example: {10, 00, 11, 110} is not prefix-free. If
we receive 1100010, then we can decode the first
symbol 110 only after receiving 110001.

▪ Example: 0,,110,1110

▪ NOTE: When code come at the receiver the whole code


will appear as given in the examples above.
22
Example: Classes of codes

X Singular Non-singular UD, but not prefix


but not UD prefix
1 0 0 10 0
2 0 010 00 10
3 0 01 11 110
4 0 10 110 111

01010 = 01 = 010 = 01 0

23
Exercise
• Is the code {00, 11, 0101, 111, 1010, 100100, 0110}
uniquely decodable?

24
Exercise
• Is the code {00, 11, 0101, 111, 1010, 100100, 0110}
uniquely decodable?

Answer: No. Consider the sequence: 111010111.

It can be decoded as either

11 1010 111
or
111 0101 11
25
Huffman code:

X Code word Probability


1 0.20
2 0.15
3 0.25
4 0.25
5 0.15

26
Huffman code:

X Code word Probability


3 0.25
4 0.25
1 0.20
2 0.15
5 0.15

27
Huffman Code:
▪ Arrange the symbols in decreasing order of their
probability
▪ Assign 0 and 1 as the last bit of the last two symbols
▪ Combine the last two symbols as one and assign the sum
of their probabilities to the new symbol
▪ Repeat this process on the new set if symbol again and
again. While assigning a bit to a derived symbol, the bit is
prepended to the code words of all the contributing
symbols.
28
Huffman Code:
▪ Properties:
▪ Huffman code is a prefix code
▪ Huffman code is optimum. No code is better than Huffman
code for any random variable, i.e. Huffman code gives the
minimum average length for any random variable.

29
Exercise:
▪ Consider the random variable
▪ X =  x1  x2  x3  x4  x5  x6  x7 
 0.49 0.26 0.12 0.04 0.04 0.03 0.02 
 

a) Find a binary Huffman code for X.


b) Find the expected code length for this encoding.
c) What is the minimum length of any fixed length
code for this random variable?
30
Exercise:
▪ Which of these codes can not be Huffman codes for
any probability assignment?
a) {0, 10, 11}.
b) {00, 01, 10, 110}.
c) {01, 10}.

31
Exercise:
▪ Answer:
a) {0, 10, 11}. Huffman code for {0.5, 0.25, 0.25}
b) {00, 01, 10, 110}. No. There cannot be a single
longest code word in a Huffman code.
c) {01, 10}. No. Huffman code for any binary source
is {0, 1}.

32

You might also like