Information Theory: Dr. Muhammad Imran Farid
Information Theory: Dr. Muhammad Imran Farid
Information Theory: Dr. Muhammad Imran Farid
2
Typical Communication Block Diagram
Source u v
Source Encoding Encryption Channel Encoding Digital Modulation
Information
Channel
4
▪ x(t) / x[n] may have redundant or
undesired/unimportant extra information
▪ Example: Telephone quality speech signal
can be compressed to as little as 1-2Kbps.
▪One cannot find the difference in the quality of
voice
5
▪ x(t) / x[n] may have redundant or
undesired/unimportant extra information
▪ Example: A band-limited continuous time signal,
bandwidth W
x(t)
x[n]
Sampling at rate >= 2W
No loss of Information
7
▪ Non-uniform distribution of the source may allow
efficient representation of the signal at a lower rate
▪ Example: For binary transmission of English text,
frequent letters may be represented by shorter bit
sequences, e.g. Morse Code
E: .
Z: _ _ . .
8
▪ Morse Code of English Alphabets
9
▪ Morse Code Generator
10
Types of source Coding
▪ Lossless
▪ No loss of information
▪ Used by various compression schemes for file
storage (e.g. compress in Linux) (WinZip in
windows)
▪ Lossy
▪ With loss of information, e.g. quantization, vector
quantization, sub-band coding
▪ Used by standards like JPEG, MPEG etc.
11
Lossless source coding
for
Discrete Memory-Less
Sources
12
Discrete Memoryless Source
X 0 , X 1 , X 2 ........
Source
i.i.d
X i x0 , x1 ,..., xM 1
13
Independent and identically distributed
random variables
14
• Without source coding what is the obvious way of coding a
discrete memoryless source.
If M <= 2b, then we can represent the symbols by binary fixed length
codes as:
• Entropy, H x = 1.75
• By Shannon source coding theorem; Entropy is the minimum
number of bits that can be attained to code the random variable 16
Entropy
▪ What are the overall uncertainty for an information source
where probability of outcomes are unequal.
▪ The name of this measurement is called Entropy.
▪ Entropy is the average amount of information that we will
receive from every symbol from any information source.
▪ NOTE: Entropy is NOT information and information is not
Entropy.
17
Entropy cont..
▪ Entropy is a MEASUREMENT of our average
UNCERTAINTY when we don’t know the outcome of an
information source. That means it’s a measurement of
how much information we DON’T have.
▪ However, this also means it’s the average amount of
information that we WILL have when we receive an
outcome from an information source.
M
1
H X Pi log 2
i 1 Pi
18
Entropy examples
1 1
▪ For fair coin: H X 0.5log 2 0.5log 2 1
0.5 0.5
▪ For unbalanced die of value {0.1, 0.1, 0.1, 0.5, 0.1, 0.1}
▪ H X 2.161bits and for balance its log 2 bits
19
Desired properties of source codes
▪ Non-singular: xi x j C xi C x j
▪ All codes are different from each other
▪ Example: 0,010,01,10
20
Desired properties of source codes
▪ uniquely decodable: For any two finite
sequences
x1 x2 ....xm x1x2 ....xn
thenC x1 C x2 ....C xm C x1 C x2 ....C xn
▪ Example: 0,,110,1110
01010 = 01 = 010 = 01 0
23
Exercise
• Is the code {00, 11, 0101, 111, 1010, 100100, 0110}
uniquely decodable?
24
Exercise
• Is the code {00, 11, 0101, 111, 1010, 100100, 0110}
uniquely decodable?
11 1010 111
or
111 0101 11
25
Huffman code:
26
Huffman code:
27
Huffman Code:
▪ Arrange the symbols in decreasing order of their
probability
▪ Assign 0 and 1 as the last bit of the last two symbols
▪ Combine the last two symbols as one and assign the sum
of their probabilities to the new symbol
▪ Repeat this process on the new set if symbol again and
again. While assigning a bit to a derived symbol, the bit is
prepended to the code words of all the contributing
symbols.
28
Huffman Code:
▪ Properties:
▪ Huffman code is a prefix code
▪ Huffman code is optimum. No code is better than Huffman
code for any random variable, i.e. Huffman code gives the
minimum average length for any random variable.
29
Exercise:
▪ Consider the random variable
▪ X = x1 x2 x3 x4 x5 x6 x7
0.49 0.26 0.12 0.04 0.04 0.03 0.02
31
Exercise:
▪ Answer:
a) {0, 10, 11}. Huffman code for {0.5, 0.25, 0.25}
b) {00, 01, 10, 110}. No. There cannot be a single
longest code word in a Huffman code.
c) {01, 10}. No. Huffman code for any binary source
is {0, 1}.
32