Kraft'S and Mcmillan'S Inequalities: Theorem 1.11

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Krafts and McMillans Inequalities

For the purpose of compression, the only interesting property of a code


besides being uniquely decodable is the lengths of the codewords. Krafts
inequality gives an exact condition for the existence of a prefix code in
terms of the codeword lengths.

Theorem 1.11: (Krafts Inequality) There exists a binary prefix code with
codeword lengths `1 , `2 , . . . , ` if and only if

X
2`i 1 .
i=1

Proof. Consider a binary search on the real interval [0, 1). In each step, the
current interval is split into two halves and one of the halves is chosen as
the new interval. We can associate a search of ` steps with a binary string
of length `: zero corresponds to choosing the left half and one to choosing
the right half. For any binary string B, let I(B) be the final interval of the
associated search. For example, 1011 corresponds to the search sequence
[0, 1), [1/2, 2/2), [2/4, 3/4), [5/8, 6/8), [11/16, 12/16) and
I(B) = [11/16, 12/16).
21
Now consider the set {I(w) | w {0, 1}` } of all intervals corresponding to
binary strings of lengths `. Clearly, an interval [l, r) belongs to this set if and
only if
The length r l of the interval is 2` .
Both end points l and r are multiples of 2` .

Such intervals are called dyadic. For any two binary strings w and w0 , the
intervals I(w) and I(w0 ) overlap if and only if one of the strings is a prefix of
the other.

Now we are ready to prove the theorem starting with the only if part. Let
C be a prefix code with the codewords w1 , w2 , . . . , w of lengths `1 , `2 , . . . , ` .
Since C is a prefix code, the intervals I(wi ), i [1..], do not overlap with
each other, and their total length is at most 1. Thus

X
X
|I(wi )| = 2`i 1 .
i=1 i=1

22
Finally, we prove the if part. Let `1 , `2 , . . . , ` be arbitrary integers
satisfying the inequality. We will construct a prefix code with those
codeword lengths. We start by sorting the lengths in nondecreasing order,
i.e., we assume that `1 `2 . . . ` . For i [1..], let

i1
X Xi
`j
[li , ri ) = 2 , 2`j
j=1 j=1

For example, the lengths 1, 2, 3 would result the intervals


[0, 1/2), [1/2, 3/4), [3/4, 7/8). The intervals [li , ri ) are dyadic:
The length of the interval [li , ri ) is 2`i .
Both end points li and ri are multiples 2`i . This is because, for all
j < i, `j `i and thus 2`j is a multiple of 2`i .

Thus [li , ri ) = I(wi ) for some binary string wi of length `i . Since the intervals
do not overlap, the binary strings are not prefixes of each other. Thus
w1 , w2 , . . . , w are valid codewords for a prefix code with the required lengths.


23
Prefix codes are a subclass of uniquely decodable codes. McMillans
inequality generalizes the result for all uniquely decodable codes.

Theorem 1.12: (McMillans Inequality) There exists a uniquely decodable


binary code with codeword lengths `1 , `2 , . . . , ` if and only if

X
2`i 1 .
i=1

Proof. Omitted.

The Huffman algorithm and most other methods for constructing codes are
restricted to prefix codes. Together the two inequalities show that this is
not a significant restriction with respect to compression.

Corollary 1.13: For any uniquely decodable binary code, there exists a
binary prefix code with the same codeword lengths.

In particular, the Huffman code is optimal among all uniquely decodable


codes.

24
Using Krafts inequality, we can also characterize redundancy in prefix codes.

Definition P1.14: A prefix code satisfying Krafts inequality with strict


inequality ( 2`i < 1) is calledPredundant. A prefix code satisfying Krafts
inequality with strict equality ( 2`i = 1) is called complete.

Corollary 1.15: For any redundant prefix code with codeword lengths
`1 , `2 , . . . , ` , there exists a complete prefix code with codeword lengths
`01 , `02 , . . . , `0 such that `0i `i for all i [1..].

Proof. Assume ` is the longest codeword length in the redundant code.


For P all i [1..], 2`i is a multiple of 2` . Thus the redundancy gap
1 P2`i is a multiple of 2` too. If we reduce ` by one, we increase the
sum 2`i by 2` and still satisfy Krafts inequality. If the resulting code is
still redundant, repeat the procedure until it becomes complete.


25
Integer codes

Next we look at some variable-length codes for integers. We want to have a


prefix code when the source alphabet is the set N of natural numbers.

We cannot use the Huffman algorithm or anything similar on an infinite


alphabet. Often there is an upper bound on the integers, but even then the
alphabet can be too large to compute and store the code efficiently.

Instead, we have a few fixed codes for all natural numbers. With each of
them, the codeword length is a non-decreasing function of the integer value.
The codes differ in how fast the growth is.

The codes described here are for all non-negative integers including zero. If
the minimum integer value nmin is larger than zero, we can shift the codes,
i.e., use the code for n nmin to encode n. The case nmin = 1 is common in
the literature.

26
Let us first define the standard binary codes that are used as a building
block in the other codes. This is not a prefix code for natural numbers but
a family of fixed-length codes for finite integer ranges.

Definition 1.16: (Binary Code) For any k 0 and n [0..2k 1]


binaryk (n) = k-bit binary representation of n.

Our first proper prefix code is the unary code.

Definition 1.17: (Unary code) For any n N


unary(n) = 1n 0

The length of the unary code for n is n + 1.

Example 1.18: binary4 (6) = 0110 and unary(6) = 1111110.

27
The unary code lengths grow too fast for many purposes. Elias and
codes achieve a logarithmic growth rates.

Definition 1.19: (Elias and code) For any n N


(n) = unary(k) binaryk (n 2k + 1)
(n) = (k) binaryk (n 2k + 1)
where k = blog(n + 1)c.

The lengths of the codes are


|(n)| = 2blog(n + 1)c + 1
|(n)| = dlog(n + 2)e + 2blogdlog(n + 2)ec

Example 1.20: If n = 6, then k = 2 and (6) = 110 11 and (6) = 101 11.

28
GolombRice codes are a family of prefix codes GR0 , GR1 , GR2 , ... with a
codeword length growth rate between unary code and Elias codes.

Definition 1.21: (GolombRice codes) For any k 0 and any n N


GRk (n) = unary(q) binaryk (n q2k )
where q = bn/2k c.

Note that GRk for a fixed

The codeword lengths are


|GRk (n)| = bn/2k c + k + 1

Example 1.22: GR1 (6) = 1110 0, GR2 (6) = 10 10, GR3 (6) = 0 110

There are many possible variants and extension of the codes we have seen.

29
Here are some examples of the codewords.
n unary(n) (n) (n) GR2 (n) GR4 (n)
0 0 0 0 000 00000
1 10 100 1000 001 00001
2 110 101 1001 010 00010
3 1110 11000 10100 011 00011
4 11110 11001 10101 1000 00100
5 111110 11010 10110 1001 00101
6 1111110 11011 10111 1010 00110
7 11111110 1110000 11000000 1011 00111
8 111111110 1110001 11000001 11000 01000
9 1111111110 1110010 11000010 11001 01001
10 111...10 1110011 11000011 11010 01010
20 111...10 111100101 110010101 11111000 100100
30 111...10 111101111 110011111 1111111010 101110
40 111...10 11111001001 1101001001 1111111111000 1101000
50 111...10 11111010011 1101010011 111...1010 11100010
60 111...10 11111011101 1101011101 111...1000 11101100

30
All of these integer codes are complete prefix codes. For example,
X
2|(n)| = 1.
nN

Example 1.23: The intervals corresponding to the codewords (as in the


proof of Krafts inequality) of length at most 7 for unary, and GR2 codes.
unary

GR2

The unused part at the end is the code space available for all the longer
codes.
31

You might also like