Huffman Tree

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

Huffman Coding-

 Huffman Coding is a famous Greedy Algorithm.


 It is used for the lossless compression of data.
 It uses variable length encoding.
 It assigns variable length code to all the characters.
 The code length of a character depends on how frequently it occurs in the given text.
 The character which occurs most frequently gets the smallest code.
 The character which occurs least frequently gets the largest code.
 It is also known as Huffman Encoding.

Prefix Rule-

 Huffman Coding implements a rule known as a prefix rule.


 This is to prevent the ambiguities while decoding.
 It ensures that the code assigned to any character is not a prefix of the code assigned to
any other character.
PRACTICE PROBLEM BASED ON HUFFMAN CODING-

Problem-

A file contains the following characters with the frequencies as shown. If Huffman Coding is
used for data compression, determine-
1. Huffman Code for each character
2. Average code length
3. Length of Huffman encoded message (in bits)

Characters Frequencies

a 10

e 15
i 12

o 3

u 4

s 13

t 1

Input : Structure/ Information

Char S T U C R E
Freq. 1 2 2 1 2 1
Information
Char I N F O R M A T
Freq. 2 2 1 2 1 1 1 1

Solution-

First let us construct the Huffman Tree.


Huffman Tree is constructed in the following steps-

Step-01:
Step-02:

Step-03:

Step-04:
Step-05:

Step-06:
Step-07:
Now,
 We assign weight to all the edges of the constructed Huffman Tree.
 Let us assign weight ‘0’ to the left edges and weight ‘1’ to the right edges.

Rule
 If you assign weight ‘0’ to the left edges, then assign weight ‘1’ to the right edges.
 If you assign weight ‘1’ to the left edges, then assign weight ‘0’ to the right edges.
 Any of the above two conventions may be followed.
 But follow the same convention at the time of decoding that is adopted at the time of
encoding.

After assigning weight to all the edges, the modified Huffman Tree is-

Now, let us answer each part of the given problem one by one-
1. Huffman Code For Characters-

To write Huffman Code for any character, traverse the Huffman Tree from root node to the
leaf node of that character.
Following this rule, the Huffman Code for each character is-
 a = 111
 e = 10
 i = 00
 o = 11001
 u = 1101
 s = 01
 t = 11000

From here, we can observe-


 Characters occurring less frequently in the text are assigned the larger code.
 Characters occurring more frequently in the text are assigned the smaller code.

2. Average Code Length-

Using formula-01, we have-


Average code length
= ∑ ( frequencyi x code lengthi ) / ∑ ( frequencyi )
= { (10 x 3) + (15 x 2) + (12 x 2) + (3 x 5) + (4 x 4) + (13 x 2) + (1 x 5) } / (10 + 15 + 12 + 3 +
4 + 13 + 1)
= 2.52

3. Length of Huffman Encoded Message-

Using formula-02, we have-


Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code length per character
= 58 x 2.52
= 146.16
≅ 147 bits

You might also like