16 Greedy Algorithms

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 21

Introduction to Algorithms

Chapter 16: Greedy Algorithms


Greedy Algorithms
 Similar to dynamic programming, but simpler approach
 Also used for optimization problems

 Idea: When we have a choice to make, make the one


that looks best right now
 Make a locally optimal choice in hope of getting a globally optimal
solution

 Greedy algorithms don’t always yield an optimal solution


 Makes the choice that looks best at the moment in order
to get optimal solution.
Fractional Knapsack Problem
 Knapsack capacity: W

 There are n items: the i-th item has value vi and weight

wi

 Goal:

 find xi such that for all 0  xi  1, i = 1, 2, .., n

 wixi  W and

 xivi is maximum
Fractional Knapsack - Example

20
 E.g.: ---
$80
Item 3 30 +

Item 2 50 50
20 $100
Item 1 30
20 +
10 10 $60

$60 $100 $120 $240

$6/pound $5/pound $4/pound


Fractional Knapsack Problem
 Greedy strategy 1:
 Pick the item with the maximum value
 E.g.:
 W=1
 w1 = 100, v1 = 2
 w2 = 1, v2 = 1
 Taking from the item with the maximum value:
Total value taken = v1/w1 = 2/100
 Smaller than what the thief can take if choosing the
other item
Total value (choose item 2) = v2/w2 = 1
Fractional Knapsack Problem
Greedy strategy 2:

 Pick the item with the maximum value per pound vi/wi

 If the supply of that element is exhausted and the thief can


carry more: take as much as possible from the item with the
next greatest value per pound
 It is good to order items based on their value per pound
v1 v2 vn
  ... 
w1 w2 wn
Fractional Knapsack Problem
Alg.: Fractional-Knapsack (W, v[n], w[n])

1. While w > 0 and as long as there are items remaining

2. pick item with maximum vi/wi

3. xi  min (1, w/wi)

4. remove item i from list

5. w  w – x i wi

 w – the amount of space remaining in the knapsack (w = W)


 Running time: (n) if items already ordered; else (nlgn)
Huffman Code Problem
 Huffman’s algorithm achieves data
compression by finding the best variable
length binary encoding scheme for the
symbols that occur in the file to be
compressed.
Huffman Code Problem
 The more frequent a symbol occurs, the
shorter should be the Huffman binary word
representing it.

 The Huffman code is a prefix-free code.


 No prefix of a code word is equal to another
codeword.
Overview
 Huffman codes: compressing data (savings of 20% to
90%)
 Huffman’s greedy algorithm uses a table of the
frequencies of occurrence of each character to build
up an optimal way of representing each character as
a binary string
C: Alphabet
Example
 Assume we are given a data file that contains only 6 symbols,
namely a, b, c, d, e, f With the following frequency table:

 Find a variable length prefix-free encoding scheme that


compresses this data file as much as possible?
Huffman Code Problem
 Left tree represents a fixed length encoding scheme
 Right tree represents a Huffman encoding scheme
Example
Constructing A Huffman Code
// C is a set of n characters

// Q is implemented as a binary min-heap O(n)


Total computation time = O(n lg n)

O(lg n)

O(lg n)

O(lg n)
Cost of a Tree T
 For each character c in the alphabet C
 let f(c) be the frequency of c in the file
 let dT(c) be the depth of c in the tree
 It is also the length of the codeword. Why?
 Let B(T) be the number of bits required to
encode the file (called the cost of T)

B(T )   f (c)dT (c)


cC
Huffman Code Problem
In the pseudocode that follows:
 we assume that C is a set of n characters and that

each character c €C is an object with a defined


frequency f [c].
 The algorithm builds the tree T corresponding to the

optimal code
 A min-priority queue Q, is used to identify the two

least-frequent objects to merge together.


 The result of the merger of two objects is a new

object whose frequency is the sum of the


frequencies of the two objects that were merged.
Running time of Huffman's algorithm
 The running time of Huffman's algorithm assumes
that Q is implemented as a binary min-heap.
 For a set C of n characters, the initialization of Q
in line 2 can be performed in O(n) time using the
BUILD-MINHEAP
 The for loop in lines 3-8 is executed exactly n - 1
times, and since each heap operation requires
time O(lg n), the loop contributes O(n lg n) to the
running time. Thus, the total running time of
HUFFMAN on a set of n characters is O(n lg n).
Prefix Code
 Prefix(-free) code: no codeword is also a prefix of some other
codewords (Un-ambiguous)
 An optimal data compression achievable by a character code can

always be achieved with a prefix code


 Simplify the encoding (compression) and decoding

 Encoding: abc  0 . 101. 100 = 0101100

 Decoding: 001011101 = 0. 0. 101. 1101  aabe

 Use binary tree to represent prefix codes for easy decoding


 An optimal code is always represented by a full binary tree, in which
every non-leaf node has two children
 |C| leaves and |C|-1 internal nodes Cost:

B (T )   f (c)dT (c) Depth of c (length of the codeword)


cC
Frequency of c
Huffman Code
 Reduce size of data by 20%-90% in general

 If no characters occur more frequently than others,


then no advantage over ASCII

 Encoding:
 Given the characters and their frequencies, perform the
algorithm and generate a code. Write the characters
using the code

 Decoding:
 Given the Huffman tree, figure out what each character
is (possible because of prefix property)
Application on Huffman code
 Both the .mp3 and .jpg file formats use
Huffman coding at one stage of the
compression
Dynamic Programming vs. Greedy
Algorithms
 Dynamic programming

 We make a choice at each step


 The choice depends on solutions to subproblems
 Bottom up solution, from smaller to larger subproblems
 Greedy algorithm
 Make the greedy choice and THEN
 Solve the subproblem arising after the choice is made
 The choice we make may depend on previous choices,
but not on solutions to subproblems
 Top down solution, problems decrease in size

You might also like