Vivekananda: Lecture Notes On
Vivekananda: Lecture Notes On
Vivekananda: Lecture Notes On
College
e of Engineering & Technolog
ogy
N
Nehru nagar post, Puttur, D.K. 5742013
Lecture Notes on
15CS43
Design and Analysis
lysis of
Algorithms
(CBCS Scheme)
Prepared by
Mr. Harivinod N
Dept. of Computer
C Science and Engineering,
VCET Puttur
Jan 2017
Contents
Cours
ourse website: www.techjourney.in
Lecture Notes | 15CS43
CS43 – Design & Analysis of Algorithms | Module 1: Introdu
troduction
1. Introduction
1.1. What is an Algorithm??
An algorithm is a finite setet of instructions to solve a particular proble lem. In addition, all
algorithms must satisfy the following
fol criteria:
a. Input. Zero or morore quantities are externally supplied.
b. Output. At least one
on quantity is produced.
c. Definiteness. Eachch instruction is clear and unambiguous. It mus ust be perfectly clear
what should be donone.
d. Finiteness. If wee trace
t out the instruction of an algorithm, the hen for all cases, the
algorithm terminate
ates after a finite number of steps.
e. Effectiveness. Ever
very instruction must be very basic so that it ca can be carried out, in
principle, by a person
per using only pencil and paper. It is not ot enough that each
operation be defini
inite as in criterion c; it also must be feasible.
Algorithm design and analy lysis process - We now briefly discuss a seq
equence of steps one
typically goes through in desig
signing and analyzing an algorithm
• Understanding the Prob oblem - From a practical perspective, the firstrst thing you need to
do before designing an algorithm
al is to understand completely the pr problem given. An
input to an algorithm spec
ecifies an instance of the problem the algorithm
thm solves. It is very
important to specify exactl
ctly the set of instances the algorithm needs to handle.
• Ascertaining the Capab bilities of the Computational Device - Onc nce you completely
understand a problem, you
ou need to ascertain the capabilities of the com
computational device
the algorithm is intended
ded for. Select appropriate model from seq equential or parallel
programming model.
• Choosing between Exac act and Approximate Problem Solving - The T next principal
decision is to choose betw
tween solving the problem exactly and solving
ing it approximately.
Because, there are importa
rtant problems that simply cannot be solvedd exactly
e for most of
their instances and somee of
o the available algorithms for solving a prob
oblem exactly can be
unacceptably slow because
use of the problem’s intrinsic complexity.
• Algorithm Design Tech chniques - An algorithm design techniquee (or “strategy” or
“paradigm”) is a generall approach
a to solving problems algorithmically
lly that is applicable
to a variety of problemss from
f different areas of computing. They provide
pro guidance for
designing algorithms foror new problems, i.e., problems for which tthere is no known
satisfactory algorithm.
• Designing an Algorithm m and Data Structures - One should pay ay close attention to
choosing data structuress appropriate
a for the operations performed byy the algorithm. For
example, the sieve of Erat
ratosthenes would run longer if we used a linke
ked list instead of an
ion. Algorithms + Data Structures = Program
array in its implementation ams
• Methods of Specifying an Algorithm- Once you have designed an algorithm;
al you need
to specify it in some fashion.
fas These are the two options that are re most widely used
nowadays for specifyingg algorithms.
a Using a natural language hass an a obvious appeal;
however, the inherent ambiguity
am of any natural language makes a concise and clear
description of algorithmss surprisingly difficult. Pseudocode is a mi mixture of a natural
language and programmin ing language like constructs. Pseudocode is usually
us more precise
than natural language, and
nd its usage often yields more succinct algorithm
thm descriptions.
• Proving an Algorithm’s ’s Correctness - Once an algorithm has beenn sspecified, you have
to prove its correctness.. That
T is, you have to prove that the algorithm
hm yields a required
result for every legitimate
te input in a finite amount of time. For some algorithms, a proof
of correctness is quite easy
asy; for others, it can be quite complex. A com
ommon technique for
proving correctness is to use mathematical induction because an alg lgorithm’s iterations
provide a natural sequencece of steps needed for such proofs.
• Analyzing an Algorithm m - After correctness, by far the most importartant is efficiency. In
o algorithm efficiency: time efficiency, indic
fact, there are two kindss of dicating how fast the
algorithm runs, and space
ce efficiency, indicating how much extra memo mory it uses. Another
desirable characteristic of an algorithm is simplicity. Unlike efficieniency, which can be
precisely defined and inve
vestigated with mathematical rigor, simplicity,
ty, like beauty, is to a
considerable degree in the
he eye of the beholder.
• Coding an Algorithm - Most algorithms are destined to be ultimate ately implemented as
computer programs. Impleplementing an algorithm correctly is necessary ry but not sufficient:
you would not like to dim
iminish your algorithm’s power by an inefficie ient implementation.
Modern compilers do provrovide a certain safety net in this regard, especi
ecially when they are
used in their code optimiza
ization mode.
Recursive algorithms
An algorithm is said to be recursive
re if the same algorithm is invokedd iin the body (direct
recursive). Algorithm A is said
aid to be indirect recursive if it calls another
er algorithm which in
turn calls A.
Example 1: Factorial computa
tation n! = n * (n-1)!
Example 2: Binomial coefficie
cient computation
unless n is extremely large or very small, the formula can give a reasona
nable estimate of the
algorithm's running time.
It is for these reasons that the
he efficiency analysis framework ignores multi
ltiplicative constants
and concentrates on the count nt's order of growth to within a constant mul
ultiple for large-size
inputs.
Orders of Growth
Why this emphasis on the count's
co order of growth for large input sizes?
s? Because for large
values of n, it is the function's
n's order of growth that counts: just look at ta
table which contains
values of a few functions parti
rticularly important for analysis of algorithms.
Table: Values
of several
functions
important for
analysis of
algorithms
2. Performance Analysis
lysis
The above thod is both exces essively difficult and, usually unnecessary. Th
The thing to do is to
identify the most important operation of the algorithm, called the bas asic operation, the
operation contributing the moost to the total running time, and compute ththe number of times
the basic operation is executed
ted.
Trade-off
There is often a time-space-tr
tradeoff involved in a problem, that is, it can
annot be solved with
few computing time and loww memory consumption. One has to make a compromise
c and to
exchange computing time forfo memory consumption or vice versa, dependingdep on which
algorithm one chooses and how one parameterizes it.
3. Asymptotic Notations
ions
The efficiency analysis frame
mework concentrates on the order of growth th of an algorithm’s
basic operation count as the principal
pr indicator of the algorithm’s efficienc
ncy. To compare and
rank such orders of growth, th, computer scientists use three notations: O(bigO oh), Ω(big
omega), Θ (big theta) and o(lit
(little oh)
Example:
Example: n2 + 5n + 7 = Θ(n2)
lim 0
→
Example:
Algorithm
Here basic operation is compa parison. The maximum no. of comparisons hhappen in the worst
th array are distinct and algorithms return tru
case. (i.e. all the elements in the rue).
Total number of basic operatio
tions (comparison) in the worst case are,
Example-1
Algorithm
….
Example-2: Tower of Hanoi oi puzzle. In this puzzle, There are n disks of different sizes that
can slide onto any of three peg
egs. Initially, all the disks are on the first pegg iin order of size, the
largest on the bottom and thehe smallest on top. The goal is to move all the disks to the third
peg, using the second one ass ana auxiliary, if necessary. We can move only ly one disk at a time,
and it is forbidden to place a larger
la disk on top of a smaller one.
The problem has an elegant recursive
re solution, which is illustrated in Figur
ure.
• To move n>1 disks from pegp 1 to peg 3 (with peg 2 as auxiliary),
o we first move recur
cursively n-1 disks from peg 1 to peg 2 (with pe
peg 3 as auxiliary),
o then move the large
rgest disk directly from peg 1 to peg 3, and,
o finally, move recurursively n-1 disks from peg 2 to peg 3 (using peg
p 1 as auxiliary).
• If n = 1, we move the singl
ngle disk directly from the source peg to the des
destination peg.
Figure: Rec
ecursive solution to the Tower of Hanoi puzzle
zle
The number of moves M(n) depends
de only on n. The recurrence equationn is
i
4. Important Problem
m Types
Ty
In this section, we are goin
oing to introduce the most important proble blem types: Sorting,
Searching, String processing,, Graph problems, Combinatorial problems.
4.1. Sorting
The sorting problem is to rear
earrange the items of a given list in non-decre
creasing order. As a
practical matter, we usuallyy need
n to sort lists of numbers, characters from
fro an alphabet or
character strings.
Although some algorithms are indeed better than others, there is no algori orithm that would be
the best solution in all situatio
tions. Some of the algorithms are simple but relrelatively slow, while
others are faster but more complex;
co some work better on randomly ord rdered inputs, while
others do better on almost-so sorted lists; some are suitable only for listss rresiding in the fast
memory, while others can bee adapted
a for sorting large files stored on a disk
isk; and so on.
Two properties of sorting alglgorithms deserve special mention. A sortingg algorithm is called
stable if it preserves the relat
lative order of any two equal elements in its input. The second
notable feature of a sorting algorithm
alg is the amount of extra memory thee algorithm requires.
An algorithm is said to be in-pplace if it does not require extra memory, exc
xcept, possibly, for a
few memory units.
4.2. Searching
The searching problem deals ls with finding a given value, called a searchh key,
k in a given set.
(or a multiset, which permits
its several elements to have the same value). There
T are plenty of
searching algorithms to choosose from. They range from the straightforward rd sequential search
to a spectacularly efficient but
bu limited binary search and algorithms bassed on representing
the underlying set in a differe
rent form more conducive to searching. The latter
la algorithms are
of particular importance for real-world
re applications because they are indisp
ispensable for storing
and retrieving information from
rom large databases.
4.3. String Processing
In recent decades, the rapid proliferation
pr of applications dealing with non--numerical data has
intensified the interest off researchers and computing practitionerss in string-handling
algorithms. A string is a sequence
s of characters from an alphabet. et. String-processing
algorithms have been impo portant for computer science in conjunctio tion with computer
languages and compiling issueues.
4.4. Graph Problems
One of the oldest and most interesting
int areas in algorithmics is graph algorit
rithms. Informally, a
graph can be thought of as a collection
co of points called vertices, some of which
w are connected
by line segments called edges.
ed Graphs can be used for modeling a wide variety of
applications, including transpo
sportation, communication, social and economi mic networks, project
scheduling, and games. Stud udying different technical and social aspects ts of the Internet in
5. Fundamental Data
a Structures
Stru
Since the vast majority off algorithms of interest operate on data, particular
p ways of
organizing data play a critical
al role in the design and analysis of algorithmss. A data structure
can be defined as a particularr scheme
s of organizing related data items.
5.1. Linear Data Structures
res
The two most important eleme
mentary data structures are the array and the linked
lin list.
A (one-dimensional) array is a sequence of n items of the same data ty
type that are stored
contiguously in computer mem
emory and made accessible by specifying a value
v of the array’s
index.
A linked list is a sequence of zero or more elements called nodes, each containing
co two kinds
of information: some data and nd one or more links called pointers to otherr nnodes of the linked
list. In a singly linked list, each
eac node except the last one contains a single le pointer to the next
element. Another extension is i the structure called the doubly linked list,li in which every
node, except the first and thee last,
l contains pointers to both its successor and
an its predecessor.
Weighted Graphs: A weigh ghted graph (or weighted digraph) is a graphph (or digraph) with
numbers assigned to its edges.
es. These numbers are called weights or costs.
Among the many propertiess of o graphs, two are important for a great numb
mber of applications:
connectivity and acyclicity. Both
B are based on the notion of a path. A path
pat from vertex u to
vertex v of a graph G can be b defined as a sequence of adjacent (conn nnected by an edge)
vertices that starts with u andd ends
e with v.
5.3. Trees
A tree (more accurately, a free
fre tree) is a connected acyclic graph. A graphph that has no cycles
but is not necessarily connecte
cted is called a forest: each of its connected co
components is a tree.
Trees have several importantt properties
p other graphs do not have. In particu
ticular, the number of
edges in a tree is always onee less
le than the number of its vertices: |E| = |V| - 1
The depth of a vertex v is the length of the simple path from the root to v. The height of a
tree is the length of the longes
est simple path from the root to a leaf.
Ordered Trees- An orderedd treet is a rooted tree in which all the childrenen of each vertex are
ordered. It is convenient to assume
as that in a tree’s diagram, all the children
ren are ordered left to
right. A binary tree can be defined
de as an ordered tree in which every vertertex has no more than
two children and each child is designated as either a left child or a rightt child
ch of its parent; a
binary tree may also be emptyty.
If a number assigned to eachh parental vertex is larger than all the number
ers in its left subtree
and smaller than all the nummbers in its right subtree. Such trees are cal
alled binary search
trees. Binary trees and binary
ry search
trees have a wide varie riety of
applications in computer scien
ience.
*****
Lecture Notes
on
15CS43
Design and Analysis
lysis of
Algorithms
(CBCS Scheme)
Prepared by
Mr. Harivinod N
Assistant Professor,
Dept. of Computer
C Science and Engineering,
VCET Puttur
Feb 2017
Module
odule-2 : Divide and Conquer
Contents
1. General methodod
2. Recurrence equ
quation
3. Algorithm: Bin
inary search
4. Algorithm: Fin
inding the maximum and minimum
5. Algorithm: Meerge sort
6. Algorithm: Quiuick sort
7. Algorithm: Stra
trassen’s matrix multiplication
8. Advantages and Disadvantages
9. Decrease and Conquer
C Approach
10. Algorithm: Top
opological Sort
Course
ourse website: www.techjourney.in
Lecture Notess || 10CS43 – DAA || Module 2: Divide and Conque
nquer
1. General method:
Divide and Conquer is one of the best-known general algorithm designn technique.
t It works
according to the following geneneral plan:
• Given a function to compute
co on ‘n’ inputs the divide-and-conqueuer strategy suggests
splitting the inputs into
nto ‘k’ distinct subsets, 1<k<=n, yielding ‘k’ su
sub problems.
• These sub problems must m be solved, and then a method must be fou found to combine sub
solutions into a solutio
tion of the whole.
• If the sub problems are still relatively large, then the divide-and-co conquer strategy can
possibly be reapplied.
• Often the sub problem ems resulting from a divide-and-conquer desig sign are of the same
type as the originall problem.
p For those cases the reapplicationn of the divide-and-
conquer principle is naturally
na expressed by a recursive algorithm.
Problem of size n
Sub Problem
lem of size n/2 Sub Problem
m of
o size n/2
Solution to sub
ub problem
p 1 Solution to sub
su problem 2
2. Recurrence equation
n for
f Divide and Conquer:
If the size of problem ‘p’ is n and the sizes of the ‘k’ sub problems ms are n1, n2 ….nk,
respectively, then the comput
uting time of divide and conquer is described
ed by the recurrence
relation
Where,
• T(n) is the time for div
ivide and conquer method on any input of size
ze n and
• g(n) is the time to com
mpute answer directly for small inputs.
• The function f(n) is the time for dividing the problem ‘p’ and comb
mbining the solutions
to sub problems.
For divide and conquer based ed algorithms that produce sub problems of the th same type as the
original problem, it is very natural
nat to first describe them by using recursion
on.
More generally, an instance of o size n can be divided into b instances off size n/b, with a of
them needing to be solved. (Here,
(H a and b are constants; a>=1 and b > 1.). Assuming that
k
size n is a power of b (i.e. n = b ), to simplify our analysis, we get the fo
following recurrence
for the running time T(n):
..... (1)
Problems on Substitutio
ion method & Master theorem to solv
olve the
recurrence relation
3. Binary Search
Problem definition: Let ai, 1 ≤ i ≤ n be a list of elements that are sorted
ed in non-decreasing
order. The problem is to findnd whether a given element x is present in the list or not. If x is
present we have to determine ne a value j (element’s position) such that aj=x.
=x If x is not in the
list, then j is set to zero.
Solution: Let P = (n, ai…al , x) oblem where n is the
x denote an arbitrary instance of search prob
number of elements in the list,
lis ai…al is the list of elements and x is thehe key element to be
searched for in the given list. Binary
B search on the list is done as follows:
Step 1: Pick an index q in the
th middle range [i, l] i.e. q= 1 /2 andd compare
c x with aq.
Step 2: if x = aq i.e key elem
ement is equal to mid element, the problem is im
immediately solved.
Step 3: if x < aq in this case
se x has to be searched for only in the sub-listt ai, ai+1, ……, aq-1.
Therefore problem reduces
re to (q-i, ai…aq-1, x).
Step 4: if x > aq , x has to be searched for only in the sub-list aq+1, ...,., al . Therefore
T problem
reduces to (l-i, aq+1…a
… l, x).
For the above solution proced
cedure, the Algorithm can be implemented as recursive or non-
recursive algorithm.
Analysis
In binary search the basic operation
ope is key comparison. Binary Search can
ca be analyzed with
the best, worst, and averagee case
c number of comparisons. The numberss of o comparisons for
the recursive and iterative ver
ersions of Binary Search are the same, if comp
mparison counting is
relaxed slightly. For Recursive
ive Binary Search, count each pass through the if-then-else block
as one comparison. For Iterati
rative Binary Search, count each pass throughh the while block as
one comparison. Let us find out
o how many such key comparison does thee algorithm make on
an array of n elements.
Best case – Θ(1) In the best
est case, the key is the middle in the array. A cconstant number of
comparisons (actually just 1)) are
a required.
Prepared by Harivinod N www.techjourney.in Page| 2.11
Lecture Notess || 10CS43 – DAA || Module 2: Divide and Conque
nquer
Worst case - Θ(log2 n) In theth worst case, the key does not exist in the ar
array at all. Through
each recursion or iteration of Binary Search, the size of the admissible ran
range is halved. This
(lo 2 n ) times. Thus, log n comparisons are
halving can be done ceiling (log ar required.
Sometimes, in case of the sucuccessful search, it may take maximum numbber of comparisons.
log n . So worst case com
mplexity of successful binary search is Θ (log2 n).
Average case - Θ (log2 n) Too find the average case, take the sum of the product
pro of number of
comparisons required to findd each element and the probability of searchin ing for that element.
To simplify the analysis, assusume that no item which is not in array will be searched for, and
that the probabilities of search
ching for each element are uniform.
Explanation:
StraightMaxMin requi
quires 2(n-1) comparisons in the best, averagee & worst cases.
By realizing the compparison of a[i]>max is false, improvement in a algorithm can be
done. Hence we can replace
re the contents of the for loop by,
If(a[i]>Max) then Max = a[i]; Else if (a[i]< min)
) min=a[i]
On the average a[i] is > max half the time. So, the avg. no. of compparison is 3n/2-1.
Example:
Space Complexity
Compared to the straight forw
rward method, the MaxMin method requires extra ex stack space for
i, j, max, min, max1 and min1.
mi Given n elements there will be 1 levels of
recursion and we need to save
ve seven values for each recursive call. (6 + 1 for
f return address).
5. Merge Sort
Merge sort is a perfect exa xample of a successful application of the divide-and
d conquer
technique. It sorts a given arra alves A [0 .. /2 -1]
rray A [O ... n - 1] by dividing it into two halv
and A [ /2 .. n-1], sorting ing each of them recursively, and then mergin ging the two smaller
sorted arrays into a single sort
rted one.
Example:
The operation of the algorithm hm on the
list 8, 3, 2, 9, 7, 1, 5, 4 is illust
ustrated in
the figure
Analysis
Worst case: During key com omparison, neither of the two arrays becomes
es empty before the
other one contains just onee element
e leads to the worst case of mergee sort.
s Assuming for
Advantages:
• Number of comparison sons performed is nearly optimal.
• For large n, the numbber of comparisons made by this algorithm in i the average case
turns out to be aboutt 0.25n
0 less and hence is also in Θ(n log n).
• Mergesort will neverr degrade
d to O (n2)
• Another advantage of o mergesort over quicksort and heapsortt is i its stability. (A
sorting algorithm is said
sa to be stable if two objects with equal keys
ys appear in the same
order in sorted output
ut as they appear in the input array to be sorted.
d. )
Limitations:
• The principal shortcom oming of mergesort is the linear amount [ O(n (n) ] of extra storage
the algorithm requires
es. Though merging can be done in-place, thee resulting algorithm
is quite complicated and
an of theoretical interest only.
6. Quick sort
Quicksort is the other import
ortant sorting algorithm that is based on thee ddivide-and-conquer
approach. Unlike mergesort, t, which
w divides its input elements accordingg to their position in
the array, quicksort divides ( or
o partitions) them according to their value.
A partition is an arrangementent of the array’s elements so that all the elem
lements to the left of
some element A[s] are less than
tha or equal to A[s], and all the elements to th
the right of A[s] are
greater than or equal to it:
Obviously, after a partition is achieved, A[s] will be in its final positionn iin the sorted array,
and we can continue sortin ting the two subarrays to the left and too the right of A[s]
independently (e.g., by the sam
ame method).
In quick sort, the entire work happens
h in the division stage, with no work re
required to combine
the solutions to the sub proble
lems.
Partitioning
We start by selecting a pivot—
—an element with respect to whose value wee are going to divide
the subarray. There are sev everal different strategies for selecting a ppivot. We use the
sophisticated method suggeste
sted by C.A.R. Hoare, the prominent Britishsh computer scientist
who invented quicksort.
Select the subarray’s first element:
e p = A[l]. Now scan the subarray
ray from both ends,
comparing the subarray’s elem
ements to the pivot.
The left-to-right scan
an, denoted below by index pointer i, starts rts with the second
element. Since we waant elements smaller than the pivot to be in the left part of the
subarray, this scan skip
kips over elements that are smaller than the pipivot and stops upon
encountering the firstt element
e greater than or equal to the pivot.
The right-to-left scan,
n, denoted below by index pointer j, starts with
ith the last element of
the subarray. Since wee want elements larger than the pivot to be inn tthe right part of the
Note that index i can go out of the subarray’s bounds in this pseudocode.
Analysis
Best Case - Here the basic operation
op is key comparison. Number of keyy comparisons made
before a partition is achievedd is
i n + 1 if the scanning indices cross over andnd n if they coincide.
If all the splits happen in the middle of corresponding subarrays, we will ll have the best case.
The number of key compariso sons in the best case satisfies the recurrence,
Average Case - Let Cavg(n) be the average number of key comparisons made ma by quicksort on
a randomly ordered array off size
s n. A partition can happen in any positi ition s (0 s n−1)
after n+1comparisons are mad
ade to achieve the partition. After the partition
ion, the left and right
subarrays will have s and n − 1− s elements, respectively. Assuming tha hat the partition split
can happen in each position s with the same probability 1/n, we get the fofollowing recurrence
relation:
Limitations
It is not stable.
It requires a stack to store
st parameters of subarrays that are yet to be sorted.
While Performance on randomly ordered arrays is known to be sensitivese not only to
implementation details ils of the algorithm but also to both compute uter architecture and
data type.
where
This implies M(n) = Θ(n2.807) which is smaller than n3 required by the brute
ute-force algorithm.
Disadvantages
One of the most comommon issues with this sort of algorithm is the fact that the
recursion is slow, whhich in some cases outweighs any advantages
es of this divide and
conquer process.
Another concern with th it is the fact that sometimes it can becomee more complicated
ve approach, especially in cases with a largee nn. In other words, if
than a basic iterative
someone wanted to add ad a large amount of numbers together, if they just create a
simple loop to add them
the together, it would turn out to be a much ch simpler approach
than it would be too divide the numbers up into two groups, s, add these groups
recursively, and then add
a the sums of the two groups together.
Another downfall iss that sometimes once the problem is broke oken down into sub
problems, the same subsu problem can occur many times. It is solv lved again. In cases
like these, it can often
en be easier to identify and save the solutionn tto the repeated sub
problem, which is com mmonly referred to as memorization.
Sub Problem
of size n-1
Solution to sub
problem
Sub Problem
of size n/2
Solution to sub
problem
Lecture Notes on
15CS43
Design and Analysis
lysis of
Algorithms
(CBCS Scheme)
Prepared by
Mr. Harivinod N
Dept. of Computer
C Science and Engineering,
VCET Puttur
Mar 2017
Modu
Module-3 : Greedy Method
Contents
1. Introduction to Greedy
reedy method
1.1 General method
The greedy method is thee straight forward design technique applica
icable to variety of
applications.
The greedy approach sugges ests constructing a solution through a sequen
uence of steps, each
expanding a partially constru
tructed solution obtained so far, until a compl
plete solution to the
problem is reached. On each step
s the choice made must be:
• feasible, i.e., it has too satisfy
s the problem’s constraints
• locally optimal, i.e.,., iti has to be the best local choice among aall feasible choices
available on that step
• irrevocable, i.e., once nce made, it cannot be changed on subseq equent steps of the
algorithm
As a rule, greedy algorithmss are
a both intuitively appealing and simple. Giv
iven an optimization
problem, it is usually easy to figure out how to proceed in a greedy man anner, possibly after
considering a few small instan
tances of the problem. What is usually moree difficult
d is to prove
that a greedy algorithm yields
ds an optimal solution (when it does).
Solution for coin change proroblem using greedy algorithm is very intui
tuitive and called as
cashier’s algorithm. Basic principle
pri is: At every iteration for search of
o a coin, take the
largest coin which can fit into
in remain amount to be changed at thatt particular
p time. At
the end you will have optimal
al solution.
Analysis:
Disregarding the time to initia
tially sort the object, each of the above strategie
gies use O(n) time,
Note: The greedy approachh to solve this problem does not necessarily
ily yield an optimal
solution
High lev
evel description of job sequencing algorithm
Analysis:
Analysis
Correctness
Prim’s algorithm always yield
lds a minimum spanning tree.
Analysis of Efficiency
The efficiency of Prim’s algor
orithm depends on the data structures chosenn for
f the graph itself
and for the priority queue of the set V − VT whose vertex priorities aree the
t distances to the
nearest tree vertices.
1. If a graph is represente
nted by its weight matrix and the priority queueue is implemented
as an unordered arra Θ 2). Indeed, on
ray, the algorithm’s running time will be in Θ(|V|
each of the |V| − 1itera
erations, the array implementing the priority qu
queue is traversed to
find and delete the minimum
m and then to update, if necessary,, the
th priorities of the
remaining vertices.
We can implement the priorit
rity queue as a min-heap. (A min-heap is a co complete binary tree
in which every element is less
ess than or equal to its children.) Deletion of the
th smallest element
from and insertion of a new element
el into a min-heap of size n are O(log n) operations.
2. If a graph is represente
nted by its adjacency lists and the priority que
ueue is implemented
as a min-heap, the run
unning time of the algorithm is in O(|E| log |V
V |).
This is because the algorithmm performs |V| − 1 deletions of the smallestt eelement and makes
|E| verifications and, possibly
bly, changes of an element’s priority in a mi min-heap of size not
exceeding |V|. Each of thesee operations,
o as noted earlier, is a O(log |V|) operation.
op Hence, the
running time of this implemenentation of Prim’s algorithm is in
(|V| − 1+ |E|) O (log |V |) = O(|E| log |V |) because, in a connected grap
aph, |V| − 1≤ |E|.
Working
The algorithm begins by sortirting the graph's edges in non decreasing orde rder of their weights.
Then, starting with the empty ty sub graph, it scans this sorted list adding the
th next edge on the
list to the current sub graph if such an inclusion does not create a cycle and
an simply skipping
the edge otherwise.
We can consider the algorith rithm's operations as a progression through a series of forests
containing all the vertices off a given graph and some of its edges. The initia
itial forest consists of
|V| trivial trees, each comprisi
ising a single vertex of the graph. The finall forest
f consists of a
single tree, which is a min inimum spanning tree of the graph. On eeach iteration, the
algorithm takes the next edge ge (u, v) from the sorted list of the graph's edges,
ed finds the trees
containing the vertices u andd v, and, if these trees are not the same, uniteites them in a larger
tree by adding the edge (u, v).).
Analysis of Efficiency
The crucial check whether two
wo vertices belong to the same tree can be foun
und out using union-
find algorithms.
Efficiency of Kruskal’s algori
orithm is based on the time needed for sorting ng the edge weights
of a given graph. Hence, withith an efficient sorting algorithm, the time effic
fficiency of Kruskal's
algorithm will be in O (|E| log
og |E|).
Illustration
An example of Kruskal’s alg lgorithm is shown below. The
selected edges are shown in bold.
bo
The pseudocode of Dijkstra tra’s algorithm is given below. Note that at in the following
pseudocode, VT contains a giv iven source vertex and the fringe contains thee vvertices adjacent to
it after iteration 0 is completed
ted.
Analysis:
The time efficiency of Dijk ijkstra’s algorithm depends on the data structures
st used for
implementing the priority queue
qu and for representing an input graphh itself. For graphs
represented by their adjacency
cy lists and the priority queue implemented as a min-heap, it is in
O ( |E| log |V| )
Applications
Transportation plannin
ing and packet routing in communication netwtworks, including the
Internet
Finding shortest paths
hs in social networks, speech recognition, doc
ocument formatting,
robotics, compilers, and
an airline crew scheduling.
Properties of Heap
o essentially complete binary tree with n nodes.
1. There exists exactly one n Its height is
equal to
2. The root of a heap alw
lways contains its largest element.
Illustration
Bottom-up construction of a heap
h for the list 2, 9, 7, 6, 5, 8. The double headed
he arrows show
key comparisons verifying the
he parental dominance.
Illustration
Analysis of efficiency:
Since we already know thatt the i in O(n), we have
th heap construction stage of the algorithm is
to investigate just the timee efficiency of the second stage. For the number of key
comparisons, C(n), needed for eliminating the root keys from the heaps of
o diminishing sizes
from n to 2, we get the followi
wing inequality:
For both stages, we get O(n)) + O(n log n) = O(n log n).
A more detailed analysis show
ows that the time efficiency of heapsort is, in fact, in Θ(n log n)
in both the worst and average
age cases. Thus, heapsort’s time efficiency fall
alls in the same class
as that of mergesort.
Unlike the latter, heapsort is in-place, i.e., it does not require any extr
xtra storage. Timing
experiments on random filess show
s that heapsort runs more slowly than quicksort
qu but can be
competitive with mergesort.
*****
Lecture Notes on
15CS43
Design and Analysis of
Algorithms
(CBCS Scheme)
Prepared by
Harivinod N
Dept. of Computer Science and Engineering,
VCET Puttur
April 2017
Contents
Course Website
www.TechJourney.in
Lecture Notes | 10CS43 – Design & Analysis
A of Algorithms | Module 4: Dynamic Programming
Example 2
Example 3
Example 4
Backward Approach
(a) Digraph. (b) Its adjacency matrix. (c) Its transitive closure.
We can generate the transitive closure of a digraph with the help of depthfirst search or
first search. Performing either traversal starting at the ith vertex gives the information
breadth-first
about the vertices reachable from it and hence the columns that contain 1’s in the ith row of
the transitive closure. Thus, doing such a traversal for every vertex as a starting point yields
the transitive closure in its entirety.
Since this method traverses the same digraph several times, we can use a better algorithm
called Warshall’s algorithm.. Warshall’s algorithm constructs the transitive closure through
a series of n × n boolean matrices:
Each of these matrices provides certain information about directed paths in the digraph.
Specifically, the element in the ith row and jth column of matrix R(k) (i, j = 1, 2, . . . , n, k
= 0, 1, . . . , n) is equal to 1 if and only if there exists a directed path of a positive length from
the ith vertex to the jth vertex with each intermediate vertex, if any, numbered not higher than
k.
Thus, the series starts with R(0) , which does not allow any intermediate vertices in its paths;
hence, R(0) is nothing other than the adjacency matrix of the digraph.
digraph R(1) contains the
information about paths thatat can use the first vertex as intermediate. The last matrix in the
(n)
series, R , reflects paths that can use all n vertices of the digraph as intermediate and hence
is nothing other than the digraph’s transitive closure.
This means that there exists a path from the ith vertex vi to the jth vertex vj with each
intermediate vertex numbered not higher than k:
vi, a list of intermediate vertices each numbered not higher than k, vj . --- (*)
Two situations regarding this path are possible.
1. In the first, the list of its intermediate vertices does not contain the kth vertex. Then this
−1. i.e. r
path from vi to vj has intermediate vertices numbered not higher than k− 1
2. The secondd possibility is that path (*)(* does contain the kth vertex vk among the
intermediate vertices. Then path (*) can be rewritten as;
vi, vertices numbered ≤ k − 1, vk, vertices numbered ≤ k − 1, vj .
i.e r 1 and r 1
Thus, we have the following formula for generating the elements of matrix R(k) from the
elements of matrix R(k−1)
As an example, the application of Warshall’s algorithm to the digraph is shown below. New
1’s are in bold.
Analysis
Its time efficiency is Θ(n3). We can make the algorithm to run faster by treating matrix rows
as bit strings and employ the bitwise or operation available in most modern computer
languages.
Space efficiency: Although separate matrices for recording intermediate results of the
algorithm are used, that can be avoided.
(a) Digraph. (b) Its weight matrix. (c) Its distance matrix
We can generate the distance matrix with an algorithm that is very similar to Warshall’s
algorithm. It is called Floyd’s algorithm.
Floyd’s algorithm computes the distance matrix of a weighted graph with n vertices through a
series of n × n matrices:
The element in the ith row and the jth column of matrix D(k) (i, j = 1, 2, . . . , n, k = 0, 1,
he length of the shortest path among all paths from the i vertex to the jth
. . . , n) is equal to the th
vertex with each intermediate vertex, if any, numbered not higher than k.
As in Warshall’s algorithm, we can compute all the elements of each matrix D(k) from its
immediate predecessor D(k−1)
Taking into account the lengths of the shortest paths in both subsets leads to the following
recurrence:
Application of Floyd’s algorithm to the digraph is shown below.. Updated elements are shown
in bold.
The average number of comparisons in a successful search in the first of these trees is 0.1 *
1+ 0.2 * 2 + 0.4 * 3+ 0.3 * 4 = 2.9, and
and for the second one it is 0.1 * 2 + 0.2 * 1+ 0.4 * 2 +
0.3 * 3= 2.1. Neither of these two trees is, in fact, optimal.
For our tiny example, we could find the optimal tree byb generating all 14 binary search trees
with these keys. As a general algorithm, this exhaustive-search
exhaustive approach is unrealistic: the
total number of binary search trees with n keys is equal to the nth Catalan number,
If we count tree levels starting with 1 to make the comparison numbers equal the keys’ levels,
the following recurrence relation is obtained:
The two-dimensional
dimensional table in Figure 8.9 shows the values needed for computing C(i, j) by
formula (8.8): they are in row i and the columns to the left of column j and in column j and
the rows below row i. The arrows point to the pairs of entries whose sums are computed in
order to find the smallest one to be recorded as the value of C(i, j). This suggests filling the
table along its diagonals, starting with all zeros on the main diagonal and given probabilities
pi, 1≤ i ≤ n, right above it and moving toward the upper right corner.
The algorithm we just sketched computes C(1, n)—the n) the average number of comparisons for
successful searches in the optimal binary tree.
tree. If we also want to get the optimal tree itself,
we need to maintain another two-dimensional
two dimensional table to record the value of k for which the
minimum in (8.8) is achieved. The table has the same shape as the table in Figure 8.9 and is
filled in the same manner,
nner, starting with entries R(i, i) = i for 1≤
1 i ≤ n. When the table is filled,
its entries indicate indices of the roots of the optimal subtrees, which makes it possible to
reconstruct an optimal tree for the entire set given.
Thus, out of two possible binary trees containing the first two keys, A and B, the root of the
optimal tree has index 2 (i.e., it contains B), and the average number of comparisons in a
successful search in this tree is 0.4. On finishing the computations we get the following final
tables:
5. Knapsack problem
We start this section with designing a dynamic programming algorithm for the knapsack
problem: given n items of known weights w1, . . . , wn and values v1, . . . , vn and a knapsack
of capacity W, find the most valuable subset of the items that fit into the knapsack.
To design a dynamic programming algorithm, we need to derive a recurrence relation that
expresses a solution to an instance of the knapsack problem in terms of solutions to its
smaller subinstances.
Let us consider an instance defined by the first i items, 1≤1 i ≤ n, with weights w1, . . . , wi,
values v1, . . . , vi , and knapsack capacity j, 1 ≤ j ≤ W. Let F(i, j) be the value of an optimal
solution to this instance. We can divide all the subsets of the first i items that fit the knapsack
th ith item and those that do. Note
of capacity j into two categories: those that do not include the
the following:
i. Among the subsets that do not include the ith item, the value of an optimal subset is,
by definition, F(i − 1, j).
ii. Among the subsets that do include the ith item (hence, j − wi ≥ 0), an optimal subset is
made
de up of this item and an optimal subset of the first i−1
i−1 items that fits into the
knapsack of capacity j − wi . The value of such an optimal subset is vi + F(i − 1, j −
wi).
Thus, the value of an optimal solution among all feasible subsets of the first I items is the
maximum of these two values.
We can find the composition of an optimal subset by backtracing the computations of this
entry in the table. Since F(4, 5) > F(3, 5), item 4 has to be included in an optimal solution
along with an optimal subset for filling 5 − 2 = 3 remaining units of the knapsack capacity.
ca
The value of the latter is F(3, 3). Since F(3, 3) = F(2, 3), item 3 need not be in an optimal
subset. Since F(2, 3) > F(1, 3), item 2 is a part of an optimal selection, which leaves element
F(1, 3 − 1) to specify its remaining composition. Similarly,
Similarly, since F(1, 2) > F(0, 2), item 1 is
the final part of the optimal solution {item 1, item 2, item 4}.
Analysis
The time efficiency and space efficiency of this algorithm are both in Θ(nW). The time
needed to find the composition of an optimal solution is
i in O(n).
Memory Functions
The direct top-down
down approach to finding a solution to such a recurrence leads to an algorithm
that solves common subproblems more than once and hence is very inefficient.
inefficient
The classic dynamic programming approach, on the other hand, works bottom up: it fills a
table with solutions to all smaller subproblems, but each of them is solved only once. An
unsatisfying aspect of this approach is that solutions to some of these smaller subproblems
are often not necessary for getting a solution
so to the problem given. Since this drawback is not
present in the top-down
down approach, it is natural to try to combine the strengths of the top-down
top
and bottom-up approaches. The goal is to get a method that solves only subproblems that are
necessary andnd does so only once. Such a method exists; it is based on using memory
functions.
This method solves a given problem in the top-down top down manner but, in addition, maintains a
table of the kind that would have been used by a bottom-up
bottom up dynamic programming algorithm.
algori
Initially, all the table’s entries are initialized with a special “null” symbol to indicate that they
have not yet been calculated. Thereafter, whenever a new value needs to be calculated, the
method checks the corresponding entry in the table first: if this entry is not “null,” it is simply
retrieved from the table; otherwise, it is computed by the recursive call whose result is then
recorded in the table.
The following algorithm implements this idea for the knapsack problem. After initializing the
table,
ble, the recursive function needs to be called with i = n (the number of items) and j = W
(the knapsack capacity).
Algorithm MFKnapsack(i, j )
//Implements the memory function method for the knapsack problem
//Input: A nonnegative integer i indicating the number of the first items being
considered and a nonnegative integer j indicating the knapsack capacity
//Output: The value of an optimal feasible subset of the first i items
//Note: Uses as global variables input arrays
arrays Weights[1..n], V alues[1..n], and
table F[0..n, 0..W ] whose entries are initialized with −1’s except for
row 0 and column 0 initialized with 0’s
Example-2 Let us apply the memory function method to the instance considered in Example
1. The table in Figure given below gives the results. Only 11 out of 20 nontrivial values (i.e.,
not those in row 0 or in column 0) have been computed. Just one nontrivial entry, V (1, 2), is
retrieved rather than being recomputed. For larger instances,
stances, the proportion of such entries
can be significantly larger.
Figure: Example of solving an instance of the knapsack problem by the memory function algorithm
6. Bellman-Ford
Ford Algorithm (Single source shortest path with –ve
ve weights)
Problem definition
Single source shortest path - Given a graph
gra and a source vertex s in graph, find shortest paths
from s to all vertices in the given graph. The graph may contain negative weight edges.
Note that wee have discussed Dijkstra’s algorithm for single source shortest path problem.
Dijksra’s algorithm is a Greedy algorithm and time
ti complexity is O(VlogV)
ogV). But Dijkstra
doesn’t work for graphs
raphs with negative weight edges.
edges
Bellman-Ford
Ford works for such graphs. Bellman
Bellman-Ford
Ford is also simpler than Dijkstra and suites
well for distributed systems. But time complexity of Bellman
Bellman-Ford
Ford is O(VE), which is more
than Dijkstra.
How it works?
Like other Dynamic Programming Problems, the algorithm calculates shortest paths in
bottom-up
up manner. It first calculates the shortest distances for the shortest paths which have
at-most
most one edge in the path. Then, it calculates
cal shortest paths with at-most
ost 2 edges, and so
th
on. After the i iteration of outer loop, the shortest paths with at most i edges are calculated.
There can be maximum |V| – 1 edges in any simple path, that is why the outer loop runs |v| –
1 times. The idea is, assuming that there is no negative weight cycle, if we have calculated
shortest paths with
th at most i edges, then an iteration over all edges guarantees to give shortest
path with at-most (i+1) edges
Bellman-Ford
Ford algorithm to compute shortest path
8. Reliability design
Lecture Notes on
15CS43
Design and Analysis of
Algorithms
(CBCS Scheme)
Prepared by
Harivinod N
Dept. of Computer Science and Engineering,
VCET Puttur
May 2017
Module : Backtracking
Module-5
Contents
Course Website
www.TechJourney.in
Lecture Notes | 10CS43 – Design & Analysis
A of Algorithms | Module 4: Dynamic Programming
1. Backtracking
Some problems can be solved,
solved by exhaustive search. The exhaustive-search
exhaustive technique
suggests generating all candidate solutions and then identifying the one (or the ones) with a
desired property.
Backtracking is a more intelligent variation of this approach. The principal idea is to
construct solutions one component at a time and evaluate such partially constructed
candidates as follows. If a partially constructed solution can be developed further without
violating the problem’s constraints, it is done by taking the first remaining legitimate option
for the next component. If there is no legitimate option for the next component, no
alternatives for any remaining component need to be considered. In this case, the algorithm
backtracks to replace the last component of the partially constructed solution with its next
option.
It is convenient to implement this kind of processing by constructing a tree of choices being
made, called the state-space tree Its root represents an initial state before the search for a
space tree.
solution begins. The nodes of the first level in the tree represent the choices made for the first
component of a solution; the nodes of the second level represent the choices for the second
component, and so on. A node in a state-space
state space tree is said to be promising if it corresponds to
a partially constructed solution that may still lead to a complete solution; otherwise,
o it is
called non-promising.. Leaves represent either non-promising
non promising dead ends or complete
solutions found by the algorithm.
In the majority of cases, a statespace tree for a backtracking algorithm is constructed in the
manner of depth-first search.. If the current node is promising, its child is generated by adding
the first remaining legitimate option for the next component of a solution, and the processing
moves to this child. If the current node turns out to be non-promising,
non promising, the algorithm
backtracks
racks to the node’s parent to consider the next possible option for its last component; if
there is no such option, it backtracks one more level up the tree, and so on. Finally, if the
algorithm reaches a complete solution to the problem, it either stops (if just one solution is
required) or continues searching for other possible solutions.
We start with the empty board and then place queen 1 in the first possible position of its row,
which is in column 1 of row 1. Then we place queen 2, after trying unsuccessfully columns 1
and 2, in the first acceptable position for it, which is square (2, 3), the square in row 2 and
column 3. This proves to be a dead end because there is no acceptable position for queen 3.
So, the algorithm backtracks and puts queen 2 in the next possible position at (2, 4). Then
queen 3 is placed at (3, 2), which proves to be another dead end. The algorithm algorith then
backtracks all the way to queen 1 and moves it to (1, 2). Queen 2 then goes to (2, 4), queen 3
to (3, 1), and queen 4 to (4, 3), which is a solution to the problem. The state-space
state tree of this
search is shown in figure.
Figure: State-space
space tree of solving the four-queens
queens problem by backtracking.
× denotes an unsuccessful attempt to place a queen in the indicated column. The
numbers above the nodes indicate the order in which the nodes are generated.
If other solutions need to be found, the algorithm can simply resume its operations at the leaf
at which it stopped. Alternatively, we can use the board’s symmetry for this purpose.
Finally, it should be pointed out that a single solution to the n-queens
n problem for any n ≥ 4
can be found in linear time.
Note: The algorithm NQueens() is not in the syllabus. It is given here for interested learners.
The algorithm is referred from textbook T2.
T2
1.3 Sum
um of subsets problem
Problem definition: Find a subset of a given set A = {a1, . . . , an } of n positive integers
whose sum is equal to a given positive integer d.
For example, for A = {1, 2, 5, 6, 8} and d = 9, there are two solutions: {1, 2, 6} and {1, 8}.
Of course, some instances of this problem may have no solutions.
It is convenient to sort the set’s elements in increasing order. So, we will assume that
a1< a2 < . . . < an.
The state-space
space tree can be constructed as a binary tree like that in Figure shown below for
the instance A = {3, 5, 6, 7} and d = 15.
The number inside a node is the sum of the elements already included in the subsets
represented by the node. The inequality below a leaf indicates the reason for its termination.
Example: Apply backtracking to solve the following instance of the subset sum problem: A
= {1, 3, 4, 5} and d = 11.
Analysis
How can we find a lower bound on the cost of an optimal selection without actually solving
the problem?
We can do this by several methods. For example, it is clear that the cost of any solution,
solution
including an optimal one, cannot be smaller than the sum of the smallest elements in each
of the matrix’s rows.. For the instance here, this sum is 2 + 3+ 1+ 4 = 10. We can and will
apply the same thinking to partially constructed solutions. For example, for any legitimate
selection that selects 9 from the first row, the lower bound will be 9 + 3 + 1+ 4 = 17.
Rather
ather than generating a single child of the last promising node as we did in backtracking, we
will generate all the children of the most promising node among non-terminated
terminated leaves in the
current tree. (Nonterminated, i.e., still promising, leaves are also called live.) How can we tell
which of the nodes is most promising? We can do this by comparing the lower bounds of the
live nodes. It is sensible to consider a node with the best bound as most promising, although
this does not, of course, preclude the possibility
possibility that an optimal solution will ultimately
belong to a different branch of the state-space
state tree. This variation of the strategy is called the
best-first branch-and-bound
bound.
We start with the root that corresponds to no elements selected from the cost matrix. The
lower-bound
bound value for the root, denoted lb, is 10. The nodes on the first level of the tree
correspond to selections of an element in the first row of the matrix, i.e., a job for person a.
a
See the figure given below.
But there is a less obvious and more informative lower bound for instances with symmetric
matrix D, which does not require a lot of work to compute. We can an compute a lower bound
on the length l of any tour as follows. For each city i, 1≤
1 i ≤ n, find the sum si of the distances
from city i to the two nearest cities; compute the sum s of these n numbers, divide the result
by 2, and, if all the distances are integers, round up the result to the nearest integer:
lb s/2 ... (1)
For example, for the instance in Figure 2.2a,
2.2 formula (1) yields
Moreover, for any subset of tours that must include particular edges of a given graph, we can
modify lower bound (formula
formula 1) accordingly. For example, for all the Hamiltonian circuits of
the graph in Figure 2.2aa that must include edge (a, d), we get the following lower bound by
summing up the lengths of the two shortest edgeses incident with each of the vertices, with the
required inclusion of edges (a, d) and (d, a):
We now apply the branch-and and-bound algorithm, with the bounding ing function given by
formula-1,, to find the shortest Hamiltonian circuit
circuit for the graph in Figure 2.2a.
2.
To reduce the amount of potential work, we take
take advantage of two observations.
1. First, without loss of generality, we can consider only tours that start at a.
2. Second, because our graph is undirected, we can generate only tours in which b is
visited before c. (Refer Note at the end of section 2.2 for more details)
In addition, after visiting n−−1=
1= 4 cities, a tour has no choice but to visit the remaining
unvisited city and return to the starting one. The state-space
state space tree tracing the algorithm’s
application is given in Figure 2.2b.
2.2
Figure: Solution to a small instance of the traveling salesman problem by exhaustive search.
Discussion
The strengths and weaknesses of backtracking are applicable to branch-and
branch and-bound as well.
The state-space tree technique enables us to solve many large instances of difficult
combinatorial problems. As a rule, however, it is virtually impossible to predict which
instances will be solvable in a realistic amount of time and which will not.
In contrast
trast to backtracking, solving a problem by branch-and-bound
branch bound has both the challenge
and opportunity of choosing the order of node generation and finding a good bounding
function. Though the best-first
first rule we used above is a sensible approach, it may or may
ma not
lead to a solution faster than other strategies. (Artificial intelligence researchers are
particularly interested in different strategies for developing state-space
space trees.)
Finding a good bounding function is usually not a simple task. On the one hand, we want this
function to be easy to compute. On the other hand, it cannot be too simplistic - otherwise, it
would fail in its principal task to prune as many branches of a state-space
space tree as soon as
possible. Striking a proper balance between these two competing requirements may require
intensive experimentation with a wide variety of instances of the problem in question.
3. 0/1 Knapsack
sack problem
Note: For this topic as per the syllabus both textbooks T1 & T2 are suggested.
Here we discuss the concepts
conce from T1 first and then that of from T2.
T2
Topic form T1 (Levitin)
Let us now discuss how we can apply the branch-and-bound
branch bound technique to solving the
knapsack problem. Given
iven n items of known weights wi and values vi , i = 1, 2, . . . , n, and a
knapsack of capacity W, find the most valuable subset of the items that fit in the knapsack.
knapsack
, 0 1
It is convenient to order the items of a given instance in descending order by their value-to-
weight ratios.
Each node on the ith level of state space tree, 0 ≤ i ≤ n, represents all the subsets of n items
that include a particular selection made from the first i ordered items. This particular
selection is uniquely determined by the path from the root to the node: a branch going to the
left indicates the inclusion of the next item, and a branch going to the right indicates its
exclusion.
We record the total weight w and the total value v of this selection in the node, along with
some upper bound ub on the value of any subset that can be obtained by adding zero or more
items to this selection. A simple way to compute the upper bound ub is to add to v, the total
value of the items already selected, the product of the remaining capacity of the knapsack
W − w and the best per unit payoff among the remaining items, which is vi+1/wi+1:
ub = v + (W − w)(vi+1/wi+1).
Example: Consider the following problem.
problem. The items are already ordered in descending
order of their value-to-weight
weight ratios.
ratios
preceding subsection.) For the knapsack problem, however, every node of the tree represents
a subset of the items given. We can use this fact to update the information about the best
subset seen so far after generating each new node in the tree. If we had done this for the
instance investigated above, we could have terminated nodes 2 and 6 before node 8 was
generated because they both are inferior
i to the subset of value 65 of node 5.
Conclusion
4. NP-Complete
Complete and NP-Hard
NP problems
4.1 Basic concepts
For many of the problems we know and study, the best algorithms for their solution have
computing times cann be clustered into two groups;
1. Solutions are bounded by the polynomial- Examples include Binary search O(log n),
Linear search O(n), sorting algorithms like merge sort O(n log n),, Bubble sort O(n2)
trix multiplication O(n3) or in general O(nk) where k is a constant.
& matrix
2. Solutions are bounded by a non-polynomial
non - Examples include travelling salesman
2 n n/2
problem O(n 2 ) & knapsack problem O(2 ). As the time increases exponentially,
even moderate size problems cannot be solved.
So far, noo one has been able to device an algorithm which is bounded by the polynomial for
the problems belonging to the non-polynomial. However impossibility of such an algorithm
is not proved.
print(‘0’); failure
Example-2: Checking whether n integers are sorted or not
procedure NSORT(A,n);
//sort n positive integers//
var integer A(n), B(n), n, i, j;
begin
B := 0; //B is initialized to zero//
for i := 1 to n do
begin
j := choice(1:n);
if B(j) <> 0 then failure;
B(j) := A(j);
end;
Decision
sion vs Optimization algorithms
An optimization problem tries to find an optimal solution.
A decision problem tries to answer a yes/no question. Most of the problems can be specified
in decision and optimization versions.
For example, Traveling
raveling salesman problem can be stated as two ways
• Optimization - find hamiltonian cycle of minimum weight,
• Decision - is there a hamiltonian cycle of weight ≤ k?
For graph coloring problem,
• Optimization – find the minimum number of colors needed to color the vertices of a
graph so that no two adjacent vertices are colored the same color
• Decision - whether there exists such a coloring of the graph’s
graph s vertices with no more
than m colors?
Many optimization problems can be recast in to decision problems with the property that the
decision algorithm can be solved in polynomial time if and only iff the corresponding
correspondin
optimization problem.
4.3 P, NP, NP-Complete
Complete and NP-Hard
NP classes
But there are some problems which are known to be in NP but don’t know if they’re
t in P. The
traditional example is the decision-problem
decision version of the Travelling
ing Salesman Problem
(decision-TSP). ). It’s not known whether decision-TSP
decision TSP is in P: there’s no known poly-time
poly
solution, but there’s no proof such a solution doesn’t exist.
There are problems that are known to be neither in P nor NP; a simple example is to
enumerate all the bit vectors of length n. No matter what, that takes 2n steps.
Now, one more concept: given decision problems P and Q, if an algorithm can transform a
tion for P into a solution for Q in polynomial time, it’s said that Q is poly-time
solution
reducible (or just reducible) to P.
The most famouss unsolved problem in computer science
s is “whether P=NP or P≠NP? ”
Figure: Commonly believed Figure: Commonly believed relationship between P, NP, NP-
NP
relationship between P and NP Complete and NP-hard
hard problems
NP-Complete problems have the property that it can be solved in polynomial time if all other
NP-Complete
Complete problems can be solved in polynomial time. i.e if anyone ever finds a poly-time
poly
complete problem, they’ve automatically got one for all the NP-complete
solution to one NP-complete
problems;
roblems; that will also mean that P=NP.
Example for NP-complete
complete is CNF-satisfiability problem. The CNF-satisfiability
satisfiability problem
deals with boolean expressions. This is given by Cook in 1971. The CNF-satisfiability
CNF
problem asks whether or not one can assign values
v true and false to variables of a given
boolean expression in its CNF form to make the entire expression true.
Over the years many problems in NP have been proved to be in P (like Primality Testing).
Still, there are many problems in NP not proved to be in P. i.e. the question still remains
whether P=NP? NP Complete Problems helps in solving this question. They are a subset
of NP problems with the property that all other NP problems can be reduced
duced to any of them in
polynomial time. So, they are the hardest problems in NP, in terms of running time. If it can
be showed that any NP-Complete
omplete problem is in P, then all problems in NP will be in P
(because of NP-Complete definition), and hence P=NP=NPC.
NP Hard Problems - These problems need not have any bound on their running time. If
any NP-Complete Problem is polynomial time reducible to a problem X, that problem X
belongs to NP-Hard
Hard class. Hence, all NP-Complete problems are also NP-Hard.
N In other
words if a NP-Hard problem is non-deterministic
non deterministic polynomial time solvable, it is a NP-
Complete problem. Example of a NP problem that is not NPC is Halting Problem.
Problem
If a NP-Hard problem can be solved
solv in polynomial time then all NP-Complete
Complete can be solved
in polynomial time.
“All NP-Complete
Complete problems are NP-Hard
NP but not all NP-Hard
Hard problems are not NP-
NP
Complete.” NP-Complete
Complete problems are subclass of NP-Hard
NP
The more conventional optimization version of Traveling Salesman Problem for finding the
shortest route is NP-hard,
hard, not strictly NP-complete.
NP
*****