Daa Combined
Daa Combined
Daa Combined
Reference Books:
1. Levitin A., Introduction to the design and analysis of
algorithms, Pearson Education (2008) 2nd ed.
1
Objectives
• Greedy Method
• Applications
– Optimal Storage on Tapes
– Job Sequencing with Deadlines
– Knapsack Problem
– Single Source Shortest Path
• Previous Gate Questions
2
Greedy Method
• The Greedy method is a most straight forward design
technique which can be applied to a wide variety of
problems.
• This algorithm works in steps. In each step it selects the
best available options until all options are finished.
• Most of the problems have n inputs and require us to
obtain a subset that satisfies some constraints.
• Any subset that satisfies these constraints is called as a
feasible solution.
• A feasible solution that either minimizes or maximizes
a given objective function is called as Optimal
Solution. 3
Greedy Algorithm
• The Greedy method suggest that one can devise an
algorithm that work in stages, considering one input at a
time.
• At each stage, a decision is made regarding whether a
particular input is an optimal solution or not.
• This is done by considering the inputs in an order
determined by some selection procedure.
• If the inclusion of the next input into the partially
constructed optimal solution results sub-
optimal/infeasible solution, then that input is not added to
the partial solution. Otherwise, it is added. The selection
procedure itself is based on some optimization measures.
4
Control Abstraction for Greedy Method
Select selects an input from a[] and removes it. The selected input’s value is
assigned to x.
Feasible is a Boolean-valued function that determines whether x can be included
into the solution vector or not.
Union combines x with the solution and updates the objective function.
5
Types of Greedy Problems
• Subset Paradigm
• To solve a problem (or possibly find the
optimal/best solution), greedy approach generate
subset by selecting one or more available choices.
Eg. includes Knapsack problem, job sequencing
with deadlines. In both of the problems greedy
create a subset of items or jobs which satisfies all
the constraints.
• Ordering Paradigm
• In this, greedy approach generate some
arrangement/order to get the best solution. Eg.
includes: Minimum Spanning tree
6
Applications
• Fractional knapsack algorithm
• Optimal Storage on tapes
• Job sequencing with deadline
• Single source shortest path
– Dijkstra's SSSP algorithm
• Activity Selection Problem
• Minimum Cost Spanning Tree
• …
7
Optimal Storage on Tapes
8
Description
• There are n programs that are to be stored on a computer
tape of length L.
• Associated with each program i is a length l;
• Clearly, all programs can be stored on the tape if and only if
the sum of the lengths of the programs is at most L.
• We shall assume that whenever a program is to be
retrieved from this tape, the tape is initially positioned at
the front.
• Hence' if the programs are stored in the order I=i1, i2’ i3 …. in
the time tj needed to retrieve program ij is proportional to
lik .
• If all programs are retrieved equally often then the
expected or mean retrieval time (MRT) is
9
Example
10
Method
• The greedy method simply requires us to store
the programs in non-decreasing order of their
lengths.
• This ordering (sorting) can be carried out in
O(n log n) time using an efficient sorting
algorithm
11
Algorithm for multiple tapes
13
• Maximize (A)
• Subject to (B)
•
• And 0 ≤ Xi ≤ 1, 1 ≤i ≤n (C)
• The profit and weights are the positive numbers.
• Here, A feasible solution is any set (X1, X2, …,
Xn) satisfying above rules (B) and (C).
• And an optimal solution is feasible solution for
which rule (A) is maximized.
14
Here, N=3, M=20, (P1, P2, P3)=(25, 24, 15) and (W1, W2, W3)=(18, 15, 10)
Different feasible solutions are:
(X1, X2, X3)
15
The knapsack algorithm
• The greedy algorithm:
Step 1: Sort pi/wi into nonincreasing order.
Step 2: Put the objects into the knapsack according
to the sorted sequence as possible as we can.
• e. g. p[i]/w[i]≥p[i+1]≥w[i+1]
n = 3, M = 20, (p1, p2, p3) = (25, 24, 15)
(w1, w2, w3) = (18, 15, 10)
Sol: p1/w1 = 25/18 = 1.39
p2/w2 = 24/15 = 1.6
p3/w3 = 15/10 = 1.5
Optimal solution: x1 = 0, x2 = 1, x3 = 1/2
total profit = 24 + 7.5 = 31.5
16
Algorithm
• Algorithm GreedyKnapsack(m,n)
//order the n objects such that p[i]/w[i]≥p[i+1]≥w[i+1]
{
for i:=1 to n do x[i]:=0.0;
U:=m;
for i:=1 to n do
{
if(w[i] > U) then break;
x[i]:=1.0; U:=U-w[i];
}
if( i ≤ n) then x[i]:=U/w[i];
}
17
Example problem
• N=7
• M=15(15-1=14-2=12-4=8-5=3-1=2)
• (p1,p2…p7)=(10,5,15,7,6,18,3)
• (w1,w2..w7)=(2,3,5,7,1,4,1)
• The solution vector is (1,2/3,1,0,1,1,1)
• P1/w1=5 p2/w2=1.66 p3/w3=3 p4/w4=1
p5/w5=6 p6/w6=4.5 p7/w7=3
• x5,x1,x6,x3,x7,x2,x4
• Weight is 1+2+4+5+1+2/3*3+0=13+2/3*3=15
• Profit is 6+10+18+15+3+2/3*5+0=55.34
(1,2/3,1,0,1,1,1) 18
Job Sequencing with Deadlines
• We are given a set of n jobs.
• Di is a deadline given to complete ithjob for profit Pi where Pi>0 &
Di ≥ 0.
• For any job i profit Pi is earned iff the job is completed within its
deadline.
• To complete a job, one has to process the job on a machine for one
unit of time.
• Only one machine is available for processing jobs.
• A feasible solution for this problem is a subset J of jobs such that
each job in this subset can be completed by its deadline.
• The value of a feasible solution J is the sum of the profits of the job
in J.
• An optimal solution is a feasible solution with maximum value.
19
• Example –Suppose on a single machine four jobs with
profit values (100, 10, 15 and 27) and their respective
deadline unit values (2, 1, 2, 1) are given. Calculate the
different feasible solutions to complete the jobs with
optimal solution.
• Solution:
• Number of Jobs (n)= 4
• Profit values of four jobs (P1, P2, P3, P4)=(100, 10, 15, 27)
• Process deadlines for respective jobs are
(D1, D2, D3, D4)=(2, 1, 2, 1)
20
(P1, P2, P3, P4)=(100, 10, 15, 27) (D1, D2, D3, D4) =(2, 1, 2, 1)
The feasible solutions and their values are –
Feasible sol. subset Processing Sequence Values
(1, 2) (2, 1) 110
(1, 3) (1, 3 or 3, 1) 115
(1, 4) (4, 1) 127
(3, 2) (2, 3) 25
(3, 4) (4, 3) 42
(1) (1) 100
(2) (2) 10
(3) (3) 15
(4) (4) 27
Here, Solution 3 is optimal solution. In this solution only job 1 and 4
are processed and the profit value is 127. These jobs must be
processed in the order Job 4 followed by job 1. Thus the processing
of job 4 begins at time zero and that of job 1 is completed at time 2.
21
Algorithm
23
Single-Source Shortest Path Problem
24
Dijkstra's algorithm
Dijkstra's algorithm - is a solution to the single-source
shortest path problem in graph theory.
Approach: Greedy
26
Dijkstra Animated Example
27
Dijkstra Animated Example
28
Dijkstra Animated Example
29
Dijkstra Animated Example
30
Dijkstra Animated Example
31
Dijkstra Animated Example
32
Dijkstra Animated Example
33
Dijkstra Animated Example
34
Dijkstra Animated Example
35
Dijkstra Animated Example
36
Implementations and Running Times
The simplest implementation is to store vertices in an array
or linked list. This will produce a running time of
O(|V|^2 + |E|)
Where |E| is for search needed to find the minimum
distance node. //A node may be connected to a maximum
of E nodes
For sparse graphs, or graphs with very few edges and
many nodes, it can be implemented more efficiently storing
the graph in an adjacency list using a binary heap or
priority queue. This will produce a running time of
O((|E|+|V|) log |V|) //inner loop is replaced by a heap
37
DIJKSTRA'S ALGORITHM - WHY USE IT?
38
Example
1 2 3 4 5 6 7
0 inf inf inf inf inf Inf
25 inf inf inf inf inf
35 39 inf inf inf
39 inf 51 Inf
inf 51 inf
93 65
93
39
Previous Gate Questions
40
Q. No. 1
41
Q. No. 2
Answer is 16
16
60/10 =6
28/7 =4
20/4 =5
24/2 =12
43
44
Q. No. 4
46
Q. No. 6
47
• 7 Given below are some algorithms, and some algorithm design paradigms.
Q. No.
List-I
A. Dijkstra’s Shortest Path 3
B. Floyd-Warshall algorithm to compute all pairs shortest path 2
C. Binary search on a sorted array 1
D. Backtracking search on a graph 4
List-II
1. Divide and Conquer
2. Dynamic Programming
3. Greedy design
4. Depth-first search
5. Breadth-first search
Match the above algorithms on the left to the corresponding design paradigm they follow Codes:
ABCD
(a) 1 3 1 5
(b) 3 3 1 5
(c) 3 2 1 4
(d) 3 2 1 5
Options:
A) a
B) b
C) c
D) D Answer C 48
Q. No. 9
49
Q. No. 10
50
Graphs
Introduction
• Generalization of a tree.
• Collection of vertices (or nodes) and connections
between them.
• No restriction on
– The number of vertices.
– The number of connections between the two
vertices.
• Have several real life applications.
Definition
• A graph G = (V,E) consists of a
– Non-empty set V of vertices and
– Possibly empty set E of edges.
• |V| denotes number of vertices.
• |E| denotes number of edges.
• An edge (or arc) is a pair of vertices (vi,vj) from V.
– Simple or undirected graph (vi,vj) = (vj,vi).
– Digraph or directed graph (vi,vj) ≠ (vj,vi).
• An edge has an associated weight or cost as well.
Undirected
Contd… Graph
v1 v4
v1 v4 Directed v2
Graph
v2 v5
v3 v6
v5
v3 v6
v1 v2
v1 v4
v2 v3
v4
v5 Complete
v3
Multigraph Graph
Representation – I
• Adjacency matrix
– Adjacency matrix for a graph G = (V, E) is a two
dimensional matrix of size |V| x |V| such that
each entry of this matrix
a[i][j] = 1 (or weight), if an edge (vi,vj) exists.
0, otherwise.
Non-Eulerian
graph as only
Eulerian graph as Euler path exists.
Euler circuit exists.
Fleury's algorithm
1. Start at a vertex of odd degree (Euler path), or, if
the graph has none, start with an arbitrarily
chosen vertex (Euler circuit).
2. Choose the next edge in the path to be one whose
deletion would not disconnect the graph. (Always
choose the non-bridge in between a bridge and a
non-bridge.)
3. Add that edge to the circuit, and delete it from the
graph.
4. Continue until the circuit is complete.
Example – 1
Vertex Degree
v1 2
v2 2
v3 4
v4 2
v5 2
v6 2
v6 e6 v3 e5 v5
v1 v2 v4 v2 v1v2
e3 or
e7 e2 e4
e1
v6 e6 v3 e5 v5
v1 v4 v3 v1v2v3
e3 or
e7 e4
e1e2
v6 e6 v3 e5 v5
Contd…
Graph Current Vertex Trail
v1 v4 v4 v1v2v3v4
v3v6 or e6 is a or
e7 e4
bridge and can’t e1e2e3
v6 e6 v3 e5 v5 be selected.
v1 v5 v1v2v3v4v5
or
e7
e1e2e3e4
v6 e6 v3 e5 v5
v1 v3 v1v2v3v4v5v3
or
e7
e1e2e3e4e5
v6 e6 v3
Contd…
Graph Current Vertex Trail
v1 v6 v1v2v3v4v5v3v6
or
e7
e1e2e3e4e5e6
v6
NULL v1 v1v2v3v4v5v3v6v1
or
e1e2e3e4e5e6e7
• Required Trail:
v1v2v3v4v5v3v6v1
or e1e2e3e4e5e6e7
Example – 2 Vertex Degree
v1 2
v2 2
v1 v2
v3 4
v4 3
v4 v5
v3 v5 2
v6 2
v6 v7 v8 v7 3
v8 2
v2 v4v2
v4v6 is a bridge
and can’t be
selected.
v1 v4v2v1
Contd…
Graph Current Vertex Trail
v3 v4v2v1v3
v4 v4v2v1v3v4
v6 v4v2v1v3v4v6
v7 v4v2v1v3v4v6v7
Contd…
Graph Current Vertex Trail
v5 v4v2v1v3v4v6v7v5
v8 v4v2v1v3v4v6v7v5v8
NULL v7 v4v2v1v3v4v6v7v5v8v7
v1 v2
• Required Trail:
v4 v5
v3
v4v2v1v3v4v6v7v5v8v7
v6 v7 v8
Example – 3
• Find Euler circuit using Fleury's algorithm. Start at
vertex A.
https://www.youtube.com/watch?v=vvP4Fg4r-Ns
Contd…
• In Fleury's algorithm, the graph traversal is linear in
the number of edges, i.e. O(|E|), excluding the
complexity of detecting bridges.
v6 v3 v5
v4 v1v2v3v4v5v3v6v1
v3 v5
v1v2v3v4v5v3v6v1
Example – 2
• Find Euler circuit using Hierholzer’s algorithm. Start
at vertex A.
• Solution:
– ABCDEA
– AGFABCDEA
– AGFABEBCDEA
Contd…
• If doubly linked list is used to maintain
– The set of unused edges incident to each vertex,
– The list of vertices on the current tour that have unused
edges, and
– The tour itself.
• The individual operations of finding
– Unused edges exiting each vertex,
– A new starting vertex for a tour, and
– Connecting two tours that share a vertex.
• May be performed in constant time, so the
algorithm takes linear time, O(|E|).
Hamiltonian Graphs
• A path passing through all the vertices of a graph is
called a Hamiltonian path.
– A graph containing a Hamiltonian path is said to be
traceable.
• A cycle passing through all the vertices of a graph is
called a Hamiltonian cycle (or Hamiltonian circuit).
• A graph containing a Hamiltonian cycle is called a
Hamiltonian graph.
• Hamiltonian path problem: Determine the
existence of Hamiltonian paths and cycles in graphs
(NP-complete).
Example
• Not Hamiltonian graphs.
Example
•
• Hamiltonian graph
– 128765431
graph[1][4] = 1.
Thus Hamiltonian graph.
1 2 3 4
4 4 1 0 1 0 1
3 3 2 1 0 1 1
2 2 3 0 1 0 1
path[] 1 1 graph[][] 4 1 1 1 0
Applications
• Painting road lines,
• Plowing roads after a snowstorm,
• Checking meters along roads,
• Garbage pickup routes, etc.
Planar Graphs
• If a graph G can be drawn on a plane (or a sphere) so that
the edges only intersect at vertices, then it is planar.
• Such a drawing of a planar graph is a planar embedding of
the graph.
• Planar Nonplanar
Contd…
8 – 11 + 5 = 2
Graph Coloring
• Special case of labeling graph elements subject to
certain constraints.
– Traditionally, "colors" are used as labels.
• Several forms
– Vertex coloring. Color vertices of a graph such that
no two adjacent vertices share the same color.
– Edge coloring. Color edges such that no two adjacent
edges share the same color.
– Face coloring of a planar graph. Assign color to each
face or region so that no two faces that share a
boundary have the same color.
Contd…
v5 v6 v7 v8
Order of vertices
v1 v2 v3 v4 v5 v6 v7 v8
c1 c1 c2 c1 c2 c2 c3 c4
Number of colors required ≤ Maximum degree + 1
≤ max{deg(v1), deg(v2), deg(v3), deg(v4), deg(v5),
deg(v6), deg(v7), deg(v8)} + 1
≤ max{3, 3, 2, 2, 2, 4, 5, 3} + 1
≤5+1
≤ 6.
Welsh–Powell algorithm
• If vertices are ordered according to their degrees
(decreasing), then
v1 v2 v3 v4
v5 v6 v7 v8
v7 v6 v1 v2 v8 v3 v4 v5
c1 c2 c3 c1 c3 c2 c3 c2
m-Coloring Problem
• Given an undirected graph and a number m, determine if
the graph can be colored with at most m colors such that
no two adjacent vertices have the same color.
• Input:
– An adjacency matrix, graph[V][V] where V is the
number of vertices in the graph.
– An integer m which is maximum number of colors that
can be used.
• Output:
– An array color[V] containing colors assigned to all the
vertices in the range 1 to m. False will be returned, if
the graph cannot be colored with m colors.
Backtracking
1. If all colors are assigned,
2. Print vertex assigned colors.
3. Else
4. Trying all possible colors, assign a color to the
vertex.
5. If color assignment is possible, recursively assign
colors to next vertices.
6. If color assignment is not possible, de-assign
color, return False.
Example – Check for 3-colorability
v2 v3 v5 v6
7 c2 c1 c2
6 c3 c1 c2 c3 v7 v1 v0 v4
5 c2 c1 c2
0 1 2 3 4 5 6 7
4 c3 c1 c2 c3 0 0 1 1 0 1 1 1 0
3 c1 1 1 0 1 0 1 0 1 0
2 1 1 0 0 0 0 0 1
2 c3 c1 c2 c3 3 0 0 0 0 1 1 0 1
1 c2 c1 c2 4 1 1 0 1 0 0 0 0
5 1 0 0 1 0 0 0 0
0 c1 6 1 1 0 0 0 0 0 0
7 0 0 1 1 0 0 0 0
Applications
• Scheduling.
• Frequency Assignment.
• Register Allocation.
• Bipartite Graphs.
• Map Coloring, etc. v2 v3 v5 v6
v7 v1 v0 v4
Definitions
• Diagraph or directed graph.
• A cycle in a diagraph G is a set of edges, {(v1, v2), (v2, v3),
..., (vr −1, vr)} where v1 = vr .
• A diagraph is acyclic if it has no cycles. Also known as
directed acyclic graph (DAG).
• DAGs are used in many applications to indicate
precedence among events. For example,
– Inheritance between classes.
– Prerequisites between courses of a degree program.
– Scheduling constraints between the tasks of a projects.
Example – Prerequisites between courses
142 → 143 → 378 → 370 → 321 → 341 → 322 →
326 → 421 → 401
Example – Scheduling
Topological Sort
• Let, G = (V,E) be a DAG
• Linear ordering of all its
vertices such that if G contains
an edge (u,v), then u appears
before v in the ordering.
• Can also be viewed as an
ordering of vertices
along a horizontal line so
that all directed edges
go from left to right.
Kahn's algorithm
• L ← Empty list that will contain the sorted elements
• S ← Set of all nodes with no incoming edge
1. while S is non-empty do
2. remove a node n from S
3. add n to tail of L
4. for each node m with an edge e from n to m do
5. remove edge e from the graph
6. if m has no other incoming edges then
7. insert m into S
8. if graph has edges then
9. return error (graph has at least one cycle)
10. else
11. return L (a topologically sorted order)
Example – 1
B C
A F
D E
A B D Vertex In-degree
A 0
B C
B 1
C D E C 1
D E D 2
E 2
E
F 0
F
A B D
Contd… B C
L: Ø C D E
S: A → F
D E
F
L: A
S: F → B
L: A → F
S: B
A B D
Contd… B C
L: A → F → B C D E
S: C
D E
F
L: A → F → B → C
S: D
L: A → F → B → C → D
Required
S: E topological
sort is
AFBCDE
L: A → F → B → C → D → E NULL
S: Ø
Topological Sort using DFS
• Perform DFS. Sort vertices in decreasing order of
their finishing time.
1 2 3 4
5 6 7 8
9 10 11 12
Contd…
Starting Finishing
Vertex
Time Time
A 1 10
B 6 9
C 7 8
D 2 5
E 3 4
F 11 12
0 1 2 3 4 5
5 0 1 1 1 1 1 1 0 0 0 0 0 0 0
4 0 0 0 0 0 0 1 1 0 0 0 0 0 0
3 0 0 0 0 1 1 1 2 0 0 0 1 0 0
2 0 0 0 1 1 1 1 3 0 1 0 0 0 0
1 0 0 0 0 0 1 1 4 1 1 0 0 0 0
• Visited[] 0 0 0 1 1 1 1 1 5 1 0 1 0 0 0
Example
• Topological sort:
0→1→2→3→4→5
Isomorphism
• Two graphs are isomorphic when the vertices of one
can be re-labeled to match the vertices of the other
in a way that preserves adjacency.
• Formally,
Graphs G1 and G2 are isomorphic if there exists
a one-to-one function, called an isomorphism,
f: V(G1) → V(G2)
such that uv is an element of E(G1) if and only
if f(u)f(v) is an element of E(G2).
Example
G1 G2 G3
• G1 and G2
f(a) = J, f(b) = K, f(c) = M, and f(d) = L.
• G1 and G3
f(a) = a, f(b) = d, f(c) = c, and f(d) = b.
Example
• f(a) = 1
• f(b) = 6
• f(c) = 8
• f(d) = 3
• f(g) = 5
• f(h) = 2
• f(i) = 4
• f(j) = 7
Graph Isomorphism Problem
• Determining whether two finite graphs are isomorphic or
not?
• It is neither an NP-complete problem nor a P-problem,
although this has not been proved (Skiena 1990).
• There is a famous complexity class called graph isomorphism
complete which is thought to be entirely disjoint from
both NP-complete and from P.
• Applications:
– Cheminformatics,
– Mathematical chemistry (identification of chemical compounds),
and
– Electronic design automation (verification of equivalence of
various representations of the design of an electronic circuit).
Connected and Disconnected Graph
• A graph is connected if all the vertices are
connected to each other, i.e.
– A path exists between every pair of vertices.
– No unreachable vertices.
• {e1, e4}, {e6, e7}, {e1, e2, e3}, {e8}, {e3, e4, e5, e6}, {e2,
e5, e7}, {e2, e5, e6} and {e2, e3, e4}.
Cut
• A cut is a partition of the vertices of a graph into
two disjoint subsets.
• Formally,
In a graph G = (V,E), a pair of subsets V1 and V2 of V
satisfying
V = V1 ∪ V2, V1 ∩ V2 = ∅, V1 ≠ ∅, V2 ≠ ∅
is called a cut (or a partition) of G. Denoted as (V1, V2).
• It is also defined as an edge set.
cut (V1, V2) = {those edges with one end vertex in V1
and the other end vertex in V2}.
Contd…
• In an unweighted undirected Min-cut
graph, the size or weight of
a cut is the number of edges
crossing the cut.
of this vertex is 0.
Flow in a network is an integer-valued function f
defined on the edges of G satisfying 0 ≤ f (u,v) ≤
c(u,v), for every edge (u,v) in E.
Contd…
| f | f ( s, v) f (v, t )
vV vV
f (s, v) f (v, t )
vV vV
f
Cut
10 15 15 10
4
s 5 3 8 6 10 t
4 6 15 10
15
4 30 7
Capacity of Cut (S,T)
c( S , T ) c(u, v)
uS ,vT
2 9 5
Cut
10 15 15 10
4
s 5 3 8 6 10 t
4 6 15 10
15
4 30 7
Capacity = 30
Min Cut
s 5 3 8 6 10 t
4 6 15 10
15
4 30 7 Capacity = 28
Flow of Min Cut
FORD-FULKERSON-METHOD(G,s,t)
1. initialize flow f to 0
4. return f
Residual Network
Path: s → v2 → v1 → v3 → t.
Residual capacity: min(13, 4, 8, 20) = 4.
All edges in path exists in E, so add 4 to the flow for
all the edges in the path.
Flow = 4 + 4
Contd…
| f | = 23.
c({s, v1,v2,v4},{v3,t}) = 23.
4
Example 2 2
1
5
3 1 1
2 4 2
4 s t
2 5 3 2
1 1
3 1 1
2 3
2 4
s t
3 2 4
1 2 5
1
3 3 1 1
2/2 4 2/2
s t
3 2
Flow = 2 1
3
4
Contd… 2
1
5
3 1 1
2/2 4 2/2
4 s t
2 5 3 1 2
1
3 1 1
2 2
3
4
s t
3 2 4
1 2 5
1
3 3 1 1
2/2 4 2/2
s t
2/3 1 2/2
Flow = 2 + 2
3
4
Contd… 2
1
5
3 1 1
2/2 4 2/2
4 s t
2 5 2/3 1 2/2
1
3 1 1
2
3
4
s t
1 1/4
1 2 2 5
2 1
1/3 1 1/1
3
2/2 4 2/2
s t
2/3 1 2/2
Flow = 2 + 2 + 1
3
1/4
Contd… 2
1
5
1/3 1 1/1
3 2/2 4 2/2
s t
2 1 5 2/3 1 2/2
2 1 1
1 1
2 3
4
s t
1 2
1
2 | f | = 5.
3
c({s,2,3,4,5},{t}) = 5.
Flow = 2 + 2 + 1 = 5
Analysis
• Matrix multiplication
• Matrix chain multiplication
• Deriving formula using DP and How to use
the formula
• Example of matrix chain multiplication
THANKS !
Optimal Binary Search Tree (OBST)
• Binary Search Tree
• Possible no. of BST for given ‘n’
• Cost of BST
• Recursive formula for constructing
OBST
• Example of Constructing OBST using
DP
1 Backtracking
1. incremental,
2. divide-and-conquer,
3. dynamic programming,
4. greedy.
Consider the problem of navigating a maze. In general, one does not know
the structure of the maze when beginning. In fact, a maze may be con-
structed not to have any particular structure. So there is no obvious way to
break down the problem into smaller pieces.
One way to solve a maze is to try every potential solution in the search
space, i.e. every possible sequence of steps from the maze entrance to its
exit. In order to narrow down and systematize the search, we will do the
following: if we hit a dead end, then we back up until the last choice that
we made and make another choice.
1.2 A definition
1
• backing up to a smaller partial solution when the options for extending
a partial solution are exhausted.
2 n-Queens problem
Input: An integer n ≥ 4.
Recall that a queen may move any length along a diagonal or any length
along a row or column. So, for example, if n = 4, a solution is (1, 3), (2, 1), (3, 4)
and (4, 2).
In order to use backtracking for this problem, we need to describe what a
potential solution looks like.
Note that an exhaustive search will check (n2 )n possible solutions, one for
each set of n coordinates. Let us use backtracking to narrow down the
search space. To do this, we need to develop tests that check whether a
partial solutions cannot be extended to a complete solution.
We can make several observations that eliminate some of the possible partial
solutions:
2
1. There must be exactly one queen per row. This is because two queens
in a row would allow them to attack each other, while fewer than one
queen per row would not allow n queens to be located on the board.
This means that the solutions must have the form:
2. There must be exactly one queen per column, for the same reason
as above. This means that (y1 , y2 , ..., yn ) must be a permutation of
(1, 2, ..., n).
The search space has thus been decreases to ”only” n! permutations.
Step i: Given the positions of the first i − 1 queens (1, y1 ), (2, y2 ), ..., (i −
1, yi−1 ), choose the position for (i, yi ). If you cannot complete Step
i for some choice of previous positions (1, y1 ), (2, y2 ), ..., (i − 1, yi−1 ),
then backtrack and re-choose the position for (i − 1, yi−1 ).
Step 1: Place a queen in the 1st row. The choices are (1, 1), (1, 2), (1, 3), (1, 4).
We first choose (1, 1)
3
Step 2: Place a queen in the 2nd row. Given our choise of (1, 1), we
cannot choose either (2, 1) [same column] or (2, 2) [same diagonal].
The remaining choices are (2, 3), (2, 4).
Step 3: Given our choices of (1, 1), (2, 3) we cannot choose (3, 1) [same
column], (3, 2) [same diagonal], (3, 3) [same column], or (3, 4) [same
diagonal]. So we eliminate choice (1, 1), (2, 3).
Step 2: Given our choice of (1, 1), the remaining choice in row 2 is (2, 4).
Step 3: Given the choice (1, 1), (2, 4), we cannot choose (3, 1) [same col-
umn], (3, 3) [same diagonal], (3, 4) [same column]. Our only choice is
(3, 2).
Step 4: Given our partial solution (1, 1), (2, 4), (3, 2), we cannot choose
any of (4, 1) [same column], (4, 2) [same diagonal], (4, 3) [same diago-
nal] or (4, 4) [same column]. Since we have no more choices in step 3
and step 2, we backtrack to step 1 and eliminate partial solution (1, 1).
Step 2: given our choice of (1, 2), we cannot place a queen in (2, 1) [same
diagonal], (2, 2) [same column] or (2, 3) [same diagonal]. That leaves
only (2, 4) as a choice.
Step 3: Given choice (1, 2), (2, 4), we cannot place a queen in (3, 2) [same
column], (3, 3) [same diagonal] or (3, 4) [same column]. This leaves
only choice (3, 1).
Step 4: Given choice (1, 2), (2, 4), (3, 1), we cannot place a queen in (4, 1)
[same column], (4, 2) [same column], or (4, 4) [same column]. This
leaves only choice (4, 3).
A solution is thus: (1, 2), (2, 4), (3, 1), (4, 3). Draw the tree representing our
backtracking steps as an exercise!
2.4 Algorithms
4
// extends a partial solution of size k-1 to one of size k,
// unless the partial solution is a complete solution, in
// which case it is output; the initial call is NQueens(Q,1,N)
NQueens(Q,k,n)
if k=n+1 then
output Q[1..n]
quit // if only one solution is needed
for i = 1 to n
Q[k] = i
if Feasible(Q, k)
then NQueens(Q, k+1, n)
Feasible(Q, k)
for i = 1 to k-1
if Q[k] = Q[i] or |Q[i] - Q[k]| = |i-k| then
return false
return true
Backtrack(partial solution)
if partial solution is a solution then
output solution
return (or quit)
for all possible extensions of partial solution
if extension is feasible then
Backtrack(extension)
4 Graph Coloring
5
Output: An assignment of one of k colors to each node, such that adjacent
nodes get different colors.
For example, let G = (V, E) where V = {1, 2, 3, 4} and E = {(1, 2), (2, 3), (2, 4), (3, 4)}
and suppose that k = 3. A valid coloring c of G is: c(1) = R, c(2) =
G, c(3) = B, c(4) = R.
node p q r s t u
1 R R G G B B
2 G B B R R G
3 B G R B G R
4 R R G G B B
Note that all these colorings are sort of equivalent. They all share the
following structure:
• The same color is used for both node 1 and node 4. For colorings p
and q, it is R, for colorings r and s, it is G, and for colorings t and u,
it is B.
6
• Nodes 2 and 3 must have distinct colors different from each other and
from the color used for nodes 1 and 4.
We will use the following strategy to find all valid colorings of a graph
G = (V, E):
4. If there is no possible color for the i-th node given the previous choices,
backtrack to a previous solution.
Consider the graph G given earlier, where E = {(1, 2), (1, 3), (2, 3), (2, 4), (3, 4)}
and k = 3.
Step 2: Given partial coloring (R), we choose a color for node 2. It can
be one of: G or B. Say we choose G.
Step 3: Given partial coloring (R, G), we choose a color for node 3. It
cannot be either R or G, so it must be B since k = 3.
Step 4: Given partial coloring (R, G, B), we choose a color for node 4. It
cannot be B or G, so it must be R. This gives the coloring (R, G, B, R)
which is coloring p.
We have no more choices of colors in step 4, and in step 3. We have
one choice in step 2.
Step 2: Given partial coloring (R), we choose a different color for node 2.
We choose B for node 2.
7
Step 3: Given partial coloring (R, B), we choose a color for node 3. It
cannot be R nor B, so it must be G.
Step 4: Given partial coloring (R, B, G), we choose a color for node 4. It
cannot be B or G, so it must be R. This gives the coloring (R, B, G, R)
which is coloring q.
We have no more choices of colors in steps 4, 3 and 2. We go back to
step 1.
Step 1: We choose a different color for node 1, say B. This will produce
a branch in the tree equivalent to the first branch where R and B are
switched. Thus we will get the colorings t and u.
If we choose G for node 1 then, we will again get a branch equivalent
to the first one with R and G swapped. This will produce the colorings
r and s.
4.3 Algorithm
We now give a recursive version of the graph coloring algorithm. Let C[1...j−
1] be a partial coloring for the first j − 1 nodes.
Color(C,j,k,n)
if j = n+1 then
output C
return or quit
for i = 1 to k
C[j] = i
if valid(C,j,n) then
Color(C,j+1,k,n)
where
Valid(C,j,n)
for all neighbors v of j with v < j
if C[v] = C[j] then
return false
return true
8
4.4 Pruning
If we are simply looking for a single solution, we can cut off the equivalent
branches of the tree to save time. For example, the three main branches of
the backtracking tree obtained in section 4.2 all gave equivalent solutions.
We need only consider the first branch if we want a single solution.
The following algorithm prunes the tree to remove equivalent branches. It
uses the following strategy:
ColorP(C,j,last,k,n)
if j = n+1 then
output C
return or quit
for i = 1 to last //try old colors first
C[j] = i
if valid(C,j,n) then
ColorP(C,j+1,last,k,n)
if last < k then
C[j] = last + 1
ColorP(C,j+1,last+1,k,n)
5 Subset sum
You solved this problem using dynamic programming. Let us apply back-
tracking to it.
9
5.1 Potential solutions
k
X
xi ti > M
i=1
k
X n
X
xi ti + ti < M.
i=1 i=k+1
10
Step 2: Choose 7 or not?
Suppose we don’t and get partial solution (0, 0).
Step 3: Suppose we choose 6 and get partial solution (0, 1, 1). Since 7+6 =
13 > 10, this is not a solution.
We have no more choices to make in steps 3 and 2.
11
Step 3: Choose 6 or not?
Suppose we don’t and get partial solution (1, 0, 0). Since 4 + 3 + 1 =
8 < 10, this partial solution won’t lead to a solution, so we prune.
Suppose we choose 6, and get solution (1, 0, 1, 0, 0, 0).
We have no more choices to make in step 3.
Step 2: Suppose we choose 7 and get partial solution (1, 1). Since 4 + 7 =
11 > 10, this is not a solution.
SubsetSum(S,T,k,n)
if k = n+1 then
output S
return
S[k] = 0
if InRange(S,T,k,n) then
SubsetSum(S, k+1, n)
S[k] = 1
if InRange(S,T,k,n) then
SubsetSum(S, k+1, n)
where
InRange(S,T,k,n)
S1 = Sum_{i=1}^{k} T[i]S[i]
if S1 > M then
return false
S2 = Sum_{i=k+1}^{n} T[i]
if S1 + S2 < M then
return false
return true
12
String Matching
T 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
mod 13
8
9
3
A few calculations…
• For a pattern P [1..m], let p denote its corresponding
value in radix-d notation.
• Using Horner’s rule, p can be computed in time ϴ(m).
p = P [m] + d(P [m – 1] + d(P [m – 2] + … + d(P [2] + dP[1]) …)).
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Step 1: s = 0, t0 = 8, p = 7.
p == t0 → No.
s < 14 → Yes.
ts +1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 13)
t1 = 10 (8 – 3 (2)) + 2 (mod 13)
t1 = 10 (2) + 2 (mod 13) = 22 (mod 13) = 9.
Contd…
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Step 2: s = 1, t1 = 9, p = 7.
p == t1 → No.
s < 14 → Yes.
ts +1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 13)
t2 = 10 (9 – 3 (3)) + 3 (mod 13)
t2 = 10 (0) + 3 (mod 13) = 3 (mod 13) = 3.
Contd…
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Step 3: s = 2, t2 = 3, p = 7.
p == t2 → No.
s < 14 → Yes.
ts +1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 13)
t3 = 10 (3 – 3 (5)) + 1 (mod 13) = 10 (-12) + 1 (mod 13)
Because -12 mod 13 = 1 t3 = 10 (1) + 1 (mod 13) = 11 (mod 13) = 11.
Contd…
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Step 4: s = 3, t3 = 11, p = 7.
p == t3 → No.
s < 14 → Yes.
ts +1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 13)
t4 = 10 (11 – 3 (9)) + 4 (mod 13) = 10 (-16) + 4 (mod 13)
Because -16 mod 13 = 10 t4 = 10 (10) + 4 (mod 13) = 104 (mod 13) = 0.
Contd…
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Step 5: s = 4, t4 = 0, p = 7.
p == t4 → No.
s < 14 → Yes.
ts +1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 13)
t5 = 10 (0 – 3 (0)) + 1 (mod 13)
t5 = 1 (mod 13) = 1.
Contd…
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Step 6: s = 5, t5 = 1, p = 7.
p == t5 → No.
s < 14 → Yes.
ts +1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 13)
t6 = 10 (1 – 3 (2)) + 5 (mod 13) = 10 (-5) + 5 (mod 13)
Because -5 mod 13 = 8 t6 = 10 (8) + 5 (mod 13) = 85 (mod 13) = 7.
Contd…
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Step 7: s = 6, t6 = 7, p = 7.
p == t6 → Yes.
Character by character matching p[1..5] == T[7..11].
{3 1 4 1 5} == {3 1 4 1 5}. Match, hence s = 6 is a valid shift.
s < 14 → Yes.
ts +1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 13)
t7 = 10 (7 – 3 (3)) + 2 (mod 13) = 10 (-2) + 2 (mod 13)
Because -2 mod 13 = 11 t = 10 (11) + 2 (mod 13) = 112 (mod 13) = 8.
7
Contd…
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Step 8: s = 7, t7 = 8, p = 7.
p == t7 → No.
s < 14 → Yes.
ts +1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 13)
t8 = 10 (8 – 3 (1)) + 6 (mod 13)
t8 = 10 (5) + 6 (mod 13) = 56 (mod 13) = 4.
Contd…
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Step 9: s = 8, t8 = 4, p = 7.
p == t8 → No.
s < 14 → Yes.
ts +1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 13)
t9 = 10 (4 – 3 (4)) + 7 (mod 13) = 10 (-8) + 7 (mod 13)
Because -8 mod 13 = 5 t9 = 10 (5) + 7 (mod 13) = 57 (mod 13) = 5.
Contd…
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Step 10: s = 9, t9 = 5, p = 7.
p == t9 → No.
s < 14 → Yes.
ts +1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 13)
t10 = 10 (5 – 3 (1)) + 3 (mod 13)
t10 = 10 (2) + 3 (mod 13) = 23 (mod 13) = 10.
Contd…
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Step 13: s = 12, t12 = 7, p = 7.
p == t12 → Yes.
Character by character matching p[1..5] == T[13..17].
{3 1 4 1 5} == {6 7 3 9 9}. Mismatch occurs at first character.
s < 14 → Yes.
ts +1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 13)
t13 = 10 (7 – 3 (6)) + 2 (mod 13) = 10 (-11) + 2 (mod 13)
Because -11 mod 13 = 2 t
13 = 10 (2) + 2 (mod 13) = 22 (mod 13) = 9.
Contd…
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Text (T) 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Pattern: 3 1 4 1 5
Σ = {0, 1, …, 9} d = |Σ| = 10 q = 13
• n = 19, m = 5, n – m = 14.
• h = 105-1 mod 13 = 104 mod 13 = 3.
• p = (3 × 104 + 1 × 103 + 4 × 102 + 1 × 101 + 5 × 100) mod 13 = 7.
• t0 = (2 × 104 + 3 × 103 + 5 × 102 + 9 × 101 + 0 × 100) mod 13 = 8.
Text (T) a
2
a
3
b
4
b
5
c
6
a
7
b
8
a
ASCII 97 97 98 98 99 97 98 97
• n = 8, m = 3, n – m = 5. Pattern: c a b
• h = 263-1 mod 3
= 262 mod 3 = 1.
Σ = {a, b, …, z} d = |Σ| = 26 q=3
• p = (99 × 262 + 97 × 261 + 98 × 260) mod 3 = 1.
• t0 = (97 × 262 + 97 × 261 + 98 × 260) mod 3 = 2.
Step 1: s = 0, t0 = 2, p = 1.
p == t0 → No.
s < 5 → Yes.
ts +1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 3)
t1 = 26 (2 – 1 (97)) + 98 (mod 3)
t1 = 26 (2 – 1 (1)) + 2 (mod 3)
(because 97 mod 3 = 1 and 98 mod 3 = 2)
= 26 (1) + 2 (mod 3)
= 28 (mod 3) = 1.
Index
Contd… 1
Text (T) a
2
a
3
b
4
b
5
c
6
a
7
b
8
a
ASCII 97 97 98 98 99 97 98 97
Pattern: c a b
Step 2: s = 1, t1 = 1, p = 1.
p == t1 → Yes.
Character by character matching p[1..3] == T[2..4].
{c a b} == {a b b}. Mismatch occurs at first character.
s < 5 → Yes.
ts + 1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 3)
t2 = 26 (1 – 1 (97)) + 99 (mod 3)
t2 = 26 (1 – 1 (1)) + 0 (mod 3)
(because 97 mod 3 = 1 and 99 mod 3 = 0)
= 26 (0) (mod 3)
= 0.
Index
Contd… 1
Text (T) a
2
a
3
b
4
b
5
c
6
a
7
b
8
a
ASCII 97 97 98 98 99 97 98 97
Pattern: c a b
Step 3: s = 2, t2 = 0, p = 1.
p == t2 → No.
s < 5 → Yes.
ts + 1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 3)
t3 = 26 (0 – 1 (98)) + 97 (mod 3)
t3 = 26 (0 – 1 (2)) + 1 (mod 3)
(because 97 mod 3 = 1 and 98 mod 3 = 2)
= 26 (0 – 2) + 1 (mod 3)
= 26 (0 + 1) + 1 (mod 3)
(because 3's complement of -2 = 1)
= 26 (1) + 1 (mod 3) = 27 (mod 3) = 0.
Index
Contd… 1
Text (T) a
2
a b
3 4
b
5
c
6
a
7
b
8
a
ASCII 97 97 98 98 99 97 98 97
Pattern: c a b
Step 4: s = 3, t3 = 0, p = 1.
p == t3 → No.
s < 5 → Yes.
ts + 1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 3)
t4 = 26 (0 – 1 (98)) + 98 (mod 3)
t4 = 26 (0 – 1 (2)) + 2 (mod 3)
(because 98 mod 3 = 2)
= 26 (0 – 2) + 2 (mod 3)
= 26 (0 + 1) + 2 (mod 3)
(because 3's complement of -2 = 1)
= 26 (1) + 2 (mod 3) = 28 (mod 3) = 1.
Index
Contd… 1
Text (T) a
2
a
3
b
4
b
5
c
6
a
7
b
8
a
ASCII 97 97 98 98 99 97 98 97
Pattern: c a b
Step 5: s = 4, t4 = 1, p = 1.
p == t4 → Yes.
Character by character matching p[1..3] == T[5..7].
{c a b} == {c a b}. Match, hence s = 4 is a valid shift.
s < 5 → Yes.
ts + 1 = d (ts – hT [s + 1]) + T [s + m + 1] (mod 3)
t5 = 26 (1 – 1 (99)) + 97 (mod 3)
t5 = 26 (1 – 1 (0)) + 1 (mod 3)
(because 97 mod 3 = 1 and 99 mod 3 = 0)
= 26 (1 – 0) + 1 (mod 3)
= 26 (1) + 1 (mod 3) = 27 (mod 3) = 0.
Index
Contd… 1
Text (T) a
2
a
3
b
4
b
5
c
6
a
7
b
8
a
ASCII 97 97 98 98 99 97 98 97
Pattern: c a b
Step 6: s = 5, t5 = 0, p = 1.
p == t5 → No.
s < 5 → No.
Step 7: s = 6.
Loop terminates.
Text (T) a a b b c a b a
2 1 0 0 1 0
Complexity
• Takes ϴ(m) preprocessing time.
• Worst-case running time is O(m (n – m + 1)).
– Example: P = am and T = an, each of the [n – m +
1] possible shifts is valid.
• In many applications, there are a few valid shifts
(say some constant c). In such applications, the
expected matching time is only O(n – m + 1) + cm),
plus the time required to process spurious hits.
Contd…
• Probabilistic analysis
– The probability of a false positive hit for a random
input is 1/q.
– The expected number of false positive hits is O(n/q).
– The expected run time is O(n) + O(m(v + n/q))), if v is
the number of valid shifts.
• Choosing q ≥ m and having only a constant number of
hits, then the expected matching time is O(n + m).
• Since m ≤ n, this expected matching time is O(n).
Knuth-Morris-Pratt Algorithm
• Based on the concept of prefix function for a
pattern.
– Encapsulates knowledge about how the pattern
matches against shifts of itself.
– This information can be used to avoid testing of
invalid shifts.
• T: b a c b a b a b a a b c b a b
• P: ababaca
Contd…
Contd…
k→
• P: a b a b a c a
a b a b a c a
Contd… n = 15
m=7
b a c b a b a b a b a c a a b
a b a b a c a
Contd… n = 15
m=7
a b a b a c a
a b a b a c a
Contd… n = 15
m=7
b a c b a b a b a b a c a a b
a b a b a c a
Contd… n = 15
m=7
b a c b a b a b a b a c a a b
a b a b a c a
Contd… n = 15
m=7
b a c b a b a b a b a c a a b
a b a b a c a
Contd… n = 15
m=7
b a c b a b a b a b a c a a b
a b a b a c a
Contd… n = 15
m=7
b a c b a b a b a b a c a a b
a b a b a c a
Contd… n = 15
m=7
b a c b a b a b a b a c a a b
a b a b a c a
Contd… n = 15
m=7
a b a b a c a
(if) P[4] == T[10]. Match. q++ = 4
b a c b a b a b a b a c a a b
Shifts skipped = 1 and first three
characters are not compared.
a b a b a c a
Comparison starts from P[4].
Contd… n = 15
m=7
b a c b a b a b a b a c a a b
a b a b a c a
Contd… n = 15
m=7
b a c b a b a b a b a c a a b
a b a b a c a
Contd… n = 15
m=7
a b a b a c a
• q == m. Yes.
– Pattern occurs with shift i – m =13 – 7 = 6.
– q = Π[q] = 1.
Contd… n = 15
m=7
a b a b a c a
Contd… n = 15
m=7
a b a b a c a
Step 16: i = 16
Loop terminates.
Boyer-Moore algorithm
• An efficient string searching algorithm that has
been the standard benchmark for practical string
search literature.
• Developed by Robert S. Boyer and J Strother
Moore in 1977
• A simplified version of it or the entire algorithm is
often implemented in text editors for the “search”
and “substitute” commands.
• The reason is that it works the fastest when the
alphabet is large and the pattern is relatively long.
Contd…
• The execution time of the Boyer-Moore algorithm
can be sub-linear.
– It does not need to check every character of the
text, but rather skips over some of them.
• The BM algorithm gets faster as the pattern
becomes longer.
– It utilizes the information gained from each
unsuccessful attempt to rule out as many
positions as possible of the text where the strings
cannot match.
The main ideas…
• Scan the pattern backwards (right to left).
– The strings are matched from the end of P to the start
of P, i.e. P[m], P[m-1], …P[1].
• The bad character shift rule.
– Avoids repeating unsuccessful comparisons against a
target character (from the text).
• The good suffix shift rule.
– Aligns only matching pattern characters against target
characters already successfully matched.
• Either rule works alone, but they are more effective
together.
Backward Scan
• Align the start of P with the start of T.
• Characters in P and T are then compared starting at
index m in P and k in T, moving backward.
• The comparisons continue until either the
beginning of P is reached (which means there is a
match) or a mismatch occurs.
• Example:
– T: This picture shows a nice view of the park.
– P: future
Contd…
• T: This picture shows a nice view of the park.
• P: future
1. P[6] == T[12] = e, 2. P[5] == T[11] = r,
3. P[4] == T[10] = u, 4. P[3] == T[9] = t,
5. P[2] == T[8] → u ≠ c
– Mismatch. Shift P to the right (relative to the text)
and start comparisons again from the right end.
• The length of shift is given by the two rules:
– The bad character shift rule.
– The good suffix shift rule.
Bad Character Shift Rule
• Upon mismatch, skip alignments until
– Mismatch becomes a match, or
– P moves past the mismatched character.
1. T: G C T T C T G C T A C C T T T T G C G C G C G C G C G G A A
P: C C T T T T G C (can be a match as C exists at position 2)
2. T: G C T T C T G C T A C C T T T T G C G C G C G C G C G G A A
P: Gs =C3T C C T T T T G C (A does not exist in pattern)
3. T: G C T T C T G C T A C C T T T T G C G C G C G C G C G G A A
s = 10
P: G C T T C TGCTTCCTTTTGC
Contd…
• Assume that a mismatch occurs between characters
P[i] (= a) and T[i + j] (= b) during an attempt at shift j.
– That is, P[i + 1 .. m] = T[i + j + 1 .. j + m] = u and P[i] ≠ T[i + j].
If the right-most
occurrence of b in P[1..m-
1] is at k > i, the pattern is
shifted to the right by one.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
T: G C T T C T G C T A C C T T T T G C G C G C G C G C G G A A
P: C C T T T T G C
j = 0. Shift = 0 + max {1, i - R(b)} = 0 + max {1, 5 – R(C)}
= 0 + max { 1, 5 – 2} = 0 + 3 = 3.
T: G C T T C T G C T A C C T T T T G C G C G C G C G C G G A A
P: CC T T T TGC
j = 3. Shift = 3 + max {1, i - R(b)} = 3 + max {1, 7 – R(A)}
= 3 + max { 1, 7 – 0} = 3 + 7 = 10.
T: G C T T C T G C T A C C T T T T G C G C G C G C G C G G A A
P: CC T T T TGC
Good Suffix Shift Rule
1. T: C G T G C C T A C T T A C T T A C T T A C T T A C G C G A A
P: C T T A C T T A C
(TAC can be aligned with the rightmost occurrence TAC)
2. T: C G T G C C T A C T T A C T T A C T T A C T T A C G C G A A
s=4
P: CTTACTTAC
(TACTTAC has no rightmost occurrence, go for longest prefix-
suffix match alignment in between CTTACTTAC and TACTTAC)
3. T: C G T G C C T A C T T A C T T A C T T A C T T A C G C G A A
s=8
P: CTTACTTAC
Contd…
• Let P[i .. m] = T[i + j .. j + m] = u and P[i - 1] ≠ T[i + j - 1].
• The good-suffix shift consists in aligning the segment u = T[i
+ j .. j + m] (= P[i .. m]) with its rightmost occurrence in
pattern (which is not a suffix) that is preceded by a
character different from P[i – 1].
• The original weaker version of this rule did not require that
the character preceding the rightmost occurrence of u
should not be P[i – 1].
Contd…
• If there exists no such segment, the shift consists in
aligning the longest suffix v of T[i + j .. j + m] with a
matching prefix of P.
Contd…
• If no such shift is possible, i.e., the longest suffix v
of T[i + j .. j + m] which matches a prefix of P, is
empty, then shift P to the right by P.length (= m).
Example – 1
1. T: C G T G C C T A C T T A C T T A C T T A C T T A C G C G A A
P: C T T A C T T A C
2. T: C G T G C C T A C T T A C T T A C T T A C T T A C G C G A A
P: CTTACTTAC
Example – 2
1...3456789……
• In this case,
– The character sequence "led" is not appearing
again in pattern.
– The longest suffix matching also results in an
empty string.
• So shift the pattern completely.
Preprocessing (Compute L(i))
• For i = 2,.., m + 1, compute L(i) as follows:
– L(i) is the largest position, less than m, such that P[i..m]
matches a suffix of P[1..L(i)] and, moreover, this suffix is
not preceded by character P[i-1].
• If no such copy of P[i..m] exists, then let L(i) = 0.
Contd… (Compute L(i))
• Since P[m + 1..m] = ε, L(m + 1) is the right-most
position j such that P[j] ≠ P[m] (or 0 if all characters
are equal).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
T: G T T A T A G C T G A T C G C G G C G T A G C G G C G A A
P: G T A G C G G C G
Step 1: k = 9. Mismatch at i = 9.
Shift using BC = max{ 1, i – R(T)} = max(1,9-2) = 7.
Shift using GS = 1 (first character mismatched)
Maximum shift = 7.
k = k + 7 = 9 + 7 = 16
A C G T 2 3 4 5 6 7 8 9 10
Contd… R: 3 8 7 2 L: 0 0 0 0 0 6 0 7 8
I: 1 1 1 1 1 1 1 1 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
T: G T T A T A G C T G A T C G C G G C G T A G C G G C G A A
P: G T AG C GG C G
1 2 3 4 5 6 7 8 9 i
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
T: G T T A T A G C T G A T C G C G G C G T A G C G G C G A A
P: G T AG C GG C G
1 2 3 4 5 6 7 8 9 i
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
T: G T T A T A G C T G A T C G C G G C G T A G C G G C G A A
P: G T AG C GG C G
1 2 3 4 5 6 7 8 9 i
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: G C A T C G C A G A G A G T A T A C A G T A C G
P: G C A G A G A G
Step 1: k = 8. Mismatch at i = 8.
Shift using BC = max{ 1, i – R(A)} = max(1,8-7) = 1.
Shift using GS = 1 (first character mismatched)
Maximum shift = 1.
k=k+1=8+1=9
A C G T 2 3 4 5 6 7 8 9
Contd… R: 7 2 6 0 L: 0 0 0 6 0 4 1 7
I: 1 1 1 1 1 1 1 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: G C A T C G C A G A G A G T A T A C A G T A C G
P: G C AGAGAG
1 2 3 4 5 6 7 8 i
Step 2: k = 9. Mismatch at i = 6.
Shift using BC = max{ 1, i – R(C)} = max(1,6-2) = 4.
Shift using GS:
As L(i + 1) = L(7) = 4 ≠ 0.
Thus, shift = m – L(i + 1) = 8 – 4 = 4.
Maximum shift = 4.
k = k + 4 = 9 + 4 = 13
A C G T 2 3 4 5 6 7 8 9
Contd… R: 7 2 6 0 L: 0 0 0 6 0 4 1 7
I: 1 1 1 1 1 1 1 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: G C A T C G C A G A G A G T A T A C A G T A C G
P: G C AGAGAG
1 2 3 4 5 6 7 8 i
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: G C A T C G C A G A G A G T A T A C A G T A C G
P: G C AGAGAG
1 2 3 4 5 6 7 8 i
Step 4: k = 20. i = 6.
Shift using BC = max{ 1, i – R(C)} = max(1,6-2) = 4.
Shift using GS:
As L(i + 1) = L(7) = 4 ≠ 0.
Thus, shift = m – L(i + 1) = 8 – 4 = 4.
Maximum shift = 4.
k = k + 4 = 20 + 4 = 24
A C G T 2 3 4 5 6 7 8 9
Contd… R: 7 2 6 0 L: 0 0 0 6 0 4 1 7
I: 1 1 1 1 1 1 1 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: G C A T C G C A G A G A G T A T A C A G T A C G
P: G C AGAGAG
1 2 3 4 5 6 7 8 i
Step 5: k = 24. i = 7.
Shift using BC = max{ 1, i – R(C)} = max(1,7-2) = 5.
Shift using GS:
As L(i + 1) = L(8) = 1 ≠ 0.
Thus, shift = m – L(i + 1) = 8 – 1 = 7.
Maximum shift = 7.
k = k + 7 = 24 + 7 = 31
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
P: X Y X Y Y X Y X Y X X
• For Rabin-Karp
– Σ = {0, 1}, q = 13, X = 0, Y=1
Naïve approach
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
P: X Y X Y Y X Y X Y X X
P: X Y X Y Y X Y X Y X X
P: X Y X Y Y X Y X Y X X
P: X Y X Y Y X Y X Y X X
P: X Y X Y Y X Y X Y X X
P: X Y X Y Y X Y X Y X X
P: X Y X Y Y X Y X Y X X
P: X Y X Y Y X Y X Y X X
P: Total shifts X Y X Y Y X Y X Y X X
= 24 – 11 + 1 = 14
P: X Y X Y Y X Y X Y X X
P: Total character comparisons X Y X Y Y X Y X Y X X
P: = 4 + 1 + 2 + 5 + 1 + 11 + 1 X Y X Y Y X Y X Y X X
+ 3 + 1 + 1 + 5 + 1 + 11 + 1
P: X Y X Y Y X Y X Y X X
= 48
P: X Y X Y Y X Y X Y X X
Σ = {0, 1}, q = 13, X = 0, Y=1
Rabin–Karp h = 211-1 mod 13 = 10
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
0 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1 1 0 1 0 1 0 0 1
P: X Y X Y Y X Y X Y X X • p = (29 + 27 + 26 + 24+ 22) mod 13 = 9
0 1 0 1 1 0 1 0 1 0 0 • T[1..11] = t0 = (29 + 26 + 24 + 22+ 21) mod 13 = 0
• T[2..12] = t1 = (2(0 – 10.0) + 1) mod 13 = 1
• T[3..13] = t2 = (2(1 – 10.1) + 0) mod 13 = 8
• p = 9.
• T[4..14] = t3 = (2(8 – 10.0) + 1) mod 13 = 4
• t7 = 9. Compare T[8..18] with P[1..11].
• T[5..15] = t4 = (2(4 – 10.0) + 0) mod 13 = 8
– T[8] = P[1] = X
– T[9] = P[2] = Y • T[6..16] = t5 = (2(8 – 10.1) + 1) mod 13 = 10
– T[10] = Y and P[3] = X. Mismatch • T[7..17] = t6 = (2(10 – 10.0) + 1) mod 13 = 8
• t12 = 9. Compare T[13..23] with P[1..11]. • T[8..18] = t7 = (2(8 – 10.1) + 0) mod 13 = 9
– Match. • T[9..19] = t8 = (2(9 – 10.0) + 1) mod 13 = 6
• T[10..20] = t9 = (2(6 – 10.1) + 0) mod 13 = 5
Total shifts = 24 – 11 + 1 = 14 • T[11..21] = t10 = (2(5 – 10.1) + 1) mod 13 = 4
• T[12..22] = t11 = (2(4 – 10.0) + 0) mod 13 = 8
Total character comparisons = 3 + 11 = 14 • T[13..23] = t12 = (2(8 – 10.1) + 0) mod 13 = 9
• T[14..24] = t13 = (2(9 – 10.0) + 1) mod 13 = 6
KMP
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
P: X Y X Y Y X Y X Y X X
Π: 0 0 1 2 0 1 2 3 4 3 1
• i = 1, q = 0. T[1] = P[1] (Shift = 0). q = 1.
• i = 2, q = 1. T[2] = P[2] (Shift = 0). q = 2.
• i = 3, q = 2. T[3] = P[3] (Shift = 0). q = 3.
• i = 4, q = 3. T[4] ≠ P[4] (Shift = 0). q = Π[3] = 1.
T[4] ≠ P[2] (Shift = 2). q = Π[1] = 0.
T[4] = P[1] (Shift = 3). q = 1.
• i = 5, q = 1. T[5] = P[2] (Shift = 3). q = 2.
• i = 6, q = 2. T[6] = P[3] (Shift = 3). q = 3.
• i = 7, q = 3. T[7] = P[4] (Shift = 3). q = 4.
• i = 8, q = 4. T[8] ≠ P[5] (Shift = 3). q = Π[4] = 2.
T[8] = P[3] (Shift = 5). q = 3.
• i = 9, q = 3. T[9] = P[4] (Shift = 5). q = 4.
• i = 10, q = 4. T[10] = P[5] (Shift = 5). q = 5.
Contd…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
P: X Y X Y Y X Y X Y X X
Π: 0 0 1 2 0 1 2 3 4 3 1 • i = 23, q = 10.
• i = 11, q = 5. T[11] = P[6] (Shift = 5). q = 6. T[23] = P[11] (Shift = 12). q = 11.
• i = 12, q = 6. T[12] = P[7] (Shift = 5). q = 7. Pattern found at shift 12. q = Π[11] = 1.
• i = 13, q = 7. T[13] = P[8] (Shift = 5). q = 8. • i = 24, q = 1.
• i = 14, q = 8. T[14] = P[9] (Shift = 5). q = 9. T[24] = P[2] (Shift = 22). q = 2.
• i = 15, q = 9. T[15] = P[10] (Shift = 5). q = 10.
• i = 16, q = 10. T[16] ≠ P[11] (Shift = 5). q = Π[10] = 3.
T[16] = P[4] (Shift = 12). q = 4.
• i = 17, q = 4. T[17] = P[5] (Shift = 12). q = 5.
• i = 18, q = 5. T[18] = P[6] (Shift = 12). q = 6. Total shifts = 6
• i = 19, q = 6. T[19] = P[7] (Shift = 12). q = 7.
• i = 20, q = 7. T[20] = P[8] (Shift = 12). q = 8. Total character comparisons = 28
• i = 21, q = 8. T[21] = P[9] (Shift = 12). q = 9.
• i = 22, q = 9. T[22] = P[10] (Shift = 12). q = 10.
2 3 4 5 6 7 8 9 10 11 12
Boyer–Moore L: 0 0 0 0 0 0 0 0 0 10 9
X Y I: 1 1 1 1 1 1 1 1 1 1 0
R: 10 9
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
P: X Y X Y Y X Y X Y X X
• Step 1: k = 11. Shift = 0.
BC = max(1,10 – 9) = 1; GS = 11 – 10 = 1;
k = 11 + 1 = 12. Shift = 0 + 1 = 1.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
P: X Y X Y Y X Y X Y X X
• Step 2: k = 12. Shift = 1.
BC = max(1,11 – 9) = 2; GS = 1;
k = 12 + 2 = 14. Shift = 1 + 2 = 3.
X Y 2 3 4 5 6 7 8 9 10 11 12
Contd… R: 10 9 L: 0 0 0 0 0 0 0 0 0 10 9
I: 1 1 1 1 1 1 1 1 1 1 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
P: X Y X Y Y X Y X Y X X
• Step 3: k = 14. Shift = 3.
BC = max(1,11 – 9) = 2; GS = 1;
k = 14 + 2 = 16. Shift = 3 + 2 = 5.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
P: X Y X Y Y X Y X Y X X
• Step 4: k = 16. Shift = 5.
BC = max(1,11 – 9) = 2; GS = 1;
k = 16 + 2 = 18. Shift = 5 + 2 = 7.
X Y 2 3 4 5 6 7 8 9 10 11 12
Contd… R: 10 9 L: 0 0 0 0 0 0 0 0 0 10 9
I: 1 1 1 1 1 1 1 1 1 1 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
P: X Y X Y Y X Y X Y X X
• Step 5: k = 18. Shift = 7.
BC = max(1,10 – 9) = 1; GS = 11 – 10 = 1;
k = 18 + 1 = 19. Shift = 7 + 1 = 8.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
P: X Y X Y Y X Y X Y X X
• Step 6: k = 19. Shift = 8.
BC = max(1,11 – 9) = 2; GS = 1;
k = 19 + 2 = 21. Shift = 8 + 2 = 10.
X Y 2 3 4 5 6 7 8 9 10 11 12
Contd… R: 10 9 L: 0 0 0 0 0 0 0 0 0 10 9
I: 1 1 1 1 1 1 1 1 1 1 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
P: X Y X Y Y X Y X Y X X
• Step 7: k = 21. Shift = 10.
BC = max(1,11 – 9) = 2; GS = 1;
k = 21 + 2 = 23. Shift = 10 + 2 = 12.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
T: X Y X X Y X Y X Y Y X Y X Y X Y Y X Y X Y X X Y
P: X Y X Y Y X Y X Y X X
• Step 8: k = 23. Shift = 12.
Pattern found. Total shifts = 8
k = 21 + 11 - 1 = 31. Shift = 12 + 10 = 22.
Total character
comparisons = 20