06DynamicProgrammingII 2x2
06DynamicProgrammingII 2x2
06DynamicProgrammingII 2x2
D YNAMIC P ROGRAMMING II
SECTION 6.6
Lecture slides by Kevin Wayne
Copyright © 2005 Pearson-Addison Wesley
http://www.cs.princeton.edu/~wayne/kleinberg-tardos
Q. How similar are two strings? Edit distance. [Levenshtein 1966, Needleman–Wunsch 1970]
・Gap penalty δ; mismatch penalty αpq.
Ex. ocurrance and occurrence. ・Cost = sum of gap and mismatch penalties.
o c u r r a n c e – o c – u r r a n c e C T – G A C C T A C G
o c c u r r e n c e o c c u r r e n c e C T G G A C G A A C G
3 4
BLOSUM matrix for proteins Dynamic programming: quiz 1
A. 1
B. 2
C. 3
D. 4
P A L E T T E
E. 5
P A L – A T E
1 mismatch, 1 gap
5 6
Goal. Given two strings x1 x2 ... xm and y1 y2 ... yn , find a min-cost alignment. Def. OPT(i, j) = min cost of aligning prefix strings x1 x2 ... xi and y1 y2 ... yj.
Goal. OPT(m, n).
Def. An alignment M is a set of ordered pairs xi – yj such that each character
appears in at most one pair and no crossings. Case 1. OPT(i, j) matches xi – yj.
xi – yj and xiʹ – yj′ cross if i < i ′, but j > j ʹ
Pay mismatch for xi – yj + min cost of aligning x1 x2 ... xi–1 and y1 y2 ... yj–1.
Def. The cost of an alignment M is:
cost(M ) = α xi y j +
∑ ∑ δ+ ∑ δ Case 2a. OPT(i, j) leaves xi unmatched.
(x i , y j ) ∈ M
i : x unmatched j : y unmatched
!##"## $ !#i### #"#j#### $ Pay gap for xi + min cost of aligning x1 x2 ... xi–1 and y1 y2 ... yj.
mismatch gap
€ x1 x2 x3 x4 x5 x6 Pay gap for yj + min cost of aligning x1 x2 ... xi and y1 y2 ... yj–1.
C T A C C – G
8
Sequence alignment: bottom-up algorithm Sequence alignment: traceback
P A L A T E
SEQUENCE-ALIGNMENT(m, n, x1, …, xm, y1, …, yn, δ, α)
________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
0 2 4 6 8 10 12 P A L E T T E
FOR i = 0 TO m
P 2 0 2 4 6 8 10 P A L – A T E
M [i, 0] ← i δ.
FOR j = 0 TO n 1 mismatch, 1 gap
A 4 2 0 2 4 6 8
M [0, j] ← j δ.
L 6 4 2 0 2 4 6
FOR i = 1 TO m
E 8 6 4 2 1 3 4
FOR j = 1 TO n
M [i, j] ← min { αxi yj + M [i – 1, j – 1], T 10 8 6 4 3 1 3
δ + M [i – 1, j], already
computed T 12 10 8 6 5 3 2
δ + M [i, j – 1] }.
E 14 12 10 8 7 5 3
9 10
Theorem. The DP algorithm computes the edit distance (and an optimal It is easy to modify the DP algorithm for edit distance to…
alignment) of two strings of lengths m and n in Θ(mn) time and space.
Pf.
A. Compute edit distance in O(mn) time and O(m + n) space.
・Algorithm computes edit distance.
・Can trace back to extract optimal alignment itself. ▪ B. Compute an optimal alignment in O(mn) time and O(m + n) space.
C. Both A and B.
D. Neither A nor B.
11 12
Abstract
The edit distance (a.k.a. the Levenshtein distance) between two strings is defined as the
minimum number of insertions, deletions or substitutions of symbols needed to transform one
string into another. The problem of computing the edit distance between two strings is a
classical computational task, with a well-known algorithm based on dynamic programming.
Sequence alignment in linear space
‣ distance-vector protocols
u<v).
Programming G. Manacher
String C is a c o m m o n subsequence of strings A and B
Techniques Editor
if and only if C is a subsequence of A and C is a subse-
A Linear Space quence of B.
Hirschberg′s algorithm Hirschberg′s algorithm strings were of length 10,000, the problem might not be
solvable in main m e m o r y for lack of space. Proof of Correctness of Algorithm A
To find L(i, j), let a c o m m o n subsequence of that
We present an algorithm which will solve this prob-
lem in quadratic time and in linear space. For example, length be denoted by S(i, j ) = ClC2...cp. I f al = bj,
assuming coefficients of 2 microseconds and 10 bytes, we can do no better than by taking cp = a~ and looking
for strings of length 1,000 we would require 2 seconds for c l . . . c p _ l as a c o m m o n subsequence of length
L(i, j) -- 1 of strings AI,~-1 and B1.i-x. Thus, in this
Edit distance graph. Edit distance graph. and 10K bytes; for strings of length 10,000 we would
require a little over 3 minutes and 100K bytes. case, L ( i , j ) = L ( i - 1,j- 1) -+- 1.
・Let f (i, j) denote length of shortest path from (0,0) to (i, j). ・Let f (i, j) denote length of shortest path from (0,0) to (i, j).
String C = c~c2...cp is a subsequence of string If ai ~ bs, then cp is ai, b;, or neither (but not both).
Copyright © 1975, Association for Computing Machinery, Inc. I f cp is a~, then a solution C to problem (A~, B~j) [writ-
General permission to republish, but not for profit, all or part ten P(i, j)] will be a solution to P(i, j - 1) since bj is
・Lemma: f (i, j) = OPT(i, j) for all i and j. ・Lemma: f (i, j) = OPT(i, j) for all i and j.
of this material is granted provided that ACM's copyright notice not used. Similarly, if cp is bi, then we can get a solu-
is given and that reference is made to the publication, to its date
of issue, and to the fact that reprinting privileges were granted tion to P(i, j ) by solving P ( i -- 1, j ) . I f c~ is neither,
by permission of the Association for Computing Machinery. then a solution to either P ( i -- 1,j) or P ( i , j -- 1) will
Research work was supported in part by NSF grant GJ-30126 suffice. In determining the length of the solution, it is
and National Science Foundation Graduate Felolwship. Author's
address: Department of Electrical Engineering, Princeton Uni- seen that L(i, j ) [corresponding to P(i, j)] will be the
versity, Princeton, NJ 08540. m a x i m u m o f L ( i - - 1 , j ) and L ( i , j - - 1). []
inductive = OP T
T (i,
(i, j)
j) ▪
x2 hypothesis
= OP
i–j
€ δ
Bellman
α xi y j
δ
equation
x3 m–n i–j
15 € δ 16
Hirschberg’s algorithm Hirschberg’s algorithm
ε y1 y2 y3 y4 y5 y6 ε y1 y2 y3 y4 y5 y6
ε 0–0 ε 0–0
x1 x1 i–j
x2 i–j x2
x3 m–n x3 m–n
17 18
ε y1 y2 y3 y4 y5 y6 ε y1 y2 y3 y4 y5 y6
ε 0–0 ε 0–0
δ
x1 i–j x1 i–j
δ xi+1 yj+1
x2 x2
x3 m–n x3 m–n
19 20
Hirschberg’s algorithm Hirschberg’s algorithm
Observation 1. The length of a shortest path that uses (i, j) is f (i, j) + g(i, j). Observation 2. let q be an index that minimizes f(q, n / 2) + g (q, n / 2).
Then, there exists a shortest path from (0, 0) to (m, n) that uses (q, n / 2).
n/2
ε y1 y2 y3 y4 y5 y6 ε y1 y2 y3 y4 y5 y6
ε 0–0 ε 0–0
x1 i–j x1 i–j q
x2 x2
x3 m–n x3 m–n
21 22
Divide. Find index q that minimizes f (q, n / 2) + g(q, n / 2); save node i–j as Theorem. Hirschberg’s algorithm uses Θ(m + n) space.
part of solution.
Pf.
Conquer. Recursively compute optimal alignment in each piece. ・Each recursive call uses Θ(m) space to compute f (·, n / 2) and g(·, n / 2).
・Only Θ(1) space needs to be maintained per recursive call.
・Number of recursive calls ≤ n. ▪
n/2
ε y1 y2 y3 y4 y5 y6
ε 0–0
x1 i–j q
x2
x3 m–n
23 24
Dynamic programming: quiz 4 Hirschberg’s algorithm: running time analysis warmup
What is the worst-case running time of Hirschberg’s algorithm? Theorem. Let T(m, n) = max running time of Hirschberg’s algorithm on
strings of lengths at most m and n. Then, T(m, n) = O(m n log n).
A. O(mn)
Pf.
B. O(mn log m)
・T(m, n) is monotone nondecreasing in both m and n.
C. O(mn log n) ・T(m, n) ≤ 2 T(m, n / 2) + O(m n)
⇒ T(m, n) = O(m n log n).
D. O(mn log m log n)
25 26
Solution 1. Dynamic programming. Solution 2. Reduce to finding a min-cost alignment of x and y with
Def. OPT(i, j) = length of LCS of prefix strings x1 x2 ... xi and y1 y2 ... yj. ・Gap penalty δ = 1
・Mismatch penalty
0 p=q
Goal. OPT(m, n). pq =
p=q
<latexit sha1_base64="fJvJmXDOUjarlNvgrVZSE2Bl4wI=">AAACj3icbVFNj9MwEHXD11K+uuXIZUQF4lQlaCWKVosqcYHbIlF2pbqqHGfSWus4rj1BraL+Q/4Af4MrHHC6OdCWkSw9vfc8M35OrVae4vhnJ7pz9979BycPu48eP3n6rHfa/+bLykmcyFKX7joVHrUyOCFFGq+tQ1GkGq/Sm4+NfvUdnVel+Uobi7NCLIzKlRQUqHkv50LbpZjXdrWFiy5PcaFMLUNHv+3y8xheAydcU63yLfBzsHABK+A8aFyZnDbHBm5wBavGgiZrW817g3gY7wqOQdKCAWvrcn7a6fOslFWBhqQW3k+T2NKsFo6U1Bh2qzxaIW/EAqcBGlGgn9W7QLbwKjAZ5KULxxDs2H9v1KLwflOkwVkIWvpDrSH/p00rykezWhlbERp5OyivNFAJTbqQKYeS9CYAIZ0Ku4JcCickhT/Ym7LrbVHuvaReV0bJMsMDVtOanGhSTA4zOwaTt8P3w+TL2WA8auM8YS/YS/aGJewdG7NP7JJNmGQ/2C/2m/2J+tEo+hCNb61Rp73znO1V9PkvMtTJIQ==</latexit>
Case 1. xi = yj. ・Edit distance = # gaps = number of characters deleted from x and y.
・ 1 + length of LCS of x1 x2 ... xi–1 and y1 y2 ... yj–1.
optimal substructure property
・Length of LCS = (m + n − edit distance) / 2.
(proof via exchange argument)
Case 2. xi ≠ yj. Analysis. O(mn) time and O(m + n) space.
・ Delete xi : length of LCS of x1 x2 ... xi–1 and y1 y2 ... yj.
・ Delete y : j length of LCS of x1 x2 ... xi and y1 y2 ... yj–1. Lower bound. No O(n2−ε) algorithm unless SETH is false.
Bellman equation.
Tight Hardness Results for LCS and other
0 i=0 j=0 Sequence Similarity Measures
Amir Abboud⇤ Arturs Backurs Virginia Vassilevska Williams⇤
OP T (i, j) = 1 + OP T (i 1, j 1) xi = yj Stanford University MIT Stanford University
[email protected] [email protected] [email protected]
<latexit sha1_base64="JiQBeZHxov289Psn1044C6KDh34=">AAADFnicbZLNbtQwEMed8FWWr205crFYgVpBVwlCalFVqRIXbiyiSyuto8hxZrfeOk6wHZQo2gs8BU/DCXHlykPwDtjZFHW3tZTon9/M/MceJykE1yYI/nj+jZu3bt/ZuNu7d//Bw0f9za1POi8VgzHLRa5OE6pBcAljw42A00IBzRIBJ8n5Wxc/+QJK81wem7qAKKMzyaecUWNR3P/7fnS8zV/i+Q4mB4fkAPdIAjMuG2ZN9aJnSYCfY2KgMrjh04VNwxwfWmpFh3PV4vkSk8lwD7LIleLQ8hdWtF12Q9tnN9y58Luwq2JnWMfzy7Uko1XbovlfTNw23XtJ2m9nRxbXORIJn51pj4BMu+PE/UEwDNqFr4qwEwPUrVG86W2RNGdlBtIwQbWehEFhooYqw5kAO59SQ0HZOZ3BxEpJM9BR097LAj+zJMXTXNlHGtzSyxUNzbSus8RmZtSc6fWYg9fFJqWZ7kcNl0VpQLJlo2kpsMmxu2SccgXMiNoKyhS3e8XsjCrKjP0VVrq03gWwlZM0VSk5y1NYo8JURlE3xXB9ZlfF+NXwzTD88HpwtN+NcwM9QU/RNgrRHjpC79AIjRHzPnq199X75n/3f/g//V/LVN/rah6jleX//gcQD+1P</latexit>
max {OP T (i 1, j), OP T (i, j 1)} xi = yj
29 30
Abstract
Two important similarity measures between sequences are the longest common subsequence
(LCS) and the dynamic time warping distance (DTWD). The computations of these measures for
two given sequences are central tasks in a variety of applications. Simple dynamic programming
algorithms solve these tasks in O(n2 ) time, and despite an extensive amount of research, no
algorithms with significantly better worst case upper bounds are known.
‣ Hirschberg′s algorithm
‣ Bellman–Ford–Moore algorithm −2
‣ distance-vector protocols 5
4 12
s −5
‣ negative cycles ⇤
Supported by NSF Grant CCF-1417238, BSF Grant BSF:2012338 and a Stanford SOE Hoover Fellowship.
8
17
7
9 −1
−6
SECTION 6.8 5 11
5
−3 13
4 10 t
Dijkstra. May not produce shortest paths when edge lengths are negative. Def. A negative cycle is a directed cycle for which the sum of its edge
lengths is negative.
s 6 v
t 3 w 5
−3 −3
Reweighting. Adding a constant to every edge length does not necessarily
make Dijkstra’s algorithm produce shortest paths.
4 −4
s 14 v
t 11 w
33 34
Shortest paths and negative cycles Shortest paths and negative cycles
Lemma 1. If some v↝t path contains a negative cycle, then there does not Lemma 2. If G has no negative cycles, then there exists a shortest v↝t path
exist a shortest v↝t path. that is simple (and has ≤ n – 1 edges).
Pf. If there exists such a cycle W, then can build a v↝t path of arbitrarily Pf.
negative length by detouring around W as many times as desired. ▪ ・Among all shortest v↝t paths, consider one that uses the fewest edges.
・If that path P contains a directed cycle W, can remove the portion of P
corresponding to W without increasing its length. ▪
v v
t t
W W
35 36
Shortest-paths and negative-cycle problems Dynamic programming: quiz 5
Single-destination shortest-paths problem. Given a digraph G = (V, E) with Which subproblems to find shortest v↝t paths for every node v?
edge lengths ℓvw (but no negative cycles) and a distinguished node t,
find a shortest v↝t path for every node v.
A. OPT(i, v) = length of shortest v↝t path that uses exactly i edges.
Negative-cycle problem. Given a digraph G = (V, E) with edge lengths ℓvw, B. OPT(i, v) = length of shortest v↝t path that uses at most i edges.
find a negative cycle (if one exists). C. Neither A nor B.
1 5
4 2 −3 5 −3 −3
t 4 −4
37 38
Shortest paths with negative weights: dynamic programming Shortest paths with negative weights: implementation
39 40
Shortest paths with negative weights: implementation Dynamic programming: quiz 6
Theorem 1. Given a digraph G = (V, E) with no negative cycles, the DP It is easy to modify the DP algorithm for shortest paths to…
algorithm computes the length of a shortest v↝t path for every node v
in Θ(mn) time and Θ(n2) space.
A. Compute lengths of shortest paths in O(mn) time and O(m + n) space.
41 42
Shortest paths with negative weights: practical improvements Bellman–Ford–Moore: efficient implementation
FOREACH node v ∈ V :
d[v] ← ∞.
Performance optimization. If d[w] was not updated in iteration i – 1,
then no reason to consider edges entering w in iteration i. successor[v] ← null.
d[t] ← 0.
FOR i = 1 TO n – 1
FOREACH node w ∈ V :
IF (d[w] was updated in previous pass)
FOREACH edge (v, w) ∈ E :
pass i
IF (d[v] > d[w] + ℓvw) O(m) time
43 44
Dynamic programming: quiz 7 Bellman–Ford–Moore: analysis
Which properties must hold after pass i of Bellman–Ford–Moore?
Lemma 3. For each node v : d[v] is the length of some v↝t path.
Lemma 4. For each node v : d[v] is monotone non-increasing.
Theorem 2. Assuming no negative cycles, Bellman–Ford–Moore computes Assuming no negative cycles, which properties must hold throughout
the lengths of the shortest v↝t paths in O(mn) time and Θ(n) extra space. Bellman–Ford–Moore?
Pf. Lemma 2 + Lemma 5. ▪
C. Both A and B.
Remark. Bellman–Ford–Moore is typically faster in practice.
D. Neither A nor B.
・Edge (v, w) considered in pass i + 1 only if d[w] updated in pass i.
・If shortest path has k edges, then algorithm finds it after ≤ k passes.
47 48
Bellman–Ford–Moore: analysis Bellman–Ford–Moore: analysis
Claim. Throughout Bellman–Ford–Moore, following the successor[v] Claim. Throughout Bellman–Ford–Moore, following the successor[v]
pointers gives a directed path from v to t of length d[v]. pointers gives a directed path from v to t of length d[v].
1 1
1 1
successor[3] = t 3 successor[3] = t 3
d[3] = 1 d[3] = 1
49 50
Claim. Throughout Bellman–Ford–Moore, following the successor[v] Claim. Throughout Bellman–Ford–Moore, following the successor[v]
pointers gives a directed path from v to t of length d[v]. pointers gives a directed path from v to t of length d[v].
5 5
4 −8 1 4 −8 1
Lemma 6. Any directed cycle W in the successor graph is a negative cycle. Theorem 3. Assuming no negative cycles, Bellman–Ford–Moore finds
Pf. shortest v↝t paths for every node v in O(mn) time and Θ(n) extra space.
・If successor[v] = w, we must have d[v] ≥ d[w] + ℓvw. Pf.
(LHS and RHS are equal when successor[v] is set; d[w] can only decrease; ・The successor graph cannot have a directed cycle. [Lemma 6]
d[v] decreases only when successor[v] is reset) ・Thus, following the successor pointers from v yields a directed path to t.
・Let v → v → … → v → v be the sequence of nodes in a directed cycle W.
1 2 k 1 ・Let v = v1 → v2 → … → vk = t be the nodes along this path P.
・Assume that (v , v ) is the last edge in W added to the successor graph.
k 1 ・Upon termination, if successor[v] = w, we must have d[v] = d[w] + ℓvw.
・Just prior to that: d[v ] ≥ d[v ] + ℓ(v , v )
1 2 1 2 (LHS and RHS are equal when successor[v] is set; d[·] did not change)
d[v2] ≥ d[v3] + ℓ(v2, v3) ・Thus, d[v1] = d[v2] + ℓ(v1, v2)
since algorithm
⋮ ⋮ ⋮ d[v2] = d[v3] + ℓ(v2, v3) terminated
・Adding inequalities yields ℓ(v , v ) + ℓ(v , v ) 1 2 2 3 + … + ℓ(vk–1, vk) + ℓ(vk, v1) < 0. ▪ ・Adding equations yields d[v] = d[t] + ℓ(v1, v2) + ℓ(v2, v3) + … + ℓ(vk–1, vk). ▪
53 54
1993 Goldberg
‣ negative cycles
O(m n1/2 log W)
55
Distance-vector routing protocols Distance-vector routing protocols
s 1 v 1 t
57
“counting to infinity” 58
・Each router stores the whole path (or network topology). 6. D YNAMIC P ROGRAMMING II
・Based on Dijkstra’s algorithm.
・Avoids “counting-to-infinity” problem and related difficulties. ‣ sequence alignment
・Requires significantly more storage. ‣ Hirschberg′s algorithm
‣ Bellman–Ford–Moore algorithm
Ex. Border Gateway Protocol (BGP), Open Shortest Path First (OSPF). ‣ distance vector protocol
‣ negative cycles
SECTION 6.10
59
Detecting negative cycles Detecting negative cycles: application
Negative cycle detection problem. Given a digraph G = (V, E), with edge Currency conversion. Given n currencies and exchange rates between pairs
lengths ℓvw, find a negative cycle (if one exists). of currencies, is there an arbitrage opportunity?
1.4
66
1
74
1.3
0. 1.
33
12
6
0.657
USD GBP
1. 1.521
5 6 06
1 8
53
1.0
1.
14
11
1.6
−3 2 4 −3 −3
32
20
0.9
0.6
0.7
0
0.6
65
95
98
0. 0.
94
3
4 −4 1.049
CAD CHF
0.953
61 62
An arbitrage opportunity
Lemma 7. If OPT(n, v) = OPT(n – 1, v) for every node v, then no negative cycles. Theorem 4. Can find a negative cycle in Θ(mn) time and Θ(n2) space.
Pf. The OPT(n, v) values have converged ⇒ shortest v↝t path exists. ▪ Pf.
・Add new sink node t and connect all nodes to t with 0-length edge.
Lemma 8. If OPT(n, v) < OPT(n – 1, v) for some node v, then (any) shortest v↝t ・G has a negative cycle iff G ʹ has a negative cycle.
path of length ≤ n contains a cycle W. Moreover W is a negative cycle. ・Case 1. [ OPT(n, v) = OPT(n – 1, v) for every node v ]
By Lemma 7, no negative cycles.
Pf. [by contradiction] ・Case 2. [ OPT(n, v) < OPT(n – 1, v) for some node v ]
・Since OPT(n, v) < OPT(n – 1, v), we know that shortest v↝t path P has Using proof of Lemma 8, can extract negative cycle from v↝t path.
exactly n edges. (cycle cannot contain t since no edge leaves t) ▪
・By pigeonhole principle, the path P must contain a repeated node x.
・Let W be any cycle in P.
・Deleting W yields a v↝t path with <
5 6 G′
n edges ⇒ W is a negative cycle. ▪
0
0
x 4 −3 t
v −3 2 −3
t 0
W
0
4 −4
c(W) < 0 63 64
Detecting negative cycles Dynamic programming: quiz 9
Theorem 5. Can find a negative cycle in O(mn) time and O(n) extra space. How difficult to find a negative cycle in an undirected graph?
Pf.
・Run Bellman–Ford–Moore on G ʹ for nʹ = n + 1 passes (instead of nʹ – 1). A. O(m log n)
・If no d[v] values updated in pass nʹ, then no negative cycles.
・Otherwise, suppose d[s] updated in pass nʹ. B. O(mn)
・Define pass(v) = last pass in which d[v] was updated. C. O(mn + n2 log n)
・Observe pass(s) = nʹ and pass(successor[v]) ≥ pass(v) – 1 for each v. D. O(n2.38)
・Following successor pointers, we must eventually repeat a node. Chapter 46
Remark. See p. 304 for improved version and early termination rule. Structures
Data Harold for Weighted
N. Gabow* Matching and
Nearest Common Ancestors with Linking
(Tarjan’s subtree disassembly trick)
Harold N. Gabow*
Abstract. This paper shows that the weighted match- optimization; detailed discussions are in [L, LP, PS].
ing problem on general graphs can be solved in time Edmonds gave the first polynomial-time algorithm for
O(n(m + n log n)), f or n and m the number of vertices weighted matching [El. Several implementations of Ed-
and edges, respectively. This was previously known monds’ algorithm have been proposed, with increas-
Abstract. only This forpaper
bipartite that the Itweighted
shows graphs. also shows match- optimization;
that a sequence detailed discussions are in [L,
ingly fast running times: O(n3) [G73, L], O(mn log n) LP, PS].
algorithm for
ing problemof m on ncageneraland graphs can be solved
link operations time can Edmonds
on nin nodes be pro- gave[BD, GMG],polynomial-time
the first O(n(m log Tog log 2++,n + n log n))
O(n(m + ncessed f or n in
log n)),on-line andtimem the number n)+n).
O(ma(m, of vertices This was previ- matching
weighted [El. Several implementations
‘[GGS]. Edmonds’ algorithm is a generalization of Ed- of the’
previouslytypeknown monds’ algorithm have been proposed, with increas-
and edges,ously known onlyThis
respectively. for was
a restricted of link operation. Hungarian algorithm, due to Kuhn, for weighted match-
O(n3) [G73, L], O(mn log n)
only for bipartite graphs. It also shows that a sequence ingly fast running ing ontimes: bipartite graphs [K55, K56]. Fredman and Tar-
of m nca and link operations on n nodes can be pro- [BD, GMG], O(n(m jan implement log Tog log the + n log n))
2++,nHungarian algorithm in O(n(m +
1. Introduction. This was previ- ‘[GGS]. Edmonds’ algorithm is a generalization of the’
cessed on-line in time O(ma(m, n)+n).
n logn)) time Kuhn, using for Fibonacci heaps [FT]. They ask if
ously known onlyThis for paper solvestype
a restricted two ofwell-known
link operation. problems Hungarian
in data algorithm, due to weighted match-
65 structures and gives some related results. The ing starting general matching can be done in this time. Our first
on bipartite graphs [K55, K56]. Fredman and Tar-
result is an affirmative 66
point is the matching problem for graphs, which jan
1. Introduction.
leads to
implement the Hungarian algorithmanswer: in O(n(m We show + that a search
the other problems. This section defines the problems
n logn)) time inusingEdmonds’
Fibonacci algorithm
heaps [FT].canThey be implemented
ask if in time
This paper solves two well-known problems in data
O(m can + nlogn).
be done This in thisimplies
time. that Our the first weighted match-
structures and and states
gives some the results.
related results. The starting general matching
point is the matching A maMing problem onforagraphs,
graph whichis a set leadsof to result
vertex-disjoint‘ is an ing
affirmative problem answer:can be
We solved
show in
that time
a search O(n(m + n log n)).
the other edges.
problems.Suppose each edge
This section e has
defines thea problems in
real-valued cosi c(e). Edmonds’ In both
algorithm cases
can the
be space
implementedis O(m). in Our
time implementation
and states The cost c(S) of a set of edges S is the sumO(m
the results. of +thenlogn). of This a search implies is inthat
some the sense
weighted optimal:
match-As shown in [FT]
individualon aedge
A maMing graphcosts.
is a set A ofminimum cost matching
vertex-disjoint‘ ing problem
is a for be
can Dijkstra’s time O(n(mone+ search
solved inalgorithm, n log n)). of Edmonds’ algo-
matching
edges. Suppose each of edgesmallest real-valuedcost.cosiThere
e has apossible c(e). are Ina both num- cases rithm the space can isbeO(m). used Our to sortimplementation
n numbers. Thus a search
The cost ber a set of edges
c(S) ofof variations: S is the cost of the
sum maximum of a search is in some sense optimal: As
requires time fI(m + n log n) in an appropriate shown in [FT] model of
a minimum cardinal-
matching is a for Dijkstra’s algorithm,
computation. one search of Edmonds’ algo-
individual ity edge costs. Ais minimum
matching a matchingcost with the greatest number
num-constraint rithm can be used to sort n numbers. Thus a search
matching of smallestpossible,
of edges There are
cost. subject
possible which to athis has Another algorithm for weighted matching is based
maximum cardinal- requires time fI(m + n log n) in an appropriate model of
ber of variations:
the smallest a minimumpossiblecostcost; minimum cost cardinality-
computation. on cost scaling. This approach is applicable if all costs
greatest number
ity matching is a matching
k matching (for awith giventhe integer k); maximum weight are integers. The best known time bound for such
of edges possible, which subject to this constraint has Another algorithm for weighted matching is based
matching; etc. The weighted matching problem refers a scaling algorithm is O(dncr(m, n) log n m log (ni’V))
the smallest possible cost; minimum cost cardinality- on cost scaling. This approach is applicable if all costs
to all of the problems in this list. [GT89]; here N is the largest magnitude of an edge cost
k matching (for a given integer k); maximum weight are integers. The best known time bound for such
In stating resource
matching; etc. The weighted matching problem refers
bounds for graph algorithms
a scaling algorithm is O(dncr(m, ofn)Ackermann’s
we and a is an inverse log n m log (ni’V)) function (see below).
assume throughout
to all of the problems in this list. this paper that the given graph has Under the similarity
[GT89]; here N is the largest magnitude of an assumption edge [GSSa]cost N 5 nOcl), this
n vertices
In stating resource andbounds
m edges. For notational
for graph algorithms simplicity
we andwea as- boundof is
is an inverse superior to function
Ackermann’s Edmonds’(seealgorithm.below). However our
sume m 2thisn/2.paper
assume throughout In the
that weighted
the given matching
graph has problem Underthis result is
the similarity still of interest
assumption [GSSa] for N 5atnOcl),least this two reasons: First,
can always be achieved by
n vertices and m edges. For notational simplicity we as- discarding isolated vertices. Edmonds’
bound is superior to Edmonds’ algorithm is
algorithm. theoretically
However ourattractive because
sume m 2 n/2. Weighted In the weightedmatchingmatching is a classic
problem problem
this in result
networkis still of it is strongly
interest for at polynomial.
least two Second,
reasons: First, for a number of
can always be achieved by discarding isolated vertices. Edmonds’ algorithm is theoretically attractive because
Weighted matching isofa classic
* Department Computerproblem
Science, is strongly
University it of
in network Boulder, for a number of
at Boulder, Second,
Colorado polynomial.
CO 80309. Research supported in part by NSF Grant No. CCR-8815636.
* Department of Computer Science, University of Colorado at Boulder, Boulder,
CO 80309. Research supported in part by NSF Grant No. CCR-8815636.
434
434