Path Compression
Path Compression
Path Compression
P[42]
Before Path Compression
n F (n)
0 1
1 2
2 4
42 3 16
4 65536
5 2 × 1019728
P[42]
The function G is essentially the inverse of F . More
formally, G(n) is the smallest integer k such that
∗ Copyright c Ian Parberry, 1992–2003.
F (k) ≥ n. G(n) ≤ 5 for all “practical” values of n.
1
With path compression, a series of Θ(n) union and If w is made a descendant of v at some time during
find operations can be carried out in time O(n · the execution of σ with path compression, then the
G(n)). That is, the algorithm takes O(G(n)) amor- rank of w is less than the rank of v.
tized time.
Proof: If w is made a descendant of v with path
We will measure time by counting one time unit for compression, then it’s also a descendant of v without
each node visited when traversing paths to the root path compression (just lower down). This means the
during union and find operations. rank of w is less than the rank of v.
Lemma 1
We will charge one unit of time each time we visit a
vertex. At the end of σ we will sum up all of these
If path compression is not used, then a tree of height units.
h will have at least 2h vertices.
(Note: Ignore the fact that we visit each vertex twice
Proof: This is the same argument that we made during path-compression.)
in the earlier lecture: the intuition is that making
the root of the smaller tree point to the root of the The trick is that we will use a funky book-keeping
larger tree makes short, wide trees that look a little scheme. We charge some of the units of time to
like complete binary trees at least in terms of their the vertex, and some to the operation that caused
height/size ratio. the visit. At the end we add up all the time units
charged to both operations and vertices.
Lemma 2
The Rules
There are at most n/2r vertices of rank r.
Let v be a vertex. Each time we pass through it we
Proof: By Lemma 1, each vertex of rank r has at
will charge as follows:
least 2r descendants. Since the descendants of ver-
tices at the same height must be distinct, having 1. If the parent of v is in a different rank group
more than n/2r vertices of rank r would give us more than v or v is the root, then charge one unit of
than 2r · n/2r = n vertices in total, which is impos- time to the operation.
sible.
2. If both v and its parent are in the same rank
(Note: The only way that two vertices can share group, then charge one unit of time to v.
descendants is if one is an ancestor of the other —
which would give them different heights.)
Corollary: No vertex can have rank larger than Charges Under Rule 1
log n.
By Lemma 3, vertices going up a path are monoton-
Lemma 3
ically increasing in rank. There are at most G(n)
2
different rank groups, 0 through G(n) − 1. There- total of n · G(n) by Rule 1.
fore we cross a rank group boundary at most G(n)
times. There are n vertices charged a total of n · G(n) by
Rule 2.
We charge one time unit to the operation each time
we cross a rank group boundary. So no operation is Grand total of O(n · G(n)) for an amortized time of
charged more than G(n) time units under Rule 1. O(G(n)) per operation.
Analysis