Binary Search Trees: Guaranteeing Performance
Binary Search Trees: Guaranteeing Performance
Binary Search Trees: Guaranteeing Performance
Princeton University • COS 226 • Algorithms and Data Structures • Spring 2004 • Kevin Wayne • http://www.Princeton.EDU/~cos226 2
Binary search tree: binary tree in symmetric order. A BST is a reference to a node.
51
25 43 64 84 99
51
3 4
Tree Shape BST Skeleton
Tree shape.
public class SymbolTable {
n Many BSTs correspond to same input data. private Node root;
n Have different tree shapes.
private static class Node {
n Performance depends on shape. Comparable key;
Object value;
Node left, right;
25
Node(Comparable key, Object value) {
this.key = key;
51 14 72
this.value = value;
}
} helper inner class
14 43 97 helper functions
72
private static boolean less(Comparable k1, Comparable k2) { }
private static boolean equals(Comparable k1, Comparable k2) { }
33 53 97 33 53 84 99
5 6
Search for specified key and return corresponding value or null. Insert key-value pair.
n Code follows from BST definition. n Code follows from BST definition.
n Use helper function to search for key in subtree rooted at h. n Search, then insert.
n Simple (but tricky) recursive code.
n Duplicates allowed.
7 8
BST Construction BST Analysis
Insert the following keys into BST: A S E R C H I N G X M P L Cost of search and insert BST.
n Proportional to depth of node.
n 1-1 correspondence between BST and quicksort partitioning.
n Height of node corresponds to number of function calls on stack
when node is partitioned.
9 10
To delete a node:
n Case 1 (zero children): just remove it.
Worst Case Average Case n Case 2 (one child): pass the child up.
Implementation Search Insert Delete Search Insert Delete n Case 3 (two children): find the next largest node using right-left*
Sorted array log N N N log N N/2 N/2 or left-right*, swap with next largest, remove as in Case 1 or 2.
Unsorted list N 1 1 N/2 1 1
Hashing N 1 N 1* 1* 1*
BST N N N log N log N ???
11 12
Symbol Table: Implementations Cost Summary Right Rotate, Left Rotate
* assumes our hash function can generate random values for all keys
† if delete allowed, insert/search become sqrt(N) too y x
A y = left(x) C
13 14
Fundamental operation to rearrange nodes in a tree. Root insertion: insert a node and make it the new root.
n Easier done than said. n Insert the node using standard BST.
n Use rotations to bring it up to the root.
Why bother?
private Node rotL(Node h) { n Faster if searches are for recently inserted keys.
Node x = h.right; n Basis for advanced algorithms.
h.right = x.left;
x.left = h; Node insertT(Node h, Comparable key, Object value) {
return x; if (h == null) return new Node(key, value);
} if (less(key, h.key)) {
h.left = insertT(h.left, key, value);
private Node rotR(Node h) { h = rotR(h);
Node x = h.left; }
h.left = x.right; else {
x.right = h; h.right = insertT(h.right, key, value);
return x; h = rotL(h);
} }
return h;
}
15 16
BST Construction: Root Insertion Randomized BST
Idea. When inserting a new node, make it the root (via root insertion)
with probability 1/(N+1) and do it recursively.
19 20
Randomized BST: Delete Symbol Table: Implementations Cost Summary
Join. Merge two disjoint symbol tables A (of size M) and B (of size N),
assuming all keys in A are less than all keys in B. Worst Case Average Case
Use A as root with probability M / (M + N), and recursively join
Implementation Search Insert Delete Search Insert Delete
n
n Delete the node. Randomized BST log N ‡ log N ‡ log N ‡ log N log N log N
n Join two broken subtrees as above.
* assumes our hash function can generate random values for all keys
Theorem. Tree still random after delete. † if delete allowed, insert/search become sqrt(N)
‡ assumes system can generate random numbers
21 22
Sort. Traverse tree in ascending order. Ceiling. Given key k, return smallest element that is at least as big as k.
n Inorder traversal.
n Same comparisons as quicksort, but pay space for extra links. Best-fit bin packing heuristic. Insert the item in the bin with the least
remaining space among those that can store the item.
Range search. Find all items whose keys are between k1 and k2.
M
n Add subtree size to each node. 10 – original proof of this result was over 70 pages of analysis!
n Takes time proportional to height of tree.
E T
4 5
private class Node {
Comparable key; B H P W
Object value; 1 2 1 3
Node left, right;
int N; F V Y
} subtree size 1 1 1
23 24
Symbol Table: Implementations Cost Summary
* assumes our hash function can generate random values for all keys
‡ assumes system can generate random numbers
25