Lecture 7

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 38

David Luebke 1 1/31/2012

CS 332: Algorithms
Quicksort
David Luebke 2 1/31/2012
Homework 2
Assigned today, due next Wednesday
Will be on web page shortly after class
Go over now
David Luebke 3 1/31/2012
Review: Quicksort
Sorts in place
Sorts O(n lg n) in the average case
Sorts O(n
2
) in the worst case
But in practice, its quick
And the worst case doesnt happen often (but more
on this later)
David Luebke 4 1/31/2012
Quicksort
Another divide-and-conquer algorithm
The array A[p..r] is partitioned into two non-
empty subarrays A[p..q] and A[q+1..r]
Invariant: All elements in A[p..q] are less than all
elements in A[q+1..r]
The subarrays are recursively sorted by calls to
quicksort
Unlike merge sort, no combining step: two
subarrays form an already-sorted array
David Luebke 5 1/31/2012
Quicksort Code
Quicksort(A, p, r)
{
if (p < r)
{
q = Partition(A, p, r);
Quicksort(A, p, q);
Quicksort(A, q+1, r);
}
}
David Luebke 6 1/31/2012
Partition
Clearly, all the action takes place in the
partition() function
Rearranges the subarray in place
End result:
Two subarrays
All values in first subarray all values in second
Returns the index of the pivot element
separating the two subarrays
How do you suppose we implement this?
David Luebke 7 1/31/2012
Partition In Words
Partition(A, p, r):
Select an element to act as the pivot (which?)
Grow two regions, A[p..i] and A[j..r]
All elements in A[p..i] <= pivot
All elements in A[j..r] >= pivot
Increment i until A[i] >= pivot
Decrement j until A[j] <= pivot
Swap A[i] and A[j]
Repeat until i >= j
Return j
Note: slightly different from
books partition()
David Luebke 8 1/31/2012
Partition Code
Partition(A, p, r)
x = A[p];
i = p - 1;
j = r + 1;
while (TRUE)
repeat
j--;
until A[j] <= x;
repeat
i++;
until A[i] >= x;
if (i < j)
Swap(A, i, j);
else
return j;
Illustrate on
A = {5, 3, 2, 6, 4, 1, 3, 7};
What is the running time of
partition()?
David Luebke 9 1/31/2012
Partition Code
Partition(A, p, r)
x = A[p];
i = p - 1;
j = r + 1;
while (TRUE)
repeat
j--;
until A[j] <= x;
repeat
i++;
until A[i] >= x;
if (i < j)
Swap(A, i, j);
else
return j;
partition() runs in O(n) time
David Luebke 10 1/31/2012
Analyzing Quicksort
What will be the worst case for the algorithm?
Partition is always unbalanced
What will be the best case for the algorithm?
Partition is perfectly balanced
Which is more likely?
The latter, by far, except...
Will any particular input elicit the worst case?
Yes: Already-sorted input
David Luebke 11 1/31/2012
Analyzing Quicksort
In the worst case:
T(1) = O(1)
T(n) = T(n - 1) + O(n)
Works out to
T(n) = O(n
2
)
David Luebke 12 1/31/2012
Analyzing Quicksort
In the best case:
T(n) = 2T(n/2) + O(n)
What does this work out to?
T(n) = O(n lg n)
David Luebke 13 1/31/2012
Improving Quicksort
The real liability of quicksort is that it runs in
O(n
2
) on already-sorted input
Book discusses two solutions:
Randomize the input array, OR
Pick a random pivot element
How will these solve the problem?
By insuring that no particular input can be chosen
to make quicksort run in O(n
2
) time
David Luebke 14 1/31/2012
Analyzing Quicksort: Average Case
Assuming random input, average-case running
time is much closer to O(n lg n) than O(n
2
)
First, a more intuitive explanation/example:
Suppose that partition() always produces a 9-to-1
split. This looks quite unbalanced!
The recurrence is thus:
T(n) = T(9n/10) + T(n/10) + n
How deep will the recursion go? (draw it)
Use n instead of O(n)
for convenience (how?)
David Luebke 15 1/31/2012
Analyzing Quicksort: Average Case
Intuitively, a real-life run of quicksort will
produce a mix of bad and good splits
Randomly distributed among the recursion tree
Pretend for intuition that they alternate between
best-case (n/2 : n/2) and worst-case (n-1 : 1)
What happens if we bad-split root node, then
good-split the resulting size (n-1) node?
David Luebke 16 1/31/2012
Analyzing Quicksort: Average Case
Intuitively, a real-life run of quicksort will
produce a mix of bad and good splits
Randomly distributed among the recursion tree
Pretend for intuition that they alternate between
best-case (n/2 : n/2) and worst-case (n-1 : 1)
What happens if we bad-split root node, then
good-split the resulting size (n-1) node?
We fail English
David Luebke 17 1/31/2012
Analyzing Quicksort: Average Case
Intuitively, a real-life run of quicksort will
produce a mix of bad and good splits
Randomly distributed among the recursion tree
Pretend for intuition that they alternate between
best-case (n/2 : n/2) and worst-case (n-1 : 1)
What happens if we bad-split root node, then
good-split the resulting size (n-1) node?
We end up with three subarrays, size 1, (n-1)/2, (n-1)/2
Combined cost of splits = n + n -1 = 2n -1 = O(n)
No worse than if we had good-split the root node!
David Luebke 18 1/31/2012
Analyzing Quicksort: Average Case
Intuitively, the O(n) cost of a bad split
(or 2 or 3 bad splits) can be absorbed
into the O(n) cost of each good split
Thus running time of alternating bad and good
splits is still O(n lg n), with slightly higher
constants
How can we be more rigorous?
David Luebke 19 1/31/2012
Analyzing Quicksort: Average Case
For simplicity, assume:
All inputs distinct (no repeats)
Slightly different partition() procedure
partition around a random element, which is not
included in subarrays
all splits (0:n-1, 1:n-2, 2:n-3, , n-1:0) equally likely
What is the probability of a particular split
happening?
Answer: 1/n
David Luebke 20 1/31/2012
Analyzing Quicksort: Average Case
So partition generates splits
(0:n-1, 1:n-2, 2:n-3, , n-2:1, n-1:0)
each with probability 1/n
If T(n) is the expected running time,
What is each term under the summation for?
What is the O(n) term for?
) ) ) . J )

=
O + + =
1
0
1
1
n
k
n k n T k T
n
n T
David Luebke 21 1/31/2012
Analyzing Quicksort: Average Case
So
Note: this is just like the books recurrence (p166),
except that the summation starts with k=0
Well take care of that in a second
) ) ) . J )
) )

=
O + =
O + + =
1
0
1
0
2
1
1
n
k
n
k
n k T
n
n k n T k T
n
n T
Write it on
the board
David Luebke 22 1/31/2012
Analyzing Quicksort: Average Case
We can solve this recurrence using the dreaded
substitution method
Guess the answer
Assume that the inductive hypothesis holds
Substitute it in for some value < n
Prove that it follows for n
David Luebke 23 1/31/2012
Analyzing Quicksort: Average Case
We can solve this recurrence using the dreaded
substitution method
Guess the answer
Whats the answer?
Assume that the inductive hypothesis holds
Substitute it in for some value < n
Prove that it follows for n
David Luebke 24 1/31/2012
Analyzing Quicksort: Average Case
We can solve this recurrence using the dreaded
substitution method
Guess the answer
T(n) = O(n lg n)
Assume that the inductive hypothesis holds
Substitute it in for some value < n
Prove that it follows for n
David Luebke 25 1/31/2012
Analyzing Quicksort: Average Case
We can solve this recurrence using the dreaded
substitution method
Guess the answer
T(n) = O(n lg n)
Assume that the inductive hypothesis holds
Whats the inductive hypothesis?
Substitute it in for some value < n
Prove that it follows for n
David Luebke 26 1/31/2012
Analyzing Quicksort: Average Case
We can solve this recurrence using the dreaded
substitution method
Guess the answer
T(n) = O(n lg n)
Assume that the inductive hypothesis holds
T(n) an lg n + b for some constants a and b
Substitute it in for some value < n
Prove that it follows for n
David Luebke 27 1/31/2012
Analyzing Quicksort: Average Case
We can solve this recurrence using the dreaded
substitution method
Guess the answer
T(n) = O(n lg n)
Assume that the inductive hypothesis holds
T(n) an lg n + b for some constants a and b
Substitute it in for some value < n
What value?
Prove that it follows for n
David Luebke 28 1/31/2012
Analyzing Quicksort: Average Case
We can solve this recurrence using the dreaded
substitution method
Guess the answer
T(n) = O(n lg n)
Assume that the inductive hypothesis holds
T(n) an lg n + b for some constants a and b
Substitute it in for some value < n
The value k in the recurrence
Prove that it follows for n
David Luebke 29 1/31/2012
Analyzing Quicksort: Average Case
We can solve this recurrence using the dreaded
substitution method
Guess the answer
T(n) = O(n lg n)
Assume that the inductive hypothesis holds
T(n) an lg n + b for some constants a and b
Substitute it in for some value < n
The value k in the recurrence
Prove that it follows for n
Grind through it
David Luebke 30 1/31/2012
Note: leaving the same
recurrence as the book
What are we doing here?
Analyzing Quicksort: Average Case
) ) )
) )
) )
) )
) )

=
O + + =
O + + + =
O +

+ +
O + +
O + =
1
1
1
1
1
1
1
0
1
0
lg
2
2
lg
2
lg
2
lg
2
2
n
k
n
k
n
k
n
k
n
k
n b k ak
n
n
n
b
b k ak
n
n b k ak b
n
n b k ak
n
n k T
n
n T
The recurrence to be solved
What are we doing here?
What are we doing here?
Plug in inductive hypothesis
Expand out the k=0 case
2b/n is just a constant,
so fold it into O(n)
David Luebke 31 1/31/2012
What are we doing here?
What are we doing here?
Evaluate the summation:
b+b++b = b (n-1)
The recurrence to be solved
Since n-1<n, 2b(n-1)/n < 2b
Analyzing Quicksort: Average Case
) ) )
)
)
) n b k k
n
a
n n
n
b
k k
n
a
n b
n
k ak
n
n b k ak
n
n T
n
k
n
k
n
k
n
k
n
k
O + +
O + + =
O + + =
O + + =

=
2 lg
2
) 1 (
2
lg
2
2
lg
2
lg
2
1
1
1
1
1
1
1
1
1
1
What are we doing here? Distribute the summation
This summation gets its own set of slides later
David Luebke 32 1/31/2012
How did we do this?
Pick a large enough that
an/4 dominates O(n)+b
What are we doing here?
Remember, our goal is to get
T(n) an lg n + b
What the hell? Well prove this later
What are we doing here? Distribute the (2a/n) term
The recurrence to be solved
Analyzing Quicksort: Average Case
) )
)
)
)
b n an
n
a
b n b n an
n b n
a
n an
n b n n n
n
a
n b k k
n
a
n T
n
k
+

'
+

'

+ O + + =
O + + =
O + +

'
+

'


O + +

=
lg
4
lg
2
4
lg
2
8
1
lg
2
1 2
2 lg
2
2 2
1
1
David Luebke 33 1/31/2012
Analyzing Quicksort: Average Case
So T(n) an lg n + b for certain a and b
Thus the induction holds
Thus T(n) = O(n lg n)
Thus quicksort runs in O(n lg n) time on average
(phew!)
Oh yeah, the summation
David Luebke 34 1/31/2012
What are we doing here?
The lg k in the second term
is bounded by lg n
Tightly Bounding
The Key Summation








=
+ =
+
+ =
1
2
1 2
1
1
2
1 2
1
1
2
1 2
1
1
1
lg lg
lg lg
lg lg lg
n
n k
n
k
n
n k
n
k
n
n k
n
k
n
k
k n k k
n k k k
k k k k k k
What are we doing here?
Move the lg n outside the
summation
What are we doing here?
Split the summation for a
tighter bound
David Luebke 35 1/31/2012
The summation bound so far
Tightly Bounding
The Key Summation


)


)


)





=
+ =
+ =
+
+
1
2
1 2
1
1
2
1 2
1
1
2
1 2
1
1
2
1 2
1
1
1
lg 1 lg
lg 1 lg
lg 2 lg
lg lg lg
n
n k
n
k
n
n k
n
k
n
n k
n
k
n
n k
n
k
n
k
k n k n
k n n k
k n n k
k n k k k k
What are we doing here?
The lg k in the first term is
bounded by lg n/2
What are we doing here? lg n/2 = lg n - 1
What are we doing here?
Move (lg n - 1) outside the
summation
David Luebke 36 1/31/2012
The summation bound so far
Tightly Bounding
The Key Summation
)





)

'
+

'


=
=
+ =
+
1 2
1
1 2
1
1
1
1
2
1 2
1
1 2
1
1
2
1 2
1
1
1
2
) ( 1
lg
lg
lg lg
lg 1 lg lg
n
k
n
k
n
k
n
n k
n
k
n
k
n
n k
n
k
n
k
k
n n
n
k k n
k n k k n
k n k n k k
What are we doing here? Distribute the (lg n - 1)
What are we doing here?
The summations overlap in
range; combine them
What are we doing here? The Guassian series
David Luebke 37 1/31/2012
The summation bound so far
Tightly Bounding
The Key Summation
)

) . J
) . J
)
4 8
1
lg lg
2
1
1
2 2 2
1
lg 1
2
1
lg 1
2
1
lg
2
) ( 1
lg
2 2
1 2
1
1 2
1
1
1
n
n n n n n
n n
n n n
k n n n
k n
n n
k k
n
k
n
k
n
k
+

'
+

'

'
+

'

'
+

'

=
What are we doing here?
Rearrange first term, place
upper bound on second
What are we doing? X Guassian series
What are we doing?
Multiply it
all out
David Luebke 38 1/31/2012
Tightly Bounding
The Key Summation
)
! ! Done!
2 when
8
1
lg
2
1
4 8
1
lg lg
2
1
lg
2 2
2 2
1
1
>
+

=
n n n n
n
n n n n n k k
n
k

You might also like