03 Sorting
03 Sorting
03 Sorting
Sorting
• Sorting is a process that organizes a collection of data
into either ascending or descending order.
• Formally
• Input: A sequence of n numbers <a1,a2,…,an>
• Output: A reordering <a’1,a’2,…,a’n> of the sequence such
that a’1 ≤ a’2 ≤ … ≤ a’n
• Given the input <6, 3, 1, 7>, the algorithm should produce
<1, 3, 6, 7>
• Called an instance of the problem
Sorting
• Sorting is a process that organizes a collection of data
into either ascending or descending order.
• We encounter sorting almost everywhere:
– Sorting prices from lowest to highest
– Sorting flights from earliest to latest
– Sorting grades from highest to lowest
– Sorting songs based on artist, album, playlist,
etc.
Sorting Algorithms
• There are many sorting algorithms (as of
27.10.2014 there are 44 Wikipedia entries)
• In this class we will learn:
– Selection Sort
– Insertion Sort
– Bubble Sort
– Merge Sort
– Quick Sort
– Bogo sort and Sleep sort are some bad/slow algorithms
• These are among the most fundamental sorting
algorithms
Sorting Algorithms
• As we learnt in the Analysis lecture (time
complexity), a stupid approach uses up
computing power faster than you might think.
• Sorting a million numbers:
• https://youtu.be/kPRA0W1kECg
Selection Sort
• Partition the input list into a sorted and unsorted
part (initially sorted part is empty)
• Select the smallest element and put it to the end of
the sorted part
• Increase the size of the sorted part by one
• Repeat this n-1 times to sort a list of n elements
Sorted Unsorted
23 78 45 8 32 56 Original List
After pass 1
8 78 45 23 32 56
After pass 2
8 23 45 78 32 56
After pass 3
8 23 32 78 45 56
After pass 4
8 23 32 45 78 56
After pass 5
8 23 32 45 56 78
Selection Sort (cont.)
template <class Item>
void selectionSort(Item a[], int n) {
for (int i = 0; i < n-1; i++) {
int min = i;
for (int j = i+1; j < n; j++)
if (a[j] < a[min]) min = j;
swap(a[i], a[min]);
}
}
23 78 45 8 32 56 Original List
After pass 1
23 78 45 8 32 56
After pass 2
23 45 78 8 32 56
After pass 3
8 23 45 78 32 56
After pass 4
8 23 32 45 78 56
After pass 5
8 23 32 45 56 78
Insertion Sort Algorithm
template <class Item>
void insertionSort(Item a[], int n)
{
for (int i = 1; i < n; i++)
{
Item tmp = a[i]; // the element to be inserted
int j;
for (j=i; j>0 && tmp < a[j-1]; j--)
a[j] = a[j-1]; // shift elements
a[j] = tmp; // insert
}
}
Insertion Sort – Analysis
• Running time depends on not only the size of the array but also
the contents of the array.
• Best-case: O(n)
– Array is already sorted in ascending order.
• Worst-case: O(n2)
– Array is in reverse order:
• Average-case: O(n2)
– We have to look at all possible initial data organizations.
Analysis of insertion sort
• Which running time will be used to characterize this
algorithm?
– Best, worst or average?
• Worst:
– Longest running time (this is the upper limit for the algorithm)
– It is guaranteed that the algorithm will not be worse than this.
• Sometimes we are interested in average case. But there are
some problems with the average case.
– It is difficult to figure out the average case. i.e. what is average
input?
– Are we going to assume all possible inputs are equally likely?
– In fact for most algorithms average case is same as the worst case.
Bubble Sort
• Repeatedly swap adjacent elements that are out of
order.
• https://youtu.be/vxENKlcs2Tw
insertionSortRec(A, n)
{
if (n>1)
insertionSortRec(A, n-1)
insertKeyIntoSubarray(A[n], A[1..n-1]) //trivial; can
//be done
in //O(n)
}
Recursive Insertion Sort
• To sort A[1..n], we recursively sort A[1..n-1] and then insert A[n]
into the sorted array A[1..n-1]
• A=524613
insertKeyIntoSubarray
12456 3
2456 1
245 6
25 4
5 2
Recursive Insertion Sort
• To sort A[1..n], we recursively sort A[1..n-1] and then insert A[n]
into the sorted array A[1..n-1]
• T(n) = T(n-1) + O(n) is the time complexity of this sorting
Recursive Insertion Sort
• To sort A[1..n], we recursively sort A[1..n-1] and then insert A[n]
into the sorted array A[1..n-1]
• T(n) = T(n-1) + O(n)
• Reduce the problem size to n-1 w/ O(n) extra work
• Seems to be doing O(n) work n-1 times; so O(n2) guess?
• T(n) ≤ cn2 //assume holds
T(n) ≤ c(n-1)2 + dn
= cn2 -2cn + c + dn
= cn2 –c(2n -1) + dn
≤ cn2 //because large values of c dominates d
Recurrences
• How about the complexity of T(n) = T(n/2) + O(1)
• Reduce the problem size to half w/ O(1) extra work
• Seems to be doing O(1) work logn times; so O(logn) guess?
• T(n) ≤ clogn //assume holds
T(n) ≤ c(log(n/2)) + d
= clogn – clog2 + d
= clogn – e //c can always selected to be > constant d
≤ clogn
Recurrences
• How about the complexity of T(n) = 2T(n/2) + O(n)
• Reduce the problem to 2 half-sized problems w/ n extra work
• Seems to be doing O(n) work logn times; so O(nlogn) guess?
• T(n) ≤ cnlogn //assume holds
T(n) ≤ 2c(n/2 log(n/2)) + dn
= cnlogn – cnlog2 + dn
= cnlogn – n(c’ + d) //c’ = clog2
≤ cnlogn
Recurrences
• How about the complexity of T(n) = T(n-1) + T(n-2)
• Reduce? the problem to a twice bigger one w/ no extra work
• T(n-1) + T(n-2) > 2T(n-2) n replaced by 2n-4 (doubled)
• Or, n-size problem replaced with 2n-3 size problem (doubled)
• Seems to be doubling the problem size; so O(2n) is good guess
Fast Algorithm
• Describe an algorithm that, given a set S of n integers and
another integer x, determines whether or not there exists 2
elements in S whose sum is exactly x
• O(n2) //brute force
Fast Algorithm
• Describe an algorithm that, given a set S of n integers and
another integer x, determines whether or not there exists 2
elements in S whose sum is exactly x
• O(nlogn) //sort followed by binary search
Mergesort
• Mergesort algorithm is one of two important divide-and-conquer
sorting algorithms (the other one is quicksort).
• It is a recursive algorithm.
– Divides the list into halves,
– Sort each halve separately (recursively), and
– Then merge the sorted halves into one sorted array.
• https://youtu.be/es2T6KY45cA
Mergesort - Example
6 3 9divide
15472
6 3 9 1 5 4 7 2
divide divide
6 3 9 1 5 4 7 2
3 6 1 9 4 5 2 7
merge merge
1 3 6 9 2 4 5 7
merge
12345679
Merge
const int MAX_SIZE = maximum-number-of-items-in-array;
void merge(DataType theArray[], int first, int mid, int last)
{
DataType tempArray[MAX_SIZE]; // temporary array
int first1 = first; // beginning of first subarray
int last1 = mid; // end of first subarray
int first2 = mid + 1; // beginning of second subarray
int last2 = last; // end of second subarray
int index = first1; // next available location in tempArray
• Arranging the array elements around the pivot p generates two smaller sorting
problems.
– sort the left section of the array, and sort the right section of the array.
– when these two smaller sorting problems are solved recursively, our bigger
sorting problem is solved.
Pivot Selection
• Which array item should be selected as pivot?
– Somehow we have to select a pivot, and we hope that we will
get a good partitioning.
– If the items in the array arranged randomly, we choose a pivot
randomly.
– We can choose the first or last element as the pivot (it may not
give a good partitioning).
– We can choose the middle element as the pivot
– We can use a combination of the above to select the pivot (in
each recursive call a different technique can be used)
Partitioning
int pivotIndex;
if (first < last) {
• Given unsorted a0, a1, a2, .., an, construct set of points in the
plane via ai=ai2 parabola as shown at left
• Every point must be on the convex hull ‘cos parabolas are convex
• Convex hull of this parabola = Ordered hull vertices = Sorting