Programming Assignment 4: Divide-and-Conquer
Programming Assignment 4: Divide-and-Conquer
Programming Assignment 4: Divide-and-Conquer
Divide-and-Conquer
Revision: August 21, 2021
Introduction
In this programming assignment, you will be practicing implementing divide-and-conquer solutions.
Learning Outcomes
Upon completing this programming assignment you will be able to:
1. Apply the divide-and-conquer technique to solve various computational problems efficiently. This will
usually require you to design an algorithm that solves a problem by splitting it into several disjoint
subproblems, solving them recursively, and then combining their results to get an answer for the initial
problem.
2. Design and implement efficient algorithms for the following computational problems:
(a) searching a sorted data for a key;
(b) finding a majority element in a data;
(c) improving the quick sort algorithm;
(d) checking how close a data is to being sorted;
(e) organizing a lottery;
(f) finding the closest pair of points.
1
Contents
1 Binary Search 3
2 Majority Element 4
4 Number of Inversions 7
5 Organizing a Lottery 8
6 Closest Points 10
2
1 Binary Search
Problem Introduction
In this problem, you will implement the binary search algorithm that allows searching
very efficiently (even huge) lists, provided that the list is sorted.
Problem Description
Task. The goal in this code problem is to implement the binary search algorithm.
Input Format. The first line of the input contains an integer 𝑛 and a sequence 𝑎0 < 𝑎1 < . . . < 𝑎𝑛−1
of 𝑛 pairwise distinct positive integers in increasing order. The next line contains an integer 𝑘 and 𝑘
positive integers 𝑏0 , 𝑏1 , . . . , 𝑏𝑘−1 .
Constraints. 1 ≤ 𝑛, 𝑘 ≤ 104 ; 1 ≤ 𝑎𝑖 ≤ 109 for all 0 ≤ 𝑖 < 𝑛; 1 ≤ 𝑏𝑗 ≤ 109 for all 0 ≤ 𝑗 < 𝑘;
Output Format. For all 𝑖 from 0 to 𝑘 − 1, output an index 0 ≤ 𝑗 ≤ 𝑛 − 1 such that 𝑎𝑗 = 𝑏𝑖 or −1 if there
is no such index.
Sample 1.
Input:
5 1 5 8 12 13
5 8 1 23 1 11
Output:
2 0 -1 0 -1
In this sample, we are given an increasing sequence 𝑎0 = 1, 𝑎1 = 5, 𝑎2 = 8, 𝑎3 = 12, 𝑎4 = 13 of length
five and five keys to search: 8, 1, 23, 1, 11. We see that 𝑎2 = 8 and 𝑎0 = 1, but the keys 23 and 11 do
not appear in the sequence 𝑎. For this reason, we output a sequence 2, 0, −1, 0, −1.
Need Help?
Ask a question or see the questions asked by other learners at this forum thread.
3
2 Majority Element
Problem Introduction
Majority rule is a decision rule that selects the alternative which has a majority,
that is, more than half the votes.
Given a sequence of elements 𝑎1 , 𝑎2 , . . . , 𝑎𝑛 , you would like to check whether
it contains an element that appears more than 𝑛/2 times. A naive way to do
this is the following.
MajorityElement(𝑎1 , 𝑎2 , . . . , 𝑎𝑛 ):
for 𝑖 from 1 to 𝑛:
currentElement ← 𝑎𝑖
count ← 0
for 𝑗 from 1 to 𝑛:
if 𝑎𝑗 = currentElement:
count ← count + 1
if count > 𝑛/2:
return 𝑎𝑖
return “no majority element”
The running time of this algorithm is quadratic. Your goal is to use the divide-and-conquer technique to
design an 𝑂(𝑛 log 𝑛) algorithm.
Problem Description
Task. The goal in this code problem is to check whether an input sequence contains a majority element.
Input Format. The first line contains an integer 𝑛, the next one contains a sequence of 𝑛 non-negative
integers 𝑎0 , 𝑎1 , . . . , 𝑎𝑛−1 .
Sample 1.
Input:
5
23922
Output:
1
2 is the majority element.
Sample 2.
Input:
4
1234
Output:
0
There is no majority element in this sequence.
4
Sample 3.
Input:
4
1231
Output:
0
This sequence also does not have a majority element (note that the element 1 appears twice and hence
is not a majority element).
What To Do
As you might have already guessed, this problem can be solved by the divide-and-conquer algorithm in time
𝑂(𝑛 log 𝑛). Indeed, if a sequence of length 𝑛 contains a majority element, then the same element is also
a majority element for one of its halves. Thus, to solve this problem you first split a given sequence into
halves and make two recursive calls. Do you see how to combine the results of two recursive calls?
It is interesting to note that this problem can also be solved in 𝑂(𝑛) time by a more advanced (non-divide
and conquer) algorithm that just scans the given sequence twice.
Need Help?
Ask a question or see the questions asked by other learners at this forum thread.
5
3 Improving Quick Sort
Problem Introduction
The goal in this problem is to redesign a given implementation of the random-
ized quick sort algorithm so that it works fast even on sequences containing
many equal elements.
Problem Description
Task. To force the given implementation of the quick sort algorithm to efficiently process sequences with
few unique elements, your goal is replace a 2-way partition with a 3-way partition. That is, your new
partition procedure should partition the array into three parts: < 𝑥 part, = 𝑥 part, and > 𝑥 part.
Input Format. The first line of the input contains an integer 𝑛. The next line contains a sequence of 𝑛
integers 𝑎0 , 𝑎1 , . . . , 𝑎𝑛−1 .
Constraints. 1 ≤ 𝑛 ≤ 105 ; 1 ≤ 𝑎𝑖 ≤ 109 for all 0 ≤ 𝑖 < 𝑛.
Output Format. Output this sequence sorted in non-decreasing order.
Sample 1.
Input:
5
23922
Output:
22239
Starter Files
In the starter files, you are given an implementation of the randomized quick sort algorithm using a 2-way
partition procedure. This procedure partitions the given array into two parts with respect to a pivot 𝑥: ≤ 𝑥
part and > 𝑥 part. As discussed in the video lectures, such an implementation has Θ(𝑛2 ) running time on
sequences containing a single unique element. Indeed, the partition procedure in this case splits the array
into two parts, one of which is empty and the other one contains 𝑛 − 1 elements. It spends 𝑐𝑛 time on this.
The overall running time is then
What To Do
Implement a 3-way partition procedure and then replace a call to the 2-way partition procedure by a call to
the 3-way partition procedure.
Need Help?
Ask a question or see the questions asked by other learners at this forum thread.
6
4 Number of Inversions
Problem Introduction
An inversion of a sequence 𝑎0 , 𝑎1 , . . . , 𝑎𝑛−1 is a pair of indices 0 ≤ 𝑖 < 𝑗 < 𝑛 such
that 𝑎𝑖 > 𝑎𝑗 . The number of inversions of a sequence in some sense measures how
close the sequence is to being sorted. For example, a sorted (in non-descending 6 1 5 2 3
order) sequence contains no inversions at all, while in a sequence sorted in de-
scending order any two elements constitute an inversion (for a total of 𝑛(𝑛 − 1)/2
inversions).
Problem Description
Task. The goal in this problem is to count the number of inversions of a given sequence.
Input Format. The first line contains an integer 𝑛, the next one contains a sequence of integers
𝑎0 , 𝑎1 , . . . , 𝑎𝑛−1 .
Sample 1.
Input:
5
23929
Output:
2
The two inversions here are (1, 3) (𝑎1 = 3 > 2 = 𝑎3 ) and (2, 3) (𝑎2 = 9 > 2 = 𝑎3 ).
What To Do
This problem can be solved by modifying the merge sort algorithm. For this, we change both the Merge and
MergeSort procedures as follows:
• Merge(𝐵, 𝐶) returns the resulting sorted array and the number of pairs (𝑏, 𝑐) such that 𝑏 ∈ 𝐵, 𝑐 ∈ 𝐶,
and 𝑏 > 𝑐;
• MergeSort(𝐴) returns a sorted array 𝐴 and the number of inversions in 𝐴.
Need Help?
Ask a question or see the questions asked by other learners at this forum thread.
7
5 Organizing a Lottery
Problem Introduction
You are organizing an online lottery. To participate, a person bets on a single
integer. You then draw several ranges of consecutive integers at random.
A participant’s payoff then is proportional to the number of ranges that
contain the participant’s number minus the number of ranges that does not
contain it. You need an efficient algorithm for computing the payoffs for all
participants. A naive way to do this is to simply scan, for all participants, the
list of all ranges. However, you lottery is very popular: you have thousands
of participants and thousands of ranges. For this reason, you cannot afford
a slow naive algorithm.
Problem Description
Task. You are given a set of points on a line and a set of segments on a line. The goal is to compute, for
each point, the number of segments that contain this point.
Input Format. The first line contains two non-negative integers 𝑠 and 𝑝 defining the number of segments
and the number of points on a line, respectively. The next 𝑠 lines contain two integers 𝑎𝑖 , 𝑏𝑖 defining
the 𝑖-th segment [𝑎𝑖 , 𝑏𝑖 ]. The next line contains 𝑝 integers defining points 𝑥1 , 𝑥2 , . . . , 𝑥𝑝 .
Constraints. 1 ≤ 𝑠, 𝑝 ≤ 50000; −108 ≤ 𝑎𝑖 ≤ 𝑏𝑖 ≤ 108 for all 0 ≤ 𝑖 < 𝑠; −108 ≤ 𝑥𝑗 ≤ 108 for all 0 ≤ 𝑗 < 𝑝.
Output Format. Output 𝑝 non-negative integers 𝑘0 , 𝑘1 , . . . , 𝑘𝑝−1 where 𝑘𝑖 is the number of segments which
contain 𝑥𝑖 . More formally,
𝑘𝑖 = |{𝑗 : 𝑎𝑗 ≤ 𝑥𝑖 ≤ 𝑏𝑗 }| .
Sample 1.
Input:
23
05
7 10
1 6 11
Output:
100
Here, we have two segments and three points. The first point lies only in the first segment while the
remaining two points are outside of all the given segments.
Sample 2.
Input:
13
-10 10
-100 100 0
Output:
001
8
Sample 3.
Input:
32
05
-3 2
7 10
16
Output:
20
Need Help?
Ask a question or see the questions asked by other learners at this forum thread.
9
6 Closest Points
Problem Introduction
In this problem, your goal is to find the closest pair of points among the given 𝑛
points. This is a basic primitive in computational geometry having applications in,
for example, graphics, computer vision, traffic-control systems.
Problem Description
Task. Given 𝑛 points on a plane, find the smallest distance between a √︀ pair of two (different) points. Recall
that the distance between points (𝑥1 , 𝑦1 ) and (𝑥2 , 𝑦2 ) is equal to (𝑥1 − 𝑥2 )2 + (𝑦1 − 𝑦2 )2 .
Input Format. The first line contains the number 𝑛 of points. Each of the following 𝑛 lines defines a point
(𝑥𝑖 , 𝑦𝑖 ).
Constraints. 2 ≤ 𝑛 ≤ 105 ; −109 ≤ 𝑥𝑖 , 𝑦𝑖 ≤ 109 are integers.
Output Format. Output the minimum distance. The absolute value of the difference between the answer
of your program and the optimal value should be at most 10−3 . To ensure this, output your answer
with at least four digits after the decimal point (otherwise your answer, while being computed correctly,
can turn out to be wrong because of rounding issues).
Sample 1.
Input:
2
00
34
Output:
5.0
There are only two points here. The distance between them is 5.
Sample 2.
Input:
4
77
1 100
48
77
Output:
0.0
There are two coinciding points among the four given points. Thus, the minimum distance is zero.
10
Sample 3.
Input:
11
44
-2 -2
-3 -4
-1 3
23
-4 0
11
-1 -1
3 -1
-4 2
-2 4
Output:
1.414213
√
The smallest distance is 2. There are two pairs of points at this distance: (−1, −1) and (−2, −2);
(−2, 4) and (−1, 3).
What To Do
This computational geometry problem has many applications in computer graphics and vision. A naive
algorithm with quadratic running time iterates through all pairs of points to find the closest pair. Your goal
is to design an 𝑂(𝑛 log 𝑛) time divide and conquer algorithm.
To solve this problem in time 𝑂(𝑛 log 𝑛), let’s first split the given 𝑛 points by an appropriately chosen
vertical line into two halves 𝑆1 and 𝑆2 of size 𝑛2 (assume for simplicity that all 𝑥-coordinates of the input
points are different). By making two recursive calls for the sets 𝑆1 and 𝑆2 , we find the minimum distances
𝑑1 and 𝑑2 in these subsets. Let 𝑑 = min{𝑑1 , 𝑑2 }.
𝑑1 𝑑2
11
It remains to check whether there exist points 𝑝1 ∈ 𝑆1 and 𝑝2 ∈ 𝑆2 such that the distance between them
is smaller than 𝑑. We cannot afford to check all possible such pairs since there are 𝑛2 · 𝑛2 = Θ(𝑛2 ) of them.
To check this faster, we first discard all points from 𝑆1 and 𝑆2 whose 𝑥-distance to the middle line is greater
than 𝑑. That is, we focus on the following strip:
𝑑1 𝑑2
𝑑 𝑑
Stop and think: Why can we narrow the search to this strip? Now, let’s sort the points of the strip by their
𝑦-coordinates and denote the resulting sorted list by 𝑃 = [𝑝1 , . . . , 𝑝𝑘 ]. It turns out that if |𝑖 − 𝑗| > 7, then
the distance between points 𝑝𝑖 and 𝑝𝑗 is greater than 𝑑 for sure. This follows from the Exercise Break below.
Exercise break: Partition the strip into 𝑑 × 𝑑 squares as shown below and show that each such square
contains at most four input points.
𝑑1 𝑑2
𝑑 𝑑
This results in the following algorithm. We first sort the given 𝑛 points by their 𝑥-coordinates and then
split the resulting sorted list into two halves 𝑆1 and 𝑆2 of size 𝑛2 . By making a recursive call for each of the
sets 𝑆1 and 𝑆2 , we find the minimum distances 𝑑1 and 𝑑2 in them. Let 𝑑 = min{𝑑1 , 𝑑2 }. However, we are not
done yet as we also need to find the minimum distance between points from different sets (i.e, a point from
𝑆1 and a point from 𝑆2 ) and check whether it is smaller than 𝑑. To perform such a check, we filter the initial
point set and keep only those points whose 𝑥-distance to the middle line does not exceed 𝑑. Afterwards, we
sort the set of points in the resulting strip by their 𝑦-coordinates and scan the resulting list of points. For
each point, we compute its distance to the seven subsequent points in this list and compute 𝑑′ , the minimum
distance that we encountered during this scan. Afterwards, we return min{𝑑, 𝑑′ }.
The running time of the algorithm satisfies the recurrence relation
(︁ 𝑛 )︁
𝑇 (𝑛) = 2 · 𝑇 + 𝑂(𝑛 log 𝑛) .
2
The 𝑂(𝑛 log 𝑛) term comes from sorting the points in the strip by their 𝑦-coordinates at every iteration.
Exercise break: Prove that 𝑇 (𝑛) = 𝑂(𝑛 log2 𝑛) by analyzing the recursion tree of the algorithm.
Exercise break: Show how to bring the running time down to 𝑂(𝑛 log 𝑛) by avoiding sorting at each
recursive call.
12
Need Help?
Ask a question or see the questions asked by other learners at this forum thread.
13