0% found this document useful (0 votes)
6 views9 pages

Bubble Sort

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 9

CHAPTER 4

Speeding Up Your Code with Big O


Big O Notation is a great tool for comparing competing algorithms, as it gives
an objective way to measure them. We’ve already been able to use it to
quantify the difference between binary search vs. linear search, as binary
search is O(log N)—a much faster algorithm than linear search, which is O(N).

However, there may not always be two clear alternatives when writing everyday
code. Like most programmers, you probably use whatever approach pops into
your head first. With Big O, you have the opportunity to compare your algo-
rithm to general algorithms out there in the world, and you can say to yourself,
“Is this a fast or slow algorithm as far as algorithms generally go?”

If you find that Big O labels your algorithm as a “slow” one, you can now take
a step back and try to figure out if there’s a way to optimize it by trying to get
it to fall under a faster category of Big O. This may not always be possible, of
course, but it’s certainly worth thinking about before concluding that it’s not.

In this chapter, we’ll write some code to solve a practical problem, and then
measure our algorithm using Big O. We’ll then see if we might be able to modify
the algorithm in order to give it a nice efficiency bump. (Spoiler: we will.)

Bubble Sort
Before jumping into our practical problem, though, we need to first learn about
a new category of algorithmic efficiency in the world of Big O. To demonstrate
it, we’ll get to use one of the classic algorithms of computer science lore.

Sorting algorithms have been the subject of extensive research in computer


science, and tens of such algorithms have been developed over the years.
They all solve the following problem:

Given an array of unsorted numbers, how can we sort them so that they end
up in ascending order?
In this and the following chapters, we’re going to encounter a number of these
sorting algorithms. Some of the first ones we’ll be learning about are known
as “simple sorts,” in that they are easier to understand, but are not as efficient
as some of the faster sorting algorithms out there.

Bubble Sort is a very basic sorting algorithm, and follows these steps:

1. Point to two consecutive items in the array. (Initially, we start at the very
beginning of the array and point to its first two items.) Compare the first
item with the second one:

2. If the two items are out of order (in other words, the left value is greater
than the right value), swap them:

(If they already happen to be in the correct order, do nothing for this step.)

3. Move the “pointers” one cell to the right:

Repeat steps 1 and 2 until we reach the end of the array or any items
that have already been sorted.

4. Repeat steps 1 through 3 until we have a round in which we didn’t have


to make any swaps. This means that the array is in order.

Each time we repeat steps 1 through 3 is known as a passthrough. That


is, we “passed through” the primary steps of the algorithm, and will repeat
the same process until the array is fully sorted.

Bubble Sort in Action


Let’s walk through a complete example. Assume that we wanted to sort the
array [4, 2, 7, 1, 3]. It’s currently out of order, and we want to produce an array
containing the same values in the correct, ascending order.
Let’s begin Passthrough #1:

This is our starting array:

Step #1: First, we compare the 4 and the 2. They’re out of order:

Step #2: So we swap them:

Step #3: Next, we compare the 4 and the 7:

They’re in the correct order, so we don’t need to perform any swaps.

Step #4: We now compare the 7 and the 1:

Step #5: They’re out of order, so we swap them:

Step #6: We compare the 7 and the 3:


Step #7: They’re out of order, so we swap them:

We now know for a fact that the 7 is in its correct position within the array,
because we kept moving it along to the right until it reached its proper place.
We’ve put little lines surrounding it to indicate this fact.

This is actually the reason that this algorithm is called Bubble Sort: in each
passthrough, the highest unsorted value “bubbles” up to its correct position.

Since we made at least one swap during this passthrough, we need to conduct
another one.

We begin Passthrough #2:

The 7 is already in the correct position:

Step #8: We begin by comparing the 2 and the 4:

They’re in the correct order, so we can move on.

Step #9: We compare the 4 and the 1:

Step #10: They’re out of order, so we swap them:


Step #11: We compare the 4 and the 3:

Step #12: They’re out of order, so we swap them:

We don’t have to compare the 4 and the 7 because we know that the 7 is
already in its correct position from the previous passthrough. And now we
also know that the 4 is bubbled up to its correct position as well. This con-
cludes our second passthrough. Since we made at least one swap during this
passthrough, we need to conduct another one.

We now begin Passthrough #3:

Step #13: We compare the 2 and the 1:

Step #14: They’re out of order, so we swap them:

Step #15: We compare the 2 and the 3:

They’re in the correct order, so we don’t need to swap them.

We now know that the 3 has bubbled up to its correct spot. Since we made
at least one swap during this passthrough, we need to perform another one.
And so begins Passthrough #4:

Step #16: We compare the 1 and the 2:

Since they’re in order, we don’t need to swap. We can end this passthrough,
since all the remaining values are already correctly sorted.

Now that we’ve made a passthrough that didn’t require any swaps, we know
that our array is completely sorted:

Bubble Sort Implemented


Here’s an implementation of Bubble Sort in Python:
def bubble_sort(list):
unsorted_until_index = len(list) - 1
sorted = False

while not sorted:


sorted = True
for i in range(unsorted_until_index):
if list[i] > list[i+1]:
sorted = False
list[i], list[i+1] = list[i+1], list[i]
unsorted_until_index = unsorted_until_index - 1

list = [65, 55, 45, 35, 25, 15, 10]


bubble_sort(list)
print list

Let’s break this down line by line. We’ll first present the line of code, followed
by its explanation.
unsorted_until_index = len(list) - 1

We keep track of up to which index is still unsorted with the unsorted_until_index


variable. At the beginning, the array is totally unsorted, so we initialize this
variable to be the final index in the array.
sorted = False
We also create a sorted variable that will allow us to keep track whether the
array is fully sorted. Of course, when our code first runs, it isn’t.
while not sorted:
sorted = True

We begin a while loop that will last as long as the array is not sorted. Next, we
preliminarily establish sorted to be True. We’ll change this back to False as soon
as we have to make any swaps. If we get through an entire passthrough
without having to make any swaps, we’ll know that the array is completely
sorted.
for i in range(unsorted_until_index):
if list[i] > list[i+1]:
sorted = False
list[i], list[i+1] = list[i+1], list[i]

Within the while loop, we begin a for loop that starts from the beginning of the
array and goes until the index that has not yet been sorted. Within this loop,
we compare every pair of adjacent values, and swap them if they’re out of
order. We also change sorted to False if we have to make a swap.
unsorted_until_index = unsorted_until_index - 1

By this line of code, we’ve completed another passthrough, and can safely
assume that the value we’ve bubbled up to the right is now in its correct
position. Because of this, we decrement the unsorted_until_index by 1, since the
index it was already pointing to is now sorted.

Each round of the while loop represents another passthrough, and we run it
until we know that our array is fully sorted.

The Efficiency of Bubble Sort


The Bubble Sort algorithm contains two kinds of steps:

• Comparisons: two numbers are compared with one another to determine


which is greater.

• Swaps: two numbers are swapped with one another in order to sort them.

Let’s start by determining how many comparisons take place in Bubble Sort.

Our example array has five elements. Looking back, you can see that in our
first passthrough, we had to make four comparisons between sets of two
numbers.
In our second passthrough, we had to make only three comparisons. This is
because we didn’t have to compare the final two numbers, since we knew
that the final number was in the correct spot due to the first passthrough.

In our third passthrough, we made two comparisons, and in our fourth


passthrough, we made just one comparison.

So, that’s:

4 + 3 + 2 + 1 = 10 comparisons.

To put it more generally, we’d say that for N elements, we make

(N - 1) + (N - 2) + (N - 3) … + 1 comparisons.

Now that we’ve analyzed the number of comparisons that take place in Bubble
Sort, let’s analyze the swaps.

In a worst-case scenario, where the array is not just randomly shuffled, but
sorted in descending order (the exact opposite of what we want), we’d actually
need a swap for each comparison. So we’d have ten comparisons and ten
swaps in such a scenario for a grand total of twenty steps.

So let’s look at the complete picture. With an array containing ten elements
in reverse order, we’d have:

9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 = 45 comparisons, and another forty-five


swaps. That’s a total of ninety steps.

With an array containing twenty elements, we’d have:

19 + 18 + 17 + 16 + 15 + 14 + 13 + 12 + 11 + 10 + 9 + 8 + 7 + 6 + 5 + 4 + 3
+ 2 + 1 = 190 comparisons, and approximately 190 swaps, for a total of 380
steps.

Notice the inefficiency here. As the number of elements increase, the


number of steps grows exponentially. We can see this clearly with the fol-
lowing table:

N data elements Max # of steps


5 20
10 90
20 380
40 1560
80 6320
If you look precisely at the growth of steps as N increases, you’ll see that it’s
growing by approximately N2.

N data elements # of Bubble Sort steps N2


5 20 25
10 90 100
20 380 400
40 1560 1600
80 6320 6400

Therefore, in Big O Notation, we would say that Bubble Sort has an efficiency
of O(N2).

Said more officially: in an O(N2) algorithm, for N data elements, there are
roughly N2 steps.

O(N2) is considered to be a relatively inefficient algorithm, since as the data


increases, the steps increase dramatically. Look at this graph:

O(n2)
Number of Steps

O(n)

Number of Elements

Note how O(N2) curves sharply upwards in terms of number of steps as the data
grows. Compare this with O(N), which plots along a simple, diagonal line.

One last note: O(N2) is also referred to as quadratic time.

A Quadratic Problem
Let’s say you’re writing a JavaScript application that requires you to check
whether an array contains any duplicate values.

One of the first approaches that may come to mind is the use of nested for
loops, as follows:

You might also like