Pydon'ts: Write Beautiful Python Code
Pydon'ts: Write Beautiful Python Code
Pydon'ts: Write Beautiful Python Code
11-04-2021
Contents
Foreword 2
Deep unpacking 28
Zip up 47
Enumerate me 56
1
Foreword
2
it to https://mathspp.com/blog/pydonts, where you can read the contents of
this book for free.
If you would like to share your feedback, let me know of any typos or errors you
found, or otherwise just get in touch, feel free to drop a line to rodrigo@maths
pp.com.
— Rodrigo, https://mathspp.com
4
that tell you how you should format your code, how to name your variables, etc.,
the Zen of Python provides you with the guidelines that you should follow when
thinking about (Python) code and when designing a program.
Zen of Python
You can read the Zen of Python by executing import this in your REPL, which
should print the following text:
The Zen of Python, by Tim Peters
References
• PEP 20 – The Zen of Python, https://www.python.org/dev/peps/pep-0020/
• “The Way of Python” mailing thread, https://groups.google.com/g/comp.l
ang.python/c/B_VxeTBClM0/m/L8W9KlsiriUJ
• Tim Peters (software engineer), Wikipedia https://en.wikipedia.org/wiki/
Tim_Peters_(software_engineer)
Introduction
In this Pydon’t we will go over the chaining of comparison operators:
• how they work;
• useful usages; and
7
• weird cases to avoid.
Pit-falls
Even though this feature looks very sensible, there are a couple of pit-falls you
have to look out for.
Non-transitive operators
We saw above that we can use a == b == c to check if a, b and c are all the
same. How would you check if they are all different?
If you thought about a != b != c, then you just fell into the first pit-fall!
Look at this code:
>>> a = c = 1
>>> b = 2
>>> if a != b != c:
... print("a, b, and c all different:", a, b, c)
a, b, and c all different: 1 2 1
The problem here is that a != b != c is a != b and b != c, which checks
that b is different from a and from c, but says nothing about how a and c relate.
From the mathematical point of view, != isn’t transitive, i.e., knowing how a
relates to b and knowing how b relates to c doesn’t tell you how a relates to c.
As for a transitive example, you can take ==: if a == b and b == c then it is also
true that a == c.
Ugly chains
This feature looks really natural, but some particular cases aren’t so great. This
is a fairly subjective matter, but I personally don’t love chains where the operat-
ors aren’t “aligned”, so chains like
• a == b == c
• a < b <= c
• a <= b < c
look really good, but in my opinion chains like
• a < b > c
• a <= b > c
• a < b >= c
don’t look that good. One can argue, for example, that a < b > c reads nicely
as “check if b is larger than both a and c”, but you could also write max(a, c)
< b or b > max(a, c).
Now there’s some other chains that are just confusing:
Examples in code
Inequality chain
Having a simple utility function that ensures that a given value is between two
bounds becomes really simple, e.g.
def ensure_within(value, bounds):
return bounds[0] <= value <= bounds[1]
or if you want to be a little bit more explicit, while also ensuring bounds is a
vector with exactly two items, you can also write
def ensure_within(value, bounds):
m, M = bounds
return m <= value <= M
Equality chain
Straight from Python’s enum module, we can find a helper function (that is not
exposed to the user), that reads as follows:
def _is_dunder(name):
"""Returns True if a __dunder__ name, False otherwise."""
return (len(name) > 4 and
name[:2] == name[-2:] == '__' and
Conclusion
Here’s the main takeaway of this article, for you, on a silver platter:
“Chaining comparison operators feels so natural, you don’t even no-
tice it is a feature. However, some chains might throw you off if you
overlook them.”
This Pydon’t showed you that:
• you can chain comparisons, and do so arbitrarily many times;
• chains with expressions that have side-effects or with non-deterministic
outputs are not equivalent to the extended version; and
• some chains using is or in can look really misleading.
References
• Python 3 Documentation, The Python Language Reference https://docs.p
ython.org/3/reference/expressions.html#comparisons;
• Python 3 Documentation, The Python Standard Library, enum, https://docs
.python.org/3/library/enum.html;
• Reddit, comment on “If they did make a python 4, what changes from
python 3 would you like to see?”, https://www.reddit.com/r/Python/comm
ents/ltaf3y/if_they_did_make_a_python_4_what_changes_from/gowuau
5?utm_source=share&utm_medium=web2x&context=3.
14
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'b' is an invalid keyword argument for print()
>>> print(b := 3)
3
>>> b
3
As shown in PEP 572, a good usage of assignment expressions can help write
better code: code that is clearer and/or runs faster.
Assignment expressions should be avoided when they make the code too convo-
luted, even if it saves you a couple of lines of code. You don’t want to disrespect
the Zen of Python, and the Zen of Python recommends writing readable code.
The snippet of code below features what is, in my opinion, a fairly unreadable
usage of an assignment expression:
import sys
i = input()
if i[0] == "q" or i == "exit":
sys.exit()
The second alternative (without :=) is much easier to read than the first one,
even though using := saved one line of code.
However, good uses of assignment expressions can
• make your code faster,
• make it more readable/expressive, and
• make your code shorter.
Examples in code
Here are a couple of examples of good usages of assignment expressions.
Conclusion
Assignment expressions allow the binding of a name to a part of an expression,
which can be used to great benefit in clarifying the flow of some programs or
saving time on expensive computations, for example. Bad usages of assignment
expressions, however, can make code very unreadable and is therefore crucial
to judge whether or not an assignment expression is a good fit for a particular
task.
References
• Python 3 Documentation, What’s New in Python, What’s new in Python 3.8
- Assignment expressions, https://docs.python.org/3/whatsnew/3.8.htm
l#assignment-expressions.
• PEP 572 – Assignment Expressions, https://www.python.org/dev/peps/pe
p-0572.
• Real Python, “Assignment Expressions: The Walrus Operator”, https://real
python.com/lessons/assignment-expressions/.
Online references consulted on the 26th of January of 2021.
19
“Any object can be tested for truth value, for use in an if or while
condition or as operand of the Boolean operations below [or, and
and not].”
What does that mean? It means that we can use any Python object we want
whenever a boolean value is expected. Boolean values (True and False) are
used in conditions, which pop up in if statements and while statements, as
well as in expressions that make use of the Boolean operators or, and and not.
As a very basic example, consider this Python session:
>>> if True:
... print("Hello, World!")
...
Hello, World!
>>> if False:
... print("Go away!")
...
>>>
This piece of code should not surprise you, as it is very standard Python code:
there are a couple of if statements that make use of explicit Boolean values.
The next step is using an expression that evaluates to a Boolean value:
>>> 5 > 3
True
>>> if 5 > 3:
... print("Hello, World!")
...
Hello, World!
The next step is using an object that is not a Boolean value, which is what this
blog post is all about:
>>> l = [1, 2, 3]
>>> if l:
... print(l)
...
[1, 2, 3]
This is the part that could be surprising if you have never encountered it. The
reason this if statement is getting executed is because the list [1, 2, 3] is
Truthy, that is, the list [1, 2, 3] can be interpreted as True in a Boolean context.
How can you know if an object is “Truthy” or “Falsy”? The simplest way is to use
the built-in bool function that converts any Python object to a Boolean:
>>> bool(l)
True
The way this works is really simple! There are a couple of rules that specify how
this works, but these simple rules can even be simplified further with a simpler
Remarks
Now a couple of remarks about the functioning of Truthy and Falsy values.
Examples in code
Now I will show you some examples of places where using the Truthy and Falsy
values of Python objects allows you to write more Pythonic code.
2D point
Let us implement a simple class to represent points in a 2D plane, which could
be an image, a plot or something else. Retrieving what we already had in the
article about __str__ and __repr__, we can add a __bool__ method so that
the origin (the point Point2D(0, 0)) is Falsy and all other points are Truthy:
## Retrieved from https://mathspp.com/blog/pydonts/pydont-confuse-str-and-repr
class Point2D:
"""A class to represent points in a 2D space."""
def __str__(self):
"""Provide a good-looking representation of the object."""
return f"({self.x}, {self.y})"
def __repr__(self):
"""Provide an unambiguous way of rebuilding this object."""
return f"Point2D({repr(self.x)}, {repr(self.y)})"
def __bool__(self):
"""The origin is Falsy and all other points are Truthy."""
return self.x or self.y
Processing data
It is also very common to use Truthy and Falsy values to measure if there is still
data to be processed.
For example, when I talked about the walrus operator :=, we saw a while loop
vaguely similar to this one:
input_lines = []
while (s := input()):
input_lines.append(s)
## No more lines to read.
print(len(input_lines))
This while loop essentially reads input lines while there are lines to be read. As
soon as the user inputs an empty line "", the loop stops and we print the number
of lines we read:
>>> print(len(input_lines))
2
Another common pattern is when you have a list that contains some data that
you have to process, and such that the list itself gets modified as you process
the data.
Consider the following example:
import pathlib
def print_file_sizes(dir):
"""Print file sizes in a directory, recursing into subdirectories."""
paths_to_process = [dir]
while paths_to_process:
path, *paths_to_process = paths_to_process
path_obj = pathlib.Path(path)
if path_obj.is_file():
print(path, path_obj.stat().st_size)
else:
paths_to_process += path_obj.glob("*")
This is not necessarily the way to go about doing this, but notice the while
statement, and then the if: ... else: ... block that either prints something,
or extends the paths_to_process list.
Conclusion
• Python’s Truthy and Falsy values allow you to rewrite common conditions
in a way that is more readable and, therefore, Pythonic.
• You can implement your own Truthy and Falsy values in custom classes by
implementing the __bool__ dunder method.
• You should also be careful when checking if a given variable is None or not,
and avoid using the Falsy value of None in those particular cases.
Introduction
In this Pydon’t we will go over deep unpacking: - what it is; - how it works; - how
to use it to improve code readability; and - how to use it to help debug your
code.
Learning about deep unpacking will be very helpful in order to pave the road
for structural matching, a feature to be introduced in Python 3.10.
28
Assignments
Before showing you how deep unpacking works, let’s have a quick look at two
other nice features about Python’s assignments.
Multiple assignment
In Python, multiple assignment is what allows you to write things like
>>> x = 3
>>> y = "hey"
>>> x, y = y, x # Multiple assignment to swap variables.
>>> x
3
>>> y
'hey'
or
>>> rgb_values = (45, 124, 183)
>>> r, g, b = rgb_values # Multiple assignment unpacks the tuple.
>>> g
124
With multiple assignment you can assign, well, multiple variables at the same
time, provided the right-hand side has as many items as the left-hand side ex-
pects.
Starred assignment
Starred assignment, that I covered in depth in this Pydon’t, allows you to write
things like
>>> l = [0, 1, 2, 3, 4]
>>> head, *body = l
>>> print(head)
0
>>> print(body)
[1, 2, 3, 4]
>>> *body, tail = l
>>> print(tail)
4
>>> head, *body, tail = l
>>> print(body)
[1, 2, 3]
With starred assignment you can tell Python that you are not sure how many
items the right-hand side will have, but all of them can be stored in a single
Deep unpacking
Deep unpacking, or nested unpacking, is similar to multiple assignment in a
sense. Multiple assignment allows you to match the length of an iterable, on
the right-hand side of an assignment, and get each element into a variable. In
a similar fashion, deep unpacking allows you to match the shape of what is on
the right-hand side of an assignment; in particular, if there are nested iterables,
you can unpack those iterables at once.
For example, using multiple assignment twice in a row, you could do this:
>>> colour_info = ("AliceBlue", (240, 248, 255))
>>> name, rgb_values = colour_info
>>> name
'AliceBlue'
>>> r, g, b = rgb_values
>>> g
248
But if you already know you want to get to the separate RGB values, you could
use deep unpacking:
>>> colour_info = ("AliceBlue", (240, 248, 255))
>>> name, (r, g, b) = colour_info
>>> name
'AliceBlue'
>>> g
248
Notice how we group the r, g, and b variables with () to create a tuple, mimicking
the shape of the colour_info variable. If we had simply written name, r, g, b
= colour_info then Python would think we are trying to do multiple assignment,
and would expect colour_info to have four items to unpack:
>>> colour_info = ("AliceBlue", (240, 248, 255))
>>> name, r, g, b = colour_info
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: not enough values to unpack (expected 4, got 2)
Our use of parenthesis in (r, g, b) tells Python we actually want to go into the
nested structure of colour_info.
This might be clearer if we actually include the outer set of parenthesis that is
usually omitted:
>>> colour_info = ("AliceBlue", (240, 248, 255))
>>> (name, (r, g, b)) = colour_info
In loops
Deep unpacking can also be used in the implicit assignments of for loops, it
doesn’t have to be in explicit assignments with an equals sign! The examples
below will show you that.
Deep unpacking, when used well, can improve the readability of your code – by
removing indexing clutter and by making the intent more explicit – and can help
you test your code for some errors and bugs.
Nothing better than showing you some code, so you can see for yourself.
Examples in code
Increasing expressiveness
Given the RGB values of a colour, you can apply a basic formula to convert it to
greyscale, which weighs the R, G, and B components differently. We could write
a function that takes the colour information like we have been using, and then
computes its greyscale value:
def greyscale(colour_info):
return 0.2126*colour_info[1][0] + 0.7152*colour_info[1][1] + 0.0722*colour_info[1][2]
(This formula we are using,
[ 0.2126R + 0.7152G + 0.0722B ~ , ]
is usually the first step of a slightly more involved formula, but it will be good
enough for our purposes.)
Now you can use your function:
Catching bugs
I said earlier that deep unpacking can also help you find bugs in your code.
It is not hard to believe that the colours list of the previous example could
have come from some other function, for example a function that scrapes the
webpage I have been checking, and creates those tuples with colour informa-
tion.
colours = [
("AliceBlue", (240, 248, 255, 127, 255, 212)),
("DarkCyan", (0, 139, 139)),
]
print(greyscale(colours[0])) # 246.8046
However, if you were to use the function that uses deep unpacking, then this
would happen:
def greyscale(colour_info):
name, (r, g, b) = colour_info
return 0.2126*r + 0.7152*g + 0.0722*b
colours = [
("AliceBlue", (240, 248, 255, 127, 255, 212)),
("DarkCyan", (0, 139, 139)),
]
Conclusion
Here’s the main takeaway of this article, for you, on a silver platter:
References
• PEP 634 – Structural Pattern Matching: Specification, https://www.python
.org/dev/peps/pep-0634/;
• PEP 3113 – Removal of Tuple Parameter Unpacking, https://www.python.o
rg/dev/peps/pep-3113/;
• Multiple assignment and tuple unpacking improve Python code readability,
https://treyhunner.com/2018/03/tuple-unpacking-improves-python-
code-readability/#Using_a_list-like_syntax;
• Unpacking Nested Data Structures in Python, https://dbader.org/blog/py
thon-nested-unpacking;
• W3Schools, HTML Color Names, https://www.w3schools.com/colors/color
s_names.asp;
• Wikipedia, Grayscale, Converting color to grayscale, https://en.wikipedia
.org/wiki/Grayscale#Converting_color_to_grayscale.
Online references last consulted on the 23rd of February of 2021.
Starred Assignment
It is fairly common to have a list or another iterable that you want to split in the
first element and then the rest. You can do this by using slicing in Python, but
the most explicit way is with starred assignments.
35
This feature was introduced in PEP 3132 – Extended Iterable Unpacking and
allows for the following:
>>> l = [1, 2, 3, 4, 5]
>>> head, *tail = l
>>> head
1
>>> tail
[2, 3, 4, 5]
This starred assignment is done by placing one * to the left of a variable name in
a multiple assignment, and by having any iterable on the right of the assignment.
All variable names get a single element and the variable name with the “star”
(the asterisk *) gets all other elements as a list:
>>> string = "Hello!"
>>> *start, last = string
>>> start
['H', 'e', 'l', 'l', 'o']
>>> last
'!'
You can have more than two variable names on the left, but only one asterisk:
>>> a, b, *c, d = range(5) # any iterable works
>>> a
0
>>> b
1
>>> c
[2, 3]
>>> d
4
When you use the starred assignment, the starred name might get an empty list,
>>> a, *b = [1]
>>> a
1
>>> b
[]
and an error is issued if there are not enough items to assign to the names that
are not starred:
>>> a, *b = []
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: not enough values to unpack (expected at least 1, got 0)
if not list_:
raise TypeError("Cannot reduce empty list.")
value = list_[0]
list_ = list_[1:]
while list_:
value = function(value, list_[0])
list_ = list_[1:]
return value
And here is an equivalent implementation using starred assignment:
def reduce(function, list_):
"""Reduce the elements of the list by the binary function."""
if not list_:
raise TypeError("Cannot reduce empty list.")
value, *list_ = list_
while list_:
val, *list_ = list_
value = function(value, val)
return value
The usage of the starred assignment here makes it abundantly clear that we
wish to unpack the list into an item to be used now and the rest to be used later.
Another similar example, but with the starred name in the beginning, follows.
weight = 2
acc = 0
for digit in reversed(digits[:-1]):
value = digit * weight
acc += (value // 10) + (value % 10)
weight = 3 - weight # 2 -> 1 and 1 -> 2
return (9 * acc % 10) == digits[-1]
40
n = int(s)
else:
n = 1
(In the code above, we use the method str.isnumeric to check if the string
is a valid integer. Try running print(str.isnumeric.__doc__) in your Python
REPL.)
With EAFP, you first try to perform whatever operation it is you want to do, and
then use a try block to capture an eventual exception that your operation might
throw in case it is not successful. In our example, this means we simply try to
convert s into an integer and in case a ValueError exception is raised, we set
the default value:
print("Type a positive integer (defaults to 1):")
s = input(" >> ")
try:
n = int(s)
except ValueError:
n = 1
We use except ValueError because a ValueError is the exception that is
raised if you try to convert to integer a string that doesn’t contain an integer:
>>> int("345")
345
>>> int("3.4")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '3.4'
>>> int("asdf")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'asdf'
Avoid redundancy
Sometimes, coding with EAFP in mind allows you to avoid redundancy in your
code. Imagine you have a dictionary from which you want to extract a value
associated with a key, but that key might not exist.
With LBYL, you would do something like:
Conclusion
EAFP code is a very good alternative to LBYL code, even being superior in vari-
ous alternatives, like the ones I mentioned above. When writing code, try to
weigh the different pros and cons of the several approaches you can take, and
don’t forget to consider writing EAFP code!
EAFP is not the absolute best way to go in every single situation, but EAFP code
can be very readable and performant!
References
• PEP 463 – Exception-catching expressions, https://www.python.org/dev
/peps/pep-0463/
• Python 3 Documentation, The Python Standard Library, Debugging and
Profiling, timeit, https://docs.python.org/3/library/timeit.html.
• Python 3 Documentation, The Python Tutorial, Errors and Exceptions, https:
//docs.python.org/3/tutorial/errors.html.
• Microsoft Devblogs, Idiomatic Python: EAFP versus LBYL, https://devblo
gs.microsoft.com/python/idiomatic-python-eafp-versus-lbyl/.
• Stack Overflow, “What is the EAFP principle in Python?”, https://stackove
rflow.com/questions/11360858/what-is-the-eafp-principle-in-python.
Introduction
One of the things I appreciate most about Python, when compared to other pro-
gramming languages, is its for loops. Python allows you to write very expressive
loops, and part of that is because of the built-in zip function.
In this article you will
• see what zip does;
• get to know a new feature of zip that is coming in Python 3.10;
• learn how to use zip to create dictionaries; and
• see some nice usage examples of zip.
47
How zip works
In a simple for loop, you generally have an iterator it and you just write some-
thing like
for elem in it:
# Do something with elem
print(elem)
An “iterator” is something that can be traversed linearly, of which a list is the
simplest example. Another very common iterator used in Python’s for loops is
a range:
for n in range(10):
print(n**2)
Sometimes you will have two or more iterators that contain related information,
and you need to loop over those iterators to do something with the different bits
of information you got.
In the example below, we have a list of first and last names of people and we
want to print the full names. The naïve solution would be to use a range to
traverse all the indices and then index into the lists:
>>> firsts = ["Anna", "Bob", "Charles"]
>>> lasts = ["Smith", "Doe", "Evans"]
>>> for i in range(len(firsts)):
... print(f"'{firsts[i]} {lasts[i]}'")
...
'Anna Smith'
'Bob Doe'
'Charles Evans'
This does the job, but a for loop like this only hints at the fact that you are
probably going to access the values in firsts, because you wrote
range(len(firsts))
but turns out you also want to access the items in lasts. This is what zip is for:
you use it to pair up iterables that you wanted to traverse at the same time:
>>> firsts = ["Anna", "Bob", "Charles"]
>>> lasts = ["Smith", "Doe", "Evans"]
>>> for first, last in zip(firsts, lasts):
... print(f"'{first} {last}'")
...
'Anna Smith'
'Bob Doe'
'Charles Evans'
Zip is lazy
One thing to keep in mind is that zip doesn’t create the tuples immediately. zip
is lazy, and that means it will only compute the tuples when you ask for them, for
example when you iterate over them in a for loop (like in the examples above)
or when you convert the zip object into a list:
>>> firsts = ["Anna", "Bob", "Charles"]
>>> lasts = ["Smith", "Doe", "Evans", "Rivers"]
>>> z = zip(firsts, lasts)
Three is a crowd
We have seen zip with two arguments, but zip can take an arbitrary number of
iterators and will produce a tuple of the appropriate size:
>>> firsts = ["Anna", "Bob", "Charles"]
>>> middles = ["Z.", "A.", "G."]
>>> lasts = ["Smith", "Doe", "Evans"]
>>> for z in zip(firsts, middles, lasts):
... print(z)
...
('Anna', 'Z.', 'Smith')
('Bob', 'A.', 'Doe')
('Charles', 'G.', 'Evans')
Mismatched lengths
zip will always return a tuple with as many elements as the arguments it received,
so what happens if one of the iterators is shorter than the others?
If zip’s arguments have unequal lengths, then zip will keep going until it ex-
hausts one of the iterators. As soon as one iterator ends, zip stops producing
tuples:
>>> firsts = ["Anna", "Bob", "Charles"]
Snake game
A friend of mine is learning Python and he started creating a replica of the game
of Snake. There is a certain point in the game where he has a menu and he wants
to display thumbnails of the “maps” that that can be played on, and he has those
images in a list called lvlpictures. At the same time, he has the positions of
where those images should go in a list called self.buttons. In order to display
the thumbnails in the correct positions, he has to call a function called blit,
which expects the image and the position the image should go to.
Here is the loop he wrote before knowing about zip:
for i in range(len(lvlpictures)):
self.surface.blit(lvlpictures[i], (self.buttons[i][0]+2,self.buttons[i][1]+2))
And here is the loop he wrote after knowing about zip:
for pic, btn in zip(lvlpictures, self.buttons):
self.surface.blit(pic, (btn[0] + 2, btn[1] + 2))
Notice that using zip makes your code shorter and it also makes more clear the
intent of processing the pictures and the buttons together. Finally, when Python
3.10 is released, he may even add the strict=True keyword argument, because
he expects lvlpictures and self.buttons to have the same length.
Matching paths
If you are not aware of it, then you might be interested in knowing that Python
has a module named pathlib that provides facilities to deal with filesystem
paths.
When you create a path, you can then check if it matches a given pattern:
>>> from pathlib import PurePath
>>> PurePath('a/b.py').match('*.py')
True
>>> PurePath('/a/b/c.py').match('b/*.py')
True
>>> PurePath('/a/b/c.py').match('a/*.py')
False
If you take a look at this match function, you find this:
class PurePath(object):
# ...
writer.writeheader()
writer.writerow({'first_name': 'Baked', 'last_name': 'Beans'})
writer.writerow({'first_name': 'Lovely', 'last_name': 'Spam'})
writer.writerow({'first_name': 'Wonderful', 'last_name': 'Spam'})
The fieldnames variable will establish the header of the CSV file and is then
used by the writerow method to know the order in which the values of the
dictionary should be written in the file.
The writeheader function is the function that writes the header of the CSV file,
and here is what it looks like:
class DictWriter:
# ...
def writeheader(self):
header = dict(zip(self.fieldnames, self.fieldnames))
return self.writerow(header)
Basically, what this function is doing is using zip to transform the header names
into a dictionary where the keys and the values are the same, pretending that
the header is just a regular data row:
>>> fieldnames = ['first_name', 'last_name']
>>> dict(zip(fieldnames, fieldnames))
{'first_name': 'first_name', 'last_name': 'last_name'}
Conclusion
Here’s the main takeaway of this article, for you, on a silver platter:
“zip is your friend whenever you need to traverse two or more iter-
ables at the same time.”
This Pydon’t showed you that:
• zip can be used to traverse several iterables at the same time;
• zip by itself returns a zip object which must then be iterated or converted
explicitly to a list if you want the tuples it produces;
• if the arguments to zip have uneven lengths, zip will stop as soon as one
of the iterators is exhausted;
• starting with Python 3.10, you can use the keyword argument strict=True
to tell zip to error if the arguments to zip have different lengths; and
• zip can provide for a really simple way to create dictionaries.
References
• Python 3 Documentation, The Python Standard Library, zip, docs.python.org/3/library/functions.html#zip
[last accessed 30-03-2021];
• Python 3.10 Documentation, The Python Standard Library, zip,
docs.python.org/3.10/library/functions.html#zip [last accessed 30-
03-2021];
• Python 3 Documentation, The Python Standard Library, csv, docs.python.org/3/library/csv.html
[last accessed 30-03-2021].
• Python 3 Documentation, The Python Standard Library, pathlib,
docs.python.org/3/library/pathlib.html [last accessed 30-03-2021].
Introduction
Following up on last week’s Pydon’t about zip, today we are talking about
enumerate.
One of the things I appreciate most about Python, when compared to other pro-
gramming languages, is its for loops. Python allows you to write very expressive
loops, and some of that expressiveness comes from the built-in enumerate func-
tion.
In this article you will
• see what enumerate does;
• take a look at its underrated optional start argument;
• learn a couple of neat use cases for enumerate;
56
• see some nice examples of code using enumerate.
Deep unpacking
Things can get even more interesting when you use enumerate, for example, on
a zip:
>>> chapter_pages = [5, 17, 31, 50] # book ends at page 50
>>> for i, (start, end) in enumerate(zip(chapter_pages, chapter_pages[1:]), start=1):
... print(f"'{i}: {end-start} pages long.'")
...
'1: 12 pages long.'
'2: 14 pages long.'
'3: 19 pages long.'
(Here I explicitly named the start= argument in the enumerate so that it was
visually easier to separate it from the argument to zip.)
This code snippet takes a list of pages where chapters of a book start and prints
the length of each chapter. Notice how enumerate returns tuples with indices
and values, but those values are extracted from a zip, which itself returns tuples:
>>> chapter_pages = [5, 17, 31, 50] # book ends at page 50
>>> for tup in enumerate(zip(chapter_pages, chapter_pages[1:]), start=1):
... print(tup)
...
(1, (5, 17))
(2, (17, 31))
(3, (31, 50))
What we do is use deep unpacking to access all these values directly:
Examples in code
Now you will see some usages of enumerate in real Python code.
Vanilla enumerate
I took a look at the Python Standard Library and by and large the most common
usage of enumerate is just a vanilla enumerate(iter) to access iterable values
and indices at the same time. Let me share a textbook example with you:
The doctest module allows you to write simple tests for your code inside the
docstrings for your functions, classes, etc. The way you write these tests is
in the form of an interactive session in the REPL. doctest then locates those
“interactive sessions” in your docstrings and plays them to see if the actual
output of the code matches what your docstring showed.
If you open your Python REPL, you will see that it starts with the prompt >>>
which has a blank space after the triple >. You cannot delete that blank space, it
is part of the prompt. When parsing a docstring to extract the actual tests, the
parser performs a check to see if the prompts have that leading blank space or
not, and here is the code that does it:
## from Lib\doctest.py in Python 3.9
class DocTestParser:
# ...
>>> sum_nats(1)
1
>>> sum_nats(10)
55
>>>sum_nats(100)
5050
"""
return int(n*(n+1)/2)
if __name__ == "__main__":
import doctest
doctest.testmod()
Notice how I intentionally wrote the third example without a space between >>>
and sum_nats(100). Running this script should throw a ValueError at your face,
that should go away when you put a blank space there.
Conclusion
Here’s the main takeaway of this article, for you, on a silver platter:
“enumerate is your best friend if you need to traverse an iterator
to deal with its data and also need access to information about its
index.”
This Pydon’t showed you that:
• enumerate gives you access to an iterable’s elements and indices at the
same time;
• enumerate by itself returns a lazy enumerate object that must be then
iterated or converted explicitly to a list (or something else that suits
your needs) if you want its values;
• enumerate takes a second argument to set an offset for the indexing;
– and, in particular, that argument can be a negative integer;
• the result of enumerate can be fed directly to dict to create a dictionary
whose keys are the indices;
• using enumerate we get a nice idiom to find the indices of an iterable that
point to the elements that satisfy a given condition; and
• coupling zip, enumerate, and deep unpacking allows you to loop over sev-
eral iterables elegantly.
68
and the built-in repr function.
There is often confusion as to what the differences between these two built-ins
are, but the difference is simple and clear. The str class is used when you
want to convert something to the string type, and is also used when you need a
readable representation of your object. On the other hand, the repr function is
used to create an unambiguous representation of its argument.
End users generally use str because they want to print readable and good look-
ing text, whereas developers may use repr because they need to debug code
and need to make sure they know what they are looking at. For example, take a
look at the following interactive session:
>>> print(3)
3
>>> print("3")
3
>>> 3
3
>>> "3"
'3'
The print function calls str on its argument and then displays it, so both the
integer 3 and the string "3" get printed the same way: you have no way to tell
if the original object is an integer or a string. After that, you see that simply
writing the integer 3 and the string "3" in the REPL returns an unambiguous
representation of the object: you can tell integers and strings apart, because
the REPL is using repr under the hood to show objects. repr is also used when
your object is inside a container, like a list or a dictionary, because containers
usually defer their str behaviour to repr, as you can see by looking at PEP 3140
and at the following session:
>>> [3, "3"]
[3, '3']
>>> print([3, "3"])
[3, '3']
>>> str([3, "3"]) == repr([3, "3"])
True
2D point
An example custom usage of the __str__ and __repr__ dunder methods could
come into play if you were to implement a simple class that represents 2D
points, for example because you have to deal with images or a game or maps,
or whatever your use case is.
Ignoring all other methods you would certainly implement, your class could look
like this:
class Point2D:
"""A class to represent points in a 2D space."""
def __str__(self):
"""Provide a good-looking representation of the object."""
return f"({self.x}, {self.y})"
Conclusion
When implementing custom classes you will probably want to give a custom
implementation of the __repr__ dunder method, and also a __str__ if you need
your instances to look good when printed to the end user. __str__ and str are
used when you need good looking strings, while the purpose of __repr__ and
repr is to create unambiguous representations of your objects.
References
• Python 3 Documentation, The Python Language Reference, Data model,
repr and str, https://docs.python.org/3/reference/datamodel.html#objec
t.__repr__.
• Python 3 Documentation, The Python Standard Library, Built-in Functions,
https://docs.python.org/3/library/functions.html.
• Python 3 Documentation, The Python Standard Library, Built-in Types, str,
https://docs.python.org/3/library/stdtypes.html#str.
• PEP 3140 – str(container) should call str(item), not repr(item), https://ww
w.python.org/dev/peps/pep-3140/.
• Stack Overflow, “Purpose of Python’s repr”, https://stackoverflow.com/qu
estions/1984162/purpose-of-pythons-repr.
• dbader.org, “Python String Conversion 101: Why Every Class Needs a
“repr””, https://dbader.org/blog/python-repr-vs-str.
Online references last consulted on the 2nd of February of 2021.
Introduction
Structural pattern matching is coming to Python, and while it may look like a
plain switch statement like many other languages have, Python’s match state-
73
ment was not introduced to serve as a simple switch statement.
PEPs 634, 635, and 636 have plenty of information on what structural pattern
matching is bringing to Python, how to use it, the rationale for adding it to Python,
etc. In this article I will try to focus on using this new feature to write beautiful
code.
! At the time of writing, Python 3.10 is still a pre-release, ! so you have to look
in the right place if you want to download ! Python 3.10 and play with it.
match colour:
case (r, g, b):
name = ""
a = 0
case (r, g, b, a):
name = ""
case (name, (r, g, b)):
a = 0
case (name, (r, g, b, a)):
pass
if len(colour) == 3:
r, g, b = colour
name = ""
a = 0
elif len(colour) == 4:
r, g, b, a = colour
name = ""
elif len(colour) != 2:
raise ValueError("Unknown colour info.")
else:
name, values = colour
if not isinstance(values, (list, tuple)) or len(values) not in [3, 4]:
raise ValueError("Unknown colour info.")
elif len(values) == 3:
r, g, b = values
a = 0
else:
r, g, b, a = values
return (name, (r, g, b, a))
I tried writing a decent, equivalent piece of code to the one using structural pat-
tern matching, but this doesn’t look that good. Someone else has suggested,
in the comments, another alternative that also doesn’t use match. That sug-
gestion looks better than mine, but is much more complex and larger than the
alternative with match.
match colour:
case (int(r), int(g), int(b)):
name = ""
a = 0
case (int(r), int(g), int(b), int(a)):
name = ""
case (str(name), (int(r), int(g), int(b))):
a = 0
case (str(name), (int(r), int(g), int(b), int(a))):
pass
case _:
raise ValueError("Unknown colour info.")
return (name, (r, g, b, a)))
def __str__(self):
"""Provide a good-looking representation of the object."""
return f"({self.x}, {self.y})"
def __repr__(self):
"""Provide an unambiguous way of rebuilding this object."""
return f"Point2D({repr(self.x)}, {repr(self.y)})"
Imagine we now want to write a little function that takes a Point2D and writes
match point:
case Point2D(x=0, y=0):
desc = "at the origin"
case Point2D(x=0, y=y):
desc = f"in the vertical axis, at y = {y}"
case Point2D(x=x, y=0):
desc = f"in the horizontal axis, at x = {x}"
case Point2D(x=x, y=y) if x == y:
desc = f"along the x = y line, with x = y = {x}"
case Point2D(x=x, y=y) if x == -y:
desc = f"along the x = -y line, with x = {x} and y = {y}"
case Point2D(x=x, y=y):
desc = f"at {point}"
__match_args__
Now, I don’t know if you noticed, but didn’t all the x= and y= in the code snippet
above annoy you? Every time I wrote a new pattern for a Point2D instance, I
had to specify what argument was x and what was y. For classes where this
order is not arbitrary, we can use __match_args__ to tell Python how we would
like match to match the attributes of our object.
Here is a shorter version of the example above, making use of __match_args__
to let Python know the order in which arguments to Point2D should match:
class Point2D:
"""A class to represent points in a 2D space."""
def describe_point(point):
"""Write a human-readable description of the point position."""
match point:
case Point2D(0, 0):
desc = "at the origin"
case Point2D(0, y):
desc = f"in the vertical axis, at y = {y}"
case Point2D(x, 0):
desc = f"in the horizontal axis, at x = {x}"
case Point2D(x, y):
desc = f"at {point}"
Wildcards
Another cool thing you can do when matching things is to use wildcards.
Asterisk *
Much like you can do things like
>>> head, *body, tail = range(10)
>>> print(head, body, tail)
0 [1, 2, 3, 4, 5, 6, 7, 8] 9
where the *body tells Python to put in body whatever does not go into head or
tail, you can use * and ** wildcards. You can use * with lists and tuples to
match the remaining of it:
def rule_substitution(seq):
new_seq = []
while seq:
match seq:
case [x, y, z, *tail] if x == y == z:
new_seq.extend(["3", x])
case [x, y, *tail] if x == y:
new_seq.extend(["2", x])
case [x, *tail]:
seq = ["1"]
print(seq[0])
for _ in range(10):
seq = rule_substitution(seq)
print("".join(seq))
"""
Prints:
1
11
21
1211
111221
312211
13112221
1113213211
31131211131221
13211311123113112211
11131221133112132113212221
"""
This builds the sequence I showed above, where each number is derived from
the previous one by looking at its digits and describing what you are looking at.
For example, when you find three equal digits in a row, like "222", you rewrite
that as "32" because you are seeing three twos. With the match statement this
becomes much cleaner. In the case statements above, the *tail part of the
pattern matches the remainder of the sequence, as we are only using x, y, and
z to match in the beginning of the sequence.
Double asterisk **
However, if you want to know what the original dictionary had that was not spe-
cified in the match, you can use a ** wildcard:
d = {0: "oi", 1: "uno"}
match d:
case {0: "oi", **remainder}:
print(remainder)
## prints {1: 'uno'}
Finally, you can use this to your advantage if you want to match a dictionary that
contains only what you specified:
d = {0: "oi", 1: "uno"}
match d:
case {0: "oi", **remainder} if not remainder:
print("Single key in the dictionary")
case {0: "oi"}:
print("Has key 0 and extra stuff.")
## Has key 0 and extra stuff.
You can also use variables to match the values of given keys:
d = {0: "oi", 1: "uno"}
match d:
case {0: zero_val, 1: one_val}:
print(f"0 mapped to {zero_val} and 1 to {one_val}")
## 0 mapped to oi and 1 to uno
Naming sub-patterns
Sometimes you may want to match against a more structured pattern, but then
give a name to a part of the pattern, or to the whole thing, so that you have
a way to refer back to it. This may happen especially when your pattern has
alternatives, which you add with |:
def go(direction):
match direction:
case "North" | "East" | "South" | "West":
return "Alright, I'm going!"
case _:
return "I can't go that way..."
def prefix(tree):
match tree:
case ast.Expression(expr):
return prefix(expr)
case ast.Constant(value=v):
return str(v)
case ast.BinOp(lhs, op, rhs):
match op:
case ast.Add():
sop = "+"
case ast.Sub():
sop = "-"
case ast.Mult():
sop = "*"
case ast.Div():
sop = "/"
case _:
raise NotImplementedError()
return f"{sop} {prefix(lhs)} {prefix(rhs)}"
case _:
raise NotImplementedError()
def op_to_str(op):
ops = {
ast.Add: "+",
ast.Sub: "-",
ast.Mult: "*",
ast.Div: "/",
}
return ops.get(op.__class__, None)
def prefix(tree):
match tree:
case ast.Expression(expr):
return prefix(expr)
case ast.Constant(value=v):
return str(v)
case ast.BinOp(lhs, op, rhs):
sop = op_to_str(op)
if sop is None:
raise NotImplementedError()
return f"{sop} {prefix(lhs)} {prefix(rhs)}"
case _:
raise NotImplementedError()
Conclusion
Here’s the main takeaway of this article, for you, on a silver platter:
“Structural pattern matching introduces a feature that can simplify
and increase the readability of Python code in many cases, but it will
not be the go-to solution in every single situation.”
This Pydon’t showed you that:
• structural pattern matching with the match statement greatly extends the
power of the already-existing starred assignment and structural assign-
ment features;
References
• PEP 622 – Structural Pattern Matching, https://www.python.org/dev/peps/
pep-0622/;
• PEP 634 – Structural Pattern Matching: Specification, https://www.python
.org/dev/peps/pep-0634/;
• PEP 635 – Structural Pattern Matching: Motivation and Rationale, https:
//www.python.org/dev/peps/pep-0635/;
• PEP 636 – Structural Pattern Matching: Tutorial, https://www.python.org
/dev/peps/pep-0636/;
• Dynamic Pattern Matching with Python, https://gvanrossum.github.io/doc
s/PyPatternMatching.pdf;
• Python 3.10 Pattern Matching in Action, YouTube video by “Big Python”,
https://www.youtube.com/watch?v=SYTVSeTgL3s.
Introduction
Structural pattern matching is coming to Python, and while it may look like a
plain switch statement like many other languages have, Python’s match state-
ment was not introduced to serve as a simple switch statement.
86
In this article I explored plenty of use cases for the new match statement, and
in this blog post I will try to explore some use cases for which a match is not
the answer. This article will assume you know how structural pattern matching
works in Python, so if you are unsure how that works feel free to read my “Pattern
matching tutorial for Pythonic code”.
! At the time of writing, Python 3.10 is still a pre-release, ! so you have to look
in the right place if you want to download ! Python 3.10 and play with it.
Be smart(er)
Sometimes you will feel like you have to list a series of cases and corresponding
values, so that you can map one to the other. However, it might be the case that
you could make your life much simpler by looking for an alternative algorithm or
formula and implementing that instead. I’ll show you an example.
In case you never heard of it, Rule 30 is an “elementary cellular automaton”.
You can think of it as a rule that receives three bits (three zeroes/ones) and
produces a new bit, depending on the three bits it received. Automatons are
really, really, interesting, but discussing them is past the point of this article.
Let us just look at a possible implementation of the “Rule 30” automaton:
def rule30(bits):
match bits:
case 0, 0, 0:
return 0
case 0, 0, 1:
return 1
case 0, 1, 0:
return 1
Basic mappings
Getting from dictionaries
There are many cases in which you just want to take a value in and map it to
something else. As an example, take this piece of code that takes an expression
and writes it in prefix notation:
def prefix(tree):
match tree:
case ast.Expression(expr):
return prefix(expr)
case ast.Constant(value=v):
return str(v)
case ast.BinOp(lhs, op, rhs):
match op:
case ast.Add():
sop = "+"
case ast.Sub():
sop = "-"
case ast.Mult():
sop = "*"
case ast.Div():
sop = "/"
case _:
raise NotImplementedError()
return f"{sop} {prefix(lhs)} {prefix(rhs)}"
case _:
raise NotImplementedError()
def op_to_str(op):
match op:
case ast.Add():
sop = "+"
case ast.Sub():
sop = "-"
case ast.Mult():
sop = "*"
case ast.Div():
sop = "/"
case _:
raise NotImplementedError()
def prefix(tree):
match tree:
case ast.Expression(expr):
return prefix(expr)
case ast.Constant(value=v):
return str(v)
case ast.BinOp(lhs, op, rhs):
return f"{op_to_str(op)} {prefix(lhs)} {prefix(rhs)}"
case _:
raise NotImplementedError()
getattr
Another useful mechanism that we have available is the getattr function, which
is part of a trio of Python built-in functions: hasattr, getattr and setattr.
! I will be writing about this trio in a future Pydon’t; ! be sure to subscribe to
the Pydon’t newsletter so you don’t miss it! ! For now, I’ll just show you briefly
foo = Foo(3, 4)
print(getattr(foo, "a")) # prints 3
bar = Foo(10, ";")
print(getattr(bar, ";")) # prints ';'
This goes to show that it is always nice to know the tools you have at your dis-
posal. Not everything has very broad use cases, but that also means that the
more specialised tools are the ones that make the most difference when they
are brought in.
Speaking of knowing your tools, the last use case in this article for which match
is a bad alternative is related to calling different functions when your data has
different types.
@functools.singledispatch
def pretty_print(arg):
print(arg)
@pretty_print.register(complex)
def _(arg):
print(f"{arg.real} + {arg.imag}i")
@pretty_print.register(list)
@pretty_print.register(tuple)
def _(arg):
for i, elem in enumerate(arg):
print(i, elem)
@pretty_print.register(dict)
def _(arg):
for key, value in arg.items():
print(f"{key}: {value}")
And this can then be used exactly like the original function:
>>> pretty_print(3)
Conclusion
Here’s the main takeaway of this article, for you, on a silver platter:
“The new match statement is great, but that does not mean the match
statement will be the best alternative always and, in particular, the
match statement is generally being misused if you use it as a simple
switch.”
This Pydon’t showed you that:
• match isn’t necessarily always the best way to implement control flow;
• short and basic match statements could be vanilla if statements;
• sometimes there is a way to compute what you need, instead of having to
list many different cases and their respective values;
• built-in tools like dict.get and getattr can also be used to fetch different
values depending on the matching key; and
• you can use functools.singledispatch when you need to execute dif-
ferent subfunctions when the input has different types.
References
• PEP 622 – Structural Pattern Matching, https://www.python.org/dev/peps/
pep-0622/;
• PEP 634 – Structural Pattern Matching: Specification, https://www.python
.org/dev/peps/pep-0634/;
• PEP 635 – Structural Pattern Matching: Motivation and Rationale, https:
//www.python.org/dev/peps/pep-0635/;
• PEP 636 – Structural Pattern Matching: Tutorial, https://www.python.org
/dev/peps/pep-0636/;
• PEP 443 – Single-dispatch generic functions, https://www.python.org/dev
/peps/pep-0443/;
• Python 3 Documentation, The Python Standard Library, getattr, https:
//docs.python.org/3/library/functions.html#getattr;
Introduction
In this Pydon’t I am going to talk a little bit about when and why recursion
might not be the best strategy to solve a problem. This discussion will entail
some particularities of Python, but will also cover broader topics and concepts
that encompass many programming languages. After this brief discussion, I
97
will show you some examples of recursive Python code and its non-recursive
counterparts.
Despite what I said I’ll do, don’t take me wrong: the purpose of this Pydon’t is
not to make you dislike recursion or to say that recursion sucks. I really like
recursion and I find it very elegant.
RecursionError
The first thing we will discuss is the infamous recursion depth limit that Python
enforces.
If you have no idea what I am talking about, then either - you never wrote a
recursive function in your life, or - you are really, really good and never made a
mistake in your recursive function definitions.
The recursion depth limit is something that makes your code raise a
RecursionError if you make too many recursive calls. To see what I am
talking about, just do the following in your REPL:
>>> def f():
... return f()
...
>>> f()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in f
File "<stdin>", line 2, in f
File "<stdin>", line 2, in f
[Previous line repeated 996 more times]
RecursionError: maximum recursion depth exceeded
>>>
In many cases, this limit helps, because it helps you find recursive functions for
which you did not define the base case properly.
There are, however, cases in which 1000 recursive calls isn’t enough to finish
your computations. A classical example is that of the factorial function:
>>> def fact(n):
... if n == 0:
... return 1
... return n*fact(n-1)
Branching overlap
Another thing to take into account when considering a recursive solution to a
problem is: is there going to be much overlap in the recursive calls?
If your recursive function branches in its recursive calls and the recursive calls
overlap, then you may be wasting plenty of time recalculating the same values
over and over again. More often than not this can be fixed easily, but just be-
cause a problem probably has a simple solution, it doesn’t mean you can out-
right ignore it.
A classical example of recursion that leads to plenty of wasted computations is
the Fibonacci sequence example:
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n-1) + fibonacci(n-2)
A simple modification to this function shows that there are many recursive calls
being made:
call_count = 0
def fibonacci(n):
global call_count
call_count += 1
if n <= 1:
return n
return fibonacci(n-1) + fibonacci(n-2)
Examples in code
I will now show some recursive code that can incur in some of the problems men-
tioned above, and will also share non-recursive versions of those same pieces
of code.
Factorials
The toy example of the factorial is great because it lends itself to countless
different implementations, and the ideas that these implementations exhibit
can then be adapted to more complex recursions.
The main characteristic here is that the recursion of the factorial is a “linear” re-
cursion, where each call only performs a single recursive call, and each recursive
call is for a simpler problem.
The vanilla recursion follows:
def factorial(n):
fact = 1
for i in range(1, n+1):
fact *= i
return fact
If you are solving a problem and come up with different solutions, don’t be afraid
to try them out.
List sum
You can implement your own sum recursively:
def sum(l):
if not l:
return 0
return l[0] + sum(l[1:])
If you carry a partial sum down the recursive calls, you can make this tail recurs-
ive:
def sum(l, partial=0):
if not l:
return partial
return sum(l[1:], l[0] + partial)
From the tail recursive function to the while solution is simple:
def sum(l):
partial = 0
while l:
l, partial = l[1:], l[0] + partial
return partial
Notice what happened: - the default value of the auxiliary variable becomes the
first statement of the function; - you write a while loop whose condition is the
complement of the base case condition; - you update your variables just like you
did in the tail recursive call, except now you assign them explicitly; and - after
the while you return the auxiliary variable.
Of course there are simpler implementations for the sum, the point here is that
this transformation is generic and always works.
Sorting a list
Here is another example where we sort a list with selection sort. First, “regular”
recursion:
def selection_sort(l):
if not l:
return []
m = min(l)
Traversing (a directory)
The Depth-first versus Breadth-first distinction is more likely to pop up when
you have to traverse something.
In this example, we will traverse a full directory, printing file names and file sizes.
A simple, purely recursive solution follows:
import pathlib
def print_file_sizes(path):
"""Print file sizes in a directory."""
path_obj = pathlib.Path(path)
if path_obj.is_file():
print(path, path_obj.stat().st_size)
else:
for path in path_obj.glob("*"):
print_file_sizes(path)
def print_file_sizes(dir):
"""Print file sizes in a directory, recursing into subdirectories."""
paths_to_process = [dir]
while paths_to_process:
path, *paths_to_process = paths_to_process
path_obj = pathlib.Path(path)
if path_obj.is_file():
print(path, path_obj.stat().st_size)
else:
paths_to_process += path_obj.glob("*")
This example that I took from my “Truthy, Falsy, and bool” Pydon’t uses the
paths_to_process list to keep track of the, well, paths that still have to be
processed, which mimics recursion without actually having to recurse.
fibonacci_values = {0: 0, 1: 1}
try:
return fibonacci_values[n]
except KeyError:
fib = fibonacci(n-1) + fibonacci(n-2)
fibonacci_values[n] = fib
return fib
print(fibonacci(10))
print(call_count) # 19
Notice that this reduced the recursive calls from 177 to 19. We can even count
the number of times we have to perform calculations:
computation_count = 0
fibonacci_values = {0: 0, 1: 1}
def fibonacci(n):
try:
return fibonacci_values[n]
except KeyError:
global computation_count
computation_count += 1
fib = fibonacci(n-1) + fibonacci(n-2)
fibonacci_values[n] = fib
return fib
print(fibonacci(10))
print(computation_count) # 9
This shows that saving partial results can really pay off!
def merge_sort(l):
"""Sort a list recursively with the merge sort algorithm."""
# Base case.
if len(l) <= 1:
return l
# Sort first and last halves.
m = len(l)//2
l1, l2 = merge_sort(l[:m]), merge_sort(l[m:])
# Now put them together.
return merge(l1, l2)
If you don’t want to have all this recursive branching, you can use a generic list
to keep track of all the sublists that are still to be sorted:
def merge(l1, l2):
"""Merge two lists in order."""
result = []
while l1 and l2:
if l1[0] < l2[0]:
h, *l1 = l1
else:
h, *l2 = l2
result.append(h)
# One of the two lists is empty, the other contains the larger elements.
result.extend(l1)
result.extend(l2)
return result
def merge_sort(l):
"""Sort a list with the merge sort algorithm."""
return already_sorted[0]
! If you don’t really know what the h, *l1 = l1, h, *l2 = l2, ! lst, *to_sort
= to_sort and l1, l2, *already_sorted = already_sorted lines ! are do-
ing, you might want to have a look at ! this Pydon’t about unpacking with starred
assignments.
In this particular example, my translation of the merge sort to a non-recursive
solution ended up being noticeably larger than the recursive one. This just goes
to show that you need to judge all situations by yourself: would this be worth it?
Is there an imperative implementation that is better than this direct translation?
The answers to these questions will always depend on the programmer and the
context they are in.
This also shows that the way you think about the problem has an effect on the
way the code looks: even though this last implementation is imperative, it is a
direct translation of a recursive implementation and so it may not look as good
as it could!
Conclusion
Here’s the main takeaway of this article, for you, on a silver platter:
“Pydon’t recurse mindlessly.”
This Pydon’t showed you that:
• Python has a hard limit on the number of recursive calls you can make and
raises a RecursionError if you cross that limit;
• Python does not optimise tail recursive calls, and probably never will;
References
• Stack Overflow, “What is the maximum recursion depth in Python, and how
to increase it?”, https://stackoverflow.com/questions/3323001/what-is-
the-maximum-recursion-depth-in-python-and-how-to-increase-it.
• Stack Overflow, “Does Python optimize tail recursion?”, https://stackove
rflow.com/questions/13591970/does-python-optimize-tail-recursion.
• Neopythonic, Tail Recursion Elimination, http://neopythonic.blogspot.com
/2009/04/tail-recursion-elimination.html.
• Neopythonic, Final Words on Tail Calls, http://neopythonic.blogspot.com
/2009/04/final-words-on-tail-calls.html.
• Documentation, The Python Standard Library, Functional Programming
Modules, operator, https://docs.python.org/3/library/operator.html.
Online references last consulted on the 16th of February of 2021.