Math 236: David. J. Erwin and Henda C. Swart School of Mathematical Sciences University of Kwazulu-Natal

Download as pdf or txt
Download as pdf or txt
You are on page 1of 205

MATH 236

DISCRETE MATHEMATICS
WITH APPLICATIONS

by

David. J. Erwin and Henda C. Swart


School of Mathematical Sciences
University of KwaZulu-Natal
2
Contents

1 Sets, mappings, equivalence relations 7


1.1 Review of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Equivalence relations . . . . . . . . . . . . . . . . . . . . . . . 13
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 How to count 21
2.1 Basic counting principles . . . . . . . . . . . . . . . . . . . . . 21
2.2 The Pigeonhole Principle . . . . . . . . . . . . . . . . . . . . . 25
2.3 One-to-one functions and permutations . . . . . . . . . . . . . 28
2.4 Counting permutations . . . . . . . . . . . . . . . . . . . . . . 30
2.5 Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.6 The Inclusion-Exclusion Principle . . . . . . . . . . . . . . . . 39
2.7 Infinite sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3 Elementary number theory 53


3.1 The Division Algorithm . . . . . . . . . . . . . . . . . . . . . 53
3.2 Multiplicative inverses in Zm . . . . . . . . . . . . . . . . . . . 56
3.3 Exponentiation in Zm : square and multiply . . . . . . . . . . 58
3.4 Prime numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.5 The Euler φ-function . . . . . . . . . . . . . . . . . . . . . . . 63
3.6 The theorems of Fermat and Euler . . . . . . . . . . . . . . . 68
3.7 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

3
4

4 Fundamentals of cryptology 81
4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2 Monoalphabetic and polyalphabetic ciphers . . . . . . . . . . . 83
4.2.1 Monoalphabetic ciphers . . . . . . . . . . . . . . . . . 84
4.2.2 Polyalphabetic ciphers . . . . . . . . . . . . . . . . . . 86
4.2.3 Modular arithmetic . . . . . . . . . . . . . . . . . . . . 88
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5 Public-key cryptography 93
5.1 One-way functions . . . . . . . . . . . . . . . . . . . . . . . . 93
5.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.1.2 The password problem . . . . . . . . . . . . . . . . . . 94
5.1.3 Trapdoor one-way functions . . . . . . . . . . . . . . . 95
5.2 The key distribution problem . . . . . . . . . . . . . . . . . . 96
5.3 Diffie-Hellman key exchange . . . . . . . . . . . . . . . . . . . 97
5.4 The birth of public-key cryptography . . . . . . . . . . . . . . 99
5.5 The RSA cryptosystem . . . . . . . . . . . . . . . . . . . . . . 100
5.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 100
5.5.2 The mechanics of RSA: key generation . . . . . . . . . 102
5.5.3 The mechanics of RSA: encryption and decryption . . . 103
5.5.4 Key size in the RSA system . . . . . . . . . . . . . . . 106
5.5.5 Digital signatures with RSA . . . . . . . . . . . . . . . 107
5.5.6 The mathematics of RSA . . . . . . . . . . . . . . . . . 111
5.6 The Discrete Logarithm Problem . . . . . . . . . . . . . . . . 112
5.7 The El Gamal public-key cryptosystem . . . . . . . . . . . . . 113
5.7.1 El Gamal: key generation . . . . . . . . . . . . . . . . 113
5.7.2 El Gamal: encryption and decryption . . . . . . . . . . 114
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

6 Product cryptosystems. DES and AES 121


6.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.2 ASCII . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.3 Feistel ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.4 An overview of the Data Encryption Standard . . . . . . . . . 127
6.4.1 The DES algorithm . . . . . . . . . . . . . . . . . . . . 127
6.5 DES Stages 1 and 3: the permutation IP . . . . . . . . . . . . 128
6.6 DES Stage 2: the Feistel cipher . . . . . . . . . . . . . . . . . 129
6.6.1 Generating the round keys . . . . . . . . . . . . . . . . 129
5

6.6.2 The round function, fKi . . . . . . . . . . . . . . . . . 134


6.7 The security of DES . . . . . . . . . . . . . . . . . . . . . . . 139
6.7.1 Triple-DES . . . . . . . . . . . . . . . . . . . . . . . . 140
6.7.2 Modes of operation . . . . . . . . . . . . . . . . . . . . 141
6.8 AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

7 An Introduction to Graphs 149


7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7.2 What is a graph? . . . . . . . . . . . . . . . . . . . . . . . . . 150
7.3 Examples of graphs . . . . . . . . . . . . . . . . . . . . . . . . 154
7.4 Operations on graphs . . . . . . . . . . . . . . . . . . . . . . . 156
7.5 The degree of a vertex . . . . . . . . . . . . . . . . . . . . . . 157
7.6 Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
7.6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 164
7.6.2 Connected graphs . . . . . . . . . . . . . . . . . . . . . 164
7.6.3 Distance in graphs . . . . . . . . . . . . . . . . . . . . 168

8 The Shortest Path Algorithm 173


8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.2 Distance in weighted graphs . . . . . . . . . . . . . . . . . . . 173
8.3 Dijkstra’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . 175

9 Maximum Flows in Networks 183


9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
9.2 Digraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
9.3 An introduction to networks . . . . . . . . . . . . . . . . . . . 186
9.4 The max-flow min-cut theorem . . . . . . . . . . . . . . . . . 192
9.5 The max-flow min-cut algorithm . . . . . . . . . . . . . . . . . 194
6
Chapter 1

Sets, mappings, equivalence


relations

1.1 Review of sets


A set is a collection of objects. The objects that make up a set are called
elements. If x is an element of the set S, we write x ∈ S. The empty set, ∅,
contains no elements. The number of elements in a set S is its cardinality,
denoted |S|. If every element of the set S is also an element of the set T ,
then S is a subset of T , written S ⊆ T . Every set is a subset of itself, i.e.,
S ⊆ S. If S ⊆ T and S 6= T , then S is a proper subset of T , denoted S ⊂ T .
We can combine two sets S and T in several ways:
• The union of S and T is the set S ∪ T = {x : x ∈ S or x ∈ T }.

• The intersection of S and T is the set S ∩ T = {x : x ∈ S and x ∈ T }.

• The difference of S and T is the set S − T = {x : x ∈ S and x 6∈ T }.

• The symmetric difference of S and T is the set S ∆ T = (S − T ) ∪


(T − S) = (S ∪ T ) − (S ∩ T ).

• The cartesian product of S and T is the set S × T = {(s, t) : s ∈


S and t ∈ T }. Each element of S × T is an ordered pair. The cartesian
product S × S of S with itself is usually written S 2 .
Sets are often considered as subsets of a universal set. If S is a set and U its
universal set, then the complement of S is the set S = U − S.

7
8 MATH236 Discrete Mathematics with Applications 2009

Example 1.1.1 Let S = {0, 2, 5} and T = {0, 5, 6, 9}. Then S ∪ T =


{0, 2, 5, 6, 9}, S ∩ T = {0, 5}, S − T = {2}, T − S = {6, 9}, S ∆ T = {2, 6, 9},
and
S × T = {(0, 0), (0, 5), (0, 6), (0, 9),
(2, 0), (2, 5), (2, 6), (2, 9),
(5, 0), (5, 5), (5, 6), (5, 9)}.
Lastly, if S is a subset of the universal set {0, 1, . . . , 10}, then S = {1, 3, 4, 6, 7, 8, 9, 10}.

A number of sets occur frequently enough in mathematics that they are given
special names or symbols:
• The positive integers or natural numbers: N = {1, 2, 3, . . .}.
• The integers: Z = {. . . , −2, −1, 0, 1, 2, . . .}.
• The real numbers: R.
• The rational numbers: Q, the set of all real numbers that can be written
in the form a/b, where a, b ∈ Z and b 6= 0.
• The irrational numbers: the set of all real numbers that are not rational.


Example 1.1.2 Hence −4 ∈ Z − N, 2 ∈ R − Q. Also, every positive integer
is an integer, every integer is in turn rational, and every rational number is real,
so N ⊆ Z ⊆ Q ⊆ R.

Theorem 1 (DeMorgan’s Law). If S1 , S2 , . . . , Sn are sets, then


S1 ∪ S2 ∪ · · · ∪ Sn = S 1 ∩ S 2 ∩ · · · ∩ S n .
Proof. Let
A = S1 ∪ S2 ∪ · · · ∪ Sn
and
B = S1 ∩ S2 ∩ · · · ∩ Sn
To prove that A = B, we shall prove two things:
Sets, mappings, equivalence relations 9

1. That A ⊆ B, and,
2. That B ⊆ A.
Firstly, if x is in S1 ∪ S2 ∪ · · · ∪ Sn , then it is not a member of any of the
sets S1 , S2 , . . . , Sn . Hence x is a member of S i for all i ∈ {1, 2, . . . , n}, and
hence x ∈ S 1 ∩ S 2 ∩ · · · ∩ S n . Thus A ⊆ B.
Suppose now that x ∈ S 1 ∩ S 2 ∩ · · · ∩ S n . Then x 6∈ S1 and x 6∈ S2
and, in general, x 6∈ Si for any i. Consequently, x 6∈ S1 ∪ S2 ∪ · · · ∪ Sn ,
so x ∈ S1 ∪ S2 ∪ · · · ∪ Sn . This shows that B ⊆ A, which completes the
proof.

Example 1.1.3 Consider the sets S1 = {1, 2, 3, 4} and S2 = {3, 4, 5, 6},


which are both subsets of the universal set {1, 2, . . . , 8}. Then S1 ∪ S2 =
{1, 2, 3, 4, 5, 6}, so S1 ∪ S2 = {7, 8}. On the other hand, S 1 = {5, 6, 7, 8},
while S 2 = {1, 2, 7, 8}, so that S 1 ∩ S 2 = {7, 8}.

1.2 Partitions
If S ∩ T = ∅, then S and T are disjoint. A collection {S1 , S2 , . . . , Sk } of
subsets of a set S is called pairwise disjoint if every two distinct subsets Si
and Sj are disjoint, i.e., i 6= j implies that Si ∩ Sj = ∅.
A partition of S is a pairwise disjoint collection of nonempty subsets whose
union is S. In other words, a partition of S is a collection S = {S1 , S2 , . . . , Sk }
of subsets of S satisfying all three of the following criteria:
1. For all i ∈ {1, 2, . . . , k}, Si 6= ∅.
2. For all i, j ∈ {1, 2, . . . , k}, if i 6= j, then Si ∩ Sj = ∅.
k
[
3. Si = S.
i=1

Each of the subsets Si is called a part of the partition S. Conditions 2 and


3 show that every element x of S is in exactly one part of the partition.

Example 1.2.1
10 MATH236 Discrete Mathematics with Applications 2009

• Let S = {1, 2, . . . , 10} and S1 = {2, 3, 8}, S2 = {1}, S3 = {4, 5, 9, 10},


S4 = {6, 7}. Then {S1 , S2 , S3 , S4 } is a partition of S. So are {{1}, {2}, . . . , {10}}
and {{1, 2, . . . , 10}}. On the other hand, let S10 = {1, 2, 3, 8}. Then
{S10 , S2 , S3 , S4 } is not a partition of S.

• Let I be the set of irrational numbers. Then {Q, I} is a partition of R.

• For each i ∈ {0, 1, 2}, let Si be the set of all integers whose remainder
when divided by 3 is i. In other words,

S0 = {x ∈ Z : x = 3k for some k ∈ Z} = {. . . , −6, −3, 0, 3, 6, . . .},


S1 = {x ∈ Z : x = 3k + 1 for some k ∈ Z} = {. . . , −5, −2, 1, 4, 7, . . .},
S2 = {x ∈ Z : x = 3k + 2 for some k ∈ Z} = {. . . , −4, −1, 2, 5, 8, . . .}.

Then {S0 , S1 , S2 } is a partition of Z.

1.3 Relations
Let S and T be nonempty sets. A relation R from S to T is a subset of
S × T , i.e., R is a set of ordered pairs (s, t), where each s is in S and each
t is a member of T . If (s, t) ∈ R, we say that s is related to t under R, and
we write s R t. If, on the other hand, (s, t) 6∈ R, then we write s R 6 t. The
domain and range of a relation R are defined as:

dom R = {s ∈ S : s R t for some t ∈ T }, and,


ran R = {t ∈ T : s R t for some s ∈ S}.

A relation in which each element of the domain is related to exactly one


element of the range is a function.

Example 1.3.1 Let S = {1, 2, 3, 4}, T = {a, b, c, d, e, f }, and

R = {(1, a), (1, c), (2, a), (2, b), (3, c)}.
Sets, mappings, equivalence relations 11

Then R is a relation from S to T . The element 1 is related to both a and c


under R, so we write 1 R a and 1 R c. Also,
dom R = {1, 2, 3}
ran R = {a, b, c}.
Since the element 1 of dom R is related to both a and c, R is not a function.

If S = T , i.e., if R is a relation from S to S, then R is called a relation


on S.

Example 1.3.2
• The set R1 = {(x, x2 ) : x ∈ R} is a relation on R.
• The set R2 = {(x, y) ∈ Z2 : x < y} is a relation on Z. Some of the
ordered pairs in R2 are: (1, 5), (−3, 200), (10, 11), (−43, −29), (0, 2). In
fact, R2 is the familiar relation ‘is less than’, and we say that ‘< is a
relation on Z’.
• Similarly, ≥ is a relation on R (and on Z, and on Q, . . . ). It consists of
all those ordered pairs (x, y) of real numbers in which x is at least as large
as y.
• Consider the set P of all sixteen subsets of {1, 2, 3, 4}. Then ⊆ is a relation
on P , as is ⊂. For example, {1, 3} and {1, 3, 4} are both members of P ,
and {1, 3} ⊆ {1, 3, 4}, so {1, 3} is related to {1, 3, 4} under the relation
⊆.
• 1 The set R3 = {(x, y) ∈ Z2 : 2 | (x−y)} is a relation on Z. Some of the or-
dered pairs in R3 are: (0, 2), (2, 0), (4, 26), (98, −22), (−1, −1), (5, 29), (−9, 1).
• The set R4 = {(x, y) ∈ R2 : |x − y| < 1} is a relation on R. Some of the
ordered pairs in R4 are: (5, 5), (−3, −3.2), (10.424, 11.403), (−0.25, 0.68).

Let R be a relation on S. Then R might have one or more of the following


interesting properties:
1
The notation ‘a|b’ is read ‘a divides b’ and means that there is some integer t such
that b = a · t.
12 MATH236 Discrete Mathematics with Applications 2009

• If x R x for all x ∈ S, then R is reflexive.

• If there is no x ∈ S for which x R x, then R is irreflexive.

• If, for all x, y ∈ S, x R y implies that y R x, then R is symmetric.

• If, for all x, y ∈ S, x R y and y R x implies that x = y, then R is


antisymmetric.

• If, for all x, y, z ∈ S, x R y and y R z implies that x R z, then R is


transitive.

As an example, we now consider the properties of each of the relations in


Example 1.3.2:

Example 1.3.3

• Since (2, 2) 6∈ R1 , R1 is not reflexive. On the other hand, (1, 1) ∈ R1 ,


so R1 is not irreflexive, either. R1 is not symmetric: (2, 4) ∈ R1 but
(4, 2) 6∈ R1 . R1 is not transitive, since (−2, 4) ∈ R1 and (4, 16) ∈
R1 , but (−2, 16) 6∈ R1 . Finally, consider the question of whether R1 is
antisymmetric. Suppose that (x, y) and (y, x) are both elements of R1 .
Since (x, y) ∈ R1 , we have y = x2 , and, since (y, x) ∈ R1 , we have
x = y 2 . Solving these two equations simultaneously, we find that (0, 0)
and (1, 1) are the only (real) solutions. Thus R1 is antisymmetric.

• No number is less than itself, so < is not reflexive; for the same reason,
< is irreflexive. If x < y, then certainly y ≮ x, hence < is not symmetric.
Moreover, if x < y and y < z, then clearly x < z. It follows that < is
transitive. We are thus left to consider the question: Is < antisymmetric?
The reasoning here is a little subtle: < is antisymmetric if the implication
‘x < y and y < x implies that x = y’ is always true. Since it’s not possible
that x < y and y < x, the left-hand side of this implication is always
false. Thus2 we find that the implication is always true. Consequently, <
is antisymmetric.

• The relation ≥ is reflexive, not irreflexive, not symmetric, and transitive,


all for reasons similar to those previously discussed. If x ≥ y and y ≥ x,
then x ≤ y ≤ x, so x = y; hence, ≥ is antisymmetric.
2
Remembering the truth table for the implication.
Sets, mappings, equivalence relations 13

• Every set is a subset of itself, so ⊆ is reflexive and not irreflexive. Notice


that {1} ⊆ {1, 2} but {1, 2} 6⊆ {1}; thus, ⊆ is not symmetric. Clearly,
if A is a subset of B and B is a subset of C, then A is a subset of C.
Hence ⊆ is transitive. Finally, if A ⊆ B then every element of A is also
an element of B; but if B ⊆ A, then every element of B is an element of
A. Thus A = B, showing that ⊆ is antisymmetric.
• For every x ∈ Z, x − x = 0 and 0 is exactly divisible by 2, so R3 is
reflexive. It’s also symmetric: If 2 | (x − y), then there exists an integer
t such that x − y = 2t. Consequently, y − x = 2(−t), so 2 | (y − x)
and (y, x) ∈ R3 . The relation is not antisymmetric: (2, 10) ∈ R3 and
(10, 2) ∈ R3 , but 2 6= 10. Finally, let (x, y), (y, z) ∈ R3 . Then there
exist integers t1 , t2 such that x − y = 2t1 and y − z = 2t2 . But then
x − z = (x − y) + (y − z) = 2t1 + 2t2 = 2(t1 + t2 ), so (x, z) ∈ R3 , proving
that R3 is transitive.
• For each x ∈ R, certainly |x − x| = 0 < 1, so R4 is reflexive. R4 is
also symmetric and not antisymmetric. Finally, R4 is not transitive: for
example, (1, 1.7) ∈ R4 and (1.7, 2.2) ∈ R4 , but (1, 2.2) 6∈ R4 .

1.4 Equivalence relations


A relation R on a set S is called an equivalence relation if it is reflexive,
symmetric, and transitive.

Example 1.4.1
• Of the relations considered in Examples 1.3.2 and 1.3.3, only R3 is an
equivalence relation.
• Denote by L the set of all lines in the plane and define a relation R on L
as follows. For L1 , L2 ∈ L, L1 is related to L2 under R if (i) L1 and L2
coincide, or, (ii) L1 is parallel to L2 . Since every line coincides with itself,
R is reflexive. Moreover, if L1 is parallel to L2 , then L2 is parallel to L1 ,
so R is symmetric. Finally, if L1 is parallel to L2 and L2 is parallel to L3 ,
then clearly L1 is parallel to L3 , showing that R is transitive.
14 MATH236 Discrete Mathematics with Applications 2009

• Let S be a set of integers and R5 the relation defined on S by x R5 y


if and only if 3|(x + 2y). Let x, y, z ∈ S. Then x + 2x = 3x, which
is divisible by 3, so xR5 x and R5 is reflexive. Suppose now that xR5 y.
Then there is some integer t such that x + 2y = 3t, so x = 3t − 2y, so
y + 2x = y + 2(3t − 2y) = 6t − 3y, which is divisible by 3, so yR5 x and
R5 is symmetric. Lastly, suppose that xR5 y and yR5 z. Then there exist
integers t1 , t2 such that x + 2y = 3t1 and y + 2z = 3t2 . Adding these
two equations together, we have x + 2y + y + 2z = 3t1 + 3t2 , whence
x + 2z = 3t1 + 3t2 − 3y, which is divisible by 3. Hence, R5 is transitive,
and thus R5 is an equivalence relation.

Let R be an equivalence relation on a set S and let x ∈ S. Then the set

[x] = {y ∈ S : x R y}

is called the equivalence class containing x. It is a subset of the set S on


which the relation R is defined. Note that since R is reflexive, x R x, so it is
always the case that x ∈ [x]. The set of all equivalence classes induced on a
set S by an equivalence relation R is usually denoted S/R.

Example 1.4.2

• Consider the relation R3 of Examples 1.3.2 and 1.3.3. Which integers are
in [5]? From the definition of the relation R3 , we know that x ∈ [5] if
and only if 5 − x = 2t for some integer t. Thus [5] = {5 + 2t : t ∈ Z}.
Clearly, then [5] is the set of all odd integers. Notice that [5] = [3] =
[7] = [1] = [−3] = [51] = · · · . Similarly, one may reason that [6] is the
set of even integers, and that [6] = [4] = [2] = [20] = [−10] = · · · . Thus
Z/R3 = {[0], [1]}.

• Consider the relation R on L from Example 1.4.1. Which lines are in the
equivalence class containing the line L1 : y = 2x + 3? Clearly, L1 R L2 if
and only if the slope of L2 is 2. Hence, [L1 ] = {all lines in the plane that can
be written in the form y = 2x + c : c ∈ R}.

• Consider the relation R5 from Example 1.4.1 and let S = {−6, −5, −2, −1, 0, 1, 3, 5, 7}.
What elements is −6 related to? We already know that R5 is an equiva-
lence relation, so certainly −6R5 − 6. Since −6 + 2(−5) = −16, which is
Sets, mappings, equivalence relations 15

not divisible by 3, −6 is not related to −5. Similarly, −6 + 2(−2) = −10,


which is not divisible by 3. Continuing in this vein, we find that
−6 + 2(−1) = −8
−6 + 2(0) = −6
−6 + 2(1) = −4
−6 + 2(3) = 0
−6 + 2(5) = 4
−6 + 2(7) = 8
Of these numbers, only −6 and 0 are divisible by 3. Thus
[−6] = {−6, 0, 3}
Since every element belongs to its own equivalence class, this also means
that
[0] = {−6, 0, 3}
[3] = {−6, 0, 3}.
By applying the same logic to the other elements of S, we find that
[−5] = {−5, −2, 1, 7} = [−2] = [1] = [7]
[−1] = {−1, 5} = [5]
In fact, it can be shown that x R5 y if and only if 3|(x − y). We leave this
as an exercise.

Theorem 2. Let R be an equivalence relation on S and x, y ∈ S. Then


[x] = [y] if and only if x R y.
Proof. Firstly, if [x] = [y], then y ∈ [y] = [x], so that x R y. This completes
the first direction of the proof. To prove the converse, suppose that x R y.
Let z ∈ [x]. Then x R z. However, since R is symmetric, y R x, and
thus, since R is transitive, y R x and x R z implies that y R z. Hence, every
element of [x] is also an element of [y], proving that [x] ⊆ [y]. By an identical
proof, one can show that [y] ⊆ [x]. Combining these two results, we find that
[x] = [y], as required.
16 MATH236 Discrete Mathematics with Applications 2009

The next theorem establishes one of the reasons why equivalence relations
are important: Every equivalence relation induces a partition.

Theorem 3. Let R be an equivalence relation on S. Then S/R is a partition


of S.

Proof. For every x ∈ S, we have x ∈ [x]; hence, every element of S belongs


to some equivalence class and no equivalence class is empty. This proves that
S/R satisfies conditions 1 and 3 of a partition (page 9). It remains to establish
condition (2), i.e., to prove that the equivalence classes are disjoint. Suppose
that [x] and [y] are distinct equivalence classes under R (that is, [x] 6= [y]).
Then from Theorem 2 we know that x 6 R y. We claim that [x] ∩ [y] = ∅.
Suppose, to the contrary, that there is some element z ∈ [x] ∩ [y]. Since
z ∈ [x], x R z. Since R is symmetric and z ∈ [y], z R y. Finally, since R is
transitive, x R z and z R y implies that x R y, a contradiction. Thus there
is no such element z and [x] ∩ [y] = ∅.

The next example is important for what follows and it is suggested that
you study it closely.

Example 1.4.3 We generalize the relation R3 introduced in Example 1.3.2. Let


n be a positive integer and x, y ∈ Z. We say that x and y are congruent modulo
n, written
x≡y (mod n),

if n | (x − y). Congruence modulo n is an equivalence relation:

Reflexivity: For all x ∈ Z, x − x = 0 · n, so x ≡ x (mod n).

Symmetry: Let x, y ∈ Z. If x − y = t · n, for some integer t, then y − x =


(−t) · n. Thus x ≡ y (mod n) if and only if y ≡ x (mod n).

Transitivity: Suppose x ≡ y (mod n) and y ≡ z (mod n). Then there exist


integers t1 , t2 such that x − y = t1 · n and y − z = t2 · n. Consequently,
x − z = (x − y) + (y − z) = (t1 + t2 ) · n, so x ≡ z (mod n).
Sets, mappings, equivalence relations 17

What do the equivalence classes under this relation look like? Clearly,
[0] = {. . . , −3n, −2n, −n, 0, n, 2n, 3n, . . .}
[1] = {. . . , −3n + 1, −2n + 1, −n + 1, 1, n + 1, 2n + 1, 3n + 1, . . .}
[2] = {. . . , −3n + 2, −2n + 2, −n + 2, 2, n + 2, 2n + 2, 3n + 2, . . .}
[3] = {. . . , −3n + 3, −2n + 3, −n + 3, 3, n + 3, 2n + 3, 3n + 3, . . .}
···
[n − 1] = {. . . , −2n − 1, −n − 1, −1, n − 1, 2n − 1, 3n − 1, 4n − 1, . . .}

Thus, for example, the equivalence class [2] contains all those integers whose
remainder when divided by n is 2. In general (for 0 ≤ k < n), the equivalence
class [k] consists of all those integers whose remainder when divided by n is k.
As a more concrete example, consider the relation congruence modulo 3,
which we denote by ∼ for convenience. The set of equivalence classes under ∼
is Z/ ∼ = {[0], [1], [2]}, where
[0] = {. . . , −9, −6, −3, 0, 3, 6, 9, . . .}
[1] = {. . . , −8, −5, −2, 1, 4, 7, 10, . . .}
[2] = {. . . , −7, −4, −1, 2, 5, 8, 11, . . .}.
Notice that, as promised by Theorem 3, the equivalence classes partition Z.
Not only does every equivalence relation give rise to a partition, but the
reverse is true as well:
Theorem 4. Let P be a partition of a set S. Then there exists an equivalence
relation R on S for which S/R = P, i.e., the equivalence classes of R are
the parts of P.
Proof. Define R as follows:
For x, y ∈ S, x R y if y is in the same part of P as x.
We prove that R is an equivalence relation. Every x is in the same part as
itself, so R is reflexive. Also, if y is in the same part as x, then x is clearly
in the same part as y, showing that R is symmetric. Finally, if y is in the
same part as x and z is in the same part as y, then z is in the same part as
x, establishing that R is transitive. Hence R is an equivalence relation and,
clearly from the definition of R, the equivalence classes of R coincide with
the parts of P.
18 MATH236 Discrete Mathematics with Applications 2009

Exercises
1.1 Let A = {1, 2, 3} and B = {2, 3, 4, 5}. Compute each of the following:

(a) A ∪ B.
(b) A ∩ B.
(c) A − B.
(d) B − A.
(e) A ∆ B.
(f) A × B.
(g) B × A.

1.2 Write down the elements of the relation on {1, 2, 3, 4, 5} determined by


a R b if |a − b| ≤ 1.

1.3 Determine which of the properties reflexive, irreflexive, symmetric, an-


tisymmetric, and transitive each of the following relations satisfies:

(a) R1 = {(1, 1), (1, 2), (2, 1), (3, 4), (4, 3)}.
(b) R2 = {(x, y) ∈ R2 : |x − y| ≥ 2}.
(c) R3 = {(x, y) ∈ Z2 : 2|xy}.

1.4 Prove that the relation x R y iff 2|(3x−5y) is an equivalence relation. If


R is defined on {−5, −4, −2, 0, 1, 2, 3, 7, 9}, determine the equivalence
classes.
Sets, mappings, equivalence relations 19

Solutions
1.1 (a) {1, 2, 3, 4, 5}
(b) {2, 3}
(c) {1}
(d) {4, 5}
(e) {1, 4, 5}
(f) {(1, 2), (1, 3), (1, 4), (1, 5), (2, 2), (2, 3), (2, 4), (2, 5), (3, 2), (3, 3), (3, 4), (3, 5)}
(g) {(2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3), (4, 1), (4, 2), (4, 3), (5, 1), (5, 2), (5, 3)}

1.2 {(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (1, 2), (2, 3), (3, 4), (4, 5)}

1.3 (a) symmetric


(b) irreflexive, symmetric
(c) symmetric

1.4 S/R = {{−5, 1, 3, 7, 9}, {−4, −2, 0, 2}}.


20 MATH236 Discrete Mathematics with Applications 2009
Chapter 2

How to count

2.1 Basic counting principles


Perhaps the most elementary counting principle is the following: If you count
something in two different ways, the two answers must be equal. A proof that
uses this principle is frequently called a combinatorial proof. We shall now
see an example.

Theorem 5. If A and B are finite sets, then

|A ∪ B| = |A| + |B| − |A ∩ B|.

Proof. We shall actually prove that

|A ∪ B| + |A ∩ B| = |A| + |B|

which is equivalent to what we are trying to establish. As mentioned previ-


ously, we shall use a combinatorial proof. Imagine the following: Line the
elements of A up in front of a turnstyle. As each passes through, give it a
slip of paper. Now line the elements of B up in front of the turnstyle. Once
again, as each passes through, give it a slip of paper. Consider the question:
How many slips of paper in total did we give out? We’ll count the slips in
two different ways:

1. We gave one slip to each member of A and then one slip to each member
of B, so clearly we gave out |A| + |B| slips of paper.

21
22 MATH236 Discrete Mathematics with Applications 2009

2. Each element of A ∪ B got either one slip of paper, if it’s in one of A


and B (but not both), or two slips of paper, if it’s in A ∩ B. Clearly,
then we gave out 1 · (|A ∪ B| − |A ∩ B|) + 2 · |A ∩ B| = |A ∪ B| + |A ∩ B|
slips of paper.

We’ve just counted the total number of slips of paper in two different ways,
but both ways must give the same answer. Therefore, |A| + |B| = |A ∪ B| +
|A ∩ B|, as required.

An immediate consequence of Theorem 5 is the following:

Corollary 6. If A and B are disjoint finite sets, then

|A ∪ B| = |A| + |B|.

Corollary 6 can quickly be generalized:

Theorem 7 (Addition Principle). If {A1 , A2 , . . . , Ak } is a pairwise disjoint


collection of finite sets, then
¯k ¯
¯[ ¯ X k
¯ ¯
¯ Ai ¯ = |Ai |.
¯ ¯
i=1 i=1

We can apply Theorem 7 to prove the following:

Theorem 8. If A and B are finite sets, then

|A × B| = |A| · |B|.

Proof. Let A = {a1 , a2 , . . . , a|A| }. For each i ∈ {1, 2, . . . , |A|}, let Ai be the
set of all ordered pairs whose first element is ai , i.e., Ai = {(ai , b) : b ∈
S
B}. Clearly, A × B = |A| i=1 Ai , and |Ai | = |B|. Also, {A1 , A2 , . . . , A|A| } is
a pairwise disjoint collection of sets (so {A1 , A2 , . . . , A|A| } is a partition of
A × B). Thus,
|A|
X
|A × B| = |A1 ∪ A2 ∪ · · · ∪ A|A| | = |Ai | = |A| · |B|.
i=1
How to count 23

Example 2.1.1 Referring to Example 1.1.1, the sets S and T have |S| = 3
and |T | = 4. Thus |S × T | = 3 · 4 = 12. You can check, by referring to that
example, that S × T does in fact have 12 elements.
Theorem 8 is one form of the Multiplication Principle:
Let S be a set of ordered pairs (s1 , s2 ) of objects in which the
first object s1 comes from a set of size n1 , and for each choice
of object s1 there are n2 choices for object s2 . Then the set S
contains n1 n2 ordered pairs.

Example 2.1.2 How many 2-digit numbers1 contain no repeated digits? A


2-digit number ab can be thought of as the ordered pair (a, b). The number a,
the first digit, can be anything except 0, so n1 = 9. Once we’ve chosen the
first digit a, the second digit can be any number except the one we chose for
a — so, whatever the choice of a, there are 9 choices for b. It follows from
the Multiplication Principle that the total number of 2-digit numbers with no
repeated digits is 9 · 9 = 81.
The Multiplication Principle can be generalized. By a k-tuple we mean an
ordered list (s1 , s2 , . . . , sk ) of k objects. Thus, a 2-tuple is the same as an
ordered pair.
Theorem 9 (Multiplication Principle). Let S be a set of k-tuples (s1 , s2 , . . . , sk )
of objects in which:
• the first object s1 comes from a set of size n1 ,
• for each choice of s1 there are n2 choices for object s2 ,
• for each choice of s2 there are n3 choices for object s3 ,
• for each choice of s3 there are n4 choices for object s4 ,
• and, in general, for each choice of si , 1 ≤ i ≤ k − 1, there are ni+1
choices for object si+1 .
1
When we talk about k-digit numbers, we’ll always assume that the most significant
(left-most) digit is not 0.
24 MATH236 Discrete Mathematics with Applications 2009

Then the number of k-tuples in the set S is n1 n2 · · · nk .

Example 2.1.3 How many 4-digit odd numbers are there? We consider each
number abcd as the 4-tuple (a, b, c, d). Such a number is odd if and only if the
last digit, d, is in the set {1, 3, 5, 7, 9}. There are no other restrictions on the
digits. Thus: n1 = 9 (since the first digit cannot be 0), n2 = n3 = 10, and
n4 = 5. It follows that the number of 4-digit odd numbers is 9 · 10 · 10 · 5 = 4500

Example 2.1.4 How many odd numbers less than 10,000 are there? Here,
we’ll use the Addition Principle together with the Multiplication Principle. For
i ∈ {1, 2, 3, 4}, let Ni be the set of all i-digit odd numbers. Notice that
{N1 , N2 , N3 , N4 } is a pairwise disjoint collection of sets. Then by the Addition
Principle, the number of odd numbers less than 10,000 is |N1 |+|N2 |+|N3 |+|N4 |.
Using the Multiplication Principle as in the previous example, we find that
|N1 | = 5, |N2 | = 45, |N3 | = 450, and we already know that |N4 | = 4500.
Thus the answer is 4500 + 450 + 45 + 5 = 5000.

Example 2.1.5 How many k-tuples can be chosen from a set of n elements if
repetition is allowed? We seek the number of k-tuples (s1 , s2 , . . . , sk ) in which
each si comes from a fixed set of n elements and in which it’s possible that
si = sj . For each position in the k-tuple, we can choose any one of n different
elements. Hence, by the Multiplication Principle, there are

· · · · · n} = nk
|n · n {z
k terms

such k-tuples.
If S is a set, then we let 2S denote the set of all subsets of S, sometimes
called the power set of S.

Example 2.1.6 If S = {a, b, c}, then

2S = {∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}}
How to count 25

The Multiplication Principle enables us to prove a formula for the number


of subsets of a set:

Theorem 10. If S is a finite set, then


¯ S¯
¯2 ¯ = 2|S| .

Proof. Let S = {x1 , x2 , . . . , x|S| }. With each subset A ⊆ S, associate an


|S|-tuple (a1 , a2 , . . . , a|S| ), where for i ∈ {1, 2, . . . , |S|} we set
(
0 if xi 6∈ A
ai =
1 if xi ∈ A.

Clearly, each subset corresponds to exactly one |S|-tuple and each |S|-tuple
corresponds to one subset. It follows that the number of subsets is equal to
the number of such |S|-tuples. Since each position in the |S|-tuple is chosen
from a set ({0, 1}) of size 2, we are asking the question: How many |S|-tuples
can be chosen from a set of 2 elements? By Example 2.1.5, the answer is
2|S| .

Example 2.1.7 Continuing from the previous example: If we let x1 = a, x2 = b,


and x3 = c, then the subset {a, c} corresponds to the 3-tuple (1, 0, 1), while the
subset {b} corresponds to the 3-tuple (0, 1, 0). According to Theorem 10, the
number of subsets of S is
¯ S¯
¯2 ¯ = 2|S| = 23 = 8.

You can verify this by counting the subsets that we found in the previous exam-
ple.

2.2 The Pigeonhole Principle


The Pigeonhole Principle in its simplest form states simply that
26 MATH236 Discrete Mathematics with Applications 2009

Theorem 11 (Pigeonhole Principle). If n+1 objects are placed into n boxes,


then at least one box contains at least two objects.

Proof. If this is not the case, then each box contains at most one object,
implying that the number of objects is at most n. This is a contradiction.

Example 2.2.1 If we choose 13 people, then there are two who have their
birthday in the same month.

Example 2.2.2 Suppose we have n married couples. How many of the 2n


people must be selected to guarantee that we have chosen at least one married
couple? Construct n boxes, each corresponding to one married couple. Each
time we choose someone, place them into the box corresponding to the couple
they are a member of. The Pigeonhole Principle says that once we’ve chosen
n+1 people, at least one box must contain 2 people, i.e., we’ve chosen a married
couple.

Example 2.2.3 We choose 101 of the integers 1, 2, . . . , 200. Show that among
the integers chosen, there are two having the property that one is divisible by the
other. Each integer in {1, 2, . . . , 200} can be written in the form n2k , where n is
an odd number between 1 and 199. There are 100 odd integers in {1, 2, . . . , 200},
so if we choose 101 numbers, by the Pigeonhole Principle, two of the numbers
we’ve chosen must have the same odd n, i.e., two of the numbers we’ve chosen
are of the form n2k1 and n2k2 . Assuming without loss of generality that k1 ≥ k2 ,
the number n2k2 divides the number n2k1 , as desired.
We now give a stronger form of the Pigeonhole Principle:

Theorem 12 (Strong Pigeonhole Principle). Let n1 , n2 , . . . , nk be positive


integers. If
n1 + n2 + · · · + nk − k + 1
objects are placed into k boxes, then there is an integer i ∈ {1, 2, . . . , k} such
that the ith box contains at least ni objects.
How to count 27

Proof. Assume, to the contrary, that this is not the case. Then for each
i ∈ {1, 2, . . . , k}, the ith box contains at most ni − 1 objects. Thus the total
number of objects is at most

(n1 − 1) + (n2 − 1) + · · · + (nk − 1) = n1 + n2 + · · · + nk − k,

which is a contradiction.

Example 2.2.4 A basket of fruit is to be made up from apples, bananas,


litchis, and mangos. How many pieces of fruit must we place in the basket to
be guaranteed that there are at least three apples or at least two bananas or at
least ten litchis or at least five mangos? From the Strong Pigeonhole Principle,
the answer is 3+2+10+5-4+1=17.

Example 2.2.5 Suppose that we choose n2 + 1 integers from the integers


1, 2, . . . , n. Think of this as picking n2 + 1 integers, putting each into a box
marked from 1 to n depending on its value. Since

n2 + 1 = (n + 1) + (n + 1) + · · · + (n + 1) −n + 1
| {z }
n terms

the Strong Pigeonhole Principle guarantees that at least one box contains at
least n + 1 objects, i.e., at least one of the integers 1, 2, . . . , n is chosen at least
n + 1 times.
Let a1 , a2 , . . . , ak be a sequence of real numbers. Recall that a subsequence
is a sequence of the form ai1 , ai2 , . . . , ait , where i1 < i2 < · · · < it . The
sequence a1 , a2 , . . . , ak is increasing if a1 ≤ a2 ≤ · · · ≤ ak , and decreasing if
a1 ≥ a2 ≥ · · · ≥ at .

Example 2.2.6 Consider the sequence 8, 1, 3, 5, 9, 2, 6, 4, 7. Then 1, 5, 2, 7 is


a subsequence, but 1, 2, 3, 4 is not. 1, 3, 5, 9 is an increasing subsequence, and
8, 5, 2 is a decreasing subsequence.
The Strong Pigeonhole Principle enables us to prove the following interesting
result, first established by Erdös and Szekeres.
28 MATH236 Discrete Mathematics with Applications 2009

Theorem 13 (Erdös-Szekeres). Every sequence a1 , a2 , . . . , an2 +1 of n2 +1 real


numbers contains an increasing subsequence of length n + 1 or a decreasing
subsequence of length n + 1.
Proof. Suppose that there is no increasing subsequence of length n + 1; we
shall show that there must as a consequence be a decreasing subsequence of
length n + 1. For each k ∈ {1, 2, . . . , n2 + 1}, let `k be the length of a longest
increasing subsequence beginning with ak . By assumption, each `k is one of
the numbers 1, 2, . . . , n. It follows from the Strong Pigeonhole Principle (see
the preceding example) that n + 1 of these numbers, say `k1 , `k2 , . . . , `kn+1 ,
are equal. We assume, without loss of generality, that k1 < k2 < · · · < kn+1 .
We claim that
ak1 ≥ ak2 ≥ · · · ≥ akn+1 .
Suppose, to the contrary, that there is some integer i ∈ {0, 1, . . . , n} such
that aki < aki+1 . But then by beginning with aki and then taking a longest
increasing subsequence starting at aki+1 , we obtain an increasing subsequence
beginning at aki of length greater than `ki+1 , implying that `ki > `ki+1 , a
contradiction. Hence, ak1 ≥ ak2 ≥ · · · ≥ akn+1 and ak1 , ak2 , . . . , akn+1 is a
decreasing subsequence of length n + 1, as required.

2.3 One-to-one functions and permutations


As we mentioned before, a function is a relation in which each element of
the domain is related to exactly one element of the range. If a function f
has the additional property that no two elements of its domain are related
to the same element of its range, i.e., x 6= y implies that f (x) 6= f (y), then
f is called a one-to-one function or an injection. If f is one-to-one, then the
relation
f −1 = {(b, a) : (a, b) ∈ f }
is also a function; conversely, if f is not one-to-one, then f −1 is not a function.
If f is an injection, then f −1 is called the inverse function of f .

Example 2.3.1
• Let f1 = {(1, a), (2, c), (3, b)} and f2 = {(1, a), (2, b), (3, b)}. Both f1
and f2 are functions. The function f1 is an injection, while f2 is not (since
the element b of the range is related to both 2 and 3). The inverse of f1
is f1−1 = {(a, 1), (c, 2), (b, 3))}.
How to count 29

• The relation f3 = {(x, x2 ) : x ∈ R} is a function. However, both (2, 4)


and (−2, 4) are members of f3 , so f3 is not one-to-one.

• The relation f4 = {(x, 5x + 6) : x ∈ R} is a one-to-one function. Its


inverse is the function f4−1 = {(x, (x − 6)/5 : x ∈ R}.

• The relation f5 = {(x, tan(π(x − 1/2))) : x ∈ (0, 1)} is a one-to-one


function. Its inverse is the function f5−1 = {(x, (1/π) arctan x + 1/2) :
x ∈ (−∞, ∞)}.

Let S be a finite nonempty set. A permutation of S is a one-to-one function


whose domain and range are both S.
The following example is important for what follows and it is suggested
that you study it closely.

Example 2.3.2 Let S = {1, 2, 3}. One permutation of S is the function f with
f (1) = 1, f (2) = 3, and f (3) = 2. This permutation can be denoted in the
following fashion: µ ¶
1 2 3
f= .
1 3 2
The top row of the matrix is read as the domain and the bottom row as the
range. Thus, in general, this notation has this form:
µ ¶
1 2 3
f= .
f (1) f (2) f (3)

There are six permutations of a set of three elements. The six permutations of
{1, 2, 3} are:
µ ¶ µ ¶ µ ¶
1 2 3 1 2 3 1 2 3
f1 = f2 = f3 =
1 2 3 2 3 1 3 1 2
µ ¶ µ ¶ µ ¶
1 2 3 1 2 3 1 2 3
f4 = f5 = f6 =
1 3 2 3 2 1 2 1 3

Written this way, it’s easy to find the inverse of a permutation: interchange
the top and bottom rows of the matrix and then (if necessary) re-order by the
30 MATH236 Discrete Mathematics with Applications 2009

elements in the (new) first row. For example, the inverse of


µ ¶
1 2 3
f2 =
2 3 1

is µ ¶
2 3 1
1 2 3
which, after resorting the top row into ascending order, becomes the permutation
µ ¶
1 2 1
= f3 .
3 1 2

In other words, f2−1 = f3 . Similarly:

f1−1 = f1
f3−1 = f2
f4−1 = f4
f5−1 = f5
f6−1 = f6 .

If f = f −1 , as is the case with f1 , f4 , f5 , and f6 , then f is called an involution.


Another way of writing permutations is as follows: for the permutation
µ ¶
1 2 3 4
2 4 1 3

we simply write the bottom row, (2413). Thus, we could write f1 = (123),
f6 = (213), etc.

2.4 Counting permutations


We now consider how to count permutations and similar mathematical ob-
jects.

Example 2.4.1 How many permutations of the set S = {a, b, c} are there?
How to count 31

As mentioned in the last example, we can think of each permutation on S as a


3-tuple (f (a), f (b), f (c)). The question then becomes: How many such 3-tuples
are there? The value f (a) may be any of three things: a, b, and c. Once we’ve
assigned a value to f (a), we must assign one to f (b). Since f is one-to-one,
f (b) cannot have the same value as f (a). This means that while there were
three choices for f (a), for each of these, there are only two possibilities for f (b).
Lastly, once we’ve chosen f (b), we must specify f (c), which can be neither f (a)
nor f (b). It follows that, whatever the values of f (a) and f (b), there’s only
one possible value for f (c). Finally, from the Multiplication Principle, the total
number of such 3-tuples is 3 · 2 · 1 = 6.
Recall that n factorial is the function with rule 0! = 1 and, for n ≥ 1,

n! = n · (n − 1) · (n − 2) · · · 3 · 2 · 1,

e.g.,

0! = 1
1! = 1
2! = 2
3! = 6
4! = 24
5! = 120.

This gives us a compact way to write certain expressions down. Recall that
an n-set is a set containing n elements:

Example 2.4.2 How many permutations of an n-set are there? Suppose that
the elements of the n-set are x1 , x2 , . . . , xn . As in the previous example, we
can consider each permutation to be an n-tuple (f (x1 ), f (x2 ), . . . , f (xn )). As
before, we have n choices for f (x1 ). Once we’ve chosen f (x1 ), there are (n − 1)
possible choices for f (x2 ), and, once we’ve chosen f (x2 ), there are (n − 2) pos-
sible choices for f (x3 ), and so on. So we see that the number of permutations
of an n-set is n(n − 1)(n − 2) · · · 2 · 1 = n!.

Example 2.4.3 Continuing the previous example: there are 6! = 720 permuta-
tions of the set {a, b, c, d, e, f }.
32 MATH236 Discrete Mathematics with Applications 2009

Example 2.4.4 How many k-tuples can be chosen from an n-set if repetition
of elements is not allowed? If repetition of elements is allowed, we already know
the answer to this question (Example 2.1.5). Suppose then that repetition is not
allowed. We may choose any of n elements for the first position in the k-tuple.
Once we’ve done that, we may choose any element for the second position ex-
cept the one we chose for the first, so for each choice for the first position, there
are n − 1 choices for the second position. The pattern is hopefully clear: once
we’ve chosen the second position, there are (n − 2) possibilities for the third
position, and so on, until we reach the kth position, for which there are n − k + 1
possibilities. From the Multiplication Principle, the number of such k-tuples is
thus
n(n − 1)(n − 2) · · · (n − k + 2)(n − k + 1). (2.1)

Example 2.4.5 Continuing the previous example: we can choose 7 · 6 · 5 = 210


3-tuples from a set of seven elements.
The expression we saw in equation (2.1) occurs frequently, and for this reason
it has a shorthand:
n!
(n)k = n(n − 1) · · · (n − k + 1) =
(n − k)!
The quantity (n)k is called a falling factorial.

Example 2.4.6 How many six-letter words2 can be formed from the letters
a,b,c,d,e,f,g,h,i (each letter can be used only once)? Each word corresponds to
a 6-tuple. The question is then: How many 6-tuples can be formed from a set
of 9 elements. The answer is
9!
(9)6 = = 60480.
3!

2
Including nonsense words.
How to count 33

Example 2.4.7 In how many ways can a President, Vice President, and Secretary
be elected from a group of fifteen people? We need to know how many 3-tuples of
the form (name of President, name of Vice-President, name of Secretary) there
are. The answer is (15)3 = 2730.

Example 2.4.8 In how many different ways can the 26 letters of the alphabet
be arranged so that no two of the vowels a,e,i,o,u occur in consecutive positions?
First of all, there are 21 consonants, so, ignoring the vowels for a moment, there
are 21! ways of arranging the consonants. Once we’ve arranged them, we must
place the vowels in the 22 ‘holes’ between, before, and after the consonants, and
whichever hole we place one vowel in, we cannot place any of the others in the
same hole. It follows that, once the consonants have been arranged, there are
(22)5 ways of distributing the vowels. Finally, by the Multiplication Principle, the
number of ways of arranging all the letters is
22!
21! · (22)5 = 21! = 161451464537975567155200000.
17!

You might have noticed that n! grows quickly, but how quickly exactly? A
commonly used approximation for n! is Stirling’s Approximation:
√ ³ n ´n
n! ≈ 2πn .
e
This shows that n! grows exponentially fast with n.

2.5 Combinations
Let n, k be nonnegative integers with k ≤ n. We define n choose k to be the
quantity µ ¶
n n! (n)k
= = .
k (n − k)!k! k!
¡ ¢
The quantity nk is also called a binomial coefficient.

Example 2.5.1 µ ¶
7 7!
= = 35.
4 3!4!
34 MATH236 Discrete Mathematics with Applications 2009

Some binomial coefficients have simple forms. For example:


µ ¶
n n!
= =1
0 n!0!
µ ¶
n n!
= =n
1 (n − 1)!1!
µ ¶
n n! n(n − 1)
= =
2 (n − 2!)2! 2
µ ¶
n n!
= =n
n−1 1!(n − 1)!
µ ¶
n n!
= = 1.
n 0!n!
¡ ¢ ¡ ¢ ¡ ¢ ¡ n ¢
Notice that n0 = nn , and n1 = n−1 . These are instances of a more
general relationship:

Theorem 14. Let n, k be nonnegative integers with k ≤ n. Then


µ ¶ µ ¶
n n
= .
k n−k

Proof.
µ ¶
n n! n!
= =
k (n − k)!k! (n − k)!(n − (n − k))!
n!
=
(n − (n − k))!(n − k)!
µ ¶
n
= .
n−k

Binomial coefficients play several important roles in mathematics. We


now give some of them. Recall that a k-subset is a subset of cardinality k.
For fixed n, k with k ≤ n, let

• T (n, k) be the set of all k-tuples chosen from the set {1, 2, . . . , n}.
How to count 35

• S(n, k) be the set of all k-subsets of {1, 2, . . . , n}.

Define the function f : T (n, k) → S(n, k) by

f ((x1 , x2 , . . . , xk )) = {x1 , x2 , . . . , xk }.

Example 2.5.2

f ((1, 2, 4)) = {1, 2, 4}


f ((1, 4, 2)) = {1, 2, 4}
f ((2, 1, 4)) = {1, 2, 4}
f ((2, 4, 1)) = {1, 2, 4}
f ((4, 1, 2)) = {1, 2, 4}
f ((4, 2, 1)) = {1, 2, 4}

Theorem 15. Let n, k ¡be¢nonnegative integers with k ≤ n. The number of


k-subsets of an n-set is nk .

Proof. Notice that for each k-subset S of S(n, k), there are k! k-tuples T
with f (T ) = S. It follows that

|T (n, k)| = k!|S(n, k)|.

But from Example 2.4.4, we know that |T (n, k)| = (n)k . The result follows.

Example 2.5.3 In how many different ways can three representatives be cho-
sen from a group of fifteen people? Notice that we’re not asking how many
3-tuples can be chosen from a set of fifteen elements, because the 3-tuple
(Adam, Mary, Lisa) is the same set of three representatives as (Mary, Lisa, Adam)
(contrast this with Example 2.4.7). The three representatives are unordered,
¡15¢ and
hence a subset rather than a k-tuple. Thus the answer we require is 3 = 455.
36 MATH236 Discrete Mathematics with Applications 2009

Example 2.5.4 How many hands of five cards can be drawn from a standard
deck of 52 cards?
¡52¢ Once again, the order the cards are drawn in is unimportant.
The answer is 5 = 2598960.
Two well-known identities involving binomial coefficients are the following:
Theorem 16 (Pascal’s Identity). Let n, k be positive integers with k ≤ n−1.
Then µ ¶ µ ¶ µ ¶
n n−1 n−1
= + .
k k−1 k
Proof. We’ll use a combinatorial proof here. Let N be the number of k-
subsets of the set S = {1, 2, . . . , n}. What is N ? Firstly, from Theorem
15, µ ¶
n
N= .
k
Here’s another way to find N . Let N1 be the number of k-subsets of S that
include the number 1, and let N2 be the number of k-subsets of S that do
not include the number 1. Clearly,

N = N1 + N2 .

Let’s determine N1 and N2 . Each set that contributes to N1 contains the


number 1 and ¡k − 1¢ other elements chosen from the (n − 1)-set {2, 3, . . . , n}.
Hence, N1 = n−1 k−1
. Each set that contributes to N2 does not contain 1,
so it’s¡a set
¢ of k elements chosen from the (n − 1)-set {2, 3, . . . , n}. Thus,
n−1
N2 = k . Finally, µ ¶ µ ¶
n−1 n−1
N= +
k−1 k
which completes the proof.

Example 2.5.5 From first principles,


µ ¶
6 6!
= = 15.
4 2!4!
How to count 37

We can also calculate this using Pascal’s Identity:


µ ¶ µ ¶ µ ¶
6 5 5
= +
4 3 4
µ ¶ µ ¶ µ ¶
4 4 5
= + +
2 3 4
=6+4+5
= 15.

Theorem 17. Let n be a nonnegative integer. Then


µ ¶ µ ¶ µ ¶
n n n
+ + ··· + = 2n .
0 1 n
Proof. If n = 0, then both sides equal 1, so we’ll assume that n ≥ 1. We’ll
use a combinatorial proof again. Consider the set S = {1, 2 . . . , n} and let
N be the number of subsets of S. What is N ? From Theorem 10,
N = 2n .
But we can also determine N in a different way. For each k ∈ {0, 1, . . . , n},
denote by Nk the number of k-subsets of S. Then
N = N0 + N1 + · · · + Nn .
¡ ¢
Moreover, from Theorem 15, we know that Nk = nk . Thus,
µ ¶ µ ¶ µ ¶
n n n
N= + + ··· + ,
0 1 n
which completes the proof.
We complete our study of binomial coefficients by stating and proving
the theorem from which they derive their name:
Theorem 18 (Binomial Theorem). For all real numbers x, y and every non-
negative integer n,
Xn µ ¶
n n k n−k
(x + y) = x y .
k=0
k
38 MATH236 Discrete Mathematics with Applications 2009

Proof. If n = 0, then the left hand side and the right hand side are equal, so
we assume that n ≥ 1. Consider the left hand side:
(x + y)n = (x + y)(x + y) · · · (x + y) . (2.2)
| {z }
n terms

When we expand the right hand side of equation (2.2), we obtain terms of
the form xk y j . Since there are n terms in the product on the right hand
side of (2.2) and each term of the form xk y j is obtained by choosing in each
(x + y) either the x or the y, it must be the case that k + j = n, i.e., each
term is of the form xk y n−k . It remains to determine, for a fixed k, how many
terms of the form xk y n−k occur when we expand the right hand side of (2.2).
The term xk y n−k occurs when, in multiplying through the right hand side
of (2.2), we choose x in k of the terms (x + y) (and, consequently, choose y
in the remaining n − k terms), so the number of terms of the form xk y n−k
is equal to the number of ways we can choose ¡n¢ k x’s from the n terms of the
form (x + y). This is clearly the number k , thus completing the proof.

Example 2.5.6
3 µ ¶
X
3 3
(x + y) = xk y 3−k
k=0
k
µ ¶ µ ¶ µ ¶ µ ¶
3 3 3 2 3 2 3 3
= y + xy + x y+ x
0 1 2 3
= y 3 + 3xy 2 + 3x2 y + x3

Corollary 19. Let n be a positive integer. Then


µ ¶ µ ¶ µ ¶ µ ¶
n n n n n
− + − · · · + (−1) =0
0 1 2 n
Proof. In the Binomial Theorem, set x = −1 and y = 1. The left hand side
is then 0. The right hand side becomes
µ ¶ µ ¶ µ ¶ µ ¶
n n n n n
− + − · · · + (−1) .
0 1 2 n
How to count 39

2.6 The Inclusion-Exclusion Principle


Recall that the Addition Principle gives a formula for the number of objects in
a union of sets provided that the sets do not overlap. The Inclusion-Exclusion
Principle gives a more general formula that can be applied to collections of
sets with nonempty intersections.
Let S be a set, each of whose objects may have one or both of two
properties, P1 and P2 . Suppose we wish to count the number of objects of
S that have neither property P1 nor property P2 . We can do this by taking
the number of objects in the set S, subtracting from that the number of
objects with property P1 and the number of objects with property P2 , and
then noting that those objects which have both property P1 and property P2
were subtracted twice, so we add back in that number of objects. In other
words, if S1 , S2 are the sets of all objects with properties P1 , P2 , respectively,
then
|S 1 ∩ S 2 | = |S| − |S1 | − |S2 | + |S1 ∩ S2 |.
This observation can be generalized.

Theorem 20. The number of objects of S that have none of the properties
P1 , P2 , . . . , Pn is given by
X X X
|S 1 ∩ S 2 ∩ · · · ∩ S n | =|S| − |Si | + |Si ∩ Sj | − |Si ∩ Sj ∩ Sk | + · · ·
· · · + (−1)n |S1 ∩ S2 ∩ · · · Sn |.

Proof. We prove this result by establishing that (i) every element of S 1 ∩


S 2 ∩ · · · ∩ S n makes a net contribution of 1 to the right hand side, while,
(ii) every other element makes a net contribution of 0 to the right hand side.
Consider first an element x ∈ S 1 ∩ S 2 ∩ · · · ∩ S n . Then x ∈ S but for all i,
the element x 6∈ Si . Thus, the contribution that x makes to the right hand
side is
1 − 0 + 0 − 0 + · · · + (−1)n 0 = 1
Now consider an element y 6∈ S 1 ∩ S 2 ∩ · · · ∩ S n ; specifically, suppose that y
satisfies¡ t¢≥ 1 of the properties P1 , P2 ,¡. .¢. , Pn . Then y makes a contribution
¡¢
of 1 = 0t to |S|. It contributes 1 to 1t of the quantities |Si |, and 1 to 2t
40 MATH236 Discrete Mathematics with Applications 2009
¡¢
of the quantities |Si ∩ Sj |, and 1 to 3t of the quantities |Si ∩ Sj ∩ Sk |, and
so on. Thus, its net contribution to the right hand side is
µ ¶ µ ¶ µ ¶ µ ¶ µ¶
t t t t t t
− + − + · · · + (−1) .
0 1 2 3 t
But, according to Corollary 19, this is equal to 0. This establishes the result.

Example 2.6.1 How many permutations of the letters


i,n,t,h,e,d,o,c,k
are such that none of the words in, the, and dock occur as consecutive letters?
Let S be the set of all permutations of the letters i,n,t,h,e,d,o,c,k. Let P1
be the property that the word in occurs, P2 that the word the occurs, and P3
that the word dock occurs. Then every permutation in S1 is a permutation of
the eight symbols
in,t,h,e,d,o,c,k
(the word in is treated as one symbol). Thus, |S1 | = 8!. Similarly, every
permutation in S2 is a permutation of the seven symbols
the,i,n,d,o,c,k
and so |S2 | = 7!. Similarly, |S3 | = 6!. Now, S1 ∩ S2 consists of all permutations
of the six symbols
in,the,d,o,c,k,
so |S1 ∩S2 | = 6!. Similarly, |S1 ∩S3 | = 5! and |S2 ∩S3 | = 4!. Finally, S1 ∩S2 ∩S3
consists of all permutations of the three symbols
in,the,dock,
implying that |S1 ∩ S2 ∩ S3 | = 3!. Putting all of this together using Theorem
20, we find that
|S 1 ∩ S 2 ∩ S 3 | = |S| − |S1 | − |S2 | − |S3 | + |S1 ∩ S2 | + |S1 ∩ S3 | + |S2 ∩ S3 | − |S1 ∩ S2 ∩ S3 |
= 9! − 8! − 7! − 6! + 6! + 5! + 4! − 3!
= 317658.
How to count 41

The result in Theorem 20 can be rewritten in another form:

Corollary 21 (Inclusion-Exclusion Principle). The number of objects of S


that have at least one of the properties P1 , P2 , . . . , Pn is given by
X X X
|S1 ∪ S2 ∪ · · · ∪ Sn | = |Si | − |Si ∩ Sj | + |Si ∩ Sj ∩ Sk | + · · ·
· · · + (−1)n+1 |S1 ∩ S2 ∩ · · · Sn |.

Proof. Recall that (Theorem 1)

S 1 ∩ S 2 ∩ · · · ∩ S n = S1 ∪ S2 ∪ · · · ∪ Sn .

Moreover,
S = (S1 ∪ S2 ∪ · · · ∪ Sn ) ∪ (S1 ∪ S2 ∪ · · · ∪ Sn ).
The result follows immediately from algebraically manipulating the formula
in Theorem 20.

Example 2.6.2 How many numbers between 0 and 100 are divisible by 2,3, or
5? Let P2 be the property that a number is divisible by 2, P3 that it is divisible
by 3, and P5 that it is divisible by 5. Then we need to know |S2 ∪ S3 ∪ S5 |. There
are 51 numbers between 0 and 100 with property P2 , 34 with property P3 , and
21 with property P5 . A number is divisible by both 2 and 3 iff it’s divisible by 6,
and there are 17 of these numbers, so |S2 ∩ S3 | = 17. Similarly, |S2 ∩ S5 | = 11,
and |S3 ∩ S5 | = 7. Lastly, |S2 ∩ S3 ∩ S5 | = 4. Then from the Inclusion-Exclusion
Principle we have

|S2 ∪ S3 ∪ S5 | = |S2 | + |S3 | + |S5 | − |S2 ∩ S3 | − |S2 ∩ S5 | − |S3 ∩ S5 | + |S2 ∩ S3 ∩ S5 |


= 51 + 34 + 21 − 17 − 11 − 7 + 4
= 75.
42 MATH236 Discrete Mathematics with Applications 2009

2.7 Infinite sets


In this section we briefly consider how to count infinite sets, and we see that
some infinities are ‘bigger’ than others.
Recall that a function f : X → Y is onto if for every y ∈ Y there exists
x ∈ X such that f (x) = y, or, equivalently, if ran f = Y . If a function is
both one-to-one and onto, it’s called a bijection.

Example 2.7.1 Consider the function f : R → R with rule f (x) = 2x + 3.


This function is linear with nonzero slope and hence one-to-one. To see that it
is onto, let y ∈ R; we must find x ∈ R such that f (x) = y. But if we choose
x = (y − 3)/2, then
f (x) = 2(y − 3)/2 + 3 = y
as desired. Thus f is a bijection.

When we count, we establish a bijection between the set we’re counting


and a subset of the integers. What do we mean by that? Suppose you’re
counting penguins as they march through a gate. As the first one passes,
you say out loud, “one”. As the second passes, you say, “two”. As the third
passes, “three”. And so on. Each penguin is associated with a specific integer
and, conversely, each integer (less than or equal to the number of penguins)
is associated with a specific penguin. If, as the last penguin passes, we
say “twenty seven”, then there are twenty seven penguins in total, and we
established this by constructing a bijection between the set of penguins and
the subset {1, 2, . . . , 27} of the integers.
This leads us to the following definition: Two sets A and B have the same
cardinality, written |A| = |B|, if there is a bijection between them.

Example 2.7.2 Let A = {3, 4, 5, 6, 7} and B = {5, 6, 7, 8, 9}. Then the func-
tion f : A → B with rule f (x) = x + 2 is a bijection, so |A| = |B|.

Example 2.7.3 Let A = {3, 4, 5, 6, 7} and B = {1, 2, 3}. Then no bijection


can be constructed between these two sets, because any one-to-one function
with domain A must have a range with cardinality at least 5, and B has only 3
elements.
How to count 43

These two examples are both hopefully straightforward. Once the sets un-
der consideration become infinite, things become both more complicated and
more interesting.

Example 2.7.4 Let A = N (the positive integers) and B = {2, 4, 6, 8, . . .}.


Notice that B is a proper subset of A. However, the function f : A → B with
rule f (x) = 2x is a bijection from A to B. Thus, even though B is completely
contained in A, the sets A and B have the same (infinite) cardinality.

Example 2.7.5 Consider the function f : (0, 1) → R with rule f (x) =


tan(π(x − 1/2)). This function is a bijection between (0, 1) and R. Conse-
quently, |(0, 1)| = |R|.
We’ve just seen two examples in which infinite sets that appeared to have
different sizes actually had the same cardinality. We now see an example
of two infinite sets of different cardinality. First, we introduce some new
notation; we shall let

|N| = ℵ0 , and
|R| = ℵ1

(the funny symbol is pronounced ‘aleph’). A set with cardinality ℵ0 is denu-


merable. A set that is either finite or denumerable is countable. A set that
is not countable is uncountable.

Theorem 22. ℵ0 6= ℵ1 .

Proof. From Example 2.7.5, we know that |(0, 1)| = ℵ1 . We shall thus prove
the result by showing3 that it is impossible to construct a bijection from N to
(0, 1). Suppose, to the contrary, that there exists a bijection f : N → (0, 1).
Then for each x ∈ N, f (x) is a number in (0, 1), i.e., a number that can be

3
The proof we shall use is called Cantor’s Diagonal Argument.
44 MATH236 Discrete Mathematics with Applications 2009

written in the form 0.x1 x2 x3 x4 x5 . . .. Suppose that

f (1) = 0.x11 x12 x13 x14 x15 . . .


f (2) = 0.x21 x22 x23 x24 x25 . . .
f (3) = 0.x31 x32 x33 x34 x35 . . .
f (4) = 0.x41 x42 x43 x44 x45 . . .
f (5) = 0.x51 x52 x53 x54 x55 . . .
..
.
f (n) = 0.xn1 xn2 xn3 xn4 xn5 . . .

Define a number y = 0.y1 y2 y3 y4 y5 . . . as follows:

(
1 if xii =
6 1
yi =
2 if xii = 1

Notice the following: (i) the number y is in (0, 1), and, (ii) there is no n ∈ N
such that f (n) = y, because the nth digits of f (n) and y are not equal.
Hence f is not onto, which contradicts our assumption that f is a bijection.
It follows that no such bijection exists and ℵ0 6= ℵ1 .

Since the integers are a subset of the real numbers, ℵ0 ≤ ℵ1 . Theorem 22


shows that in fact ℵ0 < ℵ1 , i.e., the integers have a smaller cardinality than
the real numbers, or equivalently, the real numbers are uncountable. One
might ask where the sets ‘between’ the integers and the real numbers fit in,
e.g., how ‘big’ are the rational numbers?

Theorem 23. |Q| = ℵ0 .

Proof. We first show that the positive rational numbers Q+ are countable.
Construct a table in which the entry in row i and column j is the rational
number ji .
How to count 45

1 2 3 4 ···

1 1 1 1
1 1 2 3 4
···

2 2 2 2
2 1 2 3 4
···

3 3 3 3
3 1 2 3 4
···

4 4 4 4
4 1 2 3 4
···

.. ...
. ··· ··· ··· ···

Notice that every positive rational number occurs somewhere in this table.
We shall traverse the table in the order indicated by the arrows:

1 2 3 4 ···

1 1 1 1
1 1 2 3 4
···

2 2 2 2
2 1 2 3 4
···

3 3 3 3
3 1 2 3 4
···

4 4 4 4
4 1 2 3 4
···

.. ..
. ··· ··· ··· ··· .
46 MATH236 Discrete Mathematics with Applications 2009

- -

6
? ?

6
? ?

- -

¾ ¾ ¾

We can now define a bijection f from N to Q+ . Let f (1) = 11 . Then follow


the arrows on the table from 11 , e.g., f (2) = 12 . There’s one thing we need
to be careful of. Each rational number occurs more than once, e.g., 1 occurs
as 11 and as 22 and as 33 (and as 576
576
, etc.). So while we’re tracing our way
through the table, when we come across a rational number we’ve already
met, we omit it. For example, we let f (3) = 21 rather than 22 because we’ve
already met 22 .
How to count 47

Thus we have a bijection f between N and Q+ . It remains to show how


to establish a bijection between N and Q. This is quite simple — we define
a function g as follows:

g(1) = 0
g(2) = f (1)
g(3) = −f (1)
g(4) = f (2)
g(5) = −f (2)
..
.

This is clearly a bijection between N and Q, and this establishes the result.

We have thus proved that there are ‘as many rational numbers as there
are positive integers’ !
48 MATH236 Discrete Mathematics with Applications 2009

Exercises
2.1 Chalk comes in 3 different lengths, 8 different colors, and 4 different
diameters. How many different kinds of chalk are there?

2.2 A psychology professor asks for volunteers from a nine-person class to


participate in a perception experiment. How many different groups
could the professor get?

2.3 How many two-digit numbers have distinct and nonzero digits?

2.4 How many odd numbers between 1000 and 9999 have distinct digits?

2.5 How many different five-digit numbers can be constructed from the
digits 1,2,3,3,3?

2.6 Show that if n + 1 integers are chosen from the set {1, 2, . . . , 2n}, then
there are always two which differ by 1.

2.7 A bag contains 100 apples, 100 bananas, 100 oranges, and 100 pears.
If I pick one piece of fruit out of the bag every minute, how long will it
be before I am assured of having picked at least a dozen pieces of fruit
of the same kind?

2.8 (a) Determine the number of permutations of {a, b, c, d}.


(b) Write down all permutations of the set {a, b, c, d}.
(c) For each permutation you found in (b), determine its inverse.

2.9 A committee of 5 — a President, Secretary, and 3 Committee Members


— is to be chosen from a group of 17 people. In how many ways can
the committee be chosen?
How to count 49

2.10 A committee of 4 is to be chosen from a club whose membership com-


prises 10 men and 12 women.

(a) In how many ways can the committee be chosen?


(b) In how many ways can the committee be chosen if it must include
at least 2 women?
¡¢
2.11 Use Pascal’s Identity to determine 94 .

2.12 Use the Binomial Theorem to write out the expansion of (x + y)6 .

2.13 Use the Binomial Theorem to write out the expansion of (2x − 3)4 .
¡ n ¢¡n−m¢ ¡n¢¡n−k¢
2.14 ¡Prove
¢ the formula m k
= k m without using the formula for
n
k
. Hint: consider pairs of subsets.

2.15 A used car dealer has 18 cars. Nine of them have automatic transmis-
sions, twelve have power steering, and eight have power brakes. Seven
have automatic transmissions and power steering, four have automatic
transmissions and power brakes, and five have power steering and power
brakes. Three cars have automatic transmission and power steering and
power brakes. How many cars do not have automatic transmissions,
power steering or power brakes?

2.16 Find the number of integers between 1 and 10000, inclusive, which are
divisible by none of 4,5, and 6.

2.17 Let ( √ )
n2 + 2
S= x∈R:x= ,n ∈ N .
n

n2 + 2
Define f : N → S by f (n) = n
.

(a) List three elements that belong to S.


(b) Show that f is one-to-one.
(c) Show that f is onto.
(d) Is S countable? Explain.

2.18 How do the cardinalities of the sets [0, 1] and [1, 3] compare? Justify
your answer.
50 MATH236 Discrete Mathematics with Applications 2009
How to count 51

Solutions
2.1 96

2.2 512

2.3 72

2.4 2240

2.5 20

2.7 45

2.8 (a) 24
¡ ¢
2.9 (17)2 15
3
= 123760
¡ ¢
2.10 (a) 22 4
= 7315
¡12¢¡10¢ ¡12¢¡10¢ ¡12¢¡10¢
(b) 2 2 + 3 1 + 4 0 = 5665
¡¢ ¡¢
2.11 84 + 83 = 126
¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢
2.12 60 x6 + 61 x5 y + 62 x4 y 2 + 63 x3 y 3 + 64 x2 y 4 + 65 xy 5 + 66 y 6 = x6 +
6x5 y + 15x4 y 2 + 20x3 y 3 + 15x2 y 4 + 6xy 5 + y 6

2.15 2

2.16 5334
52 MATH236 Discrete Mathematics with Applications 2009
Chapter 3

Elementary number theory

Before we begin discussing Public Key Cryptology, we need to strengthen


our knowledge of the mathematics behind it.
Recall that an integer b is divisible by an integer a if there exists an integer
d such that b = ad. When this is the case, we say a divides b, written a|b,
and call b a multiple of a and a a divisor of b.
Consider | as a relation. Clearly, | is reflexive and transitive. Suppose
that a|b and b|a. Then there exist integers d1 , d2 such that b = ad1 and
a = bd2 , whence it follows that b = bd1 d2 . Consequently, a = ±b.
An integer p is a prime if |p| ≥ 2 and the only divisors of p are ±1 and
±p.

3.1 The Division Algorithm


Let a and b be integers. Then there exist unique integers q and r, 0 ≤ r < a,
such that b = aq + r. Clearly, r = 0 if and only if a|b. If gcd(a, b) = 1, then
a and b are relatively prime.

Example 3.1.1 4 and 9 are relatively prime integers.


Suppose that 0 < a < b and we wish to find gcd(a, b). As before, write
b = aq + r. Notice that if d is a number that divides both a and b, then
d also divides r = b − aq. Similarly, if d divides q and r, then certainly d
divides b = aq + r. These observations suggest the following algorithm1 :
1
Also called Euclid’s Algorithm.

53
54 MATH236 Discrete Mathematics with Applications 2009

Division Algorithm( Given: integers a, b with 0 < a < b )


1 Start with p0 = b, q0 = a, i = 0
2 while qi does not divide pi
3 do
4 Let ri be the remainder when pi is divided by qi
5 Let pi+1 = qi and qi+1 = ri
6 Add 1 to i
7 Finally, gcd(a, b) = qi

Example 3.1.2 We’ll use the Division Algorithm to find the GCD of 112 and
268. We begin by setting p0 = 268, q0 = 112, and i = 0. Since 268 = 2·112+44,
i.e., q0 does not divide p0 , we set r0 = 44, then let p1 = 112, q1 = 44, and
i = 1. We now repeat the process with p1 and q1 : since 112 = 2 · 44 + 24, we
let r1 = 24, p2 = 44, and q2 = 24. The process is recorded in this table:

i pi qi ri
0 268 112 44
1 112 44 24
2 44 24 20
3 24 20 4
4 20 4 0

Since q4 divides p4 , we conclude that gcd(112, 268) = q4 = 4.


Before continuing our study of GCDs, we recall the Well-Ordering Axiom:

Every nonempty set of positive integers has a smallest element.

Let S be a set of elements. If an operation ◦ on S is such that a, b ∈ S


implies that a ◦ b ∈ S, then S is said to be closed under ◦, and ◦ is called a
binary operation.

Example 3.1.3 The set Z is closed under addition, multiplication, and subtrac-
tion, but not under division. N is closed under addition and multiplication, but
not under subtraction.
Elementary number theory 55

Lemma 24. Let S be a nonempty set of integers that is closed under addition
and subtraction. Then exactly one of the following two statements is true:

1. S = {0}.

2. S = {0, ±d, ±2d, ±3d, . . .}, where d is the smallest positive integer in
S.

Proof. Certainly the set {0} is closed under addition and subtraction. Sup-
pose that S 6= {0}. Then S contains an element a 6= 0. Since S is closed
under subtraction, S contains −a = a − a − a. Exactly one of a and −a is
positive; thus S contains a positive number and thus, by the Well-Ordering
Axiom, S contains a smallest positive number, d. Since S is closed under
addition and subtraction, we therefore have that S ⊇ {0, ±d, ±2d, ±3d, . . .}.
It remains to prove that S ⊆ {0, ±d, ±2d, ±3d, . . .}. Let z ∈ S. Then, by
the Division Algorithm, there are integers q and r with 0 ≤ r < d so that
z = qd + r. Thus, r = z − qd. Since d, z ∈ S and S is closed under addition
and subtraction, we therefore have r ∈ S. Finally, since r < d and d is
the smallest positive integer in S, the only possibility is that r = 0, whence
z = qd, implying that S ⊆ {0, ±d, ±2d, ±3d, . . .}.

Lemma 24 enables us to prove the following useful result.

Theorem 25. Let a and b be nonzero integers. Then gcd(a, b) can be ex-
pressed as a linear combination of a and b with integer coefficients; that is,
gcd(a, b) = sa + tb for some integers s and t.

Proof. Let
S = {sa + tb : s, t ∈ Z}.
Clearly, S is closed under addition and subtraction. Consequently, by Lemma
24, S = {0, ±d, ±2d, ±3d, . . .}, where d is the smallest positive integer in S.
We claim that d = gcd(a, b). Certainly, since a, b ∈ {±d, ±2d, ±3d, . . .}, d is
a factor of both a and b. It remains to show that d is the largest factor of
both a and b. Since d ∈ S, there exist integers s0 and t0 such that d = s0 a+t0 b.
But since d = s0 a + t0 b, any positive number p that divides both a and b must
also divide d, implying that p ≤ d, and hence that d = gcd(a, b).
56 MATH236 Discrete Mathematics with Applications 2009

Example 3.1.4 Continuing from Example 3.1.2, let’s find integers s and t such
that
4 = 268s + 112t.
The procedure is, more or less, to ‘reverse’ the Division Algorithm. To do this,
we begin by rewriting gcd(112, 268) = r3 = 4 = 24 − 20. Now we replace
r2 = 20 with (from the previous step in the Division Algorithm) 44 − 24:

4 = 24 − 20 = 24 − (44 − 24).

And now we replace r1 = 24 with 112 − 2 · 44:

4 = 2 · 24 − 44 = 2(112 − 2 · 44) − 44.

Finally, we replace r0 = 44 with 268 − 2 · 112:

4 = 2 · 112 − 5 · 44 = 2 · 112 − 5(268 − 2 · 112).

Consequently,
4 = 12 · 112 + (−5) · 268.
The procedure that we followed in this example is sometimes called the Extended
Division Algorithm.

3.2 Multiplicative inverses in Zm


Let a ∈ Zm = {0, 1, . . . , m − 1}. The multiplicative inverse of a in Zm is a
number, denoted a−1 , in Zm , such that

aa−1 ≡ a−1 a ≡ 1 (mod m).

Given a number a ∈ Zm , it is not certain that a−1 exists.

Example 3.2.1 Consider the number 2 ∈ Z4 = {0, 1, 2, 3}. Then

2×0=0
2×1=2
2×2=0
2×3=2
Elementary number theory 57

Thus, in Z4 , the number 2 has no multiplicative inverse. On the other hand, in


Z5 , 2 × 3 = 1, so in Z5 , 2−1 = 3.
We now consider when a number a has a multiplicative inverse.
Theorem 26. A number a ∈ Zm has a multiplicative inverse in Zm if and
only if a and m are relatively prime.
Proof. Suppose first that a and m are relatively prime. Since gcd(a, m) = 1,
by Theorem 25, there exist integers s, t with sa + tm = 1. Consequently,

sa = 1 − tm ≡ 1 (mod m).

Suppose s = qm + r, where r is the remainder when s is divided by m.


Then r ∈ Zm and we claim that ar ≡ 1 (mod m). Hence ar = a(s − qm) =
(−t−aq)m+1, and thus ar ≡ 1 (mod m), proving that r is the multiplicative
inverse in Zm of a.
Suppose now that a has a multiplicative inverse a−1 ∈ Zm and, to the
contrary, that a and m are not relatively prime. Then there exist integers
t, a0 , m0 ∈ [2, m−1] such that a = a0 t and m = m0 t. Since aa−1 ≡ 1 (mod m),
we have a0 ta−1 ≡ 1 (mod m). Multiplying both sides of this equation by m0 ,
we obtain
a0 ta−1 m0 ≡ m0 (mod m) (3.1)
However, the left hand side of (3.1) can be rewritten as a0 a−1 (m0 t) = a0 a−1 m,
which is clearly congruent to 0 (mod m). Since the right hand side of (3.1)
is not congruent to 0 (mod m), this is a contradiction.
Theorem 26 can be used to determine the multiplicative inverse of a ∈ Zm
in the following way:

Example 3.2.2 Suppose we wish to find 4−1 ∈ Z21 . Firstly, we check that
gcd(4, 21) = 1, which means that the inverse exists. Using the extended Division
Algorithm, we find that
1 = 21 − 5 · 4
from which it follows that −5 · 4 ≡ 1 (mod 21). From this, it looks like the in-
verse of 4 is -5, but the inverse of 4 has to be a number in [0, 20]. Remembering
that −5 ≡ 16 (mod 21), we therefore have that in Z21 , 4−1 = 16. To check:
4 · 16 = 64 = 3 · 21 + 1 ≡ 1 (mod 21).
58 MATH236 Discrete Mathematics with Applications 2009

Theorem 27. If a has a multiplicative inverse in Zm , then this inverse is


unique, i.e., if x, y ∈ Zm such that ax ≡ 1 (mod m) and ay ≡ 1 (mod m),
then x = y.
Proof. Suppose that x, y ∈ Zm such that ax ≡ 1 (mod m) and ay ≡ 1
(mod m). Then there exist integers s, t such that ax = sm + 1 and ay =
tm + 1. Consequently, a(x − y) = (s − t)m. By assumption, a has at least
one inverse a−1 , so x − y = a−1 (s − t)m, and thus x − y ≡ 0 (mod m). Since
x, y ∈ Zm , the only possibility is that x = y.

3.3 Exponentiation in Zm: square and multi-


ply
A lot of the cryptosystems that we’ll deal with later require you to be able
to compute things like
21351563 (mod 3172) (3.2)
Even with relatively small numbers like these (in practice, these cryptosys-
tems use numbers that are many hundreds of digits in length), you might
run into a problem computing something like (3.2). The solution is to use
the square and multiply algorithm, which can be implemented very efficiently
on a computer. How does the algorithm work? Write the exponent, in this
case 1563, as the sum2 of powers of 2:

1563 = 210 + 29 + 24 + 23 + 2 + 1

Hence:
10 +29 +24 +23 +2+1
21351563 = 21352
10 9 4 3
= 21352 · 21352 · 21352 · 21352 · 21352 · 2135

Now
10 9 9
21352 = 21352·2 = (21352 )2
so
9 9 4 3
21351563 = (21352 )2 · 21352 · 21352 · 21352 · 21352 · 2135
2
If you’re using a calculator that can do binary arithmetic, you can quickly see which
powers of 2 the number 1563 is the sum of by punching 1563 in (as a decimal number)
and then converting it to binary.
Elementary number theory 59

Now, 21352 is relatively easy to compute: it’s 4558225, which is 61 (mod 3172),
so
9 9 4 3
21351563 ≡ 612 · 21352 · 21352 · 21352 · 21352 · 2135 (mod 3172)
29 24 23
≡ (61 · 2135) · 2135 · 2135 · 21352 · 2135 (mod 3172)

We calculate 61 · 2135 ≡ 183 (mod 3172), and then begin the whole pro-
9 9
cess over again, first rewriting the term (61 · 2135)2 ≡ 1832 (mod 3172) ≡
8
(1832 )2 , and then continuing in this fashion until we’ve found the answer:
8 4 3
21351563 ≡ (1832 )2 · 21352 · 21352 · 21352 · 2135 (mod 3172)
8 4 3
≡ 17692 · 21352 · 21352 · 21352 · 2135 (mod 3172)
7 4 3
≡ (17692 )2 · 21352 · 21352 · 21352 · 2135 (mod 3172)
27 24 23
≡ 1769 · 2135 · 2135 · 21352 · 2135 (mod 3172)
6 4 3
≡ 17692 · 21352 · 21352 · 21352 · 2135 (mod 3172)
5 4 3
≡ 17692 · 21352 · 21352 · 21352 · 2135 (mod 3172)
4 3
≡ (1769 · 2135)2 · 21352 · 21352 · 2135 (mod 3172)
3
≡ (61 · 2135)2 · 21352 · 2135 (mod 3172)
2
≡ 1832 · 21352 · 2135 (mod 3172)
≡ (1769 · 2135)2 · 2135 (mod 3172)
≡ 61 · 2135 (mod 3172)
≡ 183 (mod 3172)

Thus,
21351563 ≡ 183 (mod 3172).
Several other explicit examples of the use of the square and multiply algo-
rithm are given later in the notes.

3.4 Prime numbers


As we mentioned previously, an integer p ≥ 2 is prime if 1 and p are its
only divisors.
Q Every positive integer n can be written uniquely in the form
n = ki=1 pei i , where p1 , p2 , . . . , pk are distinct prime numbers (called the
prime factors of n) and e1 , e2 , . . . , ek are positive integers (called exponents).
60 MATH236 Discrete Mathematics with Applications 2009

It’s useful to remember that two positive integers have a greatest common
divisor greater than 1 if and only if they have a prime factor in common.
Prime numbers will play an important role throughout the rest of this course,
particularly when we consider public-key cryptography. Since it will be useful
to have a list of small prime numbers, we give one now. Here are the primes
p satisfying 2 ≤ p ≤ 7000:

2 3 5 7 11 13 17 19 23 29
31 37 41 43 47 53 59 61 67 71
73 79 83 89 97 101 103 107 109 113
127 131 137 139 149 151 157 163 167 173
179 181 191 193 197 199 211 223 227 229
233 239 241 251 257 263 269 271 277 281
283 293 307 311 313 317 331 337 347 349
353 359 367 373 379 383 389 397 401 409
419 421 431 433 439 443 449 457 461 463
467 479 487 491 499 503 509 521 523 541
547 557 563 569 571 577 587 593 599 601
607 613 617 619 631 641 643 647 653 659
661 673 677 683 691 701 709 719 727 733
739 743 751 757 761 769 773 787 797 809
811 821 823 827 829 839 853 857 859 863
877 881 883 887 907 911 919 929 937 941
947 953 967 971 977 983 991 997 1009 1013
1019 1021 1031 1033 1039 1049 1051 1061 1063 1069
1087 1091 1093 1097 1103 1109 1117 1123 1129 1151
1153 1163 1171 1181 1187 1193 1201 1213 1217 1223
1229 1231 1237 1249 1259 1277 1279 1283 1289 1291
1297 1301 1303 1307 1319 1321 1327 1361 1367 1373
1381 1399 1409 1423 1427 1429 1433 1439 1447 1451
1453 1459 1471 1481 1483 1487 1489 1493 1499 1511
1523 1531 1543 1549 1553 1559 1567 1571 1579 1583
1597 1601 1607 1609 1613 1619 1621 1627 1637 1657
1663 1667 1669 1693 1697 1699 1709 1721 1723 1733
1741 1747 1753 1759 1777 1783 1787 1789 1801 1811
1823 1831 1847 1861 1867 1871 1873 1877 1879 1889
1901 1907 1913 1931 1933 1949 1951 1973 1979 1987
1993 1997 1999 2003 2011 2017 2027 2029 2039 2053
Elementary number theory 61

2063 2069 2081 2083 2087 2089 2099 2111 2113 2129
2131 2137 2141 2143 2153 2161 2179 2203 2207 2213
2221 2237 2239 2243 2251 2267 2269 2273 2281 2287
2293 2297 2309 2311 2333 2339 2341 2347 2351 2357
2371 2377 2381 2383 2389 2393 2399 2411 2417 2423
2437 2441 2447 2459 2467 2473 2477 2503 2521 2531
2539 2543 2549 2551 2557 2579 2591 2593 2609 2617
2621 2633 2647 2657 2659 2663 2671 2677 2683 2687
2689 2693 2699 2707 2711 2713 2719 2729 2731 2741
2749 2753 2767 2777 2789 2791 2797 2801 2803 2819
2833 2837 2843 2851 2857 2861 2879 2887 2897 2903
2909 2917 2927 2939 2953 2957 2963 2969 2971 2999
3001 3011 3019 3023 3037 3041 3049 3061 3067 3079
3083 3089 3109 3119 3121 3137 3163 3167 3169 3181
3187 3191 3203 3209 3217 3221 3229 3251 3253 3257
3259 3271 3299 3301 3307 3313 3319 3323 3329 3331
3343 3347 3359 3361 3371 3373 3389 3391 3407 3413
3433 3449 3457 3461 3463 3467 3469 3491 3499 3511
3517 3527 3529 3533 3539 3541 3547 3557 3559 3571
3581 3583 3593 3607 3613 3617 3623 3631 3637 3643
3659 3671 3673 3677 3691 3697 3701 3709 3719 3727
3733 3739 3761 3767 3769 3779 3793 3797 3803 3821
3823 3833 3847 3851 3853 3863 3877 3881 3889 3907
3911 3917 3919 3923 3929 3931 3943 3947 3967 3989
4001 4003 4007 4013 4019 4021 4027 4049 4051 4057
4073 4079 4091 4093 4099 4111 4127 4129 4133 4139
4153 4157 4159 4177 4201 4211 4217 4219 4229 4231
4241 4243 4253 4259 4261 4271 4273 4283 4289 4297
4327 4337 4339 4349 4357 4363 4373 4391 4397 4409
4421 4423 4441 4447 4451 4457 4463 4481 4483 4493
4507 4513 4517 4519 4523 4547 4549 4561 4567 4583
4591 4597 4603 4621 4637 4639 4643 4649 4651 4657
4663 4673 4679 4691 4703 4721 4723 4729 4733 4751
4759 4783 4787 4789 4793 4799 4801 4813 4817 4831
4861 4871 4877 4889 4903 4909 4919 4931 4933 4937
4943 4951 4957 4967 4969 4973 4987 4993 4999 5003
5009 5011 5021 5023 5039 5051 5059 5077 5081 5087
5099 5101 5107 5113 5119 5147 5153 5167 5171 5179
62 MATH236 Discrete Mathematics with Applications 2009

5189 5197 5209 5227 5231 5233 5237 5261 5273 5279
5281 5297 5303 5309 5323 5333 5347 5351 5381 5387
5393 5399 5407 5413 5417 5419 5431 5437 5441 5443
5449 5471 5477 5479 5483 5501 5503 5507 5519 5521
5527 5531 5557 5563 5569 5573 5581 5591 5623 5639
5641 5647 5651 5653 5657 5659 5669 5683 5689 5693
5701 5711 5717 5737 5741 5743 5749 5779 5783 5791
5801 5807 5813 5821 5827 5839 5843 5849 5851 5857
5861 5867 5869 5879 5881 5897 5903 5923 5927 5939
5953 5981 5987 6007 6011 6029 6037 6043 6047 6053
6067 6073 6079 6089 6091 6101 6113 6121 6131 6133
6143 6151 6163 6173 6197 6199 6203 6211 6217 6221
6229 6247 6257 6263 6269 6271 6277 6287 6299 6301
6311 6317 6323 6329 6337 6343 6353 6359 6361 6367
6373 6379 6389 6397 6421 6427 6449 6451 6469 6473
6481 6491 6521 6529 6547 6551 6553 6563 6569 6571
6577 6581 6599 6607 6619 6637 6653 6659 6661 6673
6679 6689 6691 6701 6703 6709 6719 6733 6737 6761
6763 6779 6781 6791 6793 6803 6823 6827 6829 6833
6841 6857 6863 6869 6871 6883 6899 6907 6911 6917
6947 6949 6959 6961 6967 6971 6977 6983 6991 6997
There are, in fact, infinitely many prime numbers, which we now prove:
Theorem 28. There are infinitely many prime numbers.
Proof. Suppose, to the contrary, that there are finitely many prime numbers.
Specifically, suppose that there are k prime numbers, p1 , p2 , . . . , pk . Consider
the number n = p1 p2 · · · pk + 1, which is certainly larger than any pi , and
hence not equal to any of them. Since n gives a remainder of 1 when divided
by any pi , n is not divisible by any prime pi . Thus n is itself a prime number,
which contradicts the assumption that there are only k such numbers.
The prime counting function π(n) is the number of prime numbers less
than or equal to n.

Example 3.4.1 π(2) = 1, π(3) = π(4) = 2, π(5) = π(6) = 3, π(7) = π(8) =


π(9) = π(10) = 4.
While we shall not prove it, we mention the well-known Prime Number The-
orem:
Elementary number theory 63

Theorem 29 (Prime Number Theorem). For large n,


n
π(n) ≈ .
ln(n)

Example 3.4.2 How many primes are there between 1 and 10100 ? According
to Theorem 29, the answer is approximately

10100 1098
= ≈ 5 × 1097 .
ln(10100 ) ln(10)

3.5 The Euler φ-function


For a positive integer n, we denote by φ(n) the number of positive integers
a for which 1 ≤ a ≤ n and gcd(n, a) = 1. The function φ is the Euler φ-
function.

Example 3.5.1 Since 16 is relatively prime to 1,3,5,7,9,11,13, and 15, φ(16) =


8. Similarly, we can calculate φ(n) for the first few positive integers n:

n relatively prime integers in [1, n] φ(n)


1 1 1
2 1 1
3 1,2 2
4 1,3 2
5 1,2,3,4 4
6 1,5 2
7 1,2,3,4,5,6 6
8 1,3,5,7 4
9 1,2,4,5,7,8 6
10 1,3,7,9 4
64 MATH236 Discrete Mathematics with Applications 2009

Theorem 30. If p is prime, then φ(p) = p − 1.


Proof. If p is prime, then p is relatively prime to each of the numbers
1, 2, . . . , p − 1.
Theorem 31. If p and q are prime, then φ(pq) = (p − 1)(q − 1).
Proof. Since p and q are prime, the only positive integers n for which n ≤
pq and gcd(n, pq) > 1 are those in the set S = {p, 2p, . . . , (q − 1)p, qp} ∪
{q, 2q, . . . , (p − 1)q, pq}. Since the two sets that comprise this union have
only one element, pq, in common, there are p + q − 1 such integers n. Thus,
the number of positive integers n for which n ≤ pq and gcd(n, pq) = 1 is
pq − (p + q − 1) = (p − 1)(q − 1).
Theorem 32. If p is prime and k is a positive integer, then
1
φ(pk ) = pk (1 − ).
p
Proof. There are pk integers in S = {1, 2, . . . , pk }. The only integers in S
that are not relatively prime to pk are the multiples of p:
p, 2p, 3p, . . . , pk−2 p, pk−1 p.
There are pk−1 such multiples; it follows that φ(pk ) = pk − pk−1 , which after
a trivial algebraic manipulation becomes the desired expression.

Example 3.5.2
1
φ(32) = φ(25 ) = 25 (1 − ) = 16
2
Also,
φ(27) = φ(33 ) = 33 − 32 = 18

k
Y
Theorem 33. If n = pei i , where p1 , p2 , . . . , pk are distinct primes and
i=1
e1 , e2 , . . . , ek are positive integers, then
k µ
Y ¶
1
φ(n) = n 1− .
i=1
pi
Elementary number theory 65

To prove Theorem 33 we first prove the following key lemma:

Lemma 34. For k ≥ 1, let p1 , . . . , pk be distinct primes and let n be a


positive integer that is divisible by pi for all i = 1, . . . , k. Then the number
of integers between 1 and n that are multiples of at least one of the integers
pi , 1 ≤ i ≤ k, is
Yk µ ¶
1
n−n 1− .
i=1
pi

Proof. We proceed by induction on k. When k = 1, the integers between 1


and n that are multiples of p1 are
µ ¶
n
p1 , 2p1 , 3p1 , . . . , p1 .
p1
Hence the number of integers between 1 and n that are multiples of p1 is
µ ¶ Yk µ ¶
n 1 1
=n−n 1− =n−n 1− .
p1 p1 i=1
pi

This establishes the base case. Assume, then, that k ≥ 2 and that the result
is true for all positive integers that are divisible by k − 1 distinct primes. Let
n be a positive integer that is divisible by k distinct primes, p1 , p2 , . . . , pk .
We wish to count the set S of all integers between 1 and n that are multiples
of at least one of the integers pi , 1 ≤ i ≤ k. To do this we use the Addition
Rule.
Let X be the set of all integers between 1 and n that are multiples of
at least one of the integers pi , 1 ≤ i ≤ k − 1, and let Y be the set of all
integers between 1 and n that are multiples of pk but not of pi for any i
where 1 ≤ i ≤ k − 1. By the Addition Rule, |S| = |X| + |Y |. By the
inductive hypothesis,


k−1
1

|X| = n − n 1− . (3.3)
i=1
pi

Next we determine |Y |. The set of all integers between 1 and n that are
multiples of pk are µ ¶
n
pk , 2pk , 3pk , . . . , pk .
pk
66 MATH236 Discrete Mathematics with Applications 2009

Hence the number of integers between 1 and n that are multiples of pk is


n/pk . Since for 1 ≤ i < j ≤ k we have gcd(pi , pj ) = 1, the multiples of pk
that are not multiples of pi for any i where 1 ≤ i ≤ k − 1 are those whose
coefficients (which are integers between 1 and n/pk ) are not multiples of pi
for any i where 1 ≤ i ≤ k − 1. By the inductive hypothesis, since n/pk is
divisible by pi for all i = 1, . . . , k − 1, the number of integers between 1 and
n/pk that are multiples of at least one of the integers pi , 1 ≤ i ≤ k − 1, is
k−1 µ ¶
n n Y 1
− 1− .
pk pk i=1 pi

By the Difference Rule,


à k−1 µ ¶! k−1 µ ¶
n n n Y 1 n Y 1
|Y | = − − 1− = 1− . (3.4)
pk pk pk i=1 pi pk i=1 pi

Thus, by Equations (3.3) and (3.4),

|S| = |X| + |Y |


k−1
1

n Y
k−1 µ
1

= n−n 1− + 1−
i=1
pi pk i=1 pi
" ¶# µ

k−1
1 1

= n− n 1− 1−
i=1
pi pk

k µ
Y ¶
1
= n−n 1− .
i=1
pi

This complete the proof of Lemma 34.

We are now in a position to prove Theorem 33.


k
Y
Proof of Theorem 33. Since n = pei i , where p1 , p2 , . . . , pk are distinct
i=1
primes and e1 , e2 , . . . , ek are positive integers, the only integers between 1
and n that are not relatively prime with n are those integers between 1 and
Elementary number theory 67

n that are multiples of at least one of the integers pi , 1 ≤ i ≤ k. By our Key


Lemma, the number of such integers is
k µ
Y ¶
1
n−n 1− .
i=1
pi

The desired result now follows from the Difference Rule. 2

¡ ¢¡ ¢¡ ¢
Example 3.5.3 φ(60) = φ(22 ·3·5) = 60 1 − 12 1 − 13 1 − 15 = 60· 12 · 23 · 45 =
16.

Theorem 35. If m and n are positive integers and gcd(m, n) = 1, then

φ(mn) = φ(m)φ(n).
Q Q0
Proof. Let m = ki=1 pei i and n = ki=1 (p0i )ei , where p1 , p2 , . . . , pk , p01 , p02 , . . . , p0k0
0

are (since gcd(m, n) = 1) distinct primes. Then from Theorem 33,


k µ
Y ¶ k0 µ ¶
1 Y 1
φ(mn) = mn 1− 1− 0
i=1
p i j=1
pi
Y k µ ¶ Y k0 µ ¶
1 1
=m 1− ·n 1− 0
i=1
pi j=1
pi
= φ(m)φ(n).

Example 3.5.4 From Example 3.5.2, we have

φ(864) = φ(32 · 27) = φ(32)φ(27) = 16 · 18 = 288.


68 MATH236 Discrete Mathematics with Applications 2009

3.6 The theorems of Fermat and Euler


Theorem 36 (Fermat). If a is a positive integer and p is a prime number,
then
ap ≡ a (mod p).
Proof. Our proof will be by induction on a. If a = 1, then ap = 1p = 1 = a,
as required. Assume then that for some integer a ≥ 1, it is the case that
ap ≡ a (mod p). We shall prove that (a + 1)p ≡ a + 1 (mod p). Consider
the binomial expansion of (a + 1)p :
p µ ¶
X p
p
(a + 1) = ai = 1 + pa + · · · + pap−1 + ap
i=0
i

Every term, except the first and last, is divisible by p, so (a + 1)p ≡ ap + 1


(mod p). However, by the Inductive Hypothesis, ap ≡ a (mod p), from which
we obtain (a + 1)p ≡ a + 1 (mod p), as required.
Theorem 37 (Euler). Let a, m be integers with m ≥ 2 and gcd(a, m) = 1.
Then
aφ(m) ≡ 1 (mod m)
Proof. Let s1 , s2 , . . . , sφ(m) be the φ(m) integers in {1, 2, . . . , m − 1} that are
relatively prime to m. For each i with 1 ≤ i ≤ φ(m), let asi = qi m+ri , where
0 ≤ ri < m. We claim that {s1 , s2 , . . . , sφ(m) } = {r1 , r2 , . . . , rφ(m) }. Since
each ri is in {0, 1, . . . , m − 1} and there are exactly φ(m) integers (namely,
s1 , s2 , . . . , sφ(m) ) in {0, 1, . . . , m − 1} that are relatively prime to m, it suffices
to prove two things: (i) that the numbers r1 , r2 , . . . , rφ(m) are all distinct,
and, (ii) that for each i, gcd(ri , m) = 1.
(i) Suppose, to the contrary, that there is a pair i, j of distinct integers for
which ri = rj and (without loss of generality) si > sj . Then asi − asj =
(qi − qj )m. Since gcd(a, m) = 1, we know from Theorem 26 that a has
a multiplicative inverse a−1 in Zm . Hence si −sj = a−1 (qi −qj )m. Thus
m|(si − sj ), but 0 < sj < si < m, so 1 ≤ si − sj < m, a contradiction.
Hence, i 6= j implies that ri 6= rj , as promised.

(ii) Suppose, to the contrary, that there is some integer i and a prime number
p such that p|ri and p|m. Then p|(qi m + ri ), hence p|asi , which means
that p divides a or p divides si . Thus, there is an integer p that divides
Elementary number theory 69

m and at least one of a and si . This contradicts our assumption that


a and si are both relatively prime to m.

We have thus proved that {s1 , s2 , . . . , sφ(m) } = {r1 , r2 , . . . , rφ(m) }. Finally,

aφ(m) s1 s2 · · · sφ(m) = as1 · as2 · · · asφ(m)


≡ r1 r2 · · · rφ(m) (mod m)
≡ s1 s2 · · · sφ(m) (mod m).

Since each si is relatively prime to m, each si has by Theorem 26 a multiplica-


tive inverse s−1 −1 −1 −1
i . Multiplying both sides of the last equation by s1 s2 · · · sφ(m) ,
we find
aφ(m) ≡ 1 (mod m)
which finishes the proof.

Corollary 38. If p is prime and gcd(a, p) = 1, then

ap−1 ≡ 1 (mod p).

Proof. If p is prime, then φ(p) = p − 1. The result now follows directly from
Theorem 37.

Example 3.6.1 Suppose we wish to solve the congruence

x ≡ 3201 (mod 11).

Corollary 38 implies that 310 ≡ 1 (mod 11). Hence:

3201 = (310 )20 · 3 ≡ 120 · 3 ≡ 3 (mod 11).

Thus, x = 3 ∈ Z11 .
Another useful consequence of Theorem 37 is the following:

Corollary 39. If a, m ∈ Z, m ≥ 2, and gcd(a, m) = 1, then a−1 = aφ(m)−1


is the multiplicative inverse in Zm of a.

Proof. From Corollary 38, a · aφ(m)−1 = aφ(m) ≡ 1 (mod m).


70 MATH236 Discrete Mathematics with Applications 2009

Example 3.6.2 Suppose we wish to find 2−1 in Z9 . Since gcd(2, 9) = 1, from


Corollary 39 we have
2−1 = 2φ(9)−1 = 25 = 32 ≡ 5 (mod 9).
It’s easy to check this: 2 × 5 = 10 ≡ 1 (mod 9), so in Z9 , 2−1 = 5.

Example 3.6.3 Another application of Corollary 38 is the solution of linear


congruences of the form
ax ≡ b (mod m)
where gcd(a, m) = 1. Since aφ(m) ≡ 1 (mod m), we have
x ≡ aφ(m) x (mod m)
≡ aφ(m)−1 ax (mod m)
≡ aφ(m)−1 b (mod m)
For example, suppose we wish to solve the congruence
4x ≡ 6 (mod 9)
Since gcd(4, 9) = 1, the solution is
x ≡ 4φ(9)−1 · 6 (mod 9)
≡ 45 · 6 (mod 9)
≡ 1024 · 6 (mod 9)
≡ 6 (mod 9)
Once again, we can check: 4 · 6 = 24 ≡ 6 (mod 9), so x = 6 is a solution of
this congruence.

3.7 Groups
A group is an ordered pair (S, ◦), where S is a nonempty set and ◦ a binary
operation on S, such that the following conditions hold:
Elementary number theory 71

1. S is closed under ◦ (technically, this follows from our choice of ◦ as a


binary operation, but it doesn’t hurt to be reminded of this).

2. The operation ◦ is associative, i.e., for all x, y, z ∈ S, (x ◦ y) ◦ z =


x ◦ (y ◦ z).

3. There is a unique element e ∈ S such that for all x ∈ S, x◦e = e◦x = x.


The element e is called the group identity.

4. For every x ∈ S, there is a unique element x−1 ∈ S such that x ◦ x−1 =


x−1 ◦ x = e. The element x−1 is the inverse of x.

If, in addition, x ◦ y = y ◦ x for all x, y ∈ S, then (S, ◦) is an abelian group.

Example 3.7.1 (Z, +) is a group (where by + we mean simply addition). The


identity element e is the number 0, since if x is any integer, then x+0 = 0+x = x.
The inverse of an integer x is the integer −x, since x + (−x) = 0. For example,
the inverse of the integer 51 is −51. Since x + y = y + x for all integers x, y,
this is an abelian group.

Example 3.7.2 (Z, ·) is not a group (by · we mean simply multiplication). The
number 1 is the identity: if x is any integer, then x·1 = 1·x = x. However, most
of the integers do not have inverses under ·. For example, there is no integer y
such that 3 · y = 1.

Example 3.7.3 Denote by GLn the set of all invertible n × n matrices with
entries from R. Then GLn together with the operation of matrix multiplication
is a group, the general linear group. The identity element is the n × n identity
matrix and the group inverse of a matrix A is its matrix inverse, A−1 . Similarly,
if we let SLn be the subset of GLn consisting of all those invertible n×n matrices
with determinant 1, then SLn is the special linear group.

Example 3.7.4 (Zn , +), where + denotes addition modulo n, is a group. The
identity element is the number 0. The inverse of x ∈ Zn is the unique number
72 MATH236 Discrete Mathematics with Applications 2009

y ∈ Zn such that x + y ≡ 0 (mod n). For example, in Z26 , the inverse of 5 is


21.
We now define a group that will play an important role in what follows. For
a positive integer n, the multiplicative group of Zn is
Z∗n = {a ∈ Zn : gcd(a, n) = 1};
the group operation is multiplication modulo n. The identity in Z∗n is the
number 1. By Theorem 26, every element of Z∗n has an inverse, so (as the
name suggests), Z∗n is a group under multiplication modulo n. It is immediate
that |Z∗n | = φ(n). Notice that if p is prime, then Z∗p = {1, 2, . . . , p − 1} =
Zp − {0}.
Let a ∈ Z∗n . The order of a is the smallest positive integer k such that
ak ≡ 1 (mod n).
The order of an element a is denoted |a|.

Example 3.7.5 Consider the group Z∗10 . The elements of Z∗10 are the integers
in Z10 that are relatively prime to 10. Hence
Z∗10 = {1, 3, 7, 9}
(and notice that |Z∗10 | = 4 = φ(10)). Consider the element 7 ∈ Z∗10 . We find
the order of 7 by finding the least positive integer k such that 7k ≡ 1 (mod 10):
k 7k mod 10
1 7
2 9
3 3
4 1
Thus, |7| = 4. Notice that if we rewrite 74 ≡ 1 (mod 10) as 7·73 ≡ 1 (mod 10),
it becomes clear that 73 ≡ 7−1 ≡ 3 (mod 10).
Analogously with the preceding, we can construct a table listing the order of
each element of Z10 :
a |a|
1 1
3 4
7 4
9 2
Elementary number theory 73

Notice that for each a ∈ Z∗10 , the number |a| is a divisor of 4 = |Z∗10 | = φ(10).
This is not a coincidence.
If α ∈ Z∗n is such that |α| = |Z∗n | = φ(n), then α is a generator3 of Z∗n . There
are many values of n for which the group Z∗n does not have a generator4 ,
e.g., Z∗8 does not have a generator5 . However, we shall be most interested
in Z∗p for p a prime number. When p is prime, Z∗p always has at least one
generator.

Example 3.7.6 The numbers 3 and 7 are both generators of Z∗10 , while 1 and
9 are not. Intuitively, 7 is a generator because the powers of 7 generate all the
elements of the group (while, on the other hand, 9 does not generate Z∗10 because
there is no integer k such that 9k ≡ 3 (mod 10) or 9k ≡ 7 (mod 10)).
We shall need to be able to find quickly generators of Z∗p ; hence, we present
the following result (without proof):

Theorem 40. Suppose p is prime and α ∈ Z∗p . Then α is a generator of Z∗p


if and only if
α(p−1)/q 6≡ 1 (mod p)
for all primes q such that q|(p − 1).

Example 3.7.7 Consider the group Z∗31 = {1, 2, . . . , 30}. Here, p − 1 = 30,
which is divisible by exactly three primes: 2, 3, and 5. So for each α ∈ Z∗31 , we
compute

• α30/2 mod 31 = α15 mod 31,

• α30/3 mod 31 = α10 mod 31, and,

• α30/5 mod 31 = α6 mod 31.


3
Or: primitive element.
4
A group that has a generator is called cyclic, and a cyclic group of order m has φ(m)
generators. The group Z∗n has order φ(n), so it follows that, if Z∗n is cyclic, then Z∗n has
φ(φ(n)) generators. For example, Z∗10 has φ(φ(10)) = φ(4) = 2 generators, namely, 3 and
7.
5
You should verify this by hand.
74 MATH236 Discrete Mathematics with Applications 2009

The results are shown in the following table:

α α6 mod 31 α10 mod 31 α15 mod 31


1 1 1 1
2 2 1 1
3 16 25 30
4 4 1 1
5 1 5 1
6 1 25 30
7 4 25 1
8 8 1 1
9 8 5 1
10 2 5 1
11 4 5 30
12 2 25 30
13 16 5 30
14 8 25 1
15 16 1 30
16 16 1 1
17 8 25 30
18 16 5 1
19 2 25 1
20 4 5 1
21 2 5 30
22 8 5 30
23 8 1 30
24 4 25 30
25 1 25 1
26 1 5 30
27 4 1 30
28 16 25 1
29 2 1 30
30 1 1 30

From this we see that 3, 11, 12, 13, 17, 21, 22, and 24 are generators of Z∗31 .
Of course, in general we don’t need a list of all the generators of Z∗p . We
just need one. Theorem 40 suggests the following procedure for finding a
generator of Z∗p :
Elementary number theory 75

Finding a generator of Z∗p ( Given: a prime number p )


1 Choose a random element α ∈ {1, 2, . . . , p − 1}
2 for each prime q that divides p − 1
3 do
4 If α(p−1)/q ≡ 1 (mod p), then go back to Step 1
5
6 The number α generates Z∗p

Example 3.7.8 Suppose we wish to find a generator of Z∗1553 . Then p = 1553


(which you can check is prime) so p−1 = 1552 = 24 97. Thus the prime factors q
of p − 1 are 2 and 97. Following the algorithm, we now choose a random integer
α ∈ [1, 1552]. Suppose we choose α = 5. We now consider each prime factor
q in turn. First, when q = 2, we compute 5(1552)/2 mod 1553 = 1552. Since
this is not 1, we continue with the second prime factor, q = 97. In this case, we
get 5(1552)/97 mod 1553 = 129. Once again, this is not 1. We’ve now completed
checking the prime factors of p − 1. Since none of them when substituted into
α(p−1)/q gave a result congruent to 1 (mod p), the number 5 is a generator. Note
that if one of the two prime factors had failed the test, we would have had to
pick another number α at random and try again. In theory, we might have to do
this several times before finding a generator.
76 MATH236 Discrete Mathematics with Applications 2009

Exercises
3.1 Determine which of the following pairs of integers are relatively prime
(give reasons for your answers):

(a) 12 and 32
(b) 21 and 40
(c) 24 and 84

3.2 Use the Division Algorithm to compute the GCD of each of the follow-
ing pairs of integers:

(a) 5 and 21
(b) 82 and 248
(c) 240 and 2805
(d) 2160 and 99225

3.3 For each pair a, b of integers given below, use the Extended Division
Algorithm to find a pair s, t of integers such that gcd(a, b) = sa + tb.

(a) 5 and 21
(b) 82 and 248
(c) 240 and 2805
(d) 2160 and 99225

3.4 Find each of the following. Check each answer:

(a) 3−1 ∈ Z20


(b) 18−1 ∈ Z839
(c) 439−1 ∈ Z2999

3.5 Use the square and multiply algorithm to find each of the following:

(a) 613 (mod 17)


(b) 4753 (mod 71)
Elementary number theory 77

3.6 Calculate each of the following:

(a) φ(11), φ(12), . . . , φ(20).


(b) φ(125)
(c) φ(140)
(d) φ(437)
(e) φ(4410)

3.7 Find the remainder when

(a) 5100 is divided by 7


(b) 3999,999,999 is divided by 7

3.8 Use Euler’s Theorem to find

(a) 4−1 ∈ Z13 .


(b) x ∈ Z6 such that 5x ≡ 3 (mod 6).

3.9 For each value of n below: (i) determine the elements of Z∗n , (ii) write
down the multiplication table for Z∗n , (iii) find the order of each element,
and (iv) determine which elements are generators.

(a) n = 5
(b) n = 6
(c) n = 9
(d) n = 15
(e) n = 16
(f) n = 17

3.10 Use the algorithm on page 75 to find a generator for each of the fol-
lowing:

(a) Z∗19
(b) Z∗157
(c) Z∗887
(d) Z∗9949
78 MATH236 Discrete Mathematics with Applications 2009
Elementary number theory 79

Solutions
3.1 (a) Not relatively prime (2 is a common factor).
(b) Relatively prime (no nontrivial common factor).
(c) Not relatively prime (6 is a common factor).

3.2 (a) 1
(b) 2
(c) 15
(d) 135

3.3 (a) 1 = 1 · 21 + (−4) · 5


(b) 2 = 1 · 248 + (−3) · 82
(c) 15 = 3 · 2805 + (−35) · 240
(d) 135 = 46 · 2160 + (−1)99225

3.4 (a) 7
(b) 606
(c) 608

3.5 (a) 10
(b) 33

3.6 (a) φ(11) = 10, φ(12) = 4, φ(13) = 12, φ(14) = 6, φ(15) = 8, φ(16) =
8, φ(17) = 16, φ(18) = 6, φ(19) = 18, φ(20) = 8
(b) 100
(c) 48
(d) 396
(e) 1008

3.7 (a) 2
(b) 6

3.8 (a) 10
(b) 3
80 MATH236 Discrete Mathematics with Applications 2009

3.9 (a) (i) Z∗5 = {1, 2, 3, 4}


1 2 3 4
1 1 2 3 4
(ii) 2 2 4 1 3
3 3 1 4 2
4 4 3 2 1
(iii) |1| = 1, |2| = 4, |3| = 4, |4| = 2
(iv) 2 and 3 are generators
(c) (i) Z∗9 = {1, 2, 4, 5, 7, 8}
1 2 4 5 7 8
1 1 2 4 5 7 8
2 2 4 8 1 5 7
(ii) 4 4 8 7 2 1 5
5 5 1 2 7 8 4
7 7 5 1 8 4 2
8 8 7 5 4 2 1
(iii) |1| = 1, |2| = 6, |4| = 3, |5| = 6, |7| = 3, |8| = 2
(iv) 2 and 5 are generators
Chapter 4

Fundamentals of cryptology

Note: While studying for this course, you might wish to download
the Handbook of Applied Cryptography, by Menezes, Oorschot,
and Vanstone, published by CRC Press, which has very kindly
made it available for free on the web. You can download it as a
set of PDFs from http://www.cacr.math.uwaterloo.ca/hac.

Cryptology is usually understood to consist of two related disciplines:


cryptography and cryptanalysis.

Cryptography is the study of mathematical techniques to provide infor-


mation security. Menezes, Oorschot, and Vanstone identify four cryp-
tographic goals:

1. Confidentiality: Ensuring that only the intended recipient of a


message is able to understand it.
2. Data integrity: Preventing the unauthorized alteration of data.
3. Authentication: Providing assurance that (i) both sender and re-
cipient are who they say are, and, (ii) that the message comes
from where it’s supposed to and goes where it’s supposed to.
4. Non-repudiation: Preventing parties from denying previously made
commitments.

Cryptanalysis is the study of mathematical techniques to defeat informa-


tion security.

81
82 MATH236 Discrete Mathematics with Applications 2009

While the word cryptology1 was used for the first time by John Wilkins
in 1641, the subject itself is much older. A very primitive form of cryptology
was used by the Egyptians 4000 years ago. The Spartans used cryptographic
devices in approximately 400 BC and, about forty years later, Tacitus in-
cluded in his military manual a chapter headed On secret messages. For
most of its history, cryptology has been the province of governments and the
military, but with the increasing use of computer networks in the last forty
years, commercial entities have become strongly interested in information
security. Today, the feasibility of internet commerce rests on the ability to
conduct secure electronic transactions.

4.1 Definitions
We wish to send a message in such a way that it is unintelligible to all
unauthorized persons, but can be understood by the intended recipient.
The plaintext (or message) M is a finite string of symbols from a finite
alphabet2 Σ. M is converted, by the process of encryption (or enciphering)
into an enciphered text called the ciphertext (or cryptogram), C. The person
who enciphers M is called the sender or encipherer and uses a set of rules
(or algorithm) to encrypt M . He sends the ciphertext, C, to the (intended)
recipient (or receiver). Normally the operation of the algorithm involves the
use of a key K which is known to both the encipherer and the receiver. The
receiver uses an algorithm (involving the key) to obtain M from C; this is
known as decryption (or deciphering). Note that he ciphertext C and the key
K must determine the plaintext M uniquely.

We shall adopt the convention that plaintext is written lowercase and ci-
phertext uppercase. For example, we might encrypt the word goodbye as
AHYEKVA.
Any person who intercepts the message is called an interceptor3 . In gen-
eral, an interceptor will not know the key and will (we hope!) be unable to
1
From the Greek words krypte (‘to hide’) and logos (‘word’).
2
An alphabet is a set of symbols which we use to write down our messages. Two
important examples are (i) the Latin alphabet {a, b, . . . , z}, and, (ii) the binary alphabet
{0, 1}.
3
Or: adversary, enemy, attacker, opponent, tapper, eavesdropper, intruder, interloper.
Fundamentals of cryptology 83

Interceptor
6

?
ciphertext - ciphertext
unsecured channel
6

encryption decryption
?
plaintext plaintext

Sender Recipient

Figure 4.1: Communication using encryption.

decipher C to obtain M . The methods used in the encryption/decryption


above form the subject of cryptography. The methods used by the inter-
ceptor to derive M from C without having access to the key are studied in
cryptanalysis.

4.2 Monoalphabetic and polyalphabetic ciphers


We now study two classes of encryption schemes:

• In a monoalphabetic cipher, each letter in the plaintext alphabet is


always encrypted as the same letter in the ciphertext alphabet. For
example, if in the word banana the first a is encrypted as F, then the
second and third letters a will be encrypted as F as well.

• In a polyalphabetic cipher, a letter in the plaintext alphabet might be


encrypted as several different letters in the ciphertext alphabet. For
example, the first a in the word banana might be encrypted as F while
the second and third letters a are encrypted as Z and B.
84 MATH236 Discrete Mathematics with Applications 2009

Monoalphabetic ciphers are cryptographically weak because they preserve


the relative frequency with which each letter occurs in the plaintext lan-
guage. So, for example, if the plaintext language is English, an interceptor
could guess that whichever letter occurs the most frequently in the ciphertext
corresponds to the letter e in the plaintext.

4.2.1 Monoalphabetic ciphers


Simple substitution ciphers
In a simple substitution cipher, we replace each letter of the alphabet by
another. In other words, a simple substitution cipher is a permutation of the
letters of the alphabet.

Example 4.2.1 Suppose that the following set of substitutions (the key) is
used. Both the encipherer and the decipherer have a copy of this key, which is
simply a permutation of the letters of the alphabet:
Plaintext a b c d e f ··· t u v w ···
Ciphertext D X W E G A ··· B F R C ···
Then cat is enciphered as WDB and AGC is deciphered as few.
In a simple substitution cipher like this one, the re-ordered alphabet (D X W
E G A · · · B F R C · · · ) is called the substitution alphabet.

This is a poor system, as it is possible to cryptanalyze it in many cases.


Memorizing the key is difficult; on the other hand, if the key is kept for
reference, it can be lost or stolen.

Shift ciphers
In the Gallic wars Julius Caesar used a cipher in which each of the letters
a,b,. . .,z is replaced by the letter which occurs three places after it in the
alphabet4 . We can represent this with the following permutation:
Plaintext a b c d e ··· w x y z
Ciphertext D E F G H ··· Z A B C
4
Of course, there is no letter three places after x, y, or z, but this is easily remedied:
We encrypt x as A, y as B, and z as C.
Fundamentals of cryptology 85

We call this a shift cipher or additive cipher or translation cipher with shift
(or key) 3.
More generally, in a shift cipher with shift d, each letter in the plaintext
alphabet is encrypted as the letter that occurs d places further on in the
alphabet and decrypted by replacing each letter by one that occurs d places
earlier on in the alphabet (or 26 − d places further on). As before, z is
followed by a b c · · · . Note that a shift cipher just a special case of the
simple substitution cipher.
Encryption can be done mechanically by means of a simple device con-
sisting of a large disc on which there is a smaller disc (with the same centre)
which can be rotated d places forward for encryption or d places back for
decryption.
The key is easily remembered, but the cipher is so insecure that it is of
no practical use, as an interceptor has to test at most 25 possible values of d
to find the key.

Example 4.2.2 Suppose that the adversary, who knows that a shift cipher is
being employed, intercepts the following ciphertext:

AOPZ TLZZHNL PZ H MHRL.

He tests values of d on the word MHRL and finds that d = 7 yields a plaintext
of fake, while d = 19 yields toys. All other shifts (values of d) result in
unintelligible plaintext. He now turns his attention to the other words in the
ciphertext. If he decrypts PZ with d = 7, he gets is, while d = 19 yields wg. So
he chooses d = 7 and decrypts the ciphertext to find the plaintext message5 .
Alternatively, if he had thought for a moment, he might have spotted that H
probably represents a or i (corresponding to d = 7 or d = 25, respectively) and,
testing d = 7, he would have decrypted the message with ease.
In this example, the words fake and toys are translates of one another. We
know of no pairs of English words of length six or more that are translates of
each other, and only a few of length four or five.

As we have mentioned, monoalphabetic ciphers are vulnerable to attack


by a frequency analysis of letters, pairs of letters (digrams), triples of letters
5
The content of which is left for you to discover as an exercise!
86 MATH236 Discrete Mathematics with Applications 2009

(trigrams), and so on. Hence, if we seek a system which is secure against


attack, it must be polyalphabetic.

4.2.2 Polyalphabetic ciphers

With a polyalphabetic cipher, a specific ciphertext letter can represent more


than one plaintext letter (and, conversely, each plaintext letter can be en-
crypted in more than one way). There are several ways to do this, but we
must be sure that, whatever we do, we can still decipher the message, i.e.,
each ciphertext should be decipherable to a unique plaintext.

n-gram substitution

A single letter is a 1-gram, a sequence of two letters (e.g., th) is a 2-gram


(also called a digram), a sequence of three letters (e.g., fud) is a 3-gram
(or trigram). In general, an n-gram is a sequence of n letters. When we
studied simple substitution ciphers, we replaced each letter (i.e., each 1-
gram) of plaintext with a letter of ciphertext. In digram substitution, we
instead replace each digram of plaintext with a digram of ciphertext. We
can go further than this, of course: in n-gram substitution, we replace each
n-gram of plaintext with an n-gram of ciphertext.
If we use our Latin alphabet of 26 letters, there are 262 digrams6 . So
digram substitution uses a key which is a permutation of 262 elements (each
a digram). This permutation may be represented by a 26 × 26 array in
which the rows correspond to the first letter of the plaintext digram and the
columns to the second letter of the plaintext digram. The entries of the array
are the ciphertext digrams that replace the plaintext digrams.

Example 4.2.3 Suppose that part of the key for a digram encryption scheme
looks like this:

6
And 263 trigrams, and, in general, 26n n-grams.
Fundamentals of cryptology 87

a b ··· x y z
.
.
.
c MZ BQ JA DD FK
d IA DT TB AT ZS
e LP SX AM EO BR
.
.
.
k BA AC QP MN LA
l WF EH GO BJ RE
m CT MB CW HP IS
.
.
.
Then the word lady would be encrypted as WFAT.
An array for decryption can easily be obtained from the encryption array.

Permutation ciphers
A block cipher is an encryption scheme in which the plaintext message is
broken up into blocks of a fixed length d, each of which is then encrypted
separately. All of the encryption schemes that we have seen so far have been
block ciphers, e.g., in a digram substitution scheme, each block has length
d = 2. We now describe another block cipher: the permutation cipher.
Let d be a positive integer. Divide the message M into blocks of length
d. Then take a permutation π of 1, 2, 3, . . . , d and apply π to each block.
Specifically, if the plaintext block is x1 x2 · · · xd , then the corresponding ci-
phertext block is xπ(1) xπ(2) · · · xπ(d) .

Example 4.2.4 Let d = 4 and π = (2413). Suppose that the message we want
to encrypt is
he is a great mathematician.
We make sure (i) we remember which symbols are spaces, and, (ii) that we pad
the message so that its length is a multiple of the block length, 4:
he-is-a-great-mathematician-.
Next we divide the message up into blocks of length 4:
he-i s-a- grea t-ma them atic ian-.
88 MATH236 Discrete Mathematics with Applications 2009

We now apply the permutation (2413) to each block. This means that the first
letter in the ciphertext block is the second letter in the plaintext block, the second
letter in the ciphertext block is the fourth letter in the plaintext block, and so
on:
plaintext he-i s-a- grea t-ma them atic ian-
ciphertext EIH- --SA RAGE -ATM HMTE TCAI A-IN
To decrypt, we apply the inverse
µ ¶
−1 1 2 3 4
π =
3 1 4 2
to each block of the ciphertext to recover the original message:
he-i s-a- grea t-ma them atic ian-.

Permutation ciphers are more secure than simple substitution ciphers,


but are still vulnerable to attack.

4.2.3 Modular arithmetic


Before we continue our discussion of polyalphabetic ciphers, we shall simplify
matters by representing the letters of the alphabet by numbers. For this to
be truly useful, we first need to discuss modular arithmetic.
If a is a nonnegative integer and n a positive integer, then we define
a mod n as the remainder when a is divided by n.

Example 4.2.5 11 mod 3 = 2, 15 mod 7 = 1, 6 mod 2 = 0.


If a is negative, then we define a mod n in the following way. Let k be the
largest multiple of n that is less than or equal to a. Then a mod n = a − k.

Example 4.2.6 The largest multiple of 5 which is less than or equal to -7 is


-10 (which is -2 times 5). Therefore, −7 mod 5 = −7 − (−10) = 3. Similarly,
−4 mod 2 = 0, −13 mod 22 = 9, and −20 mod 3 = 1.
Notice that, whatever the value of a, the number a mod n is always in
{0, 1, . . . , n − 1}.
Fundamentals of cryptology 89

letter number letter number letter number letter number


a 0 h 7 o 14 v 21
b 1 i 8 p 15 w 22
c 2 j 9 q 16 x 23
d 3 k 10 r 17 y 24
e 4 l 11 s 18 z 25
f 5 m 12 t 19
g 6 n 13 u 20

Table 4.1: Representing the alphabet with numbers.

Once we represent the alphabet using numbers (Table 4.2.3), this allows us to
‘add’ two letters together using arithmetic modulo 26, frequently abbreviated
as arithmetic (mod 26), which we now define. Let Z26 = {0, 1, . . . , 25},
and for two numbers x, y ∈ Z26 define x + y mod 26 to be the number
(x + y) mod 26, i.e, we add the two numbers x and y together and then find
the remainder when the result is divided by 26.

Example 4.2.7 13 + 19 mod 26 = 32 mod 26 = 6. Similarly, 21 + 15 mod 26 =


10 mod 26 = 10.
Similarly, we define x − y mod 26 to be the number (x − y) mod 26.

Example 4.2.8

5 − 17 mod 26 = −12 mod 26


= 14.

Clearly, we can do other arithmetic operations modulo 26 as well, e.g., mul-


tiplication modulo 26. And we can, of course, define arithmetic modulo n
(where n is an arbitrary, positive integer7 ) analogously to arithmetic modulo
26.
We can use modular arithmetic to implement the shift cipher: we simply
convert the letters in the plaintext to numbers, as per Table 4.2.3, then add
7
The number n is called the modulus.
90 MATH236 Discrete Mathematics with Applications 2009

the shift or key (modulo 26) to each number. To decrypt, we convert the
ciphertext to numbers, then subtract the shift or key (modulo 26) from each
number.

Example 4.2.9 For a shift of 7, let’s encrypt the message

penguin of death

Plaintext (letters) p e n g u i n o f d e a t h
Plaintext (numbers) 15 4 13 6 20 8 13 14 5 3 4 0 19 7
Ciphertext (numbers) 22 11 20 13 1 15 20 21 12 10 11 7 0 14
Ciphertext (letters) W L U N B P U V M K O H A O

We decipher by subtracting 7 from each number in the ciphertext (modulo 26).


Arithmetic modulo n can be used to define an important relation that we
shall use frequently later in the course. If a, b are integers and n is a positive
integer, then we say that a is congruent to b modulo n, written

a≡b (mod n)

if a mod n = b mod n. Notice that this definition is equivalent to the


one given in Example 1.4.3, and recall that we proved in that example that
congruence modulo n is an equivalence relation.
Fundamentals of cryptology 91

Exercises
4.1 Encrypt the message

When shall we three meet again


In thunder, lightning, or in rain?

using

(a) a shift cipher with key 15.


(b) a simple substitution cipher with key
µ ¶
abcdefghijklmnopqrstuvwxyz
JMBXIZDLVOSGHFAKCENYWQUTPR

(c) A permutation cipher with key (143652).

4.2 Indicate how you would decrypt (not attack!) each of the ciphertexts
you generated in Question 4.1.
92 MATH236 Discrete Mathematics with Applications 2009

Solutions
4.1 (a) LWTCHWPAALTIWGTTBTTIPVPXC. . .
(b) ULIFNLJGGUIYLEIIHIIYJDJVF. . .
(c) WNEHSHAWLTELHEEEMREGAIAT. . .

4.2 (a) Subtract 15 from each letter.


(b) Use the inverse permutation.
(c) Use the inverse permutation.
(d) Subtract the word david from each block of length 5.
Chapter 5

Public-key cryptography

To this point, we have dealt with symmetric-key cryptosystems, i.e., cryp-


tosystems in which someone knowing the key KE used for encryption can
easily deduce from this the key KD used for decryption. We now consider
a second kind of cryptosystem, in which knowing KE provides no useful
information about KD .

5.1 One-way functions


5.1.1 Definitions
Let S, T be sets. A one-way function f : S → T is a function for which

1. For each x ∈ S, the value f (x) is easy to compute, but,

2. For almost every y ∈ T , it is computationally infeasible to find x ∈ S


such that y = f (x).

Example 5.1.1 Let p be a large prime number and f (x) a polynomial of high
degree, where f : Zp → Zp . It is easy to calculate f (x) for all x ∈ Zp , but
usually hard to solve f (x) = y for x. The function f is a one-way function. For
example, if we choose f (x) = 2x1553 + 3x1471 − x101 − 1 and p = 17957, it is
very difficult to find x ∈ Z17957 for which f (x) ≡ 12202 (mod 17957).

93
94 MATH236 Discrete Mathematics with Applications 2009

Example 5.1.2 Choose N1 and N2 to be large prime numbers and let S consist
of all ordered pairs (p, q) of prime numbers with N1 ≤ p ≤ q ≤ N2 . Define
f : S → Z by the rule
f (p, q) = pq.
It is easy to calculate f (p, q) for all (p, q) ∈ S. However, if we are given the
number pq (without being told the factors p and q), then it is (in general)
computationally infeasible to find p and q. For example, the number 26547259
is the product of two (relatively small) 4-digit primes p and q. If you wish to
gauge the difficulty of finding p and q, try to factor 26547259. You will find
this easy to do on a computer (especially with a computer algebra system like
Mathematica), but extremely time-consuming by hand.
Now suppose that, instead of using two 4-digit prime numbers, we use two
300-digit prime numbers. Their product will be a number that even a computer
will have tremendous difficulty factoring. This difficulty is the basis of the RSA
cryptosystem, which we shall shortly encounter.
Suppose we are given a one-way function f . One way in which we might try
to ‘get around’ the second property, that it is in general infeasible to solve the
equation y = f (x) for x, is by creating a table of all the pairs (f (x), x): we
work our way through the set S, calculating f (x) for each x ∈ S, and then re-
order the results by increasing f (x). Now, using this table, we can in theory
quickly look up an x for any given f (x). However, when S is large enough,
this approach is not feasible. It requires too much memory to store the table.
For example, suppose that S consists of all 200-digit binary numbers; then
|S| = 2200 . If we write the table in binary and use one molecule to store each
bit (0 or 1) of information, then we still require more molecules than are in
our solar system.

5.1.2 The password problem


One of the first applications of the idea of a one-way function was to solving
the problem of the security of computer passwords. Suppose a group of users
has access to a computer. Each user logs in to the computer by supplying
a user name u and password p(u). The computer then checks the entered
password against the one it has on file for the user u to determine whether
u should be allowed to login or not. However, it is dangerous to store a
list {(ui , p(ui ))} of user names with their passwords in unencrypted form
in a file on the computer, for a hacker who gains access to the system can
Public-key cryptography 95

make a copy of the file and thus gain access to all of the user accounts.
Similarly, it is dangerous to encrypt the list of passwords using a symmetric-
key cryptosystem. The key must be stored somewhere on the computer and
a hacker gaining access to the computer can use the key to decrypt the list
of usernames and passwords.
As an alternative, consider the following. Let f be a one-way bijec-
tion whose domain is the set of all possible passwords and suppose that
instead of storing a list of pairs (u, p(u)), the computer stores the list of pairs
(u, f (p(u))). Each time a user logs in:

1. The user enters a name u and a password p0 .

2. The computer calculates f (p0 ), which (since f is a one-way function) is


computationally easy.

3. The computer checks the entered name-password pair, (u, f (p0 )), against
the stored name-password pair (u, f (p(u))). If f (p0 ) = f (p(u)), then,
since f is a bijection, p0 = p(u) and the computer allows the user to
login. Otherwise, the computer denies the user access.

An intruder who gains access to the list of pairs (u, f (p(u))) obtains no useful
information. To login as a user u, they must know the password p(u). How-
ever, all they know is the encrypted form of the password, f (p(u)); since f is
a one-way function, it is (for all practical purposes) impossible to determine
p(u) from f (p(u)).

Example 5.1.3 Suppose that each password is a pair (p, q) of large prime num-
bers and that the one-way function is f (p, q) = pq. Then for the user angus with
password (2879, 9221), the computer would create the entry (angus, 26547259).
The primes 2879 and 9221 (the factors of 26547259) would not be stored on
the computer. An intruder who hacked into the system and wanted to login as
angus would be faced with the task of factoring the number 26547259.

5.1.3 Trapdoor one-way functions


A trapdoor one-way function is a one-way function f : S → T with the
additional property that, given some extra information (called the trapdoor
96 MATH236 Discrete Mathematics with Applications 2009

information), it becomes feasible to find, for any y ∈ T , an x ∈ S such that


f (x) = y.
The existence of a trapdoor one-way function has never been proved, but
we shall soon consider functions that are believed to be trapdoor one-way
functions.

5.2 The key distribution problem


Suppose that Alice wants to send Bob a message. The message is confidential,
so she encrypts it. Of course, in order for Bob to decrypt the message, he
needs the key she used to encrypt it. She must somehow get this key to
him without anyone else finding out what the key is. What does she do?
If she encrypts the key itself, then she needs to send Bob the second key
that she used to encrypt the first key, which means she hasn’t achieved
anything by encrypting the first key, because she must now find some way
to (confidentially) communicate the second key to him. She could agree to
meet him in person and exchange the key, but this can be time-consuming.
It’s also not always possible. What if Alice is a computer in Cape Town and
Bob is a computer in Redmond, Washington?
Things get even worse if Alice is one of a group of n people (or entities,
like computers or companies), each pair of whom would like to be able to
exchange secret messages. Since the content of each message must remain
unknown to everyone except the sender and recipient, each pair of people
must agree on a (different) secret key to be used in their communications,
so a total of n(n − 1)/2 keys must be generated. And since each key must
be known by both members of the pair to which it is assigned, the key must
be communicated to or exchanged between them via a secure channel (e.g.,
a trusted courier). Each of the n people must store n − 1 keys, one for each
of the people they might want to exchange messages with. If n is, say, 1,000,
then a total of 499,500 keys must be generated and stored! Clearly, there are
logistical difficulties associated with key exchange. Indeed, as Simon Singh
notes in The Code Book:
In the 1970s, banks attempted to distribute keys by employing
special dispatch riders who had been vetted and who were among
the company’s most trusted employees. These dispatch riders
would race across the world with padlocked briefcases, personally
distributing keys to everyone who would receive messages from
Public-key cryptography 97

the bank over the next week. As business networks grew in size,
as more messages were sent, and as more keys had to be deliv-
ered, the banks found that this distribution process became a
horrendous logistical nightmare, and the overhead costs became
prohibitive.

Another possibility for organizing key exchange is to use a key centre C, a


trusted third party who (in theory) has no interest in the content of any
message exchanged by users of the system. Each user A exchanges a secret
key KA with C. If A wants to send a message to B, then:

1. A encrypts the message with the key KA and sends it to C.

2. C decrypts the message using the key KA , then re-encrypts it with the
key KB .

3. C sends the re-encrypted message to B, who decrypts it with the key


KB .

With this setup, only n keys need to be exchanged and stored. But there are
some problems: the key centre C knows the contents of every message sent
between the other parties; even if C is deemed trustworthy1 , an opponent
who gains access to C’s list of keys will be able to read every message on the
network.

5.3 Diffie-Hellman key exchange


Until the mid 1970s, the consensus amongst cryptographers was that the
difficulties associated with key exchange were an undesirable but unavoidable
part of cryptography. However, in June 1976 a Stanford researcher named
Whitfield Diffie demonstrated at a conference how two parties could agree
on a secret key without actually exchanging that key. His idea, the result of
1
People don’t like other people knowing their keys. In the 1990’s, the US government,
worried that criminals were encrypting their messages, tried to set up a key escrow system.
The idea was that the federal government would be given copies of everyone’s keys, but
that it would only be allowed to use that knowledge with appropriate oversight. Anyone
doing business with the federal government was forced to use the American Escrowed
Encryption Standard. Despite that, no-one apart from the feds liked the system, and it
was eventually scrapped.
98 MATH236 Discrete Mathematics with Applications 2009

joint research with Martin Hellman, another Stanford researcher, is simple


but ingenious. It works in the following fashion2 . Suppose that Alice and
Bob want to agree on a secret key to be used in communication between
them. Then:

1. They first pick two positive integers, Y and P . There is no need (as you
will see) for them to keep this information secret, so they could do this
on the telephone or by email or by placing a succession of page-length
ads in the New York Times.

2. Alice chooses a secret number, A, and calculates α = Y A mod P .

3. Bob also chooses a secret number, B, and calculates β = Y B mod P .

4. Alice sends α to Bob and Bob sends β to Alice.

5. Alice now calculates α0 = β A mod P and Bob calculates β 0 = αB mod P .


Since β A mod P = Y BA mod P = Y AB mod P = αB mod P , we have
α0 = β 0 . This number, α0 , is the secret key they have agreed on!

Example 5.3.1 Suppose that Alice calls Bob one morning and they agree, over
the phone, that Y = 13 and P = 29. She hangs up. While she is choosing her se-
cret number, A = 12, Bob, after some thought, decides that B = 17. Alice now
computes α = Y A mod P = 1312 mod 29 = 23298085122481 mod 29 = 23.
Similarly, Bob determines β = 1317 mod 29 = 8650415919381337933 mod 29 =
22. Alice calls Bob again. She tells him that α = 23; he replies that β = 22.
She hangs up again. She now computes α0 = β A mod P = 2212 mod 29 =
12855002631049216 mod 29 = 16. In the meantime, Bob (after finishing
his morning cup of coffee) works out β 0 = αB (mod P ) = 2317 mod 29 =
141050039560662968926103 mod 29 = 16. Both of them now know the secret
key to be used between them is 16.
The security of Diffie-Hellman key exchange rests on the intractability of the
Discrete Logarithm Problem, about which we shall have more to say later.
2
Actually, it’s a little more complicated than this, but not much. The details can be
found in the Handbook of Applied Cryptography.
Public-key cryptography 99

5.4 The birth of public-key cryptography


The talk that Diffie gave in 1976 was followed in the same year by a pa-
per, New directions in cryptography, that was published in the journal IEEE
Transactions on Information Theory. In this article, which caused great ex-
citement in the cryptographic community, Diffie and Hellman proposed the
idea of a public-key cryptosystem. Such a system would work in the following
fashion:
• A public-key cryptosystem is an asymmetric-key cryptosystem, i.e., it
is not computationally feasible to compute a decryption key KD from
the corresponding encryption key KE .
• Suppose Alice wants to use a (hypothetical) public-key cryptosystem.
She begins by generating a pair of keys, which we shall denote pri(Alice)
and pub(Alice). She keeps the key pri(Alice), her private key, secret
from everyone else, but publishes her public key, pub(Alice), in a direc-
tory available to all users of the system.
• Anyone who wants to send her a message can encrypt it using her public
key. But only Alice, using her private key, can decrypt such a message.
For example, suppose that Bob, another user of the system, wants to
send Alice an encrypted message:
1. Bob looks up Alice’s public key, pub(Alice).
2. Bob encrypts the message using pub(Alice) and sends the en-
crypted message to Alice using an open channel (e.g., by email).
It does not matter if someone intercepts the message since all they
have access to is Alice’s public key, and that information provides
no help in decrypting the message.
3. Alice decrypts the encrypted message using her private key pri(Alice).
We can think of the encryption function in a public-key cryptosystem as a
trapdoor one-way function, with the trapdoor information being the private
key. Someone knowing pri(Alice) can easily decrypt a message encrypted
using pub(Alice), but without this knowledge the task is computationally
infeasible.
The security of a public-key cryptosystem depends on the functions E
and D used for encryption and decryption, respectively. They should have
the following properties:
100 MATH236 Discrete Mathematics with Applications 2009

1. For a given plaintext P and public key pub(A), it should be easy to


compute the corresponding ciphertext C = Epub(A) (P ).

2. If only the ciphertext C is known, it should be computationally infea-


sible to find the plaintext P .

3. If the ciphertext C and the private key pri(A) are known, it should be
easy to compute the plaintext P = Dpri(A) (C).

4. It should be easy to generate pairs (pub(A), pri(A)) of public and pri-


vate keys so that too many such pairs exist for an enemy to construct
a look-up table.

While Diffie and Hellman were the first to publicly3 propose the idea of a
public-key cryptosystem, in order to actually construct a public-key cryp-
tosystem, they needed a trapdoor one-way function and, at the time their
paper was published, they had been unable to find one. It would be a year
later when three researchers on the East coast constructed the first public-key
cryptosystem.

5.5 The RSA cryptosystem


5.5.1 Introduction
‘I walked into Ron Rivest’s office,’ recalls Leonard Adleman, ‘and
Ron had this paper in his hands. He started saying, “These
Stanford guys have this really blah blah blah.” And I remember
thinking, “That’s nice, Ron, but I have something else I want to
talk about.” I was entirely unaware of the history of cryptography
and I was distinctly uninterested in what he was saying.’4

In 1976, the year New directions in cryptography appeared, Ronald Rivest,


Adi Shamir, and Leonard Adleman were researchers at MIT’s Laboratory
for Computer Science in Cambridge, Massachusetts. Trying to fill in the gap
in Diffie and Hellman’s paper, they began to look for a trapdoor one-way
3
The idea had been floated seven years earlier by a British cryptographer named James
Ellis. However, Ellis worked for GCHQ, the British government’s cryptographic agency,
and his work remained classified until 1997.
4
From The Code Book, by Simon Singh.
Public-key cryptography 101

function that could be used to construct a public-key cryptosystem. The


breakthrough came the next year. The tale is told in The Code Book, by
Simon Singh:

In April 1977, Rivest, Shamir and Adleman spent Passover at


the house of a student, and had consumed significant amounts
of Manischewitz wine before returning to their respective homes
some time around midnight. Rivest, unable to sleep, lay on his
couch reading a mathematics textbook. He began mulling over
the question that had been puzzling him for weeks — is it pos-
sible to build an asymmetric cipher? Is it possible to find a one-
way function that can be reversed only if the receiver has some
special information? Suddenly, the mists began to clear and he
had a revelation. He spent the rest of the night formalizing his
idea, effectively writing a complete scientific paper before day-
break. Rivest had made a breakthrough, but it had grown out
of a year-long collaboration with Shamir and Adleman, and it
would not have been possible without them. Rivest finished off
the paper by listing the three authors alphabetically; Adleman,
Rivest, Shamir.
The next morning, Rivest handed the paper to Adleman, who
went through his usual process of trying to tear it apart, but this
time he could find no faults. His only criticism was with the list
of authors. ‘I told Ron to take my name off the paper,’ recalls
Adleman. ‘I told him that it was his invention, not mine. But
Ron refused and we got into a discussion about it. We agreed that
I would go home and contemplate it for one night, and consider
what I wanted to do. I went back the next day and suggested to
Ron that I be the third author. I recall thinking that this paper
would be the least interesting paper that I will ever be on.’

Thus Adleman, Rivest, and Shamir became Rivest, Shamir, and Adleman,
and their cryptosystem became known by the acronym RSA. Almost thirty
years later, RSA is the most widely used public-key cryptosystem in the
world5 . The security of the system rests on the belief that there is no efficient
5
Interestingly, while Rivest, Shamir, and Adleman are credited as the inventors of
RSA, they were not the first to come up with the idea. In 1973, Clifford Cocks, a British
102 MATH236 Discrete Mathematics with Applications 2009

way of factoring a number that is the product of two large6 primes.


Before we present the RSA system, we observe that although public-
key cryptosystems have many advantages, they are not widely used for
general-purpose encryption and decryption of long messages, because the
processes of encryption and decryption in public-key cryptosystems are con-
siderably slower than the corresponding operations in a symmetric-key cryp-
tosystem. However, public-key cryptosystems are widely used to encrypt
keys for symmetric-key cryptosystems — like DES, which we discuss later in
the course — which are then used to encrypt the actual messages.

5.5.2 The mechanics of RSA: key generation


Suppose that Alice wants to use the RSA system. She must first generate a
key-pair — pri(Alice) and pub(Alice). This is accomplished as follows:
1. Alice picks two large primes, p and q, of roughly the same size. She
computes their product n = pq (the number n is commonly referred to
as the public modulus) and also the Euler function φ(n) = (p−1)(q −1).
2. Alice then selects a random integer e, 1 < e < φ(n), such that gcd(e, φ(n)) =
1, i.e., e is relatively prime to φ(n).
3. She computes the multiplicative inverse d of e in Zφ(n) , i.e., the unique
integer d such that 1 < d < φ(n) and ed ≡ 1 (mod φ(n)).
4. Finally, she sets
pri(Alice) = (n, d)
(which she keeps secret) and publishes her public key
pub(Alice) = (n, e).
mathematician with a background in Number Theory, had just joined GCHQ when one of
his colleagues happened to mention to him Ellis’s notions about public-key cryptography.
Later that afternoon, Cocks sat down to think about the problem of coming up with
a trapdoor one-way function; in the process, he devised what would become the RSA
cryptosystem. He recalls that ‘From start to finish, it took me no more than half an hour
... I thought, “Ooh, that’s nice. I’ve been given a problem, and I’ve solved it” ’. With very
little background in cryptography, he had no idea of the significance of his discovery. His
colleagues at GCHQ had been wrestling with the problem for some years, and after Diffie
and Hellman publicly posed the problem three years later, it would take Rivest, Shamir,
and Adleman a year of work to duplicate Cocks’ almost accidental discovery.
6
The definition of ‘large’ has changed as computers have become more powerful.
Public-key cryptography 103

Since the n in her private key is the same as the n in her public key,
we shall frequently think of the private key as just the number d and
write pri(Alice) = d.

Example 5.5.1 Suppose7 that Alice chooses p = 47 and q = 59. Then

n = pq = 2773

and
φ(n) = (p − 1)(q − 1) = 2668.
Alice now needs a number e such that 1 < e < 2668 and gcd(e, 2668) = 1. A
good choice is a relatively small prime number; she chooses e = 17 and quickly
checks that 17 - 2668.
The next step is to find her private key d, which satisfies 17d ≡ 1 (mod 2668),
i.e., d is the multiplicative inverse of 17 in Z2668 . Using the Extended Division
Algorithm from the previous chapter we find that d = 157. Thus

pub(Alice) = (2773, 17)


pri(Alice) = 157.

She publishes her public key and keeps her private key a secret.

5.5.3 The mechanics of RSA: encryption and decryp-


tion
Suppose now that Bob wants to send an encrypted message to Alice.
1. Bob looks up pub(Alice) = (n, e).
2. Bob represents his message, M , as an integer in the interval [0, n − 1].
If M is too large, he divides it into blocks and then encrypts each block
separately.
7
We are, more or less, using the example given in the original RSA pa-
per, A Method for Obtaining Digital Signatures and Public-Key Cryptosystems.
The paper, which is quite straightforward, is available for download from
http://theory.lcs.mit.edu/∼rivest/rsapaper.pdf
104 MATH236 Discrete Mathematics with Applications 2009

letter number letter number letter number letter number


space 00 g 07 n 14 u 21
a 01 h 08 o 15 v 22
b 02 i 09 p 16 w 23
c 03 j 10 q 17 x 24
d 04 k 11 r 18 y 25
e 05 l 12 s 19 z 26
f 06 m 13 t 20
Table 5.1: An alphabet with spaces.

3. Bob encrypts M into ciphertext C by the rule

C = M e mod n.

4. To decrypt the ciphertext C she receives from Bob, Alice uses her
private key pri(Alice) = d to find

M = C d mod n.

Example 5.5.2 Suppose that Bob wants to send the message


my hovercraft is full of eels
to Alice. He must first encode the message as numbers; wishing to keep the
spaces, he adopts the coding scheme given in Table 5.1.

The message is therefore encoded as


1325000815220518031801062000091900062112120015060005051219
Alice’s public key is (n, e) = (2773, 17). In order to use the RSA system, Bob
has to break the message up into blocks so that each block is an element of
Z2773 , so Bob breaks the message up into blocks of length 4:
1325 0008 1522 0518 0318 0106 2000 0919 0006 2112 1200
1506 0005 0512 1900
Public-key cryptography 105

(notice that we’ve padded the last block). Now Bob encrypts each block M
separately, each time producing an encrypted block C according to the rule

C = M e mod n
= M 17 mod 2773.

The first block is 1325, so we compute

4 +1
132517 mod 2773 = 13252 mod 2773
2·23
= (1325) · 1325 mod 2773
23
= (316) · 1325 mod 2773
2
= (28)2 · 1325 mod 2773
= 7842 · 1325 mod 2773
= 192.

The first block is thus encrypted as 0192. The process is repeated for the other
fourteen blocks in the message, yielding the ciphertext

0192 0596 2024 1787 0578 0195 0317 2244 2751 0844 1444
0882 0508 1278 2342

which Bob then sends to Alice.


After receiving the message from Bob, Alice uses her private key, d = 157,
to decrypt. For each block C, she computes

M = C d mod n
= C 157 mod 2773.
106 MATH236 Discrete Mathematics with Applications 2009

For example, the encrypted first block 0192 would be decrypted as follows:
7 4 3 2
192157 mod 2773 = 1922 1922 1922 1922 192 mod 2773
6 4 3 2
= 8152 1922 1922 1922 192 mod 2773
5 4 3 2
= 14782 1922 1922 1922 192 mod 2773
4 4 3 2
= 21332 1922 1922 1922 192 mod 2773
4 3 2
= (2133 · 192)2 1922 1922 192 mod 2773
4 3 2
= 19052 1922 1922 192 mod 2773
3 2
= (1941 · 192)2 1922 192 mod 2773
3 2
= 10902 1922 192 mod 2773
2
= (1256 · 192)2 192 mod 2773
2
= 26742 192 mod 2773
= 14822 192 mod 2773
= 1325

Notice that we have successfully recovered the first block of Bob’s message! You
should now repeat the procedure to decrypt the other fourteen blocks in the
message.
We shall return later to the problem of proving that the RSA system works
as advertised.

5.5.4 Key size in the RSA system


The security of the RSA system is dependent on the difficulty of factoring
large numbers8 . If an opponent can factor the public modulus n (i.e., deter-
mine the two primes p and q), then he can easily find the private key d and
hence decrypt all of Alice’s encrypted messages. The larger n is, the more
difficult this is to do. The key size of an RSA cryptosystem is the size of
the public modulus n. While for the example we just considered, our value
of n was relatively small (a four digit number like 2773 can be factored in a
fraction of a second on a desktop computer), in the real world the key size is
usually a lot larger. Of course, the larger n is, the more time encryption and
8
A detailed discussion can be found in the original RSA paper; see
http://theory.lcs.mit.edu/∼rivest/rsapaper.pdf.
Public-key cryptography 107

decryption takes, so the desire for computational security must be weighed


against the need for efficient encryption and decryption.
NIST, the National Institute of Standards and Technology, is part of the
US Department of Commerce and keeps track of recent developments, like
new algorithms and faster computers, that affect how easy it is to factor a
number of a given size. Based on their observations, they make recommen-
dations to other federal agencies on how large the public modulus n should
be to ensure security. As of May 2006, NIST suggests9 a minimum key size
of 1024 bits, though they recommend increasing this to 2048 bits for data
that must remain secure through 2030 and to 3072 bits for data that must
remain impregnable beyond 2030.
Another authority on the RSA cryptosystem is the company that admin-
istered the RSA patent (now expired):

RSA Laboratories currently10 recommends key sizes of 1024 bits


for corporate use and 2048 bits for extremely valuable keys like
the root key pair used by a certifying authority . . . Several recent
standards specify a 1024-bit minimum for corporate use. Less
valuable information may well be encrypted using a 768-bit key,
as such a key is still beyond the reach of all known key breaking
algorithms11 .

RSA Laboratories also recommends that the two primes p and q that com-
prise the modulus n should be of roughly the same length (so for a 1024 bit
modulus n, p and q should be about 512 bits each) and that p and q should
not be extremely close to one another12 .

5.5.5 Digital signatures with RSA


Besides being capable of encryption and decryption of messages, RSA can
also be used to create digital signatures. A digital signature is analogous to
a handwritten signature: it’s a way of signing a message so that someone
9
NIST Special Publication 800-57: Recommendation for Key Management Part 1:
General. Revised (May, 2006).
10
April 2005.
11
RSA Laboratories’ Frequently Asked Questions About Today’s Cryptography 4.1,
available from http://www.rsasecurity.com/rsalabs/node.asp?id=2218
12
Why?
108 MATH236 Discrete Mathematics with Applications 2009

reading the message will know with certainty that the message was created
by the signer.
Before we explain the mechanics, we introduce some new terminology.
To concatenate means to place end to end. If we concatenate message A
with message B, we denote the result A||B. For example, if A = cat and
B = dog, then A||B = catdog and B||A = dogcat.
Suppose that Alice wants to send Bob a message M in such a fashion
that he is certain that she is the one who sent it, e.g., she might be sending
her bank manager instructions to transfer money out of one of her accounts.
She uses an RSA key pair pub(Alice) = (n, e) and pri(Alice) = d.
1. She begins by representing the message M as an integer in the interval
[0, n − 1] (or breaks it into blocks if it is too long).
2. Using her private key she computes the message signature, Mpri(Alice) =
M d mod n, i.e., she encrypts the message M with her private key.
3. Alice concatenates M with Mpri(Alice) to produce M ||Mpri(Alice) .
4. Assuming that the message M is confidential, she encrypts M ||Mpri(Alice)
with Bob’s public key, pub(Bob); she then sends (M ||Mpri(Alice) )pub(Bob)
to Bob.
We now examine events from Bob’s perspective:
1. Bob receives a message (M ||M 0 )pub(Bob) from someone claiming to be
Alice. He begins by using his private key, pri(Bob), to remove the outer
layer of encryption, recovering M ||M 0 , which he separates into M and
M 0.
2. Bob now encrypts M 0 with Alice’s public key, pub(Alice), i.e., he finds
0
Mpub(Alice) = (M 0 )e mod n and compares it with M , the first half of the
concatenated message he received. There are two possibilities:
0
• Mpub(Alice) = M . Then Bob knows that M 0 was encrypted with
Alice’s private key. Since Alice is the only one who knows Alice’s
private key, this proves that the message M is from Alice.
0
• Mpub(Alice) 6= M . Therefore, either the message M 0 was not en-
crypted with Alice’s private key, or some malicious third party
altered the text M after Alice added her signature; in either case,
Bob knows that the message was not authorized by Alice.
Public-key cryptography 109

Example 5.5.3 Suppose that Alice wishes to send the signed and encrypted
message
give henda money
to Bob, and that

pub(Alice) = (2773, 17)


pri(Alice) = 157

and

pub(Bob) = (3233, 19)


pri(Bob) = 2299

She encodes the message as

M = 0709 2205 0008 0514 0401 0013 1514 0525

and encrypts each block B of length 4 by the rule C = B 157 mod 2773 to
produce the message signature

Mpri(Alice) = 1889 2059 1249 1016 1557 1291 0453 2706

so that

M ||Mpri(Alice) = 0709 2205 0008 0514 0401 0013 1514 0525


1889 2059 1249 1016 1557 1291 0453 2706.

Since each of the numbers in these blocks is in the range [0, 3232], blocks of
length 4 will work nicely with Bob’s public modulus. Alice now encrypts each
block B with Bob’s public key, using the rule C = B 19 mod 3233 to obtain

(M ||Mpri(Alice) )pub(Bob) = 2920 0156 1304 1312 1189 1477 1719 1131
0373 0127 1984 0997 3079 2118 2410 2578

This is the message that she sends to Bob.


110 MATH236 Discrete Mathematics with Applications 2009

Example 5.5.4 Suppose Bob receives the ciphertext


Y = 2920 0156 1304 1312 1189 1477 1719 1131
0373 0127 1984 0997 3079 2118 2410 2578
He sets to work decrypting it, first using his private key d = 2299 with the rule
C = B 2299 mod 3233 to discover that the underlying message is
X = 0709 2205 0008 0514 0401 0013 1514 0525
1889 2059 1249 1016 1557 1291 0453 2706.
which he decodes to
give henda money||1889 2059 1249 1016 1557 1291 0453 2706.
Bob must now verify the signature: if the second half, 1889 2059 1249 1016
1557 1291 0453 2706, of the message, when decrypted with Alice’s public key
(2773, 17) and the rule C = B 17 mod 2773, matches the first half, give henda
money, then he knows that the message came from Alice13 . So he decrypts 1889
2059 1249 1016 1557 1291 0453 2706 and, indeed, it yields the text give
henda money. He is thus assured that the message give henda money was
authorized by Alice.
There are a number of different ways to produce digital signatures; we’ve
described one of them. A digital signature scheme should have the following
properties:
1. The signature can be appended to any message the signatory wants to
identify as hers.
2. To prevent someone from appending a signature to a message the sig-
natory did not authorize, or altering the message, the signature must
be message dependent. If the message is altered after being signed, the
signature should not correspond to the altered message.
3. It should be computationally infeasible to forge the signature.
4. Signatures should be easy to check by anybody wishing to do so, e.g.,
with the RSA digital signature scheme, anyone wishing to verify Alice’s
signature M 0 on a message M can quickly look up pub(Alice) and then
0
calculate Mpub(Alice) .
13
And, if he is on good terms with Alice, he will then give Henda money.
Public-key cryptography 111

5.5.6 The mathematics of RSA


To complete our study of RSA, we shall prove that it works. That is, if we
follow the instructions for generating a key and encrypting a message with
that key, then, when we attempt to decrypt the ciphertext, we will recover
the original message.
Lemma 41. Let p and q be distinct prime numbers and let a and b be
non-negative integers. If a ≡ b (mod p) and a ≡ b (mod q), then a ≡ b
(mod pq).
Proof. Since a−b is divisible by both p and q, and p and q are distinct primes,
a − b must be divisible by their product, pq. Hence a ≡ b (mod pq).
We now prove that RSA works. Our proof is more or less the same as
the one given in the original RSA paper14 .
Theorem 42 (The RSA Theorem). Let (n, e) be a public key for the RSA
cryptosystem and (n, d) the corresponding private key, and let E(M ) =
M e mod n and D(C) = C d mod n be the encryption and decryption rules,
respectively. Then
D(E(M )) ≡ M (mod n).
Proof. Since ed ≡ 1 (mod φ(n)), there is some integer k such that ed =
1 + kφ(n). Hence,
D(E(M )) ≡ (M e )d (mod n)
ed
≡M (mod n)
kφ(n)+1
≡M (mod n)

Let p and q be the prime numbers that comprise the public modulus, i.e.,
n = pq. From Corollary 38, if p does not divide M , then
M p−1 ≡ 1 (mod p).
Raising both sides to the power k(q − 1), we have
M k(p−1)(q−1) ≡ 1k(q−1) (mod p)
≡ 1 (mod p)
14
Which can be downloaded from http://theory.lcs.mit.edu/∼rivest/rsapaper.pdf
112 MATH236 Discrete Mathematics with Applications 2009

so that
M kφ(n) · M ≡ 1 · M (mod p)
which gives, finally,
M kφ(n)+1 ≡ M (mod p). (5.1)
Notice that if p|M , then (5.1) is trivially true, and hence (5.1) is true for all
M . Similarly,
M kφ(n)+1 ≡ M (mod q) (5.2)
holds for all M . Combining (5.1) and (5.2) with Lemma 41, we therefore
have that

D(E(M )) ≡ M kφ(n)+1 (mod n)


≡ M (mod n)

as required.

5.6 The Discrete Logarithm Problem


The security of the RSA cryptosystem is based on the difficulty of factor-
ing large integers. There are many other mathematical problems that are
believed to be inherently difficult, and some of these have been used to con-
struct other kinds of public-key cryptosystems.
One example is the Discrete Logarithm Problem: Given a, b ∈ Zn , find a
number x ∈ Zn (if one exists) such that

ax ≡ b (mod n).

If we consider an equivalent problem in the reals, i.e., we are given c, d ∈ R,


and we wish to find x ∈ R such that cx = d, then the solution (if there is one)
is simply x = logc d, which is easily computed. However, when we restrict
ourselves to Zn for a large value of n, the problem becomes very difficult.
Partially, this is because of the apparently random behavior of the function
ax when reduced modulo n.

Example 5.6.1 Consider the function

f (x) = 17x

in Z2773 . If we tabulate f (x) for a few consecutive values of x,


Public-key cryptography 113

x f (x)
1130 1305
1131 1365
1132 309
1133 2106
1134 1576
1135 2403

we notice immediately the lack of any perceivable pattern in the values of f (x).

We have already noted that the security of the Diffie-Hellman key exchange
system rests on the difficulty of the Discrete Logarithm Problem:

Example 5.6.2 Suppose that an opponent has discovered a method by which he


can efficiently compute discrete logarithms, i.e., given a, b ∈ Zn , the opponent
can quickly find x such that ax ≡ b (mod n). By listening in to the conversation
between Alice and Bob in Example 5.3.1, he discovers that Y = 13, P = 29,
13A ≡ 23 (mod 29), and 13B ≡ 22 (mod 29). Using his efficient algorithm
for discrete logarithms, he quickly determines that A = 12 and B = 17, then
computes Y AB ≡ 16. He has found Alice and Bob’s shared secret key and can
now decrypt any messages they exchange.

5.7 The El Gamal public-key cryptosystem


This system was first published by Taher El Gamal in 1985. Like Diffie-
Hellman key exchange, its security rests on the apparent intractability of the
Discrete Logarithm Problem.

5.7.1 El Gamal: key generation


Suppose that Alice wants to use the El Gamal system. As with RSA, she must
first generate a key-pair — pri(Alice) and pub(Alice). This is accomplished
as follows:

1. Alice chooses a large random prime p and a generator α of the multi-


plicative group Z∗p .
114 MATH236 Discrete Mathematics with Applications 2009

2. She chooses a random integer a ∈ {2, 3, . . . , p − 2} and computes


αa mod p.

3. She sets

pub(Alice) = (p, α, αa )
pri(Alice) = a

Example 5.7.1 Suppose Alice chooses the prime p = 149 (which is far too
small, but this is just an example). She must now find a generator α of Z∗149 .
She decides to try α = 5. To determine whether 5 is a generator of Z∗149 ,
she uses Theorem 40 (or, equivalently, the algorithm that follows it). The prime
factorization of 148 is 22 ·37, so she must compute 5148/2 mod 149 = 574 mod 149
and 5148/37 mod 149 = 54 mod 149. She finds that 574 ≡ 1 (mod 149), so
α = 5 is not a generator of Z∗149 . She tries α = 12 and finds that 1274 ≡ 148
(mod 149) and 124 ≡ 25 (mod 149), so she chooses α = 12 as her generator
for Z∗149 . She picks a = 37 and calculates αa = 1237 ≡ 105 (mod 149). Thus,

pub(Alice) = (149, 12, 105)


pri(Alice) = 37

Notice that someone wishing to deduce Alice’s private key from her public
key must be able to find the number a ∈ {1, 2, . . . , 147} such that 12a ≡ 105
(mod 149), i.e., solve an instance of the Discrete Logarithm Problem.

5.7.2 El Gamal: encryption and decryption


Suppose that Bob wants to send Alice a message using the El Gamal cryp-
tosystem:

1. Bob first looks up Alice’s public key pub(Alice) = (p, α, αa ).

2. He then represent the message as an integer M in the range {0, 1, . . . , p − 1}


(or, as usual, breaks the message up into blocks, each satisfying the pre-
ceding requirement).
Public-key cryptography 115

3. Bob selects a random integer k ∈ {1, 2, . . . , p − 2}.

4. He then computes

γ = αk mod p

and

δ = M · (αa )k mod p.

5. Finally, Bob sends (γ, δ) to Alice


To decrypt the message that Bob sends her, Alice follows a two-step proce-
dure:
1. She uses her private key pri(Alice) = a to compute

γ p−1−a mod p.

Why? By Corollary 38, γ p−1 ≡ 1 (mod p), so γ p−1−a = γ −a = α−ak .

2. Now she can recover the message M by finding

δ · γ p−1−a ≡ M · αak α−ak ≡ M (mod p).

Example 5.7.2
Key generation: Suppose that Alice chooses p = 2579; she writes 2578 as the
product of 2 and 1289 (which is prime). She tries α = 2, and computes
α2 mod 2579 = 4 and α1289 mod 2579 = 2578, so α generates Z∗2579 .
She picks a = 956, finds 2956 mod 2579 = 1272, and so publishes the
information
pub(Alice) = (2579, 2, 1272))
while keeping
pri(Alice) = 956
to herself.

Encryption: Bob decides to send the message


116 MATH236 Discrete Mathematics with Applications 2009

nuts

to Alice. He encodes nuts, in the usual fashion, as 14212019. He looks


up Alice’s public key and determines that p = 2579, so he decides to split
the message up into two blocks of length 4:

1421 2019

and encrypt each block separately. For additional security, he will select
a different value of k for each block. For the first block, M1 = 1421,
he picks k1 = 318, while for the second block, M2 = 2019, he will use
k2 = 1905. Thus, for the first block, he finds
γ1 = αk1 mod p = 2318 mod 2579 = 792
δ1 = M1 (αa )k1 mod p = 1421 · 1272318 mod 2579 = 590
and, for the second block,
γ2 = αk2 mod p = 21905 mod 2579 = 1035
δ2 = M2 (αa )k2 (mod p) = 2019 · 12721905 mod 2579 = 1684
Bob sends Alice the message
(792, 590), (1035, 1684)
or, equivalently, he could concatenate everything and send Alice the number
0792059010351684.

Decryption: Alice receives the message 0792059010351684 from Bob, and


recovers
γ1 = 792
δ1 = 590
γ2 = 1035
δ2 = 1684
Using her private key pri(Alice) = a = 956, she sets to work decrypting
the first block:
γ1p−1−a mod p = 7922579−1−956 mod 2579 = 1187,
Public-key cryptography 117

and then

M1 = δ1 γ1p−1−a mod p = 590 · 1187 mod 2579 = 1421.

Similarly,

M2 = δ2 γ2p−1−a mod p = 1684·10351622 mod 2579 = 1684·1427 mod 2579 = 2019.

The encoded message is thus 14212019, which she decodes to produce

nuts

Notice that since k is chosen randomly for each encryption, the same message
M would probably be encrypted differently if it was sent from Bob to Alice
a second time. This is an advantage that the El Gamal system has over RSA
encryption. On the other hand, ciphertext generated with the El Gamal
system is twice as long as the plaintext.
118 MATH236 Discrete Mathematics with Applications 2009

Exercises
5.1 Alice and Bob agree to use Diffie-Hellman key-exchange to agree on a
secret key. They decide that Y = 8 and P = 11. Alice chooses A = 5
and Bob chooses B = 3. Find the shared secret key.

5.2 Alice wants to set up an RSA key-pair. Suppose she chooses p = 73,
q = 89, and e = 23.

(a) Find her public modulus, n. What is her public key?


(b) Determine φ(n).
(c) Hence find her private key, d.
(d) Encrypt the message
sweet home alabama
with her public key, using Table 5.1 (on p104) and blocks of length
4.
(e) Alice receives the following message from Bob:
043563195026194150055823333953415799
Decrypt it.

5.3 Chuck and Delia have the following RSA key-pairs:

pub(Chuck) = (3599, 19)


pri(Chuck) = 1099
pub(Delia) = (3551, 41)
pri(Delia) = 2009

(a) Chuck wants to send Delia the signed and encrypted message
I will buy your penguin
using Table 5.1 (on p104) and blocks of length 4. Determine the
message he sends her.
(b) Delia receives the signed and encrypted message
Public-key cryptography 119

1377109002613383178412301758231404471734023320262869
0856005906532902012710442102087319412215324119490143
from someone claiming to be Chuck. Decrypt it and verify the
signature.

5.4 Choose a pair p, q of three-digit prime numbers from Section 3.4 and
generate your own RSA key-pair.

(a) What is your public key?


(b) What is your private key?
(c) Encrypt the message
formation insecurity
with your public key (use Table 5.1 (on p104) to encode your
message and choose an appropriate block size).

5.5 Alice wants to set up an El-Gamal key-pair. Suppose she chooses p =


2777, α = 3, and a = 9.

(a) Verify that α generates Z∗2777 .


(b) What is her public key?
(c) What is her private key?
(d) Encrypt the message
tropical penguin
with her public key, using the value k = 1 for the first block,
k = 2 for the second block, k = 3 for the third block, etc., and
using Table 5.1 (on p104) and blocks of length 4.
(e) Alice receives the following message from Bob (encrypted as per
the previous question):
15570046 18942441 01281194 03840551 11522215 06792117
20371217
Decrypt it.
120 MATH236 Discrete Mathematics with Applications 2009

Solutions
5.1 10

5.2 (a) n = 6497. pub(Alice) = (6497, 23).


(b) 6336
(c) 551
(d) 4038 3834 0848 2434 3512 0001 2803 3541 4128
(e) i hate quotations

5.3 (a) 1377 1090 0261 2424 2711 1230 1758 3459 1603 1809 1870 3337
0856 0059 0653 1690 2280 1044 2102 1682 0376 0480 1833 1861
(b) the message is i will pay you sixty rand

5.7 (a) Since α2776/2 mod 2777 = 2776 6= 1 and α2776/347 mod 2777 =
1007 6= 1, α generates Z∗2777 .
(b) pub(Alice) = (2777, 3, 244)
(c) pri(Alice) = 9
(d) The ciphertext is 00030863 00091299 00271599 00812022 02432643
07292124 21871395 10070208
(e) The plaintext is no such thing
Chapter 6

Product cryptosystems. DES


and AES

Two of the first kinds of cryptosystems that we considered were simple sub-
stitution ciphers and permutation ciphers. Each of them quickly proved
vulnerable to attack. We now consider a new kind of cryptosystem that is
based on them but which is considerably more difficult to attack; so difficult,
in fact, that most modern cryptosystems are of the type we now consider. A
product cryptosystem is a block cipher that repeatedly performs substitutions
and permutations, one after the other, to produce ciphertext.

6.1 Background
In 1973, the US’s National Bureau of Standards1 called for the development
of an “algorithm to be implemented in electronic hardware devices, to be
used for the cryptographic protection of computer data”.
After rigorous testing, the proposal submitted by IBM was adopted in
1977 as the Data Encryption Standard, or DES. DES is based on an earlier
cryptosystem called Lucifer that was developed at IBM by Horst Feistel.
DES became the most widely used cryptosystem in the world. It is used,
for example, to protect PIN’s (Personal Identification Numbers) of banking
clients, telephonic transfers of money, and (of course) communication over
the internet.
1
Now the NIST.

121
122 MATH236 Discrete Mathematics with Applications 2009

Approximately every five years, the NBS/NIST reviewed DES to decide


whether or not it was still considered to be safe against attack. It was
expected that DES would be replaced by a more secure system within 10–
15 years; however, due to its success, it was to be twenty years before the
process of choosing a successor gathered much momentum.
In August 1998, NIST whittled an initial field of twenty one submissions
down to a list of fifteen possible replacements for DES. The specifications
of each of these cryptosystems were published on the Internet and each was
assessed by experts all over the world. One year later, a short list of 5
was drawn up, and these received even more scrutiny. At last, in October
2000, a block cipher system called Rijndael was chosen as the Advanced
Encryption Standard, or AES; after some minor modifications, it was adopted
as a standard in November 2001.
Despite the fact that DES has been superceded by AES, DES is still in
heavy use, especially the variant of DES called triple-DES.

6.2 ASCII
Both DES and AES are designed to operate on the binary alphabet Z2 =
{0, 1}, so any message we wish to encrypt with them must first be encoded
as such. There are many ways to accomplish this. Perhaps the most widely
used convention is ASCII, the American Standard Code for Information In-
terchange, which assigns an 8-digit binary number (a byte) to each symbol
we wish to encrypt (see Table 6.2).

Except where otherwise indicated, we’ll assume from this point on that all
of the cryptosystems we consider operate on the binary alphabet Z2 = {0, 1}.

6.3 Feistel ciphers


DES is a modified Feistel cipher. For this reason, we first describe Feistel
cryptosystems. A Feistel cryptosystem consists of the following:

• A block length, t.

• A number of rounds, r.
Product cryptosystems. DES and AES 123

symbol dec binary symbol dec binary symbol dec binary


space 32 00100000 @ 64 01000000 ‘ 96 01100000
! 33 00100001 A 65 01000001 a 97 01100001
" 34 00100010 B 66 01000010 b 98 01100010
# 35 00100011 C 67 01000011 c 99 01100011
$ 36 00100100 D 68 01000100 d 100 01100100
% 37 00100101 E 69 01000101 e 101 01100101
& 38 00100110 F 70 01000110 f 102 01100110
’ 39 00100111 G 71 01000111 g 103 01100111
( 40 00101000 H 72 01001000 h 104 01101000
) 41 00101001 I 73 01001001 i 105 01101001
* 42 00101010 J 74 01001010 j 106 01101010
+ 43 00101011 K 75 01001011 k 107 01101011
, 44 00101100 L 76 01001100 l 108 01101100
- 45 00101101 M 77 01001101 m 109 01101101
. 46 00101110 N 78 01001110 n 110 01101110
/ 47 00101111 O 79 01001111 o 111 01101111
0 48 00110000 P 80 01010000 p 112 01110000
1 49 00110001 Q 81 01010001 q 113 01110001
2 50 00110010 R 82 01010010 r 114 01110010
3 51 00110011 S 83 01010011 s 115 01110011
4 52 00110100 T 84 01010100 t 116 01110100
5 53 00110101 U 85 01010101 u 117 01110101
6 54 00110110 V 86 01010110 v 118 01110110
7 55 00110111 W 87 01010111 w 119 01110111
8 56 00111000 X 88 01011000 x 120 01111000
9 57 00111001 Y 89 01011001 y 121 01111001
: 58 00111010 Z 90 01011010 z 122 01111010
; 59 00111011 [ 91 01011011 { 123 01111011
< 60 00111100 \ 92 01011100 | 124 01111100
= 61 00111101 ] 93 01011101 } 125 01111101
> 62 00111110 ^ 94 01011110 ~ 126 01111110
? 63 00111111 95 01011111

Table 6.1: Part of ASCII


124 MATH236 Discrete Mathematics with Applications 2009

• A method that generates from a key K a number of round keys, K1 , K2 , . . . , Kr .

• A round function, fKi , that takes binary vectors of length t as input


and produces binary vectors of length t as output. The function fKi
does not have to be invertible, and should be chosen to make the cipher
difficult to attack.

• An encryption function EK that operates on blocks of length 2t. En-


cryption is performed as follows:

1. Let P = L0 ||R0 be a block of plaintext consisting of a left half,


L0 , of length t, and a right half, R0 , also of length t.
2. For each i ≥ 1, we now calculate

Li = Ri−1

and

Ri = Li−1 ⊕ fKi (Ri−1 ).

(each such pair of calculations is a round of encryption).


3. Finally, we set

EK (P ) = EK (L0 ||R0 ) = Rr ||Lr

(notice that the last term is not Lr ||Rr ).

• A decryption function DK that reverses the encryption process. In


a Feistel cipher, the decryption process is identical to the encryption
process with the single exception that the key schedule is reversed, i.e.,
instead of using

– K1 in round 1,
– K2 in round 2, and, in general,
– Ki in round i,

we use

– Kr in round 1,
Product cryptosystems. DES and AES 125

– Kr−1 in round 2, and, in general,


– Kr+1−i in round i.

Example 6.3.1 Suppose that Alice wishes to send the message

Ok

to Bob using a Feistel cipher with t = 4, r = 3, and round function

f (b1 , b2 , b3 , b4 ) = (b2 , b1 ⊕ b3 , 0, b1 ⊕ b2 )

(for simplicity, we have chosen a round function that has no explicit dependence
on a key).
She begins by encoding the message in ASCII as

0100111101101011.

Her Feistel cipher will operate on blocks of length 2t = 8, so she breaks the
message into two bytes2 ; she will encrypt each separately. Consider the first
byte: 01001111. She has

L0 = 0100
R0 = 1111

Thus,

L1 = R0 = 1111
R1 = L0 ⊕ f (R0 ) = 0100 ⊕ f (1111) = 0100 ⊕ 1000 = 1100

Proceeding in this fashion, she constructs the following table:

i Li Ri
0 0100 1111
1 1111 1100
2 1100 0011
3 0011 1000
2
A byte is eight bits.
126 MATH236 Discrete Mathematics with Applications 2009

Finally, E(01001111) = R3 ||L3 = 10000011. Similarly3 , E(01101011) =


10100110. She thus sends to Bob the ciphertext
1000001110100110.

Example 6.3.2 Continuing from the preceding example, suppose Bob receives
the ciphertext
1000001110100110.
from Alice. To decipher it, he performs the same operations as the encryption
process, but reverses the order in which the round keys are used. In this example,
there are no round keys, so the encryption and decryption processes are identical.
As with encryption, he first breaks the ciphertext up into blocks of length
2t = 8, recovering the two bytes 10000011 and 10100110. Consider the first
one, 10000011. For this byte,

L0 = 1000
R0 = 0011

Then,

L1 = R0 = 0011
R1 = L0 ⊕ f (R0 ) = 1000 ⊕ f (0011) = 1100

Bob continues in this way and fills in the following table:


i Li Ri
0 1000 0011
1 0011 1100
2 1100 1111
3 1111 0100
The first byte of plaintext is, therefore, R3 ||L3 = 01001111. In the same
fashion, he decrypts the second byte 10100110 to obtain the plaintext 01101011.
Referring to the ASCII table, Bob is now able to read Alice’s message:
3
We leave you to fill in the details.
Product cryptosystems. DES and AES 127

Ok

6.4 An overview of the Data Encryption Stan-


dard
In this section we use a very broad brush to paint a picture of how DES
works. In the sections that follow, we’ll discuss the parts of DES in more
detail. As we mentioned previously, DES is a federal standard that is man-
aged by NIST. It’s fully described in FIPS46-3 (FIPS is an acronym for
Federal Information Processing Standard), which can be downloaded from
http://www.itl.nist.gov/fipspubs/.
DES is a block cipher in which each block is a 64-bit, or 8-byte, string.
A DES key is also eight bytes. However, the last bit in each byte in a DES
key is a check bit or parity bit: its value is chosen so that the byte has odd
parity, i.e., the number of 1s in the byte is odd.

Example 6.4.1 Suppose the first seven bits of a byte in a DES key are 0111011.
There are five 1s, so we would set the last bit to be 0.
Checkbits (and, more generally, checkdigits) allow for the detection of trans-
mission errors. If a DES key is transmitted to a recipient who finds that one
or more of the bytes has even parity, then he knows that a transmission error
has occurred and can request that the key be resent4 .
It follows that in a DES key we are free to decide the value of fifty six of
the sixty four bits. This means there are 256 = 72057594037927936 ≈ 7×1016
DES keys.

6.4.1 The DES algorithm


The algorithm that DES uses is called the Data Encryption Algorithm, or
DEA. DEA works in the following way. We begin with a 64-bit plaintext, P ,
and a 64-bit DES key, K. Encryption consists of three stages:
4
The converse is not true, i.e., it’s possible that a byte with odd parity still contains
errors.
128 MATH236 Discrete Mathematics with Applications 2009

Stage 1: The bits of P are permuted according to a fixed initial permutation,


IP, to produce IP(P ). The same initial permutation is used in every
DES encryption (see below).

Stage 2: A 16-round Feistel cipher is applied to IP(P ). The DES key, K,


is used to generate round keys for the Feistel cipher.

Stage 3: The inverse of the initial permutation, IP−1 , is applied to the result
of Stage 2 to produce the final ciphertext.

DES decryption follows exactly the same procedure, except that the round
keys are used in reverse order.
We shall now consider the three stages of DES in more detail.

6.5 DES Stages 1 and 3: the permutation IP


The initial permutation IP permutes the plaintext P to produce IP(P ). The
permutation IP is frequently represented in the following form:
 
58 50 42 34 26 18 10 2
60 52 44 36 28 20 12 4
 
62 54 46 38 30 22 14 6
 
64 56 48 40 32 24 16 8
IP =  
57 49 41 33 25 17 9 1 
 
59 51 43 35 27 19 11 3
 
61 53 45 37 29 21 13 5
63 55 47 39 31 23 15 7

This notation should be interpreted in the following fashion:

• the 1st bit of IP(P ) is the 58th bit of the plaintext P ,

• the 2nd bit of IP(P ) is the 50th bit of P ,

• the 9th bit of IP(P ) is the 60th bit of P ,

• the 64th bit of IP(P ) is the 7th bit of P ,

• and so on.
Product cryptosystems. DES and AES 129

The inverse, IP−1 , of the permutation IP, which is applied in the 3rd stage
of encryption, can easily be determined from the preceding.

Example 6.5.1 Suppose that the plaintext is


Go home!
We encode this in ASCII as
P = 01000111 01101111 00100000 01101000
01101111 01101101 01100101 00100001.
(we’ve left spaces between the bytes to make it easier to read). We thus find
that
IP(P ) = 01111011 00000000 01110011 11110011
00000000 11111110 00111010 00010011.

6.6 DES Stage 2: the Feistel cipher


In Stage 2, a 16-round Feistel cipher is applied to
IP(P ) = L0 ||R0
to obtain R16 ||L16 . In each round, we use a round function, fKi , also called
the DES internal block cipher. The round keys K1 , K2 , . . . , K16 are derived
from the 64-bit DES key, K. We shall first state how to determine the round
keys, then describe the round function.

6.6.1 Generating the round keys


The round keys K1 , K2 , . . . , K16 (collectively called the key schedule) are
found in the following way from the 64-bit DES key K.
1. The (fifty six non-parity) bits of K are permuted and split up, using
the permutations C and D (see below), to form two 28-bit subkeys,
C0 = C(K) and D0 = D(K).
130 MATH236 Discrete Mathematics with Applications 2009

2. For each i with 1 ≤ i ≤ 16, the subkeys Ci and Di are determined from
Ci−1 and Di−1 by shifting the bits in Ci and Di left by a predetermined
amount (see Table 6.2).

3. For each i with 1 ≤ i ≤ 16, the round key Ki is found by feeding Ci ||Di
to the permutation PC2 (see below).

The procedure is depicted in Figure 6.1. We now describe in more detail


each of the preceding steps.

PC-1: the permutations C and D

The permutations C and D are collectively given the name PC-1, for per-
muted choice 1.
 
57 49 41 33 25 17 9
 1 58 50 42 34 26 18
C= 10 2 59 51 43 35

27
19 11 3 60 52 44 36
 
63 55 47 39 31 23 15
 7 62 54 46 38 30 22
D= 14 6 61 53 45 37

29
21 13 5 28 20 12 4

This notation should be read in the same way as that used for the initial
permutation, IP, i.e., the first bit of C0 = C(K) is the 57th bit of K, the
2nd bit of C0 is the 49th bit of K, the 8th bit of D0 = D(K) is the 7th bit
of K, and so on.
Notice that PC-1 does not contain any numbers that are congruent to
0 modulo 8. This is because, as previously discussed, the bits in positions
8,16,24,32,40,48,56, and 64 are parity bits and do not form part of the actual
key.

Example 6.6.1 Suppose that

K = 00100101 01100111 11001101 10110011


11111101 11001110 01000000 00101010
Product cryptosystems. DES and AES 131

Key

?
Permuted Choice 1

? ?
C0 D0

? ?
1 left shift 1 left shift

? ?
C1 D1

- Permuted Choice 2 -K
1
? ?
1 left shift 1 left shift

? ?
C2 D2

- Permuted Choice 2 -K
2
? ?
2 left shifts 2 left shifts

? ?
C3 D3

- Permuted Choice 2 -K
3
? ?
.. ..
. .

Figure 6.1: The DES key schedule


132 MATH236 Discrete Mathematics with Applications 2009

(notice that each byte of this DES key has odd parity). Then

C0 = C(K) = 00111100 01110110 10011011 0001


D0 = D(K) = 10101010 00110111 10110100 1000

Finding the subkeys C1 , C2 , . . . , C16 and D1 , D2 , D16


Let X be a string and ` a positive integer. Denote by L(X, `) the string
obtained from X by shifting each character of X left ` positions; the ` leftmost
characters of X are ‘cycled around’ so that they reappear (in the same order)
on the right hand side.

Example 6.6.2 If X1 = abcde, then L(X1 , 2) = cdeab. Similarly, if X2 =


00110111, then L(X2 , 3) = 10111001.
Once we have determined C0 and D0 , the other subkeys C1 , C2 , . . . , C16 and
D1 , D2 , . . . , D16 are found from them using the formula

Ci = L(Ci−1 , `i )
Di = L(Di−1 , `i )

where the values `i , 1 ≤ i ≤ 16, are given in Table 6.2.

Example 6.6.3 We continue from Example 6.6.1. We must first find C1 ; from
Table 6.2, we see that

C1 = L(C0 , `1 ) = L(C0 , 1).

Since C0 = 0011110001110110100110110001, we have

C1 = 0111100011101101001101100010.

Similarly, D1 = L(D0 , 1); since D0 = 1010101000110111101101001000,

D1 = 0101010001101111011010010001.

Continuing in this fashion, we find the following:


Product cryptosystems. DES and AES 133

i `i i `i
1 1 9 1
2 1 10 2
3 2 11 2
4 2 12 2
5 2 13 2
6 2 14 2
7 2 15 2
8 2 16 1

Table 6.2: The left shifts `i .

i Ci Di
0 0011110001110110100110110001 1010101000110111101101001000
1 0111100011101101001101100010 0101010001101111011010010001
2 1111000111011010011011000100 1010100011011110110100100010
3 1100011101101001101100010011 1010001101111011010010001010
4 0001110110100110110001001111 1000110111101101001000101010
5 0111011010011011000100111100 0011011110110100100010101010
6 1101101001101100010011110001 1101111011010010001010101000
7 0110100110110001001111000111 0111101101001000101010100011
8 1010011011000100111100011101 1110110100100010101010001101
9 0100110110001001111000111011 1101101001000101010100011011
10 0011011000100111100011101101 0110100100010101010001101111
11 1101100010011110001110110100 1010010001010101000110111101
12 0110001001111000111011010011 1001000101010100011011110110
13 1000100111100011101101001101 0100010101010001101111011010
14 0010011110001110110100110110 0001010101000110111101101001
15 1001111000111011010011011000 0101010100011011110110100100
16 0011110001110110100110110001 1010101000110111101101001000
134 MATH236 Discrete Mathematics with Applications 2009

Finding the round keys Ki from the subkeys Ci , Di


Having generated the subkeys C1 , C2 , . . . , C16 and D1 , D2 , . . . , D16 , we can
now determine the round keys K1 , K2 , . . . , K16 . Consider the following per-
mutation:5  
14 17 11 24 1 5
 3 28 15 6 21 10
 
23 19 12 4 26 8 
 
16 7 27 20 13 2 
PC2 = 41 52 31 37 47 55

 
30 40 51 45 33 48
 
44 49 39 56 34 53
46 42 50 36 29 32
The notation is the same as that used for the other DES permutations (so
the first bit of PC2(X) is the 14th bit of X, and so on). Then the round
keys K1 , K2 , . . . , K16 are given by
Ki = PC2(Ci ||Di ).
Note that since Ci and Di are twenty eight bits each, Ci ||Di is fifty six bits.
The key Ki is, however, only forty eight bits.

Example 6.6.4 Continuing our example: The round key K3 is equal to PC2(C3 ||D3 ).
Since
C3 ||D3 = 11000111011010011011000100111010001101111011010010001010,
we find
K3 = 011110010101010001111111101001010010111001100110.

6.6.2 The round function, fKi


Recall that the round function, fKi , is used in a 16-round Feistel cipher. In
each round, fKi takes as input the 32-bit string Ri−1 and the 48-bit round
key Ki , and produces a 32-bit string fKi (Ri−1 ). The details are as follows:
5
The conventional name for this permutation is PC-2, for permuted choice 2.
Product cryptosystems. DES and AES 135

1. The 32-bit string Ri−1 is expanded to a 48-bit string using the expansion
function, E (see below).

2. The resulting 48-bit string, E(Ri−1 ), is XORed with the round key, Ki ,
and the result is split into eight 6-bit strings B1 , B2 , . . . , B8 , i.e.,

B1 B2 · · · B8 = E(Ri−1 ) ⊕ Ki .

3. Each 6-bit string Bi is sent through an S-box6 . There are eight S-boxes,
S1 , S2 , . . . , S8 (see below), one for each Bi . The S-box Si operates on
the 6-bit string Bi to return a 4-bit string Si (Bi ).

4. Finally, the eight 4-bit strings S1 (B1 ), S2 (B2 ), . . . , S8 (B8 ) are concate-
nated to produce a 32-bit string which is then permuted using (yet
another) predetermined permutation, P (see below), to produce the
final result,

fKi (Ri−1 ) = P ( S1 (B1 )||S2 (B2 )|| · · · ||S8 (B8 ) ).

The procedure is represented in Figure 6.2. As before, we now give the details
of each of the steps in calculating fKi (Ri−1 ).

The expansion function, E


The expansion function is
 
32 1 2 3 4 5
4 5 6 7 8 9
 
8 9 10 11 12 13
 
12 13 14 15 16 17
E=
16

 17 18 19 20 21
20 21 22 23 24 25
 
24 25 26 27 28 29
28 29 30 31 32 1

The notation is the same as for the other DES built-in functions.
6
For ‘substitution box’.
136 MATH236 Discrete Mathematics with Applications 2009

Ri−1 (32 bits)

?
º·
E
¹¸

?
E(Ri−1 ) (48 bits) Ki (48 bits)

L
- ¾

S1 S2 S3 S4 S5 S6 S7 S8

?
º·
P
¹¸

?
fKi (Ri−1 ) (32 bits)

Figure 6.2: The DES round function, fKi .


Product cryptosystems. DES and AES 137

The S-boxes S1 , S2 , . . . , S8

The eight S-boxes are:

S1
14 4 13 1 2 15 11 8 3 10 6 12 5 9 0 7
0 15 7 4 14 2 13 1 10 6 12 11 9 5 3 8
4 1 14 8 13 6 2 11 15 12 9 7 3 10 5 0
15 12 8 2 4 9 1 7 5 11 3 14 10 0 6 13
S2
15 1 8 14 6 11 3 4 9 7 2 13 12 0 5 10
3 13 4 7 15 2 8 14 12 0 1 10 6 9 11 5
0 14 7 11 10 4 13 1 5 8 12 6 9 3 2 15
13 8 10 1 3 15 4 2 11 6 7 12 0 5 14 9
S3
10 0 9 14 6 3 15 5 1 13 12 7 11 4 2 8
13 7 0 9 3 4 6 10 2 8 5 14 12 11 15 1
13 6 4 9 8 15 3 0 11 1 2 12 5 10 14 7
1 10 13 0 6 9 8 7 4 15 14 3 11 5 2 12
S4
7 13 14 3 0 6 9 10 1 2 8 5 11 12 4 15
13 8 11 5 6 15 0 3 4 7 2 12 1 10 14 9
10 6 9 0 12 11 7 13 15 1 3 14 5 2 8 4
3 15 0 6 10 1 13 8 9 4 5 11 12 7 2 14
S5
2 12 4 1 7 10 11 6 8 5 3 15 13 0 14 9
14 11 2 12 4 7 13 1 5 0 15 10 3 9 8 6
4 2 1 11 10 13 7 8 15 9 12 5 6 3 0 14
11 8 12 7 1 14 2 13 6 15 0 9 10 4 5 3
S6
12 1 10 15 9 2 6 8 0 13 3 4 14 7 5 11
10 15 4 2 7 12 9 5 6 1 13 14 0 11 3 8
9 14 15 5 2 8 12 3 7 0 4 10 1 13 11 6
4 3 2 12 9 5 15 10 11 14 1 7 6 0 8 13
138 MATH236 Discrete Mathematics with Applications 2009

S7
4 11 2 14 15 0 8 13 3 12 9 7 5 10 6 1
13 0 11 7 4 9 1 10 14 3 5 12 2 15 8 6
1 4 11 13 12 3 7 14 10 15 6 8 0 5 9 2
6 11 13 8 1 4 10 7 9 5 0 15 14 2 3 12
S8
13 2 8 4 6 15 11 1 10 9 3 14 5 0 12 7
1 15 13 8 10 3 7 4 12 5 6 11 0 14 9 2
7 11 4 1 9 12 14 2 0 6 10 13 15 3 5 8
2 1 14 7 4 10 8 13 15 12 9 0 3 5 6 11
Each S-box consists of four rows and sixteen columns. The rows are numbered
0,1,2, and 3 and the columns are numbered 0,1,2, . . . , 15. The input to an
S-box Si is a 6-bit string, Bi , but the output is a 4-bit string, Si (Bi ). The
notation used for the S-boxes is different to the notation used for the other
DES functions we’ve encountered. Here’s how to find the output of an S-box,
given an input Bi = b1 b2 b3 b4 b5 b6 :
1. The binary number b1 ||b6 formed by concatenating the first and last
bits of Bi is a decimal number r between 0 and 3.
2. The binary number b2 ||b3 ||b4 ||b5 similarly represents a decimal number
c between 0 and 15.
3. The number in row r and column c of Si is a decimal number between
0 and 15, which can be written as a 4-digit binary number, d. The
number d is the output Si (Bi ) of the S-box.

Example 6.6.5 Suppose that B4 = 110110. Then we must use the S-box S4 .
Looking at the string B4 , we have b1 ||b6 = 10, which is the decimal number 2.
Furthermore, b2 b3 b4 b5 = 1011, which is the decimal number 11. We thus find
the number in row 2, column 11 of S4 , which is 14. The decimal number 14 is
the 4-bit binary number 1110. Thus,
S4 (B4 ) = S4 (110110) = 1110.
Product cryptosystems. DES and AES 139

The permutation P
The 32-bit string S1 (B1 )||S2 (B2 )|| · · · ||S8 (B8 ) is permuted in the following
fashion to produce fKi (Ri−1 ):
 
16 7 20 21 29 12 28 17
 1 15 23 26 5 18 31 10
P =  2 8 24 14 32 27

3 9
19 13 30 6 22 11 4 25

The notation for P is what you would expect, i.e., if we let X = S1 (B1 )||S2 (B2 )|| · · · ||S8 (B8 ),
then

• the first bit of P (X) is the 16th bit of X,

• the second bit of P (X) is the 7th bit of X,

• and so on.

6.7 The security of DES


DES has always been viewed with some suspicion by those who believe that
the NSA7 either designed some backdoor into the S-boxes or deliberately
weakened them, the intent in either case being to allow their cryptanalysts
to crack messages encrypted with DES. So far, no-one has proven either of
these assertions.
DES keys should be changed frequently to prevent attacks that require
analysis of a large amount of data. Typically, DES keys are exchanged using
a public-key cryptosystem like RSA. You might ask why, since RSA seems
a great deal simpler and is still considered secure, we use DES at all. The
answer to this question can be given in one word: speed. DES is usually ‘at
least 100 times as fast in software and between 1,000 and 10,000 times as
fast in hardware’8 .
Despite a great deal of effort, no-one has come up with an efficient way to
attack DES. However, advances in computer technology have made an attack
by exhaustive key search increasingly feasible. In 1998, it was estimated that
a specialized machine could be constructed for $1 million that would allow
7
The National Security Agency, the United States’ cryptographic agency.
8
RSA Labs Security FAQ, http://www.rsasecurity.com/rsalabs/node.asp?id=2215.
140 MATH236 Discrete Mathematics with Applications 2009

a key to be found, on average, in thirty five minutes! In the same year, a


machine built for only $250,000 found a DES key in just fifty six hours. The
following year, a distributed attack by a group of 100,000 computers found a
DES key in just over twenty two hours. Because of its increasing vulnerability
to brute force methods, DES is no longer considered to be secure.
This does not mean that DES is no longer used, however. One easy way
to increase the security of DES is to encrypt the data three times rather than
once. The resulting algorithm is called the Triple Data Encryption Algorithm
(TDEA), an ANSI standard; the associated cryptosystem is triple-DES.

6.7.1 Triple-DES
If we denote by EK and DK the (standard) DES encryption and decryption
functions using key K, then triple-DES encrypts a 64-bit block P of plaintext
to a 64-bit block C of ciphertext according to the rule9
C = EK3 (DK2 (EK1 (P ))).
There are three keying options defined for triple-DES:
1. The keys K1 , K2 , and K3 are independent.
2. K1 and K2 are independent, but K1 = K3
3. K1 = K2 = K3
The second option, in which K1 = K3 , is widely used. The resulting 128-bit
triple-DES key consists of the two 64-bit DES keys K1 and K2 . Since K1
and K2 have fifty-six non-parity bits, a 128-bit triple-DES key has 112 data
bits. The keyspace of triple-DES with this keying option (which contains
2112 ≈ 5 × 1033 keys) is so large that it has been estimated10 that a triple-
DES exhaustive key search on a $10 million machine would take roughly 30
billion years. Thus, while DES is no longer considered secure, triple-DES
should be secure for quite some time11 .
Still another way to make DES more secure is to use one of the approved
modes of operation.
9
Actually, there’s more than one way to do this. What we describe here is called
DES-EDE. A variant is DES-EEE, in which C = EK3 (EK2 (EK1 (P ))).
10
In 2003.
11
As of April 2005, triple-DES and AES are the two NIST-approved symmetric-key
cryptosystems, though federal agencies using triple-DES are required to use the third
keying option, i.e., a 192-bit triple-DES key.
Product cryptosystems. DES and AES 141

6.7.2 Modes of operation


With a block cipher algorithm, the same plaintext block will al-
ways encrypt to the same ciphertext block whenever the same key
is used. If the multiple blocks in a typical message are encrypted
separately, an adversary can easily substitute individual blocks,
possibly without detection. Furthermore, certain kinds of data
patterns in the plaintext, such as repeated blocks, are apparent
in the ciphertext.
Cryptographic modes of operation have been defined to allevi-
ate this problem by combining the basic cryptographic algorithm
with variable initialization vectors and some sort of feedback of
the information derived from the cryptographic operation.12
While the modes of operation we mention here can be applied to other
cryptosystems as well, we shall restrict the discussion to how they work with
DES. There are four DES modes of operation: ECB, CBC, CFB, and OFB.
They comprise another NIST standard13 . We describe each mode in turn.
Electronic Codebook mode (ECB): Each plaintext block of the mes-
sage is encrypted independently to produce a ciphertext block. In other
words, this is ‘standard’ DES. ECB has no message extension (the mes-
sage sent to the recipient need not be longer than the ciphertext) and
a transmission error in a block will affect only the decryption of that
block. On the other hand, ECB suffers from some of the problems
previously mentioned: some patterns in the plaintext will be visible
in the ciphertext and an adversary can replace blocks of the message
with blocks of his own without affecting the content of the rest of the
message.
Cipher Block Chaining mode (CBC): Each plaintext block is XORed
with the previous ciphertext block and then encrypted. Thus, if we
denote by Pi the ith plaintext block and by Ci the corresponding ci-
phertext block, then
Ci = EK (Pi ⊕ Ci−1 )
Pi = DK (Ci ) ⊕ Ci−1
12
NIST Special Publication 800-57: Recommendation for Key Management Part 1:
General. DRAFT (April, 2005).
13
FIPS81.
142 MATH236 Discrete Mathematics with Applications 2009

Pi
-L -
- EK

Ci−1 - Ci

Figure 6.3: CBC mode

Notice that C1 = EK (P1 ⊕ C0 ); the block C0 is not part of the message


but rather an initialization vector. Perhaps surprisingly, C0 does not
have to be encrypted14 , but can be transmitted ‘in the clear’ with
the ciphertext. Since each ciphertext block depends on the preceding
blocks, CBC in general encrypts equal plaintext blocks differently. An
adversary wishing to manipulate the content of the message is unable
to do so except by removing blocks from the beginning or the end of
the ciphertext. On the other hand, CFB has message extension equal
to the block size, as C0 has to be transmitted as well as the ciphertext.
In addition, if block Ci is transmitted incorrectly, then both Pi and
Pi+1 will decrypt incorrectly. CBC mode is shown in Figure 6.3.
Cipher Feedback mode (CFB): The previous ciphertext block is encrypted
and then XORed with the plaintext block, i.e.,
Ci = Pi ⊕ EK (Ci−1 ).
Once again, an initialization vector C0 is used to begin the process.
It’s possible to modify CFB so that its feedback is less than one full
ciphertext block. As with CBC mode, CFB has a nonzero message
extension, and an error in transmission of the ith block affects both
that block and the following block. CFB mode is depicted in Figure
6.4.
Output Feedback mode (OFB): In OFB mode, DES functions like a
synchronous stream cipher. We use an initialization vector z0 of length
equal to the block size, then define for i ≥ 1
zi = EK (zi−1 ).
14
Why?
Product cryptosystems. DES and AES 143

Pi

?
L
Ci−1 - EK - - Ci

Figure 6.4: CFB mode

To encrypt, we let
Ci = Pi ⊕ zi .
Since we must also send the initialization vector, OFB has a nonzero
message extension. If we transmit a ciphertext block Ci incorrectly,
then there will be an error in the decryption of that block, but no
other blocks will be affected. On the other hand, if we transmit z0
incorrectly, then none of the blocks of the message will decrypt cor-
rectly. The initialization vector can be transmitted in the clear, and if
the parties are communicating simultaneously, some time can be saved
because both can generate the keystream without having to wait for
the message.

6.8 AES
As has been mentioned previously, AES was published as FIPS 197 in 2001.
AES is the successor to DES, and it shares many of the same characteristics.
AES is a block cipher, but has a block length of 128 bits, as compared to
DES’s 64 bits. A DES key is always 64 bits, but an AES key can be 128,
192, or 256 bits. Like DES, AES performs a number of rounds of encryption.
The number of rounds, Nr , depends on the key size. If

• the key is 128 bits, then Nr = 10.

• the key is 192 bits, then Nr = 12.

• the key is 256 bits, then Nr = 14.

This is from FIPS 197:


144 MATH236 Discrete Mathematics with Applications 2009

. . . the AES algorithm’s operations are performed on a two-dimensional


array of bytes called the State.
For both its Cipher and Inverse Cipher, the AES algorithm uses
a round function that is composed of four different byte-oriented
transformations: 1) byte substitution using a substitution table
(S-box), 2) shifting rows of the State array by different offsets,
3) mixing the data within each column of the State array, and 4)
adding a Round Key to the State.
Product cryptosystems. DES and AES 145

Exercises
6.1 Consider the following 5-round Feistel cipher:

• The blocklength is t = 4.
• The cipher has a 20-bit key, K.
• For each integer i ∈ {1, 2, 3, 4, 5}, we obtain the round key Ki from
K by concatenating the bits of K whose position is congruent to
i (mod 5), i.e., if K = k1 k2 k3 · · · k20 , then

K1 = k1 k6 k11 k16
K2 = k2 k7 k12 k17
K3 = k3 k8 k13 k18
K4 = k4 k9 k14 k19
K5 = k5 k10 k15 k20

• Let Ki = x1 x2 x3 x4 . The round function is

fKi (b1 , b2 , b3 , b4 ) = (x1 ⊕ b1 , x2 ⊕ b1 ⊕ b2 , x3 ⊕ b3 , x4 ⊕ b1 ⊕ b4 )

Encrypt the message

Yes!

using this Feistel cipher with key K = 01100 10011 11010 01101.

6.2 In this question we consider a DES variant which works exactly the
same way as DES, but with only 3 rounds of encryption instead of the
full 16.

(a) Find the missing bits in the DES key

K = 11001 10 1000100 1111 100 0001101


001 0010 1 000101 000 1000 0110100

(b) Using the key K from part (a), find the round keys K1 , K2 , and
K3 .
146 MATH236 Discrete Mathematics with Applications 2009

(c) Finally, use the round keys K1 , K2 , and K3 from part (b) to
encrypt the message
Last tut
with a 3-round DES cipher.
Product cryptosystems. DES and AES 147

Solutions
6.1 The first block is the letter Y = 01011001, which encrypts to 11100000.

6.2 (a) K = 11001110 10001001 11110100 00001101 00110010 10000101


00001000 01101000
(b) The first round key is K1 = 000001 001110 010001 111011 101000 011101
100001 001000. Then during the first round of encryption we
have

E(R0 ) = 000100 000000 001110 101000 001110 101010 101110 101000,

so S1 (B1 ) = 0111, S2 (B2 ) = 0100, S3 (B3 ) = 0001, S4 (B4 ) =


0111, S5 (B5 ) = 1011, S6 (B6 ) = 0111, S7 (B7 ) = 1010, S8 (B8 ) =
0111.
Thus the result of the permutation P is

10100101 01100010 10111110 10111011

which implies that

R1 = 11101001 00000011 11001101 11001111

The second and third rounds proceed similarly.


148 MATH236 Discrete Mathematics with Applications 2009
Chapter 7

An Introduction to Graphs

7.1 Introduction
Recent years have seen increased demand for application of mathematics.
Graph theory has proven to be particularly useful to a large number of rather
diverse fields. The exciting and rapidly growing area of graph theory is rich
in theoretical results as well as applications to real-world problems. With the
increasing importance of the computer, there has been a significant movement
away from the traditional calculus courses and toward courses on discrete
mathematics, including graph theory.
One of the primary activities of an applied mathematician in modelling.
In a mathematical mode, we represent or identify a real-life situation or
problem with a mathematical system. One of the best-known examples of
this representation is plane Euclidean geometry which gives useful results for
describing small regions, such as measuring distances.
For a graph theorists, the modelling process involves formulating a prob-
lem in such a way that it can be attacked by techniques of graph theory. This
is not always easy; the way in which the modelling is carried out, and the
degree to which the mathematical model accurately represents the original
problem, varies considerably from problem to problem.
Many real-life situations can be described by means of a diagram con-
sisting of a set of points with lines joining certain pairs of points. Loosely
speaking (we shall be more precise shortly), such a diagram is what we mean
by a graph. Graph theory is one area of mathematics that is particulary
well-suited to model building. The two main themes of this course are the

149
150 MATH236 Discrete Mathematics with Applications 2009

development of graph theory as a subject in its own right, and the modelling
of problems.
Instances of graphs abound: for example, the points might represent
cities with lines representing direct flights between certain pairs of these
cities in some airline system, or perhaps the lines represent pipelines be-
tween certain pairs of these cities in an oil network. On the other hand, the
points might represent factories with lines representing communication links
between them. Electrical networks, multiprocessor computers or switching
circuits may clearly be represented by graphs. In chemistry, graphs may
sometimes be helpful in describing the structure of molecules. A chemical
molecule can be represented as a graph whose ’points’ correspond to the
atoms and whose ’lines’ correspond to the chemical bonds connecting them.
The problem of counting the alkanes (paraffins) Cn H2n+2 is essentially a
tree-counting problem. (Trees are a class of graphs which we will study in
Chapter 3.) For the sociologist, a graph may depict the manner in which
people (or groups of people) exert influence over one another. Graphs can
arise in several seemingly unrelated contexts, such as genetics, ecology, ar-
chaeology, music, and the phasing of traffic lights. In order to investigate
such relationships more deeply, we need to study the theory of graphs in
greater detail.
An important part of learning graph theory is problem solving, and for
this reason a number of problems at the end of most sections have been
included. Many of these are routine exercises, designed to test understanding
of the material in the text, but some are more challenging and less routine.
The large variety of proofs used in this course can help strengthen our use of
mathematical techniques.

7.2 What is a graph?


The common feature in all the preceding examples is that in each case we have
a collection of ’objects’ (cities, people, factories, atoms) which are interrelated
in some way. These ideas are easily abstracted to produce the concept of a
graph.

Definitions. A graph G is a finite nonempty set of objects, called


vertices (the singular is vertex, together with a (possibly empty) set
An Introduction to Graphs 151

of unordered pairs of distinct vertices, called edges.The set of vertices


of the graph G is called the vertex set of G, denoted by V (G), and
the set of edges is called the edge setof G, denoted by E(G). The
edge e = {u, v} is said to join the vertices u and v. If e = {u, v} is
an edge of G, then u and v are adjacent vertices, while u and e are
incident, as are v and e. Furthermore, if e1 and e2 are distinct edges of
G incident with a common vertex, then e1 and e2 are adjacent edges .
It is convenient to henceforth denote an edge by uv or vu rather that
by {u, v}. The cardinality of the vertex set of a graph G is called the
order of G and is denoted by n(G), or more simply by n when the graph
under consideration is clear, while the cardinality of its edge set is the
size of G, denoted by m(G) or m. A (n, m) graph has order n and size
m. The graph of order n = 1 is called the trivial graph. A nontrivial
graph has at least two vertices.
It is customary to define or describe a graph by means of a diagram in
which each vertex is represented by a point (which we draw as a small circle)
and each edge e = uv is represented by a line segment or curve joining the
points corresponding to u and v.
v1
s

G: s v2
¡@
¡ @
v3 ¡
s @s v4

Figure 7.1. A graph G.


To illustrate these definitions, consider the graph G defined by the sets
V (G) = {v1 , v2 , v3 , v4 } and E(G) = {v1 v2 , v2 v3 , v2 v4 , v3 v4 }. A diagram of this
graph is shown in Figure 7.1. This graph has order 4 and size 4. Furthermore,
observe that v1 and v2 are adjacent, but v1 and v3 are not adjacent. The
vertex v2 is incident to the edge v2 v3 , but v4 is not incident to v2 v3 . The
edges v1 v2 and v2 v3 are adjacent, but v1 v2 and v3 v4 are not adjacent.
We also need the concept of a subgraph of a graph. It is a common feature
of mathematics that we study complicated objects by looking at simpler
objects of the same type contained inside them, and these smaller objects
are often indicated by the prefix ”sub”. For example, we study subsets of sets,
subsystems of systems, and so on. In graph theory we make the following
definition.
152 MATH236 Discrete Mathematics with Applications 2009

Definitions. A subgraph of a graph G is a graph all of whose vertices


belong to V (G) and all of whose edges belong to E(G). If H is a
subgraph of G, then we write H ⊂ G. If a subgraph H of G contains
all the vertices of G, then H is called a spanning subgraph of G.

For example, if G is the graph shown in Figure 7.1, then the graphs H1 ,
H2 and H3 shown in Figure 7.2 are all subgraphs of G. (Note that G is
regarded as a subset of itself.)
s v1 s v1

H1 : s v2 H2 : s v2 H3 : s v2
¡@ ¡@
¡ @ ¡ @
v3 s s v4 v3 s¡ @s v4 v3 ¡
s @s v4

Figure 7.2. Subgraphs of the graph G.

An important type of subgraph of a graph that we shall encounter is the


induced subgraph.

Definitions. If W is a nonempty subset of vertices of a graph G, then


the subgraph G[W ] of G induced by W is the graph having vertex set
W and whose edge set consists of all those edges of G incident with
two vertices in W . A subgraph H of a G is called an vertex-induced
subgraph, or simply induced subgraph, of G if H = G[W ] for some subset
W of V (G). Hence if H is an induced subgraph of G, then every edge of
G incident with two vertices in V (H) belongs to E(H) (so two vertices
are adjacent in H if and only if they are adjacent in G). Similarly, if
F is a nonempty subset of edges of G, then the subgraph G[F ] induced
by F is the graph whose vertex set consists of all those vertices of G
incident with an edge in F and whose edge set is F . A subgraph H of
a G is called an edge-induced subgraphof G if H = G[F ] for some subset
F of E(G).

For example, if G is the graph shown in Figure 7.1, and H1 and H2 are
the subgraphs of G shown in Figure 7.2, then H1 = hF i and H2 = hW i,
where F = {v1 v2 , v3 v4 } and W = {v2 , v3 , v4 }.

Exercises
An Introduction to Graphs 153

7.1 Draw the graph with vertex set V = {v1 , v2 , v3 , v4 , v5 } and edge set
E = {v1 v2 , v1 v4 , v1 v5 , v2 v3 , v3 v5 , v4 v5 }.
7.2 Write down the vertex set and the edge set of the following graph:
as bs cs

G:
s s s
e d f
7.3 Which of the following graphs are subgraphs of the graph G in Exer-
cise 7.2?
bs as bs
¡@
¡ @
s s s s s¡ @s s s
a b c d d e e d
(a) (b) (c)

7.4 Six university teams A,B,C,D,E and F belong to the same league and
have played a number of matches: A has played C, D and E; B has
played C and E; C has played A, B and D; D has played A, C and E;
E has played A, B and D; F has not yet played. Let G be a graph with
vertex set V = {A, B, C, D, E, F } and let two vertices of G be adjacent
if and only if the corresponding teams have played a match.

(a) Write down E(G); (b) Draw the graph G.


7.5 What is the maximum possible size of a graph of (a) order 3; (b) order 4;
(a) order 5; (d) order n, where n is a positive integer?
7.6 For each of the properties listed below, find a graph G that has the
given property:
(a) Every vertex of G is adjacent to two vertices and every edge of G
is adjacent to two edges.
(b) Every two vertices of G are adjacent and every two edges of G are
adjacent.
(c) Every vertex of G is incident with an edge, but no two edges of G
are adjacent.
154 MATH236 Discrete Mathematics with Applications 2009

7.3 Examples of graphs


There are certain classes of graphs that occur so often that they deserve
special mention and in some cases, special notation.
Complete graphs.
A complete graph or cliqueis a graph in which every two distinct vertices are
adjacent. The complete graph of order n is denoted by Kn and is called an
n-clique.
s s s
s s s ¢A ¢As s© ©©¢H
A HHs
¢ A ¢ A PP¢ A³³ ³
¢s As s¢¡@
¡ @As @
@s³¢P
³P P ¡
As¡
K1 K2 K3 K4 K5
Figure 7.3. Complete graphs.

Empty graphs.
The empty graph is a graph containing no edges. The empty graph of order n
is denoted by Kn .
s s s
s s s s s s
s s s s s s

K1 K2 K3 K4 K5
Figure 7.4. Empty graphs.

Bipartite graphs.
Of particular importance in applications are bipartite graphs. A bipartite
graph is a graph whose vertex set can be partitioned into two sets V1 and V2
(called partite sets) in such a way that each edge of the graph joins a vertex
of V1 to a vertex of V2 . We can distinguish the vertices in V1 from those in
V2 by drawing the former in black and the latter in white, so each edge is
incident with a black vertex and a white vertex. Some examples are shown
below.
V1
z s }| s{ s c
­J ­ J @
@c ¡
s¡ s c
­ J­ J
­ ­J J s c s c s c
­
c ­
c J c Jc ¡
c¡ @@s s c s
| {z }
V2 Figure 7.5. Bipartite graphs.
An Introduction to Graphs 155

Complete bipartite graphs.


A complete bipartite graph is a bipartite graph with partite sets V1 and V2
having the added property that every vertex of V1 is adjacent to every vertex
of V2 (so each black vertex is joined to each white vertex). If |V1 | = r and
|V2 | = s, then this graph is denoted by K(r, s) or more commonly Kr,s . A
complete bipartite graph of the form K(1, s) is called a star graph. Some
examples of complete bipartite graphs are shown below.
c s c c c c c
© HH HH¡ ©
c s c s© ©
s
¡¡ @ @s H @ H ©@©¡
HH ©s @
¡©H@ ¡
c c s @@ ¡
H c©¡
© ¡©@s¡HH
s© @s
K1,4 K2,2 K2,4 K3,3

Figure 7.6. Complete bipartite graphs.

A graph G is k-partite, k ≥ 2, if its vertex set V (G) can be partitioned into


k subsets V1 , V2 , . . . , Vk (called partite sets) in such a way that each edge of
G joins a vertex of Vi to a vertex of Vj , i 6= j. A complete k-partite graph is
a k-partite graph with partite sets V1 , V2 , . . . , Vk having the added property
that every vertex of Vi is adjacent to every vertex of Vj for 1 ≤ i < j ≤ k.
If |Vi | = ni , then this graph is denoted by K(n1 , n2 , . . . , nk ). If ni = n for
all i, then we denote this graph simply by Kk (n). A graph is a complete
multipartite graphif it is a complete k-partite graph for some k ≥ 2.
156 MATH236 Discrete Mathematics with Applications 2009

Exercises

7.7 Draw the following graphs: (a) K6 (b) K4,4 (c) K2,5

7.8 How many edges does a complete bipartite graph Km,s have?

7.4 Operations on graphs


There are several operations we can perform on graphs in order to form
new ones. The simplest of these is to form their union. In the following
definitions, we assume that G1 and G2 are two graphs with disjoint vertex
sets.

Definition. The union G = G1 ∪ G2 has V (G) = V (G1 ) ∪ V (G2 ) and


E(G) = E(G1 ) ∪ E(G2 ). If a graph G consists of k (≥ 2) disjoint copies
of a graph H, then we write G = kH.

The graph 2K1 ∪ 3K2 ∪ K1,3 is shown in Figure 7.7.

s s s s
s s s s s ¡¡ @
s s @s

Figure 7.7. The union of graphs.

Definition. The join G = G1 + G2 has V (G) = V (G1 ) ∪ V (G2 ) and


E(G) = E(G1 ) ∪ E(G2 ) ∪ {uv | u ∈ V (G1 ) and v ∈ V (G2 ).

Using the join operation, we see that Kr,s = Kr +Ks . Another illustration
is given in Figure 7.8.
s
©©s
¡
s @ @s s ¡ @
© @s
P P³³ ³
G1 : s G2 : s G1 + G2 : @
sH³PP
¡
³ Ps
¡ @
H@
s¡ H¡s¡

Figure 7.8. The join of two graphs.


An Introduction to Graphs 157

Definition. If G is a graph, we form its complementG by taking the


vertex set of G and joining two vertices by an edge whenever they are
not joined in G.

For example, the complement of Kn is Kn . Figure 7.10 shows a graph G


and its complement G. Note that if we take the complement of G, then we
get back the original graph G.
v2 v1 v2 v1
sH s s s
HH A ¡@
HH A¡ @
A
G : v3 s@ Hs v4
¡
v
G: 3 ¡
s
A
@s v4
@ ¡ A

s As
v5 v5
Figure 7.10. A graph and its complement.

Exercises

7.9 Draw the following graphs: (a) 3K3 (b) K2,2 + K2 (c) K2,3 .

7.5 The degree of a vertex


We have already introduced two numbers associated with a graph G, namely
the order and the size of G. Now we define a collection of numbers associated
with G.

Definitions. Let v be a vertex of a graph G. The degreeof v is the


number of edges of G incident with v. The degree of v is denoted by
dG (v), or simply d(v) if G is clear from the context. The minimum de-
greeof G is the minimum degree among the vertices of G and is denoted
δ(G), while the maximum degree of G is the maximum degree among
the vertices of G and is denoted ∆(G)

In Figure 7.11, a graph G is shown together with the degrees of its vertices.
158 MATH236 Discrete Mathematics with Applications 2009
2s 3s 4s 1s
¡@ @ ¡@ ¡
G: ¡ @ @¡ ¡
@
s
¡ @s s @¡
¡ s @s s
1 1 3 5 2 0
Figure 7.11. The degrees of the vertices of a graph.

Definitions. A vertex is called oddor evendepending on whether its


degree is odd or even. A vertex of degree 0 in G is called an isolated
vertex and a vertex of degree 1 is an end-vertexof G. A vertex adjacent
to an end-vertex is called a remote vertex.

Regular graphs.
We say that a graph is regular if all its vertices have the same degree. In
particular, if the degree of each vertex is r, then the graph is regular of
degree r or is r-regular. Figure 7.12 illustrates some examples of graphs
which are regular of degree r, for various values of r.
s s s s
s s s @
¡ ¡ @s ­
­ JJ
­ J
s s s s ­©©Hs J
s s @ @¡s ¡ s
©
­ H s
H
J
r=1 r=2 r=3
Figure 7.12.

Observe that for the graph G shown in Figure 7.11, n(G) = 10 and
m(G) = 11, while the sum of the degrees of its ten vertices is 22. The fact
that this last number equals 2m(G) is no mere coincidence, as we now show.

Theorem 7.1 In any graph, the sum of all the vertex degrees is equal to
twice the number of edges.

Proof. Every edge is incident with two vertices; hence, when the degrees of
the vertices are summed, each edge is counted twice. 2

The above theorem is sometimes called the Handshaking Lemma. This


name arises from the fact that a graph can be used to represent a group of
people shaking hands at a party. In such a graph, the people are represented
by the vertices, and an edge is included whenever the corresponding people
have shaken hands. With this interpretation, the number of edges represents
the total number of handshakes, the degree of a vertex is the number of
An Introduction to Graphs 159

hands shaken by the corresponding person, and the sum of the degrees is
the total number of hands shaken. So the handshaking lemma states simply
that the total number of hands shaken is equal to twice the number of hand-
shakes - the reason being, of course, that exactly two hands are involved in
each handshake. There are some important consequences of the handshaking
lemma.

Corollary 7.2 In any graph, there is an even number of odd vertices.

Proof. Let G be a (n, m) graph. If G has no odd vertices, then the result fol-
lows immediately. Suppose that G contains k (≥ 1) odd vertices v1 , v2 , . . . , vk .
If G contains even vertices as well, then denote these by vk+1 , . . . , vn . By
Theorem 7.1,
X k Xn
d(vi ) + d(vi ) = 2m.
i=1 i=k+1
Pn
Since each of the numbers d(vk+1 ), . . . , d(vn ) is even, i=k+1 d(vi ) is even, so
we have
X k X n
d(vi ) = 2m − d(vi )
i=1 i=k+1

is even. However, each of the numbers d(v1 ), d(v2 ), . . . , d(vk ) is odd. Since
the sum of an odd number of odd numbers is odd, if follows that k must
be even; that is, G has an even number of odd vertices. If G has no even
vertices, then we have d(v1 ) + d(v2 ) + · · · + d(vk ) = 2m, from which we again
conclude that k is even. 2

It is often convenient to list the degrees of the vertices in a graph; this is


usually done by writing them in nonincreasing order (that is, in decreasing
order, but allowing ’repeats’ where necessary). The resulting list is called the
degree sequenceof a graph. Given a graph G, the degree sequence of G can be
easily determined (we simply find the degree of each of its vertices). For ex-
ample, the graph G of Figure 7.11 has degree sequence 5, 4, 3, 3, 2, 2, 1, 1, 1, 0.
Several interesting questions about degree sequences now come to mind.
The first question you might think of is, can we reverse this process? By
this we mean, given a degree sequence s, can we determine a graph with s
as degree sequence? Perhaps a better question is: Can we determine when
a sequence of integers represents the degree sequence of a graph? This leads
us to the following definition.
160 MATH236 Discrete Mathematics with Applications 2009

Definition. A sequence of integers is said to be graphical if it is the


degree sequence of some graph. A graph G with degree sequence s is
called a realization of s.

Let’s begin with the question: When is a sequence s : d1 , d2 , . . . , dn of


integers the degree sequence of some graph? Certain conditions are clearly
important. First, degrees are nonnegative integers, so di ≥ 0 for all i. Next,
di ≤ n − 1, because no vertex in a graph of order n can be adjacent to more
than n − 1 other vertices. Furthermore, Theorem 7.1 tells us that theP sum of
the degrees of the vertices in any graph must be an even number, so ni=1 di
is even. The above three conditions are all necessary for a sequence to be
graphical, but these conditions are not sufficient. The sequence 3, 3, 3, 1 is not
graphical, for example. A necessary and sufficient condition for a sequence
to be graphical was found by Havel [25] (1955) and later rediscovered by
Hakimi [24] (1962).

Theorem 7.3 (Havel - Hakimi) A nonincreasing sequence of nonnegative


integers s : d1 , d2 , . . . , dn is graphical if and only if the sequence s1 : d2 −
1, d3 − 1, . . . , dd1 +1 − 1, dd1 +2 , . . . , dn
(n ≥ 2) is graphical.

Proof. Suppose that the sequence s1 is graphical. Let G1 be a graph of order


n − 1 with degree sequence s1 . Then the vertices of G1 can be labelled as
v2 , v3 , . . . , vn in such a way that d(vi ) = di − 1 if 2 ≤ i ≤ d1 + 1 and d(vi ) = di
if d1 + 2 ≤ i ≤ n. We can now construct a new graph G from G1 by adding
a new vertex v1 and then joining v1 with an edge to each of v2 , v3 , . . . , vd1 +1 .
The degree of v1 is d1 , and the degree of the other vertices are the remaining
values of s. Thus, we have constructed a graph with degree sequence s, and
so s is graphical.
We show next that if s is graphical, then s1 is graphical. Assume that s
is a graphical sequence. Therefore, there are one or more graphs of order n
with degree sequence s. Among all such graphs, let G be one such that
V (G) = {v1 , v2 , . . . , vn }, where d(vi ) = di for 1 ≤ i ≤ n, and the sum of the
degrees of the vertices adjacent with v1 is maximum.
We show that in G, the vertex v1 must be adjacent with vertices having
degrees d2 , d3 , . . . , dd1 +1 . If this is not the case, then there must exist two
vertices vj and vk with dj > dk such that v1 is adjacent to vk , but not to
vj . Since the degree of vj exceeds that of vk , there must be some vertex v`
An Introduction to Graphs 161

such that v` is adjacent to vj , but not to vk . Removing the edges v1 vk and


vj v` and adding the edges v1 vj and vk v` produces a new graph G0 that also
has degree sequence s. However, in G0 the sum of the degrees of the vertices
adjacent with v1 is larger than that in G, contradicting our choice of G. This
contradiction argument verifies that our initial assumption (namely, that v1
is not adjacent with vertices having degrees d2 , d3 , . . . , dd1 +1 ) was false.
Thus, as claimed, v1 must be adjacent with vertices having degrees d2 , d3 ,
. . . , dd1 +1 . Hence, the graph obtained from G by removing v1 , together with
all the edges incident with v1 , produces a graph with degree sequence s1 , so
s1 is graphical. 2

With the aid of Theorem 7.3, we may now present a procedure, or algo-
rithm, that allows us to determine whether a finite sequence of nonnegative
integers is graphical.

Algorithm 7.4 Given a sequence of n (≥ 1) nonnegative integers:

1. If some integer in the sequence exceeds n − 1, then the sequence is not


graphical.

2. If all the integers in the sequence are 0, then the sequence is graphical.

3. If the sequence contains a negative integer, then the sequence is not


graphical.

4. Reorder the sequence (if necessary) so that it is nonincreasing.

5. Delete the first number, say t, from the sequence and subtract 1 from
the next t numbers in the sequence to form a new sequence. Return to
Step 2.

To illustrate Algorithm 7.4, consider the sequence

s : 4, 4, 4, 3, 3, 2.

Step 1 is satisfied, and we begin the loop of Steps 2 to 5. The tests in Steps 2
and 3 do not immediately halt us. Proceeding to Step 5, we get

s1 = s01 : 3, 3, 2, 2, 2.
162 MATH236 Discrete Mathematics with Applications 2009

Continuing to apply Algorithm 7.4, we have

s02 : 2, 1, 1, 2
s2 : 2, 2, 1, 1 (reordering)

s03 : 1, 0, 1
s3 : 1, 1, 0 (reordering)

s4 = s04 : 0, 0.

Algorithm 7.4 therefore shows that s is graphical, since 0, 0 is the degree


sequence of the empty graph K2 on two vertices.
If we can observe that some sequence prior to s4 is graphical, then we can
conclude from Theorem 7.3 that s is graphical. For example, the sequence
s2 is easily seen to be graphical since it is the degree sequence of the graph
G2 of Figure 7.13. By Theorem 7.3, each of the sequences s1 and s is in
turn graphical. To construct a graph with degree sequence s1 , we proceed in
reverse from s02 to s1 , observing that a vertex should be added to G2 so that
it is adjacent to one vertex of degree 2 and two vertices of degree 1 in G2 .
We thus obtain a graph G1 with degree sequence s1 (or s01 ). Proceeding from
s01 to s, we again add a new vertex joining it to two vertices of degree 3 and
two vertices of degree 2 in G1 . This gives a graph G with degree sequence s.
s s s
@ ¡@ ¡@
@ ¡ @ ¡ @
G2 : @s G1 : s¡ @s G: sH¡ @s
HH ©©
©
s
©H
s©©
HH
s s s s s
s2 : 2, 2, 1, 1 s1 : 3, 3, 2, 2, 2 s : 4, 4, 4, 3, 3, 2

Figure 7.13. Construction of a graph G with a given degree sequence.

It should be pointed out that the graph G in Figure 7.13 is not the only
graph with degree sequence s. The graphs G and H of Figure 7.14 both have
the same degree sequence, namely 2, 2, 2, 2, 2, 2. So degree sequences do not
always provide enough information to uniquely describe a graph.
s s s s s
¢A ¢A
G: H: ¢ A ¢ A
s s s s¢ As s¢ As
An Introduction to Graphs 163

Figure 7.14. Two graphs with the same degree sequence 2, 2, 2, 2, 2, 2.

Exercises

7.10 For the graph G in the accompanying figure, write down the degrees of
all the vertices and the degree sequence of G. Verify the handshaking
lemma for the graph G.
v2 v6
© ©sHH v4 v5 ©©
s
G : v1 ©
sH
HH©
H
©s s©
HH ©sv7
s © Hs©©
v3 v8
7.11 Suppose we know the degrees of the vertices of a graph G. Is it possible
to determine the order and size of G? Explain.

7.12 Show that in any group of at least two people there must be two who
have exactly the same number of acquaintances in the group.

7.13 Suppose you and your partner attended a party with three other cou-
ples. Several handshakes took place. No one shook hands with himself
(or herself) or his (or her) partner, and no one shook hands with the
same person more than once. After all the handshaking was com-
pleted, suppose you asked each person, including your partner, how
many hands he or she had shaken. Each person gave a different an-
swer.

(a) How many hands did you shake?


(b) How many hands did your partner shake?

7.14 Prove that if G is a regular bipartite graph with partite sets V1 and V2 ,
then |V1 | = |V2 |.

7.15 Determine whether the following sequences are graphical. If so, con-
struct a graph with the appropriate degree sequence.

(a) 5, 5, 5, 3, 3, 2, 2, 2, 2, 2
(b) 4, 4, 3, 2, 1, 0
164 MATH236 Discrete Mathematics with Applications 2009

(c) 3, 3, 2, 2, 2, 2, 1, 1
(d) 7, 4, 3, 3, 2, 2, 2, 1, 1, 1

7.16 Show that the sequence d1 , d2 , . . . , dn is graphical if and only if the


sequence n − d1 − 1, n − d2 − 1, . . . , n − dn − 1 is graphical. (Hint:
Consider a graph and its complement.)

7.6 Connectivity
7.6.1 Introduction
Many of the applications of graph theory involve getting from one vertex to
another in a graph. For example, how can you find the shortest route between
one London Underground station and another? How do you determine the
best route for postal deliveries? In order to formulate general methods for
solving such problems, we need to investigate the concept of connectedness
in a graph; that is, a property of a graph which enables us to proceed from
one vertex to another by a sequence of edges.

7.6.2 Connected graphs


We start this section by defining a walk in a graph.

Definitions. Let u and v be two (not necessarily distinct) vertices of


a graph G. A u-v walk in G is a finite, alternating sequence of vertices
and edges that begin with the vertex u and ends with the vertex v and
in which each edge of the sequence joins the vertex that precedes it in
the sequence to the vertex that follows it in the sequence. The number
of edges in the walk is called the length of the walk.

Note that we do not require all the edges or vertices in a walk to be


different. Often only the vertices of a walk are listed, for the edges present
are then obvious. For example, v3 , v3 v2 , v2 , v2 v6 , v6 , v6 v3 , v3 , v3 v4 , v4 , v4 v5 ,
v5 , v5 v4 , v4 is a v3 -v4 walk in the graph G of Figure 7.14. The walk just
described can be expressed more simply as v3 , v2 , v6 , v3 , v4 , v5 , v4 . This walk
has length 6.
An Introduction to Graphs v2 v4 165
s s
©©HHH v3 ©©
G : v1 HH
s
© s
©
©HH
s ©
H© Hs
v6 v5
Figure 7.14.

Definitions. If all the edges (but not necessarily all the vertices) of
walk are different, then the walk is called a trail. If, in addition, all
the vertices are different, then the trail is called a path. We consider a
single vertex as a trivial path (walk or trail).

The v3 -v4 walk described above is not a trail (since the edge v4 v5 occurs
twice); however, v3 , v2 , v6 , v3 , v4 is a v3 -v4 trail in the graph G of Figure 7.14.
This trail is not a path (since the vertex v3 is repeated). However, v3 , v5 , v4
is a v3 -v4 path. It is also useful to have special terms for those walks or trails
which start and finish at the same vertex.

Definitions. A u-v walk is closed if u = v and open otherwise. A


closed walk in which all the edges are different is a closed trail. A
closed trail which contains at least three vertices is called a circuit.A
circuit which does not repeat any vertices (except the first and last)
is called a cycle.An acyclic graph has no cycles. The length of a cycle
(or circuit) is the number of edges in the cycle (or circuit). A cycle of
length n is an n-cycle. A 3-cycle is also called a triangle . A cycle is
evenif its length is even; otherwise it is odd.

In the graph G of Figure 7.15, v1 , v2 , v3 , v4 , v5 , v2 , v6 , v1 is a circuit (of


length 7) that is not a cycle, while v2 , v4 , v3 , v5 , v2 is a cycle (as well as a
circuit) of length 4.
v3 v4 v1
s s s
@ ¡ ¡
G: @
¡ ¡
s @s¡
¡ s
v5 v2 v6
Figure 7.15.

By definition, every path is a trail and every trail is a walk. Although the
converse of each of these statements fails to hold, we do have a result that
relates walks and paths.

Theorem 7.4 Every u-v walk in a graph contains a u-v path.


166 MATH236 Discrete Mathematics with Applications 2009

Proof. Let W be a u-v walk in a graph G. If W is closed, the result is


easy; we simply use the trivial path u. Thus, assume W is an open walk, say
W : u = u0 , u1 , u2 , . . . , uk = v. Note that a vertex may have received more
than one label if it occurs more than once in W . If no vertex is repeated,
then W is already a path. Otherwise, there are vertices of G that occur
in W twice or more. Let i and j be distinct integers with i < j such that
ui = uj . That is, the vertex ui is repeated as uj . If we now delete the
vertices ui , ui+1 , . . . , uj−1 from W , we obtain a u-v walk W1 which is shorter
than W and has fewer repeated vertices. If W1 is a path, we are done; if not,
we continue this process until finally we reach a stage where no vertices are
repeated and a u-v path is obtained. 2

Before proceeding further, we mention two special classes of graphs.

Path graphs.
A path graph is a graph consisting of a single path. The path graph of order n
is denoted by Pn .
s s s s s s s s s s s s s s s
P1 P2 P3 P4 P5
Figure 7.16.

Cycle graphs.
A cycle graph is a graph consisting of a single cycle. The cycle graph of
order n is denoted by Cn (n ≥ 3).
s s s s s s
¢A ¡@
s¡ @s s s
¢ A
¢s As s s s s s s
C3 C4 C5 C6
Figure 7.17.

Definitions. A graph G is connectedif there exists a path in G be-


tween any two of its vertices, and is disconnectedotherwise. Every dis-
connected graph can be split up into a number of connected subgraphs,
called components.A component of a graph G is a maximal connected
subgraph. Two vertices u and v in a graph G are connected if u = v, or
if u 6= v and a u-v path exists in G. The number of components of G
is denoted by k(G); of course, k(G) = 1 if and only if G is connected.
An Introduction to Graphs 167

For the graph G of Figure 7.18, k(G) = 6.


s s s s s s s
@ ¡
G: @
¡
s s s¡ @s s s s s s

Figure 7.18. A graph G with six components.

Exercises
7.17 Give an example of a disconnected graph with four components where
each component is complete.
7.18 Give an example of a disconnected graph with three components where
every two components are isomorphic.
7.19 In the accompanying graph G, give an example of:
(a) a circuit which is not a cycle;
(b) a trail which is not a path; v1
s
(c) a path; (d) a cycle. ©©HHHs
G : v2 HH »»©
©
s » v3
»©
s »Hs©
» s
v4 v5 v6

7.20 Show that if G is a graph with minimum degree δ ≥ 2, then G contains


a cycle of length at least δ + 1. (Hint: Consider a longest path P in G.
Let u be an end-vertex of P and consider the vertices adjacent to u in
G.)
7.21 Show that if G is a bipartite graph with minimum degree δ ≥ 1, then
G contains a path of order at least 2δ.
7.22 Prove that a graph and its complement cannot both be disconnected.
7.23 A graph is self-complementary if it is isomorphic to its complement.
(a) Show that C5 is self-complementary.
(b) Prove that a self-complementary graph has 4k or 4k + 1 vertices,
for some integer k (that is, if G is self-complementary of order n, then
n ≡ 0 (mod 4) or n ≡ 1 (mod 4)).
168 MATH236 Discrete Mathematics with Applications 2009

7.24 Let G be a graph of even order n (i.e., n = 2s for some positive integer s)
such that G has two complete components. Prove that the minimum
size possible for G is m = (n2 −2n)/4. (Hint: Try a calculus argument.)
If G has this size, what does G look like?

7.25 Let G be a graph of order n such that d(v) ≥ (n − 1)/2 for every v ∈
V (G). Prove that G is connected. (Hint: Try a proof by contradiction:
Assume, to the contrary, that G is disconnected. This means that G
has two or more components. What can be said about the number of
vertices in each component?)

7.26 Let G be a graph of order n ≥ 2 such that d(v) ≥ (n − 2)/2 for every
v ∈ V (G). Show that G need not be connected if n is even.

7.27 Suppose that G is a graph having no vertex of degree 0 and no induced


subgraph with exactly two edges. Prove firstly that G is connected.
Hence prove that G is a complete graph.

7.6.3 Distance in graphs


The distance between two vertices in a graph is defined as follows.

Definition. For a connected graph G, we define the distance d(u, v)


between two vertices u and v as the minimum of the lengths of the u-v
paths of G. If G is a disconnected graph, then the distance between two
vertices in the same component of G is defined as above. However, if
u and v belong to different components of G, then d(u, v) is undefined
(or we could define d(u, v) = ∞).

For the graph G of Figure 7.19, d(x, u) = 2, d(x, w) = 3 while d(x, v) = 5.


s
us s
© © @ HH w ©©HH
G: x © s
HH @

©©HH
s
©©
Hs v
Hs @s© H©
s

Figure 7.19.

The distance function d(u, v) on pairs of vertices of a graph is a metric,


that is, it satisfies the following fundamental properties:
An Introduction to Graphs 169

Theorem 7.5 Let G be a graph and let u, v, w be any three vertices of G.


Then
(i) d(u, v) ≥ 0 and d(u, v) = 0 if and only if u = v;
(ii) d(u, v) = d(v, u) (symmetric property);
(iii) d(u, v) ≤ d(u, w) + d(w, v) (triangle inequality).

Proof. (i) Since d(u, v) is ∞ or equals the number of edges on a shortest


u–v path, d(u, v) ≥ 0. If d(u, v) = 0, then a shortest u–v path has no edges
and is thus the trivial path. Hence, u = v. If u = v, then d(u, v) = 0.
(ii) If d(u, v) = ∞, then there is no u–v path, and so d(v, u) = ∞.
Suppose then that d(u, v) 6= ∞. By reversing the order of the vertices of a
shortest u–v path, we obtain a v–u path, and so d(v, u) ≤ d(u, v). On the
other hand, by reversing the order of the vertices of a shortest v–u path, we
obtain a u–v path, and so d(u, v) ≤ d(v, u). Consequently, d(u, v) = d(v, u).
(iii) Let P be a shortest u–w path and let Q be a shortest w–v path.
Then P followed by Q is a u–v walk having length d(u, w) + d(w, v). By
Theorem 2.1, this walk contains a u–v path. It follows that d(u, v) ≤
d(u, w) + d(w, v). 2

Applications of distance in graphs abound. For example, suppose a town


planning committee wishes to locate certain facilities, such as a fire or police
station, in such a way as to minimize the response time between the facility
and the location of a possible emergency. How would the committee deter-
mine the most convenient sites for these facilities? To attempt to answer
such questions, we introduce the following concepts.

Definitions. The eccentricitye(v) of a vertex v in a connected graph


G is the maximum distance of a vertex from v; that is, e(v) is the
distance from v to a vertex furthest from v. The radius rad(G) of
G is the minimum eccentricity among the vertices of G, while the
diameterdiam(G) is the maximum eccentricity. A vertex v is a central
vertexif e(v) = rad(G), while v is a peripheral vertex if e(v) = diam(G).

Notice that the diameter of G is the maximum distance between any two
vertices in G. Furthermore, if rad(G) = 1, then G contains a vertex that
is adjacent to every other vertex. In Figure 7.20, the vertices of the graph
G are labelled with their eccentricities. For this graph G, rad(G) = 3 and
diam(G) = 5.
170 MATH236 Discrete Mathematics with Applications 2009
3s 4s
4 sH s5
HH
HH
G: 5 s Hs 3
@ ¡ HHHs
@ ¡ 3

s
4
Figure 7.20. Eccentricities of vertices.

The next result shows that the diameter of a graph is at most twice its
radius.

Theorem 7.6 For every connected graph G, rad(G) ≤ diam(G) ≤ 2 rad(G).

Proof. The inequality rad(G) ≤ diam(G) follows directly from the defini-
tions. To verify the inequality diam(G) ≤ 2 rad(G), let u and v be vertices
in G satisfying d(u, v) = diam(G). Furthermore, let w be a central vertex
of G. Since the distance function satisfies the triangle inequality (see Prop-
erty (iii) of Theorem 7.5), diam(G) = d(u, v) ≤ d(u, w) + d(w, v) ≤ 2 e(w) =
2 rad(G). 2
The lower bound of Theorem 7.6 is sharp; that is, there exists a graph
G for which rad(G) = diam(G). In fact, the family Kn , n ≥ 2, of complete
graphs satisfies this equation, as does the family of all cycles. The upper
bound is also sharp. To see this, let G belong to the family of paths of
odd order; that is, G ∼= P2k+1 , k ≥ 1. Then rad(G) = k and diam(G) = 2k.
Hence for each positive integer k, there exists a graph G such that diam(G) =
2 rad(G).

Definition. The centreC(G) of G is the subgraph of G induced by its


central vertices.

Returning to our facility location problem, a possible location for the


emergency facility is a vertex in the centre of the graph that models the
street system.
The centre of a graph need not consider of a single vertex. For example,
the centre of the graph G of Figure 7.20 is the subgraph induced by the
three vertices of eccentricity 3, i.e., C(G) ∼
= P3 . Thus P3 is the centre of
some graph. In fact, Hedetniemi (see [11]) showed that every graph is the
centre of some graph.

Theorem 7.7 Every graph is the centre of some connected graph.


An Introduction to Graphs 171

Proof. Let H be a given graph. Let G be the graph constructed from H


by adding four new vertices v1 , v2 , w1 , w2 and, for i = 1, 2, joining vi to every
vertex of H and to wi . For each vertex v in H, eG (v) = 2, while for i = 1, 2,
e(vi ) = 3 and e(wi ) = 4. Thus radG = 2 and the centre of G is the subgraph
induced by the vertices of H; that is, C(G) = H. 2

Exercises

7.28 For the given graph G, find


(a) the eccentricity of each vertex;
(b) the radius and diameter of G;
(c) the centre of G.
v2 v4 v6
s s s
¡ @ @
¡ @ @
v1 ¡
s @ @sv8
G: @ @ ¡
@ @ ¡
@s s @s¡
v3 v5 v7

7.29 If u and v are adjacent vertices in a graph G, then show that |e(u) −
e(v)| ≤ 1.

7.30 Given positive integers r and d with r ≤ d ≤ 2r, construct a graph G


with radG = r and diam(G) = d. (Hint: construct such a graph with
one cycle.)

Suggestions for further reading


In this chapter, we introduced many of the basic concepts of graph the-
ory. Many textbooks have been written about graph theory, such as books
by Bollobás [6], Bondy and Murty [9], Chartrand [12], Chartrand and Les-
niak [14], Chartrand and Oellermann [15], Diestel [16], West [27], and Wilson
and Watkins [28].
172 MATH236 Discrete Mathematics with Applications 2009
Chapter 8

The Shortest Path Algorithm

8.1 Introduction
Consider the following problem. A traveller wishes to drive from Knysna
in the Cape Province to Graskop in the Eastern Transvaal. Given that the
traveller has a map of South Africa that shows the distance between partic-
ular pairs of cities or towns, how does the traveller determine the shortest
possible route?
In this particular example it may not be too difficult to find the solution
by intelligent guesswork, but such an approach is less likely to succeed as
the road network becomes more and more complicated. In this section we
describe an efficient algorithm which can be used to find the shortest path
between any two vertices in a graph.

8.2 Distance in weighted graphs


Before introducing the shortest path algorithm, we will need to generalize
the concept of distance in a graph.

Definitions. A weighted graph is a graph in which each edge e is


assigned a positive real number, called the weight of e, and denoted
by w(e). The length of a path P in a weighted graph G is the sum of
the weights of the edges of P . For connected vertices u and v of the
weighted graph G, the distance d(u, v) between u and v is the minimum

173
174 MATH236 Discrete Mathematics with Applications 2009

of the lengths of the u-v paths of G. A u-v path of minimum weight in


G is called a shortest u-v path for G (so that a path containing three
edges, each of weight one, is ”shorter” than a path with two edges, each
of weight two).

Consider the weighted graph G of Figure 8.1. The path P : v2 , v1 , v3 , v4 is a


shortest v2 -v4 path (of minimum weight 4). Notice that the path Q : v2 , v3 , v4
(of weight 5) is not a shortest v2 -v4 path for G, even though Q contains fewer
edges than P .
For example, an airline service can be represented by a weighted graph
with the vertices representing the cities serviced, edges representing direct
flying routes, and with the weight of an edge being the cost of a direct
flight. Finding the distance between two vertices in this weighted graph
corresponds to finding the minimum cost of flying from one city to another,
while a shortest path is a flying route between two cities with the lowest cost.

The Weight Matrix. Let G be a weighted graph with vertex set


V (G) = {v1 , v2 , . . . , vn }. It is convenient to represent G by means of
a weight matrix W (G) defined as follows: W (G) = (wij ) is a n × n
matrix, where
½
w(vi vj ) if vi vj ∈ E(G)
wij =
∞ if vi vj 6∈ E(G).

By ’∞’ we shall mean a number larger than any weight actually occurring in
our calculations.) For example, a graph and its weight matrix are shown in
Figure 8.1, where the weights of the edges are as indicated in the diagram.

v1 1 v2
u u  
@ 0 1 2 5
@ 2  1 0 4 ∞ 
G: 5 @ 4 W (G) = 
 2 4

@ 0 1 
@ 5 ∞ 1 0
u @u
v4 1 v3

Figure 8.1. A weighted graph G and its weight matrix.


The Shortest Path Algorithm 175

8.3 Dijkstra’s algorithm


We next introduce an algorithm, due to Dijkstra (1959), which determines,
for a fixed vertex x in a connected weighted graph G, the distance d(x, v)
from x to each vertex v of G, as well as a shortest x-v path in G. We shall
write w(e) = ∞ if e 6∈ E(G), i.e, w(uv) = ∞ if u and v are nonadjacent
vertices of G.
The idea of Dijkstra’s algorithm is to modify labels `(v) attached to the
vertices, where `(v) represents the length of the shortest x-v path presently
known. As we proceed, we reduce these labels as shorter paths are found from
x to v. Initially, x is labelled `(x) = 0 and all other vertices are labelled ∞.
At any stage in the algorithm, for any vertex v 6= x, let the variable parent(v)
denote the vertex that precedes v on a shortest x-v path detected thus far.
Furthermore, at each stage of the algorithm, we look at those vertices with a
temporary label that are adjacent to the current vertex. To each such vertex
v, we assign a new temporary label `(v) representing the shortest x-v path
considered until now. This vertex then becomes the current vertex and its
label `(v) is now permanent. Whenever `(v) is updated, a shorter x-v path is
found, and at this point parent(v) is also updated to indicate the vertex that
now precedes v on this shorter x-v path. Eventually each vertex acquires a
permanent label which represents the shortest distance from x to that vertex.
For v(6= x), the label `(v) changes (perhaps several times) from ∞ to d(x, v)
as this distance is determined. When v acquires its permanent label `(v) we
produce a shortest x-v path

P : x = w0 , w 1 , w 2 , . . . , w t = v

where wi−1 = parent(wi ) for i = 1, 2, . . . , t. We are now prepared to present


Dijkstra’s algorithm.

Algorithm 8.1 (Dijkstra) Given a connected weighted graph G of order n


and a vertex x of G:

1. Set `(x) = 0 and for all v 6= x, set `(v) = ∞ and set S = V (G).

2. If |S| = 1, then stop; otherwise, continue.

3. Among all the vertices in S, let u be one of minimum label `(u).

4. For each v ∈ S, if uv ∈ E(G) and `(v) > `(u) + w(uv), then


176 MATH236 Discrete Mathematics with Applications 2009

• replace `(v) by `(u) + w(uv), and


• assign to parent(v) the vertex u.

5. Remove u from S, and return to Step 2.

As an example, we apply Dijkstra’s algorithm to the graph G of Figure 8.1


to determine the distance from the vertex x = v1 to every other vertex of
G and to find a shortest x-vi path for i = 2, 3, 4. Table 8.1 given below
has three columns corresponding to the vertices v2 , v3 , v4 . The ordered pairs
in the column corresponding to the vertex vi indicate (`(vi ), parent(vi )) for
i = 2, 3, 4 at a given point in the algorithm.

`(x) v2 v3 v4 removed S
from S

0 (∞, −) (∞, −) (∞, −) − {x, v2 , v3 , v4 }


(1, x) (2, x) (5, x) x {v2 , v3 , v4 }
(2, x) (5, x) v2 {v3 , v4 }
(3, v3 ) v3 {v4 }

Table 8.1.

To understand how we obtained Table 8.1, let us apply Dijkstra’s algo-


rithm to the graph G of Figure 8.1. We start by assigning to x = v1 the label
`(x) = 0, and to every other vertex vi the label `(vi ) = ∞. Initially, we set
S = V (G) = {x, v2 , v3 , v4 }.
We select x, since it has the smallest label among all vertices in S. The
label `(x) = 0 is now the permanent label of x. Next we examine the vertices
in S adjacent to x, namely v2 , v3 , v4 . Since `(v2 ) = ∞ > 1 = `(x) + w(xv2 ),
we replace `(v2 ) by 1 and assign to parent(v2 ) the vertex x. Furthermore,
we replace `(v3 ) by `(x) + w(xv3 ) = 2 and assign to parent(v3 ) the vertex
x, and we replace `(v4 ) by `(x) + w(xv4 ) = 5 and assign to parent(v4 ) the
vertex x. We then remove x from S to obtain S = {v2 , v3 , v4 } and return to
Step 2 of the algorithm.
We select v2 , since it has the smallest label among all vertices in S. The
label `(v2 ) = 1 is now the permanent label of x. Next we examine the vertices
The Shortest Path Algorithm 177

in S adjacent to v2 . Since `(v3 ) = 2 ≤ 5 = `(v2 ) + w(v2 v3 ), the label `(v3 )


remains unchanged. We then remove v2 from S to obtain S = {v3 , v4 } and
return to Step 2 of the algorithm.
We select v3 , since it has the smallest label among all vertices in S. The
label `(v3 ) = 2 is now the permanent label of x. Next we examine the vertices
in S adjacent to x. Since `(v4 ) = 5 > 3 = `(v3 ) + w(v3 v4 ), we replace `(v4 )
by 3 and assign to parent(v4 ) the vertex v3 . We then remove v3 from S to
obtain S = {v4 } and return to Step 2 of the algorithm. Since |S| = 1, the
algorithm terminates.
Our results may be summarized as shown in Table 8.2. Using Table 8.1
we obtain, for each i = 2, 3, 4, the distance from x = v1 to vi and a shortest
x-vi path Qi .

v d(x, v) Qi
v2 `(v2 ) = 1 x, v2
v3 `(v3 ) = 2 x, v3
v4 `(v4 ) = 3 x, v3 , v4

Table 8.2.

As a second example, we apply Dijkstra’s algorithm to the graph G of


Figure 8.2 to determine the distance from the vertex x to every other vertex
of G and to find a shortest x-vi path for i = 1, 2, . . . , 5.
xs
13 ¡@H
HH8
¡ 16 @ H
v1 s 10 ¡ @ »H »sv5
H
@ ¡ » »@ »¡
11» »
G: @
¡
» »»» @ 7
¡
v2 H @ ¡
17¡ ©© v4
¡
s
» @s
HH@ 1 ©
H@ ¡©
14 H@
H¡s © 5
©
v3
Figure 8.2. A weighted graph G.

Table 8.3 has five columns corresponding to the vertices v1 , v2 , . . . , v5 . As


in the previous example, the ordered pairs in the column corresponding to
the vertex vi indicate (`(vi ), parent(vi )) for i = 1, 2, . . . , 5 at a given point in
the algorithm.
178 MATH236 Discrete Mathematics with Applications 2009

`(x) v1 v2 v3 v4 v5 removed S
from S

0 (∞, −) (∞, −) (∞, −) (∞, −) (∞, −) − V (G)


(∞, −) (13, x) (∞, −) (16, x) (8, x) x {v1 , v2 , . . . , v5 }
(18, v5 ) (13, x) (25, v5 ) (15, v5 ) v5 {v1 , v2 , v3 , v4 }
(18, v5 ) (25, v5 ) (15, v5 ) v2 {v1 , v3 , v4 }
(18, v5 ) (20, v4 ) v4 {v1 , v3 }
(19, v1 ) v1 {v3 }

Table 8.3.

Our results may be summarised as shown in Table 8.4. Using Table 8.3
we obtain, for each i = 1, 2, . . . , 5, the distance from x to vi and a shortest
x-vi path Qi .

v d(x, v) Qi
v1 `(v1 ) = 18 x, v5 , v1
v2 `(v2 ) = 13 x, v2
v3 `(v3 ) = 19 x, v5 , v1 , v3
v4 `(v4 ) = 15 x, v5 , v4
v5 `(v5 ) = 8 x, v5

Table 8.4.

We now verify that on completion, Dijkstra’s algorithm has labelled the


vertices with their proper distances from x. To do this, we first prove the
following two lemmas.

Lemma 8.2 At the termination of Dijkstra’s algorithm, `(v) is finite for all
v ∈ V (G).

Proof. Assume, to the contrary, that on completion of Dijkstra’s algorithm


not all vertices have finite labels. Let w be the first vertex selected from S
with an infinite label. Since w was chosen as a vertex of S with minimum
label, all vertices of S when w was selected have infinite labels, while all
The Shortest Path Algorithm 179

vertices already deleted from S have finite labels. But then there is no edge
from V (G) − S to S (since if such an edge e = uv with u ∈ V (G) − S and
v ∈ S did exist, then when u was selected as the current vertex is Step 3 of the
algorithm, the label `(v) of v would have dropped from ∞ to `(u) + w(uv),
which is finite). Thus there is no path from a vertex of S to a vertex of
V (G) − S. Hence G is disconnected, which produces a contradiction. 2

Lemma 8.3 At the termination of Dijkstra’s algorithm, d(x, v) ≤ `(v) for


all v ∈ V (G).

Proof. If v = x, then d(x, v) = 0 = `(v). If v 6= x, then by Lemma 8.2,


`(v) is finite. Consider the x-v path P : x = w0 , w1 , w2 , . . . , wk = v, where
w
Pi−1 = parent(wi ) for i = 1, 2, . . . , k. The path P has weight w(P ) =
k
i=1 w(w i−1 wi ). Since wk−1 is the vertex used to label wk , we know that
`(v) = `(wk ) = `(wk−1 ) + w(wk−1 wk ). After this labelling, the vertex wk−1 is
removed from S (in Step 5), and hence its label can never change again. Re-
peating this backtracking search, we have `(wk−1 ) = `(wk−2 ) + w(wk−2 wk−1 ),
so `(v) = `(wk−2 ) + w(wk−2 wk−1 ) + w(wk−1 wk ). Eventually we will backtrack
to x. Thus `(v) = `(w0 ) + w(w0 w1 ) + w(w1 w2 ) + · · · + w(wk−1 wk ) = w(P ),
since `(x) = `(w0 ) = 0. Thus P has weight `(v). Since d(x, v) is the length
of a shortest x-v path, d(x, v) ≤ `(v). 2
We are now in a position to verify Dijkstra’s algorithm.

Theorem 8.4 At the termination of Dijkstra’s algorithm, `(v) = d(x, v) for


all v ∈ V (G).

Proof. We proceed by induction on the order in which we delete vertices


from S. This is certainly true for the first vertex, namely x, deleted from
S since `(x) = 0 = d(x, x). Assume that `(u) = d(x, u) for all vertices u
deleted from S before v. Let P : x = v0 , v1 , v2 , . . . , vk = v be a shortest x-v
path of length d(x, v). Then the x-vi subsection of P must be a shortest
x-vi path for i = 1, 2, . . . , k (otherwise we could find a shorter x-v path than
P which is impossible). Suppose vi is the vertex of highest subscript on P
deleted from S before v. By the inductive hypothesis, `(vi ) = d(x, vi ) =
w(v0 v1 ) + w(v1 v2 ) + · · · + w(vi−1 vi ). When vi was chosen from S as the
current vertex, we compared the current label of vi+1 with `(vi ) + w(vi vi+1 )
in Step 4 of the algorithm. Hence after vi is deleted from S in Step 5,
180 MATH236 Discrete Mathematics with Applications 2009

`(vi+1 ) ≤ `(vi ) + w(vi vi+1 ). If vi+1 6= v, then, since Step 4 can only decrease
labels, `(vi+1 ) still satisfies this inequality when v is chosen. Thus,

`(vi+1 ) ≤ `(vi ) + w(vi vi+1 )


= d(x, vi ) + w(vi vi+1 )
= d(x, vi+1 )
< d(x, v) (since vi+1 precedes v on P )
≤ `(v) (by Lemma 8.3)

This, however, contradicts the fact that v was chosen before vi+1 , so `(v) ≤
`(vi+1 ). Hence vi+1 = v and so

`(v) ≤ `(vi ) + w(vi v)


= d(x, vi ) + w(vi vi+1 )
= d(x, v)
≤ `(v) (by Lemma 8.3)

Hence we must have `(v) = d(x, v). 2

Exercises

8.1 A company has branches in each of six cities C1 , C2 , . . . , C6 . The fare


for a direct flight from Ci to Cj is given by the (i, j)th entry in the
matrix C (where ∞ indicates that there is no direct flight). Find the
cheapest route from C1 to all other cities.
 
0 500 ∞ 400 250 100
 500 0 150 200 ∞ 250 
 
 ∞ 150 0 100 200 ∞ 
C=



 400 200 100 0 100 250 
 250 ∞ 200 100 0 550 
100 250 ∞ 250 550 0

8.2 Let G be the weighted graph in the accompanying figure. Use Dijkstra’s
algorithm to compute d(x, v) for each v ∈ V (G) and to determine a
shortest x-v6 path.
The Shortest Path Algorithm 181
v1 2 v2 1 v3
s s s
@ ¡ ¡@ 6
3@@s¡ 6
¡ ¡ @
4¡ @sv4
G: 3 ¡
x ¡ 3
¡
¡1 ¡ ¡
s
¡ ¡
s s¡ 3
v7 10 v6 2 v5

8.3 Let G be the weighted graph with V (G) = {v1 , v2 , . . . , v7 } and weight
matrix W (G) as shown. (i) Draw the weighted graph G.
(ii) Find the distance from v3 to every other vertex of G.
(iii) Find a shortest v3 -v2 path, and a shortest v3 -v7 path.
 
0 8 1 4 2 7 ∞
 8 0 7 1 ∞ ∞ 4 
 
 1 7 0 ∞ 2 ∞ ∞ 
 
W (G) = 
 4 1 ∞ 0 ∞ 3 6 

 2 ∞ 2 ∞ 0 5 ∞ 
 
 7 ∞ ∞ 3 5 0 4 
∞ 4 ∞ 6 ∞ 4 0

8.4 Let G be the weighted graph in the accompanying figure. Use Dijkstra’s
algorithm to compute d(v1 , vi ) for each vi ∈ V (G) and to determine a
shortest v1 -vi path.
v2 2 v4 8 v6
s s s
1¡ @
@
@ 2
¡ @
v1 ¡ @4
G: s 3 1 @ 3 @sv8
@ ¡
@ @ ¡
7 @s s @s¡ 1
v3 5 v5 5 v7

Suggestions for further reading


A more detailed discussion on Dijkstra’s algorithm, including a discussion of
the complexity of the algorithm, can be found in the books by Chartrand
and Oellermann [15] and Gibbons [22].
182 MATH236 Discrete Mathematics with Applications 2009
Chapter 9

Maximum Flows in Networks

9.1 Introduction
You are designing an oil pipeline. Oil is to be pumped from an unloading
point s to a refinery t. To minimize the possibility that pipe repairs will
completely halt the flow of oil, there will be several routes. There will be
four pumping stations u, v, x and y along the pipelines, which are linked as
shown in Figure 9.1. The arrows indicate the directions in which oil can flow
and the capacities of the section of pipelines are indicate (in 1000 barrels of
oil per hour). The question we are faced with is to determine what is the
maximum volume of oil that can be pumped into the system (through s) per
hour? In this section, we shall develop a method of solving problems of this
type.
uu 5 vu

¡@ ¡@
4 ¡¡ @ ¡ @ 3
%
% @ ¡4 &
@
¡ ¡ @
@
s u
¡ ¡
@
@u t
@ ¡ @ ¡
@& ¡ 6 %¡
@&
5@ ¡ @ ¡ 8
¡
@
@u¡ @u
¡

x 2 y

Figure 9.1.

183
184 MATH236 Discrete Mathematics with Applications 2009

9.2 Digraphs
Although many problems lend themselves to a graph-theoretic formulation,
the concept of a graph is sometimes not quite adequate. When dealing with
problems of traffic flow, for example, it is necessary to know which roads are
one-way, and in which direction traffic is permitted. We may deal with prob-
lems involving flow in information or water, transport of some commodity,
etc. To deal with such problems, what we need is a graph in which every
edge has been assigned a direction - ”a digraph”. The terminology used for
digraphs is quite similar to that used for graphs.

Definitions. A digraph (or directed graph) D is a finite nonempty


set of objects, called vertices, together with a (possibly empty) set of
ordered pairs of distinct vertices of D, called arcs (or directed edges).
As with graphs, the vertex set of D is denoted by V (D) and its arc set
by E(D). If D has vertex set V and arc set E, we write D = (V, E).
The cardinality of the vertex set of D is called the order of D and is
denoted by n(D), or simple by n, while the cardinality of its arc set is
called its size, denoted by m(D), or simple by m.

As with graphs, digraphs can be represented by diagrams. The vertices


of a digraph D are indicated by small circles, and an arc (u, v) of D is
represented by a curve or line segment directed by an arrow-head from vertex
u to vertex v. Since (u, v) and (v, u) are distinct arcs, two vertices can be
joined by two arcs if they have opposite direction.

Definition. With each digraph D, we can associate a graph G (on the


same vertex set) called the underlying graph of D that is obtained from
D by deleting all directions from the arcs of D (equivalently, replacing
each arc (u, v) by the edge uv) and deleting an edge from a pair of
multiple edges if multiple edges should be produced.

A digraph D with V (D) = {u, v, w, x} and arc set E(D) = {(u, v), (v, u),
(v, w), (x, v), (x, w)} is shown in Figure 9.2 along with the underlying graph
G of D.
Maximum Flows in Networks 185
u us
®s ©
↓ ↑
D: ­s ª
v G: s v
¡@& ¡@
%
¡ @ ¡ @
x s¡ → @s w x ¡
s @s w

Figure 9.2. A digraph G and its underlying graph G.

Definitions. If (u, v) is an arc of D, then we say that u is adjacent


to v, and v is adjacent from u. Further, the arc (u, v) is incident from
u and incident to v. The outdegree, denoted od (v), of a vertex v in D
is the number of vertices adjacent from v, and the indegree, denoted
id (v), of v is the number of vertices adjacent to v. The degree, denoted
dD (v) or simply d(v) if the digraph D is clear from context, of v is
defined by d(v) = od (v) + id (v).

The outdegrees, indegrees, and degrees of the vertices of the digraph D


of Figure 9.2 are given below.

Vertex Outdegree Indegree Degree


u 1 1 2
v 2 2 4
w 0 2 2
x 2 0 2

Definitions. For a vertex v in a digraph D, we define the out-neighborhood


of v by the set N + (v) = {u ∈ V (D) | (v, u) ∈ E(D)}. The in-
neighborhood of v is defined by N − (v) = {u ∈ V (D) | (u, v) ∈ E(D)}.
Hence, od (v) = |N + (v)|, while id (v) = |N − (v)|.

The First Theorem of Digraph Theory is analogous to the First Theorem


of Graph Theory.

Theorem 9.1 If D is a digraph of size m with vertex set V , then


X X
od (v) = id (v) = m.
v∈V v∈V
186 MATH236 Discrete Mathematics with Applications 2009

Proof. When the outdegrees of the vertices are summed, each arc is counted
exactly once, since every arc is incident from exactly one vertex. Similarly,
when the indegrees are summed, each arc is counted just once since every
arc is incident to exactly one vertex. 2

Definitions. Let u and v be two vertices of a digraph D. A u-v walk


in D is a finite, alternating sequence

u = u0 , a1 , u1 , a2 , . . . , uk−1 , ak−1 , uk = v

of vertices and arcs that begin with the vertex u and ends with the
vertex v and such that ai = (ui−1 , ui ) for i = 1, 2, . . . , k. The number of
arcs k in the walk is called the length of the walk. The concepts of trail,
path, cycle, and circuits in digraphs are defined analogously to those
in graphs, except that in digraphs, we always proceed in the direction
of the arcs. Note that cycles of length 2 are possible in digraphs.

Definitions. A term that is unique to digraph theory is that of a


semiwalk. Let u and v be two vertices of a digraph D. A u-v semiwalk
in D is a finite, alternating sequence

u = u0 , a1 , u1 , a2 , . . . , uk−1 , ak−1 , uk = v

of vertices and arcs that begin with the vertex u and ends with the
vertex v and such that either ai = (ui−1 , ui ) or ai = (ui , ui−1 ) for i =
1, 2, . . . , k. If ai = (ui−1 , ui ), then we call ai a forward arc; otherwise,
we call ai a backward arc. The number of arcs k in the semiwalk is called
the length of the semiwalk. If the vertices u0 , u1 , . . . , uk are distinct,
then the u-v semiwalk is called a u-v semipath.

9.3 An introduction to networks


Before we attempt to solve the problem stated in Section 9.1, we make several
fairly natural restrictions. First, no pipe can carry more oil than its capacity
allows and, secondly, all intermediate stations pump out as much oil as they
receive and oil cannot accumulate at these stations. We now formulate our
problem in more mathematical terms.
Maximum Flows in Networks 187

Definitions. A network N is a digraph D with a nonnegative capacity


c(e) on each arc e, called the capacity function on N , and two distin-
guished vertices s and t called the source and sink, respectively. The
digraph D is called the underlying digraph of the network N .

Intuitively, the capacity c(x, y) of an arc (x, y) may be thought of as the


maximum amount of some material that can be transported from x to y per
unit time. For example, the capacity of the arc (x, y) may represent the
number of seats available on a direct flight from city x to city y in some
airline system. On the other hand, this capacity might be the capacity of a
pipeline from city x to city y in an oil network. The problem in general, is
to maximize the ”flow” from the source s to the sink t without exceeding the
capacities of the arcs.
A network may be represented by drawing its underlying digraph D and
labelling each arc of D with its capacity. For example, Figure 9.1 shows a
network with c(u, v) = 5 and c(x, y) = 2.

Definitions. Let N be a network with underlying digraph D, source


s, sink t and capacity function c. A flow f in N is an integer-valued
function on E(D) that satisfies the

1. (capacity constraint) 0 ≤ f (e) ≤ c(e) for each arc e ∈ E(D),


and the
2. (conservation constraint): f + (v) = f − (v) for every vertex v ∈
V (D) − {s, t},

where f + (v) denotes the total flow on edges exiting v and f − (v) denotes
the total flow on edges entering v, i.e.,
X X
f + (v) = f (v, w) and f − (v) = f (w, v).
w∈N + (v) w∈N − (v)

For a vertex v ∈ V (D), the net flow out of v is defined as f + (v)−f − (v),
while the net flow into v is defined as f − (v) − f + (v). The value f (N )
of a flow in N is the net flow

f (N ) = f + (s) − f − (s)

out of the source s. A maximum flow is a flow of maximum value.


188 MATH236 Discrete Mathematics with Applications 2009

A flow is a mapping that described the movement (or flow) of material


along the arcs of the network, while the capacity is a mapping that described
the maximum amount that can move along the arcs. The capacity constraint
states that the flow in an arc can never exceed its capacity, and the flow
conservation states that all that flows into a vertex, other than the source s
and sink t, also flows out of that vertex.

Examples of a flow. The zero flow assigns flow 0 to each arc in the network.
Figure 9.3 shows a flow f in a network N . Associated with each arc, we write
an ordered pair where the first number denotes the capacity of the arc and
the second number the flow in the arc. For example, f (u, v) = 1 while
c(u, v) = 5. The value of a flow is f (N ) = f + (s) − f − (s) = 8 − 0 = 8.
uu 5, 1 vu

¡@ ¡@
4, 4 ¡ @ ¡ @ 3, 3
%
% ¡
@
&
¡4, 2 @
¡ ¡ @
@
¡
s u ¡
@
@u t
@ ¡ @ ¡
@& ¡ 6, 3 %¡
@&
5, 4@ ¡ ¡
@ ¡ 8, 5
@
@u¡ @ ¡
u

x 2, 2 y

Figure 9.3.

Before presenting our first result on networks, we introduce some nota-


tion. Let D = (V, E) be a digraph, and let X, Y ⊆ V with X, Y 6= ∅. We
write
(X, Y ) = {(x, y) ∈ E | x ∈ X, y ∈ Y }
to denote the set of all arcs directed from some vertex in X to some vertex
in Y . For example, in Figure 9.3, if X = {u, v, x} and Y = {x, y, t}, then
(X, Y ) = {(u, y), (v, t), (x, y)}. For a flow f and a capacity function c in a
network N , we define
X X
f (X, Y ) = f (e) and c(X, Y ) = c(e),
e∈(X,Y ) e∈(X,Y )

where f (X, Y ) = c(X, Y ) = 0 if (X, Y ) = ∅. For a subset X ⊆ V , we denote


the complement V \ X of X by X.
Maximum Flows in Networks 189

Definitions. Let N be a network with underlying digraph D = (V, E),


source s, sink t, capacity function c and flow f . If S ⊆ V is such that
s ∈ S and t ∈ S, we call the pair (S, S) a cut in N , and we call c(S, S)
the capacity of the cut. A minimum cut is a cut of minimum value. We
call f (S, S) the flow from S to S and we call f (S, S) the flow from S
to S.

Example of a cut. Suppose f is the flow in the network of Figure 9.3


and that S = {s, v, y}. Then, S = {u, x, t} and the cut (S, S) is given by
(S, S) = {(s, u), (s, x), (v, t), (y, t)}. Thus, c(S, S) = 20 and f (S, S) = 16.
Further, (S, S) = {(u, v), (u, y), (x, v), (x, y)}, and so f (S, S) = 8.
We are now in a position to present our first result on networks.

Theorem 9.2 Let N be a network and f a flow in N . If (S, S) is a cut of


N , then
f (N ) = f (S, S) − f (S, S).

Proof. By definition, f (N ) = f + (s) − f − (s). By the flow conservation


constraint (2), we have f + (v) = f − (v) for all v ∈ S − {s}. Hence we can
write X¡ ¢
f (N ) = f + (v) − f − (v) .
v∈S

Since S ∪ S = V (D) where D is the underlying digraph of N ,


X X X
f + (v) = f (v, u) + f (v, u) = f (S, S) + f (S, S),
v∈S (v,u)∈(S,S) (v,u)∈(S,S)

while
X X X
f − (v) = f (u, v) + f (u, v) = f (S, S) + f (S, S).
v∈S (u,v)∈(S,S) (u,v)∈(S,S)

Thus,
X X
f (N ) = f + (v) − f − (v) = f (S, S) − f (S, S). 2
v∈S v∈S

As an immediate consequence of Theorem 9.2, we have the following


corollaries.
190 MATH236 Discrete Mathematics with Applications 2009

Corollary 9.3 If f is a flow in a network N , then the value of the flow is


the net flow into the sink, i.e.,

f (N ) = f − (t) − f + (t).

Proof. Let D be the underlying digraph of N and let S = V (D)−{t}. Since


S = {t}, X
f (S, S) = f (w, t) = f − (t),
w∈N − (t)

while X
f (S, S) = f (t, w) = f + (t).
w∈N + (t)

Thus, by Theorem 9.2, f (N ) = f (S, S) − f (S, S) = f − (t) − f + (t). 2

Corollary 9.4 Let f be a flow in a network N and let (S, S) be a cut of N .


Then,
f (N ) ≤ c(S, S).

Proof. By Theorem 9.2, the value of the flow f (N ) equals the net flow out
of S. Thus, f (N ) = f (S, S) − f (S, S) ≤ f (S, S) since f (S, S) ≥ 0. However
by the capacity constraint (1), f (S, S) ≤ c(S, S), whence f (N ) ≤ c(S, S). 2

Corollary 9.5 Let f be a flow in a network N and let (S, S) be a cut of N .


If f (N ) = c(S, S), then f is a maximum flow and (S, S) is a minimum cut.

Proof. Let f ∗ be a maximum flow in N , and let (X, X) be a minimum cut of


N . By Corollary 9.4, f ∗ (N ) ≤ c(X, X). However, f ∗ is a maximum flow, and
so f (N ) ≤ f ∗ (N ), while (X, X) is a minimum cut, and so c(X, X) ≤ c(S, S).
Thus, c(S, S) = f (N ) ≤ f ∗ (N ) ≤ c(X, X) ≤ c(S, S). Thus we must have
equality throughout this inequality chain. In particular, f (N ) = f ∗ (N ) and
c(X, X) = c(S, S). Hence, f is a maximum flow and (S, S) is a minimum
cut. 2
Example. Consider the network N shown in Figure 9.4, where as before
the first number associated with an arc a is its capacity c(a) and the second
number its flow f (a). The value of a flow is f (N ) = f + (s) − f − (s) = 10. If
S = {s, x}, then (S, S) is a cut in N and c(S, S) = c(s, u) + c(x, v) + c(x, y) =
4 + 4 + 2 = 10. Thus, by Corollary 9.5, f is a maximum flow and (S, S) is a
minimum cut.
Maximum Flows in Networks 191
uu 5, 1 vu

¡@ ¡@
4, 4 ¡ @ ¡ @ 5, 5
%
% ¡
@
&
¡4, 4 @
¡ ¡ @
@
s u
¡ ¡
@
@u t
@ ¡ @ ¡
@& ¡ 6, 3 %¡
@&
8, 6@ ¡ ¡
@ ¡ 8, 5
@
@u¡ @ ¡
u

x 2, 2 y

Figure 9.4.

Exercises
9.1 Let N be the network shown in Figure 9.1, where each arc is labelled
with its capacity function. A function f is defined on the arcs as follows:
f (s, u) = 2 f (s, x) = 2 f (u, v) = 1 f (u, y) = 2
f (v, t) = 2 f (x, v) = 1 f (x, y) = 1 f (y, t) = 3
Is f a flow? Explain?
9.2 For the network shown below, associated with each arc is an ordered
pair where the first number denotes the capacity of the arc and the
second number the flow in the arc. Determine the values of the flows
a, b and c.

uu 5, a vu

¡@ ¡@
4, 3 ¡ @ ¡ @ 3, 3
%
% ¡
@
&
¡4, 2 @
¡ ¡ @
@
s u
¡ ¡
@
@u t
@ ¡ @ ¡
@& ¡ 6,
@&
b %¡
5, 3@ ¡ ¡
@ ¡ 8, c
@
@u¡ @ ¡
u

x 2, 1 y

9.3 Consider the network N shown below, where as before the first number
associated with an arc a is its capacity c(a) and the second number its
flow f (a). Use Corollary 9.5, to show that the given flow is a maximum
flow and find the corresponding minimum cut.
192 MATH236 Discrete Mathematics with Applications 2009
uu 5, 0 vu

¡@ ¡@
4, 4 ¡ @ ¡ @ 3, 3
%
% ¡
@
&
¡4, 3 @
¡ ¡ @
@
s u
¡ ¡
@
@u t
@ ¡ @ ¡
@& ¡ 6, 4 %¡
@&
9, 5@ ¡ ¡
@ ¡ 8, 6
@
@u¡ @ ¡
u

x 2, 2 y

9.4 The max-flow min-cut theorem


It follows from Corollary 9.4 that the total value of a flow in a network is
never larger than the smallest capacity of a cut. Our aim in this subsection, is
to present the so-called max-flow min-cut theorem due to Ford and Fulkerson
(1956) which shows that this upper bound is always attained by some flow.
The proof that we present is algorithmic in nature and is based on the idea
of improving the current flow along some s–t semipath that is not being
”optimally” used. Once this is done, we repeat the process in the network
with its modified flow until we can find no such s–t semipath whose flow can
be improved. This technique has come to be known as the augmenting path
technique.

Definition. Let N be a network with source s, sink t, capacity function


c and flow f . Let P be an s–v semipath. If every forward arc e on P has
excess capacity (meaning f (e) < c(e)) and every backward arc on P
has nonzero flow (meaning f (e) > 0), then we call P an f -unsaturated
s–v semipath from s to v. If v = t, then we call P an f -augmenting
path (even though it may not actually be a path).

For each arc e of P , we define ²(e) to be c(e) − f (e) if e is a forward


arc and f (e) if e is a backward arc. We define the leeway of P by
²(P ) = min{²(e)}, where the minimum is taken over all arcs e on P .

Intuitively, if a forward arc e on P has excess capacity, then we can


increase the flow along e, while if a backward arc e on P has nonzero flow,
then we can ”push back” flow along the arc e.
Maximum Flows in Networks 193

Example. If f is the flow given in Figure 9.3, then

P : s, (s, x), x, (x, v), v, (u, v), u, (u, y), y, (y, t), t

is an f -augmenting path. Further, ²(s, x) = 1, ²(x, v) = 2, ²(u, v) = 1,


²(u, y) = 3, and ²(y, t) = 3. Thus, ²(P ) = 1.

We are now in a position to present the max-flow min-cut theorem due


to Ford and Fulkerson.

Theorem 9.6 (Ford and Fulkerson, 1956) In every network, the value of a
maximum flow equals the capacity of a minimum cut.

Proof. Let N be a network with underlying digraph D, source s, sink t, and


capacity function c. We define a sequence f0 , f1 , . . . of flows in N of strictly
increasing value, i.e., with f0 (N ) < f1 (N ) < f2 (N ) < · · · , as follows. We
start with the zero flow, and so f0 (e) = 0 for all arcs e of D and f0 (N ) = 0.
For each flow fi , i ≥ 0, we denote by Si the set of all vertices v such that
there is an fi -unsaturated s–v semipath from s to v.
Suppose t ∈ Si . Then there is an f -augmenting path P in N . Let fi+1 be
the function obtained from fi by increasing flow by ²(P ) along forward arcs of
P and decreasing flow by ²(P ) along backward arc of P and leaving flows on
all remaining arcs unchanged. By the definition of the leeway ²(P ), we have
0 ≤ fi+1 (e) ≤ c(e) for every arc e, and so the capacity constraint (1) holds.
To verify the conservation constraint (2), we need only consider vertices of
P since the flow along all arcs not on P is unchanged. However the net flow
out of each vertex v on P different from s and t remains 0 (irrespective of the
direction of the two arcs incident with v on P ). Hence, fi+1 is indeed a flow
in N . Further, the net flow out of the source s is ²(P ) larger in fi+1 than in
fi , and so fi+1 (N ) = fi (N ) + ²(P ). Thus, fi+1 (N ) > fi (N ), as desired.
Since a flow is an integer-valued function on E(D), fi (N ) + 1 ≤ fi+1 (N )
for all i. By Corollary 9.4, the value of any flow in N is bounded above by
the capacity of any cut in N , and so our sequence f0 , f1 , . . . of flows in N
will terminate with some flow fn . Hence in fn , t ∈ / Sn . Let S = Sn . We now
consider the cut (S, S). Suppose (u, v) ∈ (S, S) and f (u, v) < c(u, v). Then
by taking an fn -unsaturated s–u semipath from s to u, and then proceeding
to v along the arc (u, v), we produce an fn -unsaturated s–v semipath from
s to v, contradicting the fact that v ∈ S. Hence if (u, v) ∈ (S, S), then
f (u, v) = c(u, v). Thus, f (S, S) = c(S, S). Further, if (v, u) ∈ (S, S), then
194 MATH236 Discrete Mathematics with Applications 2009

f (v, u) = 0, for otherwise if f (v, u) > 0 we could find an fn -unsaturated s–v


semipath from s to v, a contradiction. Hence, f (S, S) = 0. Consequently,
by Theorem 9.2,

fn (N ) = fn (S, S) − fn (S, S) = c(S, S).

Thus, by Corollary 9.5, fn is a maximum flow and (S, S) is a minimum cut. 2

9.5 The max-flow min-cut algorithm


As remarked earlier, the proof of the max-flow min-cut theorem due to Ford
and Fulkerson is essentially algorithmic in nature since it searches for an
augmenting path to increase the flow value. It it does not find such a path,
then the proof produces a minimum cut and maximum flow.
In this subsection, we present an algorithm due to Edmonds and Karp
which gives a systematic method for finding an f -augmenting path in a net-
work N with flow f , if such a path exists. We begin the procedure with
a given flow f in N (perhaps the zero flow). As we attempt to find an f -
augmenting path, we label the vertices of N as we examine them. Initially,
we label the source s, and then we label every vertex v for which we can find
an f -unsaturated s–v semipath. The label assigned to v is an ordered pair.
If u is the vertex immediately preceding v on P , then the first component
of the label is u+ or u− depending on whether the arc preceding v on P is
a forward arc (u, v) or a backward arc (v, u). The second component of the
label is a positive integer reflecting the potential change in f along P . When
we finally label the sink t, we have found an f -augmenting path. This path
is then used to increase the total flow in N , and the process is begun again.
If at some point, we cannot find any vertex to label, then no f -augmenting
path exists and, as shown in the proof of Theorem 9.6, the present value of
the flow is a maximum.

Algorithm 9.7 (Edmonds and Karp) Given a network N with underlying


digraph D = (V, E), source s, sink t, and capacity function c.
1. Assign values of an initial flow f to the arcs of D.

2. Label s with (−, ∞) and add s to L, the list of labelled and unscanned
vertices.
Maximum Flows in Networks 195

3. Select and remove the first element of L, say u, with label (x+ , ²(u)) or
(x− , ²(u)). If L is empty, then stop. Otherwise, continue.

(a) To all vertices v that are unlabelled and such that (u, v) ∈ E and
f (u, v) < c(u, v), assign the label (u+ , ²(v)), where

²(v) = min{²(u), c(u, v) − f (u, v)}

and add v to the end of L.


(b) To all vertices v that are unlabelled and such that (v, u) ∈ E and
f (u, v) > 0, assign the label (u− , ²(v)), where

²(v) = min{²(u), f (v, u)}

and add v to the end of L.

4. If t has been labelled, go to Step 5; otherwise, go to Step 3.

5. The labels describe an f -augmenting path

s = u0 , a1 , u1 , a2 , . . . , un−1 , an−1 , un = t.

For i = 1, . . . , n, replace f (ai ) by f (ai ) + ²(t) if ai is a forward arc and


by f (ai ) − ²(t) if ai is a backward arc.

6. Discard all labels, remove all vertices from L, and go to Step 2.

As shown in the proof of max-flow min-cut theorem due to Ford and


Fulkerson (Theorem 9.6), Algorithm 9.7 terminates with a maximum flow f
in N . Furthermore, if S is the set of labelled vertices upon termination, then
(S, S) is a minimum cut.

Example. In this example, we shall use the Edmonds-Karp Algorithm to


find a maximum flow and minimum cut for the network shown in Figure 9.1.
The labels on each arc a are the capacity c(a) of a and the flow f (a) in a,
respectively. We begin with the zero flow shown in Figure 9.5.
196 MATH236 Discrete Mathematics with Applications 2009
uu 5, 0 vu

¡@ ¡@
4, 0 ¡ @ ¡ @ 3, 0
%
% ¡
@ ¡4, 0 @&
¡ ¡ @
@
s u
¡ ¡
@
@u t
@ ¡ @ ¡
@& ¡ 6, 0 %¡
@&
5, 0@ ¡ ¡
@ ¡ 8, 0
@
@u¡ @u ¡

x 2, 0 y

Figure 9.5.

Initially, s is labelled (−, ∞), and L consists only of s. As the algo-


rithm proceeds through Step 3 for the first time, we remove s from L.
Since u is unlabelled, and (s, u) ∈ E satisfies f (s, u) < c(s, u), we assign
the label (s+ , 4) to the vertex u, and add u to the end of L. (Note that
²(u) = min{²(s), c(s, u) − f (s, u)} = min{∞, 4} = 4.) Furthermore, since x
is unlabelled, and (s, x) ∈ E satisfies f (s, x) < c(s, x), we assign the label
(s+ , 5) to the vertex x, and add x to the end of L. At this stage, L = {u, x}.
Since t has not been labelled, we return to Step 3. We remove the first element
of L, namely u. We then assign the label (u+ , 4) to the vertex v, and add v to
the end of L. (Note that ²(v) = min{²(u), c(u, v)−f (u, v)} = min{4, 5} = 4.)
We then assign to y the label (u+ , 4), and add y to the end of L. At this
stage, L = {x, v, y}. Since t has not been labelled, we return to Step 3.
We remove the first element of L, namely x. Since there are no unla-
belled vertices that can be labelled from x, we continue through Step 3
again (with L = {v, y}). We remove the first element of L, namely v,
and then assign the label (v + , 3) to the vertex t, and add t to the end of
L. (Note that ²(t) = min{²(v), c(v, t) − f (v, t)} = min{4, 3} = 3.) We
now reach Step 4 and proceed to Step 5 to obtain the f -augmenting path
P : s, (s, u), u, (u, v), v, (v, t), t. (See Figure 9.6.)
Maximum Flows in Networks 197
(s+ , 4) +
(u , 4)
uu 5, 0 vu

¡@ ¡@
4, 0 ¡ @ ¡ @ 3, 0
%
% ¡
@
&
¡4, 0 @
¡ ¡ @
@
(−, ∞) s u
¡ ¡
@
@u t (v + , 3)
@ ¡ @ ¡
@& ¡ 6, 0 %¡
@&
5, 0@ ¡ ¡
@ ¡ 8, 0
@
@u¡ @ u
¡

x 2, 0 y
+ +
(s , 5) (u , 4)

Figure 9.6.

We then increase flow by ²(t) = 3 along forward arcs of P and decrease


flow by ²(t) = 3 along backward arc of P and leave flows on all remaining
arcs unchanged. The resulting flow is shown in Figure 9.7.
(s+ , 1) (u+ , 1)
uu 5, 3 vu

¡@ ¡@
4, 3 ¡ @ ¡ @ 3, 3
%
% ¡
@
&
¡4, 0 @
¡ ¡ @
@
(−, ∞) s ¡
u ¡
@
@u t (y + , 1)
@ ¡ @ ¡
@& ¡ 6, 0 %¡
@&
5, 0@ ¡ ¡
@ ¡ 8, 0
@
@u¡ @ ¡
u

x 2, 0 y
+ +
(s , 5) (u , 1)

Figure 9.7.

Proceeding through Step 2 of the Edmonds-Karp algorithm, we assign to


the vertices the labels shown in Figure 9.7. Using the resulting f -augmenting
path s, (s, u), u, (u, y), y, (y, t), t with leeway ²(t) = 1, we increase the flow
in each of the arcs in this path by ²(t) = 1. The resulting flow f is shown in
Figure 9.8.
198 MATH236 Discrete Mathematics with Applications 2009
(v − , 3) (x+ , 4)
uu 5, 3 vu

¡@ ¡@
4, 4 ¡ @ ¡ @ 3, 3
%
% ¡
@ ¡4, 0 @&
¡ ¡ @
¡
u @ @u t (y + , 2)
(−, ∞) s ¡
@
@ ¡ @ ¡
@& ¡ 6, 1 %¡
@&
5, 0@ ¡ ¡
@ ¡ 8, 1
@
@u¡ @ ¡
u

x 2, 0 y
+ +
(s , 5) (x , 2)

Figure 9.8.

Proceeding through Step 2 of the Edmonds-Karp algorithm, we assign to


the vertices the labels shown in Figure 9.8. Using the resulting f -augmenting
path s, (s, x), x, (x, y), y, (y, t), t with leeway ²(t) = 2, we increase the flow
in each of the arcs in this path by ²(t) = 2. The resulting flow f is shown in
Figure 9.9.
(v − , 3) (x+ , 3)
uu 5, 3 vu

¡@ ¡@
4, 4 ¡ @ ¡ @ 3, 3
%
% ¡
@
&
¡4, 0 @
¡ ¡ @
¡
u @ @u t (y + , 3)
(−, ∞) s ¡
@
@ ¡ @ ¡
@& ¡ 6, 1 %¡
@&
5, 2@ ¡ @
¡8, 3
@
@u¡ @u ¡
¡

x 2, 2 y
+ +
(s , 3) (u , 3)

Figure 9.9.

Proceeding through Step 2 of the Edmonds-Karp algorithm, we assign to


the vertices the labels shown in Figure 9.9. Using the resulting f -augmenting
path s, (s, x), x, (x, v), v, (u, v), u, (u, y), y, (y, t), t with leeway ²(t) = 3, we
increase the flow in each of the forward arcs in this path by ²(t) = 3 and
decrease the flow by ²(t) = 3 along backward arcs in this path. The resulting
flow f is shown in Figure 9.10.
Proceeding through Step 2 of the Edmonds-Karp algorithm, we are only
able to label the source s and no other vertex (including the sink t). Thus the
Maximum Flows in Networks 199

given flow f is a maximum flow of value, and the corresponding minimum


cut is (S, S), where S = {s}. The value of the flow f is f (N ) = f (S, S) −
f (S, S) = 9, while c(S, S) = 9.

uu 5, 0 vu

¡@ ¡@
4, 4 ¡ @ ¡ @ 3, 3
%
% ¡
@
&
¡4, 3 @
¡ ¡ @
@
¡
(−, ∞) s u ¡
@
@u t
@ ¡ @ ¡
@& ¡ 6, 4 %¡
@&
5, 5@ ¡ ¡
@ ¡ 8, 6
@
@u¡ @ ¡
u

x 2, 2 y

Figure 9.10.

Remark: When a vertex receives a label in the Edmonds-Karp algorithm, it


is added to the bottom of the ”labelled but unscanned” list L. These vertices
are scanned on a ”first-labelled first-scanned” basis, which insures that a
shortest f -augmenting path is selected. The importance of taking a shortest
f -augmenting path is illustrated by the network shown in Figure 9.11. If the
f -augmenting path path is taken alternately along s, (s, x), x, (x, y), y, (y, t), t
and s, (s, y), y, (x, y), x, (x, t), t, then we will need 2 × 106 steps before we
obtain a maximum flow of 2 × 106 , whereas the Edmonds-Karp algorithm
obtains a maximum flow in two steps.

xu
¡@
106 , 0 ¡ @ 106 , 0
% ¡ &
@
¡ @
s ¡
u 1, 0 @u t
@ ↓ ¡
@& %¡
106 , 0@ ¡106 , 0
@
@¡u¡
y

Figure 9.11.
200 MATH236 Discrete Mathematics with Applications 2009

Exercises

9.4 For the networks shown below, associated with each arc is an ordered
pair where the first number denotes the capacity of the arc and the
second number the flow in the arc. Starting with the given flow, use
the Edmonds-Karp algorithm to find a maximum flow and a minimum
cut for these networks.

u 6, 2 v
v → v
¡@
¡ ¡@
5, 2 ¡ @ % ¡ @ 2, 2
¡ @
%
¡ @ 3, 0 &
@
@ ¡
¡ @
@ ¡
(i) s v
¡ ¡
@
@v t
@ ¡ @ ¡
@ ¡ ¡
&
@ @ 8, 0 %¡
¡ @
6, 0 @ ¡ & ¡ 7, 0
@ @ ¡
@v¡ → @v ¡
x 1, 0 y

uv 5, 1 xv

¡
¡ @
@
6, 2 ¡ @ 9, 5
% ↓ 6, 3
¡ 9, 6 ↑ & @
¡ y v 6, 2 @@v
6, 3 vv 4, 2
(ii) s¡
v ← → ← t
@ ¡
@ ¡
& %
6, 4 @ ↑ 6, 2 6, 2 ↑
@ ¡
¡ 7, 0
@
@v ¡
¡
v

w 6, 2 z
Maximum Flows in Networks 201

Suggestions for further reading


A more extensive treatment on network flow theory is given in Chartrand and
Oellerman [15], Diestel [16] and West [27]. Books on network flows have been
written by Ford and Fulkerson [20] and more recently by Ahuja, Magnanti
and Orlin [1]. An excellent chapter on theoretical aspects of network flows
has been written by Frank [21].
202 MATH236 Discrete Mathematics with Applications 2009
Bibliography

[1] Ahuja R. K., T. L. Magnanti and J. B. Orlin, Network flows: The-


ory, Algorithms, and Applications, Prentice-Hall, Englewood Cliffs, NJ
(1993).

[2] Berge C., Two theorems on graph theory. Proc. Nat. Acad. Sci. USA 43
(1957), 842–844.

[3] Berge C., Theory of Graphs and its Applications (Methuen, London,
1962), 40–51.

[4] Berge C., Graphs and Hypergraphs, North-Holland, Amsterdam, 1973.

[5] Biggs N. L., E. K. Lloyd, and R. J. Wilson, Graph Theory 1736–1936.


Clarendon Press, Oxford, England (1976).

[6] Bollobás B., Graph Theory: An Introductory Course, Springer, New


York (1979).

[7] Bollobás B., Extremal Graph Theory, Handbook of Combinatorics ed.


R.L. Graham, M. Grötschel, and L. Lovász, Elsevier Science B.V., Am-
sterdam (1995), 1231–1292.

[8] Bondy J. A. and V. Chvátal, A method in graph theory. Discrete Math.


15 (1976), 111-136.

[9] Bondy J. A. and U. S. R. Murty, Graph Theory with Applications, North-


Holland, New York (1976).

[10] Buckley F. and F. Harary, Distance in Graphs. Addison-Wesley, Read-


ing, MA (1990).

203
204 MATH236 Discrete Mathematics with Applications 2009

[11] Buckley F., Z. Miller, and P. J. Slater, On graphs containing a given


graph as center. J. Graph Theory 5 (1981), 427–434.

[12] Chartrand G., Introductory Graph Theory, Dover Publications, New


York (1985).

[13] Chartrand G. and F. Harary, Graphs with prescribed connectivities,


in Theory of Graphs: Proceedings of the Colloquium Held at Tihany,
Hungary. Budapest (1968), 61–63.

[14] Chartrand G. and L. Lesniak, Graphs & Digraphs: Third Edition, Chap-
man & Hall, London (1996).

[15] Chartrand G. and O. R. Oellermann, Applied and Algorithmic Graph


Theory, McGraw-Hill, New York (1993).

[16] Diestel R., Graph Theory, Springer, New York, Inc (1997).

[17] Dijkstra E. W., A note on two problems in connection with graphs.


Numerische Math. 1 (1959), 269–271.

[18] Ford L.R. and D. R. Fulkerson, Maximal flow through a network. Canad.
J. Math. 8 (1956), 399–404.

[19] Ford L.R. and D. R. Fulkerson, A simple algorithm for finding maximal
network flows and an application to the Hitchcock problem. Canad. J.
Math. 9 (1957), 210–218.

[20] Ford L.R. and D. R. Fulkerson, Flows in Networks, Princeton University


Press, NJ (1962).

[21] Frank A., Connectivity and network flows, Chapter 2 in the Handbook of
Combinatorics (R.L. Graham, M. Grötschel, and L. Lovász, eds), North
Holland, 1993, pp. 111–178.

[22] Gibbons A., Algorithmic Graph Theory. Cambridge University press,


Cambridge (1985).

[23] Graham R. L. and P. Hell, On the history of the minimum spanning


tree problem. Ann. History Comput. 7 (1985), 43–57.
Maximum Flows in Networks 205

[24] Hakimi S. L., On the realizability of a set of integers as degrees of the


vertices of a graph. J. SIAM Appl. Math. 10 (1962), 496–506.

[25] Havel, V., A remark on the existence of finite graphs (Czech). Ĉasopis
Pěst. Mat. 80 (1955), 189–194.

[26] Kruskal, J. B., On the shortest spanning subtree of a graph and the
traveling salesman problem. proc. Amer. Math. Soc 7 (1956), 48–50.

[27] West D. B., Introduction to Graph Theory, Prentice-Hall, Upper Saddle


River, NJ (1996).

[28] Wilson R. J. and J. J. Watkins, Graphs: An Introductory Approach,


John Wiley & Sons, New York (1990).

You might also like