Dedm PDF
Dedm PDF
Dedm PDF
and
Discrete Mathematics
Simon Salamon
Mathematical Institute
University of Oxford
October 1996
c October 1996
Mathematical Institute
24–29 St Giles’, Oxford, OX1 3LB
Preface
These notes accompany the first-year Oxford course of the same title. They are
not of course meant to substitute the lectures themselves, which are likely to provide a
less theoretical approach to the subject, with more emphasis on simple applications and
problem-solving. Material from the course is covered in roughly the order of the lectures
synopsis, though a number of subsections (at least those starred) could be omitted on a
first reading.
The notes are designed to be read at leisure during the course of the entire academic
year, and some of the explanations will make more sense after exposure to other series
of lectures. For example, no effort has been made to elaborate on the concept of a
limit, which recurs at various points. The notes also touch on a number of major topics
which will be more properly covered elsewhere, such as partial differential equations and
probability. Moreover, Euclid’s algorithm is described in the Institute lecture notes [9].
There is an initial temptation to regard the course as split into two utterly distinct
parts, with calculus (§§1–6) forming part one, and combinatorics and algorithms (§§7–
12) part two. I believe this to be mistaken, since there is a definite interchange of ideas
between these two parts, and this has been emphasized here. The designers of the course
were well aware of links between difference equations and generating functions, the algo-
rithmic nature of Euler’s method for approximating solutions to differential equations,
and so on. On the other hand, for learning and revision, the contents might profitably
be sliced into quarters (§§1–3, §§4–6, §§7–9, §§10–12).
Much of the material provides an ideal setting for experimenting with computer
packages and, conversely, a greater understanding of the theory can be gained with the
help of a machine. For this reason, most exercise sections include one or two problems
with Maple, although the latter does not form part of the course and (with one exception
in §12.2) has been excluded from the main body of text. For best results, precede each
exercise with the command restart; to wipe out earlier definitions. Most of the graphics
survived from earlier years of the course, and were plotted with Mathematica.
I am grateful to Richard Bird for some helpful comments, and to Chris Prior for his
patience in reading and correcting parts of an earlier draft.
S.M.S.
September 1996
iii
Contents
1 Elementary Calculus
1.1 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Higher derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Definite integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Constant Coefficients
3.1 The characteristic equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Undetermined coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
? 3.3 Further techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 Difference Equations
4.1 Analogies with ODE’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Fibonacci type equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 Worked problems .................................................... 32
4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5 Numerical Solutions
5.1 Euler’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.2 Theoretical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
? 5.3 An improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6 Partial Derivatives
6.1 Functions of two variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.3 Homogeneous functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
? 6.4 Some partial differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
iv
7 Binomial Coefficients
7.1 Pascal’s triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.2 Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
7.3 Generalized binomial coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
7.4 Infinite series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
8 Generating Functions
8.1 Closed forms for sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
8.2 Derangements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.3 The inclusion-exclusion principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
8.4 Difference equations revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9 Asymptotic Notation
9.1 ‘O’ terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
9.2 Rates of convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
? 9.3 Power series estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
9.4 Stirling’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
10 Euclid’s Algorithm
10.1 Integer division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
10.2 Computing greatest common divisors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
? 10.3 Prime numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
10.4 Polynomial division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
10.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
11 Graphical Optimization
11.1 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
11.2 Kruskal’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
11.3 Prim’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
? 11.4 Other problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
11.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
v
1 Elementary Calculus
1.1 Differentiation
dy
Let y = y(x) be a function expressing y in terms of x. Its derivative, written or
0
dx
y , is the new function whose value at x = a equals the gradient of the graph of y at a.
dy dy
This value, written (a) or | or y 0 (a), can be expressed by a limit:
dx dx a
y(x) − y(a)
y 0 (a) = lim . (1.1)
x→a x−a
This formula can be used to work
√ out the derivatives of some simple functions from first
principles. For example, if y = x then
√ √
0 x− a 1 1
y (a) = lim = lim √ √ = √ .
x→a x−a x→a x+ a 2 a
Understanding when limits exist and what properties they enjoy is accomplished in
another course, but since limits will occur several times in these notes, we include a
precise definition.
Let a be a fixed number. If f (x) equals (y(x) − y(a))/(x − a), or indeed any other
function, the limit of f as x tends to a is said to exist and equal ` ∈ R if for any number
ε > 0 (however small it may be) there exists δ > 0 such that
It is important to note that in this test x is not allowed to equal a; indeed in many
situations (including the one at hand) f (a) is not even defined. On the other hand, if
f (a) is defined and equal to the limit ` then the function f is said to be continuous at
a. For example, (sin x)/x is undetermined when x = 0, though
sin x
lim = sin0 (0) (1.2)
x→0 x
is known to equal cos 0 = 1. If we define f (x) to equal (sin x)/x for x 6= 0 and set
f (0) = 1 then f is continuous at all points of R, as indicated by the unbroken graph in
Figure 1.
There are a number of rules that enable one to differentiate complicated functions
quickly. The first is so obvious that it is often not stated explicitly, namely that the
derivative of a sum of two functions is the sum of the individual derivatives. A slightly
more general version of this rule is the
1
sin x
Figure 1: Extending to x = 0
x
1.25
0.75
0.5
0.25
-10 -5 5 10
-0.25
-0.5
The function that assigns x to v(u(x)) is called the composition of u with v , and is
sometimes written v ◦ u so that (1.4) translates into the neat identity (v ◦ u)0 = (v 0 ◦ u)u0 .
The chain rule will take on further significance in the discussion of partial derivatives in
§6.2.
The above rules are powerful in combination. They are all needed, for example, to
2
compute higher derivatives of the function ex , given that ex is a function equal to its
own derivative:
2
2
d 2 d x2
(ex ) = (e .2x)
dx dx
d x2 2 d
= (e ).2x + ex . (2x)
dx dx
2 x2
= 2(2x + 1)e .
3
Thus, a polynomial of degree n is completely determined by the values of itself and first
n derivatives at 0. For functions that are not polynomials though, higher derivatives
tend to become more and more complicated.
The expression (1 + x)n clearly defines a monic polynomial of degree n in x. Evalu-
ating its higher derivatives using the chain rule and (1.5) yields the binomial theorem:
Well-known extensions of this result to the case in which the exponent n is no longer a
positive integer will be discussed in §7.4.
1.3 Integration
The word ‘integration’ is often used simply to mean the reverse of differentiation,
and a first exposure to the subject generally consists of taking a function f and trying
to find a function y such that f (x) = y 0 (x). If one is confident that y exists one writes
Z
y(x) = f (x)dx + c.
ln x ← x ln x−x ←
.
6 2 1 1 1 2 1 3
← − 4
← 3 ← − 2 ← 1 ← x ← 2
x ← 6
x ←
x x x x
.
0
Like 1/x, the function ln x is not defined for x = 0. In fact both ln x and ln(−x) are
valid integrals of 1/x, though only one will be real depending on the sign of x. For this
reason, one often writes Z
1
dx = ln |x| + c, (1.9)
x
though at times the absolute value signs can be a hindrance (see Example 2.1.2).
The rules given in §1.1 give rise to corresponding ones for computing integrals. If a, b
are constants and u, v functions then certainly
Z Z Z
(au(x) + bv(x))dx = a u(x)dx + b v(x)dx.
4
Integration by parts is a restatement of the product rule. If u, v are functions of x
then Z Z
du dv
v(x)dx = u(x)v(x) − u(x) dx.
dx dx
Setting du/dx = y and abbreviating the notation gives the more practical version
Z Z
R R
y v = ( y)v − ( y)v 0 .
This last function is then the ‘general solution’ of the equation y 0 (x) = x sin x.
Z
Problem 1.3.1 Use the substitution u(x) = tan( 21 x) to evaluate csc xdx, where
csc x = cosec x = 1/(sin x).
Solution. Standard trigonometric identities imply that
2u 1 − u2
sin x = , cos x = .
1 + u2 1 + u2
Using the first of these, and the fact that
du
= 1
2
sec2 ( 12 x) = 21 (1 + u2 )
dx
gives Z Z Z
1 du 1
csc xdx = dx = du
u dx u
= ln |u| + c
= ln | tan( 12 x)| + c.
2
5
1.4 Definite integrals
To avoid the nuisance of constants of integration, one sets
Z x
y(x) = f (t)dt (1.11)
a
to represent the unique function whose derivative is f (x) and which satisfies the extra
condition that y(a) = 0. For fixed x, the right-hand side of (1.11) is a ‘definite integral’
which may be interpreted geometrically as the area lying under the graph of f between
a and x.
Figure 2:
f(x)
f(a)
a a+h
To see why computing such areas acts as the inverse of differentiation, first apply
(1.1) to obtain Z x Z a+ε
d 1
f (t)dt = lim f (t)dt .
dx 0 a ε→0 ε a
The second integral is interpreted as the area of a tiny strip of width ε and height
roughly f (a) (see Figure 2), so the right-hand limit equals f (a). But this must also be
the value of the left-hand side if differentiation is to be the inverse of integration. (Strictly
speaking, these arguments are only valid if f is a continuous function at x = a.)
6
Taking the limit of both sides as x → ∞ gives
Z ∞
2
e−t dt ≤ lim (e−1 − e−x ) = e−1 = 0.367 . . .
1 x→∞
2
where the ‘infinite integral’ represents the total area under the graph of e−t to the right
of the line t = 1 (see Figure 3). On the other hand, it is known that
Z ∞
2 √
e−x = 21 π = 0.886 . . .
0
2
Figure 3: e−t , e−t and e−t /t
2
1.75
1.5
1.25
0.75
0.5
0.25
-1 0 1 2 3 4 5
Finding indefinite integrals is much harder than finding derivatives, in the sense that
it is impossible to express the integrals of many simple functions in terms of familiar
functions. A similar example of this type is the integral
Z x −t
e
dt, (1.14)
1 t
whose value is sandwiched between each side of the inequality in (1.12).
1.5 Exercises
1. (i) Prove that (1.3) holds when n is replaced by a positive fraction p/q by setting
y q = xp .
7
(ii) Deduce the quotient rule (u/v)0 = (u0 v − uv 0 )/v 2 from the chain and product rules
and (1.3) for n = −1.
2. (i) Prove that the derivative of ln(csc xZ+ cot x) is − csc x. How do you reconcile this
with the answer in Problem 1.3.1? Find sec xdx.
sin x
(ii) Simplify the function arctan by differentiation, or otherwise.
1 + cos x
3. Let x > 0. Find the derivative of y = xx , by first taking logs, or otherwise. The
minimum value of xx occurs when x is the unique solution a of the equation y 0 (x) = 0;
find a and sketch the graph of xx .
6. Tables of standard derivatives and integrals are stored in Maple. Check to see that
the following give expected results:
D(tan); int(tan(x),x);
D(sec); int(sec(x),x);
D(csc); int(csc(x),x);
D(ln); int(ln(x),x);
D(arcsin); int(arcsin(x),x);
D(arctan); int(arctan(x),x);
D(arctanh); int(arctanh(x),x);
diff(E^x,x); int(x*exp(x),x);
diff(ln(ln(x)),x); int(ln(ln(x)),x);
8
2 Introduction to Differential Equations
(i) y 0 = cot x,
(ii) y 0 + 2y = y 2 ,
Example 2.1.1
d2 y
+ sin x = 0 or y 00 (x) = − sin x.
dx2
Integrating gives
y 0 (x) = cos x + c1 ,
⇒ y(x) = sin x + c1 x + c2 .
Here c1 and c2 are constants, included to give the most general solution; in fact c1 =
y 0 (0) − 1 and c2 = y(0). 2
In this example, we might interpret x as time and y as the distance travelled by some
particle. Then y 0 = ẏ represents velocity and y 00 = ÿ acceleration, the latter perhaps
specified by some applied sinusoidal force. (Differentiations with respect to time are
conventionally denoted by dots, following Newton.) If the particle has an initial velocity
of 5 units, then the appropriate ‘initial conditions’ are
y(0) = 0, ẏ(0) = 5,
Example 2.1.2
dy
y 0 + 2y = 0 or = −2y
dx
9
One approach to solving this is to separate the variables and plonk down integral signs
without thinking what one is doing, so as to give
Z Z
1
dy = −2 dx.
y
One deduces that
ln y = −2x + c, or y(x) = be−2x ,
where c and b = ec are constants. In fact, be−2x is the general solution in the sense
that any solution of the differential equation must equal this for some value of b. On the
other hand, the use of the logarithm is a bit artificial, and it is not completely obvious
that dividing by y does not eliminate a solution. Also, the constant b can certainly be
negative even though c must then be a complex number; this explains why the absolute
signs of (1.9) are inappropriate. 2
A more convincing way of obtaining the general solution in Example 2.1.2 is to spot
that the positive function e−2x is one solution, and then suppose that y(x) = u(x)e−2x
is another. The equation becomes
0 = y 0 + 2y = (u0 − 2u)e−2x + 2ue−2x = u0 e−2x ,
and is equivalent to u0 ≡ 0, which means that u is a constant. Thus be−2x is indeed
the general solution, and we shall use this technique again and again in cases where one
solution of an equation is already known. One should note that the implication
u0 ≡ 0 ⇒ u constant (2.1)
actually underlies all the integration steps we have already made; although it appears
obvious it is strictly speaking a consequence of the
Mean Value Theorem for differentiable functions. This states that for any a, b, the
‘average’ rate of change (u(b)−u(a))/(b−a) equals u0 (c) for at least one point c between
a and b.
10
Proposition 2.2.2 If y1 , y2 are solutions of a linear homogeneous ODE then c1 y1 + c2 y2
is also a solution for any constants c1 , c2 . 2
and solutions are obtained provided one can evaluate the two integrals. However, such
solutions may be ‘implicit’, that is to say they relate x and y but do not express y
directly as a function of x.
(i) (1 + y 2 )y 0 = 1 − x2 ,
(ii) (1 − x2 )y 0 = 1 + y 2 .
11
The last line is quite acceptable as a conclusion; in theory it could be used to express
y directly in terms of x using square and cube roots but this might not help much in
practice. By contrast, the separable equation (ii) leads to
Z Z
1 1
2
dy = dx + c
1+y 1 − x2
which, at least for |x| < 1, has an explicit solution
r !
1+x
y = tan ln + c = tan(arctanhx + c)
1−x
y 0 + f (x)y = 0 (2.4)
In both cases, the coefficient of y 0 equals 1 and in the notation of Definition 2.2.1,
f0 = −f and f1 ≡ −1. This is no real restriction, as one may in general divide through
by the coefficient function of y 0 provided one does not get upset if f or g are then
undefined at particular points (see consequences of this in Problem 2.3.3).
The first equation is said to be the homogeneous equation associated with the second,
and is obviously separable. It implies that
Z Z
1
ln y = dy = − f (x)dx + a.
y
Whence
12
Proposition 2.3.1 The general solution of (2.4) is
R
y(x) = ce− f (x)dx
,
Next, consider the non-homogeneous equation (2.5). R Guided by the argument at the
− f
end of §2.1, we seek a solution in the form y = ue for some function u (we have
suppressed the x’s to save energy). Using all the calculus rules from §1.1,
R R
u0 = (y e f )0 = y 0 e f
+ y (e f .f )
R
R
= e f .(y 0 + f y)
R
= e f .g,
For this converts the non-homogeneous equation into (in fuller notation)
d
(I(x)y(x)) = I(x)g(x)
dx Z
⇒ I(x)y(x) = I(x)g(x)dx + c
Z
−1
⇒ y(x) = I(x) I(x)g(x)dx + cI(x)−1 . (2.7)
is a particular solution of (2.5), whereas cI(x)−1 is the general solution of (2.4). In fact
if y1 andR y2 are both solutions of (2.5), then y1 − y2 is a solution of (2.4); conversely,
y1 + ce− f will solve (2.5) for any constant c. The same argument shows that
13
Problem 2.3.3 Solve the initial value problem (IVP for short)
( 0
xy + 2y = sin x,
y(π) = 0
Solution. First we put the equation into standard form by dividing through by x:
2 sin x
y0 + y = . (2.8)
x x
The integrating factor is R
(2/x)dx
I(x) = e = e2 ln x = x2 ,
giving
(x2 y)0 = x2 y 0 + 2xy = x sin x.
One might have spotted that multiplying the original equation by x was all that was
needed to transform the left-hand side into an exact derivative, though it is sometimes
quicker to compute (2.6) methodically without thinking too hard. From (1.10),
x2 y = sin x − x cos x + c,
14
Figure 4: Solutions for c = −3π, −2π, −π, 0, π, 2π, 3π
-8 -6 -4 -2 2 4 6 8
-1
-2
-3
-4
rule implies that this limit is 0, so that the corresponding graph passes through the
origin.
We shall return to the graphical interpretation of ODE’s in §5.1.
and attempt an analysis based on Proposition 2.3.2. This time there is no systematic
way to solve the homogeneous equation (2.9), but if a solution of it is known, a familiar
technique can be used to find a solution of the non-homogeneous equation (2.10).
Lemma 2.4.1 Let y1 be a solution of (2.9) and u a function such that y2 = uy1 is a
solution of (2.10). Then v = u0 satisfies the first-order linear equation
y1 v 0 + (2y10 + f1 y1 )v = g.
15
Proof. One has
y20 = u0 y1 + uy10 ,
y200 = u00 y1 + 2u0 y10 + uy100 ,
and by assumption y100 + f1 y10 + f0 y1 = 0. Substituting into (2.10) gives the result. 2
Example 2.4.2
y 00 + y = cot x.
A solution of the associated homogeneous equation is obviously given by sin x, so let
y(x) = u(x) sin x, and v(x) = u0 (x). The lemma implies that
Setting c = 0 gives
Z
u(x) = csc xdx = ln tan( 21 x) = − ln(csc x + cot x)
(see Problem 1.3.1), so that for x > 0 a solution is ln(tan( 12 x)) sin x. 2
The lemma can also be applied to obtain a second solution of (2.9), given a first:
v(x) = cxe−3x
Z
−3x
⇒ 1
u(x) = − 3 cxe + 3 c e−3x dx = − 13 c(x + 31 )e−3x .
1
16
Given two distinct solutions y1 , y2 to the homogeneous second-order linear equation
(2.9), it is important to know if these are proportional or ‘linearly dependent’ in the
sense that
ay1 + by2 ≡ 0, (2.11)
for some constants a, b, not both zero. For example there are many identities involving
trigonometric functions that might obscure such a relationship. Differentiating (2.11)
gives ay10 + by20 ≡ 0, and it follows that
The function on the left, denoted W (y1 , y2 ) is called the Wronskian of y1 , y2 , and often
assumes a simple form even if the individual solutions are complicated.
Conversely, it can be shown that if y1 , y2 are solutions of (3.1) that are not propor-
tional, then W (y1 , y2 ) is actually nowhere zero on any interval in which the functions
f0 , f1 are continuous [6, §2.7]. The latter condition is not satisfied by the equation in
Problem 2.4.3 at x = 0, which tallies with the fact that
vanishes when x = 0.
2.5 Exercises
1. Verify that the given function is a solution of the corresponding ODE for x > 0:
(i) 41 x2 (3 − 2 ln x), y 00 + ln x = 0;
√
(ii) 41 x2 − 3, y 0 = y + 3;
(iii) e−x , y (iv) = y + y 0 + y 00 .
In each case, by inspection or otherwise, find a solution of the same equation not equal
to the one given.
17
x2 + 3y 2
(iii) y 0 = ;
2xy
8. (i) Let k ∈ R. Show that the substitution u = y 1−k reduces the ODE y 0 + y = y k to
a linear equation, and solve it.
(ii) Investigate the equation
eq:= D(y)(x)+y^j=y^k:
by assigning integers to j and k then solving, such as
j:=2: k:=3:
dsolve(eq,y(x));
18
3 Constant Coefficients
in which p, q are real constants. In the next, we shall return to the non-homogeneous
version in which the right-hand side is replaced by an assigned function g(x).
We can express (3.1) in the somewhat abbreviated form
(D 2 + pD + q)y = 0,
where the symbol D denotes d/dx. The expression in parentheses can now be factored
by treating D as an ordinary variable, and the outcome determines the type of solutions.
Definition 3.1.1 The auxiliary or characteristic equation associated with (3.1) is the
equation λ2 + pλ + q = 0.
This is a quadratic equation in λ, with ‘characteristic roots’
p p
−p + p2 − 4q −p − p2 − 4q
λ1 = , λ2 = ,
2 2
which coincide iff p2 = 4q and are non-real iff p2 < 4q . The characteristic equation
therefore takes the form
λ1 + λ2 = −p,
(λ − λ1 )(λ − λ2 ) = 0, so
λ1 λ2 = q.
Using the normal rules for expanding brackets, (3.1) can now be written as
(D − λ1 )(D − λ2 )y = 0;
The last step only works because λ2 is constant; this allows one to say Dλ2 y = λ2 Dy .
19
Proof. Let u denote the function (D − λ2 )y , so that
(D − λ1 )u = 0, or u0 − λ1 u = 0;
this is a first-order linear homogeneous equation with general solution u(x) = aeλ1 x ,
where a is a constant. From the definition of u,
y 0 − λ2 y = aeλ1 x ,
which has integrating factor I(x) = e−λ2 x and, from (2.7), general solution
Z
−1
y(x) = I(x) I(x)aeλ1 x dx + bI(x)−1
Z
λ2 x
= ae e(λ1 −λ2 )x dx + beλ2 x .
The result follows from integrating the last line, in which one must distinguish the two
cases λ1 6= λ2 and λ1 = λ2 . 2
Note the consistency between Propositions 2.2.2 and 3.1.2: if y1 and y2 are solutions
of (3.1) then so is c1 y1 + c2 y2 .
We next illustrate what happens when the roots of the characteristic equation are
not real.
e−x (c1 (cos 2x + i sin 2x) + c2 (cos 2x − i sin 2x)) = e−x (a1 cos 2x + a2 sin 2x)), (3.2)
where a1 = c1 +c2 and a2 = i(c1 −c2 ). Given the nature of the problem, we must restrict
the ‘general’ solution to be real by taking a1 , a2 ∈ R (and thereby allowing c1 , c2 to be
complex conjugates).
The coefficient of y 0 in the original equation arises from viscosity. If it were zero,
the exponential term in the solution would be absent and the motion would be purely
sinusoidal with continuing oscillations of equal magnitude as in (2.2). As it is, the e−x
factor causes the magnitude of the oscillations to decay, and this phenomenon is referred
to as ‘damping’. 2
20
Figure 5: Underdamped motion
0.5
0.4
0.3
0.2
0.1
1 2 3 4 5 6
-0.1
We shall always suppose that the constants p, q defining the ODE (3.1) are real. The
behaviour of solutions then depends crucially on the sign of ∆ = p2 − 4q . To explain
this, suppose that p > 0. If ∆ > 0, no oscillations occur, and any motion represented
by the solutions is said to be ‘overdamped’. The ‘critical’ case ∆ = 0 corresponding to
λ1 = λ2 is best regarded as a limiting
p case of overdamped motion. By contrast, if ∆ < 0
then λ1 = − 21 p ± ir where r = 21 4q − p2 is a real number, and the general solution of
(3.1) is
e−px/2 (a1 cos rx + a2 sin rx).
The motion is said to be ‘underdamped’; this is illustrated in Figure 5 by the solution
(3.2) with a1 = 0, a2 = 1.
in which p, q are once again constants, and g is an assigned function. Proposition 2.2.2
applies to the pair of equations (3.1) and (3.3), so to fully solve (3.3) we need only find
a particular solution of it. For many functions g , the most efficient way to do this is to
make an informed guess involving constants (the so-called ‘undetermined coefficients’)
which are subsequently evaluated by substitution.
Provided that the coefficients on the left-hand side are constant, this technique pro-
vides an easier alternative to the one based on Lemma 2.4.1, and will be illustrated in
the series of problems below.
21
Problem 3.2.1 Solve the IVP
(
y 00 − 4y 0 + 4y = sin x,
y(0) = 0 = y 0 (0).
Step 2: Seek a particular solution y1 of the given non-homogeneous ODE. Given that
repeated derivatives of sin x are multiples of itself or cos x, it is natural to try y1 =
A sin x + B cos x, where A, B are constants to be determined from the equation
(−A sin x − B cos x) − 4(A cos x − B sin x) + 4(A sin x + B cos x) = sin x
⇒ (3A + 4B) sin x + (−4A + 3B) cos x = sin x.
π
Since this must hold for all x, we get (for example, by taking x to equal 2
, 0 respectively)
)
3A + 4B = 1 3 4
⇒ A = 25 , B = 25 .
−4A + 3B = 0
Step 3: Find the values of c1 , c2 that satisfy the initial conditions. In fact,
4 4
y(0) = 25 + c2 ⇒ c2 = − 25 ,
3
y 0 (0) = 25 + c1 + 2c2 ⇒ c1 = 51 ,
The graph of this function in Figure 6 shows how the exponential part dominates when
x > 0 and the sinusoidal part when x < 0. 2
We have already used the fact that D = d/dx is a linear operator in the sense that
whenever c1 , c2 are constants and y1 , y2 are functions. The same applies to the differential
operator
L = D 2 + pD + q.
The function u(x) = eαx (α a constant) satisfies
Du = αu
22
which says that D transforms u to a multiple of itself. This is a very special situation,
and one refers to u as an eigenvector of the linear operator D. An analogous situation
characterizes the equations
R1 (v1 ) = v1 ,
R2 (v2 ) = −v2 ,
in which R1 is a rotation of 3-dimensional space with axis parallel to a vector v1 and R2 is
a reflection in a plane perpendicular to a vector v2 . Note that D 2 acts by multiplication
by α2 on u, so and
which tells us that eαx /(α2 + pα + q) = eαx /((α − λ1 )(α − λ2 )) is a particular solution
of the non-homogeneous equation
y 00 + py 0 + q y = eαx .
This works provided that α is different from λ1 and λ2 ; for the other cases it is easily
verified that Ly = eαx has a particular solution
1
xeαx , if α = λ1 6= λ2 ,
α − λ2 (3.4)
1 x2 eαx ,
2 if α = λ = λ .1 2
y 00 − 5y 0 + 6y = ex + e2x + e3x .
23
Figure 6: (3 sin x + 4 cos x + (5x − 4)e2x )/25
0.8
0.6
0.4
0.2
-10 -8 -6 -4 -2 2
-0.2
The following result can save time in determining coefficients of particular solutions.
24
(ii) For x2 sin x, try (A+Bx+Cx2 ) sin x+(D+Ex+F x2 ) cos x. In general, a polynomial
of degree n times sin or cos necessitates trying a linear combination of sin and cos with
29
coefficients that are also polynomials of degree n. A tedious calculation gives A = 250 ,
7 1 28 1 1
B = − 25 , C = 5 , D = 250 , E = 25 and F = − 10 .
(iii) For e−x cos 2x, note that y = e−x (A sin 2x + B cos 2x) is no good as it is a solution
of the associated homogeneous equation. Instead we try xy , and use the lemma:
L(xy) = 2e−x (2A cos 2x − 2B sin 2x) + (−2 + 2)e−x (A sin 2x + B cos 2x) = e−x cos 2x
⇒ A = 41 , B = 0.
(D − λ1 )(D − λ2 ) · · · (D − λn )y = 0, (3.5)
where λi ∈ C are the roots (in any order) of the associated monic polynomial with
coefficients pi . Any function yi satisfying (D − λi )yi will therefore be a solution of (3.5),
and the general solution will in fact be a linear combination of these yi , plus any ‘extra’
solutions of the form xk yi in the case of repeated roots.
Example 3.3.3
y (iv) − 2y 000 + 2y 0 − y = ex .
The characteristic polynomial is
so the roots are 1 (repeated thrice) and −1. In line with Proposition 3.1.2, the general
solution of the homogeneous equation is therefore
(c1 + c2 x + c3 x2 + 1 3 x
12
x )e + c4 e−x .
25
With hindsight, it was obvious that eλx is a solution of the constant coefficient
equation y 00 + py 0 + qy = 0 when λ2 + pλ + q = 0. There is an analogous family of
second-order linear ODE’s for which xλ is equally obviously a solution for suitable λ.
These are the Euler-Cauchy equations
x2 y 00 + pxy 0 + qy = 0 (3.6)
where p, q are again constants. Setting y(x) = xλ gives the ‘new’ characteristic equation
λ2 + (p − 1)λ + q = 0. (3.7)
The question arises as to what happens when (3.7) has repeated roots, as for instance
in the equation
x2 y 00 + 3xy 0 + y = 0,
which has 1/x as one solution. We could find an extra solution of the form u(x)xλ with
the aid of Lemma 2.4.1, but here we shall follow a much quicker route. If (3.7) has two
roots λ1 = λ and λ2 = λ + ε then we might expect
xλ+ε − xλ
lim = xλ ln x
ε→0 ε
to be a solution, given that the quotient is, for any ε 6= 0. (The limit is evaluated by
differentiating xλ = eλ ln x with respect to λ). It is easily verified that this is correct, and
we have the following analogue of Proposition 3.1.2:
3.4 Exercises
1. The equations
(i) y 00 − 2y 0 + 5y = e2x sin x;
(ii) y 00 − 2y 0 + 5y = x sin x;
(iii) y 00 − 2y 0 + 5y = xex sin 2x
all have the same characteristic roots. Find a particular solution in each case. Why is
your answer to (iii) simpler than might have been expected?
26
given that the equation has a particular solution of the form (Ax + B)x2 ex .
6. Without determining the coefficients, write down the general form of a particular
solution of the ODE
y 00 + 3y 0 − 4y = g(x)
when g(x) equals (i) (1 + x)3 , (ii) x3 ex , and (iii) x3 ex sin x. Check your answers starting
with
eq:= (D@@2)(y)(x)+3*D(y)(x)-4*y=(1+x)^3:
dsolve(eq,y(x));
27
4 Difference Equations
0 1 4 9 16 25 36 49 ......
1 3 5 7 9 11 13
2 2 2 2 2 2
0 0 0 0 0
formed by listing perfect squares and taking successive differences. Let yn denote the
nth term in the top row, starting from 0 (so that y0 = 0, y1 = 1, . . .). The fact that the
third row is all 2’s tells us that for example
yn = n2 + c 1 n + c 2 , (4.2)
where c1 , c2 are arbitrary constants. The solution of a difference equation like (4.1) is a
sequence of real numbers (y0 , y1 , y2 , . . .), that we may think of as a function defined on
the set N = {0, 1, 2, 3, . . .} of natural numbers (including 0), so that
f : N −→ R
n 7→ yn = y(n).
The ‘graph’ of this function would consist of the points (n, yn ) for n = 0, 1, 2, . . .. (It
is important to understand the distinction between a sequence (y0 , y1 , y2 , . . .), and its
underlying set {y0 , y1 , y2 , . . .} in which the order is irrelevant and any repetitions are
ignored; thus a set does not define a function.)
Taking differences is in some ways analogous to differentiation, and one might define
the ‘derivative’ of a sequence yn to be the new sequence defined by yn0 = yn+1 − yn .
However, for our purposes, given the original sequence
y = (y0 , y1 , y2 , y3 , . . .),
Sy = (y1 , y2 , y3 , y4 , . . .)
S 2 y = (y2 , y3 , y4 , y5 , . . .)
······
28
by means of the operator S which replaces each term of y by its Successor, or equivalently
Shifts the sequence one place to the left. Our original equation (4.1) becomes
S 2 y − 2Sy + y = g, or (S − 1)2 y = g,
where g stands for the constant sequence (2, 2, 2, . . .) (whose underlying set is {2}), and
we are adding sequences and multiplying them by constants in an obvious term-wise
fashion.
More generally, let p, q be real numbers and consider the difference equations
where (g0 , g1 , g2 , . . .) is an assigned sequence, and the problem is to find yn . In the first
example, gn was equal to 2 for all n; more interesting might be the sequence defined by
n 4, n even,
gn = 3 + (−1) = (4.5)
2, n odd.
The equation (4.3) is the homogeneous equation assiociated to (4.4). The linearity
property of the the operators S and S 2 + pS + q ensures that, just as for differential
equations,
Proposition 4.1.1 The general solution of (4.4) equals any particular solution plus the
general solution of (4.3).
The general solution of (4.2) has precisely this form. Had we guessed that An2 was a par-
ticular solution of (4.1) for some undetermined coefficient A, we could have substituted
this to obtain
yn+2 − 2yn+1 + yn = A(n + 2)2 − 2A(n + 1)2 + An2 = 2A,
and confirm that A = 1.
29
Proof. Briefly, if λ1 is a root of the characteristic equation, then λn+21 + pλn+1
1 + qλn1 = 0,
n n+2 n+1
and λ1 solves (4.3). If λ is a repeated root then 2λ +pλ vanishes, and this implies
n
that nλ is also a solution. By linearity, the sequences (yn ) defined above solve (4.3) for
all values of c1 , c2 . Given any other solution (ỹn ) of (4.3), one may choose the constants
c1 , c2 such that (xn = yn − ỹn ) is a solution with x0 = x1 = 0, and it follows easily that
xn = 0 for all n. 2
A more honest proof that works without prior knowledge of the solutions is given at
the end of this section. Consider next two simple examples:
(i) The homogeneous equation yn+2 − 2yn+1 + yn = 0 associated with the one in §4.1 has
characteristic equation (λ − 1)2 = 0 and therefore general solution
(c1 n + c2 )1n = c1 n + c2 ,
as claimed.
(ii) It is obvious that the general solution of yn+2 + yn = 0 has the form
(a, b, −a, −b, a, b, −a, −b, . . .)
⇒ y2n = (−1)n a, y2n+1 = (−1)n b,
where y0 = a and y1 = b are arbitrary. Alternatively, the roots of the characteristic
equation are ±i, and so according to Proposition 4.2.1, yn = c1 in + c2 (−i)n , which
amounts to the same thing if we set a = c1 + c2 and b = i(c1 − c2 ). In general, if
λ1 , λ2 = ρe±iθ are complex (assuming always that p, q are real), the solution of (4.3) is
better expressed in the form
ρn (a1 cos nθ + a2 sin nθ),
in analogy to (3.2).
A celebrated solution of a difference equation is the Fibonacci sequence
(0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, . . .)
which solves the ‘initial value problem’
(
Fn+2 − Fn+1 − Fn = 0,
(4.6)
F0 = 0, F1 = 1.
Thus, Fn denotes the nth Fibonacci number with the convention that F1 = 1 = F2 .
The characteristic equation is
λ2 − λ − 1 = 0,
and has roots
√ √
1+ 5 b 1− 5
φ= , φ=
2 2
The positive root φ = 1.61803 . . . is the so-called golden ratio, and φb = 1 − φ. Many of
the graphs of these notes (such as Figure 7) are automatically framed by an imaginary
rectangle whose sides are in the proportion φ : 1, as this is meant to be an especially
pleasing shape. A rectangle of size φ × 1 can be divided into a smaller rectangle of the
same shape plus a unit square. 2
30
The previous proposition yields
1
Corollary 4.2.2 Fn = √ (φn − φbn ).
5
Proof. To derive the Fibonacci numbers from the general solution c1 φn + c2 φbn , we need
to satisfy the initial conditions
(
0 = F0 = c1 + c2 ⇒ c2 = −c1 ,
√
b
1 = F1 = c1 (φ − φ) ⇒ c1 = 1/ 5.
2
√
This corollary has some interesting consequences. Firstly, since |φbn / 5| < 0.5 for all
n ≥ 0, we have
1
Fn = √ φn to the nearest integer.
5
Also n
n bn 1 − b
φ/φ
Fn φ −φ
lim = lim = lim n−1 · φ = φ, (4.7)
n→∞ Fn−1 n→∞ φn−1 − φ bn−1 n→∞ b
1 − φ/φ
b
as |φ/φ| < 1. This result is illustrated in Figure 7 which plots the points
Fn−1 Fn
{( , ) : 1 ≤ n ≤ 10},
n n
and shows that the ratio of successive Fibonacci numbers converges very quickly to φ.
Figure 7: y = φx
1 2 3 4
31
Algebraic proof of Proposition 4.2.1. The homogeneous difference equation (4.3) can be
written (S 2 + pS + q)y = 0, and as in §3.1, we split this into two first-order equations
(S − λ1 )u = 0, where u = (S − λ2 )y.
The general solution of the first is determined by writing
un = λ1 un−1 = λ12 un−2 = · · · = λ1n u0 .
Since (S − λ2 )y = u, we now get
yn = λ2 yn−1 + u0 λ1n−1
= λ2 (λ2 yn−2 + u0 λ1n−2 ) + u0 λ1n−1
= λ22 yn−2 + u0 (λ1n−1 + λ1n−2 λ2 )
······
= λ2n y0 + u0 (λ1n−1 + λ1n−2 λ2 + · · · + λ2n−1 ).
This is in the familiar form of the general solution of a first-order homogeneous equation,
plus a particular solution of a non-homogeneous equation which we must now sum.
There are two cases. If λ1 = λ2 , we get
yn = λ2n y0 + u0 .nλ1n−1 = (c1 n + c2 )λ1n ,
where c1 , c2 are constants. If λ1 6= λ2 then we can use the identity
xn − 1 = (x − 1)(xn−1 + xn−2 + · · · + x + 1) (4.8)
with x = λ1 /λ2 to get
λ1n − λ2n
yn = λ2n y0 + u0 = c1 λ1n + c2 λ2n ,
λ1 − λ 2
with different constants, but as required. 2
32
We now return to applications of Proposition 4.2.1. The following should be compared
to (3.4); it can be proved by straightforward verification.
Lemma 4.3.2 Let L = S 2 + pS + q , and let g denote the sequence with nth term
gn = αn . Then Ly = g has a particular solution
1
αn , if λ1 6= α 6= λ2 ,
(α − λ1 )(α − λ2 )
yn = 1
nαn−1 , if α = λ1 6= λ2 ,
α − λ 2
1 n2 αn−2 , if α = λ1 = λ2 .
2
2
Solution. The equation is analogous to the ODE of Problem 3.2.2, and its characteristic
equation has roots 2 and 3. We use the lemma above:
1
(i) To get the term 1 = 1n on the right, take y1 = (1−2)(1−3) ;
1
(ii) to get 2n take y2 = 2−3 n2n−1 ;
1
(iii) to get 3n take y3 = 3−2 n3n−1 .
The general solution of the non-homogeneous equation is therefore
yn = y1 + y2 + y3 + (general solution of Ly = 0)
= 1
2
− 2n−1 n + 3n−1 n + c2 2n + c3 3n .
Finally,
1 1
y0 = 2
+ c2 + c3 ⇒ c3 = −c2 − 2
1
y1 = 2
− 1 + 1 + 2c2 + 3c3 ⇒ c2 = −1, c3 = 12 .
Therefore,
yn = 1
2
− 2n−1 n + 3n−1 n − 2n + 21 .3n
(4.10)
= 1
2
− (n + 2)2n−1 + (n + 23 )3n−1 .
Proceeding exactly as in (4.7),
1 1 n−1
yn+1 ( )
2 3
− 2(n + 3)( 32 )n−1 + 3(n + 52 )
= 1 1 n−1 .
yn ( )
2 3
− (n + 2)( 32 )n−1 + (n + 32 )
Since nλn → 0 whenever |λ| < 1, we obtain
5
yn+1 1 + 2n
lim = lim 3 = 3,
n→∞ yn n→∞ 1 + 3
2n
which equals the characteristic root of greatest magnitude. 2
33
Problem 4.3.4 Find an expression for the general term of the sequence defined by
4, n even,
yn+2 + 2yn+1 + yn =
2, n odd,
and satisfying y0 = y1 = y2 .
Solution. First spot that the right-hand side of the equation equals gn , in the notation
of (4.5). The characteristic roots are both −1, so a general solution of the homogeneous
equation is given by (a1 + a2 n)(−1)n . Moreover,
(i) a particular solution of yn+2 + 2yn+1 + yn = 3 is (obviously) 43 , and
(ii) a particular solution of yn+2 + 2yn+1 + yn = (−1)n is (from Lemma 4.3.2) 12 n2 (−1)n .
The general solution of the non-homogeneous equation is therefore
3
4
+ (a1 + a2 n + 12 n2 )(−1)n .
Having found the general formulae (4.10),(4.11) in the previous problems, one might
be tempted to determine the yn that satisfy the respective difference equations with n
a negative integer. This amounts to ‘completing’ the solution backwards from y0 , and
shows more clearly how the magnitude of the roots of the characteristic equation affects
the behaviour of solutions. For the first problem, we obtain a solution
2147 235 13 11
(. . . , 3888 , 432 , 27 , 36 , 0, 0, 3, 21, 101, 415, 1567, . . .)
in which the negative terms are all fractional. By contrast, the second problem gives rise
to a doubly-infinite sequence symmetric about the term y1 :
(. . . , 25, −17, 13, −7, 5, −1, 1, 1, 1, −1, 5, −7, 13, −17, 25, −31, 41, . . .).
Another difference between the two solutions is that the integer function (4.10) can be
extended to a function of a real variable by merely replacing n by x. Finding such an
explicit function assuming the values (4.11) at integer points is much harder.
4.4 Exercises
1. Make a list of the ‘backwards Fibonacci numbers’ Fn that satisfy (4.6) for n ≤ −1.
State and prove a formula that relates these to the ordinary Fibonacci numbers.
34
(i) yn+2 − 5yn+1 + 6yn = 1;
(ii) yn+2 − 4yn+1 + 4yn = 2n n.
In each case, determine also the unique solution satisfying y0 = 0 = y1 .
6. Let δ be the length of the indicated diagonals of a regular pentagon with side 1 unit.
Show that (i) δ cos( 52 π) = 21 , and (ii) 2δ 2 (1 − cos( 51 π)) = 1. Deduce that δ = φ. What
is the length of each side of the smaller pentagon?
8. Find the general solution of the equation yn+3 + 2yn+2 − yn+1 − 2yn = n using the
routine
eq:= y(n+3)+2*y(n+2)-y(n+1)-2*y(n)=n:
ic:= y(0)=a,y(1)=b,y(2)=c:
rsolve({eq,ic},y);
What is the simplest particular solution you can find?
35
5 Numerical Solutions
x=1.5
Example 5.1.1
dy
= x2 + y 2 . (5.1)
dx
If the term ‘x2 ’ were missing, the equation would be very easy and would admit solutions
y = 1/(c − x) where c is a constant. As it is, (5.1) is a difficult non-linear equation,
though it does have a simple interpretation. The equation is asserting that the slope of
36
any solution passing through the point (x, y) equals the square of the distance of that
point from the origin. An accurate sketch then enables one to make a reasonable guess
at solutions, at least those near the origin. For example, Figure 8 indicates that the
solution y(x) to (5.1) satisfying the initial condition y(0) = 0 has y(1.5) equal, at least
roughly, to 1.5. Below, we shall explain a method that allows y(1.5) to be approximated
more accurately. 2
where f is a given function and x0 , y0 are given numbers. The solution is the curve
passing through (x0 , y0 ) whose slope at any point (x, y) equals f (x, y). Choose a small
distance h (called the ‘step size’) and let x1 = x0 + h. If we use the tangent line to the
curve at (x0 , y0 ) to approximate the curve, then for small h
y1 = y0 + hf (x0 , y0 )
should be close to the true value y(x1 ) of the solution at x = x1 (see Figure 9).
y
1
y
0
x0 x1
Euler’s method repeats the above idea with n steps, and calculates yn recursively.
To approximate the true value y(x0 + `) of the solution at x = x0 + ` in n steps, one
sets h = `/n and puts into action the scheme
37
Start with x0 y0
Define x 1 = x0 + h y1 = y0 + hf (x0 , y0 )
x2 = x 1 + h y2 = y1 + hf (x1 , y1 )
······
xi = xi−1 + h yi = yi−1 + hf (xi−1 , yi−1 )
······
xn = xn−1 + h yn = yn−1 + hf (xn−1 , yn−1 )
Start with x0 = 0 y0 = 0
Define x1 = 0.1 y1 = 0 + 0.1(0 + 0) = 0
x2 = 0.2 y2 = 0 + 0.1(0.01 + 0) = 0.001
x3 = 0.3 y3 = 0.001 + 0.1(0.04 + 0.000001) = 0.005 . . .
x4 = 0.4 y4 = 0.014 . . .
······
x10 = 1.0 y10 = 0.292 . . .
······
x15 = 1.5 y15 = 1.213 . . .
Data in the table below was obtained using the Maple program given in §5.4, though
it took an office machine a few minutes to report back y1500 .
38
n 1 15 150 1500 15000
an 0 1.213 . . . 1.479 . . . 1.513 . . . 1.517 . . .
Actually, the true solution satisfies 1.5174 < y(1.5) < 1.5175. As it happens, solutions of
(5.1) can be expressed in terms of so-called Bessel functions, although numerical methods
are still needed to evaluate these. The rather naı̈ve plot in Figure 8 conceals a much
more complicated picture further out from the origin. In fact, the solution with y(0) = 0
tends to infinity as x approaches about 2, and solutions of the ODE to the right of this
become increasingly steep and rapidly divergent (see §5.4).
where a is a constant chosen to suit the definition of the function f . In this case, one
is merely attempting to approximate the definite integral (1.11). With step size h, we
obtain
xi = a + ih, yi = yi−1 + hf (a + ih),
and with a total of n steps each of size `/n,
This amounts to approximating the area under the graph of f by means of n rectangles
of width h. Although a silly application of Euler’s method, proceeding blindly can give
some intriguing summation formulae.
For example, taking f (x) = 1/x, a = 1, ` = 1 gives
2n−1
X 1
yn = as an approximation to y(2) = ln 2. (5.3)
i=n
i
1 1 1 1
For example, y10 = 10
+ 11
+ 12
+ ··· +
= 0.718 . . ., compared to the true value of
19
P
∞
1
ln 2 = .6931 . . .. It is well known that the infinite series i diverges, which means that
i=1
given any number M (however large) the finite sum starting from i = 1
n
X 1
Hn =
i=1
i
39
is greater than M for some n. (Infinite series are discussed a bit more in §7.4.) The
sum Hn is called the nth harmonic number and it is easy to show that
1
< Hn − ln n < 1, n≥2 (5.4)
n
(see §5.4). This means that Hn , though unbounded, grows extremely slowly; incredibly
for example, H1000 < 8. The validity of (5.3) is an easy consequence of the following
much stronger statement that is beyond the scope of the course:
Theorem 5.2.2 The limit lim (Hn − ln n) exists and equals a number γ = 0.5772 . . .
n→∞
(called Euler’s constant). 2
which has the exact solution y(x) = ex . Similar techniques can be successfully applied
to non-homogeneous ODE’s like y 0 + y = eαx .
yn = (1 + h)yn−1
= (1 + h)2 yn−2
······
= (1 + h)n y0 .
This time, provided we quote some results from calculus, it is relatively easy to prove
that the method works:
40
n
1
Proposition 5.2.4 Let an = 1+ . Then lim an = e.
n n→∞
Proof. We have
1 ln(1 + ε) − ln 1
ln an = n ln(1 + )= ,
n ε
where ε = 1/n. This means that
1
lim ln an = ln0 (1) = =1
n→∞ 1
(using the definition of the derivative, or a special case of l’Hôpital’s rule). Hence
lim an = lim exp(ln an ) = exp lim ln an = e.
n→∞ n→∞ n→∞
x
This last step is valid because the function x 7→ e is continuous (see §1.1). 2
Like the previous approximation (5.3), this is inaccurate unless we take n to be very
large. The table in the next section shows that n needs to be about 104 to get an correct
to only 3 decimal places.
? 5.3 An improvement
The relation between Euler’s method and approximating definite integrals extends to
the general case. Integrating the equation y 0 = f (x, y) between xi−1 and xi = xi−1 + h
gives Z xi Z xi
dy
y(xi ) − y(xi−1 ) = dx = f (x, y(x))dx, (5.6)
xi−1 dx xi−1
where y(x) is the true solution. The right-hand integral in (5.6) is approximated by
the area hf (xi−1 , y(xi−1 )) of the rectangle of width h and height equal to the value of
the integrand at xi−1 . However, we do not know y(xi−1 ) and need to replace it by the
previous approximation yi−1 to give the new approximation
y(xi ) − yi−1 ≈ hf (xi−1 , yi−1 ).
This turns out to be the Euler formula.
Taking a trapezium instead of a rectangle in an attempt to evaluate (5.6) gives a
seemingly better approximation
y(xi ) − yi−1 ≈ 12 h (f (xi−1 , yi−1 ) + f (xi , y(xi ))) .
But it is y(xi ) we are after, and if we are unable to solve for it explicitly, it makes sense to
replace its occurrence on the right by the old Euler approximation yi−1 + hf (xi−1 , yi−1 ).
This yields
Heun’s formula
(
s1 = f (xi−1 , yi−1 ),
yi = yi−1 + 12 h(s1 + s2 ), where (5.7)
s2 = f (xi , yi−1 + hs1 ).
Figure 10 shows the first step graphically. The old approximation is first applied for the
whole step in order to find the value of f at its endpoint. This value is then used to
change the slope of the tangent line for the second half of the step.
41
Figure 10: Improved first step
y
1
slope = s2
x0 x1
n an bn
1 2 2.5
5 2.48832 2.702708 . . .
10 2.593742 . . . 2.714080 . . .
100 2.704813 . . . 2.718236 . . .
1000 2.716923 2.718281376
10000 2.718145 2.718281824
42
This is apparent in the table, in which the correct digits of e are underlined; if n is made
10 times as big, an generally improves by 1 decimal place, but bn improves by 2.
The improved Euler method is still too inaccurate for serious use, although it is the
prototype of more complicated ‘predictor-corrector’ methods. Most computer packages
perform numerical integration with the so-called Runge-Kutta method, which is a re-
finement yielding quartic rather than quadratic accuracy [6, §20.1]. One version of this
is defined below.
5.4 Exercises
1. The function e−5x is the unique solution of the equation y 0 +5y = 0 satisfying y(0) = 1.
Approximate e−5 by applying Euler’s method to this equation on the interval 0 ≤ x ≤ 1;
take step size equal to (i) 0.5, (ii) 0.2, (iii) 0.1, and comment on the outcomes.
2. Apply Euler’s method with 10 steps to the ODE y 0 = (ln x)y , in order to approximate
the value y(2) of the solution satisfying y(1) = 1. Compare your answer with the true
value.
4. Find the exact value of y(2) in the previous question by first substituting y = u + x
and then applying a technique from §2.5.
5. (i) By considering the area under the graph of 1/x, prove that
n
X Z n n−1
X1
1 1
< dx < , n ≥ 2,
j=2
j 1 x j=1
j
43
for i from 0 to n-1 do
x(i+1):= x(i)+h:
y(i+1):= y(i)+h*f(x(i),y(i))
od;
7. Let y(x) denote the solution to (5.1) satisfying y(0) = 0, plotted in Figure 8. By
comparing y(x) to 1/( 52 − x), and given that y(1.5) > 1, show that y becomes infinite
for x < 25 . Investigate the solution starting with
eq:= D(y)(x)=x^2+y^2:
dsolve({eq,y(0)=0},y(x)):
assign("):
plot(y(x),x=0..4);
44
6 Partial Derivatives
∂f f (x, b) − f (a, b)
= fx (a, b) = lim
∂x (a,b) x→a x−a
(equal in the example to −b/a2 ). Similarly, the partial derivative with respect to y is
given by
∂f f (a, y) − f (a, b)
= fy (a, b) = lim
∂y (a,b) y→b y−b
(equal to 1/a).
We may also define various higher partial derivatives:
∂ ∂f ∂2f 2y
( )= 2
= (fx )x = fxx = 3 ,
∂x ∂x ∂x x
2
∂ ∂f ∂ f 1
( )= = (fx )y = fxy = − 2 ,
∂y ∂x ∂y∂x x
∂ ∂f ∂2f 1
( )= = (fy )x = fyx = − 2 ,
∂x ∂y ∂x∂y x
2
∂ f
= fyy = 0.
∂y 2
This illustrates an important result, namely that for all common functions one has
∂2f ∂2f
= , or fxy = fyx ;
∂y∂x ∂x∂y
i.e., the ‘mixed’ second-order partial derivatives are equal. The precise statement is
45
Theorem 6.1.2 Let f (x, y) be a function of two variables such that f, fx , fy , fxy , fyx
are defined and continuous on some rectangle with centre (a, b). Then
We shall not prove this result which relies on the concept of continuity in the context of
R2 . Instead, we remark that the function
3
x y − y 3x
, if (x, y) 6= (0, 0),
g(x, y) = x2 + y 2 (6.1)
0, if x = 0 = y
46
by applying the mean value theorem to the function t 7→ f (t, b + dε) (λ1 depends on ε).
Therefore
h0 (0) = lim [fx (a + λ1 cε, b + dε)c + fy (a, b + λ2 dε)d], λ2 ∈ (0, 1),
ε→0
The following more complicated situation can be omitted on a first reading. Suppose
now that x and y both depend themselves on three variables s, t, u, so that
h(s, t, u) = f (x(s, t, u), y(s, t, u)).
Then in the language of partial derivatives, Proposition 6.2.1 gives
h s = f x xs + f y y s ,
h t = f x xt + f y y t ,
h u = f x xu + f y y u .
These equations are best interpreted using a matrix scheme, regarding the function h as
a composition of mappings from right to left:
f
R1 ←− R2 ←− R3
s
x(s, t, u) t
h(s, t, u) ← ←
y(s, t, u)
u
The chain rule then expresses the partial derivatives of h as the matrix product
xs xt xu
(hs , ht , hu ) = (fx , fy ) .
ys yt yu
47
Proposition 6.3.1 If h(x, y) = g(f (x, y)) then
hx = g 0 (f (x, y))fx ,
hy = g 0 (f (x, y))fy .
In contrast to Proposition 6.2.1, no proof is required beyond that of the ordinary chain
rule in §1.1.
These formulae are special cases of what is appropriately called Euler’s formula:
xfx + yfy = w f.
Proof. Fix c, d and let h(t) = f (tc, td) = tw f (c, d). From Proposition 6.2.1,
This is a more precise statement of Euler’s formula which avoids the confusion between
subscripts and variables. 2
48
determined by the single number f (x, y, z). Using the dot product for vectors, Euler’s
equation for a homogeneous function of 3 variables can be expressed in the neat form
where
∂f ∂f ∂f
∇f = ( , , ), (6.6)
∂x ∂y ∂z
is the gradient of f [6, §8.9]. The left-hand side of (6.5) represents a ‘directional deriva-
tive’ of f in the radial direction.
Equation (6.8) is the 1-dimensional wave equation with y representing time, and x
displacement. It is easy to prove that its general solution has the form
(see §6.5). Taking for example g1 = g2 = sin gives a ‘stationary wave’ 2 sin x cos(cy).
Similar solutions can also be found by substituting h(x, y) = h1 (x)h2 (y) into (6.8); the
variables ‘separate’ leaving the two ODE’s
provided the derivative g 0 (ζ) make sense with ζ a complex variable, and the method
works when g is one of many standard functions:
49
Proposition 6.4.1 If ζ = x + iy , the real and imaginary parts of ζ n (for any n ∈ Z),
eζ and ln ζ all satisfy (6.10). 2
We do not justify this theoretically, since it is an easy matter to verify the solutions in
each case. For example, the real and imaginary parts of ζ 2 are x2 − y 2 and 2xy and
obviously solve (6.10). For ζ n with n > 2, see §6.5. Also
ln r = 1
2
ln(x2 + y 2 )
which does solve (6.10). The quickest way to see this is to note (e.g. by differentiating the
equation r 2 = x2 + y 2 partially with respect to x) that rx = x/r. Then (ln r)x = x/r 2
and 2x2 1 2y 2 1
(ln r)xx + (ln r)yy = − 4 + 2 + − 4 + 2 = 0.
r r r r
An extension of the last calculation concerns a problem relevant to the theory of
gravitation:
p
Problem 6.4.2 Find a function g such that if r now denotes x2 + y 2 + z 2 then
h(x, y, z) = g(r) solves the 3-dimensional Laplace equation hxx + hyy + hzz = 0.
Solution. To start, we have
x
hx = g 0 (r)rx = g 0 ,
r
and so
x 1 x x x2 1 x2
hxx = (g 0 )x
+ g 0 + g 0 (− 2 ) = g 00 2 + g 0 ( − 3 ),
r r r r r r r
with similar expressions for hyy and hzz . Adding all three,
3 1 2
hxx + hyy + hzz = g 00 + g 0 ( − ) = g 00 + g 0 ;
r r r
the ‘3’ here really arises as the dimension of the space we are considering. To solve
Laplace’s equation, we need
2 c1
0 = r 2 (g 00 + g 0 ) = (r 2 g 0 )0 ⇒ g 0 = 2
r r c
1
⇒ g(r) = − + c2 .
r
In particular 1/r is a solution. 2
50
6.5 Exercises
1. (i) Compute the partial derivatives fx , fy , fxx , fxy , fyx , fyy of the function f (x, y) =
ex sin y , and verify that fxy = fyx .
(ii) Compute the partial derivatives gx , gy of (6.1), being careful to adopt the correct
definition to find their values at (0, 0). Then prove that gxy (0, 0) = −1 and gyx (0, 0) = 1.
3. The coordinates of a particle moving in the plane are x(t) = 2 cos t and y(t) = sin t,
where t denotes time. Verify that the particle moves along the ellipse f ≡ 0 where
f (x, y) = x2 + 4y 2 − 4, and sketch this curve.
Compute the partial derivatives fx , fy and verify that fx x0 (t) + fy y 0 (t) = 0 for points on
the curve. Evaluate the vectors (x0 (t), y 0 (t)) and (fx , fy ) when (i) t = 0, (ii) t = π/4,
and represent them as arrows emanating from the corresponding point (x(t), y(t)) on
your sketch.
4. Let u = x + cy and v = x − cy . Show that (6.8) transforms into huv = 0, and deduce
the general solution (6.9).
bn/2c
P n
5. Let ζ = x + iy . Show that the real part of ζ n equals (−1)k 2k
xn−2k y 2k , where
k=0
bn/2c means the largest integer less than or equal to n/2. Deduce that this function is
a solution of Laplace’s equation (6.10).
x
6. (i) Let f (x, t) = y( √ ), where y is defined by (1.13). Prove that f satisfies the heat
2 t
equation
fxx = ft .
If f is defined as before, but with y arbitrary, what ODE needs to be satisfied by y for
f to satisfy the same PDE?
(ii) Find solutions to the heat equation in the form h1 (x)h2 (t).
51
8. Figure 11 exhibits Dirac’s electron equation in an unusual setting. It actually rep-
resents a system of four first-order PDE’s, as ψ stands for complex-valued functions
ψi (x1 , x2 , x3 , t), i = 1, 2, 3, 4, each depending on spatial coordinates x1 , x2 , x3 and time
t. The symbols σ, ρ1 , ρ3 denote 4 × 4 matrices, m is a constant, and ∇ encodes the
operators ∂i = ∂/∂xi as in (6.6). Written out in full the equations become
∂ψ1
∂t
0 0 ∂3 ∂1 − i∂2 m 0 0 0
ψ1
∂ψ2
0 0 ∂1 + i∂2 −∂3 0 m 0 0 ψ2
i
∂t
∂ψ3 = i + ψ 3
∂t
∂ 3 ∂ 1 − i∂ 2 0 0 0 0 −m 0
∂ψ4
∂t
∂ 1 + i∂ 2 −∂ 3 0 0 0 0 0 −m ψ4
Figure 11
52
7 Binomial Coefficients
(1 + x1 )(1 + x2 )(1 + x3 )
= 1 + x 1 + x 2 + x 3 + x 2 x3 + x 3 x1 + x 1 x2 + x 1 x2 x3
| {z } | {z }
∅ {xi } {xi , xj } X
consisting of 23 terms, each corresponding to an indicated subset of X = {x1 , x2 , x3 }.
Putting x1 = x2 = x3 = x gives the familiar identity
Proposition 7.1.2
n
X n!
(x + y)n = xk y n−k (7.2)
k=0
(n − k)!k!
0!
Proof. First observe that when n = 0, (7.2) becomes the equation 1 = 0!0! . We deem
this to be true by defining
0! = 1 and 00 = 1 (7.3)
53
Figure 12:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1
1 10 45 120 210 252 210 120 45 10 1
(the latter is needed in case x + y = 0). Now suppose that (7.2) is true for a particular
value of n, and consider
n
X n!
(x + y) n+1
= xk y n−k (x + y)
k=0
(n − k)!k!
Xn
n! n!
= + xj y n+1−j + xn+1 + y n+1
j=1
(n − j)!j! (n − j + 1)!(j − 1)!
n
X n − j + 1 + j j n+1−j
= n! xy + xn+1 + y n+1
j=1
(n − j + 1)!j!
n+1
X (n + 1)!
= xj y n+1−j .
j=0
(n + 1 − j)!j!
The calculations are consistent with (7.3), and the last sum is simply (7.2) with n
replaced by n+1. Starting with n = 0 and proceeding by induction therefore establishes
the validity of (7.2) for all non-negative integers n. 2
Many properties of the binomial coefficients are apparent from their triangular display
in Figure 12 (discussed by Pascal in about 1650). The significance of the underlined and
boxed numbers are explained in §7.4 and §7.5 respectively.
54
Proof. Each identity can be justified in two ways, depending whether the binomial
coefficient is defined by factorials, or using Lemma 7.1.1.
(i) is obvious from the symmetry in the factorial formula. From the point of view
of subsets, the result reflects the fact that we may associate to a subset S of X =
{1, 2, . . . , n} of size k its complement S c = X \ S of size n − k .
(ii) is the addition formula that formed the crucial step in the proof of Proposition 7.1.2,
and is extended by the convention that nk = 0 whenever k < 0 or k > n. It corresponds
to the fact that when we choose k elements from the set
{1, 2, 3, . . . , n + 1} = X ∪ {n + 1},
7.2 Probabilities
Motivated by Lemma 7.1.3(iii), suppose now that x + y = 1, so that
n
X
n
k
xk y n−k = 1. (7.4)
k=0
(7,∗,∗,7,7,. . . ,7,∗)
with k 7’s and n − k totals different from 7 (indicated by the symbol ∗).
The probability that there will be a total of exactly k 7’s amongst the n throws (let
us call this eventuality Ek ) equals nk xk y n−k , since the binomial coefficient counts the
number of ways of choosing k from n. Seen in this light, the binomial formula (7.4)
simply asserts that the probabilities of the ‘mutually exclusive’ events Ek sum to 1.
A simpler case is that in which x = y = 1/2, as with tossing an unbiased coin. The
rows of Pascal’s triangle confirm that, naturally enough, the most likely of the events E k
55
are now those with k about half of n. As n gets large this favouring of middle values of
k is represented by Figure 13 in which the points
k n n
{( , n k
) : 0 ≤ k ≤ n}
n 2
have been plotted for 1 ≤ n ≤ 100.
Problem 7.2.1 A poker player is dealt a hand of 5 cards from a conventional pack of
52, and the order of the 5 cards is immaterial. How many of the possible hands are
(i) flushes (meaning that all 5 cards belong to only one of the 4 suits)?
(ii) straights or runs (meaning that the cards are 5 consecutive ones in the list A,2,3,4,5,6,
7,8,9,10,J,Q,K,A)? Estimate the respective probabilities.
Solution. (i) To count the number of different (hands that are) flushes, we choose the
suit and then the 5 cards from a possible 13. So the number is
13
13 · 12 · 11 · 10 · 9
4· 5
=4· = 5148,
1·2·3·4·5
and the probability
5148 5148
52
= = 0.00198 . . . (7.5)
5
2598960
is about 1 in 500. Alternatively, whatever the first card is, the next card must be one of
12 in the remaining 51, and so on; this gives directly the probability of a flush as
12 11 10 9 4.13!47!
· · · = ,
51 50 49 48 52!8!
which gives the same answer.
56
(ii) To count the number of runs, we choose a lowest card, which can be any one of
A,2,3,. . . ,10, and then suits for each of the 5. This gives a probability of
10 · 45 10240
52
= = 0.00394 . . . ,
5
2598960
Example 7.2.2 Let bd(k) be the probability that no two persons in a group of k share
k
the same birthday. Then bd(k) = 365 /365k .
This formula (that uses the notation (1.6)) is based on the ideal assumption that birth-
days are equally distributed throughout the year and that none occur on 29 February.
If the persons are placed in order and asked to call out their birthday one at a time, the
total number of outcomes conceivable is 365n . The number of these needed to prevent
a common birthday is
k
365 · 364 · 363 · · · (365 − k + 1) = 365 ,
nk n(n − 1) · · · (n − k + 1)
= (7.6)
k! k!
k
whenever 1 ≤ k ≤ n ∈ N. Given that the definition of the numerator n makes sense
when n is replaced by an arbitrary real number r, we make
These may be called ‘binomial coefficients with real exponent r’. For instance,
4
4.3.2.1.0(−1)
6
= = 0,
6.5.4.3.2.1
57
1
1/2
2
(− 12 )(− 23 )(− 52 ) 5
4
= = − 128 .
4.3.2.1
r
Note that 0 equals 1 for all r by definition; this is consistent with (7.3).
Many of the familiar identities involving ordinary binomial coefficients have their
counterparts in which the exponent is negative. This is often thanks to
Lemma 7.3.2 −n k
= (−1) k n+k−1
k
.
Proof. A straightforward application of the definition:
(−n)(−n − 1) · · · (−n − k + 1) (n + k − 1)(n + k − 2) · · · n
= (−1)k .
k! k!
2
| | | × × × × |
n+k−1
n+k−1
The answer is n−1
= k
. 2
Somewhat more surprising is the way in which the generalized coefficients with r =
−1/2 reduce to the ‘middle’ binomial coefficients:
Lemma 7.3.4 (−4)k −1/2 k
= 2k
k
.
Proof.
−1/2
(− 21 )(− 23 ) · · · ( −2k+1
2
)
k
=
k!
1 · 3 · 5 · · · (2k − 1) 2 · 4 · 6 · · · 2k
= (−1)k ·
2k k! (1 · 2 · 3 · · · k)2k
(2k)!
= (−1)k
22k (k!)2
= (− 14 )k 2k
k
.
2
58
7.4 Infinite series
First, a digression. Let a = (a0 , a1 , a2 , . . .) be a sequence of real numbers, and let
Pn
sn = ai be the so-called partial sums. Then s = (s0 , s1 , s2 , . . .) is a new sequence. If
i=0
s converges to `, which simply means that lim sn exists and equals `, one writes
n→∞
∞
X
ai = `,
i=0
and says that the infinite series converges to `. As an example, suppose that |x| < 1 so
that xn → 0 as n → ∞. Taking ai = xi and using the formula
2 1 − xn+1 1n xn+1
1+x +x +···+x = = − ,
1−x 1−x 1−x
that follows from (4.8), yields
∞
X 1
Lemma 7.4.1 xi = for |x| < 1. 2
i=0
1−x
and as x approaches 1 the logarithm becomes infinite. Incidentally, one can also differ-
entiate the lemma to obtain (after multiplying both sides by x) the useful formula
∞
X x
ixi = . (7.7)
i=1
(1 − x)2
−n
Lemmas from §7.3 show that (−1)k is the coefficient of xk in the expansion
k
X n 1 n
2 3 k n
(1 + x + x + x + · · · + x + · · ·) = xi = .
i≥0
1−x
59
∞
X
n
(1 + x)n = k
xk , for all n ∈ Z
k=0
∞
X
r
(1 + x)r = k
xk .
k=0
Referring back to Figure 12 shows what has been accomplished in this section. The bino-
mial coefficents with which we began specified the row entries of the triangle, whereas the
most recent formulae provide (in the language of §8.1) generating functions for columns
and diagonals.
7.5 Exercises
60
P
n
n
(ii) By first differentiating, show that (−1)j j j
= 0.
j=1
P
n
n
(iii) Is it true that (−1)j j 2 j
= 0?
j=1
2. Further to Problem 7.2.1, how many hands are running flushes (both (i) and (ii))? How
many hands are ‘full houses’, meaning three cards are of one value and the remaining
two of another (such K9KK9)?
3. The National Lottery consists in individuals choosing 6 numbers between 1 and 49,
after which 7 numbers in the same range are drawn at random. A first prize is given for
having chosen (in any order) the first 6 numbers drawn, and a second prize for having
chosen (in any order) the 7th number drawn and any 5 of the first 6. Find the respective
probabilities p1 , p2 of success.
4. (i) Let k, n ∈ N. Prove the ‘negative’ version of Lemma 7.1.3(ii), namely that
−n
−n−1
−n−1
k
= k
+ k−1
.
(ii) Applying Lemma 7.3.2 (with m = n + k ) and (i) repeatedly, show that
k
X
m
m−1
(−1)j j
= (−1)k k
.
j=0
5. A construction toy consists of 100 bricks of the same size but with varying numbers
of the colours black, red and yellow. Show that there are 4851 different ways of putting
together such a pack so that there is at least one brick of each colour. (It may help to
know that 4851 is a binomial coefficient.) How many packs are there if it is no longer
necessary that all colours be represented?
7. A total of n straight lines are drawn in the plane such that no two are parallel, and
no three meet in a single point. Find the resulting number (i) a of points of intersection,
(ii) b of line segments (finite or infinite), and (harder) (iii) c of regions (finite or infinite)
the plane is divided into. Deduce that a − b + c = 1.
61
8 Generating Functions
as a function g(x) in ‘closed form’. If this is possible, the series is likely to converge
for |x| sufficiently small, though in any case g(x) is called the generating function (GF)
for the sequence. The theory of such generating functions is more concerned with for-
mal manipulations, rather than worrying about questions of convergence which we shall
return to in §9.3.
The concept is illustrated by some common GF’s that arise from §7.4:
1
is the GF of (1, 1, 1, 1, 1, . . .),
1−x
1
" (1, −λ, λ2 , −λ3 , λ4 , . . .),
1 + λx
√
1/ 1 − 4x " (1, 2, 6, 20, 70, 252, . . .).
It is often convenient to write ex as exp x. The equation exp0 = exp reflects the
fact that the sequence (1, 1, 2!1 , 3!1 , 4!1 , . . .) is unchanged by the ‘differentiation’ operation
described in the lemma. Rewriting Proposition 7.1.2 in the form
X n
1 1 k 1
(x + y)n = x y n−k (8.1)
n! k=0
k! (n − k)!
62
corresponds to the equation
exp(x + y) = exp x · exp y.
n
X
n 2 2n
Problem 8.1.2 Use generating functions to show that k
= n
.
k=0
Solution. We have
n
X X n
1
(1 + x)n = n
j
xj , (1 + x1 )n = n
k xk
.
j=0 k=0
P
n
n 2
It follows that k
equals the constant term in
k=0
1
(1 + x)n (1 + x1 )n = n (1 + x)2n ,
x
which gives the required answer. 2
Just how ‘closed’ a GF needs to be is a matter for debate, and therein lies the
usefulness of the concept. Consider for example the sum
X∞
xj
g(x) = j
(8.2)
j=1
1 − x
63
Example 8.1.3 The exponential GF for the sequence (B0 , B1 , B2 , B3 , . . .) of Bernoulli
numbers is x/(ex − 1).
This defines the k th Bernoulli number by the formula
∞X Bk
x
= xk (8.3)
ex − 1 k!
k=0
Now,
1 1 2 1 3
ex − 1 = x(1 + 2! x + 3! x + 4! x + · · ·),
and so the left-hand side of (8.3) may be expanded as
∞
X
−1 j
1+ 1
2!
x + 1 2
3!
x + 1 3
4!
x +··· = 1+ (−1)j 1
2!
x + 3!1 x2 + · · ·
j=1
1 2
= 1− 1
2
x + 12 x − 1
720
x4 +···
8.2 Derangements
A permutation of X = {1, 2, . . . , n} is a reordering of its elements, i.e. a bijective
map from the set to itself. An element i ∈ X is called a fixed point of f if f (i) = i.
64
Proposition 8.2.2 The exponential GF of the sequence (d0 , d1 , d2 , . . .) is e−x /(1 − x).
Proof. Let 0 ≤ k ≤ n. The number of permutations of {1, 2, . . . , n} with exactly k
fixed points equals nk dn−k , though for this to work for k = n we need to define d0 = 1.
Therefore
n! = dn + ndn−1 + n2 dn−2 + · · · + n−2 n
1+0+1
n
X
n
= n−k
dk
k=0
n
X dk 1
⇒ 1 = · .
k! (n − k)!
k=0
The right-hand side equals the coefficient of xn in
!
X ∞
dk k X 1 j
∞
x x = h(x)ex ,
k! j=0
j!
k=0
X (−1)k n
dn 1 1 1
= 1 − 1 + − + · · · + (−1)n = (8.5)
n! 2! 3! n! k=2 k!
For example,
1
120 d5 = 12 − 61 + 24
1 1
− 120 ⇒ d5 = 60 − 20 + 5 − 1 = 44,
and in this way we may fill in some more values:
n dn dn /n!
0 1 1
1 0 0
2 1 0.5
3 2 0.333333 . . .
4 9 0.375
5 44 0.366666 . . .
6 265 0.368055 . . .
7 1854 0.367857
8 14833 0.367881
9 133496 0.367879
10 1334961 0.367879
65
Remark. In the proof of Proposition 8.2.2, we used the following useful fact:
g(x) is the GF of a = (a0 , a1 , a2 , . . .)
h(x) is the GF of b = (b0 , b1 , b2 , . . .)
P
n
⇒ g(x)h(x) is the GF of c = (c0 , c1 , c2 , . . .), where cn = ak bn−k .
k=0
X2
X
1 X3
shows that
|X1 ∪ X2 ∪ X3 | = |X1 | + |X2 | + |X3 | − |X23 | − |X31 | − |X12 | + |X123 |.
This equation generalizes in an obvious way:
Proposition 8.3.1
X X X
|X1 ∪ · · · ∪ Xn | = |Xi | − |Xij | + |Xijk | − · · · − (−1)n |X12···n |.
i i<j i<j<k
Proof. We must check that each element x in the union X1 ∪ · · · ∪ Xn is counted exactly
once on the right-hand side. This is correct for the following reason. Suppose, for sake
of argument, that
x belongs to exactly 5 of the Xi
5
⇒ x " 2
" Xij
5
⇒ x " 3
" Xijk
······
Then on the right-hand side of the equation we are trying to prove, the number of times
x is counted equals
5
5
5
5
5
1
− 2
+ 3
− 4
+ 5
= 1 − 50 − 51 + 52 − 53 + 54 − 55 = 1 − (1 − 1)5 = 1.
2
66
In practice, Xi may denote the number of objects of the set X having a particular
property ‘Pi ’. Then Xij denotes the set of objects possessing at least property Pi and
property Pj , and so on. The number of objects in the set having none of the n properties
equals |X| − |X1 ∪ · · · ∪ Xn |, which can be calculated by means of the proposition.
As an application, let Xi denote the set of permutations f of {1, 2, . . . , n} such that
f (i) = i. Then
|Xi | = (n − 1)!, |Xij | = (n − 2)!, etc.
The number of non-derangements of {1, 2, . . . , n} is then
n! − dn = |X1 ∪ · · · ∪ Xn | = n1 (n − 1)! − n2 (n − 2)! + n3 (n − 3)! − · · · ∓ 1
1 1 1
= n! 1 − 2! + 3! − · · · ∓ n! .
and
dn = ndn−1 + (−1)n . (8.6)
An instance of the latter is clearly visible near the bottom of the table in §8.2.
Problem 8.3.2 How many anagrams of ABCDEFGH contain at least one of the pairs
AB,CD,EF,GH (such as BACHDEFG)?
Solution. To proceed with the help of Proposition 8.3.1, let
67
In this section, we shall find the generating function F (x) for the Fibonacci sequence
(F0 , F1 , F2 , F3 , . . .) = (0, 1, 1, 2, 3, 5, 8, . . .)
defined by the difference equation Fn+2 − Fn+1 − Fn = 0, or
S 2 F − SF − F = 0, (8.7)
where SF and S 2 F are the shifted sequences which start with F1 and F2 respectively
(see §4.1).
Since F (x) = F0 + F1 x + F2 x2 + · · ·, with F0 = 0 and F1 = 1, the GF’s of SF and
2
S F are respectively
1
F (x) = F1 + F2 x + F3 x2 + · · ·
x
1 1
2
F (x) − = F 2 + F 3 x + F 4 x2 + · · ·
x x
Thus (8.7) becomes
1 1 1
2
F (x) − − F (x) − F (x) = 0
x x x
⇒ F (x) − x − xF (x) − x2 F (x) = 0,
and so
x
F (x) = (8.8)
1 − x − x2
We may check this answer with the aid of Lemma 7.4.1, in the same sort of way that
we treated (8.2). For
x
F (x) = = x + (1 + x)x2 + (1 + x)2 x3 + (1 + x)3 x4 + · · ·
1 − (1 + x)x
= x + x2 + 2x3 + 3x4 + 5x5 + · · ·
as expected. Notice that the x on top in (8.8) is not terribly important; it merely ensures
that F2 equals 1 rather than 2.
Equation (8.8) also leads to an independent proof of Corollary 4.2.2, since
x x 1 1 1
= =√ − ,
1 − x − x2 (1 − φx)(1 − φ̂x) 5 1 − φx 1 − φ̂x
68
Problem 8.4.1 A positive integer consisting of e even digits and o odd digits is assigned
P
∞
the ‘value’ 2e + o. Find the generating function v(x) = vk xk , where vk denotes the
k=1
number of positive integers with value k .
Solution. Since there are 5 even digits and 5 odd ones, the number of k -digit ‘integers’
with value n equals the coefficient of xn in (5x + 5x2 )k (which is of course zero if n < k ).
But this includes numbers which start with the digit 0 which are not permitted. To
correct for this, we need to subtract x2 times the coefficient of xn−2 in (5x + 5x2 )k−1 .
Hence,
X∞ X∞
2 k 2 2 k 5x + 4x4
v(x) = (5x + 5x ) − x (5x + 5x ) = ,
k=1 k=0
1 − 5x − 5x2
by Lemma 7.4.1. 2
8.5 Exercises
1. Let n ≥ 2. By multiplying both sides of the defining equation (8.3) for the Bernoulli
P n
n−1
1
numbers by ex − 1, prove that j
Bj = 0. Use this to check that B4 = − 30 and to
j=0
find B6 and B8 .
2. If the decks in §8.2 each contain only 10 cards to start with (e.g. A to 10 of clubs), is
the chance of at least one snap less or greater than with the 52-card decks?
3. Observe that (8.6) implies the formula dn = (n−1)(dn−1 +dn−2 ). Prove this directly by
considering the effect of a permutation f of {1, 2, . . . , n} on the last digit n, considering
two cases: (i) f (f (n)) = n, (ii) f (f (n)) 6= n.
4. Let n, p, q be positive integers with p, q prime. Show that the number of integers
between 1 and n inclusive that are not divisible by p or q equals
n n n
n − b c − b c + b c,
p q pq
where bxc denotes the largest integer less than or equal to x. Does this formula work if
p and q are not both prime? Use Proposition 8.3.1 to extend the formula to the case of
three primes, and count the number of integers between 1 and 1000 that are divisible by
none of 5,7,11.
n = a1 + a2 + · · · + a k , 1 ≤ k ≤ n,
69
where ai ∈ N and 1 ≤ a1 ≤ a2 ≤ · · · ≤ ak . Let pn denote the number of distinct
partitions of n, and let p0 = 1.
P
n
(i) Show that pn equals the number of solutions (b1 , b2 , . . . , bn ) to the equation jbj = n
j=1
where each bj ≥ 0 is a non-negative integer.
Q
∞
(ii) Deduce that the generating function of (p0 , p1 , p2 , . . .) equals (1 − xk )−1 .
k=1
70
9 Asymptotic Notation
Definition 9.1.1 an = O(bn ) means that there exists C > 0 and N ∈ N such that
|an | ≤ C|bn | for all n ≥ N , or more informally,
‘|an |/|bn | is bounded as n → ∞’.
This implies that
|a | |a |
|an | ≤ C 0 |bn | for all n ≥ 0, where C 0 = max{C, |b 0| , · · · , |b N −1| },
0 N −1
provided no bi is zero, but C 0 may be much less convenient to find than C and N . For
example n20 /2n → 0 as n → ∞ since 20 ln n − n ln 2 → −∞, so n20 = O(2n ). Indeed,
Definition 9.1.2 an = Ω(bn ) means the same as ‘bn = O(an )’, and an = Θ(bn ) means
the same as ‘an = O(bn ) and an = Ω(bn )’.
It follows that an = Θ(bn ) if and only if there exist c, C > 0 and N ∈ N such that
c|bn | ≤ |an | ≤ C|bn |, n ≥ N .
Recall that, if a > 1, logarithms to base a are defined by
x = loga y ⇐⇒ y = ax . (9.1)
71
so an = O(n3 ). More careful examination before applying the absolute value signs reveals
that an /n3 < 4 for n ≥ 1. Moreover, an /n3 → 4 as n → ∞, so actually an ∼ 4n3 . A
less simple example follows from Theorem 5.2.2, which implies that (Hn −ln n)/ ln n → 0
and
Hn ∼ ln n.
72
Examples 9.2.2 (i) Taking x = 1/n (for n ∈ N) in the lemma gives (ln n)/n → 0 as
n → ∞. This also gives the well-known result that n1/n = e(ln n)/n → 1 as n → ∞. Now
(ey − 1)/y tends to exp0 (0) = 1 as y → 0, and taking y = (ln n)/n gives
1/n ln n
n −1=O
n
1
|bn − e| = O( ) (9.3)
n2
(iii) Suppose that the mapping f : [0, 1] → [0, 1] is a contraction, i.e. there exists a
constant k < 1 such that
The contraction mapping theorem (or rather its proof) states that in these circumstances,
if x0 ∈ [0, 1] is chosen and xn is defined inductively by xn = f (xn−1 ) for all n ≥ 1, then
xn converges to the unique point s ∈ [0, 1] such that f (s) = s. It is also known that
kn
|xn − s| ≤ |x1 − x0 |,
1−k
and we can extract the essence of this formula by stating that
|xn − s| = O(k n )
Each of the boxed equations gives some estimate on how quickly the relevant sequence
approaches its limit, and we may now compare these different rates of convergence. For
example, 1/n2 = O((ln n)/n) because 1/(n ln n) is bounded (indeed it tends to 0) as
n → ∞. Dealing with k n is slightly harder, but k n n2 = n2 e−n| ln k| is bounded as
n → ∞, and so k n = O(1/n2 ). These relations can be summarized by the single line
1 ln n
O(k n ) = O( 2
) = O( ), (9.4)
n n
in which we have begun to widen the scope of (some would say abuse) the ‘O’ notation.
In (9.4), the expression O(bn ) is best understood as representing the set
73
of sequences which are ‘smaller or comparable’ to bn . Each equality sign in (9.4) is then
understood as really standing for the subset symbol ‘⊆’. As a slight extension of this
idea, we can rewrite (9.3) as
1
bn = e + O( 2 ), (9.5)
n
by interpreting the equality as roughly ‘∈’ and
1
e + O( 2
) = {(e + cn ) : |cn |n2 is bounded as n → ∞}.
n
The syntax (9.5) is very common and leads to no confusion, provided one remembers the
secret meanings of ‘=’ and never swaps the left and right-hand sides of an equation.
We can fill in (9.4) so as to form a whole scale of sequences that converge to 0. For
example, in the array
1 1 1 ln n 1 1
O( ) = O( ) = O( ) = O( ) = O( ) = O( ),
n2 n ln n n n ln n ln ln n
the further to the right a sequence is, the more slowly it converges to zero.
The ‘O’ notation can be applied equally well to a continuous variable x in place of
n, to describe the behaviour of a function g(x) as x approaches a fixed value or ∞. All
the previous conventions apply except that it is essential to include the limiting value of
x, in order to avoid ambiguity.
In the following result, O(xp+1 ) means any function f (x) (or set of functions) such
that f (x)/xp+1 is bounded as x → 0.
P
∞
Proposition 9.3.1 Suppose that an xn converges to a function g(x) whenever |x| is
n=0
sufficiently small. Then for any p ∈ N,
p
X
g(x) = ak xk + O(xp+1 ) as x → 0.
k=0
Proof. For this we need to know that a power series possesses a radius of convergence
R with the property that it converges when |x| < R and does not converge for |x| > R.
The value of R is determined by the limiting behaviour of the coefficients of the series
[6, §14.2]. In our case,
p
! ∞
1 X X
k
g(x) − ak x = aj+p+1 xj ,
xp+1 k=0 j=0
and the right-hand side converges, as it has the same radius of convergence as the original
sum. 2
74
For example,
1 2 1 p
ex = 1 + x + 2! x + · · · + p! x + O(xp+1 ) as x → 0, (9.6)
though the corresponding statement would be blatantly false for x → ∞. The equation
(9.6) is equivalent to asserting that
1 p
ex − 1 − x − · · · − p!
x
= O(1),
xp+1
i.e. that the left-hand side is bounded. This can also be deduced by applying l’Hôpital’s
rule p + 1 times; each time the numerator is differentiated it remains zero when x = 0.
More generally, the hypotheses of the proposition imply that
g (k) (0)
ak =
k!
in analogy to (1.8), and what results is the so-called Maclaurin series of g(x). We may
revert to a discrete variable by replacing x by 1/n to give
p
1 X ak 1
g( ) = k
+ O( p+1 ) as n → ∞. (9.7)
n k=0
n n
We next use this as the basis for some more advanced manipulation that displays the
power of the ‘O’ notation.
1 1 1 1 1 1 1 1 1
ln (1 + + 2 ) = ( + 2 ) − 12 ( + 2 )2 + 13 ( + 2 )3 + O( 4 )
n 2n n 2n n 2n n 2n n
1 1 1
= − 3 + O( 4 ).
n 6n n
Subsituting this into (9.8) gives
1 1
bn = e. exp ( − 2
+ O( 3 )),
6n n
and the result follows from (9.6). 2
75
In all the asymptotic expansions above, the left-hand side approaches a finite limit
as n → ∞; below we shall meet some examples where this is no longer true.
n! = O(nn ).
The following result gives much more precise information about the behaviour of n! for
large n.
Theorem 9.4.1 √ n n
n! ∼ 2πn as n → ∞.
e
2
Let √
Sn = 2πn(n/e)n
denote Stirling’s ‘approximation’ to n!. This is a bit of a misnomer, as both Sn and n!
are unbounded as n → ∞, though the theorem asserts that
Sn |Sn − n!|
| − 1| = → 0 as n → ∞. (9.9)
n! n!
In other words, the relative error tends to zero; this is shown in the table. By contrast,
the absolute error |Sn − n!| definitely does not tend to 0; it is in fact unbounded.
n n! Sn |Sn − n!|/n!
1 1 0.92 . . . 0.077 . . .
2 2 1.9 . . . 0.040 . . .
3 6 5.8 . . . 0.027 . . .
4 24 23 . . . 0.020
5 120 118 0.016
6 720 710 0.013
7 5040 4980 0.011
8 40320 39902 0.010
9 362880 359536 0.009
10 3628800 3598695 0.008
76
Moreover,
Sn 1 1
ln = ln Sn − ln(n!) = 2
ln(2π) + 2
ln n + n ln n − n − ln(n!)
n!
ln(n!)
1− → 0,
n ln n
whence
Corollary 9.4.2
ln(n!) ∼ n ln n as n → ∞.
It is a false step to exponentiate both sides of the corollary to conclude that n! ∼ nn . In
fact, Theorem 9.4.1 implies that n!/nn → 0 as n → ∞. 2
Some motivation helps to relate Stirling’s formula to a more elementary topic, namely
the relatively well-known identities
n
X n
X
k= 1
2
n(n + 1), k 2 = 16 n(2n + 1)(n + 1).
k=1 k=1
valid for any p ≥ 1, where Bk are the Bernoulli numbers defined in §8.1. Analogous
summation formulae exist for sums of other functions, and Stirling’s approximation can
be deduced from such a formula for
n
X
ln(n!) = ln k.
k=2
In fact,
p
X B2k 1
ln Sn − ln(n!) = − 2k−1
+ O( 2p+1 ), (9.10)
k=1
2k(2k − 1)n n
an equation also valid for any p ≥ 1 [4, §9.6].
9.5 Exercises
1. Show that, as n → ∞,
(i) n1/n = 1 + lnnn + O(( lnnn )2 ), and
(ii) n(n1/n − 1) − ln n → 0 .
77
an bn
√ an ∼ bn does not in general imply that e ∼ e . Does it imply that
2. Explain why
√
a n + 1 ∼ bn + 1 ?
78
10 Euclid’s Algorithm
79
Definition 10.1.1 Let a, b ∈ N. Then d ∈ N is called the highest common factor or
greatest common divisor (written d = gcd(a, b)) of a, b if
(i) d|a and d|b;
(ii) e|a, e|b ⇒ e|d.
Because condition (ii) reads e|d and not e ≤ d, it is not completely obvious that any
pair (a, b) has such a d. On the other hand, if d1 , d2 ∈ N both satisfy (i) and (ii) then
d1 |d2 and d2 |d1 , so d1 = d2 and the definite article ‘the’ is justified – greatest common
divisors are unique.
In Figure 14, the gcd of two numbers occurs when paths downwards from them first
meet; for instance 3 = gcd(12, 15). The row of numbers above 1 are the ‘primes’:
Definition 10.1.2 Let p ∈ N and p ≥ 2. Then p is a prime number if and only if the
only factors of p in N are 1 and p.
Thus, if p is prime then for any a ∈ N we must have gcd(p, a) = 1 or gcd(p, a) = p.
It is a well-known fact (Theorem 10.3.1 below) that any positive integer can be uniquely
factored as a product of prime numbers, and given such factorizations of a and b it may
be easy to spot gcd(a, b). For example
5432 = 23 .7.97,
⇒ gcd(5432, 2345) = 7.
2345 = 5.7.67
However the precise steps in this implication need further analysis, and we do not wish
to assume the factorization theorem at this stage, as its use does not provide an effective
procedure for computing the gcd of large numbers.
Let us add that
gcd(a, 0) = a for all a ∈ N. (10.1)
This follows immediately from Definition 10.1.1, since a|a and a|0, and if e|a and e|0
then it is a tautology that e|a.
Lemma 10.1.3 Given a, b ∈ N with b > 0, there exist unique q, r ∈ N such that
a = qb + r, with 0 ≤ r < b.
Proof. We shall regard this result as fairly self evident, though a precise recipe for a
computer to determine the quotient q is called the ‘division algorithm’ (see [9]). It can
be reduced to the task of finding bxc for x ∈ R, where bxc denotes the largest integer
less than or equal to x and is read ‘the floor of x’:
Given a, b, we may take q = ba/bc and r = a − qb. The fact that 0 ≤ r < b follows
from setting x = a/b in (10.2) and multiplying both sides by b. Given two solutions
(q, r), (q 0 , r 0 ) to the problem,
(q − q 0 )b = r − r 0 ⇒ b|(r − r 0 ),
80
If a and a0 have the same remainder r = r 0 when they are divided by b then one
says that a is congruent to a0 modulo b, and this is written
For fixed b, (10.3) defines an equivalence relation on Z. If the equivalence class containing
a is denoted a, it is easy to see that x + y = x+y, and x · y = x·y. These equations allow
‘modular’ addition and multiplication to be defined on the set Zb = {0, 1, 2, 3, . . . , b − 1}
of equivalence classes.
The next result is a consequence of the previous lemma.
Corollary 10.1.4 Let a, b ∈ N. Then gcd(a, b) exists and equals the least positive
integer in the set {ua + vb : u, v ∈ Z}.
Proof. Let d denote the least positive integer of the form ua + vb with u, v ∈ Z. We
claim that d|a. For the lemma allows us to write
a = qd + r, 0 ≤ r < d,
which implies that r too can also be expressed in the form ua + vb. Since r < d, the
only possibility is that r = 0, as required. Similarly d|b. Finally, if e|a and e|b then e
must divide any number of the form ua + vb, and d in particular. 2
Proof. Let d = gcd(a, b). First note that not only is d|b true, but also d|r because d|a
and r = a − qb; thus d is at least a common divisor of b and r. To show that it is the
greatest, suppose that e|b and e|r. Then e|(qb + r) so e is a common divisor of a and
b; by definition of d we have e|d. Therefore d = gcd(b, r). 2
This simple result provides the key step that enables one to compute the gcd of two
numbers by repeatedly applying Lemma 10.1.3. The result is Euclid’s algorithm which
we now illustrate.
Problem 10.2.2 Compute the greatest common divisor of 5432 and 2345.
Solution. We begin with a = 5432, b = 2345, find r and then repeat the same process
having first replaced a by b and b by r. The work can be laid out as follows, in which
diagonal arrows represent the replacements:
81
a q.b r
The process stops as soon as the remainder becomes 0, since in that line b must divide a
and gcd(a, b) = gcd(b, 0) = 0. Tracing the equalities of the last column back we obtain
gcd(5432, 2345) = 7. 2
Remark. Notice that if one interchanges the two numbers in the above example, the
process is ‘self-correcting’ in the sense that it resorts to the above after one extra line.
This expresses the obvious fact that gcd(a, b) = gcd(b, a):
2345 = 0.5432 + 2345 gcd(2345, 5432)
. . k
5432 = 2.2345 + 742 gcd(5432, 2345)
The above algorithm is the world’s oldest, dating from 300BC. It can be described by a
flow diagram (see §12.1) that illustrates the key features of an algorithm mentioned in
§5.1. However, only in middle of this century, with the advent of electronic computers,
was the study of algorithms taken seriously and developed further. Nowadays this study
is a vital area on the borderline between mathematics and computation.
The existence of u, v ∈ Z such that 7 = 5432u + 2345v , guaranteed by Corollary
10.1.4, can also be inferred by tracing back the calculations that led to 7 as the gcd.
Indeed, we see that
7 + 4.28 − 119 =0
− 4(28 + 6.119 − 742) =0
25(119 + 3.742 − 2345) =0
−79(742 + 2.2345 − 5432) = 0
7 −183.2345 + 79.5432 = 0
Thus one solution is u = 79 and v = −183. A more systematic way of finding u and v
can be incorporated into the algorithm itself; the result is the so-called ‘extended Euclid
algorithm’. We shall omit the details, which can be found in the Maple manual [11]
and in [9].
82
Turning Figure 14 upside down leads to the following concept that is ‘dual’ to the
gcd:
Proposition 10.2.4
ab
lcm(a, b) = .
gcd(a, b)
Proof. Given a, b ∈ N, let d = gcd(a, b) and let m = ab/d. Since a/d and b/d are whole
numbers, we have a|m and b|m so m is at least a common multiple. Now recall that
d = ua + vb for some u, v ∈ Z, and suppose that a|n and b|n. Setting ac1 = n = bc2
gives
dn = (ua + vb)n = ab(uc2 + vc1 ),
and so m divides n. Therefore m = lcm(a, b). 2
Proof. If p is not prime then p = ab where a, b > 1 and certainly (10.4) is false. So let
p be prime, and suppose that p|(ab). If gcd(p, a) = p then p|a. Otherwise gcd(p, a) = 1
and there exist u, v ∈ Z such that
up + va = 1 ⇒ (ub)p + v(ab) = b.
Proposition 10.3.2 Suppose that a, b, c are integers with a, b coprime and b > 0. Then
the congruence xa ≡ c mod b has an integer solution x with 0 ≤ x < b.
Proof. We need to find x, k ∈ Z such that xa = c + kb. By assumption, there exist
u, v ∈ Z such that
ua + vb = 1 ⇒ (cu)a = c − (cv)b,
and we may take x to equal cu minus a suitable multiple of b.
83
Theorem 10.3.3 Any integer a ≥ 2 can be written in exactly one way as
a = pr11 pr22 · · · prkk , (10.5)
where p1 < p2 < · · · < pk are primes and ri ≥ 1.
Proof. The existence of the factorization (10.5) is established by induction on a. It is
obvious if a = 2. In general, if a is not itself prime it can be factored into the product
of two smaller numbers for which factorization may be hypothesized. But then a itself
is factored.
To prove the uniqueness of the factorization, we need property (10.4). For suppose
that
pr11 pr22 · · · prkk = q1s1 q2s2 · · · q`s` ,
where p1 < p2 < · · · < pk and q1 < q2 < · · · < q` . Since p1 divides the left-hand side,
it must divide the right and writing the latter as a product in various ways one sees
eventually that p1 must divide at least one of q1 , . . . , q` . If p1 |qi then since qi is prime it
must be that p1 = qi . But then the corresponding equation q1 = pj is impossible unless
i = 1. Hence p1 = q1 , and the argument may be completed by induction. 2
200
150
100
50
10 20 30 40 50
A corollary to Theorem 10.3.3 is the well-known fact that there are infinitely many
prime numbers. For if {P1 , P2 , . . . , Pn } were a complete list of primes, the number
n
Y
1+ Pi
i=1
would have no prime factors. In any case, let Pn denote the nth prime number, so that
P1 = 2. It is a much more difficult problem to understand how fast the prime numbers
grow, and thereby ‘approximate’ Pn in analogy to Stirling’s formula in §9.4. This is
accomplished by the so-called prime number theorem [5, §1.8], one version of which is
84
Theorem 10.3.4 Pn ∼ n ln n.
This is illustrated by Figure 15; the plots of primes and the graph of x ln x diverge
sufficently slowly that the ratio Pn /(n ln n) approaches 1. The function n ln n will play
an important role in the study of algorithms in §12.2. Observe also the first big ‘gap’:
there are no prime numbers between P30 = 113 and P31 = 127.
A = A(x) = A0 + A1 x + A2 x2 + · · · + An xn
in which the coefficients Ai are real or complex numbers. Recall (cf. (1.7)) that A has
degree n if An 6= 0, and is monic if An = 1. Polynomials of degree 0 are just numbers
or constants, and polynomials of degree 1 are called linear.
as required. Uniqueness of Q, R follows from the fact that A and B are monic. 2
A(x) = Q(x)(x − λ) + R,
where n = deg A, though the roots λi need not of course be distinct. Using (10.6) and
taking complex conjugates, it is easy to see that a polynomial with real coefficients can
always be factored into linear and quadratic polynomials with real coefficients. It follows
that there are no ‘prime’ (the correct word is irreducible) real polynomials of degree
85
greater than 2, though the situation is very different if one restricts to polynomials with
integer coefficients (see §10.5).
A version of Euclid’s algorithm works for real or complex polynomials, and also monic
polynomials with integer coefficients. Strictly speaking one should first discuss each case
separately, though for simplicity we restrict now to the case of real coefficients. One
may then define the greatest common divisor of A, B following Definition 10.1.1, but if
D1 , D2 satisfy (i) and (ii) then all one may say is that D1 = aD2 for some a ∈ R. We
shall therefore write gcd(A, B) = D only if D satisfies the additional property of being
monic.
The proof of Proposition 10.4.1 shows that
This is a simpler analogue of Proposition 10.2.1 than one might expect, and gives rise to
a painless method for computing the greatest common divisor of two polynomials that
we now illustrate.
10.5 Exercises
Retrace your steps in (i) so as to find integers u, v such that 682u + 545v = 1.
86
3. Let p be a prime number, and k a positive integer. Prove that
(i) kp is divisible by p provided 1 ≤ k ≤ p − 1;
(ii) kp
p
is divisible by p if and only if k is.
4. Suppose that a, b ∈ N and that d = gcd(a, b) = ua + vb. Show that the general
solution of the equation xa + yb = d with x, y ∈ Z is
m m
x = u+ k, y = v − k,
a b
where k is an arbitrary integer and m = lcm(a, b).
Explain why this number can reasonably be called the greatest common divisor of a, b
and c. Determine its value when a = 5432, b = 8730 and c = 460.
6. Euclid’s algorithm is already encoded into Maple using the command gcd. Nonethe-
less, try out the homemade version
gcd1:= proc(a,b)
if b=0 then a
else print(b); gcd1(b,a-b*trunc(a/b)) fi
end:
Compute gcd1(100!+1,100^100-1); and modify the program to find out how many
times b is printed.
7. Explain why the following algorithm also computes gcd’s of positive integers:
gcd2:= proc(a,b)
if a<b then gcd2(b,a)
elif a=b then a
else gcd2(a-b,b) fi
end:
Is it as effective as gcd1?
87
11 Graphical Optimization
11.1 Graphs
Graphs are basically networks of points joined by lines. Whilst this course is not
concerned with graph theory as such, we need some basic notions to understand the
algorithms discussed below and the following will suit our purposes.
Figure 16:
u v w
x z
y
|V| = 6 and |E| = 10. The sequence (y, v, z, y, u) is not a path, since it passes through
an intermediate vertex twice (it’s called a trail!), whereas (u, x, w, z, v, y, u) is a cycle.
(iii) A graph is connected if there exists a path between any two vertices. A tree is a
connected graph with no cycles.
Proposition 11.1.2 A connected graph G has |E| ≥ |V| − 1, and if G is a tree then
|E| = |V| − 1.
88
Proof. Regard the edges of G as matchsticks or pieces of wire that are all separated,
so that their endpoints form a total of 2|E| vertices. Rebuild G by choosing an edge to
start with, and adding back one edge at a time, so that the partially-constructed graph
is connected at each stage. Each time one of the |E| − 1 edges is added back, the total
number of vertices drops by 1 or 2. Thus,
If in the process of rebuilding, both ends of some edge need to be attached to the
partially-constructed graph, then G must have a cycle. This cannot happen if G is a
tree, so in that case there must be equality in (11.1). 2
Any two vertices of a tree are connected by a unique path, because two such paths
would together give rise to a cycle. Trees crop up in many instances in computing;
consider for example a file directory structure. On the other hand, a human family tree
need not be a tree in the above sense; for example, the parents of King George V were
both descendants of George II, providing an Anglo-Danish cycle.
There are many possible interpretations of weighted graphs. The most obvious are those
in which the vertices represent geographical locations, and the edges correspond to ex-
isting transport routes, with w the time or distance associated to each. A variant is that
in which edges represent potential routes under construction, with w the corresponding
cost.
89
Example 11.1.4 The graph in Figure 17 is weighted by the first 24 digits of π (which
conveniently contain no zeros) in a snake-like fashion, commencing bottom left. This is
somewhat artificial but it will serve well to illustrate the problems below. The square
formed by E 0 = {AC, CE, EB, BA} has w(E 0 ) = 10, whereas w(E) = 115. 2
where the weights of each edge are indicated. Edges CE, F H, DI, N P, HM are discarded
as they form cycles, and by the time F G is added all the vertices have been connected.
The solution is illustrated in Figure 18, and has w(EK) = 51. 2
The original graph has |V| = 16, |E| = 24. It is an example of a ‘planar graph’
(meaning that when drawn on paper every intersection of edges is a vertex unlike in
Figure 16), and in this case one always has
|E| − |V| = (number of regions including the outside) − 2. (11.2)
By contrast, the MST has |EK | = 15 = |V| − 1 as predicted by Proposition 11.1.2. The
solution is not unique – we could replace the edge N L of weight 6 by P N , though N L
happened to come first in our ordering.
90
Figure 18: A minimum spanning tree
u u u u
2
8 6 4
u u u u
4
4 3 3
u u u u
1 5 2
1 3
u u u u
3 2
Figure 19:
e
y
e
x
v
u e
Choose an edge e = euv in EK \ E 0 with w(e) as small as possible. The unique path
in E 0 from u to v must contain an edge e0 = e0xy in E 0 \ EK . (See Figure 19 where edges
in E 0 are shown solid, and ones in EK dashed). Similarly, the path in EK from x to y
contains an edge e00 in EK \ E 0 (conceivably e00 = e).
91
Then w(e0 ) ≥ w(e) for otherwise w(e0 ) < w(e) ≤ w(e00 ), and e0 would have been added
before e in the construction of EK (as e00 is absent at that stage of the algorithm).
Exchanging e0 for e,
(E 0 \ {e0 }) ∪ {e}
is a spanning tree with total weight no greater than w(E 0 ), and so an MST with more
edges in common with EK . This is a contradiction. 2
Theorem 11.3.1 Let G be a weighted graph. Choose any vertex to start, and succes-
sively add edges of least weight joining a vertex already in the tree to one ‘outside’. The
result defines an MST for G . 2
The procedure is effectively the same as for Kruskal’s algorithm, except that one restricts
attention to edges connected to ones already chosen. Because of this, Prim’s algorithm
has the big advantage that it can be performed from the adjacency matrix, without
visualizing the graph. The adjacency matrix of a graph with n vertices is the symmetric
n × n matrix with entries w(eij ) where eij is the edge joining vertex i to vertex j . By
convention, the entry 0 means i 6∼ j (though it might be more accurate to use ∞ for
this purpose).
Figure 20:
F H L
u u u
5 9
4 3 3
B u uE uI
1 5
1 5 6
u u u
n
A 3 C 9 D
92
A B C D E F H I L
A 0 1 3 0 0 0 0 0 0
B 1 0 0 0 1 4 0 0 0
C 3 0 0 9 5 0 0 0 0
D 0 0 9 0 0 0 0 6 0
E 0 1 5 0 0 0 3 5 0
F 0 4 0 0 0 0 5 0 0
H 0 0 0 0 3 5 0 0 9
I 0 0 0 6 5 0 0 0 3
L 0 0 0 0 0 0 9 3 0
Delete column D
Circle a smallest positive number in row D
Store the corresponding edge DI
Delete column I
Circle a smallest positive number remaining in rows D,I
Store the corresponding edge IL
Delete column L
Circle a smallest positive number remaining in rows D,I,L
Store the corresponding edge IE
......
We have not justified Prim’s algorithm, which again exploits the greedy principle.
The success of the latter can be extended to a wider class of mathematical objects called
‘matroids’. This theory builds on the analogy between subsets of E (of a given graph) for
which there are no cycles, and subsets of linearly independent vectors in a vector space
[4, §17.4]. Both types of subset exhibit the ‘exchange’ property that is used in linear
algebra to prove that any maximal set of linearly independent vectors has the same size,
namely the dimension of the space. The analogue for a connected graph is that any MST
has |V| − 1 edges.
The general procedure for Prim’s algorithm can be represented by a flow diagram, in
which V 0 and E 0 are the vertex and edge subsets of the tree at each stage of its growth.
To make it methodical, a strategy is once again required to select and extract a suitable
edge of least weight at each stage.
93
Input
V = {v0 }, E 0 = ∅,
0
Adjacency matrix
y
YES
Does V 0 equal V? −→ Output E 0
yNO
Select an edge euv with
u ∈ V 0, v ∈ V \ V 0
and w(euv ) minimal
↓
Replace V 0 by V 0 ∪ {v}
Replace E 0 by E 0 ∪ {euv }
Drawing by numbers
One of the most natural of all graphical optimization tasks is the Shortest Path
Problem. Namely, given a weighted graph containing vertices u, v , find a path from one
to the other of least total weight (also called length). In symbols,
n−1
X
Find {u = v0 , v1 , v2 , . . . , vn = v} with w(vi vi+1 ) minimal. (11.3)
i=0
As an example, the shortest path from A to P in Figure 17 has a length of 18, and is
illustrated in Figure 21 together with the shortest paths (and their lengths) from A to
all other vertices.
Given u, (11.3) is solved simultaneously for every v by Dijkstra’s algorithm, which
proceeds by assigning temporary labels to vertices indicating the lengths of shortest paths
using only some of the edges. The algorithm starts by labelling u with 0 and all other
vertices with ∞. At each intermediate stage, one ‘examines’ the vertex (or one of the
vertices) x with the least label, n, of those that have not already been examined. The
label of each vertex y such that x ∼ y is then replaced by n + w(exy ) if this is smaller.
When all the vertices have been examined, every vertex z is automatically labelled with
the length of the shortest path from u to z .
The shortest paths themselves are found by including every edge exy for which
w(exy ) = |`(x) − `(y)|, where `(x) denotes the final label assigned to x. (Hence the
title of this subsection.) More explanation and verification can be found in [6, §22.3] and
[12, §8.2].
94
Figure 21: Shortest paths from A
t t t t
13 12 14 18
t t t t
5 5 10 14
t t t t
1 2 7 9
t
n t t t
A 3 12 12
Selling by cycles
Given a connected weighted graph G , the Travelling Salesperson Problem consists
of finding a cycle that passes through every vertex exactly once (assuming that such a
‘Hamiltonian’ cycle exists) of least total weight or length. This an example of a problem
for which there is no known algorithm, other than checking through a list of all potential
solutions (that would be prohibitive for large graphs). However, an estimate on the
length of a winning cycle can be derived from knowledge of minimal spanning trees of
certain subgraphs as follows.
Let C = (v0 , v1 , . . . , vn−1 , v0 ) be a solution; thus n ≥ 3 and we have distinguished
one of the vertices, namely v0 , on the cycle. Let G 0 be the graph formed from G by
removing v0 and all edges joining v0 . Then (v1 , . . . , vn−1 ) is a spanning tree for G 0 , so
G 9 M
n
t t t t
8
t t t t
F
t t t t
t t t t
95
If v0 = G in our original example then from Figure 18 it is clear that G 0 has an MST
of weight 51 − 8 = 43. So without knowing C we can predict that
w(C) ≥ 43 + 8 + 9 = 60,
which is not far off the mark. Actually, Figure 17 possesses exactly 6 cycles that pass
through every vertex once and only once (to see this, first note that all four ‘corners’
must be included). The shortest one is shown in Figure 22 and has length 65.
Working by zeros
A rather different class of problems can be illustrated by a weighted graph G in which
the set of vertices is the disjoint union of two disjoint subsets V1 , V2 with |V1 | = |V2 | = n,
and each element of V1 is joined by exactly one edge to each element of V2 . (Ignoring
the edge from y to z and adding weights, the graph in Figure 16 is of this type.) The
problem is then to find a subset E 0 of n edges providing a bijective correspondence
between V1 and V2 , and with w(E 0 ) minimal. This becomes a matter of optimal ‘job
assignment’ if V1 represents a set of jobs to be done, V2 a set of workers available, and
w(eux ) measures the unsuitability of giving job u to worker x.
A successful algorithm for determining E 0 can be carried out in matrix form, in the
same spirit as Prim’s algorithm. The relevant part of the adjacency matrix of G is the
block in which the rows are labelled by elements of V1 and the columns by elements of
V2 . As an example, let
and let w(eux ) equal the length of the shortest path from u to x in Figure 20. The new
adjacency block is then
H L I D
F 5 13 10 16
B 4 9 6 12
A 5 10 7 12
C 8 13 10 9
The first step consists of subtracting the least element of a row from all the elements
in that row. After this has been done for each row, one carries out the same procedure
on the columns, subtracting the least element in each column from all the elements in
that column. This reduces the above matrix to
H L I D
F 0 3 3 10
B
0 0 0 7
A 0 0 0 6
C 0 0 0 0
Because subtracting a constant from a given row or column does not affect the solution,
any choice of four 0’s such that no two are in the same row and no two in the same
96
column (these are called ‘independent zeros’) now determines the four edges whose total
weight is least. There are two solutions above, namely
both with the smallest value w(E 0 ) = 30 which happens to be the trace of the first
matrix.
In general, there is no guarantee that after the row and column subtractions, the
resulting matrix M = (mij ) possesses n independent zeros. In this case, further steps
are required to complete the algorithm. Suppose that all the 0’s lie in a union of rows and
columns determined by respective subsets I and J of {1, 2, . . . , n} with |I| + |J| < n.
The set of entries of M is then the disjoint union
F+ ∪ F 0 ∪ F − ,
where mij ∈ F+ iff bith i ∈ I and j ∈ J , and mij ∈ F− iff both i ∈ / I and j ∈
/ J. A
0 0
matrix M = (mij ) is formed by subtracting the least element f of F− from all elements
of F− and adding f to all elements of F+ , and it is easy to see that
X X
m0ij < mij . (11.4)
i,j i,j
If M0 has n independent zeros their position gives the solution; if not the process can be
repeated, and (11.4) guarantees success after a finite number of steps. The correctness
of this algorithm is established in [1, §3.3].
11.5 Exercises
3. Consider the following weighted graph. Its vertices are labelled BS, E, GP, H, KC ,
LS, NHG, OC, P C, SK, V , and correspond to 11 particular stations on intersections of
97
the Bakerloo, Central, Circle, Piccadilly, and Victoria underground lines in London. Its
edges are the segments of these lines that join two vertices without passing through a
third, and the weight of an edge is assigned by counting each passage between adjacent
stations as 1 unit (e.g. H ↔ LS is 4). Find an MST using (i) Kruskal’s algorithm, and
(ii) Prim’s algorithm.
4. In the previous question, find the lengths of a shortest path from BS to each of the
other vertices. What is the length of a shortest cycle that passes through each vertex
exactly once?
6. Is it a coincidence that the configuration of paths in Figure 21 forms a tree? Find the
configuration when Figure 17 is weighted by the digits 577215664901532860606512 of γ
(and deleting the edges corresponding to ‘0’).
7. (i) Determine by trial and error the 16 shortest paths in Figure 17 between each
vertex in V1 = {G, F, B, A} and each vertex in V2 = {P, O, K, J}. Write down the
corresponding lengths as a 4 × 4 matrix M.
(ii) Let G be the weighted graph with vertex set V1 ∪V2 and 16 edges joining the elements
of V1 and V2 and adjacency matrix M as in §11.4. Find an optimal assignment E 0 .
98
12 Algorithm Analysis and Sorting
Input a, b
y
YES
Is b zero? −→ Output a
yNO
Find r = a − bb ab c
y
Replace a by b
Replace b by r
On the other hand, t(a, b) can be deceptively small even when a and b are enormous;
for instance (using gcd1 from §10.5)
99
Proposition 12.1.1
t(a, b) = O(lg b).
Proof. We may suppose that a > b > 0, for if a < b one extra loop reverses the roles of
a, b as explained in §10.2. We first show by induction on n that
This means that there exists C > 0 such that t(a, b) ≤ C lg b for all b ≥ 2 (in fact C = 5
works) and the result follows from Definition 9.1.1. 2
Nevertheless, the above proposition confirms that Euclid’s algorithm is remarkably ef-
fective at computing the greatest common divisor of a pair of large numbers.
The relative performance or ‘complexity’ of algorithms is often assessed in terms of
such an asymptotic upper bound on the number of basic operations needed to process
data of ‘size’ n. Strictly speaking, such a bound gives no information since the constant
C in Definition 9.1.1 could be enormous; in fact the choice of an optimal algorithm in
given circumstances will often depend on the data size. However, in many cases the
value of n dwarfs any hidden constant, and on an asymptotic scale we shall see that it
is too much to expect of other algorithms that they need ‘only’ O(lg n) operations.
100
12.2 The sorting problem
In attempting to solve optimization problems of the type discussed in §11.3, one of
the most basic tasks required in practice is the ability to reorder a sequence of numbers
numerically. Indeed, to apply Prim’s algorithm on a computer one needs to take one step
in this direction to find a smallest number in rows of the adjacency matrix. In everyday
life, one often takes for granted the convenience of having data organized numerically,
and the effect of an operation that achieves this can even be seen on BBC at about
(currently) 8pm each Saturday night.
More precisely, the sorting problem consists in finding an algorithm or mechanical
procedure to arrange a sequence (a1 , a2 , . . . , an ) of n real numbers into ascending nu-
merical order. This amounts to finding a permutation π of the set {1, 2, . . . , n} such
that
aπ(1) ≤ aπ(2) ≤ · · · ≤ aπ(n) .
It is not a good idea though to embark on a list of all possible n! permutations, because
n! is so large in comparison to n (and each permutation would require up to n − 1
pairwise comparisons to check its validity). Instead, a brief study of some practical
sorting algorithms will illustrate the quest for more efficiency.
Example 12.2.1 The following program is run in Maple, after first defining y to be a
sequence of 10 real numbers:
for j from 1 to 9 do
for k from 10-j to 9 do
if y[k]>y[k+1] then y:=subsop(k=y[k+1],k+1=y[k],y) fi
od:
print(y)
od;
The rather complicated looking command on the third line after then merely redefines y
to be the new sequence obtained by swapping the k th and (k + 1)st terms y[k],y[k+1]
of the old one.
If we input the sequence
y:=[8,3,5,9,2,6,5,1,7,4]:
and then the program, Maple produces the lines
[8, 3, 5, 9, 2, 6, 5, 1, 4, 7]
[8, 3, 5, 9, 2, 6, 5, 1, 4, 7]
[8, 3, 5, 9, 2, 6, 1, 4, 5, 7]
[8, 3, 5, 9, 2, 1, 4, 5, 6, 7]
[8, 3, 5, 9, 1, 2, 4, 5, 6, 7]
[8, 3, 5, 1, 2, 4, 5, 6, 7, 9]
[8, 3, 1, 2, 4, 5, 5, 6, 7, 9]
[8, 1, 2, 3, 4, 5, 5, 6, 7, 9]
[1, 2, 3, 4, 5, 5, 6, 7, 8, 9]
101
The effect of the program is clear from examining the output. Each line is produced
from the previous one by moving an appropriate digit as far to the right as needed so
that there remains one less digit of the original sequence unsorted. 2
(a1,a2,a3)
(a1,a3,a2) (a1,a2,a3)
(a3,a1,a2) (a2,a1,a3)
(a1,a3,a2) (a1,a2,a3)
Given a sorting algorithm for n numbers that proceeds by repeatedly comparing two
numbers at a time, let us denote the maximum number of comparisons needed by h(n).
102
Proposition 12.2.2
h(n) = Ω(n lg n).
Proof. The procedure can always be represented by a decision tree as in Figure 23, but
with the maximum number of edges from top to bottom equal to h(n); thus there are
h(n) levels instead of 3. Even if all the branches of the tree are present the total number
of outputs cannot exceed 2h(n) . Each of the possible n! orderings may conceivably be
output in more than one place, but in any case the key inequality is
n! ≤ 2h(n)
This result gives a theoretical asymptotic lower bound on h(n). Roughly speaking, it
says that however clever an algorithm we design, the best we can hope for is that the
time taken to sort n numbers will be proportional to n lg n. The function n lg n lies
between n and n2 (see the graphs in §12.4), but is ‘closer’ to n in the sense that if one
is prepared to wait 106 milliseconds ∼ 20 minutes for a result, one can probably wait
106 lg(106 ) milliseconds ∼ 5 21 hours, but not 1012 milliseconds ∼ 32 years!
In this section we shall show that there do exist sorting algorithms that achieve the
lower bound of Proposition 12.2.2. In other words, it is possible to sort n numbers in
such a way that the maximum number of comparisons needed is Θ(n lg n). This estimate
fits in snugly between those of Example 12.2.1 and Euclid’s algorithm.
Example 12.3.1 The ‘merging’ operation, denoted µ, combines two ordered sequences
of respective length m, n into a single ordered one of length m + n. For example,
µ((2, 3, 5, 8, 9), (1, 4, 5, 6, 7)) = (1, 2, 3, 4, 5, 5, 6, 7, 8, 9). (12.3)
The MergeSort function, denoted M , is then defined recursively by
M (a1 , . . . , an ) = µ(M (a1 , . . . , abn/2c ), M (abn/2c+1 , . . . , an )), n ≥ 2,
and M (a1 ) = (a1 ) for n = 1.
The operation µ can easily be carried out methodically. The right-hand side of
(12.3) could have been arrived at by ‘stripping away’ numbers from the beginning of the
sequences on the left, repeatedly comparing the two initial numbers remaining at each
stage. After the half-way point
µ((. . . , 8, 9), (. . . , 5, 6, 7)) = (1, 2, 3, 4, 5, . . .),
103
only the three comparisons between 8 and 5, 8 and 6, 8 and 7 are left to complete the
job. In general, using this method, one may merge sequences of length m, n using a
maximum of m + n − 1 comparisons.
The following scheme shows what happens when MergeSort is applied to the sequence
(8, 3, 5, 9, 2, 6, 5, 1, 7, 4) of Example 12.2.1; it is designed merely as a visual aid and is not
an accurate reflection of the order a computer might carry out the steps. The sequence is
first broken up into groups until all the subsequences have size 1, achieved in the middle
row. Then the operation µ is applied to recombine the groups in the same way.
8359265174
83592 65174
83 592 65 174
8 3 5 92 6 5 1 74
8 3 5 9 2 6 5 1 7 4
38 5 29 56 1 47
38 259 56 147
23589 14567
1234556789
2
MergeSort uses the ‘divide and conquer’ principle common to many other algorithms;
it breaks the problem up into smaller parts which can be tackled more easily. Related to
this is the concept of ‘modular’ design, whereby an algorithm is built up from subroutines
that can be inserted in various orders to form a whole. The merging operation µ, itself
accomplished by an algorithm described above, forms the building block in this case.
Our final example has been included to illustrate an ingenious way of organizing data
in the implementation of an algorithm. This is based on the concept of a binary tree
which formalizes the structure of Figure 23, and has widespread applications. Further
details can be found in [3], from where the following summary is taken.
A binary tree is not strictly speaking a tree in the sense of §11.1. Rather it is a tree
with a finite number of vertices (also called nodes) and edges, plus some extra structure.
104
First and foremost, a binary tree has a special node, called the root and (despite its
name) conventionally placed at the top of the diagram. The root is attached to 0, 1 or 2
edges which (provided at least one is present) are distinguished by the names ‘left’ and
‘right’. Each of these edges is then attached to the root of another binary tree, and in
this way the structure is defined inductively.
The existence of the root r allows one to define the ancestors of a given node x of
a binary tree as all the nodes that lie on the unique path from r to x. Moreover, x is
a child of y if y is an ancestor connected by a single edge to x, and each node has at
most a left child and a right child. Two binary trees are deemed to be different if the
same node has only a left child in one and a right child in the other (see §12.4). A node
with no children is called a leaf (shown by a square in Figure 23). Given a node x, we
shall denote by hx the number of ‘levels’ below x, i.e. the greatest number of edges on
a path from x to a leaf.
105
Figure 24: Initial data
8
H
H
HH
HH
H
3 5
@ @
@ @
@ @
9 2 6 5
A
A
1 7 4
9
H
HH
HH
HH
8 6
@ @
@ @
@ @
7 4 5 5
A
A
1 3 2
106
Figure 26: Output
1
H
HH
H
HH
H
4 5
@
@
@
3 2
. . . ,5,6,7,8,9.
Let h = hr denote the number of levels below node 1, the root, so that
The number of comparisons needed to carry out part I of HeapSort is certainly no greater
than bn/2c.2h = O(n lg n), since each of the bn/2c nodes heapified has no more than h
levels below it. The number of comparisons needed for part II is no more than (n−1).2h,
so the overall bound for completing HeapSort is O(n lg n), just as for MergeSort.
The first part of this estimate can be improved:
This result is expressed by saying that a heap can be built in ‘linear time’. It has
applications for the graphical algorithms discussed earlier, since the problem of selecting
an edge of least weight can be solved by organizing relevant data into a heap. With
this technique, it can be shown (see [3, §23]) that Prim’s algorithm can be implemented
with complexity O(|E| lg |V|), where |E| is the number of edges and |V| the number of
vertices of the graph.
107
12.4 Exercises
2. Let G be a weighted graph with n vertices. Show that the number of comparisons of
pairs of numbers needed to complete Prim’s algorithm from the adjacency matrix of G
is O(n2 ).
P
∞
3. Let Cn denote the number of binary trees with n nodes, and let f (x) = Ck xk be
k=0
its GF (with C0 = 1). The picture shows that C1 = 1, C2 = 2 and C3 = 5. Verify that
C4 = 14. By considering the root and its two subtrees, prove that xf (x)2 − f (x) + 1 = 0.
(A related example is carried out in [1, §4.1], but without the distinction between left
and right that exists for binary trees.)
4. Solve the quadratic equation for f (x) in the previous question. Expand the square
1 2n
root, and prove that Cn = n+1 n
. Deduce from §9.4 that Cn = O(22n ).
7. Match up the following description of HeapSort in Maple with the vaguer English
108
Figure 27:
description in the text. Heapifying node i of a binary tree representing the n terms of a
sequence x is given by the procedure
heapify:= proc(i,n,x)
local m,y: y:=x:
if 2*i<=n and y[2*i]>y[i] then m:=2*i else m:=i fi:
if 2*i+1<=n and y[2*i+1]>y[m] then m:=2*i+1 fi:
if m>i then y:=subsop(i=y[m],m=y[i],y); y:=heapify(m,n,y) fi:
y
end:
Part I is then accomplished by
for i from trunc(n/2) by -1 to 1 do
x:=heapify(i,n,x)
od;
and part II by
for i from n by -1 to 2 do
x:=subsop(1=x[i],i=x[1],x):
n:=n-1: x:=heapify(1,n,x)
od;
Try this out on the usual suspects
n:=10; x:=[8,3,5,9,2,6,5,1,7,4]:
8. Maple has its own built-in sorting command, which is in fact based on a variant of
MergeSort. Read ?sort to find out what this is capable of.
109
Bibliography
The following texts are quoted for reference purposes only. Many of them contain
material that extends far beyond the Moderations course, but the chapters indicated
incorporate valuable treatments of relevant topics.
110