(Ebook - PDF) Mathematics - Advanced Calculus and Analysis
(Ebook - PDF) Mathematics - Advanced Calculus and Analysis
(Ebook - PDF) Mathematics - Advanced Calculus and Analysis
Search GetPedia
Works!
Welcome To GetPedia.com : The Online Information Resource.
Search GetPedia
Search GetPedia
Copyright © 2006
GetPedia | Links
Google Search
Department of
Mathematical Sciences
Ian Craw
ii
DSN: mth200-101982-8
Foreword
These Notes
The notes contain the material that I use when preparing lectures for a course I gave from
the mid 1980’s until 1994; in that sense they are my lecture notes.
”Lectures were once useful, but now when all can read, and books are so nu-
merous, lectures are unnecessary.” Samuel Johnson, 1799.
Lecture notes have been around for centuries, either informally, as handwritten notes,
or formally as textbooks. Recently improvements in typesetting have made it easier to
produce “personalised” printed notes as here, but there has been no fundamental change.
Experience shows that very few people are able to use lecture notes as a substitute for
lectures; if it were otherwise, lecturing, as a profession would have died out by now.
These notes have a long history; a “first course in analysis” rather like this has been
given within the Mathematics Department for at least 30 years. During that time many
people have taught the course and all have left their mark on it; clarifying points that have
proved difficult, selecting the “right” examples and so on. I certainly benefited from the
notes that Dr Stuart Dagger had written, when I took over the course from him and this
version builds on that foundation, itslef heavily influenced by (Spivak 1967) which was the
recommended textbook for most of the time these notes were used.
The notes are written in LATEX which allows a higher level view of the text, and simplifies
the preparation of such things as the index on page 101 and numbered equations. You
will find that most equations are not numbered, or are numbered symbolically. However
sometimes I want to refer back to an equation, and in that case it is numbered within the
section. Thus Equation (1.1) refers to the first numbered equation in Chapter 1 and so on.
Acknowledgements
These notes, in their printed form, have been seen by many students in Aberdeen since
they were first written. I thank those (now) anonymous students who helped to improve
their quality by pointing out stupidities, repetitions misprints and so on.
Since the notes have gone on the web, others, mainly in the USA, have contributed
to this gradual improvement by taking the trouble to let me know of difficulties, either
in content or presentation. As a way of thanking those who provided such corrections,
I endeavour to incorporate the corrections in the text almost immediately. At one point
this was no longer possible; the diagrams had been done in a program that had been
‘subsequently “upgraded” so much that they were no longer useable. For this reason I
had to withdraw the notes. However all the diagrams have now been redrawn in “public
iii
iv
domaian” tools, usually xfig and gnuplot. I thus expect to be able to maintain them in
future, and would again welcome corrections.
Ian Craw
Department of Mathematical Sciences
Room 344, Meston Building
email: [email protected]
www: http://www.maths.abdn.ac.uk/~igc
April 13, 2000
Contents
Foreword iii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
1 Introduction. 1
1.1 The Need for Good Foundations . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 The Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Neighbourhoods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.7 Absolute Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.8 The Binomial Theorem and other Algebra . . . . . . . . . . . . . . . . . . . 8
2 Sequences 11
2.1 Definition and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Examples of sequences . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Direct Consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Sums, Products and Quotients . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Squeezing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Bounded sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.6 Infinite Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Monotone Convergence 21
3.1 Three Hard Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Boundedness Again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.1 Monotone Convergence . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.2 The Fibonacci Sequence . . . . . . . . . . . . . . . . . . . . . . . . . 26
v
vi CONTENTS
5 Differentiability 41
5.1 Definition and Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2 Simple Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.3 Rolle and the Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . 44
5.4 l’Hôpital revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.5 Infinite limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.5.1 (Rates of growth) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.6 Taylor’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6 Infinite Series 55
6.1 Arithmetic and Geometric Series . . . . . . . . . . . . . . . . . . . . . . . . 55
6.2 Convergent Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.3 The Comparison Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.4 Absolute and Conditional Convergence . . . . . . . . . . . . . . . . . . . . . 61
6.5 An Estimation Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7 Power Series 67
7.1 Power Series and the Radius of Convergence . . . . . . . . . . . . . . . . . . 67
7.2 Representing Functions by Power Series . . . . . . . . . . . . . . . . . . . . 69
7.3 Other Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.4 Power Series or Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.5 Applications* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.5.1 The function ex grows faster than any power of x . . . . . . . . . . . 73
7.5.2 The function log x grows Rmore slowly than any power of x . . . . . . 73
2
7.5.3 The probability integral 0 e−x dx . . . . . . . . . . .
α
. . . . . . . 73
7.5.4 The number e is irrational . . . . . . . . . . . . . . . . . . . . . . . . 74
9 Multiple Integrals 93
9.1 Integrating functions of several variables . . . . . . . . . . . . . . . . . . . . 93
9.2 Repeated Integrals and Fubini’s Theorem . . . . . . . . . . . . . . . . . . . 93
9.3 Change of Variable — the Jacobian . . . . . . . . . . . . . . . . . . . . . . . 97
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.1 Graph of the function (x2 − 4)/(x − 2) The automatic graphing routine does
not even notice the singularity at x = 2. . . . . . . . . . . . . . . . . . . . . 31
4.2 Graph of the function sin(x)/x. Again the automatic graphing routine does
not even notice the singularity at x = 0. . . . . . . . . . . . . . . . . . . . . 32
4.3 The function which is 0 when x < 0 and 1 when x ≥ 0; it has a jump
discontinuity at x = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4 Graph of the function sin(1/x). Here it is easy to see the problem at x = 0;
the plotting routine gives up near this singularity. . . . . . . . . . . . . . . . 33
4.5 Graph of the function x. sin(1/x). You can probably see how the discon-
tinuity of sin(1/x) gets absorbed. The lines y = x and y = −x are also
plotted. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.1 If f crosses the axis twice, somewhere between the two crossings, the func-
tion is flat. The accurate statement of this “obvious” observation is Rolle’s
Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2 Somewhere inside a chord, the tangent to f will be parallel to the chord.
The accurate statement of this common-sense observation is the Mean Value
Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.1 Comparing the area under the curve y = 1/x2 with the area of the rectangles
below the curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.2 Comparing the area under the curve y = 1/x with the area of the rectangles
above the curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3 An upper and lower approximation to the area under the curve . . . . . . . 64
vii
viii LIST OF FIGURES
Introduction.
This chapter contains reference material which you should have met before. It is here both
to remind you that you have, and to collect it in one place, so you can easily look back and
check things when you are in doubt.
You are aware by now of just how sequential a subject mathematics is. If you don’t
understand something when you first meet it, you usually get a second chance. Indeed you
will find there are a number of ideas here which it is essential you now understand, because
you will be using them all the time. So another aim of this chapter is to repeat the ideas.
It makes for a boring chapter, and perhaps should have been headed “all the things you
hoped never to see again”. However I am only emphasising things that you will be using
in context later on.
If there is material here with which you are not familiar, don’t panic; any of the books
mentioned in the book list can give you more information, and the first tutorial sheet is
designed to give you practice. And ask in tutorial if you don’t understand something here.
df df dy
= ,
dx dy dx
and the “quick” form of the proof of the chain rule — cancel the dy’s — seems helpful. How-
ever if we consider the following result, in which the pressure P , volume V and temperature
T of an enclosed gas are related, we have
∂P ∂V ∂T
= −1, (1.1)
∂V ∂T ∂P
a result which certainly does not appear “obvious”, even though it is in fact true, and we
shall prove it towards the end of the course.
1
2 CHAPTER 1. INTRODUCTION.
Another example comes when we deal with infinite series. We shall see later on that
the series
1 1 1 1 1 1 1 1 1
1− + − + − + − + − ...
2 3 4 5 6 7 8 9 10
adds up to log 2. However, an apparently simple re-arrangement gives
1 1 1 1 1 1 1
1− − + − − + − ...
2 4 3 6 8 5 10
N — the Natural numbers are defined as the set {0, 1, 2, . . . , n, . . . }. Contrast these
with the positive integers; the same set without 0.
Z — the Integers are defined as the set {0, ±1, ±2, . . . , ±n, . . . }.
Q — the Rational numbers are defined as the set {p/q : p, q ∈ Z, q 6= 0}.
R — the Reals are defined in a much more complicated way. In this course you will start
to see why this complication is necessary, as you use the distinction between R and Q .
One point of this course is to illustrate the difference between Q and R . It is subtle:
for example when computing, it can be ignored, because a computer always works with
a rational approximation to any number, and as such can’t distinguish between the two √
sets. We hope to show that the complication of introducing the “extra” reals such as 2
is worthwhile because it gives simpler results.
Properties of R
Addition: We can add and subtract real numbers exactly as we expect, and the usual
rules of arithmetic hold — such results as x + y = y + x.
1.2. THE REAL NUMBERS 3
Multiplication: In the same way, multiplication and division behave as we expect, and
interact with addition and subtraction in the usual way. So we have rules such as
a(b + c) = ab + ac. Note that we can divide by any number except 0. We make no
attempt to make sense of a/0, even in the “funny” case when a = 0, so for us 0/0
is meaningless. Formally these two properties say that (algebraically) R is a field,
although it is not essential at this stage to know the terminology.
Order As well as the algebraic properties, R has an ordering on it, usually written as
“a > 0” or “≥”. There are three parts to the property:
Completion The set R has an additional property, which in contrast is much more mys-
terious — it is complete. It is this property that distinguishes it from Q . Its effect is
that there are always “enough” numbers to do what we want. Thus there are enough
to solve any algebraic equation, even those like x2 = 2 which can’t be solved in Q .
In fact there are (uncountably many) more - all the numbers like π, certainly not
rational, but in fact not even an algebraic number, are also in R. We explore this
property during the course.
One reason for looking carefully at the properties of R is to note possible errors in ma-
nipulation. One aim of the course is to emphasise accurate explanation. Normal algebraic
manipulations can be done without comment, but two cases arise when more care is needed:
Of course we know that 2 is non zero, so you don’t need to justify dividing by 2, but
if you divide by x, you should always say, at least the first time, why x 6= 0. If you don’t
know whether x = 0 or not, the rest of your argument may need to be split into the two
cases when x = 0 and x 6= 0.
Never multiply an inequality by a number without checking first that the number
is positive.
Here it is even possible to make the mistake with numbers; although it is perfectly
sensible to multiply an equality by a constant, the same is not true of an inequality. If
x > y, then of course 2x > 2y. However, we have (−2)x < (−2)y. If multiplying by an
expression, then again it may be necessary to consider different cases separately.
1.1. Example. Show that if a > 0 then −a < 0; and if a < 0 then −a > 0.
4 CHAPTER 1. INTRODUCTION.
Solution. This is not very interesting, but is here to show how to use the properties
formally.
Assume the result is false; then by trichotomy, −a = 0 (which is false because we know
a > 0), or (−a) > 0. If this latter holds, then a + (−a) is the sum of two positives and
so is positive. But a + (−a) = 0, and by trichotomy 0 > 0 is false. So the only consistent
possibility is that −a < 0. The other part is essentially the same argument.
1.2. Example. Show that if a > b and c < 0, then ac < bc.
Solution. This also isn’t very interesting; and is here to remind you that the order in which
questions are asked can be helpful. The hard bit about doing this is in Example 1.1. This is
an idea you will find a lot in example sheets, where the next question uses the result of the
previous one. It may dissuade you from dipping into a sheet; try rather to work through
systematically.
Applying Example 1.1 in the case a = −c, we see that −c > 0 and a − b > 0. Thus
using the multiplication rule, we have (a − b)(−c) > 0, and so bc − ac > 0 or bc > ac as
required.
1.3 Inequalities
One aim of this course is to get a useful understanding of the behaviour of systems. Think
of it as trying to see the wood, when our detailed calculations tell us about individual trees.
For example, we may want to know roughly how a function behaves; can we perhaps ignore
a term because it is small and simplify things? In order to to this we need to estimate —
replace the term by something bigger which is easier to handle, and so we have to deal with
inequalities. It often turns out that we can “give something away” and still get a useful
result, whereas calculating directly can prove either impossible, or at best unhelpful. We
have just looked at the rules for manipulating the order relation. This section is probably
all revision; it is here to emphasise the need for care.
1.4. Example. Find {x ∈ R : (x − 2)(x + 3) > 0}.
Solution. Suppose (x − 2)(x + 3) > 0. Note that if the product of two numbers is positive
then either both are positive or both are negative. So either x − 2 > 0 and x + 3 > 0, in
which case both x > 2 and x > −3, so x > 2; or x − 2 < 0 and x + 3 < 0, in which case
both x < 2 and x < −3, so x < −3. Thus
a2 − 2ab + b2 ≥ 0,
a2 + 2ab + b2 ≥ 4ab.
(a + b)2 ≥ 4ab.
a+b √
Since a ≥ 0 and b ≥ 0, taking square roots, we have ≥ ab. This is the arithmetic
2
- geometric mean inequality. We study further work with inequalities in section 1.7.
1.4 Intervals
We need to be able to talk easily about certain subsets of R . We say that I ⊂ R is an open
interval if
Thus an open interval excludes its end points, but contains all the points in between. In
contrast a closed interval contains both its end points, and is of the form
I = [a, b] = {x ∈ R : a ≤ x ≤ b}.
It is also sometimes useful to have half - open intervals like (a, b] and [a, b). It is trivial
that [a, b] = (a, b) ∪ {a} ∪ {b}.
The two end points a and b are points in R . It is sometimes convenient to
allow also the possibility a = −∞ and b = +∞; it should be clear from the
context whether this is being allowed. If these extensions are being excluded,
the interval is sometimes called a finite interval, just for emphasis.
Of course we can easily get to more general subsets of R. So (1, 2) ∪ [2, 3] = (1, 3] shows
that the union of two intervals may be an interval, while the example (1, 2) ∪ (3, 4) shows
that the union of two intervals need not be an interval.
1.7. Exercise. Write down a pair of intervals I1 and I2 such that 1 ∈ I1 , 2 ∈ I2 and
I1 ∩ I2 = ∅.
Can you still do this, if you require in addition that I1 is centred on 1, I2 is centred on
2 and that I1 and I2 have the same (positive) length? What happens if you replace 1 and
2 by any two numbers l and m with l 6= m?
1.8. Exercise. Write down an interval I with 2 ∈ I such that 1 6∈ I and 3 6∈ I. Can you
find the largest such interval? Is there a largest such interval if you also require that I is
closed?
Given l and m with l 6= m, show there is always an interval I with l ∈ I and m 6∈ I.
1.5 Functions
Recall that f : D ⊂ R → T is a function if f (x) is a well defined value in T for each x ∈ D.
We say that D is the domain of the function, T is the target space and f (D) = {f (x) :
x ∈ D} is the range of f .
6 CHAPTER 1. INTRODUCTION.
Note first that the definition says nothing about a formula; just that the result must be
properly defined. And the definition can be complicated; for example
0 if x ≤ a or x ≥ b;
f (x) =
1 if a < x < b.
defines a function on the whole of R, which has the value 1 on the open interval (a, b), and
is zero elsewhere [and is usually called the characteristic function of the interval (a, b).]
In the simplest examples, like f (x) = x2 the domain of f is the whole of R, but even
√
for relatively simple cases, such as f (x) = x, we need to restrict to a smaller domain, in
this case the domain D is {x : x ≥ 0}, since we cannot define the square root of a negative
number, at least if we want the function to have real - values, so that T ⊂ R .
Note that the domain is part of the definition of a function, so changing the domain
technically gives a different function. This distinction will start to be important in this
course. So f1 : R → R defined by f1 (x) = x2 and f2 : [−2, 2] → R defined by f2 (x) = x2
are formally different functions, even though they both are “x2 ” Note also that the range
of f2 is [0, 4]. This illustrate our first use of intervals. Given f : R → R , we can always
restrict the domain of f to an interval I to get a new function. Mostly this is trivial, but
sometimes it is useful.
Another natural situation in which we need to be careful of the domain of a function
occurs when taking quotients, to avoid dividing by zero. Thus the function
1
f (x) = has domain {x ∈ R : x 6= 3}.
x−3
The point we have excluded, in the above case 3 is sometimes called a singularity of f .
1.9. Exercise. Write down the natural domain of definition of each of the functions:
x−2 1
f (x) = g(x) = .
x2 − 5x + 6 sin x
Where do these functions have singularities?
It is often of interest to investigate the behaviour of a function near a singularity. For
example if
x−a x−a
f (x) = = for x 6= a.
x2 − a2 (x − a)(x + a)
then since x 6= a we can cancel to get f (x) = (x + a)−1 . This is of course a different
representation of the function, and provides an indication as to how f may be extended
through the singularity at a — by giving it the value (2a)−1 .
1.6 Neighbourhoods
This situation often occurs. We need to be able to talk about a function near a point: in
the above example, we don’t want to worry about the singularity at x = −a when we are
discussing the one at x = a (which is actually much better behaved). If we only look at the
points distant less than d for a, we are really looking at an interval (a − d, a + d); we call
such an interval a neighbourhood of a. For traditional reasons, we usually replace the
1.7. ABSOLUTE VALUE 7
distance d by its Greek equivalent, and speak of a distance δ. If δ > 0 we call the interval
(a − δ, a + δ) a neighbourhood (sometimes a δ - neighbourhood) of a. The significance of a
neighbourhood is that it is an interval in which we can look at the behaviour of a function
without being distracted by other irrelevant behaviours. It usually doesn’t matter whether
δ is very big or not. To see this, consider:
1.10. Exercise. Show that an open interval contains a neighbourhood of each of its points.
We can rephrase the result of Ex 1.7 in this language; given l 6= m there is some
(sufficiently small) δ such that we can find disjoint δ - neighbourhoods of l and m. We use
this result in Prop 2.6.
(a − δ, a + δ) = {X ∈ R : |x − a| < δ},
|x + y| ≤ |x| + |y|.
Proof. Since −|x| ≤ x ≤ |x|, and the same holds for y, combining these we have
|x| = |x − y + y| ≤ |x − y| + |y|
and so |x| − |y| ≤ |x − y|. Interchanging the rôles of x and y, and noting that |x| = | − x|,
gives |y| − |x| ≤ |x − y|. Multiplying this inequality by −1 and combining these we have
Proof. Unwrapping the modulus, we have either 5x − 3 < −4, or 5x − 3 > 4. From one
inequality we get 5x < −4 + 3, or x < −1/5; the other inequality gives 5x > 4 + 3, or
x > 7/5. Thus
(1 + x)3 = 1 + 3x + 3x2 + x3 ,
(1 + x)4 = 1 + 4x + 6x2 + 4x3 + x4 ,
(1 + x)5 = 1 + 5x + 10x2 + 10x3 + 5x4 + x5 .
a2 − b2 = (a − b)(a + b),
a3 − b3 = (a − b)(a2 + ab + b2 ),
a4 − b4 = (a − b)(a3 + a2 b + ab2 + b3 )
1.8. THE BINOMIAL THEOREM AND OTHER ALGEBRA 9
Note that we made use of this result when discussing the function after Ex 1.9.
And of course you remember the usual “completing the square” trick:
2 2 b b2 b2
ax + bx + c = a x + x + 2 + c −
a 4a 4a
2 2
b b
=a x+ + c− .
2a 4a
10 CHAPTER 1. INTRODUCTION.
Chapter 2
Sequences
A sequence doesn’t have to be defined by a sensible “formula”. Here is a sequence you may
recognise:-
3, 3.1, 3.14, 3.141, 3.1415, 3.14159, 3.141592 . . .
11
12 CHAPTER 2. SEQUENCES
66
64
62
60
58
56
54
52
0 5 10 15 20 25 30 35
Usually we are interested in what happens to a sequence “in the long run”, or what
happens “when it settles down”. So we are usually interested in what happens when n → ∞,
or in the limit of the sequence. In the examples above this was fairly easy to see.
Sequences, and interest in their limits, arise naturally in many situations. One such
occurs when trying to solve equations numerically; in Newton’s method, we use the standard
calculus approximation, that
f (a + h) ≈ f (a) + h.f 0 (a).
If now we almost have a solution, so f (a) ≈ 0, we can try to perturb it to a + h, which is a
true solution, so that f (a + h) = 0. In that case, we have
f (a)
0 = f (a + h) = f (a) + h.f 0 (a) and so h ≈ .
f 0 (a)
Thus a better approximation than a to the root is a + h = a − f (a)/f 0 (a).
If we take f (x) = x3√− 2, finding a root of this equation is solving the equation x3 = 2,
in other words, finding 3 2 In this case, we get the sequence defined as follows
2 2
a1 = 1whilean+1 = an + 2 if n > 1. (2.1)
3 3an
2 2
Note that this makes sense: a1 = 1, a2 = .1 + etc. Calculating, we get a2 = 1.333,
3 3.12 √
a3 = 1.2639, a4 = 1.2599 and a5 = 1.2599 In fact the sequence does converge to 3 2; by
taking enough terms we can get an approximation that is as accurate as we need. [You can
check that a35 = 2 to 6 decimal places.]
√Note also that we need to specify the accuracy needed. There is no single approximation
to 3 2 or π which will always work, whether we are measuring a flower bed or navigating a
satellite to a planet. In order to use such a sequence of approximations, it is first necessary
to specify an acceptable accuracy. Often we do this by specifying a neighbourhood of the
limit, and we then often speak of an - neighbourhood, where we use (for error), rather
than δ (for distance).
2.1. DEFINITION AND EXAMPLES 13
2.2. Definition. Say that a sequence {an } converges to a limit l if and only if, given
> 0 there is some N such that
2.4. Definition. Say a property P (n) holds eventually iff ∃N such that P (n) holds for
all n ≥ N . It holds frequently iff given N , there is some n ≥ N such that P (n) holds.
We call the n a witness; it witnesses the fact that the property is true somewhere at
least as far along the sequence as N . Some examples using the language are worthwhile. The
sequence {−2, −1, 0, 1, 2, . . . } is eventually positive. The sequence sin(n!π/17) is eventually
zero; the sequence of natural numbers is frequently prime.
It may help you to understand this language if you think of the sequence of days in
the future1 . You will find, according to the definitions, that it will frequently be Friday,
frequently be raining (or sunny), and even frequently February 29. In contrast, eventually
it will be 1994, and eventually you will die. A more difficult one is whether Newton’s work
will eventually be forgotten!
Using this language, we can rephrase the definition of convergence as
Another version may make the content of the definition even clearer; this time we use
the idea of neighbourhood:
It is important to note that the definition of the limit of a sequence doesn’t have a
simpler form. If you think you have an easier version, talk it over with a tutor; you may
find that it picks up as convergent some sequences you don’t want to be convergent. In
Fig 2.2, we give a picture that may help. The - neighbourhood of the (potential) limit l is
represented by the shaded strip, while individual members an of the sequence are shown as
blobs. The definition then says the sequence is convergent to the number we have shown
as a potential limit, provided the sequence is eventually in the shaded strip: and this must
be true even if we redraw the shaded strip to be narrower, as long as it is still centred on
the potential limit.
1
I need to assume the sequence is infinite; you can decide for yourself whether this is a philosophical
statement, a statement about the future of the universe, or just plain optimism!
14 CHAPTER 2. SEQUENCES
Potential Limit
• Let an = 1/n. Then an → 0 as n → ∞. To check this, pick > 0 and then choose N
with N > 1/. Now suppose that n ≥ N . We have
1 1
0≤ ≤ < by choice of N .
n N
• The sequence an = n − 1 is divergent; for if not, then there is some l such that
an → l as n → ∞. Taking = 1, we see that eventually (say after N ) , we have
−1 ≤ (n − 1) − l < 1, and in particular, that (n − 1) − l < 1 for all n ≥ N . thus
n < l + 2 for all n, which is a contradiction.
√
2.5. Exercise. Show that the sequence an = (1/ n) → 0 as n → ∞.
Although we can work directly from the definition in these simple cases, most of the
time it is too hard. So rather than always working directly, we also use the definition to
prove some general tools, and then use the tools to tell us about convergence or divergence.
Here is a simple tool (or Proposition).
We can argue this directly (so this is another version of this proof). Pick = |l − m|/2.
Then eventually |an − l| < , so this holds e.g.. for n ≥ N1 . Also, eventually |an − m| < ,
2.3. SUMS, PRODUCTS AND QUOTIENTS 15
so this holds eg. for n ≥ N2 . Now let N = max(N1 , N2 ), and choose n ≥ N . Then both
inequalities hold, and
|l − m| = |l − an + an − m|
≤ |l − an | + |an − m| by the triangle inequality
< + = |l − m|
Proof. Remember what this means; we are guaranteed that from some point onwards, we
never have an = 0. The proof is a variant of “if an → 2 as n → ∞ then eventually an > 1.”
One way is just to repeat that argument in the two cases where l > 0 and then l < 0. But
we can do it all in one:
Take = |l|/2, and apply the definition of “an → l as n → ∞”. Then there is some N
such that
2.8. Exercise. Let an → l 6= 0 as n → ∞, and assume that l > 0. Show that eventually
an > 0. In other words, use the first method suggested above for l > 0.
n+2 1 + 2/n
an = = .
n+3 1 + 3/n
Sums: an + bn → l + m as n → ∞;
Products: an bn → lm as n → ∞; and
16 CHAPTER 2. SEQUENCES
The other two results are proved in the same way, but are technically more difficult.
Proofs can be found in (Spivak 1967).
4 − 7n2
2.11. Example. Let an = . Show that an → −7 as n → ∞.
n2 + 3n
Solution. A helpful manipulation is easy. We choose to divide both top and bottom by
the highest power of n around. This gives:
4
4 − 7n2 n2
−7
an = = .
n2 + 3n 1 + n3
We now show each term behaves as we expect. Since 1/n2 = (1/n).(1/n) and 1/n → 0 as
n → ∞, we see that 1/n2 → 0 as n → ∞, using “product of convergents is convergent”.
Using the corresponding result for sums shows that n42 − 7 → 0 − 7 as n → ∞. In the same
way, the denominator → 1 as n → ∞. Thus by the “limit of quotients” result, since the
limit of the denominator is 1 6= 0, the quotient → −7 as n → ∞.
√
2.12. Example. In equation 2.1 we derived a sequence (which we claimed converged to 3 2)
from Newton’s method.
√ We can now show that provided the limit exists and is non zero,
the limit is indeed 3 2.
2.13. Exercise. Define the sequence {an } by a1 = 1, an+1 = (4an + 2)/(an + 3) for n ≥ 1.
Assuming that {an } is convergent, find its limit.
2.14. Exercise. Define the sequence {an } by a1 = 1, an+1 = (2an + 2) for n ≥ 1. Assuming
that {an} is convergent, find its limit. Is the sequence convergent?
2.4. SQUEEZING 17
√ √
2.15. Example. Let an = n + 1 − n. Show that an → 0 as n → ∞.
2.4 Squeezing
Actually, we can’t take the last step yet. It is true and looks sensible, but it is another case
where we need more results getting new convergent sequences from old. We really want a
good dictionary of convergent sequences.
The next results show that order behaves quite well under taking limits, but also shows
why we need the dictionary. The first one is fairly routine to prove, but you may still find
these techniques hard; if so, note the result, and come back to the proof later.
2.16. Exercise. Given that an → l and bn → m as n → ∞, and that an ≤ bn for each n,
then l ≤ m.
Compare this with the next result, where we can also deduce convergence.
− < an − l ≤ bn − l ≤ cn − l < ,
and using only the middle and outer terms, this gives
Note: Having seen the proof, it is clear we can state an “eventually” form of this result.
We don’t need the inequalities to hold all the time, only eventually.
sin(n)
2.18. Example. Let an = . Show that an → 0 as n → ∞.
n2
Solution. Note that, whatever the value of sin(n), we always have −1 ≤ sin(n) ≤ 1. We
use the squeezing lemma:
1 1 1 sin(n)
− 2
< an < 2 . Now → 0, so →0 as n → ∞.
n n n2 n2
r
1
2.19. Exercise. Show that 1+→ 1 as n → ∞.
n
√ √
Note: We can now do a bit more with the n + 1 − n example. We have
1 1
0≤ √ √ ≤ √ ,
n+1+ n 2 n
√
so we have our result since we showed in Exercise 2.5 that (1/ n) → 0 as n → ∞.
This illustrates the need for a good bank of convergent sequences. In fact we don’t have
to use ad-hoc methods here; we can get such results in much more generality. We need the
next section to prove it, but here is the results.
Note: This is another example of the “new convergent sequences from old” idea. The
application is that f (x) = x1/2 is continuous everywhere on its domain which is {x ∈ R :
x ≥ 0}, so since n−1 → 0 as n → ∞, we have n−1/2 → 0 as n → ∞; the result we proved
in Exercise 2.5.
2.21. Exercise. What do you deduce about the sequence an = exp (1/n) if you apply this
result to the continuous function f (x) = ex ?
1
2.22. Example. Let an = for n ≥ 2. Show that an → 0 as n → ∞.
n log n
Solution. Note that 1 ≤ log n ≤ n if n ≥ 3, because log(e) = 1, log is monotone increasing,
and 2 < e < 3. Thus n < n log n < n2 , when n ≥ 3 and
1 1 1 1 1 1
2
< < . Now → 0 and → 0, so → 0 as n → ∞.
n n log n n n n2 n log n
1
2.23. Exercise. Let an = √ for n ≥ 2. Show that an → 0 as n → ∞.
n log n
2.5. BOUNDED SEQUENCES 19
Thus the sequence {an } is eventually bounded, and so is bounded by Prop 2.27.
And here is another result on which to practice working from the definition. In order to
tackle simple proofs like this, you should start by writing down, using the definitions, the
information you are given. Then write out (in symbols) what you wish to prove. Finally
see how you can get from the known information to what you need. Remember that if a
definition contains a variable (like in the definition of convergence), then the definition is
true whatever value you give to it — even if you use /2 (as we did in 2.10) or /K, for any
constant K. Now try:
2.29. Exercise. Let an → 0 as n → ∞ and let {bn } be a bounded sequence. Show that
an bn → 0 as n → ∞. [If an → l 6= 0 as n → ∞, and {bn } is a bounded sequence, then in
general {an bn } is not convergent. Give an example to show this.]
Monotone Convergence
n! √
Stirling’s Formula: Let an = √ n
; then an → 2π as n → ∞.
n(n/e)
We already saw in 2.12 that knowing a sequence has a limit can help to find the limit.
As another example of this, consider the third sequence above. We have
pn+2 pn 1
=1+ and so an+1 = 1 + . (3.1)
pn+1 pn+1 an
21
22 CHAPTER 3. MONOTONE CONVERGENCE
3.1. Definition. A sequence {an } is bounded above if there is some K such that an ≤ K
for all n. We say that K is an upper bound for the sequence.
• The number K may not be the best or smallest upper bound. All we know from the
definition is that it will be an upper bound.
Nevertheless, monotone sequences do happen in real life. For example, the sequence
is how we often describe the decimal expansion of π. Monotone sequences are important
because we can say something useful about them which is not true of more general sequences.
3.2. BOUNDEDNESS AGAIN 23
n
3.6. Example. Show that the sequence an = is increasing.
2n + 1
Solution. One way to check that a sequence is increasing is to show an+1 − an ≥ 0, a
second way is to compute an+1 /an , and we will see more ways later. Here,
n+1 n
an+1 − an = −
2(n + 1) + 1 2n + 1
(2n2 + 3n + 1) − (2n2 + 3n)
=
(2n + 3)(2n + 1)
1
= > 0 for all n
(2n + 3)(2n + 1)
and the sequence is increasing.
1 1
3.7. Exercise. Show that the sequence an = − 2 is decreasing when n > 1.
n n
If a sequence is increasing, it is an interesting question whether or not it is bounded
above. If an upper bound does exist, then it seems as though the sequence can’t help
converging — there is nowhere else for it to go.
Upper bound
n
Figure 3.1: A monotone (increasing) sequence which is bounded above seems to converge
because it has nowhere else to go!
3.9. Theorem (The monotone convergence principle). Let {an } be an increasing se-
quence which is bounded above; then {an } is a convergent sequence. Let {an } be a decreasing
sequence which is bounded below; then {an } is a convergent sequence
Proof. To prove this we need to appeal to the completeness of R, as described in Section 1.2.
Details will be given in third year, or you can look in (Spivak 1967) for an accurate deduction
from the appropriate axioms for R .
24 CHAPTER 3. MONOTONE CONVERGENCE
This is a very important result. It is the first time we have seen a way of deducing
the convergence of a sequence without first knowing what the limit is. And we saw in 2.12
that just knowing a limit exists is sometimes enough to be able to find its value. Note
that the theorem only deduces an “eventually” property of the sequence; we can change a
finite number of terms in a sequence without changing the value of the limit. This means
that the result must still be true of we only know the sequence is eventually increasing and
bounded above. What happens if we relax the requirement that the sequence is bounded
above, to be just eventually bounded above? (Compare Proposition 2.27).
3.10. Example. Let a be fixed. Then an → 0 as n → ∞ if |a| < 1, while if a > 1, an → ∞
as n → ∞.
Solution. Write an = an ; then an+1 = a.an . If a > 1 then an+1 < an , while if 0 < a < 1
then an+1 < an ; in either case the sequence is monotone.
Case 1 0 < a < 1; the sequence {an } is bounded below by 0, so by the monotone
convergence theorem, an → l as n → ∞. As before note that an+1 → l as n → ∞. Then
applying 2.10 to the equation an+1 = a.an , we have l = a.l, and since a 6= 1, we must have
l = 0.
Case 2 a > 1; the sequence {an } is increasing. Suppose it is bounded above; then
as in Case 1, it converges to a limit l satisfying l = a.l. This contradiction shows the
sequence is not bounded above. Since the sequence is monotone, it must tend to infinity
(as described in 3.9).
Case 3 |a| < 1; then since −|an | ≤ an ≤ |an |, and since |an | = |a|n → 0 as n → ∞,
by squeezing, since each outer limit → 0 by case 1, we have an → 0 as n → ∞ whenever
|a| < 1.
n n n
2 −2 4
3.11. Example. Evaluate lim and lim + .
n→∞ 3 n→∞ 3 5
Solution. Using the result that if |a| < 1, then an → 0 as n → ∞, we deduce that
(−2/3)n → 0 as n → ∞, that (4/5)n → 0 as n → ∞, and using 2.10, that the second limit
is also 0.
1 n
3.12. Exercise. Given that k > 1 is a fixed constant, evaluate lim 1 − . How does
n→∞ k
your result compare with the sequence definition of e given in Sect 3.1.
6(1 + an )
3.13. Example. Let a1 = 1, and for n ≥ 1 define an+1 = . Show that {an } is
7 + an
convergent, and find its limit.
Solution. We can calculate the first few terms of the sequence as follows:
Recall from elementary calculus that if f 0 (x) > 0, then f is increasing; in other words, if
b > a then f (b) > f (a). We thus see that f is increasing.
Since a2 > a1 , we have f (a2 ) > f (a1 ); in other words, a3 > a2 . Applying this argument
inductively, we see that an+1 > an for all n, and the sequence {an } is increasing.
If x is large, f (x) ≈ 6, so perhaps f (x) < 6 for all x.
6(7 + x) − 6 − 6x 36
6 − f (x) = = > 0 if x > −7
7+x 7+x
In particular, we see that f (x) ≤ 6 whenever x ≥ 0. Clearly an ≥ 0 for all n, so f (an ) =
an+1 ≤ 6 for all n, and the sequence {an } is increasing and bounded above. Hence {an } is
convergent, with limit l (say).
6(1 + l)
Since also an+1 → l as n → ∞, applying 2.10 to the defining equation gives l = ,
7+l
or l2 + 7l = 6 + 6l. Thus l2 + l − 6 = 0 or (l + 3)(l − 2) = 0. Thus we can only have limits
2 or −3, and since an ≥ 0 for all n, necessarily l > 0. Hence l = 2.
Warning: There is a difference between showing that f is increasing, and showing that
the sequence is increasing. There is of course a relationship between the function f and the
sequence an ; it is precisely that f (an ) = an+1 . What we know is that if f is increasing,
then the sequence carries on going the way it starts; if it starts by increasing, as in the
above example, then it continues to increase. In the same way, if it starts by decreasing,
the sequence will continue to decrease. Show this for yourself.
3.14. Exercise. Define the sequence {an } by a1 = 1, an+1 = (4an + 2)/(an + 3) for n ≥ 1.
Show that {an } is convergent, and find it’s limit.
3.15. Proposition. Let {an } be an increasing sequence which is convergent to l (In other
words it is necessarily bounded above). Then l is an upper bound for the sequence {an }.
Proof. We argue by contradiction. If l is not an upper bound for the sequence, there is
some aN which witnesses this; i.e. aN > l. Since the sequence is increasing, if n ≥ N , we
have an ≥ aN > l. Now apply 2.16 to deduce that l ≥ aN > l which is a contradiction.
1 n
3.16. Example. Let an = 1 + ; then {an } is convergent.
n
Solution. We show we have an increasing sequence which is bounded above. By the
binomial theorem,
1 n n n(n − 1) 1 n(n − 1)(n − 2) 1 1
1+ = 1+ + . 2+ . 3 + ··· + n
n n 2! n 3! n n
1 1
≤ 1 + 1 + + + ···
2! 3!
1 1 1
≤ 1+1+ + + + · · · ≤ 3.
2 2.2 2.2.2
26 CHAPTER 3. MONOTONE CONVERGENCE
from which it clear that an increases with n. Thus we have an increasing sequence which is
bounded above, and so is convergent by the Monotone Convergence Theorem 3.9. Another
method, in which we show the limit is actually e is given on tutorial sheet 3.
n 1 2 3 4 5 6 7 8 Formula
pn 1 1 2 3 5 8 13 21 pn+2 = pn + pn+1
an 1 2 1.5 1.67 1.6 1.625 1.61 1.62 an = pn+1 /pn
On the basis of this evidence, we make the following guesses, and then try to prove them:
• an is convergent.
Note we are really behaving like proper mathematicians here; the aim is simply to use
proof to see if earlier guesses were correct. The method we use could be very like that in
the previous example; in fact we choose to use induction more.
Either method can be used on either example, and you should become familiar
with both techniques.
pn+1 1
an+1 = = =1+ . (3.3)
pn an
The next stage is to look at the “every other one” subsequences. First we get a relationship
like equation 3.3 for these. (We hope these subsequences are going to be better behaved
than the sequence itself).
1 1 an
an+2 = 1 + =1+ 1 =1+ 1+a . (3.4)
an+1 1 + an n
We use this to compute how the difference between successive terms in the sequence behave.
Remember we are interested in the “every other one” subsequence. Computing,
an an−2
an+2 − an = −
1 + an 1 + an−2
an + an an−2 − an−2 − an an−2
=
(1 + an )(1 + an−2 )
an − an−2
=
(1 + an )(1 + an−2 )
In the above, we already know that the denominator is positive (and in fact is at least 4
and at most 9). This means that an+2 − an has the same sign as an − an−2 ; we can now use
this information on each subsequence. Since a4 < a2 = 2, we have a6 < a4 and so on; by
induction, a2n forms a decreasing sequence, bounded below by 1, and hence is convergent
to some limit α. Similarly, since a3 > a1 = 1, we have a5 > a3 and so on; by induction
a2n forms an increasing sequence, bounded above by 2, and hence is convergent to some
limit β.
Remember that adjacent terms of both of these sequences satisfied equation 3.4, so as
usual, the limit satisfies this equation as well. Thus
√
α 1± 1+4
α =1+ , α + α2 = 1 + 2α, α2 − α − 1 = 0 and α =
1+α 2
√
Since all the terms are positive, we can ignore
√ the negative root, and get α = (1 + 5)/2.
In exactly the same way, we have β = (1 + 5)/2, and both subsequences converge to the
same limit. It is now an easy deduction (which is left for those who are interested - ask if
you want to see the details) that the Fibonacci sequence itself converges to this limit.
28 CHAPTER 3. MONOTONE CONVERGENCE
Chapter 4
It turns out that there are a number of “good” classes of functions which are worth
studying. In this chapter and the next ones, we study functions which have steadily more
and more restrictions on them. Which means the behaviour steadily improves; and at the
same time, the number of examples steadily decreases. A perfectly general function has
essentially nothing useful that can be said about it; so we start by studying continuous
functions, the first class that gives us much theory.
In order to discuss functions sensibly, we often insist that we can “get a good look” at
the behaviour of the function at a given point, so typically we restrict the domain of the
function to be well behaved.
4.1. Definition. A subset U of R is open if given a ∈ U , there is some δ > 0 such that
(a − δ, a + δ) ⊆ U .
In fact this is the same as saying that given a ∈ U , there is some open interval containing
a which lies in U . In other words, a set is open if it contains a neighbourhood of each of its
29
30 CHAPTER 4. LIMITS AND CONTINUITY
points. We saw in 1.10 that an open interval is an open set. This definition has the effect
that if a function is defined on an open set we can look at its behaviour near the point a of
interest, from both sides.
We illustrate the behaviour of the function for the case when a = 2 in Fig 4.1
-1
-2
-4 -3 -2 -1 0 1 2 3 4
Figure 4.1: Graph of the function (x2 − 4)/(x − 2) The automatic graphing routine does
not even notice the singularity at x = 2.
In this example, we can argue that the use of the (x2 − a2 )/(x − a) was perverse; there
was a more natural definition of the function which gave the “right” answer. But in the
case of sin x/x, example 4, there was no such definition; we are forced to make the two part
definition, in order to define the function “properly” everywhere. So we again have to be
careful near a particular point in this case, near x = 0. The function is graphed in Fig 4.2,
and again we see that the graph shows no evidence of a difficulty at x = 0
Considering example 5 shows that these limits need not always exist; we describe this
by saying that the limit from the left and from the right both exist, but differ, and the
function has a jump discontinuity at 0. We sketch the function in Fig 4.3.
In fact this is not the worst that can happen, as can be seen by considering example 6.
Sketching the graph, we note that the limit at 0 does not even exists. We prove this in
more detail later in 4.23.
The crucial property we have been studying, that of having a definition at a point which
is the “right” definition, given how the function behaves near the point, is the property of
continuity. It is closely connected with the existence of limits, which have an accurate
definition, very like the “sequence” ones, and with very similar properties.
4.3. Definition. Say that f (x) tends to l as x → a iff given > 0, there is some δ > 0
such that whenever 0 < |x − a| < δ, then |f (x) − l| < .
Note that we exclude the possibility that x = a when we consider a limit; we are only
interested in the behaviour of f near a, but not at a. In fact this is very similar to the
definition we used for sequences. Our main interest in this definition is that we can now
describe continuity accurately.
4.4. Definition. Say that f is continuous at a if limx→a f (x) = f (a). Equivalently, f
is continuous at a iff given > 0, there is some δ > 0 such that whenever |x − a| < δ, then
|f (x) − f (a)| < .
32 CHAPTER 4. LIMITS AND CONTINUITY
0.8
0.6
0.4
0.2
-0.2
-0.4
-8 -6 -4 -2 0 2 4 6 8
Figure 4.2: Graph of the function sin(x)/x. Again the automatic graphing routine does
not even notice the singularity at x = 0.
Figure 4.3: The function which is 0 when x < 0 and 1 when x ≥ 0; it has a jump
discontinuity at x = 0.
Note that in the “epsilon - delta” definition, we no longer need exclude the case when
x = a. Note also there is a good geometrical meaning to the definition. Given an error
, there is some neighbourhood of a such that if we stay in that neighbourhood, then f is
trapped within of its value f (a).
We shall not insist on this definition in the same way that the definition of the con-
vergence of a sequence was emphasised. However, all our work on limts and continuity of
functions can be traced back to this definition, just as in our work on sequences, everything
could be traced back to the definition of a convergent sequence. Rather than do this, we
shall state without proof a number of results which enable continuous functions both to be
recognised and manipulated. So you are expected to know the definition, and a few simply
– δ proofs, but you can apply (correctly - and always after checking that any needed
conditions are satisfied) the standard results we are about to quote in order to do required
manipulations.
0.5
-0.5
-1
0 0.05 0.1 0.15 0.2 0.25 0.3
Figure 4.4: Graph of the function sin(1/x). Here it is easy to see the problem at x = 0;
the plotting routine gives up near this singularity.
a ∈ U.
Note: This is important. The function f (x) = 1/x is defined on {x : x 6= 0}, and
is a continuous function. We cannot usefully define it on a larger domain, and so, by the
definition, it is continuous. This is an example where the naive “can draw it without taking
the pencil from the paper” definition of continuity is not helpful.
x3 − 8
4.6. Example. Let f (x) = for x 6= 2. Show how to define f (2) in order to make f
x−2
a continuous function at 2.
Solution. We have
x3 − 8 (x − 2)(x2 + 2x + 4)
= = (x2 + 2x + 4)
x−2 (x − 2)
Solution. We prove this directy from the definition, using that fact that, for all x, we have
| sin(x) ≤ 1|. Pick > 0 and choose δ = [We know the answer, but δ = /2, or any value
of δ with 0 < δ ≤ will do]. Then if |x| < δ,
1 1 1
x sin − 0 = x sin = |x|. sin ≤ |x| < δ ≤
x x x
as required. Note that this is an example where the product of a continuous and a discon-
tinuous function is continuous. The graph of this function is shown in Fig 4.5.
0.15
0.1
0.05
-0.05
-0.1
-0.15
-0.2
-0.25
0 0.05 0.1 0.15 0.2 0.25 0.3
Figure 4.5: Graph of the function x. sin(1/x). You can probably see how the discontinuity
of sin(1/x) gets absorbed. The lines y = x and y = −x are also plotted.
4.9. Definition. Say that limx→a− f (x) = l, or that f has a limit from the left iff given
> 0, there is some δ > 0 such that whenever a − δ < x < a, then |f (x) − f (a)| < .
There is a similar definition of “limit from the right”, writen as limx→a+ f (x) = l
4.11. Proposition. If limx→a f (x) exists, then both one sided limts exist and are equal.
Conversely, if both one sided limits exits and are equal, then limx→a f (x) exists.
This splits the problem of finding whether a limit exists into two parts; we can look on
either side, and check first that we have a limit, and second, that we get the same answer.
For example, in 4.2, example 5, both 1-sided limits exist, but are not equal. There is now
an obvious way of checking continuity.
4.12. Proposition. (Continuity Test) The function f is continuous at a iff both one
sided limits exits and are equal to f (a).
2
x for x ≤ 1,
4.13. Example. Let f (x) = Show that f is continuous at 1. [In fact f
x for x ≥ 1.
is continuous everywhere].
Solution. We use the above criterion. Note that f (1) = 1. Also
so f is continuous at 1.
( sin x
for x < 0,
4.14. Exercise. Let f (x) = x Show that f is continuous at 0. [In fact
cos x for x ≥ 0.
f is continuous everywhere]. [Recall the result of 4.2, example 4]
4.15. Example. Let f (x) = |x|. Then f is continuous in R .
Solution. Note that if x < 0 then |x| = −x and so is continuous, while if x > 0, then
|x| = x and so also is continuous. It remains to examine the function at 0. From these
identifications, we see that limx→0− |x| = 0+, while limx→0+ |x| = 0+. Since 0+ = 0− =
0 = |0|, by the 4.12, |x| is continuous at 0
4.16. Proposition. Let f and g be continuous at a, and let k be a constant. Then k.f ,
f + g and f g are continuous at f . Also, if g(a) 6= 0, then f /g is continuous at a.
36 CHAPTER 4. LIMITS AND CONTINUITY
Pick > 0; then there is some δ1 such that if |x−a| < δ1 , then |f (x)−f (a)| < /2. Similarly
there is some δ2 such that if |x − a| < δ2 , then |g(x) − g(a)| < /2. Let δ = min(δ1 , δ2 ), and
pick x with |x − a| < δ. Then
|f (x) + g(x) − (f (a) + g(a))| ≤ |f (x) − f (a)| + |g(x) − g(a)| < /2 + /2 = .
This gives the result. The other results are similar, but rather harder; see (Spivak 1967)
for proofs.
Note: Just as when dealing with sequences, we need to know that f /g is defined in some
neighbourhood of a. This can be shown using a very similar proof to the corresponding
result for sequences.
Proof. Pick > 0. We must find δ > 0 such that if |x − a| < δ, then g(f (x)) − g(f (a))| < .
We find δ using the given properties of f and g. Since g is continuous at f (a), there is
some δ1 > 0 such that if |y − f (a)| < δ1 , then |g(y) − g(f (a))| < . Now use the fact that f
is continuous at a, so there is some δ > 0 such that if |x − a| < δ, then |f (x) − f (a)| < δ1 .
Combining these results gives the required inequality.
4.18. Example. The function in Example 4.8 is continuous everywhere. When we first
studied it, we showed it was continuous at the “difficult” point x = 0. Now we can deduce
that it is continuous everywhere else.
4.19. Example. The function f : x 7−→ sin3 x is continuous.
Solution. Write g(x) = sin(x) and h(x) = x3 . Note that each of g and h are continuous,
and that f = g ◦ h. Thus f is continuous.
2
x − a2
4.20. Example. Let f (x) = tan . Show that f is continuous at every point of its
x2 + a2
domain.
x2 − a2
Solution. Let g(x) = 2 . Since −1 < g(x) < 1, the function is properly defined
x + a2
for all values of x (whilst tan x is undefined when x = (2k + 1)π/2 ), and the quotient is
continuous, since each term is, and since x2 + a2 6= 0 for any x. Thus f is continuous, since
f = tan ◦g.
1 + x2
4.21. Exercise. Let f (x) = exp . Write down the domain of f , and show that f
1 − x2
is continuous at every point of its domain.
As another example of the use of the definitions, we can give a proof of Proposition 2.20
4.5. INFINITE LIMITS 37
4.27. Definition. Say that limx→∞ f (x) = ∞ iff given L > 0, there is some K such that
whenever x > K, then f (x) > L.
The reason for working on proofs from the definition is to be able to check what results
of this type are trivially true without having to find it in a book. For example
4.28. Proposition. Let g(x) = 1/f (x). Then g(x) → 0+ as x → ∞ iff f (x) → ∞ as
x → ∞. Let y = 1/x. Then y → 0+ as x → ∞; conversely, y → ∞ as x → 0+
Proof. Pick > 0. We show there is some K such that if x > K, then 0 < y < ; indeed,
simply take K = 1/. The converse is equally trivial.
4.29. Definition. Say that f is continuous on [a, b] iff f is continuous on (a, b), and if, in
addition, limx→a+ f (x) = f (a), and limx→b− f (x) = f (b).
Proof. We make no attempt to prove this formally, but sketch a proof with a pair of
sequences and a repeated bisection argument. It is also noted that each hypothesis is
necessary.
4.31. Example. Show there is at least one root of the equation x − cos x = 0 in the interval
[0, 1].
Proof. Apply the Intermediate Value Theorem to f on the closed interval [0, 1]. The func-
tion is continuous on that interval, and f (0) = −1, while f (1) = 1 − cos(1) > 0. Thus there
is some point c ∈ (0, 1) such that f (c) = 0 as required.
4.32. Exercise. Show there is at least one root of the equation x − e−x = 0 in the interval
(0, 1).
4.33. Corollary. Let f be continuous on the compact interval [a, b], and assume there is
some constant h such that f (a) < h and f (b) > h. Then there is a point c with a < c < b
such that f (c) = h.
4.6. CONTINUITY ON A CLOSED INTERVAL 39
Proof. Apply the Intermediate Value Theorem to f − h on the closed interval [0, 1]. Note
that by considering −f + h we can cope with the case when f (a) > h and f (b) < h.
Note: This theorem is the reason why continuity is often loosely described as a function
you can draw without taking your pen from the paper. As we have seen with y = 1/x, this
can give an inaccurate idea; it is in fact more akin to connectedness.
4.34. Theorem. (Boundedness) Let f be continuous on the compact interval [a, b].
Then there are constants M and m such that m < f (x) < M for all x ∈ [a, b]. In other
words, we are guaranteed that the graph of f is bounded both above and below.
Proof. This again uses the completeness of R , and again no proof is offered. Note also that
the hypotheses are all needed.
4.35. Theorem. (Boundedness) Let f be continuous on the compact interval [a, b].
Then there are points x0 and x1 such that f (x0 ) < f (x) < f (x1 ) for all x ∈ [a, b]. In
other words, we are guaranteed that the graph of f is bounded both above and below, and
that these global extrema are actually attained.
Proof. This uses the completeness of R , and follows in part from the previous result. If M is
the least upper bound given by theorem 4.34. Consider the function g(x) = (M − f (x))−1 .
This is clearly continuous. If there is some point at which f (y) = M , there is nothing to
prove; otherwise g(x) is defined everywhere, and is continuous, and hence bounded. This
contradicts the fact that M was a least upper bound for f .
Note also that the hypotheses are all needed.
40 CHAPTER 4. LIMITS AND CONTINUITY
Chapter 5
Differentiability
Note that the Newton quotient is not defined when x = a, nor need it be for the
definition to make sense. But the Newton quotient, if it exists, can be extended to be
a continuous function at a by defining its value at that point to be f 0 (a). Note also
the emphasis on the existence of the limit. Differentiation is as much about showing the
existence of the derivative, as calculating the value of the derivative.
5.2. Example. Let f (x) = x3 . Show, directly from the definition, that f 0 (a) = 3a2 . Com-
pare this result with 4.6.
41
42 CHAPTER 5. DIFFERENTIABILITY
Solution. This is just another way of asking about particular limits, like 4.2; we must
compute
x3 − a3 (x − a)(x2 + xa + a2 )
lim = lim = lim (x2 + 2xa + a2 ) = 3a2 .
x→a x − a x→a x−a x→a
√ √
5.3. Exercise. Let f (x) = x. Show, directly from the definition, that f 0 (a) = 1/2 a
when a 6= 0. What function do you have to consider in the particular case when a = 4?
Just as with continuity, it is impractical to use this definition every time to compute
derivatives; we need results showing how to differentiate the usual class of functions, and
we assume these are known from last year. Thus we assume the rules for differentiation
of sums products and compositions of functions, together with the known derivatives of
elementary functions such as sin, cos and tan; their reciprocals sec, cosec and cot; and exp
and log.
5.4. Proposition. Let f and g be differentiable at a, and let k be a constant. Then k.f ,
f + g and f g are differentiable at f . Also, if g(a) 6= 0, then f /g is differentiable at a. Let
f be differentiable at a, and let g be differentiable at f (a). Then g ◦ f is differentiable at a.
In addition, the usual rules for calculating these derivatives apply.
2
x − a2
5.5. Example. Let f (x) = tan for a 6= 0. Show that f is differentiable at every
x2 + a2
point of its domain, and calculate the derivative at each such point.
Solution. This is the same example we considered in 4.20. There we showed the domain
x2 − a2
was the whole of R , and that the function was continuous everywhere. Let g(x) = 2 .
x + a2
Then g is properly defined for all values of x, and the quotient is differentiable, since each
term is, and since x2 + a2 6= 0 for any x since a 6= 0. Thus f is differentiable using the
chain rule since f = tan ◦g, and we are assuming known that the elementary functions like
tan are differentiable.
Finally to actually calculate the derivative, we have:-
0 2x2 − a2 (x2 + a2 ).2x − ((x2 − a2 ).2x)
f (x) = sec .
x2 + a2 (x2 + a2 )2
4a2 x x2 − a2
= 2 2 2
. sec2 .
(x + a ) x2 + a2
1 + x2
5.6. Exercise. Let f (x) = exp . Show that f is differentiable at every point of its
1 − x2
domain, and calculate the derivative at each such point.
The first point in our study of differentiable functions is that it is more restrictive for
a function to be differentiable, than simply to be continuous.
Proof. To establish continuity, we must prove that limx→a f (x) = f (a). Since the Newton
quotient is known to converge, we have for x 6= a,
f (x) − f (a)
f (x) − f (a) = .(x − a) → f 0 (a).0 as x → a.
x−a
Hence f is continuous at a.
5.8. Example. Let f (x) = |x|; then f is continuous everywhere, but not differentiable at 0.
Solution. We already know from Example 4.15 that |x| is continuous. We compute the
Newton quotient directly; recall that |x| = x if x ≥ 0, while |x| = −x if x < 0. Thus
f (x) − f (0) x−0 f (x) − f (0) −x − 0
lim = lim = 1, while lim = lim = −1.
x→0+ x−0 x→0+ x − 0 x→0− x−0 x→0− x − 0
Thus both of the one-sided limits exist, but are unequal, so the limit of the Newton quotient
does not exist.
f (x) f 0 (a)
lim = 0 .
x→a g(x) g (a)
Proof. Since f (a) = g(a) = 0, provided x 6= a, we have
log(1 + x)
5.13. Exercise. Evaluate lim .
x→0 sin x
sin x
5.14. Example. (Spurious, but helps to remember!) Show that lim = 1.
x→0 x
Solution. This is spurious because we need the limit to calculate the derivative in the first
place, but applying l’Hôpital certainly gives the result.
5.15. Theorem (Rolle’s Theorem). Let f be continuous on [a, b], and differentiable on
(a, b), and suppose that f (a) = f (b). Then there is some c with a < c < b such that
f 0 (c) = 0.
Note: The theorem guarantees that the point c exists somewhere. It gives no indication
of how to find c. Here is the diagram to make the point geometrically:
f’(c) = 0
x
a c b
Figure 5.1: If f crosses the axis twice, somewhere between the two crossings, the function
is flat. The accurate statement of this “obvious” observation is Rolle’s Theorem.
Proof. Since f is continuous on the compact interval [a, b], it has both a global maximum
and a global minimum. Assume first that the global maximum occurs at an interior point
c ∈ (a, b). In what follows, we pick h small enough so that c + h always lies in (a, b). Then
5.3. ROLLE AND THE MEAN VALUE THEOREM 45
f (c + h) − f (c) f (c + h) − f (c)
If h > 0, ≤ 0, and so lim ≤ 0, since we know the limit
h h→0+ h
exists.
f (c + h) − f (c) f (c + h) − f (c)
Similarly, if h < 0, ≥ 0, and so lim ≥ 0, since we
h h→0+ h
know the limit exists. Combining these, we see that f 0 (c) = 0, and we have the result in
this case.
A similar argument applies if, instead, the global minimum occurs at the interior point
c. The remaining situation occurs if both the global maximum and global minimum occur
at end points; since f (a) = f (b), it follows that f is constant, and any c ∈ (a, b) will do.
Solution. Since p0 (x) = 3(x2 + 1) > 0 for all x ∈ R , we see that p has at most one root;
for if it had two (or more) roots there would be a root of p0 (x) = 0 between them by Rolle.
Since p(0) = 1, while p(−1) = −3, there is at least one root by the Intermediate Value
Theorem. Hence p has exactly one root.
We have q 0 (x) = 3(x2 − 1) = 0 when x = ±1. Since q(−1) = 3 and q(1) = −1, there is
a root of q between −1 and 1 by the Intermediate Value Theorem. Looking as x → ∞ and
as x → −∞ shows here are three roots of q.
5.17. Exercise. Show that the equation x − e−x = 0 has exactly one root in the inter-
val (0, 1).
Our version of Rolle’s theorem is valuable as far as it goes, but the requirement that
f (a) = f (b) is sufficiently strong that it can be quite hard to apply sometimes. Fortunately
the geometrical description of the result — that somewhere the tangent is parallel to the
axis, does have a more general restatement.
5.18. Theorem (The Mean Value Theorem). Let f be continuous on [a, b], and dif-
ferentiable on (a, b). Then there is some c with a < c < b such that
f (b) − f (a)
= f 0 (c) or equivalently f (b) = f (a) + (b − a)f 0 (c).
b−a
f (b) − f (a)
h(x) = f (b) − f (x) − (b − x).
b−a
Then h is continuous on the interval [a, b], since f is, and in the same way, it is differentiable
on the open interval (a, b). Also, f (b) = 0 and f (a) = 0. We can thus apply Rolle’s theorem
to h to deduce there is some point c with a < c < b such that h0 (c) = 0. Thus we have
f (b) − f (a)
0 = h0 (c) = −f 0 (c) + ,
b−a
which is the required result.
46 CHAPTER 5. DIFFERENTIABILITY
a c b
Figure 5.2: Somewhere inside a chord, the tangent to f will be parallel to the chord. The
accurate statement of this common-sense observation is the Mean Value Theorem.
1
5.19. Example. The function f satisfies f 0 (x) = and f (0) = 2. Use the Mean Value
5 − x2
theorem to estimate f (1).
Solution. We first estimate the given derivative for values of x satisfying 0 < x < 1. Since
for such x, we have 0 < x2 < 1, and so 4 < 5 − x2 < 5. Inverting we see that
1 1
< f 0 (x) < when 0 < x < 1.
5 4
Now apply the Mean Value theorem to f on the interval [0, 1] to obtain some c with 0 < c < 1
such that f (1) − f (0) = f 0 (c). From the given value of f (0), we see that 2.2 < f (1) < 2.25
1
5.20. Exercise. The function f satisfies f 0 (x) = and f (0) = 0. Use the Mean
5 + sin x
Value theorem to estimate f (π/2).
Note the “common sense” description of what we have done. If the derivative doesn’t
change much, the function will behave linearly. Note also that this gives meaning to the
approximation
f (a + h) ≈ f (a) + hf 0 (a).
We now see that the accurate version of this replaces f 0 (a) by f 0 (c) for some c between a
and a + h.
5.21. Theorem. (The Cauchy Mean Value Theorem) Let f and g be both continuous
on [a, b] and differentiable on (a, b). Then there is some point c with a < c < b such that
g0 (c) f (b) − f (a) = f 0 (c) g(b) − g(a) .
5.4. L’HÔPITAL REVISITED 47
Thus
f 0 (c) g(b) − g(a) = g0 (c) f (b) − f (a)
This is one form of the Cauchy Mean Value Theorem for f and g. If g0 (c) 6= 0 for any
possible c, then the Mean Value theorem shows that g(b) − g(a) 6= 0, and so we can divide
the above result to get
f (b) − f (a) f 0 (c)
= 0 ,
g(b) − g(a) g (c)
giving a second form of the result.
1 − cos x 1
5.23. Example. Evaluate lim 2
= .
x→0 x 2
Solution. We have
1 − cos x sin x 1
lim 2
= lim = ,
x→0 x x→0 2x 2
where the use of l’Hôpital is justified since the second limit exists. Note that you can’t
differentiate top and bottom again, and still expect to get the correct answer; one of the
hypotheses of l’Hôpital is that the first quotient is of the 0/0 form.
f (x) f 0 (x)
lim = lim 0 .
x→∞ g(x) x→∞ g (x)
5.6. TAYLOR’S THEOREM 49
Proof. (Sketch for interest — not part of the course). Pick > 0 and choose a such that
0
f (x)
lim 0 − l < for all x > a.
x→∞ g (x)
f (a + h) = f (a) + hf 0 (c)
where c is some point between a and a + h. [By writing the definition of c in this way,
we have a statement that works whether h > 0 or h < 0.] We have already met the
approximation
f (a + h) ∼ f (a) + hf 0 (a)
when we studied the Newton - Raphson method for solving an equation, and have already
observed that the Mean Value Theorem provides a more accurate version of this. Now
consider what happens when f is a polynomial of degree n
and f 00 (0) = 2a2 . After the next differentiation, we get f 000 (0) = 3!a3 , while after k differ-
entiations, we get, f (k) (0) = k!ak , provided k ≤ n. Thus we can rewrite the polynomial,
using its value, and the value of its derivatives at 0, as
where Pn (x), the Taylor polynomial of degree n about a, and Rn (x), the corresponding
remainder, are given by
• the theorem is also true for x < a; just restate it for the interval [x, a] etc;
• if n = 0, we have f (x) = f (a) + (x − a)f 0 (c) for some c between a and x; this is a
restatement of the Mean Value Theorem;
• if n = 1, we have
f 00 (c)
f (x) = f (a) + (x − a)f 0 (x) + (x − a)2
2!
for some c between a and x; this often called the Second Mean Value Theorem;
• the special case in which a = 0 has a special name; it is called Maclaurin’s The-
orem;
• just as with Rolle, or the Mean Value Theorem, there is no useful information about
the point c.
5.6. TAYLOR’S THEOREM 51
We now explore the meaning and content of the theorem with a number of examples.
5.30. Example. Find the Taylor polynomial of order n about 0 for f (x) = ex , and write
down the corresponding remainder term.
Solution. There is no difficulty here in calculating derivatives — clearly f (k) (x) = ex for
all k, and so f (k) (0) = 1. Thus by Taylor’s theorem,
x 2 x2 xn xn+1 c
ex = 1 + x + + + ... + e
2! 2! n! (n + 1)!
for some point c between 0 and x. In particular,
x 2 x2 xn xn+1 c
Pn (x) = 1 + x + + + ... and Rn (x) = e .
2! 2! n! (n + 1)!
We can actually say a little more about this example if we recall that x is fixed. We
have
xn+1 c
ex = Pn (x) + Rn (x) = Pn (x) + e
(n + 1)!
We show that Rn (x) → 0 as n → ∞, so that (again for fixed x), the sequence Pn (x) → ex
as n → ∞. If x < 0, ec < 1, while if x ≥ 1, then since c < x, we have ec < ex . thus
n+1
x |x|n+1
|Rn (x)| = c
e ≤ max(ex , 1) → 0 as n → ∞.
(n + 1)! (n + 1)!
We think of the limit of the polynomial as forming a series, the Taylor series for ex . We
study series (and then Taylor series) in Section 7.
5.31. Example. Find the Taylor polynomial of order 1 about a for f (x) = ex , and write
down the corresponding remainder term.
Solution. Using the derivatives computed above, by Taylor’s theorem,
(x − a)2 c
ex = ea +(x − a) ea + e
2!
for some point c between a and x. In particular,
(x − a)2 c
P1 (x) = ea +(x − a) ea and R1 (x) = e .
2!
5.32. Example. Find the Maclaurin polynomial of order n > 3 about 0 for f (x) = (1 + x)3 ,
and write down the corresponding remainder term.
Solution. We have
f (x) = (1 + x)3
f (0) = 1
f 0 (x) = 3(1 + x)2
f 0 (x) = 3
f 00 (x) = 6(1 + x)
f 00 (x) = 6
f 000 (x) = 6
f 000 (x) = 6
f (n) (x) = 0 if n > 3.
52 CHAPTER 5. DIFFERENTIABILITY
5.33. Example. Find the Taylor polynomial of order n about 0 for f (x) = sin x, and write
down the corresponding remainder term.
Solution. There is no difficulty here in calculating derivatives — we have
f (x) = sin x
f (0) = 0
f 0 (x) = cos x
f 0 (x) = 1
f 00 (x) = − sin x
f 00 (x) = 0
f 000 (x) = − cos x
f 000 (x) = −1.
f (4) (x) = sin x and so on.
x3 x5 x2n+1
sin x = x − + + . . . + (−1)n+1 + ...
3! 5! (2n + 1)!
Writing down the remainder term isn’t particularly useful, but the important point is that
2n+3
x
|R2n+1 (x)| ≤ → 0 as n → ∞.
(2n + 3)!
ex + e−x ex − e−x
5.34. Exercise. Recall that cosh x = , and that sinh x = . Now check the
2 2
shape of the following Taylor polynomials:
x2 x4 x2n
cos x = 1 − + + . . . + (−1)n + ...
2! 4! 2n!
x3 x5 x2n+1
sinh x = x + + + ... + + ...
3! 5! (2n + 1)!
x2 x4 x2n
cosh x = 1 + + + ... + + ...
2! 4! 2n!
5.35. Example. Find the maximum error in the approximation
x3
sin(x) ∼ x −
3!
given that |x| < 1/2.
Solution. We use the Taylor polynomial for sin x of order 4 about 0, together with the
corresponding remainder. Thus
x 3 x5
sin x = x − + cos c
3! 5!
5.6. TAYLOR’S THEOREM 53
for some c with 0 < c < 1/2 or −1/2 < c < 0. In any case, since |x| < 1/2,
5 5
x
cos c ≤ x ≤ 1 ≤ 1
5! 5! 25 .5! 120.32 .
5.36. Example. Find the Taylor polynomial of order n about 0 for f (x) = (1+x)α , and note
that this gives a derivation of the binomial theorem. In fact, the remainder |Rn (x)| → 0 as
n → ∞, provided |x| < 1.
Solution. There is again no difficulty here in calculating derivatives — we have
Infinite Series
In this section we return to study a particular kind of sequence, those built by adding up
more and more from a given collection of terms. One motivation comes from section 5.6,
in which we obtained a sequence of approximating polynomials {Pn } to a given function.
It is more natural to think of adding additional terms to the polynomial, and as such we
are studying series. However, there is a closely related sequence; the sequence of partial
sums.
a + (a + r) + (a + 2r) + · · · + (a + nr)
Let Sn = a + ar + ar 2 + · · · + ar n ,
rSn = ar + ar 2 + ar 3 + · · · + ar n+1 ,
so (1 − r)Sn = a(1 − r n+1 ), and
a(1 − r n+1 )
Sn = if r 6= 1.
1−r
a
Note that if |r| < 1, then Sn → as n → ∞.
1−r
55
56 CHAPTER 6. INFINITE SERIES
If r > 1, say r = 1 + k,
(1 + k)n+1 − 1
Sn = a ,
k
a
> 1 + (n + 1)k − 1 > a(n + 1) → ∞ if a > 0.
k
X∞
1 1
6.1. Example. Find + .
2n 3n
n=1
Solution.
X 1 1
X
1 X 1 1 1 1 1 1
+ = + = + + + ··· + + + ···
2n 3n 2n 3n 2 4 8 3 9
1 1 3
=1+ = .
3 1 − 1/3 2
∞
X
1 1
6.2. Exercise. Find 7 n −4 n .
3 2
n=1
Thus a series is convergent if and only if it’s sequence of partial sums is convergent.
The limit of the sequence of partial sums is the sum of the series. A series which is not
convergent, is a divergent series.
X
6.4. Example. The series r n is convergent with sum 1/(1 − r), provided that |r| < 1.
X
For other values of r, the series is divergent; in particular, the series (−1)n is divergent.
Solution. We noted above that when |r| < 1, Sn → a/(1 − r) as n → ∞; note particular
cases;
∞
X 1 1 1 1
=1 or equivalently, + + + · · · = 1.
2n 2 4 8
n=1
X1
6.5. Example. The sum is convergent with sum 1.
n(n + 1)
Solution. We can compute the partial sums explicitly:
X
n Xn
1 1 1 1
Sn = = − =1− → 1 as n → ∞.
k(k + 1) k k+1 n+1
k=1 k=1
6.2. CONVERGENT SERIES 57
X1
6.6. Example. The sum is divergent.
n
Solution. We estimate the partial sums:
1 1 1 1 1 1 1 1 1
Sn = + + + + + ··· + + + ··· + + ··· +
1 2 3 4 5 8 9 15 n
1 2 4 1
> 1+ + + >2 if n ≥ 15
2 4 8 2
1 2 4 8
> 1+ + + + > 3 if n ≥ 31
2 4 8 16
→ ∞ as n → ∞.
X 1
6.7. Example. The sum is convergent. [Actually the sum is π 2 /6, but this is much
n2
harder.]
1/4 y = 1/x^2
1/9
1/n2
1/16
1 2 3 4 n-1 n
Figure 6.1: Comparing the area under the curve y = 1/x2 with the area of the rectangles
below the curve
Solution. We estimate the partial sums. Since 1/n2 > 0, clearly {Sn } is an increasing se-
quence. We show it is bounded above, whence by the Monotone Convergence Theorem (3.9),
it is convergent. From the diagram,
Z n
1 1 1 dx
2
+ 2 + ··· + 2 < 2
, and so
2 3 n 1 x
1 n 1
Sn < 1 + − ≤2− .
x 1 n
Thus Sn < 2 for all n, the sequence of partial sums is bounded above, and the series is
convergent.
P
6.8. Proposition. Let an be convergent. Then an → 0 as n → ∞.
Proof. Write l = limn→∞ Sn , and recall from our work on limits of sequences that Sn−1 → l
as n → ∞. Then
y = 1/x
1 2 3 4 n-1 n
Figure 6.2: Comparing the area under the curve y = 1/x with the area of the rectangles
above the curve
Sn = a1 + a2 + · · · + an ,
Tn = b1 + b2 + · · · bn .
6.3. THE COMPARISON TEST 59
.
2n 1 P
6.13. Example. Let an = and let bn = 2 . Then an is convergent
3n3 − 1 n
Solution. For n ≥ 1, n3 ≥ 1, so 3n3 − 1 ≥ 2n3 . Thus
2n 2n
an = ≤ 3 = bn .
3n3 − 1 2n
P P
Since we know that bn is convergent, so is an .
6.14. Remark. The conclusions of Theorem 6.12 remain true even if we only have an ≤ bn
eventually; for if it holds for n ≥ N , we replace the inequality by
Sn ≤ Tn + a1 + a2 + · · · + aN
6.16. Corollary (Limiting form of the Comparison Test). Suppose that an > 0 and
an P
bn > 0, and that there is some constant k such that lim = k > 0. Then an is
P n→∞ bn
convergent iff bn is convergent.
P
Proof. Assume first that bn is convergent. Since an /bn → k as n → ∞,
P eventually (take
= k > 0), we have an ≤ 2kbn , which is convergent by 6.11. Hence an is convergent
by 6.12. To get the converse, note that bn /an → 1/k as n → ∞, so we can use the same
argument with an and bn interchanged.
n 1 P
6.17. Example. Let an = and let bn = . Then an is divergent by the limiting
n2 +1 n
form of the comparison test.
Solution. Note that the terms are all positive, so we try to apply the limiting form of the
comparison test directly.
an n n n2
= 2 . = 2 →1 as n → ∞.
bn n +1 1 n +1
60 CHAPTER 6. INFINITE SERIES
n 1
6.20. Exercise. Let an = √ and let bn = 3/2 . Use the limiting form of the
n5
P+ n + 1 n
comparison test to show that an is convergent.
We can consider the method of comparing with integrals as an “integral test” for the
convergence of a series; rather than state it formally, note the method we have used.
P |an+1 |
6.21. Theorem (The Ratio Test). Let an be a series, and assume that lim →
|an |
r as n → ∞. Then if r < 1, the series is convergent, if r > 1, the series is divergent, while
if r = 1, the test gives no information.
Proof. A proof follows by comparing with the corresponding geometric series with ratio r.
Details will be given in full in the third year course.
2n (n!)2 P
6.22. Example. Let an = . Then an is convergent.
(2n)!
6.4. ABSOLUTE AND CONDITIONAL CONVERGENCE 61
Solution. We look the ratio of adjacent terms in the series (of positive terms).
Since the ratio of adjacent terms in the series tends to a limit which is < 1, the series
converges by the ratio test.
P∞
Proof. Assume that n=1 an is absolutely convergent, and define
an if an > 0 |an | if an < 0
a+
n = and a−
n =
0 if an ≤ 0 0 if an ≥ 0
0 ≤ a+
n ≤ |an | and 0 ≤ a−
n ≤ |an | for all n, (6.1)
|an | = a+ −
n + an and an = a+ −
n − an .
P∞
Using equation 6.1 to compare with the convergent series n=1 |an |, we see that each of
∞
X ∞
X
a+
n and a−
n
n=1 n=1
This gives one way of proving that a series is convergent even if the terms are not
all positive, and so we can’t use the comparison test directly. There is essentially only
one other way, which is a very special, but useful case known as Leibniz theorem, or the
theorem on alternating signs, or the alternating series test. We give the proof because
the argument is so like the proof of the convergence of the ratio of adjacent terms in the
Fibonacci series 3.1.
Warning: Note how we usually talk about the “Fibonacci series”, even though it is a
sequence rather than a series. Try not to be confused by this popular but inaccurate usage.
6.29. Theorem. Leibniz Theorem Let {an } be a decreasing sequence of positive terms
such that an → 0 as n → ∞. Then the series
∞
X
(−1)n+1 an is convergent.
n=1
∞
X
Proof. Write Sn for the nth partial sum of the series (−1)n+1 an . We show this sequence
n=1
has the same type of oscillating behaviour that the corresponding sequence of partial sums
in the Fibonacci example. By definition, we have
Since {an } is a decreasing sequence, a2n > a2n+1 and so s2n+1 < s2n−1 . Thus we have a
decreasing sequence
Also
Thus
s2 < s4 < s6 . . . < s2n−2 < s2n < s2n+1 < s2n−1 < . . . s5 < s3 < s1
and so letting n → ∞
α−β =0
So α = β, and all the partial sums are tending to α, so the series converges.
∞
X (−1)n+1
6.30. Example. Show that the series is conditionally convergent.
n=1
n
X∞ ∞
(−1)n+1 X 1
Solution. We have = and this is divergent by 6.19; thus the series is
n n
n=1 n=1
not absolutely convergent. We show using 6.29 that this series is still convergent, and so is
conditionally convergent.
Write an = 1/n, so an > 0, an+1 < an and an → 0 as n → ∞. Thus all the conditions
∞
X (−1)n+1
of Leibniz’s theorem are satisfied, and so the series is convergent.
n
n=1
P
6.31. Proposition (Re-arranging an Absolutely convergent Series). Let ∞ n=1 an
be
P∞ an absolutely convergent series and suppose that {bn } is a re-arrangement of {an }. Then
n=1 bn is convergent, and
∞
X ∞
X
bn = an .
n=1 n=1
64 CHAPTER 6. INFINITE SERIES
Proof. See next year, or (Spivak 1967); the point here is that we need absolute convergence
before series behave in a reasonable way.
Find how accurate the approximation obtained by just using the first ten terms
X∞
1
is, to .
n=1
n2
Again we are going to use geometrical methods for this. Our geometric statement follows
from the diagram, and is the assertion that the area of the rectangles below the curve is less
than the area under the curve, which is less than the area of the rectangles which contain
the curve.
y = 1/x2
Figure 6.3: An upper and lower approximation to the area under the curve
X
M Z M X
M −1
1 dx 1
≤ ≤
n2 N x 2 n2
n=N +1 n=N
X∞ XN
1 1
S= and SN = .
n=1
n2 n=1
n2
Our conclusion is that although S10 is not a very good approximation, we can describe the
error well enough to get a much better approximation.
66 CHAPTER 6. INFINITE SERIES
Chapter 7
Power Series
7.1. Proposition. The following series converge for all values of x to the functions shown:
x 2 x2 xn
ex = 1 + x + + + ... + ...
2! 2! n!
x3 x5 x2n+1
sin x = x − + + . . . + (−1)n+1 + ...
3! 5! (2n + 1)!
x2 x4 x2n
cos x = 1 − + + . . . + (−1)n + ...
2! 4! 2n!
x3 x5 x2n+1
sinh x = x + + + ... + + ...
3! 5! (2n + 1)!
x2 x4 x2n
cosh x = 1 + + + ... + + ...
2! 4! 2n!
These are all examples of the subject of this section; they are real power series, which
we can use to define functions. The corresponding functions are the best behaved of all the
classes of functions we meet in this course; indeed are as well behaved as could possibly
be expected. We shall see in this section that this class of functions are really just “grown
up polynomials”, and that almost any manipulation valid for polynomials remains valid for
this larger class of function.
P
7.2. Definition. A real power series is a series of the form an xn , where the an are
real numbers, and x is a real variable.
67
68 CHAPTER 7. POWER SERIES
We are thus dealing with a whole collection of series, one for each different value of x.
Our hope is that there is some coherence; that the behaviour of series for different values
of x are related in some sensible way.
X∞
7.3. Example. The geometric series xn is another example of a power series we have
n=0
already met. We saw this series is convergent for all x with |x| < 1.
It turns out that a power series is usually best investigated using the ratio test, The-
orem 6.21. And the behaviour of power series is in fact very coherent.
P
7.4. Theorem (Radius of Convergence). Suppose an xn is a power series. Then
one of the following happens:
P
• an xn converges only when x = 0; or
P
• an xn converges absolutely for all x; or
P
• there is some number R > 0 such that an xn converges absolutely for all x with
|x| < R, and diverges for all x with |x| > R.
No statement is made in the third case about what happens when x = R.
7.5. Definition. The number R described above is called the radius of convergence of
the power series. By allowing R = 0 and R = ∞, we can consider every power series to
have a radius of convergence.
Thus every power series has a radius of convergence. We sometimes call the interval
(−R, R), where the power series is guaranteed to converge, the interval of convergence.
It is characterised by the fact that the series converges (absolutely) inside this interval and
diverges outside the interval.
• The word “radius” is used, because in fact the same result is true for complex series,
and then we have a genuine circle of convergence, with convergence for all (complex)
z with |z| < R, and guaranteed divergence whenever |z| > R.
• Note the power of the result; we are guaranteed that when |x| > R, the series diverges;
it can’t even converge “accidentally” for a few x0 s with |x| > R. Only on the circle
of convergence is there ambiguity.
Thus the given series diverges if |x| > 2 and converges absolutely (and so of course con-
verges) if |x| < 2. Hence it has radius of convergence 2.
X (−1)n n!xn
7.7. Example. Find the radius of convergence of the series .
nn
Solution. This one is a little more subtle than it looks, although we have met the limit
before. Again we look at the ratio of the moduli of adjacent terms.
Here we have of course used the result about e given in Section 3.1 to note that
1 n
1+ → e as n → ∞.
n
Thus the given series diverges if |x| > e and converges absolutely (and so of course converges)
if |x| < e. Hence it has radius of convergence e.
∞
X xn
7.8. Exercise. Find the radius of convergence of the series
n2 + 1
n=0
We noted that the theorem gives no information about what happens when x = R,
i.e. on the circle of convergence. There is a good reason for this — it is quite hard to
predict what happens. Consider the following power series, all of which have radius of
convergence 2.
∞
X ∞
X ∞
X
xn xn xn
.
2n n2n n2 2n
n=1 n=1 n=1
The first is divergent when x = 2 and when x = −2, the second converges when x = −2,
and diverges when x = 2, while the third converges both when x = 2 and when x = −2.
These results are all easy to check by direct substitution, and using Theorem 6.29.
It turns out that this is the last, and best behaved of the classes of functions we study
in this course.
In fact all of what we say below remains true when R = ∞, provided we interpret the
open interval I as R .
P
7.9. Theorem. Let an xn be a power series with
P radius of convergencePR > 0. Let I
be the open interval (−R, R), and define f (x) = an xn for x ∈ I. Then nan xn−1 has
radius of convergence R, f is differentiable on I, and
∞
X
f 0 (x) = nan xn−1 for x ∈ I.
n=1
We summarise this result by saying that we can differentiate a power series term - by -
term everywhere inside the circle of convergence. If R = ∞, then this can be done for all x.
Proof. Quite a lot harder than it looks; we need to be able to re-arrange power series, and
then use the Mean Value Theorem to estimate differences, and show that even when we
add an infinite number of errors, they don’t add up to too much. It can be found e.g. in
(Spivak 1967).
7.10. Corollary. Let f and I be defined as 7.9. Then f has an indefinite integral defined
on I, given by
∞
X an
G(x) = xn+1 for x ∈ I.
(n + 1)
n=0
Proof. Apply 7.9 to G to see that G0 (x) = f (x), which is the required result.
Note: It is easy to get this result directly from the Taylor Series. The next one is not
quite so easy.
We return to equation 7.1, and replace x by −x2 to get
1
= 1 − x2 + x4 − x6 + . . . + (−1)n x2n + . . . for |x| < 1.
1 + x2
Again integrating both sides, we have
x3 x5 x2n+1
arctan(x) = K + x − + + . . . + (−1)n + ... for |x| < 1,
3 5 2n + 1
where K is a constant of integration. Again putting x = 0 shows that K = 0, and so
∞
X (−1)n x2n+1
arctan(x) = valid for |x| < 1. (7.3)
n=0
2n + 1
Thus the new series has radius of convergence 1. Denote its sum by f (x), defined for |x| < 1.
Inside the circle of convergence, it is permissible to differentiate term - by - term, and thus
f 0 (x) = log(1 + x) for |x| < 1, since they have the same power series. Hence
Z Z
x+1−1
f (x) = log(1 + x) dx = K + x log(1 + x) − dx (7.4)
1+x
= K + x log(1 + x) − x + log(1 + x). (7.5)
Putting x = 0 shows that K = 0 and so f (x) = (1 + x) log(1 + x) − x.
We have now been able to derive a power series representation for a function without
working directly from the Taylor series, and doing the differentiations — which can often
prove very awkward. Nevertheless, we have still found the Taylor coefficients.
P
7.12. Proposition. Let an xn be a power series, with radius of convergence R > 0, and
define
∞
X
f (x) = an xn for |x| < R.
n=0
f (n) (0)
Then an = , so the given series is the Taylor (or Maclaurin) series for f
n!
72 CHAPTER 7. POWER SERIES
Proof. We can differentiate n times by 7.9 and we still get a series with the same radius
of convergence. Also, calculating exactly as in the start of Section 5.6, we see that the
derivatives satisfy f (n) (0) = n!an , giving the uniqueness result.
1
7.13. Example. Let f (x) = . Calculate f (n) (0).
1 − x3
Solution. We use the binomial theorem to get a power series expansion about 0,
1
= 1 + x3 + x6 + x9 + . . . + x3n + . . . valid for |x| < 1.
1 − x3
We now read off the various derivatives. Clearly f (n) (0) = 0 unless n is a multiple of 3,
while f (3k) (0) = (3k)! by 7.12.
7.5 Applications*
This section will not be formally examined, but is here to show how we can get more
interesting results from power series.
Given α, we can always find some N such that N − α ≥ 1. Next note that provided x > 0,
each term in the series for x−α ex is positive, and hence the sum is greater than any given
term. So in particular,
xN −α x
x−α ex > ≥ if x > 1.
N! N!
x
In particular, since N is fixed, → ∞ as x → ∞, giving the result claimed.
N!
7.5.2 The function log x grows more slowly than any power of x
Specifically, we claim that for any β > 0, limx→∞ x−β log x = 0.
We are interested in what happens when x → ∞, so we will restrict to the situation
when x > 0. Put y = β log x, noting this is possible since x > 0. Thus y/β = log x, or
equivalently, x = ey/b . Thus we have xβ = ey , and so
y −y
x−β log x = .e when y = β log x. (7.6)
β
It turns out that this integral cannot be evaluated using the usual tricks — substitution,
integration by parts etc. But a power series representation and Corollary 7.10 can help.
74 CHAPTER 7. POWER SERIES
Thus
∞
X
x xn
e =
n!
n=0
X∞ 2 )n ∞
X
−x2 n (−x x2n
e = (−1) = (−1)n
n! n!
n=0 n=0
The partial sums of the power series on the right can be computed, and converge quite
quickly, so we have a practical method of evaluating the integral, even thought we can’t
“do” the integral.
Thus, we have
a! a! a!
b(a − 1)! = a! − + − + ...
1! 2! 3!
a! a! a! (−1)a a!
= a! − + − + . . . +
1! 2! 3! a!
1 1 1
+ (−1)a+1
− + − ... .
a + 1 (a + 1)(a + 2) (a + 1)(a + 2)(a + 3)
The left hand side is an integer, as is each of the terms in the sum
a! a! a! (−1)a a!
a! − + + ... + ;
1! 2! 3! a!
7.5. APPLICATIONS* 75
Differentiation of Functions of
Several Variables
We conclude with two chapters which are really left over from last year’s calculus course,
and which should help to remind you of the techniques you met then. We shall mainly be
concerned with differentiation and integration of functions of more than one variable. We
describe
In this chapter we concentrate on differentiation, and in the last one, move on to integration.
77
78 CHAPTER 8. DIFFERENTIATION OF FUNCTIONS OF SEVERAL VARIABLES
why we started with it last year. And just as last year, we shall usually have a “standard”
function name; instead of y = f (x), we often work with z = f (x, y), since most of the extra
complications occur when we have two, rather than one (independent) variable, and we
don’t need to consider more general cases like w = f (x, y, z), or even y = f (x1 , x2 , . . . , xn ).
z
(0,0,2)
(0,3,0)
(4,0,0)
x
difficulties when dealing with more complicated functions, which make sketching and visu-
alisation rather harder than for functions of one variable. And if there are three or more
independent variables, there is really no good way of visualising the behaviour of the func-
tion directly. But for just two independent variables, there are some tricks.
Solution. We can represent the surface directly by drawing it as shown in Fig. 8.3.
20
15
10
5
0
-5
-10
-15
-20
4
3
2
1
-4 -3 0
-2 -1 -1
0 1 -2
2 -3
3 4 -4
Such a representation is easy to create using suitable software and Fig. 8.3 shows the
resulting surface. We now describe how to looking at similar examples without such a
program. One approach is to draw a contour map of the surface, and then use the usual
tricks to visualise the surface.
For the surface z = x2 − y 2 , the points where z = 0 lie on x2 = y 2 , so form the lines
y = x and y = −x. We can continue in this way, and look at the points where z = 1; so
x2 −y 2 = 1. This is one of the hyperbolae shown Fig. 8.4; indeed, fixing z at different values
shows the contours (lines of constant height or z value) are all the same shape, but with
different constants. We this get the alternative representation as a contour map shown in
Fig. 8.4.
A final way to confirm that you have the right view of the surface is to section it in
different planes. So far we have looked at the intersection of the planes z = k with the
surface z = x2 − y 2 for different values of the constant k. If instead we fix x, at the value a,
then z = a2 − y 2 . Each of these curves is a parabola with its vertex upwards, at the point
y = 0, z = a2 .
8.3. Exercise.pBy looking at the curves where z is constant, or otherwise, sketch the surface
given by z = x2 + y 2 .
80 CHAPTER 8. DIFFERENTIATION OF FUNCTIONS OF SEVERAL VARIABLES
-1
-2
-3
-4
-4 -3 -2 -1 0 1 2 3 4
Figure 8.4: Contour plot of the surface z = x2 − y 2 . The missing points near the x - axis
are an artifact of the plotting program.
Continuity
As you might expect, we say that a function f of two variables is continuous at (x0 , y0 ) if
lim f (x, y) = f (x0 , y0 ).
x→x0 ,y→y0
The only complication comes when we realise that there are many different ways if which
x → x0 and y → y0 . We illustrate with a simple example.
2xy
8.4. Example. Investigate the continuity of f (x, y) = 2 at the point (0, 0).
x + y2
Solution. Consider first the case when x → 0 along the x-axis, so that throughout the
process, y = 0. We have
2x.0
f (x, 0) = = 0 → 0 as x → 0.
x2 + 0
Next consider the case when x → 0 and y → 0 on the line y = x, so we are looking at the
special case when x = y. We have
2x2
f (x, x) = =1→1 as x → 0.
x2 + x 2
Of course f is only continuous if it has the same limit however x → 0 and y → 0, and we
have now seen that it doesn’t; so f is not continuous at (0, 0).
Although we won’t go into it, the usual “putting together” theorems show that f is
continuous everywhere else.
8.2. PARTIAL DIFFERENTIATION 81
∂z ∂z
8.6. Exercise. Let z = log(x/y). Show that x +y = 0. The fact that the last two
∂x ∂y
function satisfy the same differential equation is not a co-incidence. With our next result,
we can see that for any suitably differentiable function f , the function z(x, y) = f (x/y)
satisfies this partial differential equation.
∂z ∂z
8.7. Exercise. Let z = f (x/y), where f is suitably differentiable. Show that x +y = 0.
∂x ∂y
Because the definitions are really just version of the 1-variable result, these examples
are quite typical; most of the usual rules for differentiation apply in the obvious way to
partial derivatives exactly as you would expect. But there are variants. Here is how we
differentiate compositions.
8.8. Theorem. Assume that f and all its partial derivatives fx and fy are continuous,
and that x = x(t) and y = y(t) are themselves differentiable functions of t. Let
F (t) = f (x(t), y(t)).
Then F is differentiable and
dF ∂f dx ∂f dy
= + .
dt ∂x dt ∂y dt
Proof. Write x = x(t), x0 = x(t0 ) etc. Then we calculate the Newton quotient for F .
F (t) − F (t0 ) = f (x, y) − f (x0 , y0 )
= f (x, y) − f (x0 , y) + f (x0 , y) − f (x0 , y0 )
∂f ∂f
= (ξ, y)(x − x0 ) + (x0 , η)(y − y0 )
∂x ∂y
Here we have used the Mean Value Theorem (5.18) to write
∂f
f (x, y) − f (x0 , y) =(ξ, y)(x − x0 )
∂x
for some point ξ between x and x0 , and have argued similarly for the other part. Note that
ξ, pronounced “Xi” is the Greek letter “x”; in the same way η, pronounced “Eta” is the
Greek letter “y”. Thus
F (t) − F (t0 ) ∂f (x − x0 ) ∂f (y − y0 )
= (ξ, y) + (x0 , η)
t − t0 ∂x t − t0 ∂y t − t0
Now let t → t0 , and note that in this case, x → x0 and y → y0 ; and since ξ and η are
trapped between x and x0 , and y and y0 respectively, then also ξ → x0 and η → y0 . The
result then follows from the continuity of the partial derivatives.
df
8.9. Example. Let f (x, y) = xy, and let x = cos t, y = sin t. Compute when t = π/2.
dt
Solution. From the chain rule,
df ∂f dx ∂f dy
= + = −y(t) sin t + x(t) cos t = −1. sin(π/2) = −1.
dt ∂x dt ∂y dt
The chain rule easily extends to the several variable case; only the notation is complic-
ated. We state a typical example
8.2. PARTIAL DIFFERENTIATION 83
8.10. Proposition. Let x = x(u, v), y = y(u, v) and z = z(u, v), and let f be a function
defined on a subset U ∈ R 3 , and suppose that all the partial derivatives of f are continuous.
Write
Then
∂F ∂f ∂x ∂f ∂y ∂f ∂z ∂F ∂f ∂x ∂f ∂y ∂f ∂z
= + + and == + + .
∂u ∂x ∂u ∂y ∂u ∂z ∂u ∂v ∂x ∂v ∂y ∂v ∂z ∂v
The introduction of the domain of f above, simply to show it was a function of three
variables is clumsy. We often do it more quickly by saying
Let f (x, y, z) have continuous partial derivatives
This has the advantage that you are reminded of the names of the variables on which f acts,
although strictly speaking, these names are not bound to the corresponding places. This is
an example where we adopt the notation which is very common in engineering maths. But
note the confusion if you ever want to talk about the value f (y, z, x), perhaps to define a
new function g(x, y, z).
8.11. Example. Assume that f (u, v, w) has continuous partial derivatives, and that
u = x − y, v =y−z w = z − x.
Let
F (x, y, z) = f (u(x, y, z), v(x, y, z), w(x, y, z)).
Show that
∂F ∂F ∂F
+ + = 0.
∂x ∂y ∂z
Solution. We apply the chain rule, noting first that from the change of variable formulae,
we have
∂u ∂v ∂w
= 1, = 0, = −1,
∂x ∂x ∂x
∂u ∂v ∂w
= −1, = 1, = 0,
∂y ∂y ∂y
∂u ∂v ∂w
= 0, = −1, = 1.
∂z ∂z ∂z
Then
∂F ∂f ∂f ∂f ∂f ∂f
= .1 + .0 + .−1= −
∂x ∂u ∂v ∂w ∂u ∂w
∂F ∂f ∂f ∂f ∂f ∂f
= .−1+ .1 + .0 = − +
∂y ∂u ∂v ∂w ∂u ∂v
∂F ∂f ∂f ∂f ∂f ∂f
= .0 + .−1+ .1 = − +
∂z ∂u ∂v ∂w ∂v ∂w
∂2f 2
2∂ f
= c .
∂t2 ∂x2
f(x,t)
a b
Figure 8.5: A string displaced from the equilibrium position
∂2F
= 0.
∂u∂v
Solution. Such a function is easy to integrate, because the two variables appear inde-
∂F
pendently. So = g1 (v), where g1 is an arbitrary (differentiable) function. since when
∂v
∂2F
differentiated with respect to u we are given that = 0. Thus we can integrate with
∂u∂v
respect to v to get
Z
F (u, v) = g1 (v)dv + h(u) = g(v) + h(u),
8.15. Example. Rewrite the wave equation using co-ordinates u = x − ct and v = x + ct.
Solution. Write f (x, t) = F (u, v) and now in principle confuse F with f , so we can tell
them apart only by the names of their arguments. In practice we use different symbols to
help the learning process; but note that in a practical case, all the F ’s that appear below,
would normally be written as f ’s By the chain rule
∂ ∂ ∂ ∂ ∂ ∂
= .1 + .1 and = .(−c) + .c.
∂x ∂u ∂v ∂t ∂u ∂v
86 CHAPTER 8. DIFFERENTIATION OF FUNCTIONS OF SEVERAL VARIABLES
differentiating again, and using the operator form of the chain rule as well,
∂2f ∂ ∂ ∂F ∂F
= c −c c −c
∂t2 ∂v ∂u ∂v ∂u
∂2F 2
2 ∂ F
2
2 ∂ F
2
2∂ F
= c2 − c − c + c
∂v 2 ∂u∂v ∂v∂u ∂u2
2 2
2
∂ F ∂ F ∂ F
= c2 2
+ 2
− 2c2
∂v ∂u ∂u∂v
and similarly
∂2f ∂2F ∂2F ∂2F
= + +2 .
∂x2 ∂v 2 ∂u2 ∂u∂v
∂2F
4c2 = 0,
∂u∂v
an equation which we have already solved. Thus solutions to the wave equation are of the
form f (u) + g(v) for any (suitably differentiable) functions f and g. For example we may
have sin(x − ct). Note that this is not just any function; it must be constant when x = ct.
8.16. Exercise. Let F (x, t) = log(2x + 2ct) for x > −ct, where c is a fixed constant. Show
that
∂2F 2
2∂ F
− c = 0.
∂t2 ∂x2
Note that this is simply checking a particular case of the result we have just proved.
8.17. Definition. Say that f (x, y) has a critical point at (a, b) if and only if
∂f ∂f
(a, b) = (a, b) = 0.
∂x ∂y
It is clear by comparison with the single variable result, that a necessary condition that
f have a local extremum at (a, b) is that it have a critical point there, although that is not
a sufficient condition. We refer to this as the first derivative test.
8.5. MAXIMA AND MINIMA 87
We can get more information by looking at the second derivative. Recall that we gave
a number of different notations for partial derivatives, and in what follows we use fx rather
∂f
than the more cumbersome etc. This idea extends to higher derivatives; we shall use
∂x
∂2f ∂2f
fxx instead of , and fxy instead of etc.
∂x2 ∂x∂y
8.18. Theorem (Second Derivative Test). Assume that (a, b) is a critical point for f .
Then
2 > 0, then f has a local maximum at
• If, at (a, b), we have fxx < 0 and fxx fyy − fxy
(a, b).
2 > 0, then f has a local minimum at
• If, at (a, b), we have fxx > 0 and fxx fyy − fxy
(a, b).
2 < 0, then f has a saddle point at (a, b).
• If, at (a, b), we have fxx fyy − fxy
2 = 0, and the investigation has to be
The test is inconclusive at (a, b) if fxx fyy − fxy
continued some other way.
Note that the discriminant is easily remembered as
fxx fxy
∆= = fxx fyy − f 2
fyx fyy xy
A number of very simple examples can help to remember this. After all, the result of the
test should work on things where we can do the calculation anyway!
8.19. Example. Show that f (x, y) = x2 + y 2 has a minimum at (0, 0).
Of course we know it has a global minimum there, but here goes with the test:
Solution. We have fx = 2x; fy = 2y, so fx = fy precisely when x = y = 0, and this is the
2 = 4 > 0 and there
only critical point. We have fxx = fyy = 2; fxy = 0, so ∆ = fxx fyy − fxy
is a local minimum at (0, 0).
8.20. Exercise. Let f (x, y) = xy. Show there is a unique critical point, which is a saddle
point
Proof. We give an indication of how the theorem can be derived — or if necessary how it
can be remembered. We start with the two dimensional version of Taylor’s theorem, see
section 5.6. We have
2
∂f ∂f 1 2∂ f ∂2f 2
2∂ f
f (a + h, b + k) ∼ f (a, b) + h (a, b) + k (a, b) + h + 2kh +k
∂x ∂y 2 ∂x2 ∂x∂y ∂y 2
where we have actually taken an expansion to second order and assumed the corresponding
remainder is small.
∂f ∂f
We are looking at a critical point, so for any pair (h, k), we have h (a, b)+k (a, b) =
∂x ∂y
0 and everything hinges on the behaviour of the second order terms. It is thus enough to
study the behaviour of the quadratic Ah2 + 2Bhk + Ck2 , where we have written
∂2f ∂2f ∂2f
A= , B= , and C= .
∂x2 ∂x∂y ∂y 2
88 CHAPTER 8. DIFFERENTIATION OF FUNCTIONS OF SEVERAL VARIABLES
where we write ∆ = CA − B 2 for the discriminant. We have thus expressed the quadratic
as the sum of two squares. It is thus clear that
• if A < 0 and ∆ > 0 we have a local maximum;
• if ∆ < 0 then the coefficients of the two squared terms have opposite signs, so by
going out in two different directions, the quadratic may be made either to increase or
to decrease.
Note also that we could have completed the square in the same way, but starting from the
k term, rather than the h term; so the result could just as easily be stated in terms of C
instead of A
8.21. Example. Let f (x, y) = 2x3 − 6x2 − 3y 2 − 6xy. Find and classify the critical points of
f . By considering f (x, 0), or otherwise, show that f does not achieve a global maximum.
Solution. We have fx = 6x2 − 12x− 6y and fy = −6y − 6x. Thus critical points occur when
y = −x and x2 − x = 0, and so at (0, 0) and (1, −1). Differentiating again, fxx = 12x − 12,
fyy = −6 and fxy = −6. Thus the discriminant is ∆ = −6.(12x − 12) − 36. When x = 0,
∆ = 36 > 0 and since fxx = −12, we have a local maximum at (0, 0). When x = 1,
∆ = −36 < 0, so there is a saddle at (1, −1).
To see there is no global maximum, note that f (x, 0) = 2x3 (1 − 3/x) → ∞ as x → ∞,
since x3 → ∞ as x → ∞.
Substituting, we have
S = 2(x + y) 30 − 2(x + y) + xy = 60(x + y) − 4(x + y)2 + xy,
8.5. MAXIMA AND MINIMA 89
h y
x
Figure 8.6: A dimensioned box
f (x, y) = 2 + 2x + 2y − x2 − y 2
on the triangular plate in the first quadrant bounded by the lines x = 0, y = 0 and y = 9−x
Solution. We know there is a global maximum, because the function is continuous on a
closed bounded subset of R 2 . Thus the absolute max will occur either in the interior, at
a critical point, or on the boundary. If y = 0, investigate f (x, 0) = 2 + 2x − x2 , while if
x = 0, investigate f (0, y) = 2 + 2y − y 2 . If y = 9 − x, investigate
for an absolute maximum. In fact extreme may occur when (x, y) = (0, 1) or (1, 0) or (0, 0)
or (9, 0) or (0, 9), or (9/2, 9/2). At these points, f takes the values −41/2, 2, 3, −61.
Next we seek critical points in the interior of the plate,
fx = 2 − 2x = 0 and fy = 2 − 2y = 0.
so (x, y) = (1, 1) and f (1, 1) = 4, so this must be the global maximum. Can check also
using the second derivative test, that it is a local maximum.
8.25. Theorem. The tangent to the surface F (x, y, z) = c at the point (x0 , y0 , z0 ) is given
by
∂F ∂F ∂F
(x − x0 ) + (y − y0 ) + (z − z0 ) = 0.
∂x ∂y ∂z
Proof. This is a simple example of the use of vector geometry. Given that (x0 , y0 , z0 ) lies
on the surface, and so in the tangent, then for any other point (x, y, z) in the tangent plane,
the vector (x− x0 , y − y0 , z − z0 ) must lie in the tangent plane, and so must be normal to the
normal to the curve (i.e. to ∇F ). Thus (x − x0 , y − y0 , z − z0 ) and ∇F are perpendicular,
and that requirement is the equation which gives the tangent plane.
8.26. Example. Find the equation of the tangent plane to the surface
F (x, y, z) = x2 + y 2 + z − 9 = 0
Solution. We have ∇F |(1,2,4) = (2, 4, 1), and the equation of the tangent plane is
2(x − 1) + 4(y − 2) + (z − 4) = 0.
8.27. Exercise. Show that the tangent plane to the surface z = 3xy − x3 − y 3 is horizontal
only at (0, 0, 0) and (1, 1, 1).
8.29. Example. A cone is measured. The radius has a measurement error of 3%, and the
height an error of 2%. What is the error in measuring the volume?
Solution. The volume V of a cone is given by V = πr 2 h/3, where r is the radius of the
cone, and h is the height. Thus
2 1
dV = πrh dr + πr 2 dh
3 3
πr 2 h πr 2 h
= 2.(0.03). + .(0.02) = V (0.06 + 0.02)
3 3
Thus there is an 8% error in measuring the volume.
8.30. Exercise. The volume of a cylindrical oil tank is to be calculated from measured
values of r and h. What is the percentage error in the volume, if r is measured with an
accuracy of 2%, and h measured with an accuracy of 0.5%.
Multiple Integrals
where the first sum is thought of as a limiting case, adding up the areas of a number of
rectangles each of height f (xi ), and width dxi . This leads to the natural generalisation to
several variables: we think of the function z = f (x, y) as representing the height of f at the
point (x, y) in the plane, and interpret the integral as the sum of the volumes of a number
of small boxes of height z = f (x, y) and area dxi dyj . Thus the volume of the solid of height
z = f (x, y) lying above a certain region R in the plane leads to integrals of the form
Z Z X
n X
m
= f (xi , yj )dxi dyj = lim Smn .
R i=1 j=1
Z Z
We write such a double integral as f (x, y) dA.
R
93
94 CHAPTER 9. MULTIPLE INTEGRALS
Our result that the order in which we add up the volume of the small boxes doesn’t matter
is the following, which also formally shows that we evaluate a double integral as any of the
possible repeated integrals.
9.1. Theorem (Fubini’s theorem for Rectangles). Let f (x, y) be continuous on the
rectangular region R : a ≤ x ≤ b; c ≤ y ≤ d. Then
Z Z Z d Z b Z b Z d
f (x, y) dA = f (x, y) dx dy = f (x, y) dy dx
R c a a c
Note that this is something like an inverse of partial differentiation. In doing the
first inner (or repeated) integral, we keep y constant, and integrate with respect to x.
Then we integrate with respect to y. Of course if f is a particularly simply function, say
f (x, y) = g(x)h(y), then it doesn’t matter which order we do the integration, since
Z Z Z b Z d
f (x, y) dA = g(x) dx h(y) dy.
R a c
We use the Fubini theorem to actually evaluate integrals, since we have no direct way
of calculating a double (as opposed to a repeated) integral.
9.2. Example. Integrate z = 4 − x − y over the region 0 ≤ x ≤ 2 and 0 ≤ y ≤ 1. Hence
calculate the volume under the plane z = 4 − x − y above the given region.
Solution. We calculate the integral as a repeated integral, using Fubini’s theorem.
Z Z Z Z
2 1 2 1 2
V = (4 − x − y) dy dx = (4y − xy − y 2 /2) 0 dx = (4 − x − 1/2) dx etc.
x=0 y=0 x=0 x=0
From our interpretation of the integral as a volume, we recognise V as volume under the
plane z = 4 − x − y which lies above {(x, y) | 0 ≤ x ≤ 2; 0 ≤ y ≤ 1}.
Z 3Z 2
9.3. Exercise. Evaluate (4 − y 2 ) dy dx, and sketch the region of integration.
0 0
In fact Fubini’s theorem is valid for more general regions than rectangles. Here is a pair
of statements which extend its validity.
2 y
x
1 2
Proof. We give no proof, but the reduction to the earlier case is in principle simple; we just
extend the function to be defined on a rectangle by making it zero on the extra bits. The
problem with this as it stands is that the extended function is not continuous. However,
the difficulty can be fixed.
This last form enables us to evaluate double integrals over more complicated regions by
passing to one of the repeated integrals.
9.5. Example. Evaluate the integral
Z 2 Z 2
y2
dy dx
1 x x2
1 y
9.6. Example. Find the area of the region bounded by the curve x2 + y 2 = 1, and above
the line x + y = 1.
Solution. We recognise an area as numerically equal to the volume of a solid of height 1,
so if R is the region described, the area is
Z Z Z 1 Z y=√1−x2 ! Z 1p
1 dx dy = dy dx = 1 − x2 − (1 − x) dx = . . . .
R 0 y=1−x 0
And we also find that Fubini provides a method for actually calculating integrals; some-
times one way of doing a repeated integral is much easier than the other.
9.7. Example. Sketch the region of integration for
Z 1 Z 1
2 xy
x e dx dy.
0 y
dA = dx dy = r dr dθ
Solution. Let V be the required volume. The ball is the set {(x, y, z) | x2 + y 2 + z 2 ≤ 1}.
It can be thought of as twice the volume enclosed by a hemisphere of radius 1 in the upper
half plane, and so
Z p
V =2 1 − x2 − y 2 dx dy
D
where the region of integration D consists of the unit disc {(x, y) | x2 + y 2 ≤ 1}. Although
we can try to do this integration directly, the natural co-ordinates to use are plane polars,
and so we instead do a change of variable first. As in 9.9, if we write x = r cos θ, y = r sin θ,
we have dx dy = r dr dθ. Thus
Z p Z p
V =2 1 − x − y dx dy = 2 ( 1 − r 2 ) r dr dθ
2 2
D D
Z 2π Z 1 p
= 2 dθ 1 − r 2 dr
0 0
" #1
(1 − r 2 )3/2
= 4π −
3
0
4π
= .
3
Note that after the change of variables, the integrand is a product, so we are able to do the
dr and dθ parts of the integral at the same time.
And finally, we show that the same ideas work in 3 dimensions. There are (at least) two
co-ordinate systems in R 3 which are useful when cylindrical or spherical symmetry arises.
One of these, cylindrical polars is given by the transformation
x = r cos θ, y = r sin θ, z = z,
dV = r 2 sin φ dr dφ dθ = dx dy dz.
9.11. Example. The moment of inertia of a solid occupying the region R, when rotated
about the z - axis is given by the formula
Z Z Z
I= (x2 + y 2 )ρ dV.
R
Calculate the moment of inertia about the z-axis of the solid of unit density which lies
outside the cylinder of radius a, inside the sphere of radius 2a, and above the x − y plane.
9.3. CHANGE OF VARIABLE — THE JACOBIAN 99
r ϕ
y
θ
x
z
a
2a
r
x
Figure 9.4: Cross section of the right hand half of the solid outside a cylinder of radius a
and inside the sphere of radius 2a
Solution. Let I be the moment of inertia of the given solid about the z-axis. A diagram
of a cross section of the solid is shown in Fig 9.4.
We use cylindrical polar co-ordinates (r, θ, z); the Jacobian gives dx dy dz = r dr dθ dz,
so
Z 2π Z 2a Z √4a2 −r2
I = dθ r dr r 2 dz
0 a 0
Z 2a p
= 2π r 3 4a2 − r 2 dr.
a
Index Entries
absolute value, 7 geometric progression (or Monotone Convergence Prin-
absolutely convergent, 61 series), 55 ciple, 23
alternating series test, 62 gradient, 90
arithmetic - geometric mean natural numbers, 2
inequality, 5 half - open, 5 neighbourhood, 6
Newton quotient, 41
arithmetic progression, 55 implicit functions, 92 numbers, 2
increasing, 22
Binomial Theorem, 8 inequalities, 4 open, 29
bounded above, 22 integers, 2 open interval, 5
bounded below, 22 integral test, 60 ordering of R , 3
Intermediate Value The-
closed interval, 5 orem, 38 partial differential equation,
Comparison Test, 59 interval of convergence, 68 81
completeness of R, 3 intervals, 5 positive integers, 2
completing the square, 9 power series, 67
conditionally convergent, 61 Jacobian, 97 properties of R, 2
continuity, 30, 31
l’Hôpital’s rule: general radius of convergence, 68
continuous, 29, 32, 80
form, 47 range, 5
convergent series, 56
l’Hôpital’s rule: infinite lim- Ratio Test, 60
critical point, 86
its, 48 rational numbers, 2
cylindrical polars, 98
l’Hôpital’s rule: simple form, real numbers, 2
43 real power series, 67
divergent series, 56
Leibniz Theorem, 62 repeated integral, 94
domain, 5
limit from the left, 34 Rolle’s Theorem, 44
double integral, 93 linear approximation, 91
local maximum, 87 saddle point, 87
Fibonacci sequence, 26 local minimum, 87 second derivative test, 87
first derivative test, 86 Second Mean Value The-
from above, 35 Maclaurin’s Theorem, 50 orem, 50
Fubini’s Theorem, 93 Mean Value Theorem, 45 series, 55
function, 5 modulus, 7 singularity, 6
101
102 BIBLIOGRAPHY