Part IA - Analysis I: Based On Lectures by W. T. Gowers

Part IA — Analysis I
Based on lectures by W. T. Gowers

Notes taken by Dexter Chua
Lent 2015
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
Limits and convergence

Sequences and series in R and C. Sums, products and quotients. Absolute convergence;
absolute convergence implies convergence. The Bolzano-Weierstrass theorem and
applications (the General Principle of Convergence). Comparison and ratio tests,
alternating series test. [6]
Continuity
Continuity of real- and complex-valued functions defined on subsets of R and C. The
intermediate value theorem. A continuous function on a closed bounded interval is
bounded and attains its bounds. [3]
Differentiability
Differentiability of functions from R to R. Derivative of sums and products. The chain
rule. Derivative of the inverse function. Rolle’s theorem; the mean value theorem.
One-dimensional version of the inverse function theorem. Taylor’s theorem from R to
R; Lagrange’s form of the remainder. Complex differentiation. [5]
Power series
Complex power series and radius of convergence. Exponential, trigonometric and
hyperbolic functions, and relations between them. *Direct proof of the differentiability
of a power series within its circle of convergence*. [4]
Integration
Definition and basic properties of the Riemann integral. A non-integrable function.
Integrability of monotonic functions. Integrability of piecewise-continuous functions.
The fundamental theorem of calculus. Differentiation of indefinite integrals. Integration
by parts. The integral form of the remainder in Taylor’s theorem. Improper integrals. [6]
1
Contents IA Analysis I
Contents
0 Introduction 3
1 The real number system 4
2 Convergence of sequences 7
2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Sums, products and quotients . . . . . . . . . . . . . . . . . . . . 8
2.3 Monotone-sequences property . . . . . . . . . . . . . . . . . . . . 10
2.4 Cauchy sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Limit supremum and infimum . . . . . . . . . . . . . . . . . . . . 14
3 Convergence of infinite sums 15

3.1 Infinite sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Absolute convergence . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Convergence tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4 Complex versions . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4 Continuous functions 23
4.1 Continuous functions . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Continuous induction* . . . . . . . . . . . . . . . . . . . . . . . . 26
5 Differentiability 29
5.1 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.2 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.3 Differentiation theorems . . . . . . . . . . . . . . . . . . . . . . . 33
5.4 Complex differentiation . . . . . . . . . . . . . . . . . . . . . . . 37
6 Complex power series 38

6.1 Exponential and trigonometric functions . . . . . . . . . . . . . . 39
6.2 Differentiating power series . . . . . . . . . . . . . . . . . . . . . 43
6.3 Hyperbolic trigonometric functions . . . . . . . . . . . . . . . . . 45
7 The Riemann Integral 46

7.1 Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.2 Improper integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2
0 Introduction IA Analysis I
0 Introduction
In IA Differential Equations, we studied calculus in a non-rigorous setting. While
we did define differentiation (properly) as a limit, we did not define what a limit
was. We didn’t even have a proper definition of integral, and just mumbled
something about it being an infinite sum.
In Analysis, one of our main goals is to put calculus on a rigorous foundation.
We will provide unambiguous definitions of what it means to take a limit, a
derivative and an integral. Based on these definitions, we will prove the results
we’ve had such as the product rule, quotient rule and chain rule. We will also
rigorously prove the fundamental theorem of calculus, which states that the
integral is the inverse operation to the derivative.
However, this is not all Analysis is about. We will study all sorts of limiting
(“infinite”) processes. We can see integration as an infinite sum, and differen-
tiation as dividing two infinitesimal quantities. In Analysis, we will also study
infinite series such as 1 + 14 + 19 + 16
1
+ · · · , as well as limits of infinite sequences.
Another important concept in Analysis is continuous functions. In some
sense, continuous functions are functions that preserve limit processes. While
their role in this course is just being “nice” functions to work with, they will be
given great importance when we study metric and topological spaces.
This course is continued in IB Analysis II, where we study uniform convergence
(a stronger condition for convergence), calculus of multiple variables and metric
spaces.
3
1 The real number system IA Analysis I
1 The real number system

We are all familiar with real numbers — they are “decimals” consisting of
infinitely many digits. When we really want to study real numbers, while this
definition is technically valid, it is not a convenient definition to work with.
Instead, we take an axiomatic approach. We define the real numbers to be a set
that satisfies certain properties, and, if we really want to, show that the decimals
satisfy these properties. In particular, we define real numbers to be an ordered
field with the least upper bound property.
We’ll now define what it means to be an “ordered field with the least upper
bound property”.
Definition (Field). A field is a set F with two binary operations + and × that
satisfies all the familiar properties satisfied by addition and multiplication in Q,
namely
(i) (∀a, b, c) a + (b + c) = (a + b) + c and a × (b × c) = (a × b) × c (associativity)

(ii) (∀a, b) a + b = b + a and a × b = b × a (commutativity)
(iii) ∃0, 1 such that (∀a)a + 0 = a and a × 1 = a (identity)
6 0, then ∃a−1 such that a×a−1 = 1.
(iv) ∀a ∃(−a) such that a+(−a) = 0. If a =
(inverses)
(v) (∀a, b, c) a × (b + c) = (a × b) + (a × c) (distributivity)
√
Example. Q, R, C, integers mod p, {a + b 2 : a, b ∈ Z} are all fields.
Definition (Totally ordered set). An (totally) ordered set is a set X with a
relation < that satisfies
(i) If x, y, z ∈ X, x < y and y < z, then x < z (transitivity)
(ii) If x, y ∈ X, exactly one of x < y, x = y, y < x holds (trichotomy)
We call it a totally ordered set, as opposed to simply an ordered set, when
we want to emphasize the order is total, i.e. every pair of elements are related to
each other, as opposed to a partial order, where “exactly one” in trichotomy is
replaced with “at most one”.
Now to define an ordered field, we don’t simply ask for a field with an order.
The order has to interact nicely with the field operations.
Definition (Ordered field). An ordered field is a field F with a relation < that
makes F into an ordered set such that
(i) if x, y, z ∈ F and x < y, then x + z < y + z
(ii) if x, y, z ∈ F, x < y and z > 0, then xz < yz
Lemma. Let F be an ordered field and x ∈ F. Then x2 ≥ 0.

Proof. By trichotomy, either x < 0, x = 0 or x > 0. If x = 0, then x2 = 0. So
x2 ≥ 0. If x > 0, then x2 > 0 × x = 0. If x < 0, then x − x < 0 − x. So 0 < −x.
But then x2 = (−x)2 > 0.
4
Definition (Least upper bound). Let X be an ordered set and let A ⊆ X. An

upper bound for A is an element x ∈ X such that (∀a ∈ A) a ≤ x. If A has an
upper bound, then we say that A is bounded above.
An upper bound x for A is a least upper bound or supremum if nothing
smaller that x is an upper bound. That is, we need
(i) (∀a ∈ A) a ≤ x
(ii) (∀y < x)(∃a ∈ A) a ≥ y

We usually write sup A for the supremum of A when it exists. If sup A ∈ A,
then we call it max A, the maximum of A.
Example. Let X = Q. Then the supremum of (0, 1) is 1. The √ set {x : x2 < 2}
is bounded above by 2, but √ has no supremum (even though 2 seems like a
supremum, we are in Q and 2 is non-existent!).
max[0, 1] = 1 but (0, 1) has no maximum because the supremum is not in
(0, 1).
We can think of the supremum as a point we can get arbitrarily close to in
the set but cannot pass through.
Definition (Least upper bound property). An ordered set X has the least upper
bound property if every non-empty subset of X that is bounded above has a
supremum.
Obvious modifications give rise to definitions of lower bound, greatest lower
bound (or infimum) etc. It is simple to check that an ordered field with the least
upper bound property has the greatest lower bound property.
Definition (Real numbers). The real numbers is an ordered field with the least
upper bound property.
Of course, it is very important to show that such a thing exists, or else we
will be studying nothing. It is also nice to show that such a field is unique (up
to isomorphism). However, we will not prove these in the course.
In a field, we can define the “natural numbers” to be 2 = 1 + 1, 3 + 1 + 2 etc.
Then an important property of the real numbers is
Lemma (Archimedean property v1)). Let F be an ordered field with the least
upper bound property. Then the set {1, 2, 3, · · · } is not bounded above.
Proof. If it is bounded above, then it has a supremum x. But then x − 1 is not
an upper bound. So we can find n ∈ {1, 2, 3, · · · } such that n > x − 1. But then
n + 1 > x, but x is supposed to be an upper bound.
Is the least upper bound property required to prove the Archimedean prop-
erty? It seems like any ordered field should satisfy this even if they do not have
the least upper bound property. However, it turns out there are ordered fields in
which the integers are bounded above.
P (x)
Consider the field of rational functions, i.e. functions in the form Q(x) with
P (x), Q(x) being polynomials, under the usual addition and multiplication. We
P (x) R(s)
order two functions Q(x) , S(x) as follows: these two functions intersect only
5
finitely many times because P (x)S(x) = R(x)Q(x) has only finitely many roots.
After the last intersection, the function whose value is greater counts as the
greater function. It can be checked that these form an ordered field.
In this field, the integers are the constant functions 1, 2, 3, · · · , but it is
bounded above since the function x is greater than all of them.
6
2 Convergence of sequences IA Analysis I
2 Convergence of sequences
Having defined real numbers, the first thing we will study is sequences. We
will want to study what it means for a sequence to converge. Intuitively, we
would like to say that 1, 12 , 13 , 14 , · · · converges to 0, while 1, 2, 3, 4, · · · diverges.
However, the actual formal definition of convergence is rather hard to get right,
and historically there have been failed attempts that produced spurious results.
2.1 Definitions
Definition (Sequence). A sequence is, formally, a function a : N → R (or C).
Usually (i.e. always), we write an instead of a(n). Instead of a, we usually write
it as (an ), (an )∞ ∞
1 or (an )n=1 to indicate it is a sequence.
Definition (Convergence of sequence). Let (an ) be a sequence and ` ∈ R. Then

an converges to `, tends to `, or an → ` , if for all ε > 0, there is some N ∈ N
such that whenever n > N , we have |an − `| < ε. In symbols, this says
(∀ε > 0)(∃N )(∀n ≥ N ) |an − `| < ε.
We say ` is the limit of (an ).

One can think of (∃N )(∀n ≥ N ) as saying “eventually always”, or as “from
some point on”. So the definition means, if an → `, then given any ε, there
eventually, everything in the sequence is within ε of `.
We’ll now provide an alternative form of the Archimedean property. This is
the form that is actually useful.
Lemma (Archimedean property v2). 1/n → 0.
Proof. Let ε > 0. We want to find an N such that |1/N − 0| = 1/N < ε. So
pick N such that N > 1/ε. This exists such an N by the Archimedean property
v1. Then for all n > N , we have 0 < 1/n ≤ 1/N < ε. So |1/n − 0| < ε.
Note that the red parts correspond to the definition of a sequence. This is
generally how we prove convergence from first principles.
Definition (Bounded sequence). A sequence (an ) is bounded if
(∃C)(∀n) |an | ≤ C.
A sequence is eventually bounded if
(∃C)(∃N )(∀n ≥ N ) |an | ≤ C.
The definition of an eventually bounded sequence seems a bit daft. Clearly

every eventually bounded sequence is bounded! Indeed it is:
Lemma. Every eventually bounded sequence is bounded.
Proof. Let C and N be such that (∀n ≥ N ) |an | ≤ C. Then ∀n ∈ N, |an | ≤
max{|a1 |, · · · , |aN −1 |, C}.
The proof is rather trivial. However, most of the time it is simpler to prove
that a sequence is eventually bounded, and this lemma saves us from writing
that long line every time.
7
2.2 Sums, products and quotients

Here we prove the things that we think are obviously true, e.g. sums and products
of convergent sequences are convergent.
Lemma (Sums of sequences). If an → a and bn → b, then an + bn → a + b.
Proof. Let ε > 0. We want to find a clever N such that for all n ≥ N ,
|an + bn − (a + b)| < ε. Intuitively, we know that an is very close to a and bn is
very close to b. So their sum must be very close to a + b.
Formally, since an → a and bn → b, we can find N1 , N2 such that ∀n ≥ N1 ,
|an − a| < ε/2 and ∀n ≥ N2 , |bn − b| < ε/2.
Now let N = max{N1 , N2 }. Then by the triangle inequality, when n ≥ N ,
|(an + bn ) − (a + b)| ≤ |an − a| + |bn − b| < ε.
We want to prove that the product of convergent sequences is convergent.

However, we will not do it in one go. Instead, we separate it into many smaller
parts.
Lemma (Scalar multiplication of sequences). Let an → a and λ ∈ R. Then
λan → λa.
Proof. If λ = 0, then the result is trivial.
Otherwise, let ε > 0. Then ∃N such that ∀n ≥ N , |an − a| < ε/|λ|. So
|λan − λa| < ε.
Lemma. Let (an ) be bounded and bn → 0. Then an bn → 0.
Proof. Let C 6= 0 be such that (∀n) |an | ≤ C. Let ε > 0. Then ∃N such that
(∀n ≥ N ) |bn | < ε/C. Then |an bn | < ε.
Lemma. Every convergent sequence is bounded.
Proof. Let an → l. Then there is an N such that ∀n ≥ N , |an − l| ≤ 1. So
|an | ≤ |l| + 1. So an is eventually bounded, and therefore bounded.
Lemma (Product of sequences). Let an → a and bn → b. Then an bn → ab.
Proof. Let an = a + εn . Then an bn = (a + εn )bn = abn + εn bn .
Since bn → b, abn → ab. Since εn → 0 and bn is bounded, εn bn → 0. So
an bn → ab.
Proof. (alternative) Observe that an bn −ab = (an −a)bn +(bn −b)a. We know that
an − a → 0 and bn − b → 0. Since (bn ) is bounded, so (an − a)bn + (bn − b)a → 0.
So an bn → ab.
Note that in this proof, we no longer write “Let ε > 0”. In the beginning, we
have no lemmas proven. So we must prove everything from first principles and
use the definition. However, after we have proven the lemmas, we can simply
use them instead of using first principles. This is similar to in calculus, where
we use first principles to prove the product rule and chain rule, then no longer
use first principles afterwards.
8
6 0.
Lemma (Quotient of sequences). Let (an ) be a sequence such that (∀n) an =
Suppose that an → a and a 6= 0. Then 1/an → 1/a.
Proof. We have
1 1 a − an
− = .
an a aan
We want to show that this → 0. Since a − an → 0, we have to show that 1/(aan )
is bounded.
Since an → a, ∃N such that ∀n ≥ N , |an − a| ≤ a/2. Then ∀n ≥ N , |an | ≥
|a|/2. Then |1/(an a)| ≤ 2/|a|2 . So 1/(an a) is bounded. So (a − an )/(aan ) → 0
and the result follows.
Corollary. If an → a, bn → b, bn , b 6= 0. Then an /bn = a/b.
Proof. We know that 1/bn → 1/b. So the result follows by the product rule.
Lemma (Sandwich rule). Let (an ) and (bn ) be sequences that both converge to
a limit x. Suppose that an ≤ cn ≤ bn for every n. Then cn → x.
Proof. Let ε > 0. We can find N such that ∀n ≥ N , |an −x| < ε and |bn −x| < ε.
The ∀n ≥ N , we have x − ε < an ≤ cn ≤ bn < x + ε. So |cn − x| < ε.
Example. 1/2n → 0. For every n, n < 2n . So 0 < 1/2n < 1/n. The result
follows from the sandwich rule.
Example. We want to show that
n2 + 3 1
→ .
(n + 5)(2n − 1) 2
We can obtain this by
n2 + 3 1 + 3/n2 1
= → ,
(n + 5)(2n − 1) (1 + 5/n)(2 − 1/n) 2
by sum rule, sandwich rule, Archimedean property, product rule and quotient
rule.
Example. Let k ∈ N and let δ > 0. Then
nk
→ 0.
(1 + δ)n
This can be summarized as “exponential growth beats polynomial growth even-
tually”.
By the binomial theorem,

n n
(1 + δ) ≥ δ k+1 .
k+1
Also, for n ≥ 2k,
(n/2)k+1

n n(n − 1) · · · (n − k)
= ≥ .
k+1 (k + 1)! (k + 1)!
So for sufficiently large n,
nk nk 2k+1 (k + 1)! 2k+1 (k + 1)! 1
n
≤ k+1 k+1
= · → 0.
(1 + δ) n δ δ k+1 n
9
2.3 Monotone-sequences property

Recall that we characterized the least upper bound property. It turns out that
there is an alternative characterization of real number using sequences, known
as the monotone-sequences property. In this section, we will show that the two
characterizations are equivalent, and use the monotone-sequences property to
deduce some useful results.
Definition (Monotone sequence). A sequence (an ) is increasing if an ≤ an+1
for all n.
It is strictly increasing if an < an+1 for all n. (Strictly) decreasing sequences
are defined analogously.
A sequence is (strictly) monotone if it is (strictly) increasing or (strictly)
decreasing.
Definition (Monotone-sequences property). An ordered field has the monotone
sequences property if every increasing sequence that is bounded above converges.
We want to show that the monotone sequences property is equivalent to the
least upper bound property.
Lemma. Least upper bound property ⇒ monotone-sequences property.
Proof. Let (an ) be an increasing sequence and let C an upper bound for (an ).
Then C is an upper bound for the set {an : n ∈ N}. By the least upper bound
property, it has a supremum s. We want to show that this is the limit of (an ).
Let ε > 0. Since s = sup{an : n ∈ N}, there exists an N such that aN > s − ε.
Then since (an ) is increasing, ∀n ≥ N , we have s − ε < aN ≤ an ≤ s. So
|an − s| < ε.
We first prove a handy lemma.
Lemma. Let (an ) be a sequence and suppose that an → a. If (∀n) an ≤ x, then
a ≤ x.
Proof. If a > x, then set ε = a − x. Then we can find N such that aN > x.
Contradiction.
Before showing the other way implication, we will need the following:
Lemma. Monotone-sequences property ⇒ Archimedean property.
Proof. We prove version 2, i.e. that 1/n → 0.
Since 1/n > 0 and is decreasing, by MSP, it converges. Let δ be the limit.
By the previous lemma, we must have δ ≥ 0.
If δ > 0, then we can find N such that 1/N < 2δ. But then for all n ≥ 4N ,
we have 1/n ≤ 1/(4N ) < δ/2. Contradiction. Therefore δ = 0.
Lemma. Monotone-sequences property ⇒ least upper bound property.
Proof. Let A be a non-empty set that’s bounded above. Pick u0 , v0 such that
u0 is not an upper bound for A and v0 is an upper bound. Now do a repeated
bisection: having chosen un and vn such that un is not an upper bound and vn
is, if (un + vn )/2 is an upper bound, then let un+1 = un , vn+1 = (un + vn )/2.
Otherwise, let un+1 = (un + vn )/2, vn+1 = vn .
10
Then u0 ≤ u1 ≤ u2 ≤ · · · and v0 ≥ v1 ≥ v2 ≥ · · · . We also have

v0 − u0
vn − un = → 0.
2n
By the monotone sequences property, un → s (since (un ) is bounded above by
v0 ). Since vn − un → 0, vn → s. We now show that s = sup A.
If s is not an upper bound, then there exists a ∈ A such that a > s. Since
vn → s, then there exists m such that vm < a, contradicting the fact that vm is
an upper bound.
To show it is the least upper bound, let t < s. Then since un → s, we can
find m such that um > t. So t is not an upper bound. Therefore s is the least
upper bound.
Why do we need to prove the Archimedean property first? In the proof above,
we secretly used the it. When showing that vn − un → 0, we required the fact
that 21n → 0. To prove this, we sandwiched it with n1 . But to show n1 → 0, we
need the Archimedean property.
Lemma. A sequence can have at most 1 limit.
Proof. Let (an ) be a sequence, and suppose an → x and an → y. Let ε > 0
and pick N such that ∀n ≥ N , |an − x| < ε/2 and |an − y| < ε/2. Then
|x − y| ≤ |x − aN | + |aN − y| < ε/2 + ε/2 = ε. Since ε was arbitrary, x must
equal y.
Lemma (Nested intervals property). Let F be an ordered field with the monotone
T∞ property. Let I1 ⊇ I2 ⊇ · · · be closed bounded non-empty intervals.
sequences
Then n=1 In 6= ∅.
Proof. Let Tn = [an , bn ] for each n. Then a1 ≤ a2 ≤ · · · and b1 ≥ b2 ≥ · · · .
For each n, an ≤ bn ≤ b1 . So the sequence an is bounded above. So by the
monotone sequences property, it has a limit a. For each n, we must have an ≤ a.
Otherwise, say an > a. Then for all m ≥ n, we have am ≥ an > a. This implies
that a > a, which is nonsense.
Also, for each fixed n, we have that ∀m ≥ T n, am ≤ bm ≤ bn . So a ≤ bn .
∞
Thus, for all n, an ≤ a ≤ bn . So a ∈ In . So a ∈ n=1 In .
We can use this to prove that the reals are uncountable:
Proposition. R is uncountable.
Proof. Suppose the contrary. Let x1 , x2 , · · · be a list of all real numbers. Find
an interval that does not contain x1 . Within that interval, find an interval that
does not contain x2 . Continue ad infinitum. Then the intersection of all these
intervals is non-empty, but the elements in the intersection are not in the list.
Contradiction.
A powerful consequence of this is the Bolzano-Weierstrass theorem. This is
formulated in terms of subsequences:
Definition (Subsequence). Let (an ) be a sequence. A subsequence of (an ) is a
sequence of the form an1 , an2 , · · · , where n1 < n2 < · · · .
Example. 1, 1/4, 1/9, 1/16, · · · is a subsequence of 1, 1/2, 1/3, · · · .
11
Theorem (Bolzano-Weierstrass theorem). Let F be an ordered field with the

monotone sequences property (i.e. F = R).
Then every bounded sequence has a convergent subsequence.
Proof. Let u0 and v0 be a lower and upper bound, respectively, for a sequence
(an ). By repeated bisection, we can find a sequence of intervals [u0 , v0 ] ⊇
[u1 , v1 ] ⊇ [u2 , v2 ] ⊇ · · · such that vn − un = (v0 − u0 )/2n , and such that each
[un , vn ] contains infinitely many terms T of (an ).
∞
By the nested intervals property, n=1 [un , vn ] 6= ∅. Let x belong to the
intersection. Now pick a subsequence an1 , an2 , · · · such that ank ∈ [uk , vk ]. We
can do this because [uk , vk ] contains infinitely many an , and we have only picked
finitely many of them. We will show that ank → x.
Let ε > 0. By the Archimedean property, we can find K such that vK − uK =
(v0 − u0 )/2K ≤ ε. This implies that [uK , vK ] ⊆ (x − ε, x + ε), since x ∈ [uK , vK ].
Then ∀k ≥ K, ank ∈ [uk , vk ] ⊆ [uK , vK ] ⊆ (x−ε, x+ε). So |ank −x| < ε.
2.4 Cauchy sequences

The third characterization of real numbers is in terms of Cauchy sequences.
Cauchy convergence is an alternative way of defining convergent sequences
without needing to mention the actual limit of the sequence. This allows us to
say {3, 3.1, 3.14, 3.141, 3.1415, · · · } is Cauchy convergent in Q even though the
limit π is not in Q.
Definition (Cauchy sequence). A sequence (an ) is Cauchy if for all ε, there is
some N ∈ N such that whenever p, q ≥ N , we have |ap − aq | < ε. In symbols,
we have
(∀ε > 0)(∃N )(∀p, q ≥ N ) |ap − aq | < ε.
Roughly, a sequence is Cauchy if all terms are eventually close to each other
(as opposed to close to a limit).
Lemma. Every convergent sequence is Cauchy.
Proof. Let an → a. Let ε > 0. Then ∃N such that ∀n ≥ N , |an − a| < ε/2.
Then ∀p, q ≥ N , |ap − aq | ≤ |ap − a| + |a − aq | < ε/2 + ε/2 = ε.
Lemma. Let (an ) be a Cauchy sequence with a subsequence (ank ) that converges
to a. Then an → a.
Proof. Let ε > 0. Pick N such that ∀p, q ≥ N , |ap − aq | < ε/2. Then pick K
such that nK ≥ N and |anK − a| < ε/2.
Then ∀n ≥ N , we have
ε ε
|an − a| ≤ |an − anK | + |anK − a| < + = ε.
2 2
An important result we have is that in R, Cauchy convergence and regular

convergence are equivalent.
Theorem (The general principle of convergence). Let F be an ordered field with
the monotone-sequence property. Then every Cauchy sequence of F converges.
12
Proof. Let (an ) be a Cauchy sequence. Then it is eventually bounded, since ∃N ,

∀n ≥ N , |an − aN | ≤ 1 by the Cauchy condition. So it is bounded. Hence by
Bolzano-Weierstrass, it has a convergent subsequence. Then (an ) converges to
the same limit.
Definition (Complete ordered field). An ordered field in which every Cauchy
sequence converges is called complete.
Hence we say that R is a complete ordered field.
However, not every complete ordered field is (isomorphic to) R. For example,
we can take the rational functions as before, then take the Cauchy completion
of it (i.e. add all the limits we need). Then it is already too large to be the reals
(it still doesn’t have the Archimedean property) but is a complete ordered field.
To show that completeness implies the monotone-sequences property, we
need an additional condition: the Archimedean property.
Lemma. Let F be an ordered field with the Archimedean property such that
every Cauchy sequence converges. The F satisfies the monotone-sequences
property.
Proof. Instead of showing that every bounded monotone sequence converges, and
is hence Cauchy, We will show the equivalent statement that every increasing
non-Cauchy sequence is not bounded above.
Let (an ) be an increasing sequence. If (an ) is not Cauchy, then
(∃ε > 0)(∀N )(∃p, q > N ) |ap − aq | ≥ ε.
wlog let p > q. Then
ap ≥ aq + ε ≥ aN + ε.
So for any N , we can find a p > N such that
ap ≥ aN + ε.
Then we can construct a subsequence an1 , an2 , · · · such that
ank+1 ≥ ank + ε.
Therefore
ank ≥ an1 + (k − 1)ε.
So by the Archimedean property, (ank ), and hence (an ), is unbounded.
Note that the definition of a convergent sequence is
(∃l)(∀ε > 0)(∃N )(∀n ≥ N ) |an − l| < ε,
while that of Cauchy convergence is
(∀ε > 0)(∃N )(∀p, q ≥ N ) |ap − aq | < ε.
In the first definition, l quantifies over all real numbers, which is uncountable.
However, in the second definition, we only have to quantify over natural numbers,
which is countable (by the Archimedean property, we only have to consider the
cases ε = 1/n).
Since they are equivalent in R, the second definition is sometimes preferred
when we care about logical simplicity.
13
2.5 Limit supremum and infimum

Here we will define the limit supremum and infimum. While these are technically
not part of the course, eventually some lecturers will magically assume students
know this definition. So we might as well learn it here.
Definition (Limit supremum/infimum). Let (an ) be a bounded sequence. We
define the limit supremum as

lim sup an = lim sup am .
n→∞ n→∞ m≥n
To see that this exists, set bn = supm≥n am . Then (bn ) is decreasing since we
are taking the supremum of fewer and fewer things, and is bounded below by
any lower bound for (an ) since bn ≥ an . So it converges.
Similarly, we define the limit infimum as

lim inf an = lim inf am .
n→∞ n→∞ m≥n
Example. Take the sequence

3 1 4 1
2, −1, , − , , − , · · ·
2 2 3 3
Then the limit supremum is 1 and the limit infimum is 0.
Lemma. Let (an ) be a sequence. The following two statements are equivalent:
– an → a
– lim sup an = lim inf an = a.
Proof. If an → a, then let ε > 0. Then we can find an n such that
a − ε ≤ am ≤ a + ε for all m ≥ n
It follows that
a − ε ≤ inf am ≤ sup am ≤ a + ε.
m≥n m≥n
Since ε was arbitrary, it follows that
lim inf an = lim sup an = a.
Conversely, if lim inf an = lim sup an = a, then let ε > 0. Then we can find n
such that
inf am > a − ε and sup am < a + ε.
m≥n m≥n
It follows that ∀m ≥ n, we have |am − a| < ε.
14
3 Convergence of infinite sums IA Analysis I
3 Convergence of infinite sums

In this chapter, we investigate which infinite sums, as opposed to sequences,
converge. We would like to say 1 + 12 + 14 + 18 + · · · = 2, while 1 + 1 + 1 + 1 + · · ·
does not converge. The majority of the chapter is coming up with different tests
to figure out if infinite sums converge.
3.1 Infinite sums

Definition (Convergence of infinite sums and partial sums). Let (an ) be a real
sequence. For each N , define
N
X
SN = an .
n=1
If the sequence (SN ) converges to some limit s, then we say that

∞
X
an = s,
n=1
∞
X
and we say that the series an converges.
n=1
We call SN the N th partial sum.
There is an immediate necessary condition for a series to converge.
∞
X
Lemma. If an converges. Then an → 0.
n=1
∞
X
Proof. Let an = s. Then Sn → s and Sn−1 → s. Then an = Sn − Sn−1 →
n=1
0.
However, the converse is false!
P
Example (Harmonic series). If an = 1/n, then an → 0 but an = ∞.
We can prove this as follows:
1 1 2n−1 1
S2n − S2n−1 = + ··· + ≥ = .
2n−1 + 1 2n 2n 2
Therefore S2n ≥ S1 + n/2. So the partial sums are unbounded.
Example (Geometric series). Let |ρ| < 1. Then
∞
X 1
ρn = .
n=0
1−ρ
We can prove this by considering the partial sums:

N
X 1 − ρN +1
ρn = .
n=0
1−ρ
N +1
Since ρ → 0, this tends to 1/(1 − ρ).
15
∞
X 1
Example. converges. This is since
n=2
n(n − 1)
1 1 1
= − .
n(n − 1) n−1 n
So
N
X 1 1
=1− → 1.
n=2
n(n − 1) N
P∞ that an ≥ 0 for every n and the partial sums Sn are bounded

Lemma. Suppose
above. Then n=1 an converges.
Proof. The sequence (Sn ) is increasing and bounded above. So the result follows
form the monotone sequences property.
The simplest convergence test we have is the P comparison test. Roughly
P
speaking, it says that if 0 ≤ an ≤ bn for all n and bn converges, then an
converges. However, we will prove a much more general form here for convenience.
Lemma (Comparison test). Let (an ) and (bn ) be non-negative
P sequences, and
that ∃C, N such that ∀n ≥ N , an ≤ Cbn . Then if
suppose P bn converges, then
so does an .
PR PR
Proof. Let M > N . Also for each R, let SR = n=1 an and TR = n=1 bn . We
want SR to be bounded above.
M
X M
X ∞
X
SM − SN = an ≤ C bn ≤ C bn .
n=N +1 n=N +1 n=N +1
P∞
So ∀M ≥ N , SM ≤ Sn + C n=N +1 bn . Since the SM are increasing and
bounded, it must converge.
Example.
P 1 P 1
(i) n2n converges, since 2n converges.
P n
(ii) 2n converges.
n/2 4/2
If n ≥ 4, then
√ n ≤ 2 . That’s because 4 = 2 and for n ≥ 4,
(n + 1)/n < 2, so when we increase n, we multiply the right side by a
greater numberPbyn/2the left. P
Hence by the comparison test, it is sufficient
to show that 2 /2n = 2−n/2 converges, which it does (geometric
series).
P 1
√ diverges, since √1 ≥ 1 . So if it converged, then so would
P1
(iii) n n n n , but
P1
n diverges.
P 1 1 1
(iv) n2 converges, since for n ≥ 2, n2 ≤ n(n−1) , and we have proven that
the latter converges.
16
∞
X n+5
(v) Consider . We show this converges by noting
n=1
n3 − 7n2 /2
7n2

7
n3 − = n2 n − .
2 2
So if n ≥ 8, then
7n2 n3
n3 − ≥ .
2 2
Also, n + 5 ≤ 2n. So
n+5
≤ 4/n2 .
n3 − 7n2 /2
So it converges by the comparison test.
1/nα converges.
P
(vi) If α > 1, then
PN
Let SN = n=1 1/nα . Then
1 1
S2n − S2n−1 = + ··· + n α
(2n−1 + 1)α (2 )
2n−1
≤ n−1 α
(2 )
= (2n−1 )1−α
= (21−α )n−1 .
But 21−α < 1. So
S2n = (S2n − S2n−1 ) + (S2n−1 − S2n−2 ) + · · · (S2 − S1 ) + S1
and is bounded above by comparison with the geometric series 1 + 21−α +

(21−α )2 + · · ·
3.2 Absolute convergence

Here we’ll consider two stronger conditions for convergence — absolute conver-
gence and unconditional convergence. We’ll prove that these two conditions are
in fact equivalent.
P
Definition
P (Absolute convergence). A series an converges absolutely if the
series |an | converges.
P (−1)n+1
Example. The series n = 1 − 12 + 13 − 14 + · · · converges, but not
absolutely.
To see the convergence, note that
1 1 1
a2n−1 + a2n = − = .
2n − 1 2n 2n(2n − 1)
It is easy to compare with 1/n2 to get that the partial sums S2n converges. But
S2n+1 − S2n = 1/(2n + 1) → 0, so the S2n+1 converges to the same limit.
It does not converge absolutely, because the sum of the absolute values is
the harmonic series.
17
P P
Lemma. Let an converge absolutely. Then an converges.
P PN PN
Proof. We know that |an | converges. Let Sn = n=1 an and TN = n=1 |an |.
We know two ways to show random sequences converge, without knowing
what they converge to, namely monotone-sequences and Cauchy sequences. Since
Sn is not monotone, we shall try Cauchy sequences.
If p > q, then
p p
X X
|Sp − Sq | = an ≤ |an | = Tp − Tq .

q=1 n=q+1
But the sequence Tp converges. So ∀ε > 0, we can find N such that for all
p > q ≥ N , we have Tp − Tq < ε, which implies |Sp − Sq | < ε.
P
Definition (Unconditional
P∞ convergence). A series an converges uncondition-
ally if the series n=1 aπ(n) converges for every bijection π : N → N, i.e. no
matter how we re-order the elements of an , the sum still converges.
P
Theorem. If an converges absolutely, then it converges unconditionally.
PN
Proof. Let Sn = n=1 aπ(n) . Then if p > q,
p ∞
X X
|Sp − Sq | = aπ(n) ≤ |aπ(n) |.

n=q+1 n=q+1
P P∞
Let ε > 0. Since |an | converges, pick M such that n=M +1 |an | < ε.
Pick N large enough that {1, · · · , M } ⊆ {π(1), · · · , π(N )}.
Then if n > N , we have π(n) > M . Therefore if p > q ≥ N , then
p
X ∞
X
|Sp − Sq | ≤ |aπ(n) | ≤ |an | < ε.
n=q+1 n=M +1
Therefore the sequence of partial sums is Cauchy.

The converse is also true.
P
Theorem. If an converges unconditionally, then it converges absolutely.
Proof. We will prove the contrapositive: if it doesn’t converge absolutely, it
doesn’t converge unconditionally.
P
Suppose that |an | = ∞. Let (bn ) be the subsequence of non-negative
P
terms
P of an , and (cn ) be the subsequence
P of negative terms. Then bn and
cn cannot
P both converge, or else |a n | converges.
wlog, bn = ∞. Now construct a sequence 0 = n0 < n1 < n2 < · · · such
that ∀k,
bnk−1 +1 + bnk−1 +2 + · · · + bnk + ck ≥ 1,
This is possible because the bn are unbounded and we can get it as large as we
want.
Let π be the rearrangement
b1 , b2 , · · · bn1 , c1 , bn1 +1 , · · · bn2 , c2 , bn2 +1 , · · · bn3 , c3 , · · ·
So the sum up to ck is at least k. So the partial sums tend to infinity.
18
We can prove an even stronger result:

P
Lemma. Let an be a series that converges absolutely. Then for any bijection
π : N → N,
X∞ X∞
an = aπ(n) .
n=1 n=1
P P
Proof. Let ε >
P0. We know εthat both |an | and |aπ(n) | converge. So let M
be such that n>M |an | < 2 and n>M |aπ(n) | < 2ε .
P
Now N be large enough such that
{1, · · · , M } ⊆ {π(1), · · · , π(N )},
and
{π(1), · · · , π(M )} ⊆ {1, · · · , N }.
Then for every K ≥ N ,
K K
K K
X X X X ε ε
an − aπ(n) ≤ |an | + |aπ(n) | < + = ε.

2 2

n=1 n=1 n=M +1 n=M +1
We have the P first inequality

P since given our choice of M and N , the first M
terms of the an and aπ(n) sums are cancelled by some term in the huge
sum. P
P So ∀K ≥ N , the partial sums up to K differ by at most ε. So | an −
aπ(n) | ≤ ε. P P
Since this is true for all ε, we must have an = aπ(n) .
3.3 Convergence tests

We’ll now come up with a lot of convergence tests.
Lemma (Alternating sequence test). Let (an ) be a decreasing sequence of non-

X∞
negative reals, and suppose that an → 0. Then (−1)n+1 an converges, i.e.
n=1
a1 − a2 + a3 − a4 + · · · converges.
N
X
Proof. Let SN = (−1)n+1 an . Then
n=1
S2n = (a1 − a2 ) + (a3 − a4 ) + · · · (a2n−1 − a2n ) ≥ 0,
and (S2n ) is an increasing sequence.

Also,
S2n+1 = a1 − (a2 − a3 ) − (a4 − a5 ) − · · · − (a2n − a2n+1 ),
and (S2n+1 ) is a decreasing sequence. Also S2n+1 − S2n = a2n+1 ≥ 0.

Hence we obtain the bounds 0 ≤ S2n ≤ S2n+1 ≤ a1 . It follows from the
monotone sequences property that (S2n ) and (S2n+1 ) converge.
Since S2n+1 − S2n = a2n+1 → 0, they converge to the same limit.
19
Example.
1 1 1
1− √
3
+ √
3
− √
3
+ · · · converges.
2 3 4
Lemma (Ratio test). We have three versions:
(i) If ∃c < 1 such that

|an+1 |
≤ c,
|an |
P
for all n, then an converges.
(ii) If ∃c < 1 and ∃N such that
|an+1 |
(∀n ≥ N ) ≤ c,
|an |
P
then an converges. Note that just because the ratio is always less than
1, it doesn’t necessarily converge. It has to
Pbe always less than a fixed
number c. Otherwise the test will say that 1/n converges.
(iii) If ∃ρ ∈ (−1, 1) such that

an+1
→ ρ,
an
P
then an converges. Note that we have the open interval (−1, 1). If
|an+1 |
|an | → 1, then the test is inconclusive!
Proof.
≤ cn−1 |a1 |. Since
P n P
(i) |an |P c converges, so does |an | by comparison test.
So an converges absolutely, so it converges.
|aN +k | ≤ ck |aN |. So the series
P
(ii) For all k ≥ 0, we have P |aN +k | converges,
and therefore so does |ak |.
an+1 |an+1 |
(iii) If an → ρ, then |an | → |ρ|. So (setting ε = (1 − |ρ|)/2) there exists N
such that ∀n ≥ N , |a|an+1
n|
|
≤ 1+|ρ|
2 < 1. So the result follows from (ii).
nbn converges, since

P
Example. If |b| < 1, then
(n + 1)bn+1

an+1 1
= = 1 + b → b < 1.
an nbn n
So it converges.
∞ X
X ∞
We can also evaluate this directly by considering bn .
i=1 n=i
The following two tests were taught at the end of the course, but are included
here for the sake of completeness.
20
Theorem
P∞ (Condensation test). Let (an ) be a decreasing non-negative sequence.
Then n=1 an < ∞ if and only if
∞
X
2k a2k < ∞.
k=1
P1 P 1
Proof. This is basically the proof that n diverges and nα converges for
α < 1 but written in a more general way.
We have
a1 + a2 + (a3 + a4 ) + (a5 + · · · + a8 ) + (a9 + · · · + a16 ) + · · ·

≥a1 + a2 + 2a4 + 4a8 + 8a16 + · · ·
P k P
So if 2 a2k diverges, an diverges.
To prove the other way round, simply group as
a1 + (a2 + a3 ) + (a4 + · · · + a7 ) + · · ·
≤a1 + 2a2 + 4a4 + · · ·
1
P∞ P∞ 1
Example. If an = n, then 2k a2k = 1. So k=1 2k a2k = ∞. So n=1 n = ∞.
After we formally define integrals, we will prove the integral test:
P∞ test). Let f : [1, ∞]

Theorem (Integral R ∞→ R be a decreasing non-negative
function. Then n=1 f (n) converges iff 1 f (x) dx < ∞.
3.4 Complex versions

Most definitions in the course so far carry over unchanged to the complex
numbers. e.g. zn → z iff (∀ε > 0)(∃N )(∀n ≥ N ) |zn − z| < ε.
Two exceptions are least upper bound and monotone sequences, because the
complex numbers do not have an ordering! (It cannot be made into an ordered
field because the square of every number in an ordered field has to be positive)
Fortunately, Cauchy sequences still work.
We can prove the complex versions of most theorems so far by looking at the
real and imaginary parts.
Example. Let (zn ) be a Cauchy sequence in C. Let zn = xn + iyn . Then (xn )
and (yn ) are Cauchy. So they converge, from which it follows that zn = xn + iyn
converges.
Also, the Bolzano-Weierstrass theorem still holds: If (zn ) is bounded, let
zn = xn + yn , then (xn ) and (yn ) are bounded. Then find a subsequence (xnk )
that converges. Then find a subsequence of (ynk ) that converges.
Then nested-intervals property has a “nested-box” property as a complex
analogue.
Finally, the proof that absolutely convergent sequences converge still works.
It follows that the ratio test still works.
P n
Example. If |z| < 1, then nz converges. Proof is the same as above.
21
However, we do have an extra test for complex sums.
Lemma (Abel’s test). Let a1 ≥ a2 ≥ · · ·P≥ 0, and suppose that an → 0. Let

z ∈ C such that |z| = 1 and z 6= 1. Then an z n converges.
Proof. We prove that it is Cauchy. We have
N N
X X z n+1 − z n
an z n = an
z−1
n=M n=M
N
1 X
= an (z n+1 − z n )
z−1
n=M
N N
!
1 X X
= an z n+1 − an z n
z−1
n=M n=M
−1
N N
!
1 X
n+1
X
n+1
= an z − an+1 z
z−1
n=M n=M −1
−1
N
!
1 N +1 M
X
n+1
= aN z − aM z + (an − an+1 )z
z−1
n=M
We now take the absolute value of everything to obtain

−1
N N
!
X
n 1 X
an z ≤ aN + aM + (an − an+1 )

|z − 1|

n=M n=M
1
= (aN + aM + (aM − aM +1 ) + · · · + (aN −1 − aN ))
|z − 1|
2aM
= → 0.
|z − 1|
So it is Cauchy. So it converges
an (z n+1 − z n ) into aN z N +1 −
P
Note Pthat here we transformed the sum
M n+1
aM z + (an −an+1 )z . What we have effectively done is a discrete analogue
of integrating by parts.
P n
Example. The series z /n converges if |z| < 1 or if |z| = 1 and z 6= 1, and
it diverges if z = 1 or |z| > 1.
The cases |z| < 1 and |z| > 1 are trivial from the ratio test, and Abel’s test
is required for the |z| = 1 cases.
22
4 Continuous functions IA Analysis I
4 Continuous functions
4.1 Continuous functions
Definition (Continuous function). Let A ⊆ R, a ∈ A, and f : A → R. Then f
is continuous at a if for any ε > 0, there is some δ > 0 such that if y ∈ A is such
that |y − a| < δ, then |f (y) − f (a)| < ε. In symbols, we have
(∀ε > 0)(∃δ > 0)(∀y ∈ A) |y − a| < δ ⇒ |f (y) − f (a)| < ε.
f is continuous if it is continuous at every a ∈ A. In symbols, we have
(∀a ∈ A)(∀ε > 0)(∃δ > 0)(∀y ∈ A) |y − a| < δ ⇒ |f (y) − f (a)| < ε.
Intuitively, f is continuous at a if we can obtain f (a) as accurately as we

wish by using more accurate values of a (the definition says that if we want to
approximate f (a) by f (y) to within accuracy ε, we just have to get our y to
within δ of a for some δ).
For example, suppose we have the function
(
0 x≤π
f (x) = .
1 x>π
Suppose that we don’t know what the function actually is, but we have a
computer program that computes this function. We want to know what f (π)
is. Since we cannot input π (it has infinitely many digits), we can try 3, and it
gives 0. Then we try 3.14, and it gives 0 again. If we try 3.1416, it gives 1 (since
π = 3.1415926 · · · < 3.1416). We keep giving more and more digits of π, but the
result keeps oscillating between 0 and 1. We have no hope of what f (π) might
be, even approximately. So this f is discontinuous at π.
However, if we have the function g(x) = x2 , then we can find the (approx-
imate) value of g(π). We can first try g(3) and obtain 9. Then we can try
g(3.14) = 9.8596, g(3.1416) = 9.86965056 etc. We can keep trying and obtain
more and more accurate values of g(π). So g is continuous at π.
Example.
– Constant functions are continuous.
– The function f (x) = x is continuous (take δ = ε).

The definition of continuity of a function looks rather like the definition of
convergence. In fact, they are related by the following lemma:
Lemma. The following two statements are equivalent for a function f : A → R.
– f is continuous
– If (an ) is a sequence in A with an → a, then f (an ) → f (a).
Proof. (i)⇒(ii) Let ε > 0 Since f is continuous at a,
(∃δ > 0)(∀y ∈ A) |y − a| < δ ⇒ |f (y) − f (a)| < ε.
23
We want N such that ∀n ≥ N , |f (an ) − f (a)| < ε. By continuity, it is enough

to find N such that ∀n ≥ N , |an − a| < δ. Since an → a, such an N exists.
(ii)⇒(i) We prove the contrapositive: Suppose f is not continuous at a. Then
(∃ε > 0)(∀δ > 0)(∃y ∈ A) |y − a| < δ and |f (y) − f (a)| ≥ ε.
For each n, we can therefore pick an ∈ A such that |an − a| < n1 and |f (an ) −
f (a)| ≥ ε. But then an → a (by Archimedean property), but f (an ) 6→ f (a).
Example.
(
−1 x < 0
(i) Let f (x) = . Then f is not continuous because − n1 → 0 but
1 x≥0
f (− n1 ) → −1 6= f (0).
(ii) Let f : Q → R with (
1 x2 > 2
f (x) =
0 x2 < 2
Then f is continuous. For every a ∈ Q, we can find an interval about a on
which f is constant. So f is continuous at a.
(iii) Let (
sin x1 x 6= 0
f (x) =
0 x=0
Then f (a) is discontinuous. For example, let an = 1/[(2n + 0.5)π]. Then
an → 0 and f (an ) → 1 6= f (0).
We can use this sequence definition as the definition for continuous functions.
This has the advantage of being cleaner to write and easier to work with. In
particular, we can reuse a lot of our sequence theorems to prove the analogous
results for continuous functions.
Lemma. Let A ⊆ R and f, g : A → R be continuous functions. Then
(i) f + g is continuous
(ii) f g is continuous
(iii) if g never vanishes, then f /g is continuous.
Proof.
(i) Let a ∈ A and let (an ) be a sequence in A with an → a. Then
(f + g)(an ) = f (an ) + g(an ).
But f (an ) → f (a) and g(an ) → g(a). So
f (an ) + g(an ) → f (a) + g(a) = (f + g)(a).
(ii) and (iii) are proved in exactly the same way.
24
With this lemma, from the fact that constant functions and f (x) = x are
continuous, we know that all polynomials are continuous. Similarly, rational
functions P (x)/Q(x) are continuous except when Q(x) = 0.
Lemma. Let A, B ⊆ R and f : A → B, g : B → R. Then if f and g are
continuous, g ◦ f : A → R is continuous.
Proof. We offer two proofs:
(i) Let (an ) be a sequence in A with an → a ∈ A. Then f (an ) → f (a) since

f is continuous. Then g(f (an )) → g(f (a)) since g is continuous. So g ◦ f
is continuous.
(ii) Let a ∈ A and ε > 0. Since g is continuous at f (a), there exists η > 0 such
that ∀z ∈ B, |z − f (a)| < η ⇒ |g(z) − g(f (a))| < ε.
Since f is continuous at a, ∃δ > 0 such that ∀y ∈ A, |y − a| < δ ⇒
|f (y) − f (a)| < η. Therefore |y − a| < δ ⇒ |g(f (y)) − g(f (a))| < ε.
There are two important theorems regarding continuous functions — the

maximum value theorem and the intermediate value theorem.
Theorem (Maximum value theorem). Let [a, b] be a closed interval in R and
let f : [a, b] → R be continuous. Then f is bounded and attains it bounds, i.e.
f (x) = sup f for some x, and f (y) = inf f for some y.
Proof. If f is not bounded above, then for each n, we can find xn ∈ [a, b] such
that f (xn ) ≥ n for all n.
By Bolzano-Weierstrass, since xn ∈ [a, b] and is bounded, the sequence
(xn ) has a convergent subsequence (xnk ). Let x be its limit. Then since f is
continuous, f (xnk ) → f (x). But f (xnk ) ≥ nk → ∞. So this is a contradiction.
Now let C = sup{f (x) : x ∈ [a, b]}. Then for every n, we can find xn
such that f (xn ) ≥ C − n1 . So by Bolzano-Weierstrass, (xn ) has a convergent
subsequence (xnk ). Since C − n1k ≤ f (xnk ) ≤ C, f (xnk ) → C. Therefore if
x = lim xnk , then f (x) = C.
A similar argument applies if f is unbounded below.
Theorem (Intermediate value theorem). Let a < b ∈ R and let f : [a, b] → R
be continuous. Suppose that f (a) < 0 < f (b). Then there exists an x ∈ (a, b)
such that f (x) = 0.
Proof. We have several proofs:

(i) Let A = {x : f (x) < 0} and let s =√sup A. We shall show that f (s) = 0
(this is similar to the proof that 2 exists in Numbers and Sets). If
f (s) < 0, then setting ε = |f (s)| in the definition of continuity, we can find
δ > 0 such that ∀y, |y − s| < δ ⇒ f (y) < 0. Then s + δ/2 ∈ A, so s is not
an upper bound. Contradiction.
If f (s) > 0, by the same argument, we can find δ > 0 such that ∀y,
|y − s| < δ ⇒ f (y) > 0. So s − δ/2 is a smaller upper bound.
25
(ii) Let a0 = a, b0 = b. By repeated bisection, construct nested intervals

[an , bn ] such that bn − an = b02−a
n
0
and f (an ) < 0 ≤ f (bn ). Then by the
nested intervals property, we can find x ∈ ∩∞ n=0 [an , bn ]. Since bn − an → 0,
an , bn → x.
Since f (an ) < 0 for every n, f (x) ≤ 0. Similarly, since f (bn ) ≥ 0 for every
n, f (x) ≥ 0. So f (x) = 0.
It is easy to generalize this to get that, if f (a) < c < f (b), then ∃x ∈ (a, b)
such that f (x) = c, by applying the result to f (x) − c. Also, we can assume
instead that f (b) < c < f (a) and obtain the same result by looking at −f (x).
Corollary. Let f : [a, b] → [c, d] be a continuous strictly increasing function
with f (a) = c, f (b) = d. Then f is invertible and its inverse is continuous.
Proof. Since f is strictly increasing, it is an injection (suppose x 6= y. wlog,
x < y. Then f (x) < f (y) and so f (x) 6= f (y)). Now let y ∈ (c, d). By the
intermediate value theorem, there exists x ∈ (a, b) such that f (x) = y. So f is a
surjection. So it is a bijection and hence invertible.
Let g be the inverse. Let y ∈ [c, d] and let ε > 0. Let x = g(y). So f (x) = y.
Let u = f (x − ε) and v = f (x + ε) (if y = c or d, make the obvious adjustments).
Then u < y < v. So we can find δ > 0 such that (y − δ, y + δ) ⊆ (u, v). Then
|z − y| < δ ⇒ g(z) ∈ (x − ε, x + ε) ⇒ |g(z) − g(y)| < ε.
√
With this corollary, we can create more continuous functions, e.g. x.
4.2 Continuous induction*

Continuous induction is a generalization of induction on natural numbers. It
provides an alternative mechanism to prove certain results we have shown.
Proposition (Continuous induction v1). Let a < b and let A ⊆ [a, b] have the
following properties:
(i) a ∈ A
(ii) If x ∈ A and x 6= b, then ∃y ∈ A with y > x.
(iii) If ∀ε > 0, ∃y ∈ A : y ∈ (x − ε, x], then x ∈ A.

Then b ∈ A.
Proof. Since a ∈ A, A 6= ∅. A is also bounded above by b. So let s = sup A.
Then ∀ε > 0, ∃y ∈ A such that y > s − ε. Therefore, by (iii), s ∈ A.
If s 6= b, then by (ii), we can find y ∈ A such that y > s.
It can also be formulated as follows:

Proposition (Continuous induction v2). Let A ⊆ [a, b] and suppose that
(i) a ∈ A
(ii) If [a, x] ⊆ A and x 6= b, then there exists y > x such that [a, y] ⊆ A.
26
(iii) If [a, x) ⊆ A, then [a, x] ⊆ A.

Then A = [a, b]
Proof. We prove that version 1 ⇒ version 2. Suppose A satisfies the conditions
of v2. Let A0 = {x ∈ [a, b] : [a, x] ⊆ A}.
Then a ∈ A0 . If x ∈ A0 with x 6= b, then [a, x] ⊆ A. So ∃y > x such that
[a, y] ⊆ A. So ∃y > x such that y ∈ A0 .
If ∀ε > 0, ∃y ∈ (x − ε, x] such that [a, y] ⊆ A, then [a, x) ⊆ A. So by (iii),
[a, x] ⊆ A, so x ∈ A0 . So A0 satisfies properties (i) to (iii) of version 1. Therefore
b ∈ A0 . So [a, b] ⊆ A. So A = [a, b].
We reprove intermediate value theorem here:
Theorem (Intermediate value theorem). Let a < b ∈ R and let f : [a, b] → R
be continuous. Suppose that f (a) < 0 < f (b). Then there exists an x ∈ (a, b)
such that f (x) = 0.
Proof. Assume that f is continuous. Suppose f (a) < 0 < f (b). Assume that
(∀x) f (x) 6= 0, and derive a contradiction.
Let A = {x : f (x) < 0} Then a ∈ A. If x ∈ A, then f (x) < 0, and by
continuity, we can find δ > 0 such that |y − x| < δ ⇒ f (y) < 0. So if x 6= b, then
we can find y ∈ A such that y > x.
We prove the contrapositive of the last condition, i.e.
x 6∈ A ⇒ (∃δ > 0)(∀y ∈ A) y 6∈ (x − δ, x].
If x 6∈ A, then f (x) > 0 (we assume that f is never zero. If not, we’re done).
Then by continuity, ∃δ > 0 such that |y − x| < δ ⇒ f (y) > 0. So y ∈
6 A.
Hence by continuous induction, b ∈ A. Contradiction.
Now we prove that continuous functions in closed intervals are bounded.
Theorem. Let [a, b] be a closed interval in R and let f : [a, b] → R be continuous.
Then f is bounded.
Proof. Let f : [a, b] be continuous. Let A = {x : f is bounded on [a, x]}. Then
a ∈ A. If x ∈ A, x 6= b, then ∃δ > 0 such that |y − x| < δ ⇒ |f (y) − f (x)| < 1.
So ∃y > x (e.g. take min{x + δ/2, b}) such that f is bounded on [a, y], which
implies that y ∈ A.
Now suppose that ∀ε > 0, ∃y ∈ (x, −ε, x] such that y ∈ A. Again, we can
find δ > 0 such that f is bounded on (x − δ, x + δ), and in particular on (x − δ, x].
Pick y such that f is bounded on [a, y] and y > x − δ. Then f is bounded on
[a, x]. So x ∈ A.
So we are done by continuous induction.
Finally, we can prove a theorem that we have not yet proven.
Definition (Cover of a set). Let A ⊆ R. A cover of A bySopen intervals is a
set {Iγ : γ ∈ Γ} where each Iγ is an open interval and A ⊆ γ∈Γ Iγ .
A finite subcover is a finite subset {Iγ1 , · · · , Iγn } of the cover that is still a
cover.
Not every cover has a finite subcover. For example, the cover {( n1 , 1) : n ∈ N}
of (0, 1) has no finite subcover.
27
Theorem (Heine-Borel*). Every cover of a closed, bounded interval [a, b] by

open intervals has a finite subcover. We say closed intervals are compact (cf.
Metric and Topological Spaces).
Proof. Let {Iγ : γ ∈ Γ} be a cover of [a, b] by open intervals. Let A = {x : [a, x]
can be covered by finitely many of the Iγ }.
Then a ∈ A since a must belong to some Iγ .
If x ∈ A, then pick γ such that x ∈ Iγ . Then if x = 6 b, since Iγ is an open
interval, it contains [x, y] for some y > x. Then [a, y] can be covered by finitely
many Iγ , by taking a finite cover for [a, x] and the Iγ that contains x.
Now suppose that ∀ε > 0, ∃y ∈ A such that y ∈ (x − ε, x].
Let Iγ be an open interval containing x. Then it contains (x − ε, x] for some
ε > 0. Pick y ∈ A such that y ∈ (x−ε, x]. Now combine Iγ with a finite subcover
of [a, y] to get a finite subcover of [a, x]. So x ∈ A.
Then done by continuous induction.
We can use Heine-Borel to prove that continuous functions on [a, b] are
bounded.
Theorem. Let [a, b] be a closed interval in R and let f : [a, b] → R be continuous.
Then f is bounded and attains it bounds, i.e. f (x) = sup f for some x, and
f (y) = inf f for some y.
Proof. Let f : [a, b] → R be continuous. Then by continuity,
(∀x ∈ [a, b])(∃δx > 0)(∀y) |y − x| < δx ⇒ |f (y) − f (x)| < 1.
Let γ = [a, b] and for each x ∈ γ, let Ix = Sn(x − δx , x + δx ). So by Heine-Borel,

we can find x1 , · · · , xn such that [a, b] ⊆ 1 (xi − δxi , xi + δxi ).
But f is bounded in each interval (xi − δxi , xi + δxi ) by |f (xi )| + 1. So it is
bounded on [a, b] by max |f (xi )| + 1.
28
5 Differentiability IA Analysis I
5 Differentiability
In the remainder of the course, we will properly develop calculus, and put
differentiation and integration on a rigorous foundation. Every notion will be
given a proper definition which we will use to prove results like the product and
quotient rule.
5.1 Limits
First of all, we need the notion of limits. Recall that we’ve previously had limits
for sequences. Now, we will define limits for functions.
Definition (Limit of functions). Let A ⊆ R and let f : A → R. We say
lim f (x) = `,
x→a
or “f (x) → ` as x → a”, if
(∀ε > 0)(∃δ > 0)(∀x ∈ A) 0 < |x − a| < δ ⇒ |f (x) − `| < ε.
We couldn’t care less what happens when x = a, hence the strict inequality
0 < |x − a|. In fact, f doesn’t even have to be defined at x = a.
Example. Let (
x x 6= 2
f (x) =
3 x=2
Then lim = 2, even though f (2) = 3.
x→2
sin x
Example. Let f (x) = x . Then f (0) is not defined but lim f (x) = 1.
x→0
We will see a proof later after we define what sin means.
We notice that the definition of the limit is suspiciously similar to that of

continuity. In fact, if we define
(
f (x) x 6= a
g(x) =
` x=a
Then f (x) → ` as x → a iff g is continuous at a.

Alternatively, f is continuous at a if f (x) → f (a) as x → a. It follows also
that f (x) → ` as x → a iff f (xn ) → ` for every sequence (xn ) in A with xn → a.
The previous limit theorems of sequences apply here as well
Proposition. If f (x) → ` and g(x) → m as x → a, then f (x) + g(x) → ` + m,
f (x)g(x) → `m, and fg(x)
(x)
→m`
if g and m don’t vanish.
5.2 Differentiation
Similar to what we did in IA Differential Equations, we define the derivative as
a limit.
29
Definition (Differentiable function). f is differentiable at a with derivative λ if

f (x) − f (a)
lim = λ.
x→a x−a
Equivalently, if
f (a + h) − f (a)
lim = λ.
h→0 h
We write λ = f 0 (a).
Here we see why, in the definition of the limit, we say that we don’t care
what happens when x = a. In our definition here, our function is 0/0 when
x = a, and we can’t make any sense out of what happens when x = a.
Alternatively, we write the definition of differentiation as
f (x + h) − f (x)
= f 0 (x) + ε(h),
h
where ε(h) → 0 as h → 0. Rearranging, we can deduce that
f (x + h) = f (x) + hf 0 (x) + hε(h),
Note that by the definition of the limit, we don’t have to care what value ε takes
10
when h = 0. It can be 0, π or 1010 . However, we usually take ε(0) = 0 so that
ε is continuous.
Using the small-o notation, we usually write o(h) for a function that satisfies
o(h)
h → 0 as h → 0. Hence we have
Proposition.
f (x + h) = f (x) + hf 0 (x) + o(h).
We can interpret this as an approximation of f (x + h):
f (x + h) = f (x) + hf 0 (x) + o(h) .
| {z } |{z}
linear approximation error term
And differentiability shows that this is a very good approximation with small
o(h) error.
Conversely, we have
Proposition. If f (x + h) = f (x) + hf 0 (x) + o(h), then f is differentiable at x
with derivative f 0 (x).
Proof.
f (x + h) − f (x) o(h)
= f 0 (x) + → f 0 (x).
h h
We can take derivatives multiple times, and get multiple derivatives.

Definition (Multiple derivatives). This is defined recursively: f is (n + 1)-
times differentiable if it is n-times differentiable and its nth derivative f (n) is
differentiable. We write f (n+1) for the derivative of f (n) , i.e. the (n + 1)th
derivative of f .
Informally, we will say f is n-times differentiable if we can differentiate it n
times, and the nth derivative is f (n) .
30
We can prove the usual rules of differentiation using the small o-notation. It
can also be proven by considering limits directly, but the notation will become a
bit more daunting.
Lemma (Sum and product rule). Let f, g be differentiable at x. Then f + g
and f g are differentiable at x, with
(f + g)0 (x) = f 0 (x) + g 0 (x)

(f g)0 (x) = f 0 (x)g(x) + f (x)g 0 (x)
Proof.
(f + g)(x + h) = f (x + h) + g(x + h)
= f (x) + hf 0 (x) + o(h) + g(x) + hg 0 (x) + o(h)
= (f + g)(x) + h(f 0 (x) + g 0 (x)) + o(h)
f g(x + h) = f (x + h)g(x + h)
= [f (x) + hf 0 (x) + o(h)][g(x) + hg 0 (x) + o(h)]
= f (x)g(x) + h[f 0 (x)g(x) + f (x)g 0 (x)]
+ o(h)[g(x) + f (x) + hf 0 (x) + hg 0 (x) + o(h)] + h2 f 0 (x)g 0 (x)
| {z }
error term
By limit theorems, the error term is o(h). So we can write this as
= f (x) + h(f 0 (x)g(x) + f (x)g 0 (x)) + o(h).
Lemma (Chain rule). If f is differentiable at x and g is differentiable at f (x),

then g ◦ f is differentiable at x with derivative g 0 (f (x))f 0 (x).
Proof. In this proof, we use hε(h) instead of o(h).
(g ◦ f )(x + h) = g(f (x + h))

= g[f (x) + hf 0 (x) + hε1 (h)]
| {z }
the “h” term
= g(f (x)) + f g (x) + hε1 (h) g 0 (f (x))
0

+ hf 0 (x) + hε1 (h) ε2 (hf 0 (x) + hε1 (h))

= g ◦ f (x) + hg 0 (f (x))f 0 (x)

h i
+ h ε1 (h)g 0 (f (x)) + f 0 (x) + ε1 (h) ε2 hf 0 (x) + hε1 (h) .

| {z }
error term
We want to show that the error term is o(h), i.e. it divided by h tends to 0 as
h → 0.
But ε1 (h)g 0 (f (x)) → 0, f 0 (x)+ε1 (h) is bounded, and ε2 (hf 0 (x)+hε1 (h)) → 0
because hf 0 (x) + hε1 (h) → 0 and ε2 (0) = 0. So our error term is o(h).
We usually don’t write out the error terms so explicitly, and just use heuristics
like f (x + o(h)) = f (x) + o(h); o(h) + o(h) = o(h); and g(x) · o(h) = o(h) for any
(bounded) function g.
31
Example.
(i) Constant functions are differentiable with derivative 0.

(ii) f (x) = λx is differentiable with derivative λ.
(iii) Using the product rule, we can show that xn is differentiable with derivative
nxn−1 by induction.
(iv) Hence all polynomials are differentiable.

Example. Let f (x) = 1/x. If x 6= 0, then

1 1 −h
f (x + h) − f (x) − x(x+h) −1 −1
= x+h x = = → 2
h h h x(x + h) x
by limit theorems.
Lemma (Quotient rule). If f and g are differentiable at x, and g(x) 6= 0, then
f /g is differentiable at x with derivative
0
f f 0 (x)g(x) − g 0 (x)f (x)
(x) = .
g g(x)2
Proof. First note that 1/g(x) = h(g(x)) where h(y) = 1/y. So 1/g(x) is differen-
−1 0
tiable at x with derivative g (x) by the chain rule.
g(x)2
By the product rule, f /g is differentiable at x with derivative
f 0 (x) g 0 (x) f (x)0 g(x) − f (x)g 0 (x)

− f (x) = .
g(x) g(x)2 g(x)2
Lemma. If f is differentiable at x, then it is continuous at x.

f (y) − f (x)
Proof. As y → x, → f 0 (x). Since, y − x → 0, f (y) − f (x) → 0 by
y−x
product theorem of limits. So f (y) → f (x). So f is continuous at x.
Theorem. Let f : [a, b] → [c, d] be differentiable on (a, b), continuous on [a, b],
and strictly increasing. Suppose that f 0 (x) never vanishes. Suppose further that
f (a) = c and f (b) = d. Then f has an inverse g and for each y ∈ (c, d), g is
differentiable at y with derivative 1/f 0 (g(y)).
In human language, this states that if f is invertible, then the derivative of
f −1 is 1/f 0 .
Note that the conditions will (almost) always require f to be differentiable
on open interval (a, b), continuous on closed interval [a, b]. This is because it
doesn’t make sense to talk about differentiability at a or b since the definition of
f 0 (a) requires f to be defined on both sides of a.
32
Proof. g exists by an earlier theorem about inverses of continuous functions.

Let y, y + k ∈ (c, d). Let x = g(y), x + h = g(y + k).
Since g(y + k) = x + h, we have y + k = f (x + h). So k = f (x + h) − y =
f (x + h) − f (x). So
−1
g(y + k) − g(y) (x + h) − x f (x + h) − f (x)
= = .
k f (x + h) − f (x) h
As k → 0, since g is continuous, g(y + k) → g(y). So h → 0. So
g(y + k) − g(y)
→ [f 0 (x)]−1 = [f 0 (g(y)]−1 .
k
Example. Let f (x) = x1/2 for x > 0. Then f is the inverse of g(x) = x2 . So
1 1 1
f 0 (x) = = 1/2 = x−1/2 .
g 0 (f (x)) 2x 2
Similarly, we can show that the derivative of x1/q is 1q x1/q−1 .

Then let’s take xp/q = (x1/q )p . By the chain rule, its derivative is
1 p p−1 1 p p
p(x1/q )p−1 · x1/q−1 = x q + q −1 = x q −1 .
q q q
5.3 Differentiation theorems

Everything we’ve had so far is something we already know. It’s just that now we
can prove them rigorously. In this section, we will come up with genuinely new
theorems, including but not limited to Taylor’s theorem, which gives us Taylor’s
series.
Theorem (Rolle’s theorem). Let f be continuous on a closed interval [a, b]
(with a < b) and differentiable on (a, b). Suppose that f (a) = f (b). Then there
exists x ∈ (a, b) such that f 0 (x) = 0.
It is intuitively obvious: if you move up and down, and finally return to the
same point, then you must have changed direction some time. Then f 0 (x) = 0
at that time.
Proof. If f is constant, then we’re done.
Otherwise, there exists u such that f (u) 6= f (a). wlog, f (u) > f (a). Since f
is continuous, it has a maximum, and since f (u) > f (a) = f (b), the maximum
is not attained at a or b.
Suppose maximum is attained at x ∈ (a, b). Then for any h 6= 0, we have
(
f (x + h) − f (x) ≤ 0 h > 0
h ≥0 h<0
since f (x + h) − f (x) ≤ 0 by maximality of f (x). By considering both sides as we

take the limit h → 0, we know that f 0 (x) ≤ 0 and f 0 (x) ≥ 0. So f 0 (x) = 0.
33
Corollary (Mean value theorem). Let f be continuous on [a, b] (a < b), and
differentiable on (a, b). Then there exists x ∈ (a, b) such that
f (b) − f (a)
f 0 (x) = .
b−a
f (b)−f (a)
Note that b−a is the slope of the line joining f (a) and f (b).
f (x) f (b)
f (a)
The mean value theorem is sometimes described as “rotate your head and
apply Rolle’s”. However, if we actually rotate it, we might end up with a
non-function. What we actually want is a shear.
Proof. Let
f (b) − f (a)
g(x) = f (x) − x.
b−a
Then
f (b) − f (a)
g(b) − g(a) = f (b) − f (a) − (b − a) = 0.
b−a
So by Rolle’s theorem, we can find x ∈ (a, b) such that g 0 (x) = 0. So
f (b) − f (a)
f 0 (x) = ,
b−a
as required.
We’ve always assumed that if a function has a positive derivative everywhere,
then the function is increasing. However, it turns out that this is really hard to
prove directly. It does, however, follow quite immediately from the mean value
theorem.
Example. Suppose f 0 (x) > 0 for every x ∈ (a, b). Then for u, v in [a, b], we can
find w ∈ (u, v) such that
f (v) − f (u)
= f 0 (w) > 0.
v−u
It follows that f (v) > f (u). So f is strictly increasing.
Similarly, if f 0 (x) ≥ 2 for every x and f (0) = 0, then f (1) ≥ 2, or else we
can find x ∈ (0, 1) such that
f (1) − f (0)
2 ≤ f 0 (x) = = f (1).
1−0
34
Theorem (Local version of inverse function theorem). Let f be a function with

continuous derivative on (a, b).
Let x ∈ (a, b) and suppose that f 0 (x) 6= 0. Then there is an open interval
(u, v) containing x on which f is invertible (as a function from (u, v) to f ((u, v))).
Moreover, if g is the inverse, then g 0 (f (z)) = f 01(z) for every z ∈ (u, v).
This says that if f has a non-zero derivative, then it has an inverse locally
and the derivative of the inverse is 1/f 0 .
Note that this not only requires f to be differentiable, but the derivative
itself also has to be continuous.
Proof. wlog, f 0 (x) > 0. By the continuity, of f 0 , we can find δ > 0 such that
f 0 (z) > 0 for every z ∈ (x − δ, x + δ). By the mean value theorem, f is strictly
increasing on (x − δ, x + δ), hence injective. Also, f is continuous on (x − δ, x + δ)
by differentiability.
Then done by the inverse function theorem.
Finally, we are going to prove Taylor’s theorem. To do so, we will first need
some lemmas.
Theorem (Higher-order Rolle’s theorem). Let f be continuous on [a, b] (a < b)
and n-times differentiable on an open interval containing [a, b]. Suppose that
f (a) = f 0 (a) = f (2) (a) = · · · = f (n−1) (a) = f (b) = 0.
Then ∃x ∈ (a, b) such that f (n) (x) = 0.

Proof. Induct on n. The n = 0 base case is just Rolle’s theorem.
Suppose we have k < n and xk ∈ (a, b) such that f (k) (xk ) = 0. Since
f (a) = 0, we can find xk+1 ∈ (a, xk ) such that f (k+1) (xk+1 ) = 0 by Rolle’s
(k)
theorem.
So the result follows by induction.
Corollary. Suppose that f and g are both differentiable on an open interval
containing [a, b] and that f (k) (a) = g (k) (a) for k = 0, 1, · · · , n − 1, and also
f (b) = g(b). Then there exists x ∈ (a, b) such that f (n) (x) = g (n) (x).
Proof. Apply generalised Rolle’s to f − g.
Now we shall show that for any f , we can find a polynomial p of degree at
most n that satisfies the conditions for g, i.e. a p such that p(k) (a) = f (k) (a) for
k = 0, 1, · · · , n − 1 and p(b) = f (b).
A useful ingredient is the observation that if
(x − a)k
Qk (x) = ,
k!
then (
(j) 1 j=k
Qk (a) =
0 j 6= k
Therefore, if
n−1
X
Q(x) = f (k) (a)Qk (x),
k=0
35
then
Q(j) (a) = f (j) (a)
for j = 0, 1, · · · , n − 1. To get p(b) = f (b), we use our nth degree polynomial
term:
(x − a)n
p(x) = Q(x) + n
f (b) − Q(b) .
(b − a)
Then our final term does not mess up our first n − 1 derivatives, and gives
p(b) = f (b).
By the previous corollary, we can find x ∈ (a, b) such that
f (n) (x) = p(n) (x).
That is,
n!
f (n) (x) =

f (b) − Q(b) .
(b − a)n
Therefore
(b − a)n (n)
f (b) = Q(b) + f (x).
n!
Alternatively,
(b − a)n−1 (n−1) (b − a)n (n)
f (b) = f (a) + (b − a)f 0 (a) + · · · + f (a) + f (x).
(n − 1)! n!
Setting b = a + h, we can rewrite this as
Theorem (Taylor’s theorem with the Lagrange form of remainder).
hn−1 (n−1) hn (n)

f (a + h) = f (a) + hf 0 (a) + · · · + f (a) + f (x) .
(n − 1)! n!
| {z } | {z }
(n−1)-degree approximation to f near a error term
for some x ∈ (a, a + h).

Strictly speaking, we only proved it for the case when h > 0, but we can
easily show it holds for h < 0 too by considering g(x) = f (−x).
Note that the remainder term is not necessarily small, but this often gives
us the best (n − 1)-degree approximation to f near a. For example, if f (n) is
bounded by C near a, then
n
h (n) C n
f (x) ≤ |h| = o(hn−1 ).
n! n!
Example. Let f : R → R be a differentiable function such that f (0) = 1 and

f 0 (x) = f (x) for every x (intuitively, we know it is ex , but that thing doesn’t
exist!). Then for every x, we have
∞
x2 x3 X xn
f (x) = 1 + x + + + ··· = .
2! 3! n=0
n!
While it seems like we can prove this works by differentiating it and see that
f 0 (x) = f (x), the sum rule only applies for finite sums. We don’t know we can
differentiate a sum term by term. So we have to use Taylor’s theorem.
36
Since f 0 (x) = f (x), it follows that all derivatives exist. By Taylor’s theorem,
f (2) (0) 2 f (n−1) (0) n−1 f (n) (u) n

f (x) = f (0) + f 0 (0)x + x + ··· + x + x .
2! (n − 1)! n!
for some u between 0 and x. This equals to
n−1
X xk f (n) (u) n
f (x) = + x .
k! n!
k=0
(n)
We must show that the remainder term f n!(u) xn → 0 as n → ∞. Note here
that x is fixed, but u can depend on n.
But we know that f (n) (u) = f (u), but since f is differentiable, it is continuous,
and is bounded on [0, x]. Suppose |f (u)| ≤ C on [0, x]. Then
(n)(u)
f C n
n
n! x ≤ n! |x| → 0

from limit theorems. So it follows that

∞
x2 x3 X xn
f (x) = 1 + x + + + ··· = .
2! 3! n=0
n!
5.4 Complex differentiation

Definition (Complex differentiability). Let f : C → C. Then f is differentiable
at z with derivative f 0 (z) if
f (z + h) − f (z)
lim exists and equals f 0 (z).
h→0 h
Equivalently,
f (z + h) = f (z) + hf 0 (z) + o(h).
This is exactly the same definition with real differentiation, but has very
different properties!
All the usual rules — chain rule, product rule etc. also apply (with the same
proofs). Also the derivatives of polynomials are what you expect. However, there
are some more interesting cases.
Example. f (z) = z̄ is not differentiable.
(
z+h−z h̄ 1 h is real
= =
h h −1 h is purely imaginary
If this seems weird, this is because we often think of C as R2 , but they are
not the same. For example, reflection is a linear map in R2 , but not in C. A
linear map in C is something in the form x 7→ bx, which can only be a dilation
or rotation, not reflections or other weird things.
Example. f (z) = |z| is also not differentiable. If it were, then |z|2 would be as
2
well (by the product rule). So would |z|z = z̄ when z = 6 0 by the quotient rule.
At z = 0, it is certainly not differentiable, since it is not even differentiable on R.
37
6 Complex power series IA Analysis I
6 Complex power series

Before we move on to integration, we first have a look at complex power series.
This will allow us to define the familiar exponential and trigonometric functions.
Definition (Complex power series). A complex power series is a series of the

form
∞
X
an z n .
n=0
when z ∈ C and an ∈ C for all n. When it converges, it is a function of z.

When considering complex power series, a very important concept is the
radius of convergence. To make sense of this concept, we first need the following
lemma:
Lemma. Suppose that an z n converges and |w| < |z|, then an wn converges
P P
(absolutely).
Proof. We know that w n

|an wn | = |an z n | · .

z
n n
P
Since an z converges, the terms an z are bounded. So pick C such that
|an z n | ≤ C
for every n. Then

∞
X ∞
X w n
0≤ |an wn | ≤ C ,

n=0 n=0
z
which converges (geometric series). So by the comparison test, an wn converges

P
absolutely.
an z n does not converge and |w| > |z|, then an wn does

P P
It follows that if
not converge.
an z n converges } (R may
P
Now let R = sup{|z| : P∞ be infinite). If |z| < R,
n
then we can find
P 0nz with |z0 | ∈ (|z|, R] such that a
P n nn 0 z converges. So by
lemma above, an z converges. If |z| > R, then an z diverges by definition
of R.
Definition (Radius of convergence). The radius of convergence of a power series

an z n is
P
n X o
R = sup |z| : an z n converges .
1
{z : |z| < R} is called
P the ncircle of convergence. .
an z n diverges. When
P
If |z| < R, then an z converges. If |z| > R, then
|z| = R, the series can converge at some points and not the others.
∞
X
Example. z n has radius of convergence of 1. When |z| = 1, it diverges
n=0
(since the terms do not tend to 0).
1 Note to pedants: yes it is a disc, not a circle
38
∞
X zn
Example. has radius of convergence 1, since the ratio of (n + 1)th term
n=0
n
to nth is
z n+1 /(n + 1) n
n
=z· → z.
z /n n+1
So if |z| < 1, then the series converges by the ratio test. If |z| > 1, then eventually
the terms are increasing in modulus.
If z = 1, then it diverges (harmonic series). If |z| = 1 and z = 6 1, it converges
by Abel’s test.
∞
X zn
Example. The series converges for |z| ≤ 1 and diverges for |z| > 1.
n=1
n2
As evidenced by the above examples, the ratio test can be used to find the
radius of convergence. We also have an alternative test based on the nth root.
an z n is
P
Lemma. The radius of convergence of a power series
1
R= p .
lim sup n
|an |
p
Often n
|an | converges, so we only have to find the limit.
p p
Proof. Suppose |z| < 1/ lim sup n |an |. Then |z| lim sup n |an | < 1. Therefore
there exists N and ε > 0 such that
p
sup |z| n |an | ≤ 1 − ε
n≥N
by the definition of lim sup. Therefore
|an z n | ≤ (1 − ε)n
for
P every n ≥ N , which implies (by comparison with geometric series) that
an z n converges absolutely. p p
On the other hand, if |z| lim sup n |an | > 1, it follows that |z| nP|an | ≥ 1 for
infinitely many n. Therefore |an z n | ≥ 1 for infinitely many n. So an z n does
not converge.
zn p 1
Example. The radius of convergence of n is 2 because n |an | = 2 for every
p p2
n. So lim sup n |an | = 12 . So 1/ lim sup n |an | = 2.
But often it is easier to findP
the radius convergence from elementary methods
such as the ratio test, e.g. for n2 z n .
6.1 Exponential and trigonometric functions

Definition (Exponential function). The exponential function is
∞
X zn
ez = .
n=0
n!
By the ratio test, this converges on all of C.
39
A fundamental property of this function is that
ez+w = ez ew .
Once we have this property, we can say that

Proposition. The derivative of ez is ez .
Proof.
ez+h − ez eh − 1

= ez
h h
h2

h
= ez 1 + + + ···
2! 3!
But
2 2 3

+ h + · · · ≤ |h| + |h| + |h| + · · · = |h|/2 → 0.
h
2! 3! 2 4 8 1 − |h|/2
So
ez+h − ez
→ ez .
h
But we must still prove that ez+w = ez ew .

Consider two sequences (an ), (bn ). Their convolution is the sequence (cn )
defined by
cn = a0 bn + a1 bn−1 + a2 bn−2 + · · · + an b0 .
The relevance of this is that if you take
N
! N ! N
X X X
n n
an z bn z and cn z n ,
n=0 n=0 n=0
n
and equate coefficients of z , you get
cn = a0 bn + a1 bn−1 + a2 bn−2 + · · · + an b0 .
P∞ P∞
Theorem. Let n=0 an and n=0 bn be two absolutely convergent series, P∞ and
let (cn ) be the convolution of the sequences (an ) and (bn ). Then n=0 cn
converges (absolutely), and
∞ ∞
! ∞ !
X X X
cn = an bn .
n=0 n=0 n=9
P
Proof. We first show that a rearrangement of cn converges absolutely.
P Hence
it converges unconditionally, and we can rearrange it back to cn .
Consider the series
(a0 b0 ) + (a0 b1 + a1 b1 + a1 b0 ) + (a0 b2 + a1 b2 + a2 b2 + a2 b1 + a2 b0 ) + · · · (∗)
Let
N
X N
X N
X N
X
SN = an , TN = bn , UN = |an |, VN |bn |.
n=0 n=0 n=0 n=0
40
P P
Also let SN → S, TN → T, UN → U, VN → V (these exist since an and bn
converge absolutely).
If we take the modulus of the terms of (∗), and consider the first (N + 1)2
terms (i.e. the first N + 1 brackets), the sum is UN VN . Hence the series converges
absolutely to U V . Hence (∗) converges.
The partial sum up to (N + 1)2 of the series (∗) itself is SN TN , which
converges to ST . So the whole series converges to ST .
Since it converges absolutely, it converges unconditionally. Now consider a
rearrangement:
a0 b0 + (a0 b1 + a1 b0 ) + (a0 b2 + a1 b1 + a2 b0 ) + · · ·
Then this converges to ST as well. But the partial sum of the first 1 + 2 + · · · + N
terms is c0 + c1 + · · · + cN . So
∞
N
! ∞ !
X X X
cn → ST = an bn .
n=0 n=0 n=9
Corollary.
ez ew = ez+w .
Proof. By theorem above (and definition of ez ),
∞
wn z wn−1 z 2 wn−2 zn
X
z w
e e = 1· + + + ··· + ·1
n=0
n! 1! (n − 1)! 2! (n − 2)! n!
∞
z w
X 1 n n n−1 n 2 n−2 n n
e e = w + zw + z w + ··· + z
n=0
n! 1 2 n
∞
X
= (z + w)n by the binomial theorem
n=0
= ez+w .
Note that if (cn ) is the convolution of (an ) and P (bn ), thenP

the convolution
of (an z n ) and (bn z n ) is (cn z n ). Therefore if both an z n
and bn z n converge
n
P
absolutely, then their product is cn z .
Note that we have now completed the proof that the derivative of ez is ez .
Now we define sin z and cos z:
Definition (Sine and cosine).
eiz − e−iz z3 z5 z7
sin z = =z− + − + ···
2i 3! 5! 7!
eiz + e−iz z2 z4 z6
cos z = =1− + − + ··· .
2 2! 4! 6!
We now prove certain basic properties of sin and cos, using known properties
of ez .
41
Proposition.
d ieiz + ie−iz
sin z = = cos z
dz 2i
d ieiz − ie−iz
cos z = = − sin z
dz 2
−2iz
2iz
e +2+e e2iz − 2 + e−2iz
sin2 z + cos2 z = + = 1.
4 −4
It follows that if x is real, then | cos x| and | sin x| are at most 1.
Proposition.
cos(z + w) = cos z cos w − sin z sin w

sin(z + w) = sin z cos w + cos z sin w
Proof.
(eiz + e−iz )(eiw + e−iw ) (eiz − e−iz )(eiw − e−iw )

cos z cos w − sin z sin w = +
4 4
ei(z+w) + e−i(z+w)
=
2
= cos(z + w).
Differentiating both sides wrt z gives
− sin z cos w − cos z sin w = − sin(z + w).
So
sin(z + w) = sin z cos w + cos z sin w.
d
When x is real, we know that cos x ≤ 1. Also sin 0 = 0, and dx sin x =
cos x ≤ 1. So for x ≥ 0, sin x ≤ x, “by the mean value theorem”. Also, cos 0 = 1,
d
and dx cos x = − sin x, which, for x ≥ 0, is greater than −x. From this, it follows
2 2
that when x ≥ 0, cos x ≥ 1 − x2 (the 1 − x2 comes from “integrating” −x, (or
finding a thing whose derivative is −x)).
Continuing in this way, we get that for x ≥ 0, if you take truncate the power
series for sin x or cos x, it will be ≥ sin x, cos x if you stop at a positive term,
and ≤ if you stop at a negative term. For example,
x3 x5 x7 x9 x11
sin x ≥ x − + − + − .
3! 5! 7! 9! 11!
In particular,
22 24 2
cos 2 ≤ 1 − + = 1 − 2 + < 0.
2! 4! 3
Since cos 0 = 1, it follows by the intermediate value theorem that there exists
2
some x ∈ (0, 2) such that cos x = 0. Since cos x ≥ 1 − x2 , we can further deduce
that x > 1.
π
Definition (Pi). Define the smallest x such that cos x = 0 to be 2.
42
Since sin2 z + cos2 z = 1, it follows that sin π2 = ±1. Since cos x > 0 on [0, π2 ],
sin π2 ≥ 0 by the mean value theorem. So sin π2 = 1.
Thus
Proposition.
π
cos z + = − sin z
2
π
sin z + = cos z
2
cos(z + π) = − cos z
sin(z + π) = − sin z
cos(z + 2π) = cos z
sin(z + 2π) = sin z
Proof.
π π π
cos z + = cos z cos − sin z sin
2 2 2
π
= − sin z sin
2
= − sin z
and similarly for others.
6.2 Differentiating power series

P∞
We shall show that inside the circle
P∞ of convergence, the derivative of n=0 anz is
given by the obvious formula n=1 nan z n−1 .
We first prove some (seemingly arbitrary and random) lemmas to build up
the proof of the above statement. They are done so that the final proof will not
be full of tedious algebra.
Lemma. Let a and b be complex numbers. Then
bn − an − n(b − a)an−1 = (b − a)2 (bn−2 + 2abn−3 + 3a2 bn−4 + · · · + (n − 1)an−2 ).
Proof. If b = a, we are done. Otherwise,
bn − an
= bn−1 + abn−2 + a2 bn−3 + · · · + an−1 .
b−a
Differentiate both sides with respect to a. Then
−nan−1 (b − a) + bn − an
= bn−2 + 2abn−3 + · · · + (n − 1)an−2 .
(b − a)2
Rearranging gives the result.
Alternatively, we can do
bn − an = (b − a)(bn−1 + abn−2 + · · · + an−1 ).
Subtract n(b − a)an−1 to obtain
(b − a)[bn−1 − an−1 + a(bn−2 − an−2 ) + a2 (bn−3 − an−3 ) + · · · ]
and simplify.
43
This implies that
(z + h)n − z n − nhz n−1 = h2 ((z + h)n−2 + 2z(z + h)n−3 + · · · + (n − 1)z n−2 ),
which is actually the form we need.

Lemma. Let an z n have radius of convergence R, and let |z| < R. Then
nan z n−1 converges (absolutely).
P
|an |rn converges, so the terms

P
Proof. Pick r such that |z| < r < R. Then
n
|an |r are bounded above by, say, C. Now
n−1 n−1
X
n−1
X
n−1 |z| CX |z|
n|an z |= n|an |r ≤ n
r r r
n−1
n |z| n|an z n−1 | converges,
P P
The series r converges, by the ratio test. So
by the comparison test.
Corollary. Under the same conditions,
∞
X n
an z n−2
n=2
2
converges absolutely.
Proof. Apply Lemma above again and divide by 2.
an z n be a power series with radius of convergence R. For
P
Theorem. Let
|z| < R, let
X∞ ∞
X
f (z) = an z n and g(z) = nan z n−1 .
n=0 n=1
Then f is differentiable with derivative g.
Proof. We want f (z + h) − f (z) − hg(z) to be o(h). We have
∞
X
f (z + h) − f (z) − hg(z) = an ((z + h)n − z n − hnz n ).
n=2
We started summing from n = 2 since the n = 0 and n = 1 terms are 0. Using

our first lemma, we are left with
∞
X
h2 an (z + h)n−2 + 2z(z + h)n−3 + · · · + (n − 1)z n−2

n=2
We want the huge infinite series to be bounded, and then the whole thing is a
bounded thing times h2 , which is definitely o(h).
Pick r such that |z| < r < R. If h is small enough that |z + h| ≤ r, then the
last infinite series is bounded above (in modulus) by
∞ ∞
X X n n−2
|an |(rn−2 + 2rn−2 + · · · + (n − 1)rn−2 ) = |an | r ,
n=2 n=2
2
which is bounded. So done.
44
In IB Analysis II, we will prove the same result using the idea of uniform
convergence, which gives a much nicer proof.
Example. The derivative of
z2 z3
ez = 1 + z + + + ···
2! 3!
is
z2
1+z+ + · · · = ez .
2!
So we have another proof that of this fact.
Similarly, the derivatives of sin z and cos z work out as cos z and − sin z.
6.3 Hyperbolic trigonometric functions

Definition (Hyperbolic sine and cosine). We define
ez + e−z z2 z4 z6
cosh z = =1+ + + + ···
2 2! 4! 6!
ez − e−z z3 z5 z7
sinh z = =z+ + + + ···
2 3! 5! 7!
Either from the definition or from differentiating the power series, we get
that
Proposition.
d
cosh z = sinh z
dz
d
sinh z = cosh z
dz
Also, by definition, we have
Proposition.
cosh iz = cos z
sinh iz = i sin z
Also,
Proposition.
cosh2 z − sinh2 z = 1,
45
7 The Riemann Integral IA Analysis I
7 The Riemann Integral

Finally, we can get to integrals. There are many ways to define an integral,
which can have some subtle differences. The definition we will use here is the
Riemann integral, which is the simplest definition, but is also the weakest one, in
the sense that many functions are not Riemann integrable but integrable under
other definitions.
Still, the definition of the Riemann integral is not too straightforward, and
requires a lot of preliminary definitions.
7.1 Riemann Integral

Definition (Dissections). Let [a, b] be a closed interval. A dissection of [a, b] is
a sequence a = x0 < x1 < x2 < · · · < xn = b.
Definition (Upper and lower sums). Given a dissection D, the upper sum and
lower sum are defined by the formulae
n
X
UD (f ) = (xi − xi−1 ) sup f (x)
i=1 x∈[xi−1 ,xi ]
Xn
LD (f ) = (xi − xi−1 ) inf f (x)
x∈[xi−1 ,xi ]
i=1
Sometimes we use the shorthand
Mi = sup f (x), mi = inf f (x).

x∈[xi−1 ,xi ] x∈[xi−1 −xi ]
The upper sum is the total area of the red rectangles, while the lower sum is
the total area of the black rectangles:
y
···
···
x
a x1 x2 x3 xi xi+1 · · · b
Definition (Refining dissections). If D1 and D2 are dissections of [a, b], we say

that D2 refines D1 if every point of D1 is a point of D2 .
Lemma. If D2 refines D1 , then
UD2 f ≤ UD1 f and LD2 f ≥ LD1 f.
46
Using the picture above, this is because if we cut up the dissections into
smaller pieces, the red rectangles can only get chopped into shorter pieces and
the black rectangles can only get chopped into taller pieces.
y y
x x
x0 x1 x0 x1 x2 x3
Proof. Let D be x0 < x1 < · · · < xn . Let D2 be obtained from D1 by the

addition of one point z. If z ∈ (xi−1 , xi ), then
" #
UD2 f − UD1 f = (z − xi−1 ) sup f (x)
x∈[xi−1 ,z]
" #
+ (xi − z) sup f (x) − (xi − xi−1 )Mi .
x∈[z,xi ]
But supx∈[xi−1 ,z] f (x) and supx∈[z,xi ] f (x) are both at most Mi . So this is at
most Mi (z − xi−1 + xi − z − (xi − xi−1 )) = 0. So
UD2 f ≤ UD1 f.
By induction, the result is true whenever D2 refines D1 .

A very similar argument shows that LD2 f ≥ LD1 f .
Definition (Least common refinement). If D1 and D2 be dissections of [a, b].

Then the least common refinement of D1 and D2 is the dissection made out of
the points of D1 and D2 .
Corollary. Let D1 and D2 be two dissections of [a, b]. Then
UD1 f ≥ LD2 f.
Proof. Let D be the least common refinement (or indeed any common refinement).
Then by lemma above (and by definition),
UD1 f ≥ UD f ≥ LD ≥ LD2 f.
Finally, we can define the integral.

Definition (Upper, lower, and Riemann integral). The upper integral is
Z b
f (x) dx = inf UD f.
a D
47
The lower integral is

Z b
f (x) dx = sup LD f.
a D
If these are equal, then we call their common value the Riemann integral of f ,
Rb
and is denoted a f (x) dx.
If this exists, we say f is Riemann integrable.
We will later prove the fundamental theorem of calculus, which says that
integration is the reverse of differentiation. But why don’t we simply define
integration as anti-differentiation, and prove that it is the area of the curve?
There are things that we cannot find (a closed form of) the anti-derivative of,
2
like e−x . In these cases, we wouldn’t want to say the integral doesn’t exist — it
surely does according to this definition!
There is an immediate necessary condition for Riemann integrability — bound-
edness. If f is unbounded above in [a, b], then for any dissection D, there must
be some i such that f is unbounded on [xi−1 , xi ]. So Mi = ∞. So UD f = ∞.
Similarly, if f is unbounded below, then LD f = −∞. So unbounded functions
are not Riemann integrable.
Example. Let f (x) = x on [a, b]. Intuitively, we know that the integral is
(b2 − a2 )/2, and we will show this using the definition above. Let D = x0 < x1 <
· · · < xn be a dissection. Then
n
X
UD f = (xi − xi−1 )xi
i=1
b2 −a2
We know that the integral is 2 . So we put each term of the sum into the
x2i −x2i−1
form 2 plus some error terms.
n 2
x2i−1 x2 x2

X x
= i
− + i − xi−1 xi + i−1
i=1
2 2 2 2
n
1X 2
= (x − x2i−1 + (xi − xi−1 )2 )
2 i=1 i
n
1 2 1X
= (b − a2 ) + (xi − xi−1 )2 .
2 2 i=1
Definition (Mesh). The mesh of a dissection D is maxi (xi+1 − xi ).
Then if the mesh is < δ, then

n n
1X δX δ
(xi − xi−1 )2 ≤ (xi − xi−1 ) = (b − a).
2 i=1 2 i=1 2
So by making δ small enough, we can show that for any ε > 0,

Z b
1 2
x dx < (b − a2 ) + ε.
a 2
48
Similarly,
Z b
1 2
x dx > (b − a2 ) − ε.
a 2
So Z b
1 2
x dx = (b − a2 ).
a 2
Example. Define f : [0, 1] → R by
(
1 x∈Q
f (x) = .
0 x 6∈ Q
Let x0 < x1 < · · · , xn be a dissection. Then for every i, we have mi = 0 (since

there is an irrational in every interval), and Mi = 1 (since there is a rational in
every interval). So
n
X n
X
UD f = Mi (xi − xi−1 ) = (xi − xi−1 ) = 1.
i=1 i=1
Similarly, LD f = 0. Since D was arbitrary, we have

Z 1 Z 1
f (x) dx = 1, f (x) dx = 0.
0 0
So f is not Riemann integrable.

Of course, this function is not interesting at all. The whole point of its
existence is to show undergraduates that there are some functions that are not
integrable!
Note that it is important to say that the function is not Riemann integrable.
There are other notions for integration in which this function is integrable. For
example, this function is Lebesgue-integrable.
Using the definition to show integrability is often cumbersome. Most of
the time, we use the Riemann’s integrability criterion, which follows rather
immediately from the definition, but is much nicer to work with.
Proposition (Riemann’s integrability criterion). This is sometimes known as
Cauchy’s integrability criterion.
Let f : [a, b] → R. Then f is Riemann integrable if and only if for every
ε > 0, there exists a dissection D such that
UD − LD < ε.
Proof. (⇒) Suppose that f is integrable. Then (by definition of Riemann

integrability), there exist D1 and D2 such that
Z b
ε
UD1 < f (x) dx + ,
a 2
and Z b
ε
LD2 > f (x) dx − .
a 2
49
Let D be a common refinement of D1 and D2 . Then
UD f − LD f ≤ UD1 f − LD2 f < ε.
(⇐) Conversely, if there exists D such that
UD f − LD f < ε,
then
inf UD f − sup LD f < ε,
which is, by definition, that
Z b Z b
f (x) dx − f (x) dx < ε.
a a
Since ε > 0 is arbitrary, this gives us that

Z b Z b
f (x) dx = f (x) dx.
a a
So f is integrable.
The next big result we want to prove is that integration is linear, ie
Z b Z b Z b
(λf (x) + µg(x)) dx = λ f (x) dx + µ g(x) dx.
a a a
We do this step by step:

Proposition. Let f : [a, b] → R be integrable, and λ ≥ 0. Then λf is integrable,
and Z b Z b
λf (x) dx = λ f (x) dx.
a a
Proof. Let D be a dissection of [a, b]. Since
sup λf (x) = λ sup f (x),

x∈[xi−1 ,xi ] x∈[xi−1 ,xi ]
and similarly for inf, we have
UD (λf ) = λUD f
LD (λf ) = λLD f.
So if we choose D such that UD f − LD f < ε/λ, then UD (λf ) − LD (λf ) < ε. So

the result follows from Riemann’s integrability criterion.
Proposition. Let f : [a, b] → R be integrable. Then −f is integrable, and
Z b Z b
−f (x) dx = − f (x) dx.
a a
50
Proof. Let D be a dissection. Then
sup −f (x) = − inf f (x)

x∈[xi−1 ,xi ] x∈[xi−1 ,xi ]
inf −f (x) = − sup f (x).

x∈[xi−1 ,xi ] x∈[xi−1 ,xi ]
Therefore
n
X
UD (−f ) = (xi − xi−1 )(−mi ) = −LD (f ).
i=1
Similarly,
LD (−f ) = −UD f.
So
UD (−f ) − LD (−f ) = UD f − LD f.
Hence if f is integrable, then −f is integrable by the Riemann integrability
criterion.
Proposition. Let f, g : [a, b] → R be integrable. Then f + g is integrable, and
Z b Z b Z b
(f (x) + g(x)) dx = f (x) dx + g(x) dx.
a a a
Proof. Let D be a dissection. Then

n
X
UD (f + g) = (xi − xi−1 ) sup (f (x) + g(x))
i=1 x∈[xi−1 ,xi ]
n
!
X
≤ (xi − xi−1 ) sup f (u) + sup g(v)
i=1 u∈[xi−1 ,xi ] v∈[xi−1 ,xi ]
= UD f + UD g
Therefore,
Z b Z b Z b Z b Z b
(f (x) + g(x)) dx ≤ f (x) dx + g(x) dx = f (x) dx + g(x) dx.
a a a a a
Similarly,
Z b Z b Z b
(f (x) + g(x)) dx ≥ f (x) dx + g(x) dx.
a a a
So the upper and lower integrals are equal, and the result follows.
So we now have that
Z b Z b Z b
(λf (x) + µg(x)) dx = λ f (x) dx + µ g(x) dx.
a a a
We will prove more “obvious” results.

Proposition. Let f, g : [a, b] → R be integrable, and suppose that f (x) ≤ g(x)
for every x. Then
Z b Z b
f (x) dx ≤ g(x) dx.
a a
51
Proof. Follows immediately from the definition.

Proposition. Let f : [a, b] → R be integrable. Then |f | is integrable.
Proof. Note that we can write
sup f (x) − inf f (x) = sup |f (u) − f (v)|.

x∈[xi−1 ,xi ] x∈[xi−1 ,xi ] u,v∈[xi−1 ,xi ]
Similarly,
sup |f (x)| − inf |f (x)| = sup ||f (u)| − |f (v)||.

x∈[xi−1 ,xi ] x∈[xi−1 ,xi ] u,v∈[xi−1 ,xi ]
For any pair of real numbers, x, y, we have that ||x| − |y|| ≤ |x − y| by the
triangle inequality. Then for any interval u, v ∈ [xi−1 , xi ], we have
||f (u)| − |f (v)|| ≤ |f (u) − f (v)|.
Hence we have
sup |f (x)| − inf |f (x)| ≤ sup f (x) − inf f (x).

x∈[xi−1 ,xi ] x∈[xi−1 ,xi ] x∈[xi−1 ,xi ] x∈[xi−1 ,xi ]
So for any dissection D, we have
UD (|f |) − LD (|f |) ≤ UD (f ) − LD (f ).
So the result follows from Riemann’s integrability criterion.

Combining these two propositions, we get that if
|f (x) − g(x)| ≤ C,
for every x ∈ [a, b], then

Z
b Z b
f (x) dx − g(x) dx ≤ C(b − a).

a a
Proposition (Additivity property). Let f : [a, c] → R be integrable, and let

b ∈ (a, c). Then the restrictions of f to [a, b] and [b, c] are Riemann integrable,
and Z bZ Z c c
f (x) dx + f (x) dx = f (x) dx
a b a
Similarly, if f is integrable on [a, b] and [b, c], then it is integrable on [a, c] and
the above equation also holds.
Proof. Let ε > 0, and let a = x0 < x1 < · · · < xn = c be a dissection of D of
[a, c] such that Z c
UD (f ) ≤ f (x) dx + ε,
a
and Z c
LD (f ) ≥ f (x) dx − ε.
a
52
Let D0 be the dissection made of D plus the point b. Let D1 be the dissection of
[a, b] made of points of D0 from a to b, and D2 be the dissection of [b, c] made of
points of D0 from b to c. Then
UD1 (f ) + UD2 (f ) = UD0 (f ) ≤ UD (f ),
and
LD1 (f ) + LD2 (f ) = LD0 (f ) ≥ LD (f ).
Since UD (f ) − LD (f ) < 2ε, and both UD2 (f ) − LD2 (f ) and UD1 (f ) − LD1 (f )
are non-negative, we have UD1 (f ) − LD1 (f ) and UD2 (f ) − LD2 (f ) are less than
2ε. Since ε is arbitrary, it follows that the restrictions of f to [a, b] and [b, c] are
both Riemann integrable. Furthermore,
Z b Z c Z c
f (x) dx+ f (x) dx ≤ UD1 (f )+UD2 (f ) = UD0 (f ) ≤ UD (f ) ≤ f (x) dx+ε.
a b a
Similarly,
Z b Z c Z c
f (x) dx+ f (x) dx ≥ LD1 (f )+LD2 (f ) = LD0 (f ) ≥ LD (f ) ≥ f (x) dx−ε.
a b a
Since ε is arbitrary, it follows that
Z b Z c Z c
f (x) dx + f (x) dx = f (x) dx.
a b a
The other direction is left as an (easy) exercise.

Proposition. Let f, g : [a, b] → R be integrable. Then f g is integrable.
Proof. Let C be such that |f (x)|, |g(x)| ≤ C for every x ∈ [a, b]. Write Li and
`i for the sup and inf of g in [xi−1 , xi ]. Now let D be a dissection, and for each
i, let ui and vi be two points in [xi−1 , xi ].
We will pretend that ui and vi are the maximum and minimum when we
write the proof, but we cannot assert that they are, since f g need not have
maxima and minima. We will then note that since our results hold for arbitrary
ui and vi , it must hold when f g is at its supremum and infimum.
We find what we pretend is the difference between the upper and lower sum:
n
X
xi − xi−1 )(f (vi )g(vi ) − f (ui )g(ui )

i=1
n
X
= (xi − xi−1 ) f (vi )(g(vi ) − g(ui )) + (f (vi − f (ui ))g(ui ))

i=1
n
X
≤ C(Li − `i ) + (Mi − mi )C
i=1
= C(UD g − LD g + UD f − LD f ).
Since ui and vi are arbitrary, it follows that
UD (f g) − LD (f g) ≤ C(UD f − LD f + UD g − LD g).
Since C is fixed, and we can get UD f − LD f and UD g − LD g arbitrary small
(since f and g are integrable), we can get UD (f g) − LD (f g) arbitrarily small. So
the result follows.
53
Theorem. Every continuous function f on a closed bounded interval [a, b] is

Riemann integrable.
Proof. wlog assume [a, b] = [0, 1].
Suppose the contrary. Let f be non-integrable. This means that there exists
some ε such that for every dissection D, UD − LD > ε. In particular, for every
n, let Dn be the dissection 0, n1 , n2 , · · · , nn .
Since UDn − LDn > ε, there exists some interval nk , k+1

n in which sup f −
inf f > ε. Suppose the supremum and infimum are attained at xn and yn
respectively. Then we have |xn − yn | < n1 and f (xn ) − f (yn ) > ε.
By Bolzano Weierstrass, (xn ) has a convergent subsequence, say (xni ). Say
xni → x. Since |xn − yn | < n1 → 0, we must have yni → x. By continuity, we
must have f (xni ) → f (x) and f (yni ) → f (x), but f (xni ) and f (yni ) are always
apart by ε. Contradiction.
2
With this result, we know that a lot of things are integrable, e.g. e−x .
To prove this, we secretly used the property of uniform continuity:
Definition (Uniform continuity*). Let A ⊆ R and let f : A → R. Then f is
uniformly continuous if
(∀ε)(∃δ > 0)(∀x)(∀y) |x − y| < δ ⇒ |f (x) − f (y)| ≤ ε.
This is different from regular continuity. Regular continuity says that at any
point x, we can find a δ that works for this point. Uniform continuity says that
we can find a δ that works for any x.
It is easy to show that a uniformly continuous function is integrable, since by
uniformly continuity, as long as the mesh of a dissection is sufficiently small, the
difference between the upper sum and the lower sum can be arbitrarily small by
uniform continuity. Thus to prove the above theorem, we just have to show that
continuous functions on a closed bounded interval are uniformly continuous.
Theorem (non-examinable). Let a < b and let f : [a, b] → R be continuous.
Then f is uniformly continuous.
Proof. Suppose that f is not uniformly continuous. Then
(∃ε)(∀δ > 0)(∃x)(∃y) |x − y| < δ and |f (x) − f (y)| ≥ ε.
Therefore, we can find sequences (xn ), (yn ) such that for every n, we have
1
|xn − yn | ≤ and |f (xn ) − f (yn )| ≥ ε.
n
Then by Bolzano-Weierstrass theorem, we can find a subsequence (xnk ) converg-
ing to some x. Since |xnk −ynk | ≤ n1k , ynk → x as well. But |f (xnk )−f (ynk )| ≥ ε
for every k. So f (xnk ) and f (ynk ) cannot both converge to the same limit. So f
is not continuous at x.
This proof is very similar to the proof that continuous functions are integrable.
In fact, the proof that continuous functions are integrable is just a fuse of this
proof and the (simple) proof that uniformly continuously functions are integrable.
Theorem. Let f : [a, b] → R be monotone. Then f is Riemann integrable.
54
Note that monotone functions need not be “nice”. It can even have in-
finitely many discontinuities. For example, if f : [0, 1] → R maps x to the
1/(first non-zero digit in the binary expansion of x), with f (0) = 0.
ε
Proof. let ε > 0. Let D be a dissection of mesh less than f (b)−f (a) . Then
n
X
UD f − LD f = (xi − xi−1 )(f (xi ) − f (xi−1 ))
i=1
n
ε X
≤ (f (xi ) − f (xi−1 ))
f (b) − f (a) i=1
= ε.
Pictorially, we see that the difference between the upper and lower sums is
total the area of the red rectangles.
y
To calculate the total area, we can stack the red areas together to get something
ε
of width f (b)−f (a) and height f (b) − f (a). So the total area is just ε.
Lemma. Let a < b and let f be a bounded function from [a, b] → R that is
continuous on (a, b). Then f is integrable.
R1
An example where this would apply is 0 sin x1 . It gets nasty near x = 0, but
its “nastiness” is confined to x = 0 only. So as long as its nastiness is sufficiently
contained, it would still be integrable.
The idea of the proof is to integrate from a point x1 very near a up to a
point xn−1 very close to b. Since f is bounded, the regions [a, x1 ] and [xn−1 , b]
are small enough to not cause trouble.
Proof. Let ε > 0. Suppose that |f (x)| ≤ C for every x ∈ [a, b]. Let x0 = a and
ε
pick x1 such that x1 − x0 < 8C . Also choose z between x1 and b such that
ε
b − z < 8C .
55
Then f is continuous [x1 , z]. Therefore it is integrable on [x1 , z]. So we can

find a dissection D0 with points x1 < x2 < · · · < xn−1 = z such that
ε
UD0 f − LD0 f < .
2
Let D be the dissection a = x0 < x1 < · · · < xn = b. Then
ε ε ε
UD f − LD f < · 2C + + · 2C = ε.
8C 2 8C
So done by Riemann integrability criterion.
Example.
(
sin x1 x 6= 0
– f (x) = defined on [−1, 1] is integrable.
0 x=0
(
x x≤1
– g(x) = defined on [0, 1] is integrable.
x2 + 1 x>1
Corollary. Every piecewise continuous and bounded function on [a, b] is inte-

grable.
Proof. Partition [a, b] into intervals I1 , · · · , Ik , on each of which f is (bounded
and) continuous. Hence for every Ij with end points xj−1 , xj , f is integrable on
[xj−1 , xj ] (which may not equal Ij , e.g. Ij could be [xj−1 , xj )). But then by the
additivity property of integration, we get that f is integrable on [a, b]
We defined Riemann integration in a very general way — we allowed arbitrary
dissections, and took the extrema over all possible dissection. Is it possible to
just consider some particular nice dissections instead? Perhaps unsurprisingly,
yes! It’s just that we opt to define it the general way so that we can easily talk
about things like least common refinements.
Lemma. Let f : [a, b] → R be Riemann integrable, and for each n, let Dn be
the dissection a = x0 < x1 < · · · < xn = b, where xi = a + i(b−a)
n for each i.
Then Z b
UDn f → f (x) dx
a
and Z b
LDn f → f (x) dx.
a
Proof. Let ε > 0. We need to find an N . The only thing we know is that f is
Riemann integrable, so we use it:
Since f is integrable, there is a dissection D, say u0 < u1 < · · · < um , such
that Z b
ε
UD f − f (x) dx < .
a 2
We also know that f is bounded. Let C be such that |f (x)| ≤ C.
For any n, let D0 be the least common refinement of Dn and D. Then
UD0 f ≤ UD f.
56
Also, the sums UDn f and UD0 f are the same, except that at most m of the
subintervals [xi−1 , xi ] are subdivided in D0 .
For each interval that gets chopped up, the upper sum decreases by at most
b−a
n · 2C. Therefore
b−a
UDn f − UD0 f ≤ 2C · m.
n
Pick n such that 2Cm(b − a)/n < 2ε . Then
ε
UDn f − UD f < .
2
So Z b
UDn f − f (x) dx < ε.
a
4C(b−a)m Rb
This is true whenever n > ε . Since we also have UDn f ≥ a
f (x) dx,
therefore Z b
UDn f → f (x) dx.
a
The proof for lower sums is similar.
For convenience, we define the following:
Notation. If b > a, we define
Z a Z b
f (x) dx = − f (x) dx.
b a
We now prove that the fundamental theorem of calculus, which says that
integration is the reverse of differentiation.
Theorem (Fundamental theorem of calculus, part 1). Let f : [a, b] → R be
continuous, and for x ∈ [a, b], define
Z x
F (x) = f (t) dt.
a
0
Then F is differentiable and F (x) = f (x) for every x.
Proof.
F (x + h) − F (x) 1 x+h
Z
= f (t) dt
h h x
Let ε > 0. Since f is continuous, at x, then there exists δ such that |y − x| < δ
implies |f (y) − f (x)| < ε.
If |h| < δ, then
Z Z
1 x+h 1 x+h
f (t) dt − f (x) = (f (t) − f (x)) dt

h x h x

Z
1 x+h
≤ |f (t) − f (x)| dt

|h| x

ε|h|
≤
|h|
= ε.
57
Corollary. If f is continuously differentiable on [a, b], then

Z b
f 0 (t) dt = f (b) − f (a).
a
Proof. Let Z x
g(x) = f 0 (t) dt.
a
Then
d
g 0 (x) = f 0 (x) = (f (x) − f (a)).
dx
Since g 0 (x) − f 0 (x) = 0, g(x) − f (x) must be a constant function by the mean
value theorem. We also know that
g(a) = 0 = f (a) − f (a)
So we must have g(x) = f (x) − f (a) for every x, and in particular, for x = b.
Theorem (Fundamental theorem of calculus, part 2). Let f : [a, b] → R be a
differentiable function, and suppose that f 0 is integrable. Then
Z b
f 0 (t) dt = f (b) − f (a).
a
Note that this is a stronger result than the corollary above, since it does not
require that f 0 is continuous.
Proof. Let D be a dissection x0 < x1 < · · · < xn . We want to make use of this
dissection. So write
n
X
f (b) − f (a) = (f (xi ) − f (xi−1 )).
i=1
For each i, there exists ui ∈ (xi−1 , xi ) such that f (xi ) − f (xi−1j ) = (xi −
xi−1 )f 0 (ui ) by the mean value theorem. So
n
X
f (b) − f (a) = (xi − xi−1 )f 0 (ui ).
i=1
We know that f 0 (ui ) is somewhere between sup f 0 (x) and inf f 0 (x)
x∈[xi ,xi−1 ] x∈[xi ,xi−1 ]
by definition. Therefore
LD f 0 ≤ f (b) − f (a) ≤ UD f 0 .
Since f 0 is integrable and D was arbitrary, LD f 0 and UD f 0 can both get arbitrarily
Rb
close to a f 0 (t) dt. So
Z b
f (b) − f (a) = f 0 (t) dt.
a
58
Note that the condition that f 0 is integrable is essential. It is possible to find

a differentiable function whose derivative is not integrable! You will be asked to
find it in the example sheet.
Using the fundamental theorem of calculus, we can easily prove integration
by parts:
Theorem (Integration by parts). Let f, g : [a, b] → R be integrable such that
everything below exists. Then
Z b Z b
0
f (x)g (x) dx = f (b)g(b) − f (a)g(a) − f 0 (x)g(x) dx.
a a
Proof. By the fundamental theorem of calculus,

Z b Z b
0 0
(f (x)g (x) + f (x)g(x)) dx = (f g)0 (x) dx = f (b)g(b) − f (a)g(a).
a a
The result follows after rearrangement.

Recall that when we first had Taylor’s theorem, we said it had the Lagrange
form of the remainder. There are many other forms of the remainder term. Here
we will look at the integral form:
Theorem (Taylor’s theorem with the integral form of the remainder). Let f be
n + 1 times differentiable on [a, b] with with f (n+1) continuous. Then
(b − a)2 (2)
f (b) = f (a) + (b − a)f 0 (a) + f (a) + · · ·
2!
Z b
(b − a)n (n) (b − t)n (n+1)
+ f (a) + f (t) dt.
n! a n!
Proof. Induction on n.
When n = 0, the theorem says
Z b
f (b) − f (a) = f 0 (t) dt.
a
which is true by the fundamental theorem of calculus.

Now observe that
Z b b
(b − t)n (n+1) −(b − t)n+1 (n+1)

f (t) dt = f (t)
a n! (n + 1)! a
Z b
(b − t)n+1 (n+1)
+ f (t) dt
a (n + 1)!
Z b
(b − a)n+1 (n+1) (b − t)n+1 (n+2)
= f (a) + f (t) dt.
(n + 1)! a (n + 1)!
So the result follows by induction.

Note that the form of the integral remainder is rather weird and unexpected.
How could we have come up with it? We might start with the fundamental
59
theorem of algebra and integrate by parts. The first attempt would be to

integrate 1 to t and differentiate f 0 (t) to f (2) (t). So we have
Z b
f (b) = f (a) + f 0 (t) dt
a
Z b
= f (a) + [tf 0 (t)]ba − tf (2) (t) dt
a
Z b
= f (a) + bf 0 (b) − af 0 (a) − tf (2) (t) dt
a
0
We want something in the form (b − a)f (a), so we take that out and see what
we are left with.
Z b
0 0 0
= f (a) + (b − a)f (a) + b(f (b) − f (a)) − tf (2) (t) dt
a
Rb
Then we note that f 0 (b) − f 0 (a) = a
f (2) (t) dt. So we have
Z b
= f (a) + (b − a)f 0 (a) + (b − t)f (2) (t) dt.
a
Then we can see that the right thing to integrate is (b − t) and continue to obtain
the result.
Theorem (Integration by substitution). Let f : [a, b] → R be continuous. Let
g : [u, v] → R be continuously differentiable, and suppose that g(u) = a, g(v) = b,
and f is defined everywhere on g([u, v]) (and still continuous). Then
Z b Z v
f (x) dx = f (g(t))g 0 (t) dt.
a u
Proof. By the fundamental theorem of calculus, f has an anti-derivative F

defined on g([u, v]). Then
Z v Z v
0
f (g(t))g (t) dt = F 0 (g(t))g 0 (t) dt
u u
Z v
= (F ◦ g)0 (t) dt
u
= F ◦ g(v) − F ◦ g(u)
= F (b) − F (a)
Z b
= f (x) dx.
a
We can think of “integration by parts” as what you get by integrating the

product rule, and “integration by substitution” as what you get by integrating
the chain rule.
60
7.2 Improper integrals

It is sometimes sensible to talk about integrals of unbounded functions or
integrating to infinity. But we have to be careful and write things down nicely.
Definition (Improper integral). Suppose that we have a function f : [a, b] → R
Rb
such that, for every ε > 0, f is integrable on [a + ε, b] and lim a+ε f (x) dx
ε→0
exists. Then we define the improper integral
Z b Z b
f (x) dx to be lim f (x) dx.
a ε→0 a+ε
even if the Riemann integral does not exist.

We can do similarly for [a, b − ε], or integral to infinity:
Z ∞ Z b
f (x) dx = lim f (x) dx.
a b→∞ a
when it exists.
Example. Z 1 h i1
x−1/2 dx = 2x−1/2 = 2 − 2ε1/2 → 2.
ε ε
So Z 1
x−1/2 dx = 2,
0
even though x−1/2 is unbounded on [0, 1].

Note that officially we are required to make f (x) = x−1/2 a function with
domain [0, 1]. So we can assign f (0) = π, or any number, since it doesn’t matter.
Example. x
Z x
1 1 1
dt = − = 1 − → 1 as x → ∞
1 t2 t 1 x
by the fundamental theorem of calculus. So
Z ∞
1
dx = 1.
1 x2
Finally, we can prove the integral test, whose proof we omitted when we first
began.
P∞ test). Let f : [1, ∞]

Theorem (Integral R ∞→ R be a decreasing non-negative
function. Then n=1 f (n) converges iff 1 f (x) dx < ∞.
Proof. We have Z n+1 Z n
f (x) dx ≤ f (n) ≤ f (x) dx,
n n−1
since f is decreasing (the right hand inequality is valid only for n ≥ 2). It follows
that
Z N +1 XN Z N
f (x) dx ≤ f (n) ≤ f (x) dx + f (1)
1 n=1 1
61
P
So
R ∞ if the integral exists, then f (n) is increasing and bounded above by
1
f (x) dx, so converges.
RN PN
If the integral does not exist, then 1 f (x) dx is unbounded. Then n=1 f (n)
is unbounded, hence does not converge.
Rx P∞
Example. Since 1 t12 dt < ∞, it follows that n=1 n12 converges.
62

Part IA - Analysis I: Based On Lectures by W. T. Gowers

Uploaded by

Copyright:

Available Formats

Part IA - Analysis I: Based On Lectures by W. T. Gowers

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Part IA - Analysis I: Based On Lectures by W. T. Gowers

Uploaded by

Copyright:

Available Formats

Part IA — Analysis I

Based on lectures by W. T. Gowers

Limits and convergence

1 The real number system 4

3 Convergence of infinite sums 15

6 Complex power series 38

7 The Riemann Integral 46

1 The real number system

(i) (∀a, b, c) a + (b + c) = (a + b) + c and a × (b × c) = (a × b) × c (associativity)

Lemma. Let F be an ordered field and x ∈ F. Then x2 ≥ 0.

Definition (Least upper bound). Let X be an ordered set and let A ⊆ X. An

(ii) (∀y < x)(∃a ∈ A) a ≥ y

Definition (Convergence of sequence). Let (an ) be a sequence and ` ∈ R. Then

(∀ε > 0)(∃N )(∀n ≥ N ) |an − `| < ε.

We say ` is the limit of (an ).

A sequence is eventually bounded if

(∃C)(∃N )(∀n ≥ N ) |an | ≤ C.

The definition of an eventually bounded sequence seems a bit daft. Clearly

2.2 Sums, products and quotients

|(an + bn ) − (a + b)| ≤ |an − a| + |bn − b| < ε.

We want to prove that the product of convergent sequences is convergent.

2.3 Monotone-sequences property

Then u0 ≤ u1 ≤ u2 ≤ · · · and v0 ≥ v1 ≥ v2 ≥ · · · . We also have

Theorem (Bolzano-Weierstrass theorem). Let F be an ordered field with the

2.4 Cauchy sequences

An important result we have is that in R, Cauchy convergence and regular

Proof. Let (an ) be a Cauchy sequence. Then it is eventually bounded, since ∃N ,

2.5 Limit supremum and infimum

Example. Take the sequence

Since ε was arbitrary, it follows that

lim inf an = lim sup an = a.

It follows that ∀m ≥ n, we have |am − a| < ε.

3 Convergence of infinite sums

3.1 Infinite sums

If the sequence (SN ) converges to some limit s, then we say that

We can prove this by considering the partial sums:

P∞ that an ≥ 0 for every n and the partial sums Sn are bounded

But 21−α < 1. So

S2n = (S2n − S2n−1 ) + (S2n−1 − S2n−2 ) + · · · (S2 − S1 ) + S1

and is bounded above by comparison with the geometric series 1 + 21−α +

3.2 Absolute convergence

Therefore the sequence of partial sums is Cauchy.

b1 , b2 , · · · bn1 , c1 , bn1 +1 , · · · bn2 , c2 , bn2 +1 , · · · bn3 , c3 , · · ·

So the sum up to ck is at least k. So the partial sums tend to infinity.

We can prove an even stronger result:

{1, · · · , M } ⊆ {π(1), · · · , π(N )},

We have the P first inequality

3.3 Convergence tests

Lemma (Alternating sequence test). Let (an ) be a decreasing sequence of non-

S2n = (a1 − a2 ) + (a3 − a4 ) + · · · (a2n−1 − a2n ) ≥ 0,

and (S2n ) is an increasing sequence.

S2n+1 = a1 − (a2 − a3 ) − (a4 − a5 ) − · · · − (a2n − a2n+1 ),

and (S2n+1 ) is a decreasing sequence. Also S2n+1 − S2n = a2n+1 ≥ 0.

(i) If ∃c < 1 such that

(iii) If ∃ρ ∈ (−1, 1) such that

nbn converges, since

a1 + a2 + (a3 + a4 ) + (a5 + · · · + a8 ) + (a9 + · · · + a16 ) + · · ·

P∞ test). Let f : [1, ∞]

3.4 Complex versions