Real Analysis
Real Analysis
Real Analysis
Venn diagrams show all possible logical relations between a finite collection of sets.
Definition. If any element x of a set A is also an element of another set B, then the
set A is called a subset of the set B, denoted A B or B A. If A B and A 6= B,
then A is called a proper subset of B, denoted A $ B.
When A B, then B contains A, or A is contained in B. The Venn diagrams representing
A B and A 6 B are as follow:
B
AB
A 6 B
It is also denoted 2 .
P(A) = {X : X A}.
Solution. We have
P() = {},
P({a}) = {, {a}},
AB
AB
A\B
AB
AB
A\B
When A B = , then
A B = {3, 4},
A C = .
Example 7. Let A be the set of all even numbers, B the set of all multiples of 3, and
C the set of all odd numbers. Prove that
(a) A B = {6n : n Z},
(b) B C = {3k : k Z is odd}.
Solution. The sets A, B, C can be written as
A = {2k : k Z},
B = {3n : n Z},
C = {2m + 1 : m Z}.
Example 8. Let A = {1, 2, 3, 4, 5}, B = {2, 4, 6, 8, 10} and C = {1, 3, 5}, then
A \ B = {1, 3, 5}, B \ A = {6, 8, 10},
A \ C = {2, 4},
C \ A = .
Theorem.
(i) (Idempotent laws) A A = A, A A = A.
(ii) (Commutative laws) A B = B A, A B = B A.
(iii) (Associative laws) (A B) C = A (B C), (A B) C = A (B C).
(iv) (Distributive laws) A(B C) = (AB)(AC), A(B C) = (AB)(AC).
(v) (Identity laws) A = A, A = .
(vi) (Complement laws) A Ac = , (Ac )c = A.
(vii) (De Morgans laws) (A B)c = Ac B c , (A B)c = Ac B c .
(viii) (Absorption laws) A B A B = B, A B A B = A.
(ix) (Transitive law) A B and B C A C.
Proof. (iii)
x, x (A B) C x A B or x C
(x A or x B) or x C
x A or (x B or x C)
x A or x B C
x A (B C)
The Venn diagrams depicting AB in the non-disjoint and disjoint cases are as follow:
A B 6=
AB =
B \ A = {5},
Solution. By definition of AB, the distributive laws and the De Morgans law, we have
AB = (A \ B) (B \ A)
= (A B c ) (B Ac )
= (A B) (A Ac ) (B c B) (B c Ac )
= (A B) (B c Ac )
= (A B) (A B)c
= (A B) \ (A B).
Definition. A set having finitely many elements is called a finite set; otherwise, it is
called an infinite set. Denote by n(S) the number of elements of a finite set S, also called
the cardinality of S. Other commonly used notations denoting the number of elements of
a finite set include |A|, #A, card A.
Theorem. (Principle of inclusion-exclusion) For any finite sets A and B,
n(A B) = n(A) + n(B) n(A B).
i=1
16i<j6m
+ + (1)
m1
16i<j<k6m
n(Ai Aj Ak )
n(A1 A2 Am ).
Example 12. In a class of 40 students, every student need to take Chemistry or Mathematics. If there are 35 students taking Chemistry and 20 students taking Mathematics,
how many of them are taking Chemistry and Mathematics at the same time?
Solution. Let
C = {Students taking Chemistry},
then n(C) = 35, n(M ) = 20 and n(C M ) = 40. So, there are
A B = (E \ A) (E \ B) =
P (x, y)
P (y, x)
x
O
Definition. The ordered pair of x and y, (x, y), is defined as the set {{x}, {x, y}}, where
x and y are called the first coordinate and the second coordinate of (x, y), respectively.
Theorem. (x, y) = (s, t) if and only if x = s and y = t.
Proof. Let X = {x}, Y = {x, y}, S = {s} and T = {s, t}. By definition, we have
(x, y) = {X, Y } and (s, t) = {S, T }.
Since x = s and y = t, X = S and Y = T so that (x, y) = {X, Y } = {S, T } = (s, t).
Since (x, y) = (s, t), we have {X, Y } = {S, T }. There are two cases to consider.
Case 1. When X = S and Y = T , then x = s and {x, y} = {s, t}, hence x = s and y = t.
Case 2. When X = T and Y = S, then x = s = t and x = y = s, hence x = s and
y = t.
Definition. The Cartesian product of the two sets A and B is defined as
A B = {(a, b) : a A and b B}.
Generally speaking, A B 6= B A.
A B = {(1, 4), (1, 5), (2, 4), (2, 5), (3, 4), (3, 5)},
B A = {(4, 1), (4, 2), (4, 3), (5, 1), (5, 2), (5, 3)}.
10
S = {(a, 2), (b, 2), (c, 3), (d, 4), (e, 1)}
Some authors call Dom(R) the first projection pr1 (R), and Im(R) the second projection
pr2 (R), of the relation R.
Example 16. The domain and image of the relation R in example 2 are
Dom(R) = {a, b, c},
11
B
Im(R)
R
R1
S and T are the relations from A to B, and from B to C, respectively. The composite
relation of S and T is
T S = {(a, x), (a, y), (b, x), (b, y), (c, z), (d, y)}.
Theorem. Let A, B, C, D be sets, R, S, T be relations from A to B, from B to C, and
from C to D, respectively. Then
(a) (R1 )1 = R,
(b) (T S) R = T (S R),
(c) (S R)1 = R1 S 1 .
Proof. (a) Obviously (R1 )1 is a relation from A to B. Since
R1 = {(b, a) B A : (a, b) R},
(R1 )1 = {(a, b) A B : (b, a) R1 }
= {(a, b) A B : (a, b) R}
= R.
12
(c)
= (T S) R.
Definition. A mapping from a set A to a set B, also called a function, is a relation f
from A to B satisfying the following conditions:
(i) Dom(f ) = A;
(ii) (x, y) f and (x, y ) f y = y .
We write f : A B to mean f being a mapping from A into B. Sets A and B are called
the domain and the codomain of f , respectively.
Condition (i) states that f (x) is defined for each x A; condition (ii) states that f (x)
is well defined for each x A, i.e., f is single-valued at each x A.
Example 19. Let A = {1, 2, 3, 4} and B = {a, b, c, d, e}. The following are mappings
from A to B:
(a) f1 = {(1, a), (2, b), (3, b), (4, c)},
(b) f2 = {(1, d), (2, c), (3, b), (4, a)}.
The following are not mappings from A to B:
(c) g1 = {(1, a), (2, b), (3, c)} (g1 (4) is not defined),
(d) g2 = {(1, a), (1, b), (2, a), (3, a), (4, c)} (g2 (1) has two images in B),
(e) g3 = {(1, d), (2, e), (3, f ), (4, h)} (g3 (3) = f 6 B and g3 (4) = h 6 B).
13
g
f (x)
g(f (x))
Example 20.
(a) Let f : R R and g : R R be defined by f (x) = 2x and g(x) = cos x, with
x R. Obviously, g f : R R and
g f (x) = g(f (x)) = cos 2x,
x R.
x R+ .
is called the direct image of X under f . The direct image of the domain A under f , f [A],
is called the range of f . The set
f 1 [Y ] = {x A : f (x) Y }
14
Note. Whether the inverse mapping f 1 exists or not, the inverse image f 1 [Y ] always
exists.
Example 21. Let A = {1, 2, 3, 4, 5, 6}, B = {a, b, c, d, e, f } and g : A B be defined as
g(1) = a,
g(2) = g(5) = b,
g(4) = e,
g(6) = f.
g(3) = d,
g 1 [Y ] = {2, 5, 6}.
Let a, b R such that a 6 b. Certain intervals of real numbers are defined as follows:
(a, b) = {x R : a < x < b},
[a, b] = {x R : a 6 x 6 b},
[a, ) = {x R : a 6 x < },
(, a] = {x R : < x 6 a},
(, ) = {x R : < x < } = R .
Example 22. Let A = [0, 2], B = [1, 1], and f : A B be the mapping defined by
f (x) = sin 2x for all x A. Let X = [0, 3 ] and Y = ( 21 , 0). Find f [X] and f 1 [Y ].
Theorem. Let f : A B be a mapping, X, X be subsets of A, and Y, Y be subsets of
B. Then
(i) X X f [X] f [X ].
(ii) f [X X ] = f [X] f [X ].
(iii) f [X X ] f [X] f [X ].
(iv) Y Y f 1 [Y ] f 1 [Y ].
(v) f 1 [Y Y ] = f 1 [Y ] f 1 [Y ].
(vi) f 1 [Y Y ] = f 1 [Y ] f 1 [Y ].
(vii) f 1 [B \ Y ] = A \ f 1 [Y ].
(viii) X f 1 [f [X]].
(ix) Y f [f 1 [Y ]].
Proof. (i) Since
y f [X] y = f (x) for some x X
f [X] f [X ] follows.
y f [X ],
( X X )
15
(ii) y,
Thus, f [X X ] = f [X] f [X ].
(iv) x,
x f 1 [Y ] f (x) Y
f (x) Y
f 1 [Y ] f 1 [Y ] follows.
( Y Y )
x f 1 [Y ].
f [X]
b
A
f 1 [Y ]
B
f [A]
Y
y
16
C = {x R : 0 6 x 6 2}.
B = {x R : 0 6 x 6 1},
then
(a) f : A B is surjective but not injective (because f (1) = f (1) = 1);
(b) f : B C is injective but not surjective (because 2 C has no pre-image);
(c) f : A C is neither injective (f (1) = f (1) = 1) nor surjective (2 C has no
pre-image);
(d) f : B B in a bijective.
Example 24. Let f : A B and g : B C be surjective mappings. Prove that g f
is also surjective.
Solution. It is obvious that g f : A C. Let z C. Since g is surjective, there exists
y B such that g(y) = z. Since f is surjective, there exists x A such that f (x) = y.
So, z = g(y) = g(f (x)) = g f (x). Since z C is arbitrary, g f is surjective.
Example 25.
(a) Prove that for any a, b R, a2 + ab + b2 + 3 > 0.
(b) Let f : R R be the mapping defined by f (x) = x3 + 3x 2, for all x R. Prove
that f is injective.
Solution. (a) By completing the square, we have for any a, b R,
3b2
b
+ 3 > 0.
a2 + ab + b2 + 3 = (a + )2 +
2
4
(b) Let x1 , x2 R be such that f (x1 ) = f (x2 ). Then
Since x21 + x1 x2 + x32 + 3 > 0 by (a), we must have x1 x2 = 0, i.e., x1 = x2 , thus proving
the injectivity of f .
17
x
,
1+x
where
(g f )(a) = (g f )(a )
a = a
( g f is injective)
18
p
q
r
s
a
b
c
A
p
q
r
s
a
b
c
Considering now the mapping f : A B, where A = {a, b, c, d}, B = {p, q, r}, and f be
defined as follows:
f
A
B
a
b
c
d
p
q
r
19
A
a
b
c
d
p
q
r
20
whose elements are obtained by swapping the coordinates of the corresponding points in
the graph of f . Geometrically speaking, the graph of f 1 is readily obtained by reflecting
the graph of f with respect to the diagonal line y = x.
Example 30. The inverse of f : R R+ defined by f (x) = ex/2 for x R+ , is the
mapping g : R+ R defined by g(x) = 2 ln x, where x R+ ; the image of g can be
obtained from the image of f by reflection with respect to the diagonal line y = x:
y
y=x
y=f (x)
y=g(x)
x
Exercise. By considering the graph of y = sin(x), sketch the graph of y = sin1 (x).
21
Equivalence relations
Definition. Let A be a set, R A A a relation from A to A. The relation R is called
(E1) reflexive for any a A, (a, a) R;
(E2) symmetric for any (a, b) R, (b, a) R;
(E3) transitive for any (a, b), (b, c) R, (a, c) R.
If the relation R is reflexive, symmetric and transitive, it is an equivalence relation.
Definition. The diagonal of a set A is a relation DA defined as
DA = {(a, a) : a A}.
The reflexivity, symmetry and transitivity can also be expressed as:
(E1) R is reflexive DA R.
(E2) R is symmetric R = R1 .
(E3) R is transitive R R R.
(E1)(E3) can be represented by the R image:
(E1)
(E2)
(E3)
(b,a)
(a,a)
A
DA
(b,b)
(a,b)
(b,c)
(a,c)
(a,b)
22
Theorem. Let R be a equivalence relation defined on set A. Its quotient set A/R is a
partition of the set A.
S
Proof. Since aA a/R = A, (P1) holds. Since distinct equivalence classes are disjoint,
(P2) holds.
From the above theorem, every equivalence relation R defined on a set A gives rise to a
partition of A. The converse of the above theorem is also true.
Theorem. Let P = {S1 , S2 , . . . , Sk } be a partition of the set A. There exists an equivalence relation on A such that A/ = P.
Proof. Let a, b A. Declare a b a, b belong to the same block, i.e., a, b Si , for
some 1 6 i 6 k. For any a, b, c A,
S
(E1) since A = ki=1 Si , a, a Si for some i so that a a.
(E2) if a b, then a, b Si for some i so that b, a Si , i.e., b a.
(E3) if a b and b c, then a, b Si and b, c Sj for some i, j; necessarily i = j so
that a, c Si , i.e., a c.
Thus, is an equivalence relation and the quotient set A/ = P.
Exercise. Let L be the set of all straight lines in the xy-plane. For any 1 , 2 L ,
define 1 R2 1 //2 .
(a) Prove that R is an equivalence relation.
(b) Illustrate the equivalence class containing the straight line y = 2x + 1.
23
Example 31. Let N0 as the set of all non-negative integers. A relation R is defined in
N0 as follows: For any (m, n), (p, q) N0 N0 , (m, n)R(p, q) m + q = n + p.
(a) Prove that R is a equivalence relation.
(b) Illustrate the equivalence class containing (m, n).
Solution. (a) Let (m, n), (p, q), (r, s) N0 N0 .
(E1) Since m + n = n + m, (m, n)R(m, n) so that S is reflexive.
(E2) If (m, n)R(p, q), then
m + q = n + p p + n = q + m (p, q)R(m, n)
so that R is symmetric.
(E3) If (m, n)R(p, q) and (p, q)R(r, s), then m + q = n + p and p + s = q + r so that
(m + s) + q = (m + q) + s = (n + p) + s
= n + (p + s) = n + (q + r)
= (n + r) + q.
Cancelling q from both sides, we get m + s = n + r (m, n)R(r, s). Hence, R
is transitive.
Thus, R is an equivalence relation.
(b) If (p, q) (m, n)/R, then (m, n)R(p, q) m + q = n + p m n = p q. One
can identify (p, q) to the difference p q. Thus, the equivalence class (m, n)/R containing
(m, n) contains all differences which are equivalent to m n.
Example 31 shows how to construct any integer from non-negative integers as equivalence
classes of the relation R.
Example 32. Let Z = Z \{0}. A relation S is defined in Z Z as follow: To any
(m, n), (p, q) Z Z , (m, n)S(p, q) mq = np.
(a) Prove that S is a equivalence relation.
(b) Illustrate the equivalence class containing (m, n).
Solution. (a) Let (m, n), (p, q), (r, s) Z Z .
(E1) Since mn = nm, (m, n)S(m, n) so that S is reflexive.
(E2) If (m, n)S(p, q), then mq = np pn = qm (p, q)S(m, n) so that S is
symmetric.
(E3) If (m, n)S(p, q) and (p, q)S(r, s), then mq = np and ps = qr so that
(ms)q = (mq)s = (np)s = n(ps) = n(qr) = (nr)q.
Since q 6= 0, cancelling q from both sides, we get ms = nr (m, n)S(r, s).
Hence, S is transitive.
= pq . One can
(b) If (p, q) (m, n)/S, then (m, n)S(p, q) mq = np m
n
identify (p, q) to the fraction pq . Thus, the equivalence class (m, n)/S containing (m, n)
contains all fractions which are equivalent to m
.
n
Example 32 shows how to construct rational numbers from integers.
24
25
(1)
is not satisfied by any p Q. If there were one such p, we may assume that p = m
, where
n
m, n Z are relatively prime (i.e., they have no common factor). Substituting p = m
n
into (1), we have
m2 = 2n2 .
(2)
2
Since m is even, m must be even. Writing m = 2k for some k Z. Then
4k 2 = (2k)2 = 2n2
so that after cancelling 2 from both sides, we get 2k 2 = n2 , which forces n to be even.
But this contradicts m and n being relatively prime.
Now let A = {p Q+ : p2 < 2} and B = {p Q+ : p2 > 2}. We next show that the
set A contains no largest element and the set B contains no smallest element. For every
p Q+ , let
p2 2
2p + 2
q =p
=
.
(3)
p+2
p+2
Then
2(p2 2)
q2 2 =
.
(4)
(p + 2)2
If p A, then p2 2 < 0, (3) shows that q > p, and (4) shows that q 2 < 2. Thus, q A.
If p B, then p2 2 > 0, (3) shows that 0 < q < p, and (4) shows that q 2 > 2. Thus,
q B.
1.2 Remark. The purpose of Example 1.1 is to show that Q has certain gaps, in spite of
the fact that in between any two rational numbers there is another: if r, s Q and r < s,
Q satisfies r < r+s
< s. The real numbers fill these gaps.
then r+s
2
2
26
Ordered sets
1.5 Definition. Let S be an nonempty set. A partial order on S is a relation, denoted
by , satisfying the following properties:
(i) (Antisymmetry) If x, y S are such that x y and y x, then x = y.
(ii) (Transitivity) If x, y, z S such that x y and y z, then x z.
Example. On the set of natural numbers N = {1, 2, 3, . . .}, define by
x y x|y,
x, y N .
1.6 Definition. The nonempty set S together with the order is call a partially ordered
set, abbreviated as poset. Write (S, ) to mean the set S whose partial order is . When
the partial order is understood, we simply write S instead of (S, ).
When any two elements x, y of a poset S is comparable, i.e., either x y or y x is true,
then S is said to be totally ordered. We then say that S is an (totally) ordered set.
The Law of trichotomy. Let S be an ordered set and let x, y S. Then exactly one
of the following holds: x < y, x = y, or x > y.
Since the set of all real numbers R is totally ordered, the Law of Trichotomy holds in R.
That is, given two real numbers, we can always determine whether they are equal, or one
is larger than the other.
1.7 Definition. A subset E of an ordered set S is bounded above (resp., bounded below)
if there exists S (resp., S) such that x 6 (resp., 6 x) for all x E; such a
(resp., ) is called an upper bound (resp., lower bound) of E.
1.8 Definition. Suppose S is an ordered set, E S, and E is bounded above. Suppose
there exists an S with the following properties:
(i) is an upper bound of E.
(ii) If < , then is not an upper bound of E.
Then is called the least upper bound (or supremum) of E and we write
= sup E.
The greatest lower bound (or infimum) of a set E which is bounded below is defined in a
similar manner: The statement
= inf E
means that is a lower bound of E and that no with > is a lower bound of E.
27
1.9 Examples.
(a) Consider the sets A = {p Q+ : p2 < 2} and B = {p Q+ : p2 > 2} of Example
1.1 as subsets of the ordered set Q. The set A is bounded above. In fact, the
upper bounds of A are exactly the members of B. Since B contains no smallest
element, A has no least upper bound in Q. Similarly, B is bounded below: The
set of all lower bounds of B is A {r Q : r 6 0}. Since A has no largest element,
B has no greatest lower bound in Q.
(b) If = sup E exists, then may or may not be a member of E. For instance, let
E1 = {r Q : r < 0} and let E2 = {r Q : r 6 0}. Then
sup E1 = sup E2 = 0,
and 0 6 E1 , 0 E2 .
(c) Let E = { n1 : n = 1, 2, 3, . . .}. Then sup E = 1 E, and inf E = 0 6 E.
1.10 Definition. An ordered set S is said to have the least-upper-bound property if the
following is true: If 6= E S and E is bounded above, then sup E exists in S.
Example 1.9(a) shows that Q does not have the least-upper-bound property.
We shall now show that there is a close relation between greatest lower bounds and least
upper bounds, and that every ordered set with the least-upper-bound property also has
the greatest-lower-bound property.
1.11 Theorem. Suppose S is an ordered set with the least-upper-bound property, 6=
B S, and B is bounded below. Let L = {m : m 6 x, x B}, the set of all lower
bounds of B. Then
= sup L
exists in S, and = inf B. In particular, inf exists in S.
Proof. Since B is bounded below, L 6= . Since L consists of exactly those y S such
that y 6 x for every x B, we see that every x B is an upper bound of L. Thus, L is
bounded above. Our hypothesis about S implies therefore that L has a supremum in S;
call it .
If < then is not an upper bound of L, hence 6 B. It follows that 6 x for
every x B. Thus, L.
If < then 6 L, since is an upper bound of L. We have shown that L but
6 L if > . In other words, is a lower bound of B, but is not if > . This
means that = inf B.
28
Fields
1.12 Definition. A field is a set F with two operations, called addition and multiplication, which satisfy the following axioms: for any x, y, z F ,
(A1) (Closed under addition) x + y F .
(A2) (Commutative law for addition) x + y = y + x.
(A3) (Associative law for addition) (x + y) + z = x + (y + z).
(A4) (Existence of additive identity) 0 F such that 0 + x = x.
(A5) (Existence of additive inverse) To every x F corresponds an element x F
such that x + (x) = 0; x is the additive inverse of x.
(M1) (Closed under multiplication) If x, y F , then xy F .
(M2) (Commutative law for multiplication) xy = yx.
(M3) (Associative law for multiplication) (xy)z = x(yz).
(M4) (Existence of multiplicative identity) 0 6= 1 F such that 1x = x for every x F .
(M5) (Existence of multiplicative inverse) If 0 6= x F , then there exists x1 F such
that x( x1 ) = 1; x1 is the multiplicative inverse of x 6= 0.
(D1) (Distributive law) x(y + z) = xy + xz.
Simply speaking, a field is a set of numbers in which one can add, subtract, multiply and
divide (by nonzero) elements of the set, and the results are still elements of the set.
Our usual experience suggests that the set of all real numbers R, the set of all complex
numbers C, and the set of all rational numbers Q, are fields.
1.13 Remarks.
(a) One usually writes (in any field)
x y,
x
,
y
x + y + z, xyz, x2 , x3 , 2x, 3, . . .
in instead of
x + (y), x( y1 ), (x + y) + z, (xy)z, xx, xxx, x + x, x + x + x, . . .
(b) The field axioms clearly hold in Q, the set of all rational numbers, if addition and
multiplication have their customary meaning. Thus Q is a field.
1.14 Proposition. The axioms for addition imply the following statements.
(a) (Cancellation law) If x + y = x + z then y = z.
(b) (Uniqueness of the additive identity) If x + y = x then y = 0.
(c) (Uniqueness of the additive inverse) If x + y = 0 then y = x.
(d) (x) = x.
Proof. (a) If x + y = x + z, the axioms for addition give
y = 0 + y = (x + x) + y = x + (x + y) = x + (x + z) = (x + x) + z = 0 + z = z.
29
1.15 Proposition. The axioms for multiplication imply the following statements.
(a) If x 6= 0 and xy = xz then y = z.
(b) If x 6= 0 and xy = x then y = 1.
(c) If x 6= 0 and xy = 1 then y = x1 .
1
(d) If x 6= 0 then 1/x
= x.
Proof. Exercise.
1.16 Proposition. The field axioms imply the following statements, for any x, y F ,
(a) 0x = 0.
(b) If x 6= 0 and y 6= 0 then xy 6= 0.
(c) (x)y = (xy) = x(y).
(d) (x)(y) = xy.
Proof. (a) Since 0x + 0x = (0 + 0)x = 0x, Proposition 1.14(b) then implies that 0x = 0.
(b) Suppose on the contrary that xy = 0. Then by (a) and (M5), we have
1
1
1
1
xy =
0 = 0,
1=
y
x
y
x
which is a contradiction.
(c) Since
(x)y + xy = (x + x)y = 0y = 0,
Proposition 1.14(c) then implies (x)y = (xy). The remaining assertion is proved
similarly.
(d) By (c) and Proposition 1.14(d), we have
(x)(y) = [x(y)] = [(xy)] = xy.
1.17 Definition. An ordered field is a field F which is also an ordered set, such that
(i) x + y < x + z if x, y, z F and y < z,
(ii) xy > 0 if x, y F and x, y > 0.
If x > 0, we call x positive; if x < 0, x is negative.
For example, Q is an ordered field.
1.18 Proposition. The following statements are true in every ordered field.
(a) If x > 0 then x < 0, and vice versa.
(b) If x > 0 and y < z then xy < xz.
(c) If x < 0 and y < z then xy > xz.
(d) If x 6= 0 then x2 > 0. In particular, 1 > 0.
(e) If 0 < x < y then 0 < y1 < x1 .
30
31
The second statement means that Q R and that the operations of addition and multiplication in R, when applied to the elements of Q, coincide with the usual operations on
rational numbers. The elements of R are called real numbers.
1.20 Theorem.
(a) (The archimedean property of R) If x, y R and x > 0, then there is a positive
integer n such that nx > y.
(b) If x, y R and x < y, then there exists a p Q such that x < p < y.
Proof. (a) Let A = {nx : n Z+ }. If (a) were false, then y would be an upper bound of
A. But then A has a least upper bound in R. Put = sup A. Since x > 0, x < so
that x is not an upper bound of A. Hence x < mx for some m Z+ . But then
< (m + 1)x A, which is a contradiction.
(b) Since x < y, we have y x > 0, and (a) furnishes a positive integer n such that
n(y x) > 1.
Apply (a) again, to obtain positive integers m1 and m2 such that m1 > nx, m2 > nx.
Then
m2 < nx < m1 .
Hence there is an integer m (with m2 6 m 6 m1 ) such that
m 1 6 nx < m.
m
.
n
1.21 Theorem. (Existence of nth roots of positive reals) For every real x > 0 and every
integer n >0, there is one and only one positive real y such that y n = x. This number y
is written n x or x1/n .
Proof. (Uniqueness) Let y1 and y2 are both nth roots of x. If y1 6= y2 , then we may
assume that 0 < y1 < y2 . Hence,
x = y1n < y2n = x,
which is a contradiction. Consequently, y1 = y2 .
x
(Existence) Let E = {t R+ : tn < x}. If t = 1+x
then 0 < t < 1 so that tn < t < x.
Thus, t E and E 6= .
If t > 1 + x then tn > t > x so that t 6 E. Thus 1 + x is an upper bound of E.
32
To prove that y = x we will show that each of the inequalities y n < x and y n > x leads
to a contradiction.
The identity bn an = (b a)(bn1 + bn2 a + + an1 ) yields the inequality
bn an < (b a)nbn1
when 0 < a < b. Assume y n < x. Choose h so that 0 < h < 1 and
x yn
h<
.
n(y + 1)n1
Put a = y, b = y + h. Then
(y + h)n y n < hn(y + h)n1 < hn(y + 1)n1 < x y n .
Thus (y + h)n < x and y + h E. Since y + h > y, this contradicts the fact that y is an
upper bound of E.
Assume y n > x. Put
yn x
k=
.
ny n1
Then 0 < k < y. If t > y k, we conclude that
y n tn 6 y n (y k)n < kny n1 = y n x.
(6)
33
Conversely, for any infinite decimal (6) the set E of numbers (5) is bounded above, and
(6) is the decimal expansion of sup E.
34
(a) z + w = z + w,
(b) zw = zw,
35
(b) |
z | = |a ib| = a2 + (b)2 = a2 + b2 = |z|.
(c) Since zw = (ac bd) + i(ad + bc),
(d) This follows from | Re(z)| = |a| = a2 6 a2 + b2 = |z|; the equality holds b =
0 z is real.
(e)
|z + w|2 = (z + w)(z + w)
= (z + w)(
z + w)
= z z + z w + zw + ww
= |z|2 + z w + z w + |w|2
= |z|2 + 2 Re(z w)
+ |w|2
6 |z|2 + 2|z w|
+ |w|2
= |z|2 + 2|z||w|
+ |w|2
= (|z| + |w|)2 .
i=1
z1
w1
z2
w2
i=1
zn
.
wn
=
= =
The equality holds if and only if
Pn
P
P
Proof. Let A = i=1 |zi |2 , B = ni=1 |wi |2 and C = ni=1 zi wi . If B = 0, then w1 = =
wn = 0 and the inequality is trivial. So, suppose B > 0. By Theorem 1.31, we have
n
X
i=1
|Bzi Cwi |2 =
n
X
i=1
= B2
n
X
i=1
Pn
|zi |2 B C
2
= B(AB |C| ).
n
X
i=1
zi wi BC
n
X
i=1
zi wi + |C|2
n
X
i=1
|wi |2
Since i=1 |Bzi Cwi |2 > 0 and B > 0, it follows that AB > |C|2 . The equality holds
C
|Bz1 Cw1 | = = |Bzn Cwn | = 0 wz11 = = wznn = B
.
36
Euclidean spaces
1.36 Definitions. For each positive integer k, let Rk be the set of all ordered k-tuples
x = (x1 , x2 , . . . , xk )T ,
where x1 , . . . , xk R, called the coordinates of x. The elements of Rk are called points,
or vectors, especially when k > 1. If y = (y1 , . . . , yk )T and if R, put
x + y = (x1 + y1 , . . . , xk + yk )T
and x = (x1 , . . . , xk )T
so that x+y Rk and x Rk . This defines addition of vectors and scalar multiplication
of a vector by a real scalar. These two operations satisfy the commutative, associative,
and distributive laws and make Rk into a vector space over the real field. The zero element
of Rk (called the zero vector) is the point 0 all of whose coordinates are 0. We also define
the usual inner product (or scalar product) of x and by
xy =
and the norm of x by
kxk2 = (x x)
k
X
1/2
xi yi
i=1
X
k
x2i
i=1
1/2
The structure defined (the vector space Rk with the above inner product and norm) is
called the k-dimensional Euclidean space.
1.37 Theorem. Suppose x, y, z Rk , and R. Then
(a) kxk2 > 0; the equality holds if and only if x = 0.
(b) kxk2 = ||kxk2 ;
(c) |x y| 6 kxk2 kyk2 ;
(d) kx + yk2 6 kxk2 + kyk2 .
Proof. (a) It is clear that
kxk22
k
X
x2i > 0;
i=1
i=1
i=1
i.e., (x y)2 6 kxk22 kyk22 . The desired inequality then follows upon taking square roots of
both sides.
37
= x x + 2(x y) + y y
6 kxk22 + 2|x y| + kyk22
= (kxk2 + kyk2 )2 .
1.38 Remarks. Theorem 1.37 (a) and (d) will allow us to regard Rk as a metric space.
R1 (the set of all real numbers) is the real line; R2 is the xy-plane, or the complex plane.
In these two cases the norm is just the absolute value of the corresponding real or complex
number.
38
Basic Topology
2.3 Definition. Let A and B be sets. If there exists a bijective mapping f : A B,
then A and B have the same cardinal number. This means that if A and B are finite sets,
then A and B have the same number of elements; if A and B are infinite, then they have
the same level of infinity.
Proposition. Declare two sets A and B are equivalent, written A B, if there exists a
bijection f : A B. The relation between sets is an equivalence relation.
Proof. Exercise.
2.4 Definition. For any positive integer n, let Jn be the set whose elements are the
integers 1, 2, . . . , n; let N be the set of all positive integers. For any set A,
(a) A is finite if A Jn for some n (the empty set is also considered to be finite).
(b) A is infinite if A is not finite.
(c) A is countable if A N.
(d) A is uncountable if A is neither finite nor countable.
(e) A is at most countable if A is finite or countable.
Countable sets are sometimes called enumerable, or denumerable.
2.5 Example. Let Z be the set of all integers. Then Z is countable. For, consider the
following arrangement of the sets Z and N:
Z : 0, 1, 1, 2, 2, 3, 3, . . .
N : 1, 2, 3, 4, 5, 6, 7, . . .
An explicit formula for a bijection f : N Z can be given:
(
n
if n even,
f (n) = 2 n1
if n is odd.
2
2.6 Remark. A finite set cannot be equivalent to one of its proper subsets. This is,
however, possible for infinite sets, as is shown by Example 2.5, in which N is a proper
subset of Z.
In fact, we could replace Definition 2.4(b) by the statement: A is infinite if A is equivalent
to one of its proper subsets.
39
2.7 Definition. By a sequence, we mean a function f defined on the set N of all positive
integers. If f (n) = xn for n N, it is customary to denote the sequence f by the symbol
{xn }, or sometimes by x1 , x2 , x3 , . . .. The values of f , that is, the elements xn , are called
the terms of the sequence. If A is a set and if xn A for all n N, then {xn } is said to
be a sequence in A, or a sequence of elements of A.
Note that the terms x1 , x2 , x3 , . . . of a sequence need not be distinct.
Since every countable set is the range of a injection defined on N, we may regard every
countable set as the range of a sequence of distinct terms.
Speaking more loosely, we may say that the elements of any countable set can be arranged
in a sequence. Sometimes it is convenient to replace N in this definition by the set of all
nonnegative integers N0 = {0, 1, 2, . . .}.
2.8 Theorem. Every infinite subset of a countable set A is countable.
Proof. Let E be an infinite subset of A. Arrange the elements x of A in a sequence {xn }
of distinct terms. Construct a sequence {nk } as follows: Let n1 be the smallest positive
integer such that xn1 E. Having chosen n1 , . . . , nk1 (k = 2, 3, 4, . . .), let nk be the
smallest integer greater than nk1 such that xnk E.
Defining f (k) = xnk (k = 1, 2, 3, . . .), we obtain a bijection between E and N.
The theorem shows that, roughly speaking, countable sets represent the smallest infinity: No uncountable set can be a subset of a countable set.
2.9 Definition. Let A and be sets, and suppose that with each element of A there
is associated a subset of which we denote by E .
The set whose elements are the sets E will be denoted by {E }. Instead of speaking
of sets of sets, we shall sometimes speak of a collection of sets, or a family of sets.
2.10 Examples. Let A = {x R : 0 < x 6 1}. For every x A, let Ex = {y R : 0 <
y < x}. Then
(i) E
Sx Ez if and only if 0 < x 6 z 6 1;
(ii) TxA Ex = E1 ;
(iii) xA Ex = .
(i) and
T (ii) are clear. To prove (iii), we note that for every y > 0, y 6 Ex if x < y. Hence,
y 6 xA Ex .
40
[
S=
En .
(15)
n=1
Then S is countable.
Proof. Let every set En be arranged in a sequence {xnk }, k = 1, 2, 3, . . ., and consider the
infinite array
x11
x12
x13
x14 . . .
x21
x22
x23
x24 . . .
(16)
x31
x32
x33
x34 . . .
x41
x42
x43
x44 . . .
... ... ... ... ... ... ... ...
in which the elements of En form the nth row. The array contains all elements of S, which
can be arranged in a sequence
x11 ; x21 , x12 ; x31 , x22 , x13 ; x41 , x32 , x23 , x14 ; . . .
(17)
If any two of the sets En have elements in common, these repeated elements will occur in
(17). Hence, there is a subset T of the set of all positive integers such that S T , which
shows that S is at most countable (Theorem 2.8). Since E1 S, and E1 is infinite, S is
infinite, and thus countable.
Corollary. Suppose A is at most countable, and, for every A, B is at most
countable. Put
[
T =
B .
A
2.13 Theorem. Let A be a countable set. The set of all ordered n-tuples
is countable.
Bn = {(a1 , . . . , an ) : a1 , . . . , an A}
(18)
For every fixed b, the set of pairs (b, a) is equivalent to A, and hence countable. Thus Bn
is the union of a countable set of countable sets. By Theorem 2.12, Bn is countable. The
theorem follows by induction.
41
42
Metric spaces
2.15 Definition. A set X, whose elements we shall call points, is said to be a metric
space if with any two points p, q X there is associated a real number d(p, q), called the
distance from p to q, such that
(a) (positive definiteness) d(p, q) > 0, with the equality holds if and only if p = q;
(b) (symmetry) d(p, q) = d(q, p);
(c) (triangle inequality) d(p, q) 6 d(p, r) + d(r, q), for any r X.
Any function satisfying these three properties is called a distance function, or a metric.
2.16 Examples. The most important examples of metric spaces are the Euclidean spaces
Rk , especially R1 (the real line) and R2 (the complex plane); the distance in Rk is defined
by
d(x, y) = kx yk2 (x, y Rk ).
(19)
By Theorem 1.37, the conditions of Definition 2.15 are satisfied by (19).
It is important to observe that every subset Y of a metric space X is a metric space in
its own right, with the same distance function. For it is clear that if conditions (a)(c)
of Definition 2.15 hold for p, q, r X, they also hold if p, q, r are restricted to lie in Y .
Thus, every subset of a Euclidean space is a metric space.
2.17 Definition.
Segment: (a, b) = {x R : a < x < b}; interval: [a, b] = {x : a 6 x 6 b};
half-open intervals: [a, b) = {x R : a 6 x < b} and (a, b] = {x R : a < x 6 b}.
If ai < bi for i = 1, . . . , k, the set {(x1 , . . . , xk ) : ai 6 xi 6 bi , 1 6 i 6 k} is called a k-cell.
Thus a 1-cell is an interval, a 2-cell is a rectangle, etc.
If x Rk and r > 0, the open (resp., closed) ball B with center at x and radius r is defined
as Br (x) = {y Rk : ky xk2 < r} (resp., ky xk2 6 r).
We call a set E Rk convex if
x + (1 )y E
whenever x, y E, and 0 < < 1. For example, balls are convex. For if ky xk2 < r,
kz xk2 < r, and 0 < < 1, we have
ky + (1 )z x| = k(y x) + (1 )(z x)k2
6 ky xk2 + (1 )kz xk2
< r + (1 )r
= r.
The same proof applies to closed balls. It is also easy to see that k-cells are convex.
43
2.18 Definition. Let X be a metric space. All points and sets mentioned below are
understood to be elements and subsets of X.
(a) A neighborhood of p is a set Nr (p) = {q X : d(p, q) < r} for some r > 0. The
number r is called the radius of Nr (p).
(b) A point p is a limit point of the set E if every neighborhood of p contains a point
q 6= p such that q E.
(c) If p E and p is not a limit point of E, then p is called an isolated point of E.
(d) E is closed if every limit point of E is a point of E.
(e) A point p is an interior point of E if there is a neighborhood N of p such that
N E.
(f) E is open if every point of E is an interior point of E.
(g) The complement of E (denoted by E c ) is the set {p X : p 6 E}.
(h) E is perfect if E is closed and if every point of E is a limit point of E.
(i) E is bounded if there is a real number M and a point q X such that d(p, q) < M
for all p E.
(j) E is dense in X if every point of X is a limit point of E, or a point of E (or both).
Let us note that in R1 neighborhoods are segments, whereas in R2 neighborhoods are
interiors of circles.
Example. The empty set is open and closed. Why?
2.19 Theorem. Every neighborhood is an open set.
Proof. Consider a neighborhood E = Nr (p), and let q E. Then there exists h > 0 such
that
d(p, q) = r h.
For all points s such that d(q, s) < h, we then have
d(p, s) 6 d(p, q) + d(q, s) < r h + h = r,
Then r > 0. The neighborhood Nr (p) contains no point q of E such that q 6= p, which
contradicts p being a limit point of E.
Corollary. A finite point set has no limit points.
44
45
T
and Fc is open, by Theorem 2.23. Hence (a) implies that (21) is open so that F is
closed.
T
Next, put H = ni=1 Gi . For any x H, there exist neighborhoods Ni of x, with radii
ri such that Ni Gi (i = 1, 2, . . . , n). Put
r = min(r1 , . . . , rn ),
i=1
2.25 Examples. In parts (c) and (d) of the preceding theorem, the finiteness of the
collections is essential.
For let Gn = ( n1 , n1 ), (n = 1, 2, 3, . . .), which is an open subset
T
of R. Put G = n=1 Gn . Then G = {0}, which is not open in R. Thus the intersection
of an infinite collection of open sets need not be open. Similarly, the union of an infinite
collection of closed sets need not be closed.
2.26 Definition. If X is a metric space, if E X, and if E denotes the set of all limit
points of E in X, then the closure of E is the set E = E E .
2.27 Theorem. If X is a metric space and E X, then
(a) E is closed,
(b) E = E if and only if E is closed,
(c) E F for every closed set F X such that E F .
By (a) and (c), E is the smallest closed subset of X containing E.
then p is neither a point of E nor a limit point of E. Hence
Proof. (a) If p X and p 6 E,
The complement of E is therefore open.
p has a neighborhood which does not intersect E.
Hence E is closed.
(a) implies that E is closed. If E is closed, then E E, hence E = E.
(b) If E = E,
point of E. Hence y E.
46
47
Compact sets
2.31 Definition. By an open cover of a set
S E in a metric space X we mean a collection
{G } of open subsets of X such that E G .
Let X be a metric space and E a subset of X. Open covers of E always exists. For a
fixed r > 0, {Nr (x) : x E} is an open cover of E because for each x E, {x} Nr (x)
so that
[
[
E=
{x}
Nr (x).
xE
xE
(22)
K V 1 V n .
(23)
48
which is a contradiction.
K1 K1 Kn = ,
Corollary. If {Kn } T
is a sequence of nonempty compact sets such that Kn Kn+1
(n = 1, 2, 3, . . .), then
n=1 Kn is not empty.
2.37 Theorem. If E is an infinite subset of a compact set K, then E has a limit point
in K.
Proof. If no point of K were a limit point of E, then each q K would have a neighborhood Vq which contains at most one point of E (namely, q, if q E). It is clear that no
finite subcollection of {Vq } can cover E; and the same is true of K, since E K. This
contradicts the compactness of K.
2.38 Theorem. (Nested interval property)
If {In } is a sequence of intervals in R such
T
that In In+1 (n = 1, 2, 3, . . .), then
I
n=1 n is not empty.
49
(1 6 j 6 k; n = 1, 2, 3, . . .).
Then kx yk2 6 , if x I, y I.
Suppose, on the contrary, that there exists an open cover {G } of I which contains no
a +b
finite subcover of I. Put cj = j 2 j . The intervals [aj , cj ] and [cj , bj ] then determine 2k
k-cells Qi whose union is I.
At least one of these sets Qi , call it I1 , cannot be covered by any finite subcollection of
{G } (otherwise I could be so covered). We next subdivide I1 and continue the process.
We obtain a sequence {In } with the following properties:
(a) I I1 I2 I3 ;
(b) In is not covered by any finite subcollection of {G };
(c) if x In and y In , then kx yk2 6 2n .
By (a) and Theorem 2.39, there is a point x which lies in every In . For some , x G .
Since G is open, there exists r > 0 such that ky x k2 < r implies that y G . If n is
so large that 2n < r, then (c) implies that In G , which contradicts (b).
This completes the proof.
50
The equivalence of (a) and (b) in the next theorem is known as the Heine-Borel theorem.
2.41 Theorem. If a set E in Rk has one of the following three properties, then it has
the other two:
(a) E is closed and bounded.
(b) E is compact.
(c) Every infinite subset of E has a limit point in E.
Proof. If (a) holds, then E I for some k-cell I, and (b) follows from Theorems 2.40 and
2.35. Theorem 2.37 shows that (b) implies (c). It remains to show that (c) implies (a).
If E is not bounded, then E contains points xn such that
kxn k2 > n (n = 1, 2, 3, . . .).
The set S = {xn : n = 1, 2, 3, . . .} is infinite and clearly has no limit point in Rk , hence
has none in E. Thus (c) implies that E is bounded.
If E is not closed, then there is a point x0 Rk which is a limit point of E but not a
point of E. For n = 1, 2, 3, . . ., there are points xn E such that kxn x0 k2 < n1 . Let
S = {xn : n = 1, 2, 3, . . .}. Then S is infinite (otherwise kxn x0 k2 would have a constant
positive value, for infinitely many n), S has x0 as the only limit point in Rk . For if y Rk ,
y 6= x0 , then
kxn yk2 > kx0 yk2 kxn x0 k2
1
> kx0 yk2
n
1
> kx0 yk2
2
for all but finitely many n; this shows that y is not a limit point of S (Theorem 2.20).
Thus, S has no limit point in E; hence E must be closed if (c) holds.
Remark. (b) and (c) are equivalent in any metric space. If the underlying metric space
is finite dimensional (e.g., Rk ), then (a) and (b) are equivalent. In general, a subset of a
metric space is compact if and only if it is complete and totally bounded; this equivalence
holds should the dimension of the metric space be finite or not.
2.42 Theorem (Weierstrass). Every bounded infinite subset of Rk has a limit point
in Rk .
Proof. Let E be a bounded infinite subset of Rk . Then there exists a k-cell I Rk such
that E I. By Theorem 2.40, I is compact, and so E has a limit point in I by Theorem
2.37.
51
Perfect sets
2.43 Theorem. Let P be a nonempty perfect set in Rk . Then P is uncountable.
Proof. Since P has limit points, P must be infinite. Suppose P is countable, and denote
the points of P by x1 , x2 , x3 , . . .. We shall construct a sequence {Vn } of neighborhoods,
as follows.
Let
V1 = {y Rk : ky x1 k2 < r}, and V1 = {y Rk : ky x1 k2 6 r}.
Then V1 is a neighborhood of x1 , and V1 is the closure of V1 .
Suppose Vn has been constructed, so that Vn P is not empty. Since every point of P is
a limit point of P , there is a neighborhood Vn+1 such that (i) Vn+1 Vn , (ii) xn 6 Vn+1 ,
(iii) Vn+1 P is not empty. By (iii), Vn+1 satisfies our induction hypothesis, and the
construction can proceed.
Put Kn = Vn P . Since
T Vn is closed and bounded, Vn is compact.
T Since xn 6 Kn+1 ,
no point of P lies in n=1 Kn . Since Kn P , this implies that n=1 Kn is empty. But
each Kn is nonempty, by (iii), and Kn Kn+1 , by (i); this contradicts the Corollary to
Theorem 2.36.
Corollary. Every interval [a, b] (a < b) is uncountable. In particular, the set of all real
numbers is uncountable.
2.44 The Cantor set. The set which we are now going to construct shows that there
exist perfect sets in R which contain no segment.
Let E0 = [0, 1]. Remove the segment ( 13 , 32 ), and let E1 be the union of the intervals
1
2
[0, ], [ , 1].
3
3
Remove the middle thirds of these intervals, and let E2 be the union of the intervals
1
2 3
6 7
8
[0, ], [ , ], [ , ], [ , 1].
9
9 9
9 9
9
Continuing in this way, we obtain a sequence of compact sets En , such that
(a) E1 E2 E3 ;
(b) En is the union of 2n intervals, each of length 3n .
52
The set
P =
En
n=1
is called the Cantor set. P is clearly compact, and Theorem 2.36 shows that P is not
empty.
No segment of the form
3k + 1 3k + 2
),
(24)
( m ,
3
3m
where k and m are positive integers, has a point in common with P . Since every segment
(, ) contains a segment of the form (24), if
,
3m <
6
P contains no segment.
To show that P is perfect, it is enough to show that P contains no isolated point. Let
x P , and let S be any segment containing x. Let In be that interval of En which
contains x. Choose n large enough, so that In S. Let xn be an endpoint of In such
that xn 6= x.
It follows from the construction of P that xn P . Hence, x is a limit point of P , and P
is perfect.
One of the most interesting properties of the Cantor set is that it provides us with an
example of an uncountable set of measure zero.
P
P
1
1
2 n
n1 1
The length of the removed middle-thirds =
( 3n ) = 13
n=1 2
n=0 ( 3 ) = ( 3 )( 1 2 ) = 1.
3
Thus, the Lebesgue measure (the length) of the Cantor set = 1 1 = 0.
53
Connected sets
2.45 Definition. Two subsets A and B of a metric space X are said to be separated if
= and A B = . A set E X is said to be connected if E is not a union
both A B
of two nonempty separated sets.
2.46 Remark. Separated sets are disjoint, but disjoint sets need not be separated. For
example, the interval [0, 1] and the segment (1, 2) are not separated, since 1 is a limit
point of (1, 2). However, the segments (0, 1) and (1, 2) are separated.
Connected subsets of the line have a particularly simple structure:
2.47 Theorem. A subset E of R1 is connected if and only if it has the following property:
If x, y E, and x < z < y, then z E.
54
(b)
(c)
(d)
(e)
If
If
If
If
sn
sn
sn
sn
Proof. (a) Let V be a neighborhood of p. There exists > 0 such that for q X
and d(q, p) < , q V . There exists N such that n > N implies d(pn , p) < . Thus,
n > N implies pn V .
(b) Let > 0 be given. There exist integers N, N such that n > N implies d(pn , p) <
and n > N implies d(pn , p ) < 2 . Hence, if n = max(N, N ), we have
d(p, p ) 6 d(p, pn ) + d(pn , p ) < + = .
2 2
55
(c) Since pn p, there is an integer N such that n > N implies d(pn , p) < 1. Put
r = max(1, d(p1 , p), . . . , d(pN 1 , p)).
1
n sn
(d) lim
Proof. (a) Given > 0. There exist integers N1 , N2 such that n > N1 implies |sn s| < 2
and n > N2 implies |tn t| < 2 . If N = max(N1 , N2 ), then n > N implies
|(sn + tn ) (s + t)| 6 |sn s| + |tn t| < + = .
2 2
(b) Given > 0. There exists an integer N such that n > N implies |sn s| < |c|+1
,
hence
< .
|csn cs| = |c||sn s| 6 |c|
|c| + 1
The convergence c + sn c + s follows from (a) by setting tn = c for all n.
(1)
and
so that lim (sn s)(tn t) = 0. We now apply (a) and (b) to (1), and conclude that
n
(d) Choose m such that |sn s| < 12 |s| if n > m. By the triangle inequality, we have
1
1
|s| |sn | 6 |sn s| < |s| |s| < |sn | when n > m.
2
2
Given > 0. There is an integer N > m such that n > N implies
1
|sn s| < |s|2 .
2
Hence, for n > N ,
1
1
= |sn s| < 2|sn s| < .
sn s
|sn ||s|
|s|2
56
3.4 Theorem.
(a) Suppose xn = (x1,n , . . . , xk,n )T Rk (n = 1, 2, 3, . . .). Then {xn } converges to
x = (x1 , . . . , xk )T if and only if
lim xj,n = xj
(1 6 j 6 k).
(2)
(b) Suppose {xn }, {yn } are sequences in Rk , {n } is a sequence of real numbers, and
xn x, yn y, n . Then
lim (xn + yn ) = x + y,
lim xn yn = x y,
lim n xn = x.
Since xn x, for j = 1, 2, . . . , k,
|xj,n xj | 6 kxn xk2 0 as n ,
i.e., (2) holds.
To each > 0 there corresponds an integer N such that n > N implies
(1 6 j 6 k).
|xj,n xj | <
k
Hence n > N implies
kxn xk2 =
X
k
j=1
|xj,n xj |
1/2
<
X
k
j=1
2 1/2
= ,
so that xn x.
(b) follows from (a) and Theorem 3.3.
Subsequences
3.5 Definition. Given a sequence {pn }, consider a sequence {nk } of positive integers,
such that n1 < n2 < n3 < . Then the sequence {pni } is called a subsequence of {pn }.
If {pni } converges, its limit is called a subsequential limit of {pn }.
It is clear that {pn } converges to p every subsequence of {pn } converges to p.
3.6 Theorem.
(a) If {pn } is a sequence in a compact metric space X, then some subsequence of {pn }
converges to a point of X.
(b) Every bounded sequence in Rk contains a convergent subsequence.
57
Proof. (a) Let E be the set of all terms of {pn }. If E is finite then there is a p E and a
sequence {ni } with n1 < n2 < n3 < such that
pn1 = pn2 = pn3 = .
If E is infinite, Theorem 2.37 shows that E has a limit point p X. Choose n1 so that
d(p, pn1 ) < 1. Having chosen n1 , . . . , ni1 we see from Theorem 2.20 that there is an
integer ni > ni1 such that d(p, pni ) < 1i . Then {pni } converges to p.
(b) This follows from (a), since Theorem 2.41 implies that every bounded subset of Rk
lies in a compact subset of Rk .
3.7 Theorem. The subsequential limits of a sequence {pn } in a metric space X form a
closed subset of X.
Proof. Let E be the set of all subsequential limits of {pn } and let q be a limit point of
E . We have to show that q E . Choose n1 so that pn1 6= q. (If no such n1 exists,
then E has only one point, and there is nothing to prove.) Put = d(q, pn1 ). Suppose
n1 , . . . , ni1 are chosen. Since q is a limit point of E , there is an x E with d(x, q) < 2i .
Since x E , there is an ni > ni1 such that d(x, pni ) < 2i . Thus
d(q, pni ) 6
2i1
Cauchy sequences
3.8 Definition. A sequence {pn } in a metric space X is said to be a Cauchy sequence if
for every > 0, there is an integer N such that d(pn , pm ) < if m, n > N .
3.9 Definition. Let E be a nonempty subset of a metric space X, and let
S = {d(p, q) : p, q E}.
3.10 Theorem.
(a) Let E be a subset of a metric space X, then diam E = diam E.
(b) If Kn is a sequence of compact sets in X such that Kn Kn+1 (n = 1, 2, 3, . . .)
and if
lim diam Kn = 0,
n
T
then n=1 Kn consists of exactly one point.
58
it is clear that
Proof. (a) Since E E,
diam E 6 diam E.
By the definition of E,
there are points p , q E such
Fix > 0, and choose p, q E.
that d(p, p ) < , d(q, q ) < . Hence
d(p, q) < d(p, p ) + d(p , q ) + d(q , q)
< + d(p , q ) + 6 2 + diam E.
It follows that
diam E 6 2 + diam E,
and since was arbitrary, (a) is proved.
T
(b) Put K =
n=1 Kn . By Theorem 2.36, K is not empty. If K contains more than one
point, then diam K > 0. But for each n, Kn K so that diam Kn > diam K. This
contradicts the assumption that diam Kn 0.
3.11 Theorem.
(a) In any metric space X, every convergent sequence is a Cauchy sequence.
(b) If X is a compact metric space and if {pn } is a Cauchy sequence in X, then {pn }
converges to some point of X.
(c) In Rk , every Cauchy sequence converges.
The fact (contained in Theorem 3.11) that a sequence converges in Rk if and only if it is
a Cauchy sequence is usually called the Cauchy criterion for convergence.
Proof. (a) Let pn p and let , there is an integer N such that d(p, pn ) < for all n > N .
Hence,
d(pn , pm ) 6 d(pn , p) + d(p, pm ) < 2
as soon as m, n > N . Thus {pn } is a Cauchy sequence.
(b) Let {pn } be a Cauchy sequence in the compact space X. For N = 1, 2, 3, . . ., let
EN = {pN , pN +1 , pN +2 , . . .}. Then
lim diam EN = 0
(3)
N
by Definition 3.9 and Theorem 3.10(a). Being a closed subset of the compact space X,
each EN is compact (Theorem 2.35). Also EN EN +1 so that EN EN +1 .
Theorem 3.10(b) shows now that there is a unique p X which lies in every EN . Let
> 0 be given. By (3) there is an integer N0 such that diam EN < if N > N0 . Since
p EN , it follows that d(p, q) < for every q EN , hence for every q EN . In other
words, d(p, pn ) < if n > N0 . This says precisely that pn p.
(c) Let {xn } be a Cauchy sequence in Rk . Define EN as in (b), with xi in place of pi . For
some N , diam EN < 1. The set of all terms of {xn } is the union EN {x1 , . . . , xN 1 }.
Hence {xn } is bounded. Since every bounded subset of Rk has compact closure in Rk
(Theorem 2.41), (c) follows from (b).
59
3.12 Definition. A metric space in which every Cauchy sequence converges is said to be
complete.
Theorem 3.11 says that all compact metric spaces and all Euclidean spaces are complete.
Theorem 3.11 implies also that every closed subset E of a complete metric space X is
complete. (Every Cauchy sequence in E is a Cauchy sequence in X, hence it converges
to some p X, and actually p E since E is closed.)
An example of a metric space which is not complete is the space of all rational numbers,
with d(x, y) = |x y|.
3.13 Definition. A sequence {sn } of real numbers is said to be
(a) monotonically increasing if sn 6 sn+1 (n = 1, 2, 3, . . .);
(b) monotonically decreasing if sn > sn+1 (n = 1, 2, 3, . . .).
The class of monotonic sequences consists of the increasing and the decreasing sequences.
3.14 Theorem. Suppose that {sn } is monotone. Then {sn } converges if and only if it is
bounded.
Proof. Suppose sn 6 sn+1 (the proof is analogous in the other case). Let E be the set of
all terms of {sn }.
This follows from Theorem 3.2(c).
Let s = sup E. Then
sn 6 s (n = 1, 2, 3, . . .).
For every > 0, there is an integer N such that
Since {sn } is increasing, for n > N ,
s < sN 6 s.
a, yn =
a
y2 y1 = a + a a = p
> 0.
a+ a+ a
It is true for n = 1.
Assume that the result is true for n = k, i.e., yk+1 yk > 0. Then
yk+1 yk
> 0.
yk+2 yk+1 = a + yk+1 a + yk =
a + yk+1 + a + yk
a + yn1
60
1 + 1 + 4a
1 1 + 4a
6 yn 6
.
2
2
Since {yn } is increasing and bounded above, {yn } is convergent. Let lim yn = . Then
n
lim
yn2
= lim (a + yn1 )
n
=a+
0 = 2 a
1 + 1 + 4a
=
2
since {yn } is a sequence of positive numbers.
Example. Given two positive real numbers a and b such that a < b. Let {xn } and {yn }
be two sequences satisfying x1 = a, y1 = b and
p
xn + yn
xn+1 = xn yn , yn+1 =
2
for all positive integers n. Show that both {xn } and {yn } converge and lim xn = lim yn .
n
n
so that {xn } is bounded above and {yn } is bounded below. Thus lim xn = A and
n
lim yn = B exist and
n
xn1 + yn1
n
2
A+B
B=
2
A = B.
lim yn = lim
61
Similarly, if for every real M there is an integer N such that n > N implies sn 6 M , we
write
sn .
3.16 Definition. Let {sn } be a sequence of real numbers. Let
: sn x for some subsequence {sn }}.
E = {x R
k
k
This set E contains all subsequential limits as defined in Definition 3.5, plus possibly the
numbers +, . Put
s = sup E, s = inf E,
called the limit superior and limit inferior of {sn }; we use the notation
lim sup sn = s ,
n
lim inf sn = s .
n
Simply speaking, lim sup sn and lim inf sn are, respectively, the largest limit point and the
smallest limit point of {sn }. Alternatively, lim sup sn and lim inf sn can be defined by
lim sup sn = inf sup sm ,
n
n m>n
m>n
3.17 Theorem. Let {sn } be a sequence of real numbers. Let E and s be defined as in
Definition 3.16. Then s has the following two properties:
(a) s E.
(b) If x > s , there is an integer N such that n > N implies sn < x.
Moreover, s is the only number with the properties (a) and (b).
Proof. (a) If s = +, then E is not bounded above; hence {sn } is not bounded above,
and there is a subsequence {snk } such that snk +.
If s is real, then E is bounded above, and at least one subsequential limit exists, so
that (a) follows from Theorems 3.7 and 2.28. If s = , then E = {}, and there
is no subsequential limit. Hence, for any real M , sn > M for at most a finite number of
values of n, so that sn .
This establishes (a) in all cases.
(b) Suppose there is a number x > s such that sn > x for infinitely many values of n.
These sn form a subsequence of {sn } such that each sn > x. The limit of this subsequence
y E is such that y > x > s , contradicting the definition of s (what definition?).
Thus s satisfies (a) and (b).
To show the uniqueness, suppose there are two numbers, p and q, which satisfy (a) and
(b), and suppose p < q. Choose x such that p < x < q. Since p satisfies (b), we have
sn < x for n > N . But then q cannot satisfy (a).
62
Moreover, s is the only number with the properties (a) and (b).
Proof. Exercise.
3.18 Examples.
(a) Let {sn } be a sequence containing all rationals. Then every real number is a
subsequential limit, and
lim sup sn = +,
lim inf sn = .
n
(b) Let sn =
(1)n
1 .
1+ n
Then
lim sup sn = 1,
lim inf sn = 1.
n
Proof. To illustrate the idea of the proof, we assume that all the lim sup and lim inf exist.
For m > n > N , we have
sm 6 tm 6 sup tm .
m>n
Since the right side is independent of m, taking sup of the left side over all m > n, we get
sup sm 6 sup tm .
m>n
m>n
n m>n
m>n
Since the left side is independent of n, taking the inf of the right side over all n, we finally
obtain
lim sup sn 6 inf sup tm = lim sup tn .
n
n m>n
63
Series
3.21
P Definition. Given a sequence {an }. The nth partial sum of the infinite series
n=1 an is
n
X
sn =
ak = a1 + a2 + + an .
k=1
X
X
ak = s.
an = lim
n=1
k=1
The number s is called the sum of the series, which is the nth partial sum.
If {sn } diverges, the series is said to diverge.
The Cauchy criterion (Theorem 3.11) can be restated in the following form:
3.22 Theorem.
that
if m > n > N .
3.23 Theorem. If
The condition lim an = 0 is necessary but not sufficient for the convergence of
n P
1
1
instance, the series
n=1 n diverges despite that lim n = 0.
n
an . For
3.24 Theorem. A series of non-negative terms converges if and only if its partial sums
form a bounded sequence.
P
Proof. Let
n=1 an be a series of non-negative terms. Then by Theorem 3.14,
X
n=1
n
X
ak is bounded.
k=1
64
Proof. Given > 0, there exists N > N0 such that m > n > N implies
m
X
ck < ,
k=n
k=n
k=n
an converges, so must
dn .
X
X
X
1
n2 + 2
4n2 + n
(a)
.
,
(b)
,
(c)
3 7
3n + n
n4 + 5
n + n3
n=1
n=1
n=1
Solution. (a) Since for n > 1,
the series
X
n=1
3n
1
is convergent by the comparison test.
+n
the series
X
n2 + 2
n=1
1
1
< n and
+n
3
1
X 1
1
3
< ,
=
1 =
n
3
2
1 3
n=1
3n
n4 + 5
n2 + 2
n2 + 2
<
and by the p-test,
n4 + 5
n4
X
X
n2 + 2 X 1
2
=
+
< ,
n4
n2 n=1 n4
n=1
n=1
is convergent by the comparison test.
4n2 + n
4n2
4
X
4 X 1
4
= 3
= +,
3
2n1/3
2 n=1 n1/3
n=1
X
4n2 + n
the series
is divergent by the comparison test.
3 7
n + n3
n=1
65
Proof. If < 1, we can choose so that < < 1, and an integer N such that
p
n
|an | <
for n > N [by Theorem 3.11(b)]. That is, n > N implies
|an | < n .
P n
P
Since 0 < < 1,
converges. Convergence of
an follows now from the comparison
test.
If > 1, then, again by Theorem 3.17, there is a sequence {nk } such that
q
nk
|ank | .
X
X
1
1
,
.
n
n2
n=1
n=1
For each of these series = 1, but the first diverges, the second converges.
P
3.34 Theorem (Ratio Test). The series
an
an+1
< 1,
(a) converges if lim sup
a
n
n
> 1 for all n > n0 , where n0 is some fixed integer.
(b) diverges if an+1
an
an+1
< 1 holds, we can find < 1, and an integer N , such that
Proof. If lim sup
an
n
an+1
an <
for n > N . In particular,
|aN +1 | < |aN |,
|aN +2 | < |aN +1 | < 2 |aN |,
..
.
That is,
66
an , as exemplified by
1
n
3.35 Examples.
(a) Consider the series
1 1
1
1
1
1
1
1
+ + 2 + 2 + 3 + 3 + 4 + 4 + ,
2 3 2
3
2
3
2
3
for which
n
2
an+1
= lim
= 0,
lim inf
n 3
n
an
r
1
2n 1
n
1
2n 1
lim sup n an = lim
= ,
n
n
2
n
2
n
1 3
an+1
= lim
= +.
lim sup
n 2 2
an
n
The root test indicates convergence; the ratio test does not apply.
(b) The same is true for the series
X
n=1
where
an =
1
1 1
1
1
1
1
+1+ + +
+
+
+
+ ,
2
8 4 32 16 128 64
lim inf
n
1
an+1
= ,
an
8
but
lim
and
lim sup
n
an+1
= 2,
an
1
an = .
2
cn+1
lim inf
6 lim inf n cn ,
n
n
cn
cn+1
lim sup n cn 6 lim sup
.
cn
n
n
Proof. We shall prove the second inequality; the proof of the first is quite similar. Put
cn+1
.
= lim sup
cn
n
If = +, there is nothing to prove. If is finite, choose > . There is an integer N
such that
cn+1
6
cn
for n > N . In particular, for any p > 0,
cN +p 6 cN +k
(k = 0, 1, . . . , p 1).
67
cn 6
so that
lim sup
(n > N ).
p
n
cN N
cn 6 ,
(18)
lim sup n cn 6 .
It follows from Theorem 3.37 that if the ratio test works, then the root test works; if the
ratio test fails, the root test may work. It is in this sense that the root test is superior
than the ratio test.
Integral test
Let
n=1
an =
n=1
N
1
X
an +
n=1
an ,
n=N
P
where the first sum on the right
is
finite,
a
converges
if
and
only
if
n
n=1
n=N an
P
converges, where N > 0. For
a
to
converge,
necessarily
lim
a
=
0.
So, an
n
n=1 n
n
must decrease to zero from some N onwards. Let f : N R be the function such that
an = f (n) for n = 1, 2, 3, . . ..
y
y = f (x)
aN
N +1
N +2
N +3
...
n+1
68
Interpreting aN as the area of the first rectangle, etc., the following inequalities are immediate:
Z
X
X
an 6
f (x) dx 6
an .
n=N +1
n=N
Thus,
n=N +1
an converges
1
n=1 np
f (x) dx converges.
Solution. Since f (x) = x1p > 0 for all x > 1 and is decreasing, by the integral test, we
have
Z
X
1
1
converges
dx converges.
p
p
n
x
1
n=1
Since
we have
1
dx = lim
R
xp
R
1
1
xp
R
1
p+1
p+1 R
R
1
x
1
,
= lim
dx = lim
R p + 1
R p + 1
xp
p + 1
1
1
k=3 k2 3k+2 .
dx
= lim
R 3
x2 x1
h x 2 iR
= lim ln
R
x1 3
32
R2
ln
= lim ln
R
R1
31
= ln 2 < ,
P
1
the series
k=3 k2 3k+2 converges by the integral test.
1
k=3 k2 3k+2
is a
Power series
3.38 Definition. Given a sequence {cn } of complex numbers, the series
cn z n
(19)
n=0
is called a power series. The numbers cn are called the coefficients of the series; z is a
complex number.
69
Every power series there is associated a circle, the circle of convergence, such that (19)
converges if z is in the interior of the circle and diverges if z is in the exterior.
cn z n , put
p
1
= lim sup n |cn |, R = .
n
P
(If = 0, then R = +; if = +, R = 0.) Then
cn z n converges if |z| < R, and
diverges if |z| > R.
3.39 Theorem. Given the power series
3.40 Examples.
P
n n
(a) The series P
n=1 nn z has R = 0.
z
(b) The series
n=0 n! has R = +. (In this case the ratio test is easier to apply
than the root
P test.)n
n
(c) The series
n=0 z has R = 1. If |z| = 1, the series diverges, since z 6 0 as
n . P
zn
(d) The series
n=1 n2 has R = 1. It converges for all z with |z| = 1, by the comparison test.
Absolute convergence
P
P
Definition. The series
an is said to converge absolutely if the series
|an | converges.
P
P
3.45 Theorem. If
an converges absolutely, then
an converges.
Proof. The assertion follows from the inequality
m
m
X X
ak 6
|ak |
k=n
k=n
3.46 Remarks. For series of positive terms, absolute convergence is the same as convergence.
P
P
P
If
an converges, but
|an | diverges, we say that
an converges non-nonabsolutely or
conditionally. For instance, the series
X
(1)n
n=1
70
Proof. Let
An =
n
X
ak
and Bn =
k=1
Then
n
X
bk .
k=1
An + Bn =
n
X
(ak + bk ).
k=1
lim (An + Bn ) = A + B,
and
Thus, two convergent series may be added term by term, and the resulting series converges
to the sum of the two series.
P
P
3.48 Definition. Given
an and
bn , we put
n
X
cn =
ak bnk (n = 0, 1, 2, . . .)
k=0
and call
P
This
definition
may
be
motivated
as
follows.
If
we
take
two
power
series
an z n and
P
n
bn z , multiply them term by term, and collect terms containing the same power of z,
we get
X
X
n
an z
bn z n = (a0 + a1 z + a2 z 2 + )(b0 + b1 z + b2 z 2 + )
n=1
n=1
= a0 b0 + (a0 b1 + a1 b0 )z + (a0 b2 + a1 b1 + a2 b0 )z 2 +
= c0 + c1 z + c2 z 2 + .
71
Continuity
Reference: W. Rudin, Principles of Mathematical Analysis, 3rd edition, McGraw-Hill,
c1976, Chapter 4.
4.1 Definition. Let (X, dX ) and (Y, dY ) be metric spaces; suppose E X, f maps E
into Y , and p is a limit point of E. Write
lim f (x) = q
(1)
xp
if there is a point q Y with the following property: For every > 0, there exists a > 0
such that
dY (f (x), q) <
(2)
for all points x E for which
(3)
It should be noted that p X, but that p need not be a point of E in the above definition.
Moreover, even if p E, we may very well have f (p) 6= lim f (x).
xp
(4)
lim f (pn ) = q
(5)
xp
if and only if
n
lim pn = p.
(6)
Proof. Let {pn } be a sequence in E satisfying (6). Let > 0 be given. Then there
exists > 0 such that dY (f (x), q) < if x E and 0 < dX (x, p) < . Also, there exists N
such that n > N implies 0 < dX (pn , p) < . Thus, for n > N , we have dY (f (pn ), q) < ,
which shows that (5) holds.
Conversely, suppose (4) is false. Then there exists some > 0 such that for every
> 0 there exists a point x E (depending on ), for which dY (f (x), q) > but
0 < dX (x, p) < . Taking = n1 (n = 1, 2, 3, . . .), we thus find a sequence {xn } in E
satisfying (6) for which (5) is false.
for all n, and pn p, lim f (pn ) = q. The uniqueness of q then follows from Theorems
n
3.2(b).
72
1
1
1
1
lim sin
= lim sin(2n ) = 1 6= 1 = lim sin(2n + ) = lim sin ,
n
n
n
pn n
2
2
qn
by Theorem 4.2, lim sin x1 does not exist.
x0
f (x)
f
provided g(x) 6= 0.
( )(x) =
g
g(x)
If f (x) = c for all x E, where c C is fixed, then f is called a constant function, and
write f = c. If f, g : E R and f (x) > g(x) for all x E, then write f > g.
Similarly, if f , g : E Rk , then define f + g and f g by
(f + g)(x) = f (x) + g(x),
xp
lim g(x) = B.
xp
Then
(a) lim (f + g)(x) = A + B;
xp
A
,
B
if B 6= 0.
Proof. In view of Theorem 4.2, these assertions follow immediately from the analogous
properties of sequences (Theorem 3.3).
Remark. If f , g : E Rk , then (a) remains true, and (b) becomes
(b ) lim (f g)(x) = A B.
xp
73
Continuous functions
4.5 Definition. Let (X, dX ) and (Y, dY ) be metric spaces, E X, p E, and f : E Y .
Then f is said to be continuous at p if for every > 0, there exists a > 0 such that
dY (f (x), f (p)) <
for all x E such that dX (x, p) < . If f is continuous at every point of E, then f is said
to be continuous on E.
Note that f has to be defined at the point p in order to be continuous at p.
f is continuous at every isolated point p of E because there exists > 0 such that
N (p) = {q E : dX (p, q) < } = {p}; x E and dX (x, p) < x = p so that
dY (f (x), f (p)) = 0 <
The next theorem states that a continuous function of a continuous function is continuous.
4.7 Theorem. Suppose X, Y, Z are metric spaces, E X, f : E Y , g : f (E) Z,
and h : E Z defined by
h(x) = g(f (x)) (x E).
If f is continuous at p E and g is continuous at f (p), then h is continuous at p. The
function h is called the composition or the composite of f and g, denoted by
h = g f.
Proof. Let > 0 be given. Since g is continuous at f (p), there exists > 0 such that
dZ (g(y), g(f (p))) < if dY (y, f (p)) < and y f (E).
74
Suppose f 1 (V ) is open in X for every open set V in Y . Fix p X and > 0, let
V = {y Y : dY (y, f (p)) < }. Then V is open; hence f 1 (V ) is open; hence there exists
> 0 such that x f 1 (V ) as soon as dX (p, x) < . But if x f 1 (V ) then f (x) V ,
so that dY (f (x), f (p)) < .
4.10 Theorem.
(a) Let (X, dX ) be a metric space, f1 , . . . , fk : X R, and let f : X Rk defined by
f (x) = (f1 (x), . . . , fk (x)) (x X);
then f is continuous if and only if each of the functions f1 , . . . , fk is continuous.
(b) If f , g : X Rk , then f + g : X Rk and f g : X R are continuous on X.
The functions f1 , . . . , fk are called the components of f .
Proof. Given > 0.
(a) For j = 1, 2, . . . , k, by virtue of the following inequalities
X
1/2
k
2
|fj (x) fj (y)| 6 kf (x) f (y)k2 =
|fi (x) fi (y)|
,
i=1
there exists > 0 such that dX (x, y) < implies kf (x) f (y)k2 < . Thus, when
dX (x, y) < then |fj (x) fj (y)| < .
75
For j = 1, 2, . . . , k, there exist j > 0 such that dX (x, y) < j |fj (x) fj (y)| <
. Choose = min(1 , . . . , k ) > 0. Then dX (x, y) < |fj (x) fj (y)| < for
k
k
j = 1, 2, . . . , k so that
X
1/2 X
1/2
k
k
2
2
kf (x) f (y)k2 =
|fi (x) fi (y)|
<
( )
= .
k
i=1
i=1
4.11 Examples.
(a) If x1 , . . . , xk are the coordinates of the point x Rk , the functions i defined by
i (x) = xi
(x Rk )
(8)
shows that we may take = in Definition 4.5. The functions i are sometimes
called the coordinate functions.
(b) Repeated application of Theorem 4.9 then shows that every monomial
xn1 1 xn2 2 xnk k
(9)
76
i.e., {f (V )} is an open cover of X. Since X is compact, there are finitely many indices
1 , . . . , n such that
X f 1 (V1 ) f 1 (Vn ).
(12)
1
Since f (f (E)) E for every E Y , (12) implies that
f (X) f (f 1 (V1 ) f 1 (Vn )) V1 Vn .
(13)
tX
(14)
Then there exist points p, q X such that f (p) = M and f (q) = m. In other words, f
attains its maximum (at p) and its minimum (at q).
Proof. By Theorem 4.15, f (X) is a closed and bounded set of real numbers; hence f (X)
contains
M = sup f (X) and m = inf f (X),
by Theorem 2.28.
4.17 Theorem. Suppose f is a continuous 1-1 mapping of a compact metric space X
onto a metric space Y . Then the inverse mapping f 1 defined on Y by
f 1 (f (x)) = x (x X)
Proof. Applying Theorem 4.8 to in place of f 1 , we see that it suffices to prove that f (V )
is an open set in Y for every open set V in X. Fix such a set V . The complement V c of
V is closed in X, hence compact (Theorem 2.35); hence f (V c ) is a compact subset of Y
(Theorem 4.14) and so is closed in Y (Theorem 2.34). Since f is bijective, f (V ) is the
complement of f (V c ). Hence f (V ) is open.
77
(15)
We put
(18)
1
min((p1 ), . . . , (pn )).
(19)
2
Then > 0. Now let p, q X such that dX (p, q) < . By (18), there is an integer m,
1 6 m 6 n, such that p J(pm ); hence
1
dX (p, pm ) < (pm ),
(20)
2
and we also have
1
dX (q, pm ) 6 dX (p, q) + dX (p, pm ) < + (pm ) 6 (pm ).
2
Finally, (16) shows that therefore
=
78
The next theorem shows that compactness is essential in the hypotheses of Theorems
4.14, 4.15, 4.16, and 4.19.
4.20 Theorem. Let E be a noncompact set in R. Then
(a) there exists a continuous function on E which is not bounded;
(b) there exists a continuous and bounded function on E which has no maximum.
If, in addition, E is bounded, then
(c) there exists a continuous function on E which is not uniformly continuous.
Proof. Suppose first that E is bounded, so that there exists a limit point x0 of E which
is not a point of E.
Consider
1
(x E).
(21)
x x0
This is continuous on E (Theorem 4.9), but evidently unbounded. To see that (21) is
not uniformly continuous, let > 0 and > 0 be arbitrary, and choose a point x E
such that |x x0 | < . Taking t close enough to x0 , we can then make the difference
|f (t)f (x)| > , although |xt| < . Since this is true for every > 0, f is not uniformly
continuous on E. This proves (a) when E is bounded.
f (x) =
1
(x E)
1 + (x x0 )2
is continuous on E, and is bounded, since 0 < g(x) < 1. It is clear that
g(x) =
(22)
sup g(x) = 1,
xE
whereas g(x) < 1 for all x E. Thus g has no maximum on E. This proves (b) when E
is bounded.
Having proved the theorem for bounded sets E, let us now suppose that E is unbounded.
Then f (x) = x establishes (a), whereas
h(x) =
x2
1 + x2
(x E)
(23)
79
Otherwise, define h(x) = f (x) x, where x [a, b]. Then h : [a, b] [a, b] is continuous.
Since h(a) = f (a) a > 0 and h(b) = f (b) b < 0. By the Intermediate-value property
of continuous functions, there exists x0 (a, b) such that h(x0 ) = f (x0 ) x0 = 0, i.e.,
f (x0 ) = x0 .
80
Discontinuities
If x is a point in the domain of the function f at which f is not continuous, we say that
f is discontinuous at x, or that f has a discontinuity at x.
4.25 Definition. Let f be defined on (a, b). Consider any point x (a, b). We write the
right-hand limit
f (x+) = q
if f (tn ) q as n , for all sequences {tn } in (x, b) such that tn x. To obtain the
definition of the left-hand limit f (x), we restrict ourselves to sequences {tn } in (a, x).
It is clear that for any x (a, b), lim f (t) exists if and only if
tx
Otherwise, the discontinuity is said to be of the second kind, i.e., not both f (x+) and
f (x) exist.
4.27 Examples.
(a) Define
f (x) =
1 if x Q,
0 if x R \ Q.
Then f has a discontinuity of the second kind at every point x, since neither f (x+)
nor f (x) exists.
(b) Define
(
x if x Q,
f (x) =
0 if x R \ Q.
Then f is continuous at x = 0 and has a discontinuity of the second kind at every
other point (i.e., x 6= 0).
(c) Define
if 3 < x < 2,
x + 2
f (x) = x 2 if 2 6 x < 0,
x + 2
if 0 6 x < 1.
Then f has a simple discontinuity at x = 0 and is continuous at every other point
of (3, 1).
(d) Define
(
sin x1 if x 6= 0,
f (x) =
0
if x = 0.
Since neither f (0+) nor f (0) exists, f has a discontinuity of the second kind at
x = 0. Assuming that sin x is a continuous function, then Theorem 4.7 implies
that f is continuous at every point x 6= 0.
81
for x < 1,
x
f (x) = 0
for x = 1,
2 x for x > 1.
We clearly have
x1
x1
x1+
x1+
Thus, lim f (x) = 1. Since lim f (x) 6= f (1), f has a simple discontinuity at x = 1. f can
x1
x1
xx0
xx0 +
g(x) =
f (x) for x 6= x0 ,
,
L
for x = x0
Solution. Let a (0, 1) Q. Then a = pq for some p, q relatively prime so that f (a) = 1q .
There exists {xn } (0, 1)\Q such that xn a as n . But lim f (xn ) = 0 < 1q = f (a),
n
proving f being discontinuous at x = a.
Given > 0. Let now a (0, 1) \ Q. Let q N be the smallest such that 0 < 1q < .
There exist finitely many p N, say, p1 , . . . , pk , which are relatively prime to q. Choose
= min |a pqi | > 0. If x (0, 1) \ Q and |x a| < , then |f (x) f (a)| = 0 < ; if
16i6k
1
q
82
Solution. Let = 21 . For every x Q and any > 0, there exists y B (x) \ Q such that
|f (x) f (y)| = 1 > 21 = . Thus, f is discontinuous at x; also, for x R \ Q, there exists
y B (x) Q such that |f (x) f (y)| = 1 > 21 = . Consequently, f is discontinous at
every x R.
Monotone functions
4.28 Definition. Let f : (a, b) R. Then f is said to be monotonically increasing (resp.,
decreasing) on (a, b) if a < x < y < b implies f (x) 6 f (y) (resp., f (x) > f (y)). The class
of monotonic functions consists of both the increasing and the decreasing functions.
4.29 Theorem. Let f be monotonically increasing on (a, b). Then f (x+) and f (x)
exist at every x (a, b). More precisely,
sup f (t) = f (x) 6 f (x) 6 f (x+) = inf f (t).
x<t<b
a<t<x
(25)
(26)
(27)
(28)
The second half of (25) is proved in precisely the same way (Exercise).
Next, if a < x < y < b, we see from (25) that
f (x+) = inf f (t) = inf f (t).
x<t<b
x<t<y
(29)
The last equality is obtained by applying (25) to (a, y) in place of (a, b). Similarly,
f (y) = sup f (t) = sup f (t).
a<t<y
(30)
x<t<y
83
Since x1 < x2 implies f (x1 +) 6 f (x2 ), we see that r(x1 ) 6= r(x2 ) if x1 6= x2 . We have
thus established a 1-1 correspondence between the set E and a subset of Q. The latter is
countable.
Infinite limits and limits at infinity
4.32 Definition. For any c R, (c, +) = {x R : x > c} is called a neighborhood of
+. Similarly, (, c) = {x R : x < c} is a neighborhood of .
4.33 Definition. Let E R and let f : E R. We say that
f (t) A as t x,
g(t) B
as t x.
Then
(a) f (t) A implies A = A.
(b) (f + g)(t) A + B,
(c) (f g)(t) AB,
(d) (f /g)(t) A/B,
provided the right members of (b), (c), and (d) are defined.
Note that , 0 , /, A/0 are not defined.
84
Differentiation
5.1 Definition. Let f : [a, b] R. For any x [a, b] form the quotient
f (t) f (x)
(a < t < b, t 6= x),
(t) =
tx
and define
f (x) = lim (t)
tx
(1)
(2)
called the right-hand derivative and left-hand derivative of f at x, respectively. In particular, at the endpoints a and b, the derivative, if it exists, is a right-hand or left-hand
derivative, respectively.
If f is defined on a segment (a, b) and if a < x < b, then f (x) is defined by (1) and (2),
as above. But f (a) and f (b) are not defined in this case.
5.2 Theorem. Let f be defined on [a, b]. If f is differentiable at a point x [a, b], then
f is continuous at x.
Proof. Since f (x) exists, we have, by Theorem 4.4,
f (t) f (x)
lim(f (t) f (x)) = lim
(t x)
tx
tx
tx
f (t) f (x)
= lim
lim(t x)
tx
tx
tx
= f (x) 0 = 0.
The converse of this theorem is not true. A standard counterexample is f (x) = |x|, which
is continuous at every x R but not differentiable at x = 0.
85
5.3 Theorem. Suppose that f, g : [a, b] R are differentiable at x [a, b]. Then f + g,
f g, f /g are differentiable at x and
(a) (f + g) (x) = f (x) + g (x);
(b) (f
= f (x)g(x) + f (x)g (x);
g)(x)
(f + g)(t) (f + g)(x)
tx
tx
f (t) + g(t) f (x) g(x)
= lim
tx
tx
g(t) g(x)
f (t) f (x)
+ lim
= lim
tx
tx
tx
tx
= f (x) + g (x).
(f + g) (x) = lim
86
The next theorem states that a differentiable function of a differentiable function is differentiable.
5.5 Theorem (Chain rule) Suppose f is continuous on [a, b], f (x) exists at some point
x [a, b], g is defined on an interval I containing f ([a, b]), and g is differentiable at the
point f (x). If
h(t) = g(f (t)) (a 6 t 6 b),
then h is differentiable at x, and
h (x) = g (f (x))f (x).
(3)
= g (f (x))f (x).
h (x) = lim
tx
5.6 Examples.
(a) Let f be defined by
f (x) =
For x 6= 0,
x sin x1
0
f (x) = sin
when x 6= 0,
when x = 0.
1 1
1
cos .
x x
x
(8)
At x = 0,
f (t) f (0)
1
= lim sin ,
t0
t0
t0
t
1
1
cos .
x
x
(9)
(9)
(10)
87
At x = 0, since
f (t) f (0)
1
= lim t sin 6 lim |t| = 0,
lim
t0
t 0 t0
t t0
we see that
f (0) = 0.
(11)
f (t) f (x)
> 0.
tx
Letting t x, we see that f (x) > 0. If x < t < x + , then
f (t) f (x)
6 0.
tx
which shows that f (x) 6 0. Hence f (x) = 0.
5.9 Theorem. (Cauchys mean value theorem) If f and g are continuous real functions
on [a, b] which are differentiable in (a, b), then there is a point x0 (a, b) at which
Proof. Put
(12)
To prove the theorem, we have to show that h (x0 ) = 0 for some x0 (a, b). If h is
constant, this holds for every x (a, b). If h(t) > h(a) for some t (a, b), let x0 [a, b]
at which h attains its maximum (Theorem 4.16). By (12), x0 (a, b), and Theorem 5.8
shows that h (x0 ) = 0. If h(t) < h(a) for some t (a, b), the same argument applies if we
choose for x0 [a, b] where h attains its minimum.
88
x0
Solution. Let f (t) = sin t, where t R. Then f (t) is continuous on [y, x] and differentiable
in (y, x). By mean value theorem, there exists c (y, x) such that
sin x sin y
= f (c) = cos c 6 1.
xy
Hence, sin x sin y 6 x y since x y > 0.
5.11 Theorem. Suppose f is differentiable in (a, b).
(a) If f (x) > 0 for all x (a, b), then f is monotonically increasing.
(b) If f (x) = 0 for all x (a, b), then f is constant.
(c) If f (x) 6 0 for all x (a, b), then f is monotonically decreasing.
Proof. Let x1 , x2 (a, b) be such that x1 < x2 . Then there exists x (x1 , x2 ) such that
6 0 if f (x) 6 0.
89
1/g(x)
0
indeterminate form then rewriting it as
we have an indeterminate form
1/f (x)
0
to which lHopitals rule applies.
90
ln x
.
x1
Solution. The given limit is an indeterminate form 00 . Applying lHopitals rule yields
x1
1/x
ln x
= lim
= 1.
x1 1
x1 x 1
lim
ln sin x
.
x0 ln tan x
. Applying lHopitals rule yields
Solution. The given limit is an indeterminate form
lim+
x0
ln sin x
cos x/ sin x
= lim+
= lim cos2 x = 1.
ln tan x x0 sec2 x/ tan x x0+
x ex1
.
x1 (x 1)2
Solution. The given limit is an indeterminate form 00 . Applying lHopitals rule yields
1 ex1
ex1
1
x ex1
=
lim
=
lim
= .
x1 2(x 1)
x1
x1 (x 1)2
2
2
lim
f (x)
cannot be evaluated, it does not necessarily imply that
xa g (x)
f (x)
does not exist. What we can conclude is that lHopitals rule cannot be apg(x)
x + sin x
1 + cos x
x + sin x
= lim
which cannot
that an application of lHopitals rule yields lim
x
x
x
1
be determined. However, since sin x is bounded, we have
sin x
x + sin x
= 1.
= lim 1 +
lim
x
x
x
x
lim
xa
) tan x.
2
= lim
= lim ( sin2 x) = 1.
lim (x ) tan x = lim
x 2 cot x
x 2 csc2 x
x 2
x 2
2
Solution. Since lim tan x = , lim (x
91
cos x
1 sin x
= lim
= lim cot x = 0,
x/2 sin x
x/2
x/2
cos x
x/2
Solution. The given limit is an indeterminate form 1 . Let y = (cos x)csc x . Then
ln y = (csc x) ln cos x which is an indeterminate form 0 as x 0.
ln cos x
sin x/ cos x
lim ln y = lim (csc x) ln cos x = lim
= lim
= 0,
x0
x0
x0 sin x
x0
cos x
from which 0 = lim ln y = ln lim y since ln y is continuous. Thus, lim y = 1.
x0
x0
x0
Solution. The given limit is an indeterminate form 00 . Let y = (1 cos x)1/ ln x . Then
ln(1 cos x)
which is an indeterminate form as x 0+ .
ln y =
ln x
ln(1 cos x)
sin x/(1 cos x)
x sin x
lim+ ln y = lim+
= lim+
= lim+
x0
x0
x0
x0 1 cos x
ln x
1/x
cos x x sin x + cos x
x cos x + sin x
= lim+
= 2.
= lim+
x0
x0
sin x
cos x
Since ln y is continuous, 2 = lim+ ln y = ln lim+ y lim+ y = e2 .
x0
x0
x0
Solution. The given limit is an indeterminate form 0 . Letting y = x1/x and applying
lHopitals rule, we get
ln x
1/x
lim ln y = lim
= lim
= 0.
x
x x
x 1
Since ln y is continuous, 0 = lim ln y = ln lim y , hence lim y = 1.
x
92
n = 1, 2, 3, . . . .
f (n) is called the nth derivative, or the derivative of order n, of f . In order for f (n) (x)
to exist at a point x, f (n1) (t) must exist in a neighborhood of x (or in a one-sided
neighborhood, if x is an endpoint of the interval on which f is defined), and f (n1) must
be differentiable at x.
Theorem. Let f (x) and g(x) be n times differentiable functions. Then
dn
dn
kf (x) = k n f (x), where k is a constant.
(a)
dxn
dx
dn
dn
dn
[f
(x)
g(x)]
=
f
(x)
g(x).
(b)
dxn
dxn
dxn
Proof. Exercise.
x
and n be an integer greater than 1. Find f (n) (0).
1 x2
.
f (x) =
2 1x 1+x
so that n f (x) =
dx
2 (1 x)n+1 (1 + x)n+1
(
n!
0 if n is even,
f (n) (0) = [1 (1)n ] =
2
n! if n is odd.
Theorem (Leibniz). Let f (x) and g(x) be n times differentiable functions of x. Then
n
X
n (r)
(n)
f (x)g (nr) (x).
(f (x)g(x)) =
r
r=0
93
Proof. Induction on n. By the product rule, (f g) (x) = f (x)g(x) + f (x)g (x) so that it
is true for n = 1. Assume that it is true for n = k, i.e.,
k
X
k (r)
(k)
(f (x)g(x)) =
f (x)g (kr) (x).
r
r=0
1
r
r=1
r=0
k
X
k
k
(k+1)
= f (x)g
(x) +
(
+
)f (r) (x)g (k+1r) (x) + f (k+1) (x)g(x)
r1
r
r=1
k
X
k + 1 (r)
(k+1)
f (x)g (k+1r) (x) + f (k+1) (x)g(x)
= f (x)g
(x) +
r
r=1
k+1
X
k + 1 (r)
=
f (x)g (k+1r) (x),
r
r=0
k
. Thus, it is also true for n = k + 1.
+ kr = k+1
since r1
r
By the principle of mathematical induction it is true for all n > 1.
Example. Let y = xeax , where a R. Find y (20) .
Taylors theorem
5.15 Theorem. Suppose f is a real function on [a, b], n is a positive integer, f (n1) is
continuous on [a, b], f (n) (x) exists for every x (a, b). Let , x [a, b], 6= x, and define
P (x) =
n1 (k)
X
f ()
k=0
k!
(x )k .
(23)
f (n) ()
(x )n .
n!
(24)
94
For n = 1, this is just the mean value theorem. In general, the theorem shows that f can
be approximated by a polynomial of degree n 1, and that (24) allows us to estimate the
error, if we know bounds on |f (n) (x)|.
Proof. Let M be the number defined by
and put
f (x) = P (x) + M (x )n
(25)
(27)
Hence the proof will be complete if we can show that g (n) () = 0 for some between
and x. Since P (k) () = f (k) () for k = 0, 1, . . . , n 1, we have
g() = g () = = g (n1) () = 0.
(28)
Our choice of M shows that g(x) = 0, so that g (x1 ) = 0 for some x1 between and x, by
the mean value theorem. Since g () = 0, we conclude similarly that g (x2 ) = 0 for some
x2 between and x1 . After n steps we arrive at the conclusion that g (n) (xn ) = 0 for some
xn between and xn1 , that is, between and . Thus, = xn satisfies g (n) () = 0.
95
Rn =
f (n) ()
(x )n is called the remainder after n terms.
n!
If lim Rn = 0 for all x I, where I is some interval of real numbers containing , then
n
f ()(x )2
f (n1) ()(x )n1
+ +
.
2!
(n 1)!
T5 (x)
-4
-2
T7 (x)
-2
T3 (x)
-4
x2
xn
ex0 xn+1
x
+
+ +
+
1!
2!
n! (n + 1)!
96
ex0 xn+1
n (n+1)!
= 0. There
exists an integer k > 0 such that |x| < k2 . Then for n > k,
x n+1
x0 k n+1k
e|x| |x|k
e 0x
e
x
x
1
(n + 1)! = k!(k + 1)(k + 2) (n + 1) 6 k!
2n+1k
0
as n . Thus, ex = 1 + x +
x2
2!
x3
3!
+ follows.
Taylor series
(a) gives series representations of functions,
(b) provides a source of polynomial approximations of functions.
The usefulness of the latter depends on the estimation of the remainder term.
Example. Consider the logarithmic function f (x) = ln(1 + x), where 1 < x < 1.
Taking successive derivatives, we have
1
,
1+x
(1)3 (3!)
,
(1+x)4
f (x) =
f (4) (x) =
f (x) =
1
,
(1+x)2
...
(1)2 (2!)
,
(1+x)3
(1)n1 (n1)!
(1+x)n
f (x) =
f (n) (x) =
so that f (n) (0) = (1)n1 (n 1)!. Also, f (0) = ln 1 = 0. The Taylor expansion of f
about a = 0 is
(1)n1 xn
(1)n xn+1
x2 x3
+
+
+
ln(1 + x) = x
2
3
n
(n + 1)(1 + x0 )n+1
for some x0 lying in between x and 0.
Suppose now that we want to estimate ln( 23 ) = ln(1 + 12 ) with error less than 106 .
In this case 0 < x0 <
1
2
( 21 )n+1
n+1
( 1 )15
1
1 ( 1 )2 ( 1 )3
ln(1 + ) 2 + 2 + 2
2
2
2
3
15
59848147
= 0.405466 (correct to 6 decimal places).
=
147603456
In fact, ln(1 + 12 ) = 0.405465 correct to 6 decimal places.
97
Some important Taylor series together with their intervals of convergence are as follows:
x2 x3 x4
x
+
+
+
< x < ,
e =1+x+
2!
3!
4!
x3 x5 x7
sin x = x
+
+
< x < ,
3!
5!
7!
x2 x4 x6
cos x = 1
+
+
< x < ,
2!
4!
6!
x2 x3 x4
+
+
1 < x 6 1,
ln(1 + x) = x
2
3
4
x3 x5 x7
tan1 x = x
+
+
1 6 x 6 1,
3
5
7
X
( 1) ( n + 1)xn
1 < x < 1,
(1 + x) = 1 +
0 6= R .
n!
n=1
98
We write
xi = xi xi1 (i = 1, 2, . . . , n).
Now suppose that f is a bounded real function defined on [a, b]. Corresponding to each
partition P of [a, b] we put
Mi =
sup
f (x),
mi =
xi1 6x6xi
U (P, f ) =
n
X
Mi xi ,
inf
xi1 6x6xi
L(P, f ) =
i=1
and finally
n
X
f (x),
mi xi
i=1
Z ab
f d = inf U (P, f ),
(1)
f d = sup L(P, f ),
(2)
where the inf and the sup are taken over all partitions P of [a, b]. The left members of
(1) and (2) are called the upper and lower Riemann integrals of f over [a, b], respectively.
If the upper and lower integrals are equal, we say that f is Riemann-integrable on [a, b],
we write f R (here, R denotes the set of Riemann-integrable functions), and we denote
the common value of (1) and (2) by
Z
or by
f dx,
(3)
f (x) dx.
(4)
b
a
This is the Riemann integral of f over [a, b]. Since f is bounded, m = inf f (x) and
a6x6b
m 6 f (x) 6 M
(a 6 x 6 b).
so that the numbers L(P, f ) and U (P, f ) form a bounded set. This shows that the upper
and lower integrals are defined for every bounded function f . The question of their
equality, and hence the question of the integrability of f , is a more delicate one.
99
R1
Example. Consider the Riemann integral 0 x2 dx. Let Pn = {x0 , x1 , . . . , xn } be the
uniform partition of [0, 1], where xi = ni for i = 0, 1, . . . , n. Then xi = n1 for all i. Since
f (x) = x2 is increasing, we have for i = 1, 2, . . . , n,
Mi =
i
x2 = x2i = ( )2
n
xi1 6x6xi
sup
and
mi =
inf
xi1 6x6xi
x2 = x2i1 = (
i1 2
).
n
n
n
X
(n 1)(2n 1)
1 X
i1 2 1
(i 1)2 =
) ( )= 3
,
L(P, f ) =
(
n
n
n i=1
6n2
i=1
P
where we have used the fact that ni=1 i2 = n(n+1)(2n+1)
. Letting n , we have
6
Z 1
1
lim U (P, f ) = lim L(P, f ) = =
x2 dx.
n
n
3
0
6.2 Definition. Let be a monotonically increasing function on [a, b] (since (a) and
(b) are finite, it follows that is bounded on [a, b]). Corresponding to each partition
P = {a = x0 < x1 < < xn = b} of [a, b], we write
i = (xi ) (xi1 ),
i = 1, 2, . . . , n.
It is clear that i > 0. For any real function f which is bounded on [a, b], we put
U (P, f, ) =
n
X
Mi i ,
L(P, f, ) =
i=1
n
X
mi i ,
i=1
inf
xi1 6x6xi
sup
f (x),
xi1 6x6xi
f d = inf U (P, f, ),
(5)
f d = sup L(P, f, ),
(6)
a
b
the inf and sup again being taken over all partitions. If
common value is denoted by
Z b
f d,
Rb
a
f d =
Rb
a
f d, then their
(7)
or sometimes by
f (x) d(x),
a
and is called the Riemann-Stieljes integral of f with respect to , over [a, b].
(8)
100
R2
Example. Consider the Riemann-Stieljes integral I = 0 x2 dx, where (x) = x is
the largest integer 6 x. The largest integer function x is right continuous at integers
with left limits. To compute I, we consider the partition Pn = {x0 , x1 , x2 , x3 , x4 }, where
x0 = 0, x1 = 1 n1 , x2 = 1, x3 = 2 n1 and x4 = 2. Then
1
0 = 0 0 = 0,
n
1
2 = 1 1 = 1 0 = 1,
n
1
3 = 2 1 = 1 1 = 0,
n
1
4 = 2 2 = 2 1 = 1.
n
The upper and lower Riemann-Stieljes sums are then
1
1
U (Pn , x2 , x) = (1 )2 (0) + (1)2 (1) + (2 )2 (0) + (2)2 (1) = 5,
n
n
1
1
1
1
L(Pn , x2 , x) = (0)2 (0) + (1 )2 (1) + (1)2 (0) + (2 )2 (1) = (1 )2 + (2 )2 .
n
n
n
n
Letting n ,
1 = 1
Consequently, the Riemann-Stieljes integral I = 5. Here, all we need are sufficient number
of points in the partition P to capture the jumps in f and .
If (7) exists, i.e., if (5) and (6) are equal, we say that f is integrable with respect to , in
the Riemann sense, and write f R().
By taking (x) = x, the Riemann integral is seen to be a special case of the RiemannStieltjes integral. Let us mention explicitly, however, that in the general case need not
even be continuous.
Regarding the notation, we prefer (7) to (8), since the letter x, the so-called variable of
integration in (8), is a dummy variable, it does not matter which letter we use to represent
Rb
the integral. For instance, (8) is the same as a f (y) d(y). The integral depends on f, ,
a and b, but not on the variable of integration, which may as well be omitted.
6.3 Definition. We say that the partition P is a refinement of P if P P (that is,
if every point of P is a point of P ). Given two partitions, P1 and P2 , we say that P is
their common refinement if P = P1 P2 .
6.4 Theorem. If P is a refinement of P , then
L(P, f, ) 6 L(P , f, )
(9)
U (P , f, ) 6 U (P, f, ).
(10)
and
101
Proof. To prove (9), suppose first that P contains just one point more than P =
{x0 , x1 , . . . , xn }. Let this extra point be x , and suppose xi1 < x < xi for some i.
Put
w2 = inf f (x).
w1 =
inf f (x),
xi1 6x6x
x 6x6xi
inf
xi1 6x6xi
f (x). Hence
If P contains k points more than P , we repeat this reasoning k times, and arrive at (9).
The proof of (10) is analogous.
6.5 Theorem.
Rb
a
f d 6
Rb
a
f d.
Proof. Let P be the common refinement of two partitions P1 and P2 . By Theorem 6.4,
L(P1 , f, ) 6 L(P , f, ) 6 U (P , L, ) 6 U (P2 , f, ).
Hence
L(P1 , f, ) 6 U (P2 , f, ).
If P2 is fixed and the sup is taken over all P1 , (11) gives
Z b
f d 6 U (P2 , f, ).
(11)
(12)
6.6 Theorem. f R() on [a, b] if and only if for every > 0 there exists a partition
P such that
U (P, f, ) L(P, f, ) < .
(13)
Proof. For every partition P of [a, b] we have
Z b
Z b
f d 6
f d 6 U (P, f, ).
L(P, f, ) 6
a
06
b
a
f d
f d < .
a
Suppose f R(), and let > 0 be given. Then there exist partitions P1 and P2
such that
Z b
(14)
U (P2 , f, ) <
f d + ,
2
a
102
b
(15)
f d < L(P1 , f, ) + .
2
a
We choose P to be the common refinement of P1 and P2 . Then Theorem 6.4, together
with (14) and (15), shows that
Z b
Theorem 6.6 furnishes a convenient criterion for integrability. Before we apply it, we state
some closely related facts.
6.7 Theorem.
(a) If (13) holds for some P and some , then (13) holds (with the same ) for every
refinement of P .
(b) If (13) holds for P = {x0 , x1 , . . . , xn } and if si , ti are arbitrary points in [xi1 , xi ],
then
n
X
|f (si ) f (ti )|i < .
i=1
implies
U (P , f, ) L(P , f, ) 6 U (P, f, ) L(P, f, ) < .
(b) Both f (si ), f (ti ) [mi , Mi ] so that |f (si ) f (ti )| 6 Mi mi . Thus
n
X
|f (si ) f (ti )|i 6 U (P, f, ) L(P, f, ) < .
i=1
103
Since f is uniformly continuous on [a, b], there exists a > 0 such that
|f (x) f (t)| <
(16)
if x, t [a, b] and |x t| < . If P is any partition of [a, b] such that xi < for all i,
then (16) implies that
Mi mi 6 (i = 1, 2, . . . , n)
(17)
and therefore
n
X
U (P, f, ) L(P, f, ) =
(Mi mi )i
i=1
n
X
i=1
6.9 Theorem. If f is monotonic on [a, b], and if is continuous and monotone increasing
on [a, b], then f R().
Proof. Let > 0 be given. For any positive integer n, choose a partition such that
(b) (a)
(i = 1, 2, . . . , n).
i =
n
This is possible since is continuous (Theorem 4.23). We suppose that f is monotonically
increasing. Then
Mi = f (xi ), mi = f (xi1 ) (i = 1, 2, . . . , n),
so that
n
(b) (a) X
U (P, f, ) L(P, f, ) =
[f (xi ) f (xi1 )]
n
i=1
(b) (a)
[f (b) f (a)] <
n
if n is taken large enough. By Theorem 6.6, f R().
=
104
6.10 Theorem. Suppose f is bounded on [a, b], f has only finitely many points of
discontinuity on [a, b], and is continuous at every point at which f is discontinuous.
Then f R().
Proof. Let > 0 be given. Put M = sup |f (x)|, let E be the set of points at which f is
discontinuous. Since E is finite and is continuous at every point of E, we can cover E
by finitely many disjoint intervals [uj , vj ] [a, b] such that the sum of the corresponding
differences (vj ) (uj ) is less than . Furthermore, we can place these intervals in such
a way that every point of E (a, b) lies in the interior of some [uj , vj ].
Remove the segments (uj , vj ) from [a, b]. The remaining set K is compact. Hence f is
uniformly continuous on K, and there exists > 0 such that |f (s) f (t)| < if s, t K
and |s t| < .
Now form a partition P = {x0 , x1 , . . . , xn } of [a, b] as follows: For all j, uj , vj P . No
point of any segment (uj , vj ) occurs in P . If xi1 6= uj for all j, then xi < . Note that
Mi mi < 2M for every i, and that Mi mi < unless xi1 = uj for some j. Hence, as
in the proof of Theorem 6.8,
U (P, f, ) L(P, f, ) 6 [(b) (a)] + 2M .
Note. If f and have a common point of discontinuity, then f need not be in R().
6.11 Theorem. Suppose f R() on [a, b], m 6 f 6 M , is continuous on [m, M ],
and h(x) = (f (x)) on [a, b]. Then h R() on [a, b].
Proof. Choose > 0. Since is uniformly continuous on [m, M ], there exists > 0 such
that < and |(s) (t)| < if |s t| 6 and s, t [m, M ].
Since f R(), there is a partition P = {x0 , x1 , . . . , xn } of [a, b] such that
U (P, f, ) L(P, f, ) < 2 .
(18)
Let Mi , mi have the same meaning as in Definition 6.1, and let Mi , mi be the analogous
numbers for h. Divide the numbers 1, 2, . . . , n into two classes: i A if Mi mi < ,
i B if Mi mi > .
For i A, our choice of shows that Mi mi 6 .
For i B, Mi mi 6 2K, where K = sup |(t)|, m 6 t 6 M . By (18), we have
X
X
i 6
(Mi mi )i < 2
iB
iB
(19)
105
so that
iA
iB
Remark. This theorem suggests the question: Just what functions are Riemann-integrable?
The answer is given by Theorem 11.33(b).
6.12 Theorem. (Properties of the integral)
(a) If f, f1 , f2 R() on [a, b], then f1 + f2 R() and cf R() for every constant
c, and
Z b
Z b
Z b
Z b
Z b
f d.
cf d = c
f2 d,
f1 d +
(f1 + f2 ) d =
a
f1 d 6
f2 d.
a
(c) If f R() on [a, b] and if a < c < b, then f R() on [a, c] and on [c, b], and
Z b
Z b
Z c
f d.
f d =
f d +
a
(20)
These inequalities persist if P1 and P2 are replaced by their common refinement P . Then
(20) implies
U (P, f, ) L(P, f, ) < 2,
106
f d 6 U (P, f, ) <
f1 d +
f2 d + 2.
(21)
is proved. The proofs of the other assertions of Theorem 6.12 are so similar that we omit
the details. In part (c) the point is that (by passing to refinements)
we may restrict
R
ourselves to partitions which contain the point c, in approximating f d.
6.13 Theorem. If f, g R() on [a, b], then
(a) f g R();
R
R
b
b
(b) |f | R() and a f d 6 a |f | d.
Proof. (a) If we take (t) = t2 , Theorem 6.11 shows that (f g)2 R(). The identity
4f g = (f + g)2 (f g)2
107
6.15 Theorem. If a < s < b, f is bounded on [a, b], f is continuous at s, and (x) =
I(x s), then
Z b
f d = f (s).
a
X
n=1
n cn
cn I(x sn ).
(22)
X
f (x) d =
cn f (sn ).
a
(23)
n=1
X
cn < .
n=N +1
Put
1 (x) =
N
X
n=1
cn I(x sn ),
2 (x) =
n=N +1
Z
Since 2 (b) 2 (a) < ,
f d1 =
a
N
X
cn I(x sn ).
cn f (sn ).
(24)
n=1
Z b
f d2 6 M ,
(25)
where M = sup |f (x)|. Since = 1 + 2 , it follows from (24) and (25) that
Z
N
b
X
f
d
c
f
(s
)
n
n < M .
a
n=1
108
Proof. Let > 0 be given and apply Theorem 6.6. There is a partition P = {x0 , x1 , . . . , xn }
of [a, b] such that
U (P, ) L(P, ) < .
(28)
f (si )i =
i=1
n
X
i=1
i=1
In particular,
n
X
f (si )i 6 U (P, f ) + M ,
i=1
U (P, f, ) 6 U (P, f ) + M .
(31)
Now note that (28) remains true if P is replaced by any refinement. Hence (31) also
remains true. We conclude that
Z b
Z b
f d
f (x) (x) dx 6 M .
a
f d =
a
(38)
for any bounded f . The equality of the lower integrals follows from (30) in exactly the
same way. The theorem follows.
109
6.18 Remark. The two preceding theorems illustrate the generality and flexibility which
are inherent in the Stieltjes process of integration. If is a pure step function of the form
(22), the integral reduces to a finite or infinite series. If has an integrable derivative,
the integral reduces to an ordinary Riemann integral. This makes it possible in many
cases to study series and integrals simultaneously, rather than separately.
g(y) = f ((y)).
g d =
A
(36)
f d.
(37)
L(Q, g, ) = L(P, f, ).
(38)
Since
f R(), P can be chosen so that both U (P, f, ) and L(P, f, ) are close to
R
f d. Hence (38), combined with Theorem 6.6, shows that g R() and that (37)
holds. This completes the proof.
Let us note the following special case: Take (x) = x. Then = . Assume R on
[A, B]. If Theorem 6.17 is applied to the left side of (37), we obtain
Z B
Z b
f ((y)) (y) dy.
f (x) dx =
a
Then F is continuous on [a, b]; furthermore, if f is continuous at a point x0 of [a, b], then
F is differentiable at x0 , and F (x0 ) = f (x0 ).
110
x0 < s 6 x0 6 t < x0 +
and a 6 s < t 6 b,
f
(x
)]
du
f
(x
)
0
0
t s
ts
s
It follows that F (x0 ) = f (x0 ).
111
(**) is also known as the Newton-Leibniz formula. This is exactly the way in which we
calculate definite integral: first find an antiderivative of f , then substitute the limits of
integration.
6.22 Theorem (Integration by parts). Suppose f and g are differentiable functions
on [a, b], f R, and g R. Then
Z b
Z b
f (x)g(x) dx.
f (x)g (x) dx = f (b)g(b) f (a)g(a)
a
Proof. Put h(x) = f (x)g(x) and apply Theorem 6.21 to h and its derivative. Note that
h R, by Theorem 6.13.
The rule of thumb for applying integration by parts can be stated as follows:
1. RThe part selected as g must be readilyRintegrable.
2. f g must not be more complex than f g .
Example. Find
x3 ex dx.
2
112
(1)
X
n=1
fn (x) (x E),
(2)
fn .
Here, the main problem is to determine whether important properties of fn are preserved
under the limit operations (1) and (2).
For example, if the functions fn are continuous, is the limit function f continuous? If the
limit function f is continuous at a point x, then
lim lim fn (t) = lim f (t) = f (x) = lim fn (x) = lim lim fn (t),
tx n
tx
n tx
i.e., the limit processes can be interchanged. The same question can be asked if fn are
differentiable or integrable.
7.2 Example. For m, n = 1, 2, 3, . . ., let
m
.
m+n
Then for fixed n, lim sm,n = 1 so that lim lim sm,n = 1. On the other hand, for every
m
n m
fixed m, lim sm,n = 0 so that lim lim sm,n = 0. In this example, the order in which the
n
m n
limits are evaluated is important.
sm,n =
x2
(1 + x2 )n
fn (x) =
n=0
f (x) =
X
n=0
0
1 + x2
x2
.
(1 + x2 )n
if x = 0
if x =
6 0
113
When m!x is an integer, fm (x) = 1. For all other values of x, fm (x) = 0. Now let
f (x) = lim fm (x).
m
sin nx
fn (x) =
n
as n , whereas f (0) = 0.
n +
fn (x) = n2 x(1 x2 )n .
(10)
lim fn (x) = 0
1
0
x(1 x2 )n dx =
1
.
2n + 2
n2
+
2n + 2
0
as n . If in (10), n2 is replaced by n, (11) still holds, but we now have
Z 1
1
n
= ,
lim
fn (x) dx = lim
n 0
n 2n + 2
2
whereas
Z
Z
(11)
fn (x) dx =
lim fn (x) dx = 0.
Thus, the limit of the integral need not be equal to the integral of the limit, even if both
are finite.
114
n
= 0.
Theorem 3.20(d) states that if p > 0 and is real, then lim
n (1 + p)n
Uniform convergence
7.7 Definition. A sequence of functions {fn }, n = 1, 2, 3, . . ., converges uniformly on E
to a function f if for every > 0, there is an integer N such that n > N implies
|fn (x) f (x)| <
(12)
for all x E.
It is clear that every uniformly convergent sequence is pointwise convergent. If {fn }
converges pointwise on E, then there exists a function f such that for every > 0 and
for every x E, there exists an integer N , depending on and x, such that (12) holds if
n > N ; if {fn } converges uniformly on E, then it is possible, for every > 0, to find one
integer N that will do for all x E.
The series
by
fi (x) = sn (x)
i=1
converges uniformly on E.
7.8 Theorem. (Cauchy criterion for uniform convergence) The sequence of functions
{fn }, defined on E, converges uniformly on E if and only if for every > 0 there exists
an integer N such that m, n > N , x E, implies
|fn (x) fm (x)| < .
(13)
+ =
2 2
if m, n > N and x E.
The sequence {fn (x)} converges, for every x E, to the limit function f (x).
Let > 0 be given and choose N such that (13) holds. Fix n and let m in (13).
Since fm (x) f (x) as m , this gives
|fn (x) f (x)| 6
for every n > N and every x E, which completes the proof.
(14)
115
7.9 Theorem. Suppose lim fn (x) = f (x), where x E. Put Mn = sup |fn (x) f (x)|.
n
xE
i=n
provided m < n are large enough. Uniform convergence now follows from Theorem
7.8.
Uniform convergence and continuity
tx
(n = 1, 2, 3, . . .).
(15)
(16)
(17)
tx
i.e.,
tx n
n tx
Proof. Let > 0 be given. By the uniform convergence of {fn }, there exists N such that
n, m > N and t E imply
|fn (t) fm (t)| 6 .
(18)
Letting t x in (18), we obtain
|An Am | 6
for n, m > N ,
i.e., {An } is a Cauchy sequence and therefore converges, say to A. Choose n large enough
so that
(20)
|f (t) fn (t)| 6
3
for all t E (this is possible by the uniform convergence) and that
|An A| 6 .
(21)
3
Then, for this n, we choose a neighborhood V of x such that
(22)
|fn (t) An | 6
3
if t V E and t 6= x.
116
7.14 Definition. Let X be a metric space. Denote by C (X) the set of all complexvalued, continuous, bounded functions with domain X. Associate with each f C (X)
its supremum norm
kf k = sup |f (x)|.
xX
117
d (f, g) = kf gk.
Theorem 7.9 can be rephrased as: A sequence {fn } converges to f in (C (X), d ) if and
only if fn f uniformly on X.
Accordingly, closed subsets of C (X) are sometimes called uniformly closed, the closure of
a set A C (X) is called its uniform closure, and so on.
7.15 Theorem. (C (X), d ) is a complete metric space.
Proof. Let {fn } be a Cauchy sequence in C (X).
For any given > 0, there exists N such that kfn fm k < if n, m > N .
It follows that there exists a function f with domain X such that fn f uniformly.
Since f is the uniform limit of a sequence of continuous functions, f is continuous.
Moreover, f is bounded because there exists an n such that for all x,
|f (x) fn (x)| < 1 |f (x)| < 1 + |fn (x)| 6 1 + Mn ,
118
(24)
a6x6b
Then
fn n 6 f 6 fn + n ,
so that the upper and lower integrals of f satisfy
Z b
Z b
Z b
Z b
f d 6
f d 6
(fn + n ) d.
(fn n ) d 6
a
Hence,
06
b
a
f d
(25)
b
a
f d 6 2n [(b) (a)].
Since n 0 as n , the upper and lower integrals of f are equal, i.e., f R().
Another application of (25) now yields
Z b
Z b
(26)
f d
fn d 6 n [(b) (a)].
a
X
f (x) =
fn (x) (a 6 x 6 b),
n=1
n=1
119
(27)
(28)
|fn (x0 ) fm (x0 )| <
2
and
|fn (t) fm
(t)| <
(a 6 t 6 b).
(29)
2(b a)
Applying the mean value theorem to fn fm , (29) shows that
|x t|
6
(30)
|fn (x) fm (x) fn (t) + fm (t)| 6
2(b a)
2
for any x, t [a, b] if m, n > N . By (28) and (30), we have for x [a, b] and m, n > N ,
|fn (x) fm (x)| 6 |fn (x) fm (x) fn (x0 ) + fm (x0 )| + |fn (x0 ) fm (x0 )|
< +
2 2
=
so that {fn } converges uniformly on [a, b]. Let
(t) =
f (t) f (x)
tx
tx
(31)
(32)
6
(m, n > N )
|t x|
2(b a)
so that {n } converges uniformly for t 6= x. Since {fn } converges to f , we conclude from
(31) that
lim n (t) = (t)
|m (t) n (t)| =
uniformly for t [a, b] \ {x}. Applying now Theorem 7.11 to {n }, (32) and (33) show
that
lim (t) = lim fn (x),
tx
which is (27).
120
7.18 Theorem. There exists a real continuous function on R which is nowhere differentiable.
Proof. Define
(x) = |x|
(1 6 x 6 1)
and extend the definition of (x) to all real x by requiring that
(x + 2) = (x).
(34)
(35)
X
3
f (x) =
( )n (4n x).
4
n=0
(36)
(37)
Since 0 6 6 1, Theorem 7.10 shows that the series (37) converges uniformly on R. By
Theorem 7.12, f is continuous on R.
Now fix an x R and a positive integer m. Put
1
,
(38)
m =
2 4m
where the sign is so chosen that no integer lies between 4m x and 4m (x + m ). This can be
done, since 4m |m | = 21 . Define
(4n (x + m )) (4n )
.
(39)
m
When n > m, then 4n m is an even integer so that n = 0. When 0 6 n 6 m, (36) implies
that |n | 6 4n . Since m = 4m , we conclude that
m
m1
X
f (x + m ) f (x) X
3m + 1
3
n
= ( )n n > 3m
3
=
.
m
4
2
n =
n=0
n=0
121
For n = 0, 1, 2 . . . , n, define fn (x) = (4n x). The graphs of y = f0 (x) and y = f1 (x) are
as follows:
y
1
y = f0 (x)
1
y
1
4
14
y = f1 (x)
1
4
As n increases,
fn (x) increases. Since
P 3 nthe number of saw teeth in the graph of y =
k
f (x) = n=0 ( 4 ) fn (x) and the set of locations of saw teeth { 4n : k Z, n N {0}} is
dense in R, the nowhere differentiability of f follows.
122
(x E, n = 1, 2, 3, . . .).
hence
lim (sin nk x sin nk+1 x)2 = 0 (0 6 x 6 2).
(40)
(41)
123
x2
x2 + (1 nx)2
(0 6 x 6 1, n = 1, 2, 3, . . .).
Then |fn (x)| 6 1, so that {fn } is uniformly bounded on [0, 1]. Also
lim fn (x) = 0 (0 6 x 6 1),
but
1
fn ( ) = 1 (n = 1, 2, 3, . . .),
n
so that no subsequence can converge uniformly on [0, 1].
The concept which is needed in this connection is that of equicontinuity; it is given in the
following definition.
7.22 Definition. A family F of complex functions f defined on a set E in a metric
space X is said to be equicontinuous on E if for every > 0 there exists a > 0 such that
|f (x) f (y)| <
whenever d(x, y) < , x, E, and f F . Here d denotes the metric of X.
It is clear that every member of an equicontinuous family is uniformly continuous.
The sequence of Example 7.21 is not equicontinuous.
Theorems 7.24 and 7.25 will show that there is a very close relation between equicontinuity,
on the one hand, and uniform convergence of sequences of continuous functions, on the
other. But first we describe a selection process which has nothing to do with continuity.
7.23 Theorem. If {fn } is a pointwise bounded sequence of complex functions on a
countable set E, then {fn } has a subsequence {fnk } such that {fnk (x)} converges for
every x E.
Proof. Let {xi }, i = 1, 2, 3, . . ., be the points of E, arranged in a sequence. Since
{fn (x1 )} is bounded, there exists a subsequence, which we shall denote by {f1,k }, such
that {f1,k (x1 )} converges as k .
Let us now consider sequences S1 , S2 , S3 , . . ., which we represent by the array
S1 :
S2 :
S3 :
f1,1
f2,1
f3,1
f1,2
f2,2
f3,2
f1,3
f2,3
f3,3
f1,4
f2,4
f3,4
124
(c) The order in which the functions appear is the same in each subsequence; i.e., if
one function precedes another in S1 , they are in the same relation in every Sn ,
until one or the other is deleted. Hence, when going from one row in the above
array to the next below, functions may move to the left but never to the right.
We now go down the diagonal of the array; i.e., we consider the sequence
S : f1,1 , f2,2 , f3,3 , f4,4 , .
(b) Let E be a countable dense subset of K. Theorem 7.23 shows that {fn } has a
subsequence {fni } such that {fni (x)} converges for every x E. Put fni = gi to simplify
the notation. We shall prove that {gi } converges uniformly on K.
Let > 0, and pick > 0 as in the beginning of this proof. Let
V (x) = {y K : d(x, y) < }.
125
(46)
|gi (x) gj (x)| 6 |gi (x) gi (xs )| + |gi (xs ) gj (xs )| + |gj (xs ) gj (x)| < 3.