Introduction To Real Analysis Fall 2014 Lecture Notes: Vern I. Paulsen November 6, 2014
Introduction To Real Analysis Fall 2014 Lecture Notes: Vern I. Paulsen November 6, 2014
Introduction To Real Analysis Fall 2014 Lecture Notes: Vern I. Paulsen November 6, 2014
Contents
1 Metric Spaces
1.1 Definition and Examples . . . . . . . . . . .
1.1.1 Vector Spaces, Norms and Metrics .
1.2 Open Sets . . . . . . . . . . . . . . . . . . .
1.2.1 Uniformly Equivalent Metrics . . . .
1.3 Closed Sets . . . . . . . . . . . . . . . . . .
1.4 Convergent Sequences . . . . . . . . . . . .
1.4.1 Further results on convergence in Rk
1.4.2 Subsequences . . . . . . . . . . . . .
1.5 Interiors, Closures, Boundaries of Sets . . .
1.6 Completeness . . . . . . . . . . . . . . . . .
1.7 Compact Sets . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
9
10
12
14
17
21
23
24
26
30
37
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
45
47
49
51
52
CONTENTS
Chapter 1
Metric Spaces
These notes accompany the Fall 2011 Introduction to Real Analysis course
1.1
We verify (1)(4). (1) and (3) are obvious. For (2): d(x, y) = 0 iff
|x1 y1 | + |x2 y2 | = 0. But since both terms in the sum are non-negative
for the sum to be 0, each one must be 0. So d(x, y) = 0 iff |x1 y1 | = 0
AND |x2 y2 | = 0 iff x1 = y1 and x2 = y2 iff x = (x1 , x2 ) = (y1 , y2 ) = y.
Finally to see (4):
d(x, y) = |x1 y1 | + |x2 y2 | = |x1 z1 + z1 y1 | + |x2 z2 + z2 y2 |
|x1 z1 | + |z1 y1 | + |x2 z2 | + |z2 y2 | = d(x, z) + d(z, y).
We often denote the taxi cab metric by d1 (x, y).
Example 1.4. A different metric on R2 . For x = (x1 , x2 ), y = (y1 , y2 ) set
d (x, y) = max{|x1 y1 |, |x2 y2 |}. So the distance between two points is
the larger of these two numbers.
We only check the triangle inequality. Let z = (z1 , z2 ) be another point.
We have two cases to check. Either |x1 y1 | = d(x, y) OR d(x, y) = |x2 y2 |.
Case 1: d(x, y) = |x1 y1 |.
Now notice that |x1 z1 | max{|x1 z1 |, |x2 z2 |} = d(x, z). Similarly,
|z1 y1 | max{|z1 y1 |, |z2 y2 |}. Hence,
d(x, y) = |x1 y1 | = |x1 z1 + z1 y1 | |x1 z1 | + |z1 y1 |
d(x, z) + d(z, y).
Case 2: d(x, y) = |x2 y2 |. Now use |x2 z2 | max{|x1 z1 |, |x2 z2 |} =
d(x, z). Similarly, |z2 y2 | max{|z1 y1 |, |z2 y2 |}. Hence,
d(x, y) = |x2 y2 | = |x2 z2 + z2 y2 | |x2 z2 | + |z2 y2 |
d(x, z) + d(z, y).
So in each case the triangle inequality is true, so it is true.
Example 1.5. In this case we let X be the set of all continuous real-valued
functions on [0, 1]. We use three facts from Math 3333:
1. if f and g are continuous on [0,1], then f g is continuous on [0,1],
2. if f is continuous on [0,1], then |f | is continuous on [0,1],
3. if h is continuous on [0,1], then there is a point 0 t0 1, so that
h(t) h(t0 ) for every 0 t 1. That is h(t0 ) = max{h(t) : 0 t
1}.
then b2 4ac.
Proposition 1.8 (Cauchy-Schwarz Inequality). Let a1 , ..., an , b1 , ..., bn be
real numbers. Then
q
q
2
2
|a1 b1 + an bn | a1 + + an b21 + + b2n .
1.1.1
p
a21 + + a2n
10
1.2
Open Sets
Definition 1.21. Let (X,d) be a metric space, fix x X and r > 0. The
open ball of radius r centered at x is the set
B(x; r) = {y X : d(x, y) < r}.
Example 1.22. In R with the usual metric, B(x; r) = {y : |x y| < r} =
{y : x r < y < x + r} = (x r, x + r).
Example 1.23. In R2 with the Euclidean metric, x = (x1 , x2 ), then B(x; r) =
{(y1 , y2 ) : (x1 y1 )2 + (x2 y2 )2 < r2 }, which is a disk of radius r centered
at x.
Example 1.24. In R3 with the Euclidean metric, B(x; r) really is an open
ball of radius r. This example is where the name comes from.
Example 1.25. In (R2 , d ) we have
B(x; r) = {(y1 , y2 ) : |x1 y1 | < r and |x2 y2 | < r} =
{(y1 , y2 ) : x1 r < y1 < x1 + r and x2 r < y2 < y2 + r}.
So now an open ball is actually an open square, centered at x with sides
of length 2r.
Example 1.26. In (R2 , d1 ) we have B((0, 0); 1) = {(x, y) : |x0|+|y 0| <
1} which can be seen to be the diamond with corners at (1,0),(0,1),(-1,0),
(0,-1).
Example 1.27. When X = {f : [0, 1] R|f is continuous} and d(f, g) =
max{|f (t)g(t)| : 0 t 1}, then B(f ; r) = {g : g is continuous and f (t)
r < g(t) < f (t) + r, t}. This can be pictured as all continuous functions g
whose graphs lie in a band of width r about the graph of f.
Example 1.28. If we let R have the usual metric and let Y = [0, 1] R
be the subspace, then when we look at the metric space Y we have that
B(0; 1/2) = [0, 1/2) = (1/2, 1/2) Y.
Example 1.29. When we let X be a set with the discrete metric and x X,
then B(x; r) = {x} when r 1. When r > 1, then B(x; r) = X.
Definition 1.30. Given a metric space (X,d) a subset O X is called open
provided that whenever x O, then there is an r > 0 such that B(x; r) O.
11
Showing that sets are open really requires proof, so we do a few examples.
Example 1.31. In R with the usual metric, an interval of the form (a, b) =
{x : a < x < b} is an open set.
Proof. Given x (a, b), let r = min{x a, b x}. If y B(x; r) then x r <
y < x + r. Since y < x + r x + (b x) = b and y > x r x (x a) = a
we have a < y < b. So B(x; r) (a, b). Thus, (a, b) is open.
So open intervals really are open sets.
Example 1.32. In R2 with the Euclidean metric, a rectangle of the form
R = {(y1 , y2 ) : a < y1 < b, c < y2 < d} is an open set.
Proof. Given x = (x1 , x2 ) R, let r = min{x1 a, b x1 , x2 c, d x2 }.
We claim that B(x; r) R. To see this let y = (y1 , y2 ) B(x; r), then
(y1 x1 )2 + (y2 x2 )2 < r2 .
Looking at just one term in the sum, we see that (y1 x1 )2 < r2 and so
|y1 x1 | < r. This implies that x1 r < y1 < x1 + r. As before we see that
x1 + r b and x1 r a, so a < y1 < b.
Similarly, we get that c < y2 < d, so y R.
Example 1.33. In R with the usual metric an interval of the form [a, b) is
not open.
Proof. Consider the point a [a, b). Any ball about this point is of the form
B(a; r) = (a r, a + r) and this contains points that are less than a and so
not in the set [a, b).
CAREFUL! If we let Y = [0, 1) R be the subspace. Then in Y the set
O = [0, 1/2) is open!(Explain why.)
The next result justifies us calling B(x; r) an open ball.
Proposition 1.34. Let (X,d) be a metric space, fix x X and r > 0. Then
B(x; r) is an open set.
Theorem 1.35. Let (X,d) be a metric space. Then
1. the empty set is open,
2. X is open,
3. the union of any collection of open sets is open,
4. the intersection of finitely many open sets is open.
12
1.2.1
The definition of open set really depends on the metric. For example, on
R if instead of the usual metric we used the discrete metric, then every set
would be open. But we have seen that when R has the usual metric, then
not every set is open. For example, [a, b) is not an open subset of R in the
usual metric. Thus, whether a set is open or not really can depend on the
metric that we are using.
For this reason, if a given set X has two metrics, d and , and we say
that a set is open, we generally need to specify which metric we mean.
Consequently, we will say that a set is open with respect to d or open
in (X, d) when we want to specify that it is open when we use the metric
d. In this case it may or may not be open with respect to .
In the case of R2 , we already have three metrics, the Euclidean metric
d, the taxi cab metric d1 and the metric d . So when we say that a set is
open in R2 , we could potentially mean three different things. On the other
hand it could be the case that all three of these metrics give rise to the same
collection of open sets.
In fact, these three metrics do give rise to the same collections of open
sets and the following definition and result explains why.
Definition 1.41. Let X be a set and let d and be two metrics on X. We
say that these metrics are uniformly equivalent provided that there are
constants A and B such that for every x, y X,
(x, y) Ad(x, y) and d(x, y) B(x, y).
13
14
Theorem 1.45. Let X be a set and let d and be metrics on X that are
uniformly equivalent. Then a set is open with respect to d if and only if it
is open with respect to .
Proof. Let E X be a set that is open with respect to d. We must prove
that E is open with respect to .
Since E is open with respect to d, given x E there is an r > 0 so that
Bd (x; r) E. By the lemma, B (x; r/B) Bd (x; r). Thus, B (x; r/B) E
and we have shown that given an arbitrary point in E that there is an open
ball in the metric about that point that is contained in E. Hence, E is
open with respect to .
The proof that a set that is open with respect to is open with respect
to d is similar and uses the other containment given in the lemma.
Problem 1.46. Let X be a set with three metrics, d, , and . Prove that
if d and are uniformly equivalent and and are uniformly equivalent,
then d and are uniformly equivalent.
Problem 1.47. Prove that on Rn the Euclidean metric, the metric d1 and
the metric d are all uniformly equivalent.
1.3
Closed Sets
15
Example 1.51. In R with the usual metric, we have that (b, ) is open
and (, a) is open. So when a < b we have that O = (, a) (b, +)
is open. Hence, Oc = [a, b] is closed.
This shows that our old calculus definition of a closed interval, really
is a closed set in this sense.
Definition 1.52. Let (X, d) be a metric space, let x X and let r > 0. The
closed ball with center x and radius r is the set
B (x; r) = {y X : d(x, y) r}.
The following result explains this notation.
Proposition 1.53. Let (X, d) be a metric space, let x X and let r > 0.
Then B (x; r) is a closed set.
Proof. Let O = (B (x; r))c denote the complement. We must prove that O
is open. Note that O = {y X : d(x, y) > r}.
For p O set r1 = d(x, p) r > 0. We claim that B(p; r1 ) O. If
we can prove this claim, then we will have shown that O is open. So let
q B(p; r1 ), then d(x, p) d(x, q) + d(q, p) < d(x, q) + r1 . Subtracting r1
from both sides of this last inequality yields, d(x, p) r1 < d(x, q). But
d(x, p) r1 = r. Hence, r < d(x, q) and so q O. Since q was an arbitrary
element of B(p; r1 ), we have that B(p; r1 ) O and our proof is complete.
Because the definition of closed sets involves complements, it is useful
to recall DeMorgans Laws. Given subsets Ei X where i belongs to some
set I, we have
[
Ei = {x X : there exists i I with x Ei }
iI
and
\
iI
iI
iI
iI
The following theorem about closed sets follows from DeMorgans Laws
and Theorem 1.31.
16
iI
1.4
17
Convergent Sequences
n+
provided that for every > 0, there is a real number N so that when n > N,
then d(p, pn ) < .
Often out of sheer laziness, I will write limn pn = p for limn+ pn = p.
Example 1.60. When X = R and d is the usual metric, then d(p, pn ) =
|p pn | and this definition is identical to the definition used when we studied
convergent sequences of real numbers in Math 3333.
For a quick review, we will look at a couple of examples of sequences and
recall how we would prove that they converge.
For a first example, consder the sequence given by pn = 3n+1
5n2 . Fo any
natural number n, the denominator of this fraction is non-zero. So this
formula defines a sequence of points. For large n, the +1 in the numerator
and the 2 in the denominator are small in relation to the 3n and 5n so we
expect that this sequence has limit p = 35 .
Now for some scrap work. To prove this we would need
d(p, pn ) = |p pn | = |
11
Simplifying this fraction leads to the condition 25n10
< . Solving for n we
11
11
see that this is true provided < 25n 10, i.e., 25 + 25 < n. So it looks
11
like we should choose, N = 25
+ 25 . This is not a proof, just our scrap work.
Now for the proof:
11
Given > 0, define N = 25
+ 25 . For any n > N, we have that 25n10 >
11
11
11 1
25N 10 = , and so d(p, pn ) = | 35 3n+1
= . Hence,
5n2 | = 25n10 < ( 11/ )
limn+ pn = p.
18
Now that we have recalled what this definition means in R and have seen
how to prove a few examples. We want to look at what this concept means
in some of our other favorite metric spaces.
Theorem 1.61. Let Rk be endowed with the Euclidean metric d, let {pn }
Rk be a sequence with pn = (a1,n , a2,n , . . . , ak,n ) and let p = (a1 , a2 , . . . , ak )
Rk . Then limn+ pn = p in (Rk , d) if and only if for each j, 1 j k, we
have limn+ aj,n = aj in R with the usual metric.
Proof. First we assume that limn+ pn = p. Given > 0, there is a N so
that for n > N, we have that
q
d(p, pn ) = (a1 a1,n )2 + + (ak ak,n )2 < .
p
Now fix j, we have that |aj aj,n | = (aj aj,n )2 is smaller than the above
square root, since it is just one of the terms. Thus, for the same N, we have
that n > N implies that |aj aj,n | d(p, pn ) < . Hence, for each , we
have found an N and so limn aj,n = aj . This proves the first implication.
Now conversely, suppose that for each j, we have that limn aj,n = aj .
Given an > 0, we must produce an N so that when n > N, we have
d(p, pn ) < .
For each j, taking 1 = k we can find an Nj so that for n > Nj we have
that |aj aj,n | < 1 .(Why did I call it Nj ?) Now let N = max{N1 , . . . , Nk },
2
2
then for n > N, we have
p that (aj aj,n ) < 1 for every j, 1p2j k.
2
2
2
p Hence, d(ppn ) = (a1 a1,n ) + + (ak ak,n ) < 1 + + 1 =
k21 = . This proves that limn pn = p in the Euclidean metric.
So the crux of the above theorem is that a sequence of points in Rk
converges if and only if each of their components converge.
If we combine our first two examples with the above theorem,
we see that
3n+1
2
if we define a sequence of points in R by setting pn = ( 5n2 , n2 + 8n n)
then in the Euclidean metric these points converge to the point p = ( 35 , 4).
What if we had used the taxi cab metric or d metric on R2 instead of
the Euclidean metric, would these points still converge to the same point?
The answer is yes and the following result explains why and saves us having
to prove separate theorems for each of these metrics.
Proposition 1.62. Let X be a set with metrics d and that are uniformly
equivalent, let {pn } X be a sequence and let p X be a point. Then
{pn } converges to p in the metric d if and only if {pn } converges to p in the
metric .
19
20
21
1.4.1
22
Two other useful connection to notice between the Euclidean metric and
the vector space operations is that
p
d(p, q) = (a1 b1 )2 + + (ak bk )2 = d(0, p q)
and
d(rp, rq) =
p
(ra1 rb1 )2 + + (rak rbk )2 = |r|d(p, q).
23
2A ,
and choose N2 , so
1.4.2
Subsequences
24
1.5
25
Before giving any examples, it will be easiest to have some other descriptions of S.
Proposition 1.81. S is closed and if C is any closed set with S C, then
S C.
Proof. S is closed because intersections of closed sets are closed. If C is
closed and S C, then C is one of the sets that is in the intersection
defining S and so S C.
Thus, S is the smallest closed set containing S.
Theorem 1.82. Let p X, then p S if and only if for every r > 0,
B(p; r) S is non-empty.
Proof. Suppose that there was an r > 0 so that B(p; r) S was empty. This
would imply that S B(p; r)c , which is a closed set. Thus, we have a closed
set containing S and p
/ B(p; r)c . This implies that p
/ S. (This is the
contrapositive.)
Conversely, if B(p; r) S is non-empty for every r > 0, then for each n,
we may choose a point pn B(p; 1/n) S. This gives a sequence {pn } S,
and it is easy to see that limn pn = p. Now if C is any closed set with S C,
then {pn } C, and so by Theorem 1.61, p C. Hence, p is in every closed
set containing S and so p S.
Corollary 1.83. Let p X, then p S if and only if there is a sequence
{pn } S with limn pn = p.
Example 1.84. Let X = R with the usual metric. Then the closure of (a, b)
is [a, b] and Q = R.
Example 1.85. If X = {x R : x > 0} with the usual metric and S = {x :
0 < x < 1} then S = (0, 1].
Definition 1.86. Let (X,d) be a metric space and let S X. Then the
boundary of S, denoted S is the set S = S S c .
Proposition 1.87. Let p X, then p S if and only if for every r > 0,
B(p; r) S is non-empty and B(p; r) S c is non-empty.
Proof. Apply the characterization of the closure of sets to S and S c .
Corollary 1.88. Let p X. p S if and only if there is a sequence
{pn } S and a sequence {qn } S c with limn pn = limn qn = p.
26
Example 1.89. Let X = R with the usual metric and let S = (0, 1]. Then
S = {0, 1} and Q = R.
Problem 1.90. Prove Corollary 1.83.
Problem 1.91. Prove Corollary 1.88.
Problem 1.92. Let X = R2 with the Euclidean metric and let S = {(x1 , x2 ) :
x21 + x22 < 1}. Prove that S = {(x1 , x2 ) : x21 + x22 1} and that S =
{(x1 , x2 ) : x21 + x22 = 1}.
Definition 1.93. Let (X,d) be a metric space S X. A point p X, is
called a cluster point or accumulation point of S, provided that for every
r > 0, B(p; r) S has infinitely many points.
Problem 1.94. Prove that p is a cluster point of S if and only if there is
a sequence {pn } S of distinct points, i.e., pi 6= pj for i 6= j, such that
limn pn = p.
Problem 1.95. Let (X,d) be a discrete metric space and let S X be any
non-empty set. Find int(S), S, S and give the cluster points of S.
1.6
Completeness
One weakness of convergence is that when we want to prove that a sequence {pn } converges, then we need the point p that it converges to before
we can prove that it converges. But often in math, one doesnt know yet
that a problem has a solution and we can only produce a sequence {pn }
that somehow is a better and better approximate solution and we want to
claim that necessarily a point exists that is the limit of this sequence. It
is for these reasons that mathematicians introduced the concepts of Cauchy
sequences and complete metric spaces.
Definition 1.96. Let (X,d) be a metric space. A sequence {pn } X is
called Cauchy provided that for each > 0 there exists N so that whenever
m, n > N, then d(pn , pm ) < .
We look at a few properties of Cauchy sequences.
Proposition 1.97. Let (X,d) be a metric space and let {pn } X be a
sequence. If {pn } is a convergent sequence, then {pn } is a Cauchy sequence.
1.6. COMPLETENESS
27
Proof. Let p = limn pn . Given > 0 there is N so that n > N implies that
d(p, pn ) < 2 . Now let m, n > N, then d(pn , pm ) d(pn , p) + d(p, pm ) <
2 + 2 = .
Proposition 1.98. Let (X,d) be a metric space, {pn } X a sequence and
{pnk } a subsequence. If {pn } is Cauchy, then {pnk } is Cauchy, i.e., every
subsequence of a Cauchy sequence is also Cauchy.
Proof. Given > 0, there is N so that n, m > N then d(pn , pm ) < . Let
K = N and recall that nk k. Thus, if k, j > K then nk k > N and
nj j > N, and so d(pnk , pnj ) < .
Proposition 1.99. Let (X,d) be a metric space and let {pn } X be a
Cauchy sequence. Then {pn } is bounded.
Proof. Set = 1 and choose N so that for n, m > N, d(pn , pm ) < 1. Fix any
n1 > N, we will show that there is a radius r so that {pn } B (pn1 ; r).
To this end, set r = max{1, d(pn1 , pn ) : 1 n N }. This is a finite set of
numbers so it does have a maximum. Now given any n, either 1 n N,
in which case d(Pn1 , pn ) r, since it is one of the numbers occurring the
max, or n > N, and then d(pn1 , pn ) < 1 r.
Definition 1.100. Let (X,d) be a metric space. If for each Cauchy sequence
in (X,d), there is a point in X that the sequence converges to, then (X,d) is
called a complete metric space.
Example 1.101. Let X = (0, 1] R be endowed with the usual metric
d(x, y) = |x y|. Then X is not complete, since { n1 } X is a Cauchy
sequence with no point in X that it can converge to.
Example 1.102. Let Q denote the rational numbers with metricd(x, y) =
|x y|. We can take a sequence of rational numbers converging to 2, which
we know is irrational. Then that sequence will be Cauchy, but not have a
limit in Q. Thus, (Q, d) is not complete.
In Math 3333, we proved that R with the usual metric has the property
that every Cauchy sequence converges, that is, (R, d) is a complete metric
space. This fact is so important that we repeat the proof here. First, we
need to recall a few important facts and definitions.
Recall that a set S R is called bounded above if there is a number
b R such that s S implies that s b. Such a number b is called an upper
bound for S. An upper bound for S that is smaller than every other upper
bound of S is called a least upper bound for S and denoted lub(S) in
28
Rosenlichts book. In the book that we used for Math 3333, a least upper
bound for S was called a supremum for S and denoted sup(S).
Similarly, a set S R is called bounded below if there is a number
a R such that s S implies that a s. Such a number a is called a
lower bound for S. A lower bound for S that is larger than every other
lower bound is called a greatest lower bound for S, and denoted glb(S)
in Rosenlichts book. In the book that we used for Math 3333, a greatest
lower bound was called an infimum for S and denoted inf(S).
A key property of R is that every set that is bounded above has a least
upper bound and that every set that is bounded below has a greatest lower
bound.
Proposition 1.103. Let {an } be a sequence of real numbers such that the
set S = {an : n 1} is bounded above and such that an an+1 for every
n. Then {an } converges and limn an = lub(S). If {bn } is a sequence of real
numbers such that S = {bn : n 1} is bounded below and bn bn+1 , then
{bn } converges and limn bn = glb(S).
Proof. Let a = lub(S), then an a for every n. Given > 0, since a < a,
we have that a is not an upper bound for S and so there is a N so
that a < aN . If n > N, then a < aN an a < a + . Thus,
a < an < a + and so |an a| < for every n > N. This proves that
limn an = a.
The proof of the other case is similar.
Theorem 1.104. Let (R, d) denote the real numbers with the usual metric. Then (R, d) is complete, i.e., every Cauchy sequence of real numbers
converges.
Proof. Let {pn } be a Cauchy sequence of real numbers. By the above results
S1 = {pj : j 1} is a bounded set of real numbers, so it is bounded above
and bounded below. For each n 1, let Sn = {pk : k n}. Note that
S1 S2 . . . , so that each set Sn is a bounded set of real numbers.
Let an = glb(Sn ), and bn = lub(Sn ). Since Sn+1 Sn we have that an
an+1 and bn+1 bn . Since pn Sn , an pn bn . In particular, an b1
for every n, and so {an } converges and a = limn an = lub({an : n 1}).
Similarly, a1 bn for every n and so {bn } converges and b = limn bn =
glb({bn : n 1}). Also, since an bn we have that a b.
Now given any > 0, there is N so that when j, n > N, |pn pj | < .
Hence, for any j n, we have that pn < pj < pn + . This shows that
pn is a lower bound for Sn and pn + is an upper bound for Sn . Thus,
pn an a b bn pn + .
1.6. COMPLETENESS
29
This shows that a and b are trapped in an interval of length 2. Since
was arbitrary, by the vanishing principle, |a b| = 0, so a = b. Thus,
limn an = limn bn , but an pn bn , so by the Squeeze Theorem for
limits, limn an = limn pn = limn bn .
Another set of important examples of compelte metric spaces are the
Euclidean spaces.
Theorem 1.105. Let (Rk , d) denote k-dimensional Euclidean space. Then
(Rk , d) is complete.
Proof. To simplify notation, we will only do the case k = 2. So let pn =
(an , bn ) be a Cauchy sequence of points in R2 with the Euclidean metric
d. Given > 0, there
p is N so that n, m > N implies that d(pn , pm ) < .
But |an am | |an am |2 + |bn bm |2 = d(pn , pm ), so for n, m > N,
|an am | < , and so {an } is a Cauchy sequence of real numbers. Hence
{an } converges, let a = limn an .
Similarly, |bn bm | < for n, m > N and so {bn } converges and let
b = limn bn .
Since limn an = a and limn bn = b, by Theorem 1.57, limn pn = (a, b).
Hence, this arbitrary Cauchy sequence has a limit and the proof is done.
Problem 1.106. Let X be a set and let d and be two uniformly equivalent
metrics on X. Prove that (X, d) is a complete metric space if and only if
(X, ) is complete.
By the above theorem and problem, (Rk , d1 ) and (Rk , d ) are also complete metric spaces.
Example 1.107. Let X = (0, 1]. Recall that X is not complete in the usual
metric d(x, y) = |x y|. Given x, y X we set (x, y) = | x1 y1 |. It is easy
to check that is a metric on X. We claim that (X, ) is complete! Thus,
d and are not uniformly equivalent.
We sketch the proof. To prove this we must show that if {xn } X is
Cauchy in the metric, then it converges to a point in X. Given > 0,
suppose that for n, m > N, (xn , xm ) < . Since (xn , xm ) = | x1n x1m | =
m
| xxnnx
xm | we have that |xn xm | < |xn xm | . Thus, {xn } is Cauchy
in the usual metric. Let x = limn xn . Since, 0 < xn 1 we have that
0 x 1. We claim that x 6= 0. Because if x = 0, then for any N, when
n, m > N, if we fix m and we let n +, xn 0, then x1n +. Thus,
(xn , xm ) = | x1n x1m | +. This prevents us making (xn , xm ) < and
so violates the Cauchy condition.
30
1.7
Compact Sets
31
Before we can give many more examples of compact sets, we need some
theorems.
Proposition 1.112. Let (X,d) be a metric space. If K X is compact,
then K is closed.
Proof. We will prove that if K is not closed, then K is not compact. Since
K is not closed K 6= K. Hence, there is p K, p
/ K. Since p K, then
for every r > 0, B(p; r) K is non-empty.
Let Un = B (p; 1/n)c = {q X : d(p, q) > 1/n}. Then each of these
sets is open and nN Un = {q X : d(p, q) > 0} = {p}c . Since p
/ K,
we have that thi is an open cover of K. Not that n < m implies that
Un Um , so as in the proof of Example 1.106, Un1 UnL = UN
where N = max{n1 , ..., nL }. So if K was contained in a finite subcover, then
there would be N with K UN . But this would imply that for q K,
d(p, q) > 1/N, and so K B(p; 1/N ) would be empty, contradiction.
Proposition 1.113. Let (X,d) be a metric space, K X a compact set
and let S K be a closed subset. Then S is compact.
Proof. Let {U }A be an open cover of S and let V = S c so that V is open.
Check that {V, U : A} is an open cover of K. So finitely many of these
sets cover K and the same finite collection covers S since S is a subset of
K.
The next few results give a better understanding of the structure of
compact sets.
Proposition 1.114. Let (X,d) be a metric space, K X a compact set
and S K an infinite subset. Then S has a cluster point in K.
Proof. We prove the contrapositive. Suppose S K is a set with no cluster
point in K. Then for each point p K, since it is not a cluster point of
S there would exist an r > 0 such that B(p; r) contains only finitely many
points from S. The set of all such balls is an open cover of K. Now because
K is compact, some finite collection of these balls cover K. But each of
these balls has only finitely many points from S so their union contains only
finitely many points from S. Since their union contains K, it contains all of
S and so S has only finitely many points.
Definition 1.115. Let (X,d) be a metric space, K X. Then K is called
sequentially compact if every sequence in K has a subsequence that converges to a point in K.
32
33
with d(pn , pm ) > for all n 6= m. But for a sequence to converge to a point,
it must be Cauchy. Since no subsequence of this sequence could be Cauchy,
no subsequence converges. Hence, K is not sequentially compact.
Now we come to the main theorem.
Theorem 1.120. Let (X,d) be a metric space and K X. Then the following are equivalent:
1. K is compact,
2. K is sequentially compact,
3. K is totally bounded and complete.
Proof. We have already shown that 1) implies 2) and that 2) implies 3). So
it will be enough to show that 3) implies 1). To this end assume that 3)
holds but that K is not compact. Then there is some open cover {U } of K
with no finite subcover.
For = 1/2 take a finite 1/2-net, {p1 , ..., pL }, So that K L
j=1 B(pj ; 1/2).
34
35
36
Chapter 2
38
Thus, S2 = {(1, 1)}, S3 = {(1, 2), (2, 1)}, S4 = {(1, 3), (2, 2), (3, 1)}. It is not
hard to see that Sn has n 1 elements. Now to list these points, we will first
list the points in S2 then S3 etc. Within each set, we will list the elements in
order of their first coordinate, so for example, (1, 2) comes before (2, 1). This
time the formula for the one-to-one onto function f : N N N, is difficult
to write down. But equivalently, we could define its inverse, i.e., a one-to-one
onto function g : NN N. Now if i+j = n then all the elements of the sets
S1 , ..., Sn1 come before. These sets have 1 + 2 + ... + (n 2) = (n2)(n1)
2
elements, then the order after that is determined by whether i = 1, 2, ...
Thus, the function g is defined by
g((i, j)) =
(i + j 2)(i + j 1)
+i
2
39
unique. Now let
r1 = 0.a1,1 a1,2 a1,3 .....
r2 = 0.a2,1 a2,2 a2,3 .....
etc.
where each of the ai,j is an integer between 0 and 9. Note that ai,j is the
j-th decimal entry of the i-th number.
Now define a sequence of integers bn , as follows: if 0 an,n 4 let
bn = an,n + 1, and if 5 an,n 9 let bn = an,n 1.
Note that 1 bn 8. Hence, there is a real number r, 0 < r < 1, where
r has decimal expansion,
r = 0.b1 b2 ....
Note that the way that we have defined the bn s, there are no nines. Since
b1 6= a1,1 , r 6= r1 and since b2 6= a2,2 , r 6= r2 , etc. Thus, r was not included
in the list! Contradiction.
This method of proofof listing things, somehow getting a doubly indexed array and then altering the (n, n)-entries is often referred to as Cantors diagonalization method.
40
Chapter 3
Continuous Functions
Continuity plays an important role for functions on the real line. Intuitively,
a function is continuous provided that it sends close points to close
points. When we say close we naturally have a notion of distance between
points, both in the domain and the range. Thus, we see that continuity is
really a property for functions between metric spaces. In this chapter, we
define what we mean by continuous functions between metric spaces, then
study the properties of continuous functions. We will see that our notions
of open, closed and compact sets all play an important role.
Definition 3.1. Let (X, d) and (Y, ) be metric spaces and let p0 X.
We say that a function f : X Y is continuous at p0 provided that for
every > 0 there is > 0 such that whenever p X and d(p0 , p) < then
(f (p0 ), f (p)) < .
Note that in this definition the value of really depends on the for this
reason some authors write ().
Two quick examples.
Example 3.2. Let X = Y = R both endowed with the usual metric and let
f : R R be defined by
(
0 if x 0
f (x) =
1 if x > 0.
Then f is not continuous at p0 = 0.
To see that f is not continuous at 0, take = 1/2. For any > 0,
the point p = /2 satisfies d(p0 , p) = |p0 p| = |0 /2| = /2, but
41
42
43
Example 3.8. Let X = Y = R with the usual metric and let f (p) = 5p + 7,
then f is uniformly continuous.
Given > 0, let = /5, then when d(p, q) = |p q| < we have that
(f (p), f (q)) = |f (p) f (q)| = 5|p q| < 5 = .
Example 3.9. Let X = Y = R with the usual metric and let f (p) = p2 .
Then f is not uniformly continuous.
To prove this it will be enough to take = 1, and show that for this value
of , we can find no corresponding > 0. By way of contradiction, suppose
that there was a such that |p q| < implied that |f (p) f (q)| < 1. Set
pn = n, qn = n + /2. Then |pn qn | < , but |f (pn ) f (qn )| = qn2 p2n =
n + 2 /4 > n > 1, for n sufficiently large, in fact for n > 1 .
Example 3.10. Let X = [M, +M ], Y = R with usual metrics and let
f : X Y with f (p) = p2 . Then f is uniformly continuous.
To see this, given > 0, set = 2M
. then |p q| < implies that
2
2
|p q | = |p + q||p q| 2M |p q| < 2M = .
We now look at a characterization of continuity in terms of open and
closed sets.
Recall that if f : X Y and S Y, then the preimage of S is the subset
of X given by
f 1 (S) = {x X : f (x) S}.
44
metric spaces, we will use a subscript to help indicate which metric. Let
p0 f 1 (U ), so that f (p0 ) U. Since U is open there is > 0, so that
B (f (p0 ); ) U. Because f is continuous at p0 , there is a > 0, so that
d(p0 , p) < implies that (f (p0 ), f (p)) < .
But this last statement shows that Bd (p0 ; ) f 1 (U ), because p
Bd (p0 ; ) implies that d(p0 , p) < , implies that (f (p0 ), f (p)) < , implies
that f (p) B (f (p0 ); ), implies that f (p) U.
Hence, f 1 (U ) is open in X.
Conversely, assume that 2) holds and we will prove 1). Given p0 X
and an > 0, we have that U = b (f (p0 ); ) is open. Hence, by assumption, f 1 (U ) is open and p0 f 1 (U ). Thus, we may pick > 0
so that Bd (p0 ; ) f 1 (U ). But this implies that if d(p0 , p) < , then
p f 1 (U ), which implies that f (p) U = B (f (p0 ); ), which implies that
(f (p0 ), f (p)) < .
Hence, f is continuous at p0 , and since this was an arbitrary point, we
have that f is continuous at every point in X.
Proposition 3.12. Let (X, d),(Y, ) and (Z, ) be metric spaces and let
f : X Y and g : Y Z be continuous functions. Then g f : X Z is
continuous.
Proof. First note that if S Z, then x (g f )1 (S) iff (g f )(x) S
iff g(f (x)) S iff f (x) g 1 (S) iff x f 1 (g 1 (S)). Thus, we have that
(g f )1 (S) = f 1 (g 1 (S)).
We use the above theorem. If U Z is open in Z, then g 1 (U ) is open
in Y and hence f 1 (g 1 (U )) is open in X. Thus, when U is open in Z, then
(g f )1 (U ) is open in X. Hence, g f is continuous.
Problem 3.13. Let f : R R be defined by f (x) = x3 . Prove that f is
continuous.
Problem 3.14. Prove that f (x) = x3 is not uniformly continuous on R.
Problem 3.15. Prove that f : [M, +M ] R with f (x) = x3 is uniformly
continuous.
Problem 3.16. Let (X, d) and (Y, ) be metric spaces. A function f : X
Y is called Lipschitz continuous provided that there is a constant K > 0, so
that (f (p), f (q)) Kd(p, q). Prove that every Lipschitz continuous function
is uniformly continuous.
Given a function f : X Y and S X, the image of S is the subset of
Y given by f (S) = {f (p) : p S}.
45
3.1
46
f
g
is continuous at p0 .
Proof. To prove the first statement, given > 0, there is 1 > 0, so that
d(p0 , p) < 1 implies that f (p0 ) f (p)| < /2. Also there is 2 > 0, so that
d(p0 , p) < 2 implies that |g(p0 ) g(p)| < /2. Let = min{1 , 2 }.
To prove the second, given > 0, first pick 1 > 0 and 2 > 0, so
that d(p0 , p) < 1 implies |f (p0 ) f (p)| < 1 and d(p0 , p) < 2 implies that
|g(p0 ) g(p)| < 1. Now pick 3 > 0 and 4 > 0 so that d(p0 , p) < 3 implies
that |f (p0 )f (p)| < 2(|g(p0 )|+1) and d(p0 , p) < 4 implies that |g(p0 )g(p)| <
2(|f (p0 )|+1) . Let = min{1 , ..., 4 }, then for d(p0 , p) < we have that
|f (p0 )g(p0 ) f (p)g(p)| |f (p0 ) f (p)||g(p0 )| + |g(p0 ) g(p)||f (p)| <
|g(p0 )| +
(|f (p0 )| + 1) < .
2(|g(p) )| + 1)
2(|f (p0 )| + 1)
Statement 3) follows from 2) and the fact constants are continuous.
Statement 4) will follow from 2), if we prove that the function g1 is
continuous at p0 . Given > 0, we first pick 1 > 0, so that d(p0 , p) < 1
implies that |g(p0 ) g(p)| < |g(p20 )| . This guarantees that |g(p)| > |g(p20 )| . We
now pick 2 > 0 so that d(p0 , p) < 2 implies that |g(p0 ) g(p)| <
Now if we let = min{1 , 2 }, then when d(p0 , p) < , we have
|
|g(p0 )|2
.
2
1
1
|g(p) g(p0 )|
|=
g(p0 ) g(p)
|g(p0 )g(p)|
|g(p) g(p0 )|
|g(p0 )|2
2
<
= .
2
|g(p0 )| /2
2
|g(p0 )|2
Corollary 3.20. Let (X, d) be a metric space, let R have the usual metric
and let f, g : X R be functions. If f and g are both continuous, then:
1. f + g is continuous,
2. f g is continuous,
3. for any constant c, cf is continuous,
4. if g(p) 6= 0 for all p, then
f
g
is continuous.
47
3.2
48
49
Proof. Given p = (a1 , ..., ak ) and q = (b1 , ..., bk ) two points in Rk , we have
that d(xi (p), xi (q)) = |xi (p) xi (q)| = |ai bi | d2 (p, q). This shows that
each coordinate function is a Lipschitz continuous function from (Rk , d2 ) to
R. The remainder of the proof follows as in the case of ordinary polynomials.
First, one does induction to show that every monomial function is continuous
and then do an induction on the number of terms in the polynomial.
Problem 3.26. Prove that the function g : [0, +) R defined by g(x) =
x is continuous.
Problem 3.27. Prove or disprove that the above function is uniformly continuous.
3.3
In this section we generalize the concept of limit and establish the connections between limits and continuity. The first result gives a sequential test
for continuity.
Theorem 3.28. Let (X, d) and (Y, ) be metric spaces, let f : X Y be a
function and let p0 X. Then f is continuous at p0 if and only if whenever
{pn } X is a sequence that converges to p0 , the sequence {f (pn )} Y
converges to f (p0 ).
Proof. Assume that f is continuous at p0 and let > 0. Then there exists
> 0, such that d(p0 , p) < implies that (f (p0 ), f (p)) < . Since limn pn =
po , there is N so that n > N implies that d(p0 , p) < . Hence, if n > N,
then (f (p0 ), f (pn )) < . This proves that limn f (pn ) = f (p0 ).
To prove the converse, we show its contrapositive. So assume that f is
not continuous at p0 . Then there exists > 0, for which we can obtain no
that satisfies the definition of continuity. The fact that = 1/n fails to
satisfy the definition, means that there exists a point pn , with d(p0 , pn ) <
1/n, but (f (p0 ), f (pn )) .
In this manner we obtain a sequence {pn } with d(p0 , pn ) < 1/n and
hence limn pn = p0 . Yet (f (p0 ), f (pn )) for every n, and so f (p0 ) cannot
be the limit of the sequence {f (pn )}.
Corollary 3.29. Let (X, d) and (Y, ) be metric spaces and let f : X Y.
Then f is continuous if and only if whenever {pn } X is a convergent
sequence we have limn f (pn ) = f (limn pn ).
50
pp0
provided that for every > 0 there is a > 0 such that whenever d(p0 , p) <
and p 6= p0 , then (q0 , f (p)) < .
Note that we need p0 to be a cluster point to guarantee that the set of
ps satisfying d(p0 , p) < and p 6= p0 , is non-empty. Often, when using this
definition, the function f is actually defined on all of X, in which case we
simply ignore its value at p0 .
We leave the proof of the following result as an exercise.
Theorem 3.31. Let (X, d) and (Y, ) be metric spaces, let f : X Y be
a function and let p0 X be a cluster point. Then f is continuous at p0 if
and only if limpp0 f (p) = f (p0 ).
Lemma 3.32. Let (X, d) and (Y, ) be metric spaces and let f : X Y be
any function. If p0 X is not a cluster point, then f is continuous at p0 .
Proof. Since p0 is not a cluster point, there exists r > 0 so that d(p0 , p) < r
implies that p = p0 . Thus, given any > 0, if we let = r, then d(p0 , p) <
implies that (f (p0 ), f (p)) = 0 < .
Corollary 3.33. Let (X, d) and (Y, ) be metric spaces, and let f : X Y
be a function. Then f is continuous if and only if for every cluster point
p0 X, we have that limpp0 f (p) = f (p0 ).
Problem 3.34. Prove Theorem 3.31.
Problem 3.35. Prove Corollary 3.33.
Problem 3.36. Let (X, d) and (Y, ) be metric spaces, let f : X Y be
a function and let p0 X be a cluster point. Prove that limpp0 f (p) = q0
if and only if for every sequence of points {pn } X such that pn 6= p0 and
limn pn = p0 , we have limn f (pn ) = q0 .
3.4
51
52
Theorem 3.40. Let (X, d) and (Y, ) be metric spaces, and let f : X Y
be a continuous function. If X is compact, then f is uniformly continuous.
Proof. Given > 0, since f is continuous at p, there is a (p) > 0, such that
d(p, q) < (p) implies that (f (p), f (q)) < /2.
The collection of sets {B(p; (p)
2 )}pX is an open cover of X. Hence, there
exists a finite collection of points, {p1 , ..., pn } so that
X B(p1 ;
(pn )
(p1 )
) B(pn ;
).
2
2
3.5
We will see in this section that the intermediate value theorem from calculus
is really a consequence of the fact that an interval of real numbers is a
connected set. First we need to define this concept.
Definition 3.42. A metric space (X, d) is connected if the only subsets
of X that are both open and closed are X and the empty set. A subset S
of X is called connected provided that the subspace (S, d) is a connected
metric space. If S is not connected then we say that S is disconnected or
separated.
Proposition 3.43. A metric space (X, d) is disconnected if and only if X
can be written as a union of two disjoint, non-empty open sets.
54
Chapter 4
56
Proof. First, we show that the inductively defined sequence does converge.
To see this, since X is complete, it will be enough to show that the sequence
is Cauchy.
Let A = d(x2 , x1 ). Then we have that d(x3 , x2 ) = d(f (x2 ), f (x1 ))
rd(x2 , x1 ) = rA. Similarly, d(x4 , x3 ) = d(f (x3 ), f (x2 )) rd(x3 , x2 ) r2 A.
By induction, we prove that d(xn+1 , xn ) rn1 A.
Now, if m > n, then d(xm , xn ) d(xm , xm1 ) + + d(xn+1 , xn )
Pm1 k1
n1
A Ar1r . Given any > 0, we may choose an integer N such
k=n r
N
that Ar
1r < . Then if m, n > N, we have that d(xm , xn ) < . Thus, {xn }nN
is Cauchy.
Let x0 = limn xn . Then since f is continuous, f (x0 ) = limn f (xn ) =
limn xn+1 = x0 . to a point x0 satisfying f (x0 ) = x0 .
n1
Now if we fix any n, then d(x0 , xn ) = limm d(xm , xn ) Ar1r , by the
above estimate, which proves 3).
Thus, we know that there is a point x0 , with f (x0 ) = x0 and that this
sequence converges to one such point. To complete the proof of the theorem
it will be enough to show that if f (x0 ) = x0 and f (p0 ) = p0 , then p0 = x0 .
To see this last fact, note that d(p0 , x0 ) = d(f (p0 ), f (x0 )) rd(p0 , x0 ). Since
r < 1, this implies that d(p0 , x0 ) = 0.
Thus, the contraction mapping principle not only guarantees us the existence of a fixed point, but shows that there is a unique fixed point and gives
us a method for approximating the fixed point, together with an estimate of
how close the sequence xn is to the fixed point! This is a remarkable amount
of information.
Heres a typical application of this theorem. Let f (x) = cos(x) and
note that since 1 < /2 for 0 x 1, we have that 0 f (x) 1. Thus,
f : [0, 1] [0, 1] and is continuous. Also, by the mean value theorem, for
0 x y 1, there is c, x c y with
f (y) f (x) = f 0 (c)(y x) = sin(c)(y x).
Hence, |f (y) f (x)| sin(c)|y x| sin(1)|y x|. Thus, f (x) = cos(x) is
a contraction mapping with r = sin(1) < 1.
Using the fact that [0, 1] is complete, by the contraction mapping principle, there is a unique point, 0 x0 1, such that cos(x0 ) = x0 . Moreover,
we can obtain this point(or at least approximate it) by choosing any number
x1 , 0 x1 1 and forming the inductive sequence xn+1 = cos(xn ).
The third part of the theorem gives us an estimate of the distance between our approximate fixed point xn and the true fixed point. In par-
57
| 2 xn |.
4.1
f (x1 )
.
f 0 (x1 )
(x)f (x)
Proof. Fix an r, 0 < r < 1. Let g be as above, then g 0 (x) = f(f
0 (x))2 , which is
0
continuous in a neighborhood of x0 . Since g (x0 ) = 0, there is some M > 0,
58
4.2
In this section we show how the contraction mapping principle can be used
to deduce that solutions exist and are unique for some very complicated
ordinary differential equations.
Starting with a continuous function h(x, y) and an intial value y0 , we
wish to solve
dy
= h(x, y), y(a) = y0 .
dx
That is, we seek a function f (x) so that f 0 (x) = h(x, f (x)) for a < x < b
and f (a) = y0 . Such a problem is often called an initial value problem(IVP).
dy
For example, when h(x, y) = x2 y 3 , then we are trying to solve, dx
= x2 y 3
which can be done by separation of variables. But our function could be
dy
h(x, y) = sin(xy), in which case the differential equation becomes dx
=
sin(xy), which cannot be solved by elementary means.
For this application, we will use a number of things that we have not yet
developed fully. But again, we stress that we are seeking motivation.
59
60
Using that fn is continuous, pick > 0, so that |x0 x| < implies that
|fn (x0 ) fn (x)| < /3. Then for |x0 x| < we have that
|f (x0 ) f (x)| = |f (x0 ) fn (x0 ) + fn (x0 ) fn (x) + fn (x) f (x)|
|f (x0 ) fn (x0 )| + |fn (x0 ) fn (x)| + |fn (x) f (x)| < .
Thus, we have that f is continuous at x0 and since x0 was arbitrary, f is
continuous.
Finally, we must show that limn d(f, fn ) = 0. But to see this, given
> 0, take N as before and recall that we have already shown, that |f (x)
fn (x)| /3, for any n > N. Hence, for n > N, we have that d(f, fn )
sup{|f (x) fn (x)| : a x b} /3 < .
We now have all the tools at our disposal needed to solve some IVPs.
Theorem 4.8. Let h(x, y) be a continuous function on [a, b]R and assume
that |h(x, y1 ) h(x, y2 )| K|y1 y2 | with r = K(b a) < 1. Then for any
y0 , the initial value problem, f 0 (x) = h(x, f (x)), f (a) = y0 has a unique
solution on the interval [a, b].
61
62
Chapter 5
Riemann and
Riemann-Stieltjes
Integration
In this chapter we develop the theory of the Riemann integral, which is
the type of integration used in your calculus courses and we also introduce
Riemann-Stieltjes integration which is widely used in probability, statistics
and financial mathematics.
Given a closed interval [a, b] by a partition of [a, b] we mean a set
P = {x0 .x1 , , ..., xn1 , xn } with a = x0 < x1 < ... < xn = b. The norm or
width of the partition is
kPk = max{xi xi1 : 1 i n}.
Given two partitions P1 and P2 we say that P2 is a refinement of P1 or
P2 refines P1 provided that as sets P1 P2 . Note that if P2 refines P1 ,
then kP2 k kP1 k.
Given a bounded function f : [a, b] R and a partition P = {x0 , ..., xn }
of [a, b] for i = 1, ..., n, we set
Mi = sup{f (x) : xi1 x xi }
and
mi = inf{f (x) : xi1 x xi }.
The upper Riemann sum of f given the partition P is the real number,
U (f, P) =
n
X
Mi (xi xi1 ),
i=1
63
n
X
mi (xi xi1 ).
i=1
i=1
where x0i is any choice of points satisfying, xi1 x0i xi , for i = 1, ..., n.
Since mi f (x0i ) Mi , the upper and lower Riemann give an upper and
lower bound for general Riemann sums, i.e.,
L(f, P)
n
X
i=1
In fact, since we can choose the points x0i so that f (x0i ) is arbitrarily close
to Mi , we see that U (f, P) is actually the supremum of all general Riemann
sums of f given P. Similarly, by choosing the points x0i so that f (x0i ) is
arbitrarily close to mi , we see that L(f, P) is the infimum of all general
Riemann sums.
Thus, if we want all general Riemann sums of a function to be close
to a value that we wish to think of as the integral of f, then it will be
enough to study the extreme cases of the upper and lower sums.
Proposition 5.1. Let f : [a, b] R be a bounded function and let P1 and
P2 be partitions of [a, b] with P2 a refinement of P1 . Then
L(f, P1 ) L(f, P2 ) U (f, P2 ) U (f, P1 ).
Proof. We have already observed that for any partition, L(f, P) U (f, P).
So we only need to consider the other two inequalities.
It is enough to prove these inequalities in the case that P2 is obtained
from P1 by adding just one extra point. Because if we do this then P2 can
be obtained from P1 by successively adding a point at a time and in each
case the inequalities move in the right direction.
So let us set P1 = {x0 , ..., xn } so that for some j, P2 = {x0 , ...., xj1 , x
, xj , ..., xn }
where xj1 < x
< xj .
65
We start with the case of the upper sums. Note that most of the terms
in the sum for U (f, P1 ) and U (f, P2 ) will be equal, except that the one term
Mj (xj xj1 ) in the sum for U (f, P1 ) will be replaced by two terms in the
sum for U (f, P2 ). These two new terms are Mj0 (
x xj1 ) + Mj00 (xj x
),
where
Mj0 = sup{f (x) : xj1 x x
} and Mj00 = sup{f (x) : x
x xj }.
Since [xj1 , x
] [xj1 , xj ] and [
x, xj ] [xj1 , xj ], we have that Mj0
00
Mj and Mj Mj . Hence, the two terms appearing in U (f, P2 ), satisfy
Mj0 (
x xj1 ) + Mj00 (xj x
) Mj (
x xj1 ) + Mj (xj x
) = Mj (xj xj1 ),
which is the one term appearing in U (f, P1 ). Thus, U (f, P2 ) U (f, P1 ).
The proof for the lower sums is similar, except that since we are dealing
with infima, we will have that
m0j = inf{f (x) : xj1 x x
} mj and m00j = inf{f (x) : x
x xj } mj .
We say that f is Riemann integrable when these two numbers are equal
and in this case we define the Riemann integral of f to be
Z
f (x)dx =
a
f (x)dx.
f (x)dx =
a
f (x)dx = b a and
a
f (x)dx = 0.
a
and so
Z
f (x)dx.
a
L(f, P)
f (x)dx
a
67
which implies that
Z
f (x)dx
0
a
Since > 0 was arbitrary, the lower and upper Riemann integrals must be
equal.
Conversely, assume that f is Riemann integrable and that > 0 is given.
Since the upper integral is an infimum, we may choose a partition P1 so that
Z
f (x)dx + /2.
a
Let P3 = P1 P2 , then
U (f, P3 ) L(f, P3 ) U (f, P1 ) L(f, P2 ) <
Z b
Z b
( f (x)dx + /2) ( f (x)dx /2) =
a
since the upper and lower integrals are equal. Thus, we have found a partition satisfying our criteria.
Definition 5.5. We say that a bounded function f : [a, b] R satisfies
the Riemann integrability criterion provided that for every > 0, there
exists a partition P, such that U (f, P) L(f, P) < .
Thus, the above theorem says that a bounded function is Riemann integrable if and only if it satisfies the Riemann integrability criterion.
Theorem 5.6. Let f : [a, b] R be a continuous function. Then f is
Riemann integrable.
Proof. Since f is continuous and [a, b] is compact, we have that f is bounded
and f is uniformly continuous. We will prove that f meets the Riemann
integrability criterion.
Given any > 0, since f is uniformly continuous, there is a > 0, so
that whenever |x y| < then |f (x) f (y)| < /(b a). Now let P = {a =
n
X
i=1
n
X
[/(ba)](xi xi1 ) = .
i=1
j=
n(n + 1)
.
2
Problem 5.7. Let f (x) = x. Compute U (f, Pn ) and L(f, Pn ). Use these
formulas and the Riemann integrability
R 1criterion to prove that f is Riemann
integrable on [0, 1] and to prove that 0 xdx = 1/2.
Problem 5.8. Let a c < d b and let f : [a, b] R be the function
defined by
0 a x c
f (x) = 1 c < x < d .
0 dxb
Rb
Prove that f is Riemann integrable on [a, b] and that a f (x)dx = (d c).
5.1
69
f d,
a
and it is designed to also define a signed area under the graph of f but
now if we want the area of a rectangle to be the length of the base times the
height, then a rectangle from xi1 to xi of height h should have area
h((xi ) (xi1 )).
Thus, given a bounded function f an increasing function , a partition
P = {a = x0 , ...., xn = b}, the numbers Mi = sup{f (x) : xi1 x xi } and
mi = inf{f (x) : xi1 x xi }, we are led to define the upper RiemannStieltjes sum as
U (f, P, ) =
n
X
Mi ((xi ) (xi1 ))
i=1
n
X
i=1
When
Z
f d =
a
f d,
a
71
0 x < 1
P rob(X x) = p 1 x < 3 .
1 3x
So our cumulative distribution function has two jump discontinuities, the
first of size p ocurring at 1 and the second of size 1 p occurring at 3.
More generally, if one has a probability space, a random variable X,
and we let (x) = P rob(X x), then will be an increasing function,
i.e., one for which we can consider Riemann-Stieltjes integrals. Much work
in probability and its applications, e.g., financial math, is concerned with
computing these Riemann-Stieltjes integrals for various functions f in the
case that is the cumulative distribution function of a random variable.
Returning to our example, if for any c we let
(
0 x<c
Jc (x) =
,
1 cx
then we can also write P rob(X x) = pJ1 (x) + (1 p)J3 (x).
These simple jump functions are useful for expressing the cumulative
distributions of many random variables. For example, if we consider one roll
of a fair die, so that each side has probability 1/6 of facing up and we let
X be the random variable that simply gives the number that is facing up,
then
P rob(X x) = 1/6[J1 (x) + J2 (x) + J3 (x) + J4 (x) + J5 (x) + J6 (x)].
For this reason we want to take a careful look at what Riemann-Stieltjes
integration is when (x) = hJc (x), for some real number c and h > 0.
Definition 5.14. Let f : [a, b] R and let a < c b. If there is a number
L so that for every > 0, there exists > 0, so that when c < x < c,
Proof. First we assume that f is continuous from the left at c and prove
that f satisfies the Riemann-Stieltjes integrability criterion. So let > 0,
be given. Since limxc f (x) = f (c), there is > 0, so that c < x < c
implies that |f (x) f (c)| < 3h
.
Now consider the partition, P = {a = x0 , x1 = c /2, x3 = c, x4 = b}.
Then we have that
U (f, P, ) L(f, P, ) =
4
X
i=1
73
5.2
Z
cf d = c
f d,
a
Z
(f + g)d =
f d +
a
gd,
a
cf d = sup{L(cf, P, )} =
a
Z b
Z b
sup{cL(f, P, )} = c f d = c
f d =
a
Z b
c f d = inf{cU (f, P, } =
a
inf{U (cf, P, )} =
f d,
a
which shows that the upper and lower integrals are equal for cf, so that cf
is Riemann-Stieltjes
R b integrable with respect to and shows that they are
both equal to c a f d. Thus, 1) is proven in the case that c 0.
When c < 0, recall that for any bounded set S of real numbers, sup{cs :
s S} = c inf{s : s S} and inf{cs : s S} = c sup{s : s S}. Thus,
for any interval, sup{cf (x) : xi1 x xi } = c inf{f (x) : xi1 x xi }
75
cf d = sup{L(cf, P, )} =
a
c sup{L(f, P, )} = inf{cL(f, P, )} =
Z
inf{U (cf, P, )} =
f d,
a
f d +
a
(f + g)d
a
f d +
a
gd.
a
gd + .
a
(f + g)d
a
f d +
a
gd.
a
Rb
Rb
Since the upper bound and lower bound are both equal to a f d + a gd
we have that these inequalities must be equalities and 2) follows.
Before proving 3) we first make a number of observations to reduce the
work. First suppose that we only prove that if a function h is RiemannStieltjes integrable with respect to , then h2 is Riemann-Stieltjes integrable.
Then, since f and g are both integrable, by part 2), f + g is integrable and
hence we would have (f + g)2 = f 2 + 2f g + g 2 , f 2 and g 2 are all integrable.
Applying 1) and 2) again, we would have that (f + g)2 f 2 g 2 = 2f g
is integrable and hence by 1), f g is integrable. Thus, it will be enough to
prove that h integrable implies that h2 is integrable.
Next, we claim that to show that squares of integrable functions are
integrable, it will be enough to show that squares of non-negative integrable
are integrable. To see this, given any h that is integrable, we know that
h is bounded, so there is some constant c > 0 so that h + c 0. Since
constants are continuous functions and continuous functions are integrable,
h + c is integrable. Thus, if we know that squares of non-negative integrable
functions are integrable, then we will have that (h + c)2 = h2 + 2ch + c2 is
integrable. But 2ch+c2 is integrable by 1) and 2), so h2 = (h+c)2 2chc2
is integrable.
Thus, it remains to show that if h 0 is Riemann-Stieltjes integrable
with respect to , then h2 is also Riemann-Stieltjes integrable with respect
to . To this end we show that h2 satisfies the Riemann-Stieltjes integrability
criterion. So let > 0, be given and let K = sup{h(x) : a x b}. If h = 0,
then the result is clearly true, so we may assume that h 6= 0 and so K > 0.
Since h is Riemann-Stieltjes integrable, it satisfies the criterion and hence
there exists a partition P, so that U (h, P, ) L(h, P, ) < /2K. Let
P = {a = x0 , ..., xn = b}, let Mi = sup{h(x) : xi1 x xi } and let
mi = inf{h(x) : xi1 x xi }. Since h(x) 0, we have that sup{h(x)2 :
xi1 x xi } = Mi2 and inf{h(x)2 : xi1 x xi } = m2i . Thus, we have
77
that
U (h2 , P, ) L(h2 , P, ) =
n
X
(Mi2 m2i )((xi ) (xi1 )) =
i=1
n
X
i=1
i=1
and
Z
Z b
Z b
f d(c1 + 2 ) = c f d1 +
f d2 .
Proof. First note that for any partition P = {a = x0 , ..., xn = b}, we have
that
U (f, P, c1 + 2 ) =
n
X
i=1
cU (f, P, 1 ) + U (f, P, 2 ).
Thus,
Z
U (f, P1 , 1 ) <
f d1 + /2c
a
U (f, P2 , 2 ) <
f d2 + /2.
a
Let P3 = P1 P2 , then
Z
f d(c1 + 2 ) U (f, P3 , c1 + 2 ) =
a
cU (f, P3 , 1 ) + U (f, P3 , 2 )
cU (f, P1 , 1 ) + U (f, P2 , 2 ) <
Z b
Z b
c[ f d1 + /2c] + [ f d2 + /2].
a
and these two inequalties for the upper Riemann-Stieltjes, prove equality
of these integrals. The proof for the lower Riemann-Stieltjes integrals is
similar.
Proposition 5.22. Let 1 , 2 : [a, b] R be increasing, let c > 0 and let
f : [a, b] R be a bounded function. If f is Riemann-Stieltjes integrable
with respect to 1 and 2 , then f is Riemann-Stieltjes integrable with respect
to c1 + 2 and
Z b
Z b
Z b
f d(c1 + 2 ) = c
f d1 +
f d2 .
a
Proof. The previous result shows that the upper and lower Riemann-Stieltjes
of f with respect to c1 + 2 are equal and are given by the above formula.
Theorem 5.23. Let a < c1 < .... < cn b, let hi > 0, i = 1, ..., n and
let = h1 Jc1 + hn Jcn . If f : [a, b] R is a bounded function that is
continuous from the left at each ci , i = 1, ..., n, then f is Riemann-Stieltjes
integrable with respect to and
Z b
f d = h1 f (c1 ) + hn f (cn ).
a
79
0
x<0
1
0x<1
Problem 5.27. Let : [1, 3] R be defined by (x) =
.
1+x 1x<2
7
2x3
R3
Compute 1 td(t).
Problem 5.28. Let : [a, b] R be an increasing function, let a < c < b
and let f : [a, b] R be a bounded function. Prove:
Rb
Rc
Rb
Rb
f d =
Rc
f d +
Rb
a f d
a f d
c f d,
f d
Problem 5.29. Let : [a, b] R, let a < c < b and let f : [a, b] R be
Riemann-Stieltjes integrable. Prove that f is Riemann-Stieltjes integrable
on [a, c] and on [c, b] and that
Z
Z
f d =
5.3
Z
f d +
f d.
c
Before proceeding it will be useful to have a few more facts about the
Riemann-Stieltjes integral.
f d = f (c)((b) (c)).
a
Proof. When (b) (a) = 0, then both sides of the equation are 0 and any
point c will suffice. So assume that (b) (a) 6= 0.
By Problem 5.18,
Rb
a f d
m
M,
(b) (a)
where m = f (x0 ) and M = f (x1 ) are the minimum and maximum of f on
[a, b]. Thus, by the intermediate value theorem, there is a point c between
x0 and x1 such that f (c) is equal to this fraction.
Proposition 5.31. Let a c < d b, let : [a, b] R be an increasing
function and let f : [a, b] R be a bounded function that is RiemannStieltjes integrable with respect to on [a, b], then f is Riemann-Stieltjes
integrable with respect to on [c, d].
Proof. Given > 0 there is a partition P of [a, b] with U (f, P, )L(f, P, ) <
. Let P1 = P {c, d}, then P1 is a refinement of P and so U (f, P1 , )
L(f, P1 , ) U (f, P, ) L(f, P, ) < . Now if we only consider the points
in P1 that lie in [c, d] then that will be a partition of [c, d] and the terms in the
above sum corresponding to those subintervals will be even smaller. Thus,
we obtain a partition of [c, d] that satisfies the condition in the RiemannStieltjes integrability criterion.
In the case that (x) = x the following theorem reduces to the classic
fundamental theorem of calculus.
Theorem 5.32 (Fundamental Theorem of Calculus). Let : [a, b] R be
an increasing function that
R x is differentiable on (a, b) and let f : [a, b] R
be continuous. If F (x) = a f (t)d(t), then F is differentiable on (a, b) and
F 0 (x) = f (x)0 (x).
Proof. Let a < x < x + h < b, then by the mean value theorem, there is
c, x c x + h so that
F (x + h) F (x)
1
=
h
h
x+h
f (t)d(t) =
x
f (c)((x + h) (x)
.
h
81