Real Analysis-Final
Real Analysis-Final
Real Analysis-Final
Real Analysis
M.A. (Previous)
Developed & Produced by EXCEL BOOKS PVT. LTD., A-45 Naraina, Phase 1, New Delhi-110028
3
Contents
Chapter 1 Sequences and Series of Functions 5
1
SEQUENCES AND SERIES OF FUNCTIONS
1.1. The object of this chapter is to consider sequences whose terms are functions rather than
real numbers. There sequences are useful in obtaining approximations to a given function. We
shall study two different notations of convergence for a sequence of functions: Pointwise
convergence and uniform convergence
Pointwise and Uniform Convergence of Sequences of functions
Definition. Let A ⊆ R and suppose that for each n∈N there is a function fn : A→R. Then <fn> is
called a sequence of functions on A. For each x ∈A, this sequence gives rise to a sequence of
real numbers, namely the sequence < fn(x) > .
Definition. Let A ⊆ R and let < fn > be a sequence of functions on A. Let A0 ⊆ A and
suppose f : A0 → R. Then the sequence <fn> is said to converge on A0 to f if for each x∈A0, the
sequence <fn(x)> converges to f(x) in R.
In such a case f is called the limit function on A0 of the sequence <fn>.
When such a function f exists, we say that the sequence <fn> is convergent on A0 or that
<fn> converges pointwise on A0 to f and we write f(x) = lim fn(x), x∈A0. Similarly, it Σfn(x)
n →∞
converges for every x∈A0, and if
∞
f(x) = fn(x), x∈A0,
n =1
Thus our question of continuity reduces to “can we interchange the limit symbols in (1.1.2)?” or
“Is the order in which limit processes are carried out immaterial”. The following examples show
that the limit symbols cannot in general be interchanged.
6 SEQUENCES AND SERIES OF FUNCTIONS
Example. A sequence of functions for which limit of the integral is not equal to integral of
the limit: Let
fn(x) = n2x (1−x)n, x∈R, n = 1, 2, ….
If 0 ≤ x ≤ 1, then
f(x) = lim fn(x) = 0
n →∞
and so
1
0 f(x) dx = 0
1 1
But 0 fn(x) dx = n2 0 x(1−x)n dx
n2 n2
= −
n +1 n + 2
n2
=
(n + 1)(n + 2)
and so
1
lim 0 fn(x) dx = 1
n →∞
REAL ANALYSIS 7
Hence
1 1
lim 0 fn(x) dx ≠ 0 ( lim fn(x)) dx.
n →∞ n →∞
Example. A sequence of differentiable functions [fn] with limit 0 for which [fn′] diverges:
Let
sin nx
fn(x) = if x∈R, n = 1, 2,
n
Then lim fn(x) = 0 ∀ x.
n →∞
converges uniformly on E.
1
Examples. (1) Consider the sequence <Sn> defined by Sn(x) = in any interval [a, b], a > 0.
x+n
Then
1
S(x) = lim Sn(x) = lim =0
n →∞ n →∞ x+n
For convergence we must have
(1.1.4) |Sn(x) − S(x) < ∈ , n > n0
1
or − 0 < ∈, n > n0
x+n
1
or <∈
x+n
1
or x+n>
∈
1
or n > −x
∈
8 SEQUENCES AND SERIES OF FUNCTIONS
1 1
, then (1.1.4) is satisfied for m(integer) greater than
If we select n0 as integer next higher to
∈ ∈
which does not depend on x∈[a, b]. Hence the sequence <Sn> is uniformly convergent to S(x) in
[a, b].
2. Consider the sequence <fn> defined by
x
fn(x) = ,x≥0
1 + nx
Then
x
f(x) = lim = 0 for all x ≥ 0.
n →∞ 1 + nx
Then <fn> converges pointwise to 0 for all x ≥ 0. Let ∈ >0, then for convergence we must have
|fn(x) − f(x)| < ∈, n > n0
x
or − 0 < ∈, n > n0
1 + nx
x
<∈
1 + nx
x < ∈ + nx ∈
or n x ∈ > x−∈
x− ∈
or n>
x∈
x 1
or n> =
x∈ ∈
1
If n0 is taken as integer greater than , then
∈
|fn(x) − f(x)) < ∈ ∀ n > n0 and ∀ x∈[0, ∞)
Hence <fn> converges uniformly to f on [0, ∞).
3. Consider the sequence <fn> defined by
fn(x) = xn, 0 ≤ x ≤ 1.
Then
0 if 0 ≤ x < 1
f(x) = lim xn =
n →∞ 1 if x = 1
or xn < ∈
REAL ANALYSIS 9
n
1 1
or >
x ∈
1
log
or n> ∈
1
log
x
1
Thus we should take n0 to be an integer next higher to log 1 / ∈ log . If we take x = 1, then m
x
does not exist. Thus the sequence in question is not uniformly convergent to f in the interval
which contains 1.
4. Consider the sequence < fn > defined by
nx
fn(x) = , 0 ≤ x ≤ a.
1+ n 2x 2
Then if x = 0, then fn(x) = 0
and so f(x) = lim fn(x) = 0
n →∞
If x ≠ 0, then
nx
f(x) = lim fn(x) = lim =0
n →∞ n →∞ 1+ n 2x 2
Thus f is continuous at x = 0. For convergence we must have
|fn(x) − f(x)| < ∈, n > n0
nx
or <∈
1+ n 2x 2
nx
or 1 + n2 x2 − >0
∈
1 1 1
or nx > + −4
2 ∈ 2 ∈2
Thus we can find an upper bound for n in any interval 0 < a ≤ x ≤ b, but the upper bound is
infinite if the interval includes 0. Hence the given sequence in non-uniformly convergent in any
interval which includes the origin. So 0 is the point of non-uniform convergence for this
sequence.
5. Consider the sequence <fn> defined by
fn(x) = tan−1 nx , 0 ≤ x ≤ a
Then
if x ≠ 0
f(x) = lim fn(x) = 2
n →∞
0 if x = 0
10 SEQUENCES AND SERIES OF FUNCTIONS
Hence to each ∈>0, there exists an integer N such that n>N, x ε E imply
|fn(x) −f(x)| < ∈
Hence fn→f uniformly on E.
Weierstrass contributed a very convenient test for he uniform convergence of infinite series of
functions.
Theorem 3. (Weierstrass M-test). Let <fn> be a sequence of functions defined on E and suppose
|fn(x)| ≤ Mn (xεE, n = 1, 2, 3,…),
where Mn is independent of x. Then Σ fn converges uniformly as well as absolutely on E if Σ Mn
converges.
Proof. Absolute convergence follows immediately from comparison test.
To prove uniform convergence, we note that
m m
|Sm(x) − Sn(x)| = f n (x) − fn
i =1 i =1
∈
and so A < for all n > N (independent of x) and for all x ε E. Also, since {vn(x)} in
M
decreasing, vn(x) < v1(x) < M since v1(x) is bounded for all x ε E
Hence
| un(x) vn(x) + un+1(x) vn+1(x) +…+ um(x) vm(x)| < ∈
n
for n > N and all x ∈E and so un(x) vn(x) is uniformly convergent.
n =1
n
Theorem. (Dirichlet’s test). The series un(x) vn(x) converges uniformly on E if
n =1
(i) {vn(x)} is a positive decreasing sequence for all values of x∈E, which tends to zero
uniformly on E
(ii) Σ un(x) oscillates or converges in such a way that the moduli of its limits of
oscillation remains less than a fixed number M for all x ε E.
∞
Proof. Consider the series un(x) vn(x), where {vn(x)} is a positive decreasing sequence
n =1
tending to zero uniformly on E. By Abel’s Lemma
|un(x) vn(x) + un+1(x) vn+1(x) +…+ um(x) vm (x) | < A vn(x),
where A is greatest of the magnitudes
|un(x)|, |un(x) + vn+1(x)|,…, |un(x) + un+1(x) +…+ um(x)|
and is a function of x.
s
Since Σ un(x) converges or oscillates finitely in such a way that u n ( x ) < M for all x ε E,
r
therefore. A is less than M. Furthermore since vn(x)→0 uniformly as n→∞, to each ∈>0 there
exists an integer N such that
∈
vn(x) < for all n > N and all xεE
M
Hence
∈
|unx) un(x) + vn+1(x) vn+1(x) +…+ um(x) vm(x)| < .M=∈
M
∞
for all n > N and xεE and so un(x) vn(x) is uniformly convergent on E.
n =1
∞ cos n
Examples. 1. Consider the series . We observe that
n =1 np
cos n 1
≤
np np
Also, we know that
14 SEQUENCES AND SERIES OF FUNCTIONS
∞ 1
n =1 np
cos n
is convergent if p > 1. Hence, by Weierstrass M-Test, the series Σ converges absolutely
np
and uniformly for all real values of θ if p > 1.
∞ sin n
Similarly, the series converges absolutely and uniformly by Weierstrass’s M-Test.
n =1 np
2. Taking Mn = rn, 0 < r < 1, it can be shown by Weierstrass’s M-Test that the series
Σ rn cos nθ, Σ rn sin nθ, Σ rn cos2 nθ, Σ rn sin2 nθ converge uniformly and
absolutely
∞ x
3. Consider , x∈R.
n =1 n (1 + nx 2 )
We assume that x is +ve, for if x is negative, we can change signs of all the terms. We have
x
fn(x) =
n (1 + nx 2 )
and fn′(x) = 0
1
implies nx2 = 1. Thus maximum value of fn(x) is
2n 3 / 2
1
Hence fn(x) ≤
2n 3 / 2
1 ∞ x
Since Σ 3/ 2
is convergent, Weierstarss’ M-Test implies that is uniformly
n n =1 n (1 + nx 2 )
convergent for all x ε R.
∞ x
4. Consider the series , x ε R. We have
n =1 (n + x 2 ) 2
x
fn′ (x) =
(n + x 2 ) 2
(n + x 2 ) 2 − 2x (n + x 2 )2x
and so f n (x) =
(n + x 2 ) 4
Thus fn′(x) = 0 gives
x4 + x2 + 2nx2 − 4nx2 − 4x4 = 0
− 3x4 − 2nx2 + n2 = 0
or 3x4 + 2nx2 − n2+ = 0
n n
or x2 = or x =
3 3
REAL ANALYSIS 15
3 3 1
Also fn′′(x) is −ve. Hence maximum value of fn(x) is 2
. Since Σ 2 is convergent, it follows
16n n
by Weierstrass’s M-Test that the given series is uniformly convergent.
∞ anxn ∞ a n x 2n
5. The series and
n =1 1 + x 2n n =1 1 + x 2n
converge uniformly for all real values of x is Σ an is absolutely convergent. The solutions follow
the same line as for example 4.
6. Consider the series
∞ (−1) n x 2 n
.
n =1 n p 1 + x 2n
(−1) n
We note that if p > 1, then Σ is absolutely convergent and is independent of x. Hence, by
np
Weierstrass’s M-Test, the given series is uniformly convergent for all xεR.
(−1) n
If 0 ≤ p ≤ 1, the series Σ is convergent but not absolutely. Let
np
x 2n
vn(x) =
1 + x 2n
Then <vn(x)> is monotonically decreasing sequence for |x| , 1 because
x 2n x 2 n +2
vn(x) −vn+1(x) = −
1 + x 2n 1 + x 2 n +2
x 2 n (1 − x 2 )
= (+ve)
(1 + x 2 n )(1 + x 2 n + 2 )
x2
Also v1(x) = < 1.
1+ x2
∞ (−1) n x 2 n
Hence, by Abel’s Test, the series . in uniformly convergent for 0 < p ≤ 1 and |x|
n =1 n p 1 + x 2n
< 1.
6. Consider the series
xn
Σ an . ,
1 + x 2n
under the condition that Σan is convergent. Let
xn
vn(x) =
1 + x 2n
Then
16 SEQUENCES AND SERIES OF FUNCTIONS
v n (x ) 1 + x 2 n +2
=
v n +1 ( x ) x (1 + x 2 n )
and so
v n (x ) (1 − x )(1 − x 2 n×2 )
−1 =
v n +1 ( x ) x (1 + x 2 n )
which is positive if 0 < x < 1. Hence
vn > vn+1
x
and so <vn(x)> is monotonically decreasing and positive. Also v1(x) = is bounded. Hence,
1+ x 2
xn
by Abel’s test, the series Σ an. is uniformly convergent in (0, 1) if Σ an is convergent.
1 + x 2n
nx n −1 (1 − x )
7. Consider the series Σ an under the condition that Σ an is convergent. We have
1− xn
nx n −1 (1 − x )
vn(x) =
1− xn
Then
vn (x ) n 1 − x n +1
= .
v n +1 ( x ) (n + 1) x 1 − x n
n
Since →0 as n→∞, taking n sufficient large
n +1
v n ( x ) 1 − x n +1
> > 1 if 0 < x < 1.
v n +1 ( x ) 1 − x n
Hence < un(x)> is monotonically decreasing and positive. Hence, by Abel’s Test, the given
series converges uniformly in (0, 1).
1.3. Uniform Convergence and Continuity.
We know that if f and g are continuous functions, then f + g is also continuous and this result
holds for the sum of finite number of functions. The question arises “Is the sum of infinite
number of continuous function a continuous function?”. The answer is not necessary. The aim
of this section is to obtain sufficient condition for the sum function of an infinite series of
continuous functions to be continuous.
Theorem. 6. Let <fn> be a sequence of continuous functions on a set E ⊆ R and suppose that
<fn> converges uniformly on E to a function f : E→R. Then the limit function f is continuous.
Proof. Let cεE be an arbitrary point. If c is an isolated point of E, then f is automatically
continuous at C. So suppose that c is an accumulation point of E. We shall show that f is
continuous at c. Since fn→f uniformly, for every ∈ > 0 there is an integer N such that n ≥ N
implies
REAL ANALYSIS 17
∈
|fn(x) − f(x)| <for all x∈E.
3
Since fM is continuous at c, there is a neighbourhood Sδ(c) such that x ε Sδ(c) ∩ E (since c is
limit point) implies
|fM(x) − fM(c)| < ∈/3.
By triangle inequality, we have
|f(x) − f(c)| = |f(x) − fM(x) + fM(x) − fM(c) + fM(c)− f(c)|
≤ |f(x) − fM(x)| + | fM(x) − fM(c)| + | fM(c)− f(c)|
∈ ∈ ∈
< + + =∈
3 3 3
Hence
|f(x) − f(c) | < ∈, x ε Sδ(∈) ∩ E.
which proves the continuity of f at arbitrary point c ε E.
Remark. Uniform convergence of <fn> in the above theorem is sufficient but not necessary to
transmit continuity from the individual terms to the limit function. For example, let fn : [0, 1]
→R be defined for n ≥ 2 by
1
n2x for 0 ≤ x ≤
n
2 1 2
fn(x) = − n 2 x − for ≤ x ≤
n n n
2
0 for ≤ x ≤ 1
n
Each of the function fn is continuous on [0, 1]. Also fn(x) → 0 as n→∞ for all x ε [0, 1]. Hence
the limit function f vanishes identically and is continuous. But the convergence fn→f is non-
uniform.
The series version of Theorem 6 is the following:
Theorem. 7. If the series Σ fn(x) of continuous functions is uniformly convergent to a function f
on [a, b] , then the sum function f is also continuous on [a, b].
n
Proof. Let Sn(x) = fn(x), n∈N and let ∈>0. Since Σ fn converges uniformly to f on [a, b],
i =1
there exists a positive integer N such that
∈
(1.3.1.) |Sn(x) − f(x)| < for all n ≥ N and x ε [a, b].
3
Let c be any point of [a, b], then (1.3.1) implies
∈
(1.3.2) |Sn(c) − f(c)| < for all n≥ N.
3
Since fn is continuous on [a, b] for each n, the partial sum
Sn(x) = f1(x) + f2(x) +…+ fn(x)
is also continuous on [a, b] for all n. Hence to each ∈ > 0 then exist a δ > 0 such that
∈
(1.3.3) |Sn(x) − Sn(c)| < whenever |x−c| < δ
3
Now, by triangle inequality, and using (1.3.1), (1.3.2) and (1.3.3), we have
|f(x) − f(c)| = |f(x) −Sn(x) + Sn(x) − Sn(c) + Sn(c) − f(c)|
≤ | f(x) − Sn(x)| + |Sn(x) − Sn(c)| + |Sn(c) − f(c)|
18 SEQUENCES AND SERIES OF FUNCTIONS
∈
(1. 4. 1) η[α(b) − α(a)] ≤
3
This is possible since α is monotonically increasing. Since fn→f uniformly on [a, b], to each η>0
there exists an integer n such that
(1.4.2) |fn(x) − f(x)| ≤ η, x ε [a, b]
Since fn ε R(α), we choose a partition P of [a, b] such that
∈
(1.4.3) U (P, fn, α) − L(P, fn, α) <
3
The expression (1.4.2) implies
fn(x) − η ≤ f(x) ≤ fn(x) + η
Now f(x) ≤ fn(x) + η implies, by (1.4.1) that
∈
(1.4.4) U (P, f, α) , U (P, fn, α) +
3
Similarly, f(x) ≥ fn(x) − η implies
∈
(1.4.5) L(P, f, α) ≥ L(P, fn, α) −
3
Combining (1.4.3), (1.4.4) and (1.4.5), we get
U (P, f, α) − L(P, f, α) < ∈
Hence f ε R(α) on [a, b].
Further uniform convergence implies that to each ∈>0, there exists an integer N such that for n ≥
N
∈
|fn(x) − f(x)| < , x ∈ [a, b]
[ (b) − (a )]
Then for n > N,
| ab fdα − ab fndα| = | ab (f−fn) dα ≤ ab |f−fn| dα
∈ b
< dα(x) dx
[ (b) − (a )] a
∈ [ (b) − (a )]
=
( b) − ( a )
= ∈.
Hence
b
a f dα = lim fn dα
n →∞
and the result follows.
The series version of Theorem 9 is :
Theorem. 10. Let fn ε R, n = 1, 2,… If Σ fn converges uniformly to f on [a, b], then f ε R and
∞
b b
a f(x) dα = a fn(x) dα,
n =1
b b
a f(x)dx = lim a Sn(x) dx
n →∞
But
b b b b
a Sn(x) dx = a f1(x) dx + a f2(x) dx +…+ a fm(x) dx
n
b
= a fi(x) dx
i =1
Hence
∞
b b
a f(x) dx = lim a fi(x) dx
n →∞ i =1
∞
b
= a fi(x) dα,
i =1
and the proof of the theorem is complete.
2
Example. 1. Consider the sequence <fn> for which fn(x) = nx e −nx , n ε N, x ε [0, 1]. We note
that
f(x) = lim fn(x)
n →∞
nx
= lim 2
= 0, x ε (0, 1]
n →∞ 1 + nx n2x2
+ + ...
1 2
Then
1
0 f(dx) = 0
and
2
1
0 fn(x) dx = 1
0 nx e −nx dx
1 n −t
= e dt , t = nx2
2 0
1
= [1 −e−n]
2
Therefore
1
lim fn(x) dx = lim [1−e−n]
n →∞ n →∞ 2
1
=
2
1
If < fn > were uniformly convergent, then 0 f(x) dx should have been equal to lim fn9x) dx.
n →∞
But it is not the case. Hence the given sequence is not uniformly convergent to f infact, x = 0 is
the point of non-uniform convergence.
∞ x
2. Consider the series 2 2
. This series is uniformly convergent and so in integrable
n =1 ( n + x )
term by term. Thus
∞ x m 1 x
1
0 2 2
dx = lim
n =1 (n + x ) m → ∞ n =1 0
(n + x 2 ) 2
REAL ANALYSIS 21
m
= lim 1
0 n(n + x2)−2 dx
m→∞ n =1
1
m (n + x 2 ) −1
= lim
m→∞ n =1 −2 0
m 1 1 1
= lim −
m→∞ n =1 2 n n +1
1 1 1 1 1 1
= lim 1 − + − + ... + −
m →∞ 2 2 2 3 m m +1
1 1 1
= lim 1− =
m →∞ 2 m +1 2
∞ nx (n − 1) x
3. Consider the series 2 2
− , a ≤ x ≤ 1.
n =11+ n x 1 + (n − 1) 2 x 2
Let Sn(x) denote the partial sum of the series. Then
nx
Sn(x) =
1+ n 2x 2
and so f(x) = lim Sn(x) = 0 for all x ∈[0, 1]
n →∞
As we know that 0 is a point of non-uniform convergence of the sequence <Sn(x)>, the given
series is not uniformly convergent on [0, 1]. But
1 1
0 f(x) dx = 0 0 dx = 0
and
1 1 nx
0 Sn(x) dx = 0 dx
1+ n2x2
1 1 2n 2 x
= dx
2n 0 1 + n 2 x 2
=
1
2n
[
log (1 + n 2 x 2 ) ] 1
0
1
= log (1 + n2)
2n
Hence
1 1 ∞
lim 0 S n ( x )dx = lim log(1 + n 2 ) form
n →∞ 2n
n →∞ ∞
n ∞
= lim form
1+ n2 ∞
1
= lim =0
n →∞ 2 n
Thus
1 1
0 f(dx) dx = lim 0 Sn(x) dx,
n →∞
and so the series is integrable term by term although 0 is a point of non-uniform convergence.
22 SEQUENCES AND SERIES OF FUNCTIONS
Theorem. 11. Let {gn} be a sequence of function of bounded variation on [a, b] such that gn(a) =
0, and suppose that there is a function g such that
lim V(g − gn) = 0
n →∞
and g(a) = 0. Then for every continuous function f on [a, b], we have
lim ab f dgn = ab f dg.
n →∞
and gn→g uniformly on [a, b].
Proof. If V denotes the total variation on [a, b], then
V(g) ≤ V(gn) + V(g−gn)
Since gn is of bounded variation and lim V(g − gn) = 0 it follows that total variation of g is finite
n →∞
and so g is of bounded variation on [a, b]. Thus the integrals in the assertion of the theorem exist.
Suppose |f(x)| ≤ M on [a, b]. Then
b b b
| a
fdg − a
f dgn| = | a
fd (g−gn)
≤ M V(g − gn)
Since V (g−gn)→0 as n→∞, it follows that
b b
∞ fdg = lim ∞ f dgn
n →∞
Furthermore,
|g(x) − gn(x)| ≤ V(g−gn), a≤x≤b
Therefore, as n→∞, we have
gn →f uniformly.
1.5. Uniform Convergence and Differentiation
If f and g are derivable, then
d d d
[f(x) + g(x)] = f(x) + g(x)
dx dx dx
and that this can be the extended to finite number of derivable function. In this section, we shall
extend this phenomenon under some suitable condition to infinite number of functions.
Theorem. 12. Suppose {fn} is a sequence of functions, differentiable on [a, b] and such that
[fn (x0)} converges for some point x0 on [a, b]. If {fn′} converges uniformly on [a, b], then {fn}
converges uniformly on [a, b], to a function f, and
f ′(x) = lim fn′(x) (a ≤ x ≤ b).
n →∞
t
a g(x) dx = lim [fn(t) − fn(a)]
n →∞
Since <fn> converges to f on [a, b], we have
lim fn(t) = f(t) and lim fn(a) = f (a)
n →∞ n →∞
Hence
t
a g(x) dx = f(t) − f(a)
and so
d t
( a g(x) dx) = f′(t)
dt
or g(t) = f′(t), t ε [a, b]
This completes the proof of the theorem.
The series version of theorem 13 is
Theorem. 14. If a series Σ fn converges to f on [a, b] and
(i) each fn is differentiable on [a, b]
(ii) each fn′ is continuous on [a, b]
(iii) the series Σ fn′ converges uniformly to g on [a, b]
then f is differentiable on [a, b] and f ′(x) = g(x) for all x ε [a, b].
∞
Proof. Let <Sn> be the sequence of partial sums of the series fn. Since Σfn converges to f on
n =1
[a, b], the sequence <Sn> converges to f on [a, b]. Further, since Σfn′ converges uniformly to g on
[a, b], the sequence <Sn′> of partial sums converges uniformly to g on [a, b]. Hence, Theorem 13
is applicable and we have
f ′(x) = g(x) for all x ε [a, b].
∞ nx (n − 1) x
Examples. 1. Consider the series 2 2
−
n =1 1+ n x 1 + (n − 1) 2 x 2
For this series, we have
nx
Sn(x) = , 0≤x≤1
1+ n 2x 2
We have seen that 0 is a point of non-uniform convergence for this sequence. We have
nx
f(x) = lim Sn(x) = lim
n →∞ n →∞ 1 + n 2 x 2
= 0 for 0 ≤ x ≤ 1
Therefore
f ′(0) = 0
S ( 0 + h ) − S n ( 0)
Sn′(0) = lim n
n →0 h
n
= lim =n
n →0 1 + n 2 h 2
Hence
lim Sn′(0) = ∞
n →∞
Then
f′(0) ≠ lim Sn′(0).
n →∞
REAL ANALYSIS 25
∞ sin nx
2. Consider the series , x ε R. We have
n =1 n3
sin nx
fn(x) =
n3
cos nx
fn′(x) =
n2
Thus
cos nx
Σfn′(x) = Σ
n2
cos nx 1 1
Since 2
< 2 and Σ 2 is convergent, therefore, by Weierstrass’s M-test the series Σfn′(x)
n n n
is uniformly as well as absolutely convergent for all x ε R and so Σfn can be differentiated term
by term.
∞ ∞
Hence ( f n )′ = fn ′
n =1 n =1
'
∞ sin nx ∞ cos nx
or =
n =1 n 3
n =1 n2
1.6. Weierstrass’s Approximation Theorem.
Weierstrass proved an important result regarding approximation of continuous function which
has many application in Numerical Methods and other branches of mathematics.
The following computation shall be required for the proof of Weierstrass’s Approximation
Theorem.
For any p, q ε R, we have, by Binomial Theorem
(1.6.1)
k =0
n
( )p
n
k
k
qn−k = (p +q)n, n ε I,
where
|n
. ( )=
n
k
| k| n − k
Differentiating with respect to p, we obtain
n
k =0
( )k p
n
k
k−1
qn−k = m (p + q)n−1,
which implies
(1.6.2)
k =0
n k
n
( )p
n
k
k
qn−k = p(p+q)n−1, n ε I
k =0 n
n
k p
k−1 n−k
()
q = p(n−1) (p + q)n−2 + (p + q)n−1
and so
(1.6.3)
n k2
k =0 n
n
2 k
()
pk qn−k = p2 1 −
1
n
p
(p + q ) n − 2 + (p + q ) n −1
n
Now if x∈[0, 1], take p = x and q = 1 −x. Then (1.6.1), (1.6.2) and (1.6.3) yield
26 SEQUENCES AND SERIES OF FUNCTIONS
k =0
( )x
n
k
k
(1 − x ) n −k = 1
(1.6.4)
n
k =0
k
n
( )xn
k
k
(1 − x ) n −k = x
k =0
k2
n2
( )xn
k
k
(1 − x ) n −k = x 2 1 −
1
n
+
x
n
2
k
On expanding −x , it follows from (1.6.4) that
n
2
(1.6.5)
n
k =0
k
n
−x ( )x
n
k
k
(1 − x ) n −k =
x (1 − x )
n
(0 ≤ x ≤ 1)
(1.6.6.) Bn(x)
k =0
n
( )x
k
n
n
k , k
(1 − x ) n −k f
0 ≤ x ≤ 1, n ε I.
Since f is continuous over compact set [0, 1], it is uniformly continuous on [0, 1]. Hence given ∈
> 0 there exists δ>0 such that
∈
|f(x) − f(y) | < , (|x−y| < δ; x, y ε [0, 1])
2
Suppose N ε I such that
1
(1.6.9) <δ
4
N
and such that
1 ∈
(1.6.10) < (| f | > 0)
N 4| f |
Fix x ε [0, 1]. Multiplying the first identity in (1.6.4) by f and subtracting (1.6.6), we obtain for
any n ε I,
()
n
k n h n−k
(1.6.11) f(x) − Bn(x) = f (x ) − f k x (1−x)
k =0 n
= Σ1 + Σ2, say,
where Σ1 is the sum over those values of k such that
k 1
(1.6.12) | −x| < ,
n 4
N
k 1
while Σ2 is the sum over other values of k. If k does not satisfy (1.6.12), that is, if | −x| > ,
n 4 n
then
2
2 2 k
(k −nx) = n −x ≥ n3
n
Hence
|Σ2| = |Σ2[f(x) − f
k n k
n
()
] k x (1−x)n−k|
≤ Σ2[|f(x)| +| f
k
n
()
| ] nk xk (1−x)n−k
()
≤ 2|f(x)| Σ2 nk xk (1−x)n−k
≤2
|f |
3
()
Σ2 (k−nx)2 nk xk (1−x)n−k
n
≤
2| f | n
()
(k −nx)2 nk xk (1−x)n−k
n 3 h =0
Here, by (1.6.5)
2| f |
|Σ2| ≤ n x (1−x)
n3
2| f |
≤
n
1 ∈
If n ≥ N, it follows from (1.6.10) that < and so
n 4| f |
|Σ2| < ∈/2.
28 SEQUENCES AND SERIES OF FUNCTIONS
k
Moreover, if n ≥ N and k satisfies (1.6.12), then by (1.6.9) and (1.6.12), − x < δ and so
n
k
|f(x) − f | < ∈/2
n
∈
()
< Σ1 nk xk (1−x)n−k
2
and so by first identity of (1.6.4), we have
∈
|Σ1| <
2
Thus, (1.6.11) yields
| f(x) − Bn(x) ≤ |Σ1| + |Σ2|
∈ ∈
< + = ∈.
2 2
Since x was arbitrary point in [0, 1] and n any integer with n ≥ N, this shows that
| f(x) − Bn(x)| < ∈, 0 ≤ x ≤ 1, n ε I.
This completes the proof of the theorem.
Example. If f is continuous on [0, 1] and if
1 n
0 x f(x) dx = 0 for n = 0, 1, 2, …,
use Weierstrass’s Approximation Theorem to prove that f(x) = 0 on [0, 1]
Solution. The given hypothesis is that the integral of the product of f with any polynomial is
zero. We shall show that 01 f 2 = 0. We have, by Weierstrass’s Approximation Theorem
1 1
0 f 2 =& 0 P(x) f(x) = 0
f=0
1.7. Power Series
In this section we shall consider power series with real coefficients, and study its properties.
∞
Definition. A series of the form an xn is called a power series
n =0
∞
Applying Cauchy’s root test, we observe that the power series an xn is convergent if
n =0
1
|x| < ,
l
where
l = lim |an|1/n
1
The series is divergent if |x| >
l
Taking
1
r=
lim | a n |1/ n
REAL ANALYSIS 29
We say that the power series is absolutely convergent if |x| < r and divergent if |x| > r. If a0, a1,…
are all real and if x is real, we get an interval −r < x < r inside which the series is convergent.
∞
f(x) = an xn
n =0
Then by the above theorem
f ′(x) = Σ n an xn−1
Now applying the theorem 16 to f ′(x), we have
f ′′(x) = Σ n (n−1) an xn−2
…………………………
…………………………
∞
f (k) (x) = n(n−1) (n−2)…(n−k+1) an xn−k
n =k
Clearly f (k)(0) = k ak ; the other terms vanish at x = 0.
Remark. If the coefficients of a power series are known, the values of the derivatives of f at the
centre of the interval of convergence can be found from the relation
f (k)(0) = |k ak.
Also we can find coefficients from the values at origin of t, f ′, f ′′,…
Theorem. 18. (Uniqueness Theorem). If Σ an xn and Σ bnxn converge on some interval (−r, r),
r > 0 to the same function f, then
an = bn for all n∈N.
Proof. Under the given condition, the function f have derivatives of all order in (−r, r) given by
∞
f (k)(x) = n(n−1) (n−2)…(n−k+1) an xn−k
n =k
Putting x = 0, this yields
f (k)(0) = |k ak and f k(0) = |k bk
for all k ∈N. Hence
ak = bk for all k∈N.
This completes the proof or the theorem.
Theorem. 19. (Abel). Let (−r, r) be the interval of convergence of the power series
∞
f(x) = an xn
n =0
If the series is convergent when x = r, then
lim f(x) = f(r)
x →r −0
A similar result holds if the series is convergent when x = −r.
Putting x = ry, we obtain the power series
∞ ∞
an rn yn = bn yn, say,
n =0 n =0
whose interval of convergence in (−1, 1). It is therefore sufficient to prove the theorem for r = 1.
Hence we shall prove the following.
Theorem. 19. (Abel). Let (−1, 1) be interval of convergence for the power series Σ an xn. if
∞ ∞
an = S, than lim an xn = S.
n =0 x →1−0 n =0
m m
an xn = (Sn − Sn−1) xn
n =0 n =0
m m
= S n xn − Sn−1 xn
n =0 n =0
m −1 m
= Sn xn + Sm xm − Sn−1 xn
n =0 n =0
m −1 m
= S n xn − x Sn−1 xn−1 + Sm xm
n =0 n =0
m −1
= (1−x) Sn xn + Sm xm
n =0
For |x| < 1, let m→∞ and obtain
∞
(1.7.1) f(x) = (1−x) S n xn
n =0
Since Σ an = S, Sn→S as n→∞. So to each ∈>0, there exists an integer N such that n>N implies
|S − Sn| < ∈/2
Also we know that
∞
(1−x) xn = 1 (|x| <1)
n =0
or
∞
(1.7.2) S = (1−x) Sxn (|x| <1)
n =0
Then (1.7.1) and (1.7.2) yield
∞
| f(x) − S| = |(1−x) (Sn − S) xn|
n =0
N ∞
≤ (1−x) |Sn−S| |x|n + |Sn−S| |x|n
n =0 n = N +1
N ∈
≤ (1−x) |Sn − S| |x|n +
n =0 2
N
But for a fixed N, (1−x) |Sn −S| |x|n is a positive continuous function of x having zero value
n =0
N
at x = 1. Therefore there exists δ>0 such that for 1 − δ < x < 1, (1−x) |Sn −S| |x|n is less
n =0
∈
than . Hence
2
∈ ∈
| f(x) −S| < + = ∈, 1−δ < x < 1
2 2
and so lim
x →1−
∞
f(x) = S = an
n =0
Tauber’s Theorem. The converse of Abel’s theorem proved above is false in general. If f is
given by
32 SEQUENCES AND SERIES OF FUNCTIONS
∞
f(x) = anxn, −r < x < r
n =0
the limit f(r−) may exist but yet the series Σ an rn may fail to converge. For example, if
an = (−1)n, then
1
f(x) = , −1 < x < 1
1+ x
1
and f(x) → as x→1−. However Σ (−1)n is not convergent. Tauber showed that the converse of
2
Abel’s theorem can be obtained by imposing additional condition on the coefficients an. A large
number of such results are known now a days as Tauberian Theorems. We present here only
Tauber’s first theorem.
∞
Theorem. 20. (Tauber). Let f(x) = anxn for −1 < x < 1 and suppose that lim n an = 0. If f(x)
n =0 n →∞
∞
→ S as x→1−, then an converges and has the sum S.
n =0
n 1
Proof. Let n σn = k |ak|. Then σn→ 0 as n→∞. Also, lim f(xn) = S if xn = 1 − . Therefore
h =0 n →∞ n
to each ∈ >0, we can choose an integer N such that n ≥ N implies
∈ ∈ ∈
| (fxn) − S| < , σn < , n |an| < .
3 3 3
n
Let Sn = ak . Then for −1 < x < 1, we have
h =0
n ∞
Sn − S = f(x) − S + ak (1−xk) − ak xk
k =0 h = n +1
Let x ∈ (0, 1). Then
(1−xk) = (1−x) (1+x + −+ xk−1) ≤ k(r−x)
for each k. Therefore, if n ≥ N and 0 < x < 1, we have
n ∈
|Sn − S| ≤ |f(x) − S| + (1−x) k |ak| +
h =0 3n (1 − x )
1
Putting x = xn = 1− , we find that
n
∈ ∈ ∈
|Sn − S| < + + = ∈1
3 3 3
which completes the proof.
REAL ANALYSIS 33
2
FUNCTIONS OF SEVERAL VARIABLES
2.1. In this chapter, we shall study derivatives and partial derivatives of functions of several
variables alongwith their properties.
2.2. Linear Transformations
Definition. A mapping f of a vector space X into a vector space Y is said to be a linear
transformation if
f(x1 + x2) = f (x1) + f (x2),
f(cx) = cf(x)
for all x, x1, x2 ∈ X and all scalars c.
Clearly, if f is linear transformation, then f (0) = 0.
A linear transformation of a vector space X into X is called linear operator on X.
If a linear operator T on a vector space X is one-to-one and onto, then T is invertible and
its inverse is denoted by T−1. Clearly T−1 (Tx) = x for all x ∈X. Also, if T is linear, then T−1 is
also linear.
Theorem 1. A linear operator T on a finite dimensional vector space X is one-to-one if and
only if the range of T is whose of X.
Proof. Let R(T) denote range of T. Let (x1, x2, …, xn) be a basis of X. Since T is linear the set
(Tx1, Tx2,…, Txn) spans R(T). The range of T will be whole of X if and only if {Tx1, Tx2,…,
Txn} is linearly independent
So, Suppose first that T is one-to-one. We shall prove that {Tx1, Tx2,…, Txn} is linearly
independent. Hence, let
c1Tx1, c2Tx2+…+ cnTxn = 0
Since T is linear, this yields
T(c1x1 + c2x2 +…+ cnxn) = 0
and so c1x1 + c2x2 +…+ cnxn = 0
Since (x1, x2,…, xn} is linearly independent, we have
c1 = c2 = … cn = 0
Thus {Tx1, Tx2,…, Txn} is linearly independent and so R(T) = X if T is one-to-one.
Conversely, suppose {Tx1, Tx2,…, Txn} is linearly independent and so
(2.2.1) c1Tx1 + c2Tx2 +…+ cnTxn = 0
implies c1 = c2=…= cn = 0. Since T is linear, (2.2.1) implies
T(c1x1 + …+ cnxn) = 0
c1x1 +…+ cnxn = 0
Thus T(x) = 0 only if x = 0. Now
T (x) = T(y) T(x−y) = 0 x−y=0 x=y
and so T is one-to-one. This completes the proof of the theorem
Definition. Let L(X, Y) be the set of all linear transformations of the vector space X into the
vector space Y. If T1, T2 ∈ L(X, Y) and if c1, c2 are scalars, then c1 T1 + c2 T2 is defined by
34 FUNCTIONS OF SEVERAL VARIABLES
Hence d is a metric.
Theorem. 3. If T ∈ L(Rn, Rm) and S ∈L (Rm, Rk), then
||S T|| ≤ ||S|| ||T||
Proof. We have
|(ST) x| = | s(Tx)| ≤ ||S|| | Tx|
≤ ||S|| ||T|| |x|
Taking sup. over x, |x| ≤ 1, we get
||ST|| ≤ ||S|| ||T||.
In Theorem 2, we have seen that the set of linear transformation form a metric space. Hence the
concepts of convergence, continuity, open sets, etc make sense in Rn.
Theorem. 4. Let C be the collection of all invertible linear operator on Rn .
1
(a) If T ∈ C, ||T−1|| = , S ∈L (Rn, Rn) and ||S − T|| = β < α, then S ∈C.
α
(b) C is an open subset of L(Rn, Rn) and the mapping T → T−1 is continuous on C.
Proof. We note that
|x| = |T−1 T x| ≤ ||T−1|| |Tx|
1
= |Tx| for all x ∈Rn
α
and so
(2.2.2) (α −β) |x| = α|x| − β|x|
≤ |Tx| − β |x|
≤ |Tx| − |(S −T) x|
≤ |Sx| for all x ∈Rn.
Thus kernel of S consists of 0 only. Hence S is one-to-one. Then theorem 1 implies that T is also
onto. Hence S is invertible and so S ∈ C. But this holds for all S satisfying ||S−T|| < α. Hence
every point of C is an interior point and so C is open.
Replacing x by S−1 y in (2.2.2), we have
(α −β) |S−1 y| ≤ |SS−1 y| = |y|
|y|
or |S−1 y| ≤
α −β
1
and so ||S−1|| ≤
α −β
Since S − T = S−1 (T − S) T−1.
−1 −1
We have
(2.2.3) ||S−1 − T−1|| ≤ ||S−1|| || T−S|| ||T−1||
β
≤
α ( α − β)
Thus if f is the mapping which maps T →T−1, then (2.2.3) implies
|| S − T ||
||f(S) − f(T)|| ≤
α ( α − β)
Hence, if ||S −T||→0 then f(S) → f(T) and so f in continuous. This completes the proof of the
theorem.
36 FUNCTIONS OF SEVERAL VARIABLES
In particular, let f be a real valued function of three variables x,y, z say. Then f is differentiable
at the point (x, y, z) if it possesses a determinate value in the neighbourhood of this point and if
∆f = f(x + ∆x, y + ∆y, z + ∆z) − f(x, y, z) = A∆x + B∆y + C∆z + ∈ρ, where ρ = |∆x| + |∆x| +
|∆z|, ∈→0 as ρ→0 and A, B, C are independent of x, y, z. In this case A∆x + B∆y + C∆z is
called differential of f at (x, y, z).
Theorem 5. (Uniqueness of Derivative of a function). Let E be an open set in Rn and f maps E
in Rm and x∈E. Suppose h ∈Rn is small enough such that x + h ∈E. Then f has a unique
derivative.
Proof. If possible, let there the two derivatives A1 and A2. Therefore
REAL ANALYSIS 37
| f (x + h) − f(x) − A 1h |
lim =0
h →0 |h|
| f (x + h) − f(x) − A 2 h |
and lim =0
h →0 |h|
Consider B = A1 − A2. Then
Bh = A1h − A2h
= f(x +h) −f(x) + f(x) − f(x +h) + A1h − A2h
= f (x +h) −f (x) − A2h + f(x) − f(x +h) + A1h
and so |Bh| < |f (x +h) −f(x) −A1h |+| f(x+h) − f(x) − A2h|
which implies
| Bh | | f (x + h) − f(x) − A 2 h | | f (x + h) − f(x) − A 2 h |
lim ≤ lim +
h →0 h h →0 |h| |h|
=0
For fixed h ≠ 0, it follows that
| B( th) |
(2.3.5) → 0 as t→0
| th |
The linearity of B shows that L.H.S. of (2.3.5) is independent of t. Thus Bh = 0 for all h ∈Rn.
Hence B = 0, that is, A1 = A2, which proves uniqueness of the derivative.
The following theorem, known as chain rule, tells us how to compute the total derivatives of the
composition of two functions.
Theorem. 6. (Chain rule). Suppose E is an open set in Rn , f maps E into Rm, f is differentiable
at x0 will total derivative f ′(x0), g maps on open set containing f (E) into Rk and g is
differentiable at f (x0) with total derivative g′ (f(x0)). Then the composition map F = f o g
mapping E into Rk and defined by F(x) = g(f(x)) is differentiable at x0 and has the derivative
F ′(x0) = g′ (f(x0)) f ′(x0)
Proof. Take
y0 = f(x0), A = f ′ (x0), B = g′(y0)
and define
r1(x) = f(x) − f(x0) − A(x −x0)
r2(y) = g(y) − g(y0) − B(y −y0)
r(x) = F(x) − F(x0) − BA(x −x0)
To prove the theorem it is sufficient to show that
F′(x0) = BA,
That is,
r(x)
(2.3.6) →0 as x − x0
| x − x0 |
But, in term of definition of F (x), we have
r(x) = g(f(x)) − g(y0) −B(f(x) −f(x0) − A(x −x0))
so that
(2.3.7) r(x) = r2 (f(x)) + B r1 (x)
If ∈ >0, it follows from the definitions of A and B that there exist η > 0 and δ > 0 such that
| r2 ( y ) |
≤ ∈ if |y −y0| < η
| y − y0 |
or |r2(y)| ≤ ∈ |y −y0| if |y −y0| < η, i.e. if |f(x) − f(x)| < η
38 FUNCTIONS OF SEVERAL VARIABLES
∈
(2.4.4) |(Dj f) (y) − (Dj f) (x) | < , y ∈B, 1 ≤ j ≤ n.
n
Suppose h = Σ hj ej, |h| < r, and take v0 = 0
and vk= h1 e1 + h2 e2 +…+ hk ek for 1 ≤ k ≤ n.
Then
n
(2.4.5) f(x +h) − f(x) = [f (x + vj) − f (x + vj−1)]
j=1
Since |vh| < r for 1 ≤ k ≤ n and since s is converse, the end points x + vj−1 and x + vj lie in s.
Further, since
vj = vj−1 + hj ej
Mean Value Theorem implies
(2.4.6) f(x + vj) − f(x + vj−1) = f (x + vj−1 + hj ej) − f(x + vj−1)
= hj ej (Di f) (x + vj−1 + vj hj ej)
∈
for some θ ∈(0, 1) and by (2.4.4) this differ from hj (Di f) (x) by less than |hj| . Hence (2.4.5)
n
gives
n 1 n
| f(x + h) − f(x) − hj (Di f) (x)| ≤ |hj| ∈
j=1 n j=1
= |h| ∈
for all h satisfying |h| < r.
Hence f is differentiable at x and f ′(x) is the linear function which assigns the number Σ hj(Di f)
(x) to the vector h = Σ hj ej. The matrix [f ′ (x)] consists of the row (Di f ) (x),…, (Dn f) (x). Since
D1 f, D2 f,…, Dn f are continuous functions on E, it follows that f ′ is continuous and hence f ∈C′
(E).
2.5. Classical Theory for Functions of more than one Variable
Consider a variable u connected with the three independent variables x, y and z by the functional
relation
u = u (x, y, z)
If arbitrary increments ∆x, ∆y, ∆z are given to the independent variables, the corresponding
increment ∆u of the dependent variable of course depends upon the three increments assigned to
x, y and z.
Definition. A function u = u(x, y, z) is said to be differentiable at the point (x, y, z) if it
possesses a determinate value in the neighbourhood of this point and if.
∆u = A ∆x + B∆y + C∆z + ∈ρ,
where ρ = |∆x| + |∆y| + |∆z|, ∈→0, as ρ→0 and A, B, C are independent of ∆x, ∆y, ∆z.
In the above definition ρ may always be replaced by η, where
η= x 2 + y2 + z2
Definition. If the increment ratio
u ( x + x , y, z ) − u ( x , y, z )
x
REAL ANALYSIS 41
tends to a unique limit as ∆x tends to zero, this limit is called the partial derivative of u with
∂u
respect to x and is written or u x .
∂x
∂u ∂u
Similarly and can be defined.
dy ∂z
The differential coefficients: If in the equation
∆u = A∆x + B∆y, + C∆z + ∈ρ
we suppose that ∆y = ∆z = 0, then, on the assumption that u is differentiable at the point (x, y, z),
∆u = u(x + ∆x, y, z) − u (x, y, z)
= A∆x + ∈|∆x|
and dividing by ∆x,
u ( x + x , y, z ) − u ( x , y, z )
=A+∈
x
∂u
and by taking the limit as ∆x→0, since ∈→0 as ∆x→0, we get =A
∂x
∂u ∂u
Similarly = B and = C.
∂y ∂z
∂u ∂u ∂u
Hence, when the function u = u(x, y, z) is differentiable, the partial derivatives , , are
∂x ∂y ∂z
respectively the differential coefficients A, B, C and so
∂u ∂u ∂u
∆u = x+ y+ z+ ∈
∂x ∂y ∂z
The differential of the dependent variable du is defined to be the principal part of ∆u so that the
above expression may be written as
∆u = du + ∈ρ.
Now as in the case of functions of one variable, the differentials of the independent variables are
identical with the arbitrary increments of these variables. It we write u = x, u = y, u = z.
respectively, it follows that
dx = ∆x, dy = ∆y, dz = ∆z
Therefore, expression for du reduces to
∂u ∂u ∂u
du = dx + dy + dz
∂x ∂y ∂z
The distinction between derivatives and differential coefficients
We know that the necessary and sufficient condition that the function y = f(x) should be
differentiable at the point x is that it possesses a finite definite derivative at that point. Thus for
functions of one variable, the existence of the derivative f ′(x) implies the differentiability of f(x)
at any given point.
For functions of more than one variable this is not true. If the function u = u (x, y, z) is
differentiable at the point (x, y, z), the partial derivatives of u with respect to x, y and z certainly
exist and are finite at this point, for then they are identical with differential coefficients A, B and
C respectively. The partial derivatives, however, may exist at a point when the function is not
differentiable at that point. In other words, the partial derivatives need not always be differential
coefficients.
42 FUNCTIONS OF SEVERAL VARIABLES
x 3 − y3
Example.1. Let f be a function defined by f(x, y) = , where x and y are not
x2 + y2
simultaneously zero, f(0, 0) = 0.
If this function is differentiable at the origin, then, by definition,
(2.5.1) f(h, k) − f(0, 0) = Ah + Bk + ∈η, (1)
2 2
where η = h + k and ∈→0 as η→0.
Putting h = η cosθ, k = η sin θ in (2.5.1) and dividing through by η, we get
cos3θ − sin3θ = A cos θ + B sin θ + ∈.
Since ∈→0 as η→0, we get, by taking the limit as η→0
cos3θ − sin3θ = A cos θ + B sin θ
which is impossible, since θ is arbitrary.
The function is therefore not differentiable at (0, 0). But the partial derivative exist however, for
f (h ,0) − f (0,0) h −0
fx(0, 0) = lim = lim =1
h →0 h h → 0 h
f (0, k ) − f (0,0) −k
fy(0, 0) = lim = lim = −1 .
k →0 k k →0 + k
xy
if x 2 y 2 ≠ 0
Example. 2. f(x, y) = x 2 + y2
0 if x = y = 0
Then fx(0, 0) = 0 = fy(0, 0)
and so partial derivatives exist. If it is different, then
df = f (h, k) − f(0, 0) = Ah + Bh + ∈η, where A = fx(0, 0) B = fy (0, 0),
This yields
hh
= ∈ h2 + h2 , η = h2 + k2
h2 + k2
or hk = h2 + k2
Putting k = mh we get
mh2 = ∈h2 (1+m2)
m
or =∈
1+ m2
m
Hence lim =0,
k →0 1+ m2
which is impossible. Hence the function is not differentiable at the origin.
Remarks:
1. Thus the information given by the existence of the two first partial derivatives is limited. The
values of fx(x, y) and of fy(x, y) depend only on the values of f(x, y) along two lines through the
point (x, y) respectively parallel to the axes of x and y. This information is incomplete, and tells
us nothing at all about the behaviour of the function f(x, y) as the point (x, y) is approached along
a line which is inclined to the axis of x at any given angle θ which is not equal to 0 or π/2.
2. Partial derivatives are also in general functions of x, y and z which may posses partial
derivatives with respect to each of the three independent variables, we have the definition
REAL ANALYSIS 43
∂ ∂u u ( x + x , y, z ) − u x ( x , y, z )
(i) = lim x
∂x ∂x x → 0 x
∂ ∂u u ( x + y + y, z ) − u x ( x , y, z )
(ii) = lim x
∂y ∂x y→0 y
∂ ∂u u ( x , y, z + z ) − u x ( x , y, z )
(iii) = lim x
∂z ∂x z→0 z
Provided that each of these limits exist. We shall denote the second order partial derivatives by
∂ 2u ∂ 2u ∂ 2u
or u xx , or u yx and or uzx.
∂x 2 ∂y∂x ∂z∂x
∂u ∂u
Similarly we may define higher order partial derivatives of and .
∂y ∂z
The following example shows that certain second partial derivatives of a function may exist at a
point at which the function is not continuous.
x 3 + y3
Example. 3. Let φ(x, y) = when x ≠ y
x−y
φ(x, y) = 0 when x = y.
This function is discontinuous at the origin. To show this it suffices to prove that if the origin is
approached along different path, φ(x, y) does not tend to the same definite limit. For, if φ(x, y)
were continuous at (0, 0), φ(x, y) would tend to zero (the value of the function at the origin) by
what ever path the origin were approached.
Let the origin be approached along the three curves
(i) y = x − x2, (ii) y = x − x3, (iii) y = x − x4 ;
then we have
2 x 3 + 0( x 4 )
(i) φ(x, y) = →0 as x→0
x2
2 x 3 + 0( x 4 )
(ii) φ(x, y) = →z as x→0
x3
2 x 3 + 0( x 4 )
(iii) φ(x, y) = →∞ as x→0
x4
∂ ∂φ
Certain partial derivatives, however, exist at (0, 0), for if φx,x denote we have, for
∂x ∂x
example,
φ(h ,0) − φ(0,0) h2
φx(0, 0) = lim = lim = 0,
h →0 h h →0 h
φ (h ,0) − φ x (0,0) 2h
φxx(0, 0) = lim x = lim = 2,
h →0 h h →0 h
since φ(x, 0) = x2, φx(x, 0) = 2x when x ≠ 0.
The following example shows that uxy is not always equal to uyx.
Example. 4. Let
44 FUNCTIONS OF SEVERAL VARIABLES
xy( x 2 − y 2 )
f(x, y) =
x2 + y2
f(0, 0) = 0.
When the point (x, y) is not the origin, then
∂f x 2 − y2 4x 2 y 2
(2.5.2) =y 2 +
∂x x + y 2 (x 2 + y 2 ) 2
∂f x 2 − y2 4x 2 y 2
(2.5.3) =x 2 +
∂y x + y 2 (x 2 + y 2 ) 2
while at the origin,
f (h ,0) − f (,0)
(2.5.4) fx(0, 0) = lim =0
h →0 h
and similarly fy(0, 0) = 0.
From (2.5.2) and (2.5.3) we see that
(2.5.5) fx(0, y) = −y (y ≠ 0)
(2.5.6) fy(x, 0) = x (x ≠ 0)
Now we have, using (2.5.4), (2.5.5) and (2.5.8)
f y (h ,0) − f y (0,0) h
fxy (0, 0) = lim = lim = 1
h →0 h h
f x (0, k ) − f x (0,0) −k
fyx (0, 0) = lim = lim = −1
k →0 k k
and so fxy (0, 0) ≠ fyx(0, 0).
Example. 5. Prove that the function
f(x, y) = (|xy|)1/2
∂f ∂f
is not differentiable at the point (0, 0), but that and both exist at the origin and have the
∂x ∂y
value zero.
Hence deduce that these two partial derivatives are continuous except at the origin.
Solution. We have
∂f f (h ,0) − f (0,0)
(0,0) = lim =0
∂x h →0 h
f (0, k ) − f (0,0)
fy(0, 0) = lim =0
k →0 k
If f(x, y) is differentiable at (0, 0), then we must have
f(h, k) = 0.h + 0.k + ∈ h2 + k2
where ∈→0 as h 2 + k 2 →0
Now ∈=
(| hk |)1/ 2
h2 + k2
Putting h = ρ cos θ, k = ρ sin θ, we get
∈ = | sin cos |
∴ lim =∈ | sin cos | | cos sin | =0 which is impossible for arbitrary 0.
→0
Hence, f is not differentiable.
REAL ANALYSIS 45
Theorem. 9. (Young) If (i) fx and fy exist in the neighbourhood of the point (a, b) and (ii) fx and
fy are differentiable at (a, b); then
fxy = fyx
Proof. We shall prove this theorem by taking equal increments h both for x and y and calculating
∆2 f in two different ways, where
∆2 f = f(a + h, b + h) − f (a +h, b) − f(a, b+h) + f(a, b).
Let
H(x) = f(x, b+h) − f(x, b)
Then, we have
∆2 f = H(a +h) − H(a)
Since fx exists in the neighbourhood of (a, b), the function H(x) is derivable in (a, a+h).
Applying Mean Value Theorem to H(x) for 0 < θ < 1, we obtain
H(a + h) − H(a) = h H′(a + θh).
Therefore
(2.5.7) ∆2f = hH′(a + θh)
= h[fx (a + θh, b + h) − fx (a + θh, b)]
By hypothesis (ii) of the theorem, fx(x, y) is differentiable at (a, b) so that
fx(a + θh, b+h) − fx(a, b) = θhfxx (a, b) + h fyx (a, b) + ∈′ h
and
fx (a + θh, b) − fx(a, b) = θh fxx + ∈′′h ,
where ∈′ and ∈′′ tend to zero as h→0. Thus, we get (on subtracting)
fx(a +θh, b+h) − fn (a + θh, b) = hfyx (a, b) + h (∈′ − ∈′′)
Putting this value in (1), we obtain
(2.5.8) ∆2f = h2 fyx + ∈1 h2,
where ∈1 = ∈′ − ∈′′,
so that ∈1 tends to zero with h .
Similarly, if we take
K(y) = f(a + h, y) − f(a, y),
Then we can show that
46 FUNCTIONS OF SEVERAL VARIABLES
x 2 y2
, x ≠ 0, y ≠ 0
f(x, y) = x 2 + y 2
0 , x=y=0
f (h ,0) − f (0,0)
We have fx(0, 0) = lim =0
h →0 h
f (0, k ) − f (0,0)
fy(0, 0) = lim =0
k →0 k
Also, for (x, y) ≠ (0, 0), we have
( x 2 + y 2 )2 xy 2 − x 2 y 2 .2 x 2 xy 4
fx(x, y) = =
(x 2 + y 2 ) 2 (x 2 + y 2 ) 2
2x 4 y
fy(x, y) = 2 2 2
(x y )
f (0, k ) − f x (0,0)
Again fyx (0, 0) = lim x = 0 and fxy (0, 0) = 0
k →0 k
So that fyx (0, 0) = fxy (0, 0)
8xy 3 ( x 2 + y 2 ) 2 − 2 xy 4 .4 y( x 2 + y 2 )
For (x, y) ≠ (0, 0), we have fyx(x, y) =
(x 2 + y 2 ) 4
8x 3 y 3
=
(x 2 + y 2 ) 2
Putting y = mx, we can show that
lim fyx ≠ 0 = fyx(0, 0)
( x , y )→ ( 0, 0 )
so that fxy is not continuous at (0, 0). Thus the condition of Schwarz’s theorem is not satisfied.
To see that conditions of Young’s Theorem are also not satisfied, we notice that
f (h ,0) − f x (0,0)
fxx(0,0) = lim x =0
h →0 h
If fx is differentiable at (0, 0) we should have
fx(h, k) − fx(0, 0) = fxx(0, 0). h + fyx(0, 0). k + ∈η
2hk 4
= ∈ η, where ∈→0 as η→0.
(h 2 + k 2 ) 2
Put h = ρ cos θ, k = ρ sin θ, then η = h 2 + k 2 = ρ
so we have
2 cos . 4 sin 4
4
= ∈ρ
t2 t n (n)
φ(t) = φ(0) + tφ′(0) + φ′′(0) +…+ φ (θt),
|2 |n
where 0 < θ < 1. Now put t = 1 and observe that
φ(1) = f (a + h, b+k), φ(0) = f (a, b), φ′(0) = d f(a, b)
φ′′(0) = d2f (a, b),…., φ(n) (θ t) = dn f (a + θh, h + θk).
It follows immediately that
1 2
(2.5.11) f (a + h, b + k) = f(a, b) + d f (a, b) + d f(a, b) +…
|2
1
+ dn−1 f(a, b) + Rn
| n − 1
1 n
where Rn = d f(a + θh, b + θk), 0 < θ < 1.
|n
We have assumed here that all the partial derivatives of order n are continuous in the domain in
question. Taylor expansion does not necessarily hold if these derivatives are not continuous.
Maclaurin’s theorem. If we put a = b = 0, h = x, k = y, we get at once, from the equation
1 2
f(a + h, b + k) = f(a, b) + df (a, b) + d f(a, b) +
|2
1
+ dn−1 f(a, b) + Rn
| n − 1
1 n
where Rn = d f(a + θh, b + θk), 0 < 0 < 1,
|n
1 2 1
that f(x, y) = f(0, 0) + df(0, 0) + d f(0, 0) +…+ dn−1 f(0, 0) + Rn
|2 | n −1
1 n
where Rn = d f (0x, θh), 0 < θ < 1.
|n
The theorem easily extend to any number of variables.
Example. 6. If f (x, y) = (|x y|)1/2, prove that Taylor’s expansion about the point (x, x) is not valid
in any domain which includes the origin. Give reasons.
Solution. If a Taylor expansion were possible (n = 1)
f(x + h, x + h) = f(x, x) + h {fx(ξ, ξ) + fy(ξ, ξ)}
where x < ξ < x + h. This is not valid for all x, h for it implies that
|x + λ| = |x| + h, ξ ≠ 0
= |x|, ξ = 0
∂f ∂f
(The reason is that the partial derivatives and are not continuous at the origin).
∂x ∂z
2.7. Implicit functions
Let F(x1, x2,…, xn, u) = 0 (1)
be a functional relation between the n + 1 variables x1,…, xn, u and let x = a1, x2 = a2,…, xn = an
be a set of values such that the equation
F(a1,…, an, u) = 0 (2)
50 FUNCTIONS OF SEVERAL VARIABLES
is satisfied for at least one value of u, that is the equation (2) in u has at least one root. We may
consider u as a function of the x’s : u = φ(x1, x2,…, xn) defined in a certain domain, where
φ(x1, x2,…, xn) has assigned to it at any point (x1, x2,…, xn) the roots u of the equation (1) at this
point. We say that u is the implicit function defined by (1). It is, in general, a many valued
function.
More generally, consider the set of equations
Fp (x1,…, xn, u1,…, um) = 0 (p = 1, 2,…, m) (3)
between the n +m variables x1,…, xn, u1,…, um and suppose that the set of equations (3) are such
that there are points (x1, x2,…, xn) for which these m equations are satisfied for at least one set of
values u1, u2,…, um. We may consider the u’s as functions of the x’s,
up = φp (x1, x2,…, xn) ( p = 1, 2,…, m)
where the functions φ have assigned to them at the point (x1, x2,…, xn) the values of the roots u1,
u2,…, um at this point. We say that u1, u2,…, um constitute a system of implicit functions
defined by the set of equation (3). These functions are in general many valued.
Theorem. 12 (Existence Theorem). Let F(u, x, y) be a continuous function of the variables u, x,
y. Suppose that
(i) F(u0, a, b) = 0
(ii) F(u, x, y) is differentiable at (u0, a, b)
∂F
(iii) The partial derivative (u0, a, b) ≠ 0
∂u
Then there exists at least one function u = u (x, y) reducing to u0 at the point (a, b) and which, in
the neighbourhood of this point, satisfies the equation F (u, x, y) = 0 identically.
Also, every function u which possesses these two properties is continuous and differentiable at
the point (a, b).
∂F
Proof. Since F(u0, a, b) = 0 and (u0, a, b) ≠ 0, the function F is either an increasing or
∂u
decreasing function of u when u = u0. Thus there exists a positive number δ such that F(u0 − δ, a,
b) and F(u0 + δ, a,b) have opposite signs. Since F is given to be continuous, a positive number η
can be found so that the functions
F(u0 − δ, x, y) and F(u0 + δ, x, y)
the values of which may be as near as we please to
F(u0 − δ, a, b) and F(u0 + δ, a, b)
will also have opposite signs so long as |x − a| < η and |y −b| < η.
Let x, y be any two values satisfying the above conditions. Then F(u, x, y) is a continuous
function of u which changes sign between u0 −δ and u0 + δ and so vanishes somewhere in this
interval. Thus for these x and y there is a u in [u0 − δ, u0 + δ] for which F(u, x, y) = 0. This u is a
function of x and y, say u (x, y) which reduces to u0 at the point (a, b).
Suppose that ∆u, ∆x, ∆y are the increments of such function u and of the vanishes x and y
measured from the point (a, b). Since F is differentiable at (u0, a, b), we have
∆F = [Fu(u0, a, b) + ∈] ∆u + [Fx(u0, a, b) + ∈′] ∆x
+ [Fy (u0, a, b) + ∈′′] ∆y = 0
since ∆F = 0 because of F = 0. The numbers ∈, ∈′, ∈′′ tend to zero with ∆u, ∆x and ∆y and can
be made as small as we please with δ and η. Let δ and η be so small that the numbers ∈, ∈′, ∈′′
REAL ANALYSIS 51
1
are all less than |Fu(u0, a, b)|, which is not zero by our hypothesis. The above equation then
2
shows that ∆u→0 as ∆x→0 and ∆y→0 which means that the function u = u(x, y) is continuous at
(a, b).
Moreover, we have
[Fx (u 0 , a , b)+ ∈'] x + [Fy (u 0 , a , b)+ ∈''
] y
∆u = −
Fu (u 0 , a , b)+ ∈
F ( u , a , b) Fy (u 0 , a , b)
=− x 0 x− ∆y + ∈1 ∆x + ∈2 ∆y,
Fu (u 0 , a , b) Fu (u 0 , a , b)
∈1 and ∈2 tending to zero as ∆ x and ∆y tend to zero.
Hence u is differentiable at (a, b).
∂F
Cor.1. If exists and is not zero in the neighbourhood of the point (u0, a, b), the solution u of
∂u
the equation F = 0 is unique.
Suppose that there are two solutions u1 and u2. Then we should have, by Mean Value Theorem,
for u1 < u′ < u2
θ = F(u1, x, y) − F(u2, x, y) = (u1 − u2) Fu(u′, x, y) ,
and so Fu (u, x, y) would vanish at some point in the neighbourhood of (u0, a, b) which is contary
to our hypothesis
Cor. 2. If F(u, x, y) is differentiable in the neighbourhood of (u0, a, b), the function u = u(x, y) is
differentiative in the neighbourhood of the point (a, b).
This is immediate, because the preceding proof is then applicable at every point (u, x, y) in that
neighbourhood.
Corollary 1 is of great importance, for by considering a function of wo variables only,
F(u, x) = 0, and taking F(u, x) = f(u) −x, we can enunciate the fundamental theorem on inverse
functions as follows.
Theorem. 13 (Inverse Function Theorem). If, in the neighbourhood of u = u0, the function f(u)
is a continuous function of u, and if (i) f(u0) = a, (ii) f ′(u) ≠ 0 in the neighbourhood of the point u
= u0, then there exists a unique continuous function u = φ(x), which is equal to u0 when x = a,
and which satisfied identically the equation
f(u) − x = 0,
in the neighbourhood of the point x = a.
The function u = φ(x) thus defined is called the inverse function of x = f(u).
2.8. Extreme Values
Definition. A function f(x, y, z) of several independent variables x, y, z,… is said to have an
extreme value at the point (a, b, c,…) if the increment
∆f = f(a + h, b + k, c +l ) −f (a, b, c)
preserves the same sign for all values of h, k, l, whose moduli do not exceed a sufficiently small
positive number η.
If ∆f is negative, then the extreme value is a maximum and if ∆f is positive it is a minimum.
Now we find necessary and sufficient conditions for extreme values. We will consider a function
of two independent variables.
By Taylor’s theorem we have
52 FUNCTIONS OF SEVERAL VARIABLES
∂f ∂f
f(x + h, y + k,…) − f(x, y, …) = h +k +…+ terms of the second and
∂x ∂y
higher orders.
Now by taking h, k, l, sufficiently small, the first degree terms can be made to govern the sign of
the right hand side and therefore of the left side also, of the above equation, therefore by
changing the sign of h, k, l, the sign of the left hand member would be changed. Hence as a first
condition for the extreme value we must have
∂f ∂f ∂f
h + k + l + … = 0,
∂x ∂y ∂z
and since these arbitrary increments are independent of each other, we must have
∂f ∂f ∂f
= 0, = 0, = 0, …
∂x ∂y ∂z
which are necessary conditions for extreme points. These conditions are not sufficient for
extreme points.
To find sufficient conditions we will consider only the case of two variables.
Let f be a real valued function of two variables. Let (a, b) be an interior point of the domain of f
such that f admits of second order continuous partial derivatives in this neighbourhood. We
suppose that fx(a, b) = 0 = fy(a, b).
∂2 f ∂2 f ∂2 f
We write r, s, t for the values of , , respectively when x = a and y = b. That is,
∂x 2 ∂x∂y ∂y 2
fx,x(a, b) = r, fx, y(a, b) = S, fy,y(a, b) = t
If (a + h, b + k) is any point of neighbourhood of (a, b), then by Taylor’s theorem we have
f(a h, b +k) − f(a, b) = h fx(a, b) + k fy(a, b)
1
+ [h2 fx,x(a, b) + 2 fxy(a, b) hk + k2 fy,y(a, b)]
2
1 ∂f ∂f
+ R3 = [h2 + 25hk + tk2] + R3 Θ ( a , b ) = (a , b ) = 0
2 ∂x ∂y
where R3 consists of terms of the third and higher orders of small quantities, and by taking h and
k sufficiently small the second degree terms now can be made to govern the sign of the right
hand side and therefore of the left hand side also. If these terms are of permanent sign for all
such values of h and k, we shall have a maximum or minimum for f(x, y,…) according as that
sign is negative or positive.
Now condition for the invariable sign of (r h2 + 25hk + tk2) is that rt − S2 shall be positive and
the sign will be that of r. If rt − S2 is positive, it is clear that r and t must have the same sign.
Thus, if rt − S2 is positive we have a maximum or minimum according as r and t are both
negative or both positive.
This condition was first pointed out by Lagrange and is known as Lagrange’s condition.
If, However, rt = S2, the quadratic terms
1
rh2 + 2shk + tk2 becomes (hr + ks)2 (*)
r
and are therefore of the same sign as r or t unless
h S
= − = B say for which * vanishes
k r
REAL ANALYSIS 53
In this case we must consider terms of higher degree in the expansion f(a + h, h + k) − f(a, b).
h
The cubic term must vanish collectively when = β; otherwise, by changing the sign of both h
k
and k we could change the sign of f(a + h, b + k) − f(a, b). And the biquadratic terms must
h
collectively be of the same sign as r and t when = β.
k
If r = 0, S ≠ 0, * changes sign with k and there is no extreme value. If r = 0 = S * does not
change sign but it vanishes where h = 0 (without h = 0). This is a doubtful case.
In the case in which x, s, t are each of them zero, the quadratic terms are altogether absent, and
the cubic terms would change sign with h and k and therefore all the differential coefficients of
the third order must vanish separately when x = a and y = b and the biquadratic terms must be
such that they retain the same sign for all sufficiently small values of h, k.
Example. 9. Let
f(x, y) = 2x4 − 3x2 y + y2
∂f ∂f ∂f ∂f
Then = 8x 3 − 6 xy (0,0) = 0; = − 3x 2 + 2 y (0,0) = 0
∂x ∂x ∂y ∂y
∂2 f 2 ∂2 f
r= = 24x − 6y = 0 at (0, 0), S = = −6x = 0 at (0, 0)
∂x 2 ∂x∂y
∂2 f
t= 2
= 2. Thus rt − S2 = 0. Thus it is a doubtful case
∂y
However, we can write f(x, y) = (x2 −y) (2x2 −y), f(0, 0) = 0
f(x, y) − f(0, 0) = (x2−y) (2x2−y) > 0 for y < 0 or x2 > y > 0
y
< 0 for y > x2 > > 0
2
Thus ∆ f does not keep the some sign mean (0, 0). Therefore it does not have maximum or
minimum at(0, 0).
2.9. Lagrange’s Method of Undermined Multipliers
Let u = φ (x1, x2, xn) be a function of n variables which are connected by m equations
f1(x1, x2,…, xn) = 0, f2 (x1, x2,…, xn) = 0, …, fm (x1, x2,…, xn) = 0,
so that only n−m of the variables are independent.
When u is a maximum or minimum
∂u ∂u ∂u ∂u
du = dx 1 + dx 2 + dx 3 + ... + dx n = 0
∂x 1 ∂x 2 ∂x 3 ∂x n
∂f ∂f ∂f ∂f
Also df1 = 1 dx 1 + 1 dx 2 + 1 dx 3 + ... + 1 dx n = 0
∂x 1 ∂x 2 ∂x 3 ∂x n
∂f ∂f ∂f ∂f
df2 = 1 dx 1 + 1 dx 2 + 1 dx 3 + ... + 1 dx n = 0
∂x 1 ∂x 2 ∂x 3 ∂x n
……………………………………………………….
……………………………………………………….
∂f ∂f ∂f ∂f
dfm = m dx 1 + m dx 2 + m dx 3 + ... + m dx n = 0
∂x 1 ∂x 2 ∂x 3 ∂x n
Multiplying these lines respectively by 1, λ1, λ2,…, λn and adding, we get a result which may be
written
P1 dx1 + P2 dx2 + P3 dx3 +…+ Pn dxn = 0 ,
∂u ∂f ∂f ∂f
where Pr = + 1 1 + 2 2 + ... + m m
∂x r ∂x r ∂x r ∂x r
The m, quantities λ1, λ2,…, λm are at our choice. Let us choose them so as to satisfy the m linear
equations
P1 = P2 = P3…= Pm = 0
The above equation is now reduced to
Pm+1 dxm+1 + Pm+2 dxm+2 +…+ Pn dxn = 0
It is indifferent which n − m of the n variables are regarded as independent. Let them be xm+1,
xm+2, …, xn. Then since n−m quantities dxm+1, dxm+2,…, dxn are all independent their coefficients
must be separately zero. Thus we obtain the additional n−m equations
Pm+1 = Pm+2 =…= Pn = 0
REAL ANALYSIS 55
x 2 y2 z2
(x2 + y2 + z2) + λ1 + + + λ2 (lx, my + nz) = 0
a 2 b2 c2
or r2 + λ1 = 0 λ1 = −r2
Hence from (4) (5) and (6) we have
λ 2l λ2m λ2n
x= ,y= ,z =
r2 r2 r2
−1 −1 −1
a2 b2 c2
56 FUNCTIONS OF SEVERAL VARIABLES
l 2a 2 m2b2 n 2c 2
But lx + my + nz = 0 λ2
+ + = 0 and since λ2 ≠ 0, the
r 2 − a 2 r 2 − b2 r 2 − c2
equation giving the values of r2, which are the squares the length of the semi-axes required in the
l2a 2 m2b2 n 2c2
quadratic in r2 is + + =0
r 2 − a 2 r 2 − b2 r 2 − c2
Example. 11. Investigate the maximum and minimum radii vector of the sector of “surface of
elasticity” (x2 + y2 + z2)2 = a2 x2 + b2 y2 + c2 z2 made by the plane lx + my + nz = 0
Solution. We have
xdx + ydy + zdz (1)
a2xdx + b2ydy + c2zdz = 0 (2)
and ldx + mdy + ndz = 0 (3)
Multiplying these equations by 1, λ1, λ2 respectively and adding we get
x + a2xλ1 + lλ2 = 0 (4)
2
y + b yλ1 + mλ2 = 0 (5)
2
z + c 2λ1 + nλ2 = 0 (6)
Multiplying by x, y, z respectively and adding we get
(x2 + y2 + z2) + (a2 x2 + b2y2 + c2 z2) λ1 + (ln + my + nz) λ2 = 0
1
r2 + λ1 r4 = 0 λ1 = −
r2
lr22 2m r
2
2 nr
2
x= , y = , z =
a − r2
2
b2 − r 2 c2 − r 2
2
r2
2l
2 2
2m r
2 2
2n r
Then lx + my + nz = 0 , y = , z = =0
a − r2
2
b2 − r 2 c2 − r 2
l2 m2 n2
+ + =0
a 2 − r 2 b2 − r 2 c2 − r 2
It is a quadratic in r and gives its required values.
Example. 12. Prove that the volume of the greatest rectangular parallelopiped that can be
x2 y2 z2 8abc
inscribed in the ellipsoid 2 + 2 + 2 = 1 is .
a b c 3 3
Solution. Volume of a parallelepiped is = 8xyz. Its maximum value is to be find under the
x2 y2 z2
condition that it is inscribed in the ellipsoid 2 + 2 + 2 = 1. We have
a b c
u = 8xyz
x2 y2 z2
f1 = + + = 1.
a 2 b2 c2
REAL ANALYSIS 57
Therefore
du = 8yz dx + 8xz dy + 8xydz = 0 (1)
x2 y2 z2
df i = dx + dy + dz = 0 (2)
a2 b2 c2
Multiplying (1) by 1 and (2) by λ and adding we get
x
yz λ=0 (3)
a2
y2
zx + λ=0 (4)
b2
z
zy + λ=0 (5)
c2
From (3), (4) and (5) we get
a 2 yz b 2 zx c 2 xy
λ= = =
x y z
a 2 yz b 2 zx c 2 xy
and so = =
x y z
Dividing throughout by x, y, z we get
a2 b2 c2
= =
a2 y2 z2
3x 2 a b c
Hence 2
= 1 or x = . Similarly y = , z=
a 3 3 3
It follows therefore that
8abc
u = 8 xyz =
3 3
Example 13. Find the point of the circle x2 + y2 + z2 = k2, lx + my + nz = 0 at which the function
u = ax2 + by2 + cz2 + 2fyz + 2gzx + 2hxy attain its greatest and its least value.
Solution. We have
u = ax2 + by2 + cz2 + 2fyz + 2gzx + 2hxy
f1 = lx + my + nz = 0
f2 = x2 + y2 + z2 = k2
Then ax dx + by dy + czdz + fy dz + fz dy + gz dx + gx dz + hx dy + hydx = 0
l dx + mdy + ndz = 0
x dx + ydy + zdz = 0
Multiplying by 1, λ1, λ2 respectively and adding
58 FUNCTIONS OF SEVERAL VARIABLES
x2 = , y2 = , z2 = ,
2a ( + a ) 2b ( + b ) 2c ( + c )
where µ is the +ve root of the cubic
µ3 − (bc + ca + ab) µ − 2abc = 0
Solution. We have
a 2x 2 + b2 y2 + c2z 2
u= (1)
x 2 y2z2
ax2 + by2 + cz2 = 1 (2)
Differentiating (1), we get
1 b2 c2
Σ 3 2 + 2 dx = 0
x z y
which on multiplication with x2 y2 z2 yields
1 2 2
Σ (b y + c2z2) dz = 0 (3)
x
Differentiating (2) we have
Σ ax dx = 0 (4)
REAL ANALYSIS 59
x2 =
2a ( a + )
Similarly y2 = and z2 =
2 b( b + ) 2c ( + c )
Substituting these values of x2, y2 and z2 in (2) we obtain
+ + =1
2( a + ) 2( b + ) 2( c + )
which is equal to
µ3 − (bc + ca + ab) µ −2 abc = 0 (8)
Since a, b, c are +ve, any one of (5), (6), (7) shows that µ must be +ve. Hence µ is the +ve root
(8)
2.10. Jacobians
If u1, u2,…, un be n functions of the n variables x1, x2, x3,…, xn the determinant
∂u 1 ∂u 1 ∂u
, ,..., 1
∂x 1 ∂x 2 ∂x n
∂u 2 ∂u 2 ∂u
, ,..., 2
∂x 1 ∂x 2 ∂x n
.............................
∂u n ∂u n ∂u
, ,..., n
∂x 1 ∂x 2 ∂x n
is called the Jacobian of u1, u2,…, un with regard to x1, x2,…, xn. This determinant is often
denoted by
∂ (u 1 , u 2 ,..., u n )
, J (u1, u2,…, un)
∂ ( x 1 , x 2 ,..., x n )
60 FUNCTIONS OF SEVERAL VARIABLES
∂ (u 1 , u 2 ,..., u n )
=0
∂ ( x 1 , x 2 ,..., x n )
which establishes the theorem.
Theorem. 15. If u1, u2,…, un are n functions of the n variables x1, x2,…, xn say um = fm(x1, x2,…,
∂ (u 1 , u 2 ,..., u n )
xm), (m = 1, 2,…n), and if = 0 , then if all the differential coefficients concerned
∂ ( x 1 , x 2 ,..., x n )
are continuous, there exists a functional relation connecting some or all of the variables u1, u2,…,
un which is independent of x1, x2,…, xn
Proof. First we prove the theorem when n = 2. We have u = f(x, y), v = g (x, y) and
∂u ∂u
∂x ∂y
=0
∂v ∂v
∂x ∂y
∂v ∂u ∂v
If v does not depend on y, then = 0 and so either = 0 or else = 0 . In the former case
∂y ∂y ∂x
u and v are functions of x only, and the functional relation sought is obtained from
u = f(x), v = g(x).
by regarding x as a function of v and substituting in u = f(x). In the latter case v is a constant,
and the functional relation is
v=a
∂v
If v does depend on y, since ≠ 0 the equation v = g(x, y) defines y as a function of x and v,
∂y
say
y = ψ (x, v),
and on substituting in the other equation we get an equation of the form
u = F(x, v).
(The fn. F [x, g (x, y)] is the same function of x and y as f(x, y))
Then
∂u ∂u ∂F ∂F ∂v ∂F ∂v ∂F
+ . 0
0 = ∂x ∂y = ∂x ∂v ∂x ∂v ∂y
= ∂∂xv
∂v ∂v ∂v ∂v ∂v
∂x ∂x ∂x ∂y ∂x ∂y
∂F
(obtained on multiplying the second now by and subtracting from the first) and so, either
∂u
∂v ∂F
= 0, which is contrary to hypothesis or else =0, so that F is a function of v only ; hence
∂y ∂x
the functional relation is
62 FUNCTIONS OF SEVERAL VARIABLES
u = F(v)
Now assume that the theorem holds for n−1.
Now un must involve one of the variables at least, for if not there is a functional relation un = a.
∂u n
Let one such variable be called xn Since ≠ 0 we can solve the equation
∂x n
un = fn (x1, x2,…, xn)
for xn in terms of x1, x2,…, xn−1 and un, and on substituting this value in each of the other
equations we get n−1 equations of the form
(2.10.5) ur = gr (x1, x2,…, xn−1, un), (r = 1, 2,…, n−1) (2)
If now we substitute fn (x1, x2,…, xn) for un the functions gr (x1, x2,…, xn−1, un) become
fr (x1, x2,…, xn−1, xn), (r = 1, 2,…, n−1)
Then
∂f1 ∂f1 ∂f
,..., 1
∂x 1 ∂x 2 ∂x n
∂f 2 ∂f 2 ∂f
,..., 2
0 = ∂x 1 ∂x 2 ∂x n
............................
∂f n ∂f n ∂f
,..., n
∂x 1 ∂x 2 ∂x n
∂g1 ∂g1
,..., , 0
∂x 1 ∂x n −1
∂g 2 ∂g 2
,..., , 0
= ∂x 1 ∂x n −1
.............................
∂u n ∂u n ∂u n
,..., ,
∂x 1 ∂x n −1 ∂x n
by subtracting the elements of the last row multiplied by
∂g1 ∂g 2 ∂g
, ,..., n −1
∂u n ∂u n ∂u n
REAL ANALYSIS 63
∂U ∂U ∂u ∂u
∂ ( U, V ) ∂ ( u , v) ∂v × ∂x ∂y
. = ∂u
∂ (u , v) ∂ ( x , y) ∂V ∂V ∂v ∂v
∂u ∂v ∂x ∂y
∂U ∂u ∂U ∂v ∂U ∂u ∂U ∂v
+ +
∂u ∂x ∂v ∂x ∂u ∂y ∂v ∂y
=
∂V ∂u ∂V ∂u ∂V ∂u ∂V ∂v
+ +
∂u ∂x ∂v ∂x ∂u ∂y ∂v ∂y
∂U ∂U
∂ ( U, V )
= ∂x ∂y =
∂V ∂V ∂ ( x , y)
∂x ∂y
64 FUNCTIONS OF SEVERAL VARIABLES
The same method of proof applies if there are several functions and the same number of
variables.
Lemma. If J is he Jacobian of system u, v with regard to x, y and J′ the Jacobian of x, y with
regard to u, v, then J J′ = 1.
Proof. Let u = f(x, y) and v = F(x, y), and suppose that these are solved for x and y giving
x = φ(u, v) and y = ψ(u, v),
we then have an differentiating u = f(x, y) w.r.t u and v; = F(x, y) w.r.t u and v
∂u ∂x ∂u ∂y
1= + .
∂v ∂u ∂y ∂u
obtained from u = f(x, y)
∂u ∂x ∂u ∂y
0= + .
∂x ∂v ∂y ∂v
∂v ∂x ∂v ∂y
0= + .
∂x ∂u ∂y ∂u
obtained from v = F(x, y)
∂v ∂x ∂v ∂y
1= + .
∂x ∂v ∂y ∂u
∂u ∂u ∂x ∂y
∂x ∂y
Also J J′ = × ∂u ∂u
∂v ∂v ∂x ∂y
∂x ∂y ∂v ∂v
∂u ∂x ∂u ∂y ∂u ∂x ∂u ∂y
+ . , + .
∂x ∂u ∂y ∂u ∂x ∂v ∂y ∂v
=
∂v ∂x ∂v ∂y ∂v ∂x ∂v ∂y
+ . , + .
∂x ∂u ∂y ∂u ∂x ∂v ∂y ∂v
1 0
= =1
0 1
Example15. If u = x + 2y + z, v = x − 2y + 3z
w = 2xy − xz + 4yz − 2z2,
∂ ( u , v, w )
prove that = 0, and find a relation between u, v, w.
∂ ( x , y, z )
Solution. We have
∂u ∂u ∂u
∂x ∂y ∂z
∂ (u , v, w ) ∂v ∂v ∂v
=
∂ ( x , y, z) ∂x ∂y ∂z
∂w ∂w ∂w
∂x ∂y ∂z
REAL ANALYSIS 65
1 2 1
= 1 −2 3
2 y − z 2 x + 4z − x + 4 y − 4z
1 0 0
= 1 −4 2 Performing c2−2c1 and c3−c1
2 y − z 2 x + 6z − 4 y − x + 2 y − 3z
−4 2 0 2
= = Performing c1+2c2
2 x + 6 y − 4 y − x + 2 y − 3z 0 − x + 2 y − 3z
=0
Hence a relation between u, v and w exists
Now,
u = v = 2x + 4z
u −v = 4y − 2z
w = x(2y−z) + 2z(2y −3)
= (x+2z) (2y−z)
4w = (u+v) (u−v)
4w = u2 − v2
which is the required relation.
Example. 16. Find the condition that the expressions px + qy + rz, p′x + q′y + r′z are connected
with the expression ax2 + by2 + cz2 + 2fyz + 2gzx + 2hxy. By a functional relation.
Solution. Let
u = px + qy + rz
v = p′ + q′y + r′z
w = ax2 + by2 + cz2 + 2fyz + 2gzx + 2hxy
We know that the required condition is
∂ ( u , v, w )
=0
∂ ( x , y, z )
Therefore
∂u ∂u ∂u
∂x ∂y ∂z
∂v ∂v ∂v
=0
∂x ∂y ∂z
∂w ∂w ∂w
∂x ∂y ∂z
But
∂u ∂u ∂u
= p, = q, =r
∂x ∂y ∂z
∂v ∂y ∂v
= p' = q', = r'
∂x ∂y ∂z
66 FUNCTIONS OF SEVERAL VARIABLES
∂w
= 2ax + 2hy + 2gz
∂x
∂w
= 2hx + 2by + 2fz
∂y
∂w
= 2gx + 2fy + 2cz
∂z
Therefore
p q r
p' q' r' =0
2ax + 2hy + 2gz 2hx + 2by + 2fz 2gx + 2fy + 2cz
p q r p q r p q r
p' q ' r '= 0, p' q ' r '= 0, p' q ' r '= 0
a h g h b f g f c
which is the required condition.
1
Example. 17. Prove that if f(0) = 0, f′(x) = , then
1+ x2
x+y
f(x) + f(y) = f
1 − xy
Solution. Suppose that
u = f(x) + f(y)
x+y
v=
1 − xy
∂u ∂u
,
∂x ∂y
Now J (u, v) =
∂v ∂v
∂x ∂y
1 1
1+ x2 1 + y2
= =0
1 + y2 1+ x2
(1 − xy) 2 (1 + xy) 2
Therefore u and v are connected by a functional relation
Let u = φ(v), that is,
x+y
f(x) + f(y) = φ
1 − xy
Putting y = 0, we get
f(x) + f(0) = φ(x)
f(x) + 0 = φ(x) Θ f(0) = 0
x+y
Hence f(x) + f(y) = f
1 − xy
Example. 18. The roots of the equation in λ
REAL ANALYSIS 67
∂x ∂y ∂z
γ+α = −(b+c) − ( c + a ) − (a + b )
∂ ∂ ∂
∂x ∂y ∂z
α+β = −(b+c) − ( c + a ) − ( a + b)
∂ ∂ ∂
∂x ∂y ∂z
βγ = bc + ca + ab
∂ ∂ ∂
∂x ∂y ∂z
γα = bc + ca + ab
∂ ∂ ∂
∂x ∂y ∂z
αβ = bc + ca + ab
∂ ∂ ∂
∂x ∂y ∂z
∂ ∂ ∂ 1 1 1
∂x ∂y ∂z
Now, − ( b + c) − ( c + a ) − (a + b )
∂ ∂ ∂
∂x ∂y ∂z bc ca ab
∂ ∂ ∂
1 1 1
= + + +
Hence
∂ ( x , y, z )
(b −c) c−a) (a−b) = −(α−β) (β−γ) (γ−α)
∂( , , )
∂ ( x , y, z ) ( − )( − )( − )
=−
∂( , , ) (b − c)(c − a )(a − b)
Second Method. After the step (*) let a+b+c −(x +y+z) = ξ, ab + bc + ca −x (b+a) − y(c+a) −z
(a+b) = η
abc − bcx − cay − abz = ζ (1)
α + β + γ = −ξ, αβ + βγ + γα = η, αβγ = −ζ (2)
−1 −1 −1 −1 −1 −1
∂ (ξ, η, ζ ) ∂ (ξ, η, ζ )
then = − (b + c) − (c + a ) − (a + b) and = β+ γ γ +α α +β
∂ ( x , y, z ) ∂ (α, β, γ )
− bc − ca − ab − βγ − γα − αβ
Solution. Here
∂U ∂U ∂U
∂x ∂y ∂z 1 1 −1
∂ ( U, V, W ) ∂V ∂V ∂V
= = 1 −1 1
∂ ( x , y, z ) ∂x ∂y ∂z
∂W ∂W ∂W 2 x 2( y − z) 2(z − y)
∂x ∂y ∂z
1 1 0
= 1 −1 0 =0
2 x 2( y − z ) 0
Hence there exists some functional relation between U, V and W.
Moreover,
U + V = 2x
U −V = 2(y−z)
and (U +V)2 + (U−V)2 = 4(x2 + y2 + z2 −2yz)
= 4W
which is the required functional relation.
Example. 21. Let V be a function of the two variables, x and y. Transform the expression
∂ 2V ∂ 2V
+
∂x 2 ∂y 2
by the formulae of plane polar transformation.
x = r cos θ, y = r sin θ.
Solution. We are given a function V which is function of x and y and therefore it is a function of
r and θ. From x = r cos θ & y = r sin θ, we have
r= x 2 + y 2 , θ = tan−1 y/x
Now
∂V ∂V ∂r ∂V ∂
= . + .
∂x ∂r ∂x ∂u ∂x
∂V sin θ ∂V ∂r ∂θ sin θ
= cos θ − Θ = cos θ, =−
∂r r ∂θ ∂x ∂x r
∂V ∂V ∂r ∂V ∂
and = . + .
∂y ∂r ∂y ∂ ∂y
∂V cos θ ∂V ∂r ∂θ cos θ
= sin θ + Θ = sin θ, =
∂r r ∂θ ∂y ∂y r
70 FUNCTIONS OF SEVERAL VARIABLES
∂ ∂ sin ∂
Therefore = cos −
∂x ∂r r ∂
∂ ∂ cos ∂
= sin +
∂y ∂r r ∂
∂ 2V ∂ sin ∂ ∂V sin ∂V
Hence = cos − cos −
∂x 2
∂r r ∂ ∂r r ∂
∂ ∂V sin ∂V sin ∂ ∂V sin ∂V
= cos θ cos − − cos −
∂r ∂r r ∂ r ∂ ∂r r ∂
∂ 2 V sin ∂V sin ∂ 2 V
= cos θ cos + 2 −
∂r 2 r ∂ r ∂x∂
∂ 2V
2 sin cos ∂ 2 V sin 2 ∂ 2 V
= cos θ −2 + 2
∂r 2 r ∂r∂ r ∂ 2
sin 2 ∂V 2 sin cos ∂V
+ + (1)
r ∂r r2 ∂
∂ 2V ∂ cos ∂ ∂V cos ∂V
and = sin + sin +
∂y 2
∂r r ∂ ∂r r ∂
∂ 2 V cos ∂V cos ∂ 2 V
= sin θ sin − 2 +
∂r 2 r ∂ r ∂r∂
∂V ∂V ∂V ∂ ∂
r =x +y = x +y v
∂r ∂x ∂y ∂x ∂x
∂ ∂ ∂
r =x +y
∂r ∂x ∂y
∂ ∂ ∂
Similarly =x −y
∂ ∂y ∂x
∂Z ∂Z ∂r ∂z ∂ ∂z sin ∂z
Now = . + . = cos − (1)
∂x ∂r ∂x ∂ ∂x ∂r r ∂
∂Z ∂Z cos ∂Z
= sin + . (2)
∂y ∂r r ∂
2 2 2 2
∂Z ∂Z ∂Z 1 ∂Z
Therefore + = + 2
∂x ∂y ∂r r ∂
and the given expression is equal to
2 2 2
∂Z 2 ∂z 2 1 ∂Z
r + (a − r ) + 2
∂r ∂r r ∂
2 2
∂Z a2 ∂Z
= a2 + −1 .
∂r r2 ∂
∂ 2u ∂ 2u ∂ 2u 2
2 ∂ u ∂u ∂ 2 u
(x2 − y2) − + 4 xy = r − r −
∂x 2 ∂y 2 ∂x∂y ∂r 2 ∂r ∂ 2
∂u ∂x ∂u ∂y
. + .
∂r ∂x ∂y ∂r
∂u ∂u x ∂u y ∂u
= cos θ + sin = +
∂x ∂y r ∂x r ∂y
∂u ∂u ∂u
r =x +y (1)
∂r ∂x ∂y
∂ ∂u ∂ ∂ ∂u ∂u
Therefore r i = x +y x +y
x ∂r ∂x ∂y ∂x ∂y
∂ ∂u ∂u ∂ ∂u ∂u
=x x +y +y x +y
∂x ∂x ∂y ∂y ∂x ∂y
∂ 2u ∂ 2u ∂ 2u 2
2 ∂ u ∂u ∂u
= x2 + xy + xy + y +x +y
∂x 2
∂x∂y ∂y∂x ∂y 2
∂x ∂y
Therefore
2∂ 2r ∂u 2
2 ∂ u ∂ 2u 2
2 ∂ u ∂y ∂u
∴ r +r =x + 2 xy +y +x +y (2)
∂r 2
∂r ∂x 2
∂x∂y ∂y 2
∂x ∂y
∂ 2u 2
2 ∂ u ∂ 2u 2
2 ∂ u
r2 = x + 2 xy + y using (1)
∂r 2 ∂x 2 ∂xvy ∂y 2
∂u ∂u ∂x ∂u ∂y
Again, = . + .
∂ ∂x ∂ ∂y ∂
∂u ∂u
=x −y
∂y ∂x
∂ 2u ∂ ∂ ∂u ∂u
Therefore = x −y x −y
∂ 2
∂y ∂x ∂y ∂x
∂ ∂u ∂u ∂ ∂u ∂u
=x x −y −y x −y
∂y ∂y ∂x ∂x ∂y ∂x
∂ 2u ∂ 2u 2
2 ∂ u ∂u ∂u
= x2 − 2 xy + y −x −y (3)
∂y 2
∂u∂x ∂x 2
∂x ∂y
From (1), (2) and (3) we get the required result.
Example 24. If x = r cos θ, y = r sin θ, show that
∂2
= − r −2 cos 2θ
∂x∂y
Solution. We have
x = r cos θ, y = r sin θ
REAL ANALYSIS 73
3
Where mi and Mi are bounds of f defined above. The sums L(P, f, α) and U(P, f, α) are
respectively called Lower Stieltjes sum and Upper Stieltjes sum corresponding to the partition P.
We further define
b
f dα = lub L (P, f, α)
a
b
f dα = glb U(P, f, α),
a
b b
where lub and glb are taken over all possible partitions P of [a, b]. Then f dα and f dα are
a a
respectively called Lower integral and Upper integrals of f with respect to α.
b
If the lower and upper integrals are equal, then their common value, denoted by f dα, is called
a
the Riemann – Stieltjes integral of f with respect to α, over [a, b] and in that case we say that f
is integrable with respect to α, in the Riemann sense and we write f ∈ℜ(α).
The functions f and α are known as the integrand and the integrator respectively.
In the special case, when α(x) = x, the Riemann - Stieltjes integral reduces to Riemann –
b b
integral. In such a case we write L(P, f), U(P, f), f, f and f ∈ℜ respectively in place of L(P,
a a
b b
f, α), U(P, f, α), f dα, f dα and f ε ℜ(α).
a a
Clearly, the numerical value of f dα depends only on f, α, a and b and does not depend on
the symbol x. In fact x is a “dummy variable” and may be replaced by any other convenient
symbol.
3.3. In this section, we shall study characterization of upper and lower Stieltjes sums, and
upper and lower Stieltjes integrals.
The next theorem shows that for increasing function α, refinement of the partition increases the
lower sums and decreases the upper sums.
Theorem 1. If P* is a refinement of a partition P of [a, b], then
L(P, f, α) ≤ L(P*, f, α) and
U(P*, f, α) ≤ U(P, f, α).
Proof. Suppose first that P* contains exactly one more point than the partition P of [a, b]. Let
this point be x* and let this point lie in the subinterval [xi-1, xi ]. Let
W1 = glb f (x) (xi-1 ≤ x ≤ x*)
W2=glb f (x) (x* ≤ x ≤ xi)
Then w1 ≥ mi, and w2 ≥ mi where
mi = glb f(x) (xi-1 ≤ x ≤ xi)
Hence
L(P*, f, α) − L(P,f,α) = w1[α(x*) − α(xi−1 )] –w2[α(xi) − α(x*)] − mi[α(xi) − α(xi−1)]
= (w1-mi) [α(x*)- α( xi –1)] + (w2 –mi) [α(xi) - α(x*)]
76
≥0
Hence L(P*, f, α) ≥ L(P, f, α).
If P* contains k points more than P, we repeat the above reasoning k times.
The proof for U(P*, f, α) ≤ U(P, f, α) is analogous.
Theorem 2. If α is monotonically increasing on [a, b], them for any two partitions P1and P2, we
have
L(P1, f, α) ≤ U(P2, f, α)
Proof. Let P be the common refinement of P1 and P2, that is, P = P1 U P2. Then we have, using
Theorem1,
L(P1, f, α) ≤ L(P, f, α) ≤ U(P, f, α) ≤ U (P2, f,α).
Remark. It also follows from this theorem that
m [α(b)-α(a)] ≤ L(P1, f, α) ≤ U(P2, f, α) ≤ M[α(b )-α(a)] ,
where m and M are as usual inf and sup of f on [a, b].
Theorem 3. If α is increasing on [a, b], then
b b
f dα ≤ f dα.
a a
Proof. Let P* be the common refinement of two partitions P1 and P2. Then , by Theorem 1,
L(P1, f, α) ≤ L(P*, f, α) ≤ U(P*,f,α) ≤ U(P2, f, α)
Hence
L(P1, f, α) ≤ U(P2, f, α)
We keep P2 fixed, and take lub over all P1. We obtain
f dα ≤ U(P2, f, α)
−
Taking glb over all P2, we get
−
f dα ≤ f dα.
−
=0
n
U(P, f, α) = M i ∆xi
i =1
n
= (xi-xi-1) = xn-x0 =1−0
i =1
=1
77
Theorem 4. Let α on [a, b]. Then f ∈ℜ(α) if and only if for every ∈>0 there exists a partition
P such that
U(P, f, α) − L(P, f, α) < ∈.
Proof. Suppose first that for every P we have
U(P, f, α)-L(P, f, α)<∈.
This gives us
b b b b
[U(P, f, α) − f dα]+[ f dα − f dα]+[ f dα − L(P, f, α)] < ∈.
a a a a
Since, each one of the three numbers
− −
U(P, f, α)- f dα, f dα- f dα, f dα-L(P, f, α) is non-negative, we have
− −
−
0≤ f dα- f dα <∈.
a −
−
Since ∈ is arbitrary positive number, we note that the non-negative number f dα- f dα is
−
which yields
−
f dα = f dα
−
and so f ∈ ℜ(α).
Conversely, suppose that f ∈ ℜ(α) and that ∈ > 0 be given. Then
−
f dα = f dα = f dα
−
∈
(3.3.3) U(P, f, α) < f dα +
2
∈
(3.3.4) L(P, f, α) > f dα -
2
Combining (3.3.3) and (3.3.4), we obtain
∈ ∈
f dα − < L(P, f, α) < U(P, f, α) < f dα +
2 2
which yields
U(P, f, α) − L(P, f) <∈.
This completes the proof of the theorem.
3.4. In this section, we shall discuss integerability of continuos and monotonic functions
alongwith properties of Riemann-Stieltjes integrals.
Theorem 5. If f is continuous on [a, b], then (α).
(i) f ∈ ℜ(α)
(ii) to every ∈> 0 there corresponds a δ > 0 such that
n b
| f(ti)∆αi − f dα | < ∈
i =1 a
for every partition P of [a, b] with | P | < δ and for all ti ∈ [xi-1, xi].
Proof. (i) Let ∈ > 0 and select η > 0 such that
(3.4.1) [α(b) - α(a)] η > ∈
which is possible by monotonicity of α on [a, b]. Also f is continuous on compact set [a, b].
Hence f is uniformly continuous on [a, b]. Therefore there exists a δ > 0 such that
(3.4.2) | f(x) - f(t) | < η whenever | x - t | < δ for all x ∈[a, b], t ∈[a, b].
Choose a partition P with | P | < δ. Then (3.4.2) implies
Mi - mi ≤ η (i =1, 2,……,n)
Hence
n n
U(P, f, α) − L(P, f, α) = M i ∆α i − mi ∆α i
i =1 i =1
n n
= ( M i − mi )∆α i ≤ η ∆α i
i =1 i =1
n
=η [α i ( xi ) − α ( xi −1 )]
i =1
= η[α(b) − α(a)]
∈
< η. = ∈,
η
which is necessary and sufficient condition for f ∈ ℜ(α).
(ii) We have
n
L(P, f, α) ≤ f (t i )∆α i ≤ U(P, f, α)
i =1
and
79
b
L(P, f, α) ≤ fdα ≤ U(P, f, α)
a
Since f ∈ ℜ(α), for each ∈>0 there exists δ > 0 such that for all partition P with | P | < δ, we
have
U(P, f, α) − L(P, f, α) < ∈
Thus
n b
| f (t i )∆α i − f dα | < U(P, f, α) − L(P, f, α)
i =1 a
<∈
n b
Thus for continuous functions f, lim|P|→0 f (t i )∆α i exits and is equal to f dα.
i =1 a
Theorem 6. If f is monotonic on [a, b] and if α is both monotonic and continuous on [a, b], then
f ∈ ℜ(α).
Proof. Let ∈ be a given positive number. For any positive integer n, choose a partition P of [a,
b] such that
α (b) − α (a)
∆αi = (i = 1, 2 , …..,n).
n
This is possible since α is continuous and monotonic on [a, b] and so assumes every value
between its bounds α(a) and α(b). If is sufficient to prove the result for monotonically increasing
function f, the proof for monotonically decreasing function being analogous. The bounds of f in
[xi-1, xi] are then
mi = f(xi-1), Mi = f(xi), i =1, 2,…..,n.
Hence
n
U(P, d, α) - L(P, f, α) = ( M i − mi )∆α i
i =1
α (b) − α (a) n
= ( M i − mi )
n i =1
α (b) − α (a) n
= f ( xi ) − f ( xi −1 )]
n i =1
α (b) − α (a)
= [f(b) - f(a)]
n
< ∈ for large n.
Hence f ∈ ℜ(α).
Example. Let f be a function defined by
f(x*) = 1 and f(x) = 0 for x ≠ x*, a ≤ x* ≤ b.
b
Suppose α is increasing on [a, b] and is continuous at x*. Then f ∈ ℜ(α) over [a, b] and f dα
a
=0
Solution. Let P = {x0, x1,…...xn} be a partition of [a, b] and let x* ∈ ∆xi. Since α is continuous at
x*, to each ∈ > 0 there exists δ > 0 such that
80
∈
| α(x) − α(x*) | < whenever | x − x* | < δ
2
Again since α is an increasing function,
∈
α(x) − α(x*) < for 0 < x − x* < δ
2
and
∈
α(x*) − α(x) < for 0 < x* − x < δ
2
Then for a partition P of [a, b],
∆αi = α(xi) − α(xi-1)
= α(x) − α(x*) + α(x*) − α(xi-1)
∈ ∈
< + = ∈.
2 2
n 0 if t i ≠ x *
Therefore f ( t 1 )∆α i =
i =1 ∆α i , t i = x * .
that is,
n
| f (t i )∆α i − 0 | < ∈
i =1
Hence
n b
lim|P|→0 f (t i )∆α i = f dα = 0.
i =1 a
b
and so f ∈ ℜ(α) and f dα = 0 .
a
Theorem 7. Let f1 ε ℜ(α) and f2 ε ℜ (α) on [a,b], then (f1 + f2) ε ℜ (α) and
b b b
(f1 + f2)dα = f1dα + f2dα
a a a
Proof. Let P = [a = x0, x1,……xn = b] be any partition of [a, b]. Suppose further that Mi′, mi′,
Mi″, mi″and Mi, mi are the bounds of f1, f2 and f1 + f 2 respectively in the subinterval [xi-1,xi]. If α1,
α2 ∈ [xi-1, xi], then
[f1(α2) + f2(α2)] - [f1(α1) + f2(α1)]
≤ | f1(α2) - f1(α1) | + | f2(α2 ) – f2(α1)]
≤ (Mi′ - mi′ ) + (Mi″ - mi″)
Therefore, since this hold for all α1, α2 ∈[xi−1, xi], we have
(3.4.3) Mi – mi ≤ (Mi′ - mi′ ) + (Mi″ - mi″)
Since f1, f2 ∈ℜ(α), there exits a partition P1 and P2 of [a, b] such that
∈
U(P1 , f 1 , α) − L(P1 , f 1, α) <
(3.4.4) 2
∈
U(P2 , f 2 , α) − L(P2 , f 2 , α) <
2
These inequalities hold if P1 and P2 are replaced by their common refinements P.
Thus using (3.4.3), we have for f = f1 + f2,
81
n
U(P, f ,α) – L(P, f ,α) = ( M i − mi )∆α i
i =1
n n
≤ ( M i '− mi ' )∆α i + ( M i "− mi " )∆α i
i =1 i =1
∈ ∈
< + (using 3.4.4)
2 2
= ∈.
Hence f = f1 + f2 ∈ ℜ(α).
Further, we note that
Mi′ – mi″ ≤ mi ≤ Mi ≤ Mi′ + Mi″
Multiplying by ∆ αI and adding for I = 1, 2, …..,n, we get
(3.4.5) L(P, f1, α) + L(P, f2, α) ≤ L(P, f, α) ≤ U(P, f , α)
≤ U (P, f1, α) ≤ U(P, f1, α) + U(P, f2, α)
Also
b
∈
(3.4.6) U(P, f1, α) < f1 dα +
a
2
b
∈
(3.4.7) U(P, f2, α) < f2 dα +
a
2
Combining (3.4.5), (3.4.6) and (3.4.7), we have
b
f dα ≤ U(P, f , α) ≤ U (P, f 1, α) + U(P, f2, α)
a
b b
∈ ∈
< f1 dα + f2 dα + +
a a
2 2
Since ∈ is arbitrary positive number, we have
b b b
(3.4.8) f dα ≤ f1 dα + f2 dα
a a a
Proof. Since f ∈ ℜ(α) and f ∈ ℜ(β), there exists partition P1 and P2 such that
82
∈
U(P1, f, α) – L(P1, f , α) <
2
∈
U(P2, f, β) – L(P2, f , β) <
2
These inequalities hold if P1 and P2 are replaced by their common refinement P.
Also
∆ (αi + βi ) = [α(xi) - α(xi-1)] + [β (xi) - β (xi-1)]
Hence, if Mi and mi are bounds of f in (xi-1, xi),
n
U(P, f, (α + β)) − L(P, f, (α + β)) = ( M i − mi )∆ (α i + β i )
i =1
n n
= ( M i − mi )∆α i + ( M i − mi )∆β i
i =1 i =1
∈ ∈
< + = ∈.
2 2
Hence f ∈ ℜ(α + β).
Further
b
∈
U(P, f, α) < f dα +
a
2
b
∈
U(P, f, α) < f dβ +
a
2
and
U(P, f, α+β) = M i ∆α i + M i ∆β i
Also, then
b
f d(α + β) ≤ U(P, f, α + β) = U(P, f, α) + U(P, f, β)
a
b b
∈ ∈
< f dα + + f dβ +
a
2 a
2
b b
= f dα + f dβ + ∈
a a
Theorem 9. If f ∈ ℜ(α) on [a, b], then f ∈ ℜ(α) on [a, c] and f ∈ℜ(α) on [c, b] where c is a
point of [a, b] and
b c b
f dα = f dα + f dα.
a a c
n
=c M i ∆α i
i =1
= c U(P, f, α)
Similarly
L(P, g, α) = c L(P, f, α)
Since f ∈ ℜ(α), ∃ a partition P such that for every ∈ >0,
∈
U(P, f, α) – L(P, f, α) <
c
Hence
U(P, g, α) – L(P, g, α) = c [ U(P, f, α) – L(P, f, α)]
∈
<c = ∈.
c
Hence g = c f ∈ ℜ(α).
b
∈
Further, since U(P, f, α) < f dα + ,
a
2
b
g dα ≤ U(P, g, α) = c U(P, f, α)
a
b
∈
< c( f dα + )
a
2
Since ∈ is arbitrary
b b
g dα ≤ c f dα
a a
Replacing f by –f, we get
b b
g dα ≥ c f dα
a a
b b
Hence (cf) dα = c f dα
a a
(ii) If M and m are bounds of f ∈ ℜ(α) on [a, b], then it follows that
b
(3.4.16) m[α(b) - α(a)] ≤ f dα ≤ M[α(b) - α(a)] for b ≥ a.
a
In fact, if a = b, then (3.4.16) is trivial. If b > a, then for any partition P, we have
n
m[α(b) - α(a)] ≤ m i ∆α i = L(P, f, α)
i =1
b
≤ f dα
a
≤ U(P, f, α) = M i ∆α i
≤ M (b – a)
which yields
85
b
(3.4.17) m [α(b) - α(a)] ≤ f dα ≤ M (b – a)
a
Theorem 11. Suppose f ∈ ℜ(α) on [a, b], m ≤ f ≤ M, ϕ is continuous on [m, M] and h(x) =
φ[f(x)] on [a, b]. Then h ∈ ℜ(α) on [a, b].
Proof. Let ∈ > 0. Since ϕ is continuous on closed and bounded interval [m, M], it is uniformly
continuous on [m, M]. Therefore there exists δ > 0 such that δ < ∈ and
| φ(s) − φ(t) | < ∈ if | s − t | ≤ δ, s, t ∈[m, M].
Since f ∈ ℜ(α), there is a partition P = {x0, x1, …….., xn} of [a, b] such that
(3.4.18) U(P, f, α) − L(P, f, α) < δ2.
Let Mi, mi and M*i, mi* be the lub, g. l. b of f(x) and φ(x) respectively in [xi-1, xi]. Divide the
number 1,2,…..,n into two classes :
i ∈ A if Mi – mi < δ
and
i ∈ B if Mi – mi ≥ δ.
For i ∈ A, our choice of δ implies that Mi* - mi* ≤ ∈. Also, for i ∈ B, Mi* - mi* ≤ 2k where
k = lub | ϕ(t) |, t ∈[m, M]. Hence, using (3.4.18), we have
(3.5.19) δ ∆α i ≤ ( M i − mi )∆α i < δ2
i∈B i∈B
≤ ∈[α(b) - α(a)] + 2 kδ
< [α(b) - α(a)] + 2k]
Since ∈ was arbitrary,
U(P, h, α) – L(P, h, α) <∈*, ∈* >0.
Hence h ∈ f(α).
Theorem 12. If f ∈ ℜ(α) and g ∈ ℜ(α) on [a, b], then f g ∈ ℜ, | f | ∈ ℜ(α) and
b b
| f dα | ≤ | f | dα.
a a
86
Proof. Let φ be defined by φ(t) = t2 on (a,b]. Then h(x) = φ[f(x)] = f2 ∈ ℜ(α) by Theorem 11.
Also
1
fg = [(f + g)2 – (f − g)2].
4
Since f, g ∈ ℜ(α), f + g ∈ ℜ(α), f - g ∈ ℜ(α). Then, (f + g)2 and (f - g)2 ∈ ℜ(α) and so their
1
difference multiplied by also belong to ℜ(α) proving that fg ∈ ℜ.
4
If we take φ(f) = | t |, again Theorem 11 implies that | f | ∈ ℜ(α). We choose c = ±1 so that
c f dα ≥ 0
Then
| f dα | = c f dα = c f dα ≤ | f | dα
because cf ≤ | f |.
3.5. Riemann-Stieltjes integral as limit of sums. In this section, we shall show that Riemann-
Stieltjes integral f dα can be considered as the limit of a sequence of sums in which Mi, mi
involved in the definition of f dα are replaced by values of f.
Definition. Let P = {a = x0, x1,……., xn = b} be a partition of [a, b] and let points t1, t2,….., tn be
such that tI ∈[xi-1=, xi]. Then the sum
n
S(P, f, α) = f (t i )∆α i
i =1
Proof. Suppose lim|P|→0 S(P, f, α) exists and is equal to A. Then given ∈ > 0 there exists a δ>0
such that | P| < δ implies
∈
| S(P, f, α) – A | <
2
or
∈ ∈
(3.5.1) A− < S(P, f,α) < A +
2 2
If we choose partition P satisfying | P | < δ and if we allow the points ti to range over [xi-1, xi],
taking lub and glb of the numbers S(P, f, α) obtained in this way, the relation (3.5.1) gives
∈ ∈
A− ≤ L(P, f, α ) ≤ U((P, f, α ) ≤ (U,f,α) ≤ A +
2 2
and so
∈ ∈
U((P, f, α ) - L(P, f, α ) < + =∈
2 2
87
∈ ∈
A− ≤ f dα ≤ A +
2 2
or
Theorem 14. If
(ii) Let f ∈ ℜ(α), α be continuous and ∈ > 0. Then there exists a partition P* such that
∈
(3.5.2) U(P*, f, α) < f dα +
4
Now, α being uniformly continuous, there exists δ1 > 0 such that for any partition P of [a,b] with
|P| < δ1 , we have
∈
∆αi = α(xi) – α(xi-1) < for all i
4 Mn
where n is the number of intervals into which [a, b] is divided by P*. C onsider the sum
U(P, f, α). Those intervals of P which contain a point of P* in their interior contribute no more
than
(n − 1) ∈ M ∈
(3.5.3) (n – 1) max ∆αi. M < < .
4M n 4
Then (3.5.2) and (3.5.3) yield
∈
(3.5.4) U(P, f, α) < f dα +
2
for all P with | P | < δ1.
Similarly, we can show that there exists a δ2 > 0 such that
∈
(3.5.5) L(P, f, α) > f dα -
2
for all P with | P | < δ2.
88
Taking δ = min (δ1 - δ2), it follows that (3.5.3) and (3.5.4) hold for every P such that | P | < δ.
Since
L(P, f, α) ≤ S(P, f, α) < U(P, f, α)
(3.5.4) and (3.5.5) yield
∈
S(P, f, α) < f dα +
2
and
∈
S(P, f, α) < f dα -
2
Hence
∈
| S(P, f, α) - f dα | <
2
for all P such that | P | < δ and so
lim|P|→0 S(P, f, α) = f dα
3.6. Integration and Differentiation. In this section, we show that integration and
differentiation are inverse operations.
Definition. If f ∈ ℜ on [a, b], than the function F defined by
t
F(t) = f (x) dx, t ∈[a, b]
a
is called the “Integral Function” of the function f.
Theorem 15. If f ∈ ℜ on [a, b], then the integral function F of f is continuous on [a, b].
Proof. We have
t
F(t) = f (x) dx
a
Since f ∈ ℜ, it is bounded and therefore there exists a number M such that for all x in [a, b],
|f(x) | ≤ M.
Let ∈ be any positive number and c any point of [a, b]. Then
c c +h
F(c) = f (x) dx, F(c + h) f (x) dx.
a a
Therefore
c +h c
| F(c + h) –F(c) = | f (x) dx − f (x) dx
a a
89
c +h
= | f (x) dx |
a
≤ M|h|
∈
< ∈ if | h | <
M
∈
Thus | (c + h) – c | < δ = implies | F(c + h) –F(c) < ∈. Hence F is continuous at any point C
M
∈[a, b] and is so continuous in the interval [a, b].
Theorem 16. If f is continuous on [a, b], then the integral function F is differentiable and
F′(x0) = f(x0), x ∈[a, b].
Proof. Let f be continuous at x0 in [a, b]. Then there exists δ > 0 for every ∈ > 0 such that
(3.6.1) | f(t) – f(x0) | < ∈
whenever | t – x0 | < δ. Let x0 − δ < s ≤ x0 ≤ t < x0 + δ and a ≤ s < t ≤ b, then
t
F (t ) − F ( s ) 1
− f(x0) = | f (x) dx – f(x0) |
t−s t−s s
t t
1 1
=| f (x) dx − f (x0) dx |
t−s s
t−s s
t t
1 1
= [f (x) - f (x0)]dx | ≤
| | f (x) − f (x0) |dx < ∈,
t−s
s
t−s s
(using (3.6.1)).
Hence F′(x0) = f (x0). This completes the proof of the theorem
Definition. A derivable function F such that F′ is equal to a given function f in [a, b] is called
Primitive of f.
Thus the above theorem asserts that “Every continuous function f possesses a Primitive, viz the
t
integral function f (x) dx”
a
Furthermore, the continuity of a function is not necessary for the existence of primitive. In other
words, the function possessing primitive are not necessary continuous. For example, consider the
function f on [0, 1] defined by
1 1
2 x sin − cos , x ≠ 0
f(x) = x x
0, x = 0
It has primitive
1
x 2 sin , x ≠ 0
F(x) = x
0, x = 0
Clearly F′(x) = f(x) but f(x) is not continuous at x = 0, i.e., f is not continuous in [0,1].
Theorem 17.(Fundamental Theorem of the Integral Calculus). If f ∈ ℜ on [a, b] and if there
is a differential function F on [a, b] such that F′ = f, then
90
b
f(x) dx = F(b) – F(a)
a
Proof. Let P be a partition of [a, b] and choose ti (I = 1, 2, ……,n) such that xi-1 ≤ ti ≤ xi. Then,
by Lagrange’s Mean Value Theorem, we have
F(xi) – F(xi-1) = (xi – xi-1) F′(ti) = (xi – xi-1) f(ti) (since F′ = f).
Further
n
F(b) – F(a) = [F(xi) – F(xi-1)]
i =1
n
= f(ti) (xi – xi-1)
i =1
n
= f(ti) ∆ xi
i =1
b
and the last sum tends to f(x) dx as | P | →0, by Theorem 13 taking α(x) = x . Hence
a
b
f(x) dx = F(b) – F(a).
a
The next theorem tells us that the symbol dα(x) can be replaced by α′(x) dx in the Riemann –
b
Stieltjes integral f(x) dα(x). This is the situation in which Riemann – Stieltjes integral
a
reduces to Riemann integral.
Theorem 18. If f ∈ ℜ and α′ ∈ ℜ on [a, b], then f ∈ ℜ(α) and
b b
f dα = f(x) α′(x) dx.
a a
Proof. Since f ∈ ℜ , α′ ∈ ℜ, it follows that their product f α′ ∈ ℜ. Let ∈ > 0 be given. Choose
M such that | f | ≤ M. Since f α′ ∈ ℜ and α′ ∈ ℜ, using Theorem 14(ii) for integrator as x, we
have
(3.6.2) | f(ti) α′(ti) ∆ xi - f α′ | < ∈
if | P | < δ1 and xi-1 ≤ ti ≤ xi and
(3.6.3) | α′(ti) ∆ xi − α′ | < ∈
if | P | < δ2 and xi-1 ≤ ti ≤ xi. Letting ti vary in (3.6.3), we have
(3.6.4) | α′(si) | ∆ xi − α′ | < ∈
if | P | < δ2 and xi-1≤ si ≤ xi. From (3.6.3) and (3.6.4) it follows that
| α′(ti) | ∆ xi − α′ + α′ − α′(si) ∆ xi |
≤| α′(ti) ∆ xi - α′ | + | α′(Si) ∆ xi - α′ |
< ∈ + ∈ = 2∈
or
91
4 2
x
= 2| = 8 Ans.
4 0
and
2 2
[x] dx2 = [x] 2x dx
0 0
1 2
= [x] 2x dx + [x] 2x dx
0 1
2 2
x2
=0+ 2x dx = 0 + 2|
1
2 1
= 0 + 3 = 3 Ans.
We now establish a connection between the integrand and the integrator in a Riemann – Stieltjes
integral. We shall show that existence of f dα implies the existence of α df.
92
We recall that Abel’s transformation (Partial Summation Formula) for sequences reads as
follows:
“Let <an> and <bn> be two sequences and let An = a0 + a1 + …..+ an (A-1 = 0). Then
q q −1
(3.6.7) an bn = An (bn - bn-+1) + Aq bq – Ap-1bp.”
n= p n =p
Theorem (Integration by parts). If f ∈ ℜ(α) on [a, b], then α ∈ ℜ(f) on [a, b] and
f(x) dα(x) = f(b) α(b) – f(a) α(a) - α (x) df(x)
(Due to analogy with (3.6.7), the above expression is also known as Partial Integration
Formula).
Proof. Let P = {a = x0, x1, ….,xn = b}be a partition of [a, b]. Choose t1, t2, ….,tn such that
xi-1 ≤ ti ≤ xi and take t0 = a, tn+1 = b. Suppose Q is the partition{t0, t1, ….,tn+1}of [a, b]. By partial
summation, we have
n n +1
S(P, f, α) = f(ti)[α(xi) − α(xi-1)] = f(b) α(b) – f(a) α(a) − α(xi-1)[ f(ti) − f(ti-1)]
i =1 i =1
3.7. Mean Value Theorems For Riemann – Stieltjes Integrals. In this , section, we establish
Mean Value Theorems which are used to get estimate value of an integral rather than its exact
value.
Theorem 19 (First Mean Value Theorem for Riemann – Stieltjes Integral). If f is continuous
and real valued and α is monotonically increasing on [a, b], then there exists a point x in [a, b]
such that
f dα = f(x) [ α(b) - α(a)]
Proof. If α(a) = α(b), the theorem holds trivially, both sides being 0 in that case (α become
constant and so dα = 0). Hence we assume that α(a) < α(b). Let
M = lub f(x), m = glb f(x). a ≤ x ≤ b
Then
m ≤ f(x) ≤ M
or
m[α(b) - α(a)] ≤ f dα ≤ M[α(b) - α(a)]
Hence there exists some c satisfying m ≤ c ≤ M such that
b
f dα = c[α(b) -α(a)]
a
Since f is continuous, there is a point x ∈[a, b] such that f(x) =c and so we have
b
f(x) dα(x) = f(x)[α(b) -α(a)]
a
This completes the proof of the theorem.
93
Theorem 20 (Second Mean – value Theorem for Riemann – Stieltjes Integral). Let f be
monotonic and α real and continuous. Then there is a point x ∈[a, b] such that
b
f dα = f(a)[α(x) - α(a)] +f(b)[α(b) - α(x)]
a
Proof. By Partial Integration Formula, we have
b b
f dα = f(b) α(b) – f(a) α(a) - αdf
a a
The use of First Mean –Value Theorem of Riemann – Stieltjes integral yields that there is x in
[a, b] such that
b
αdf = α(x)[f(b) – f(a)]
a
∆ xi = xi − xi-1
= φ( yi) − φ( yi-1)
= ∆φi
Let for any ci ∈ ∆xi, di ∈ ∆ yi, where ci = ϕ(di). Putting g(y) = f[ϕ(y)], we have
n
(3.8.1) S(P, f) = f(ci) ∆xi
i =1
= f (φ(d i ))∆φ i
i
= g(di) ∆ϕi
i
= S(Q, g, ϕ)
94
b
Continuity of f implies that S(P, f)→ f(x) dx as | P |→ 0 and continuity of g implies that
a
b
S(Q, g, ϕ)→ g(y) dϕ as | Q | → 0.
a
Theorem 23. If f maps [a, b] into Rk and if f ∈ R(α) for some monotonically increasing
function α on [a, b], then | f | ∈ R(α) and
b b
| f dα | ≤ |f | dα.
a a
Proof. Let
f = (f1, ….,fk).
Then
|f | = (f12 + ….+ fhh)1/2
95
Since each fi ∈ R(α), the function fi2 ∈ R(α) and so their sum f12 + ….+ fk2 ∈ R(α). Since x2 is a
continuous function of x, the square root function is continuous on [0, M] for every real M.
Therefore | f | ∈ R(α).
y= f dα
and
|y |2 = yi2 = yi fi dα
i
= ( yi fi ) dα
(3.9.1) |y |2 ≤ |y | | f | dα
|y | ≤ |f | dα
b
or | f | dα ≤ |f | dα.
a
3.10. Rectifiable Curves. The aim of this section is to consider application of results studied in
this chapter to geometry.
Definition. A continuous mapping γ of an interval [a, b] into Rk is called a curve in Rk.
If γ : [a, b] → Rk is continuous and one – to – one, then it is called an arc.
If for a curve r : [a, b] → Rk,
r(a) = r(b)
but
r(t1) ≠ r(t2)
for every other pair of distinct points t1, t2 in [a, b], then the curve γ is called a simple closed
curve.
Definition. Let f : [a, b] → Rk be a map. If P = {x0, x1, …., xn}is a partition of [a, b], then
n
V(f , a , b ) = lub | f(xi) – f(xi-1) | ,
i =1
where the lub is taken over all possible partitions of [a, b], is called total variation of f on [a,b].
The function f is said to be of bounded variation on [a, b] if V(f , a, b ) < + ∞.
96
The ith term | γ(xi) – γ(xi-1) | in this sum is the distance in Rk between the points r(xi-1) and r(xi).
n
Further | γ(xi) – γ(xi-1) | is the length of a polygon whose vertices are at the points γ(x0),
i =1
γ(x1), …, γ(xn). As the norm of our partition tends to zero those polygons approach the range of
γmore and more closely.
Theorem 24. Let γ be a curve in Rk. If γ′ is continuous on [a, b], then γ is rectifiable and has
length
b
| γ′(t) | dt.
a
Proof. It is sufficient to show that | γ′ | = V(γ, a, b). So, let {x0, ….,xn}be a partition of [a, b].
≤ | γ′(t) | dt
i =1 xi −1
b
= | γ′(t) | dt
a
Thus
(3.10.1) V(γ, a, b) ≤ | γ′ |.
To prove the reverse inequality, let ∈ be a positive number. Since γ′ is uniformly continuous on
[a, b], there exists δ > 0 such that
| γ′(s) - γ′(t) | < ∈, if | s – t | < δ.
If mesh (norm) of the partition P is less then δ and xi-1 ≤ t ≤ xi, then we have
| γ′(t) | ≤ | γ′(xi) | + ∈,
so that
xi
| γ′(t) | dt − ∈ ∆ xi ≤ γ′(xi) | ∆ xI
xi −1
xi
≤ | γ(xi) – γ(xi-1) | + ∈ ∆ xi
Adding these inequalities for i = 1, 2, …., n, we get
97
b n
| γ′(t) | dt ≤ | γ(xi) – γ(xi-1) | + 2 ∈ (b – a)
a i =1
= V(γ, a, b) + 2∈ (b – a)
Since ∈ is arbitrary, it follows that
b
(3.10.2) | γ′(t) | dt ≤ V(γ, a, b)
a
Combining (3.10.1) and (3.10.2), we have
b
| γ′(t) | dt = V(γ, a, b)
a
b
Hence the length of r is | γ′(t) | dt.
a
98
Definition. An extended real – valued set function µ defined on a class E of sets is called
countably additive it for every disjoint sequence {An} of sets in E whose union is also in E, we
have
∞ ∞
µ ( Υ Ai ) = µ(Ai)
i =1 i =1
Definition. Length of an open set is defined to be the sum of lengths of the open intervals of
which it is composed of. Thus, if ∈ is an open set, then
l (G) = l(In),
n
where
G= Υ
n
In , In1 ∩ In2 = ϕ if n1 ≠ n2.
Definition. The Lebesgue Outer Measure or simply the outer measure m* of a set A is defined
as
m* (A) = inf l(In),
A ⊆ UI n
where the infimum is taken over all finite or countable collections of intervals {In} such that
A ⊆ In.
Since the lengths are positive numbers, it follows from the definition of m* that m*(A) ≥ 0 and
that m* φ = 0.
Further, if A is a singleton, then m* A = 0 and also if A ⊆ B, then M* A ≤ M* B.
Theorem 25. Outer measure is translation invariant.
Proof. Let A be a set. We shall show that m* (A) = m*(A+x),
where A + x = {y + x : y ∈ A}.
Let {In} be collection of intervals {In} such that A ⊆ ∪ In. Then, by the definition of outer
measure, for ∈ > 0, we have
(3.11.1) m* (A) ≥ l(In) - ∈.
99
Since a ∈ I, there exists an open interval (a1, b1) from the above mentioned finite number of
intervals such that a1 < a < b1. If b1 ≤ b, then b1 ∈ I. Since b1 is not covered by the open interval
(a1, b1), there is an open interval (a2, b2) in the finite collection J1, …., Jp with a2 < b1 < b2.
Continuing in this fashion we obtain a sequence
(a1, b1), (a2, b2),….(an, bn)
in the collection J1, J2, ….,Jp satisfying
ai < bi-1 < bi
for every i = 2, ..,n.
Since the collection is finite, out process must terminate with an (an, bn) satisfying b ∈ (an, bn)
Then we have
n
l(In) ≥ l (ai, bi)
n i =1
= (bn – an) + (bn-1 – an-1) + ….+(b2 – a2) + (b1 – a1)
= bn – (an – bn-1) …….(a2 – b1) – a1
Since each expression in the bracket is –ve, it follows that
l(In) > bn – a1 > b – a.
n
Hence the theorem is proved in this case.
100
Next, let I be any bounded interval with end points a and b. For every positive real number ∈, we
have
[a + ∈, b − ∈] ⊂ I ⊂ [a, b]
Therefore
b – a − 2∈ ≤ m* [a + ∈, b − ∈]
≤ m* I ≤ m* [a, b]
= b – a.
Since this holds for every ∈ > 0, we must have
m* I = b – a = l(I)
Finally, let I be unbounded. Then, for every real number r, I contains a bounded interval H of
length
l (H) ≥ r.
Therefore by the above result
m* I ≥ m* H = l(H) ≥ r.
Since this holds for every r ∈ R, we must have
m* I = ∞ = l(I)
This completes the proof of the theorem.
Theorem27. Let {An} be a countable collection of sets of real numbers. Then
m* (∪An) ≤ m* An.
Proof. If one of the sets An has infinite outer measure, the inequality holds trivially. So suppose
m* An is finite. Then, given ∈ > 0, there exists a countable collection { In, i} of open intervals
such that An ⊂ Υ In, i and
i
∈
l(In,i) < m* An + ,
i 2n
by the definition of m* An.
Now the collection [In, i]n,i = Υ n
[In, i]i is countable, being the union of a countable number of
= l (In, i)
n i
∈
< ( m* An + )
n 2n
∈
= m* An +
n n 2n
1
= m* An + ∈
n n 2n
= m* An + ∈
n
Cor 1. If A is countable, m* A = 0.
101
Proof. We know that a countable set is the union of a countable family of singleton. Therefore
A = ∪ [xn ], which yields
m* A = m* [∪(xn)] ≤ m* [xn] (by the above theorem)
But as already pointed out outer measure of a singleton is zero. Therefore it follows that
m* A ≤ 0
Since outer measure is always a non – negative real number, m* A = 0.
Cor 2. Every interval is not countable.
Proof. We know that outer measure of an interval I is equal to its length. Therefore it follows
from Cor. 1 that every interval is not countable.
Cor 3. If m* A = 0, then m* (A ∪ B) = m* B.
Proof. Using the above proposition
m*(A ∪ B) ≤ m* A + m* B
= 0 + m* B (i)
Also B ⊂ A ∪ B
Therefore m* B ≤ m* (A ∪ B) (ii)
From (i) and (ii) it follows that
m* B = m* (A ∪ B)
Note:- Because of the property m* (∪ An) ≤ m* An, the function m* is said to be countably
subadditive. It would be much better if m* were also countably additive, that is, if
m* (∪ An) = m* An.
for every countable collection[An] of disjoint sets of real numbers. If we insist on countable
additivity, we have to restrict the domain of the function m* to some subset m of the set 2R of all
subsets of R. The members of m are called the measurable subsets of R. That is, to do so we
suitably reduce the family of sets on which m* is defined. This is done by using the following
definition due to Carathedory.
Definition. A set E of real numbers is said to be m* measurable, if for every set A ∈ R, we have
m* A = m* (A ∩ E) + m* (A ∩ Ec)
Since
A = (A ∩ E) ∪ (A ∩ Ec),
It follows from the definition that
m* A = m* [(A ∩ E) ∪ (A ∩ Ec) ≤ m* (A ∩ E) + m* (A ∩ Ec)
Hence, the above definition reduces to:
A set E ∈ R is measurable if and only if for every set A ∈ R, we have
m* A ≥ m* (A ∩ E) + m* (A ∩ Ec).
For example φ is measurable.
Theorem 28. If m* E = 0, then E is measurable.
Proof. Let A be any set. Then A ∩ E ⊂ E and so
m* (A ∩ E) ≤ m* E = 0 (i)
Also A ⊃ A ∩ Ec , and so
m* A ≥ m* (A ∩ Ec) = m* (A ∩ Ec) + m* (A ∩ E)
as m* (A ∩ E) = 0 by (i)
Hence E is measurable.
Theorem29. If a set E is measurable, then so is its complement Ec.
102
Proof. We shall prove this lemma by induction on n. The lemma is trivial for n = 1. Let n > 1
and suppose that the lemma holds for n – 1 measurable sets Ei.
Since En is measurable, we have
m* (X) = m* (X ∩ En) + m* (X ∩ Enc)
for every set X ∈ R. In particular we may take
103
n
X = A ∩ [ Υ Ei].
i =1
Since E1, E2, ….,En are disjoint, we have
n
X ∩ En = A ∩ [ Υ Ei] ∩ En = A ∩ En
i =1
and
n n −1
X ∩ Enc = A ∩ [ Υ Ei] ∩ Enc = A ∩ [ Υ Ei]
i =1 i =1
Hence we obtain
n −1
m* X = m*(A ∩ En) + m*(A ∩ [ Υ Ei]) (i)
i =1
But since the lemma holds for n – 1 we have
n −1 n −1
m*(A ∩ [ Υ Ei]) = m*(A ∩ Ei)
i =1 i =1
Therefore (i) reduces to
n −1
m* X = m*(A ∩ En) + m*(A ∩ Ei)
i =1
n
= m*(A ∩ Ei).
i =1
Hence the lemma.
Lemma 2. Let A be an algebra of subsets and {Ei | i ∈ N} a sequence of sets in A. Then there
exists a sequence [Di | i ∈ N] of disjoint members of A such that
Di ⊂ Ei ( i ∈ N)
Υ
i∈N
Di = Υ
i∈N
Ei
Υ
i∈N
Di ⊂ Υ
i∈N
Ei
Υ
i∈N
Di ⊃ Υ
i∈N
Ei .
l(In) ≤ m* A + ∈ (i)
105
Let In′ = In ∩ (a, ∞ ) and In″ = In ∩ (- ∞ , a). Then In′ and In″ are intervals (or empty) and
l(In) = l (In′) + l(In″) = m*( In′) +m*( In″) (ii )
Since A1 ⊂ U In′, we have
m* A1 ≤ m*(U In′) ≤ m* In′, (iii)
m* A1 + m* A2 ≤ m* In′ + ≤ m* In″
≤ m* A + ∈ [by (i)].
But ∈ was arbitrary positive number and so we must have
m* A1 + m* A2 ≤ m* A.
Definition. The collection of Borel sets is the smallest σ - algebra which contains all of the
open sets.
Theorem 32. Every Borel set is measurable. In particular each open set and each closed set is
measurable.
Proof. We have already proved that (a, ∞) is measurable. So we have
(a, ∞)c = (- ∞, a] measurable.
∞
1
Since (- ∞, b) = Υ (- ∞, b - ] and we know that countable union of measurable sets is
n =1 n
measurable, therefore (- ∞, b) is also measurable. Hence each open interval,
(a, b) = ( - ∞, b) ∩ (a, ∞) is measurable, being the intersection of two measurable sets.
But each open set is the union of countable number of open intervals and so must be measurable
(The measurability of closed set follows because complement of each measurable set is
measurable).
Let M denote the collection of measurable sets and the collection of open sets. Then C ⊂ M.
Hence is also a subset of M since it is the smallest σ - algebra containing . So each element
of is measurable. Hence each Borel set is measurable.
Definition. If E is a measurable set, then the outer measure of E is called the Lebesgue Measure
of E ad is denoted by mE.
Thus, m is the set function obtained by restricting the set function m* to the family M of
measurable sets. Two important properties of Lebesgue measure are summarized by the
following theorem.
Theorem 33. Let {En} be a sequence of measurable sets. Then
106
m(∪ Ei) ≤ m Ei
If the sets En are pairwise disjoint, then
m(∪ Ei) = m Ei .
Υ i =1
Ei ⊃ Υi =1
Ei.
And so
∞ n n
m( Υ Ei) ≥ m( Υ Ei) = m Ei
i =1 i =1 i =1
Since the left hand side of this inequality is independent of n, we have
∞ ∞
m( Υ Ei) ≥ m Ei
i =1 i =1
The reverse inequality follows from countable subadditivity and we have
∞ ∞
m( Υ Ei) = m Ei.
i =1 i =1
Hence the theorem is proved.
Theorem 34. Let {En} be an infinite sequence of measurable sets such that En+1 ⊂ En for each
n. Let mE1 < ∞. Then
∞
lim
m( Ι En) = mEn.
i =1 n→∞
∞
Proof. Let E = Ιi =1
Ei and let Fi = Ei – Ei-1. Then since {En} is a decreasing sequence. We have
∩ Fi = φ.
Also we know that if A and B are measurable sets then their difference A – B = A ∩ Bc is also
measurable. Therefore each Fi is measurable. Thus {Fi} is a sequence of measurable pairwise
disjoint sets.
Now
∞ ∞
Υ
i =1
Fi = Υi =1
(Ei – Ei+1)
∞
= Υ i =1
(Ei ∩ Ei+1c)
107
= E1 ∩ (∪Eic)
∞
= E1 ∩ ( Ι Ei)c
i =1
= E1 ∩ Ec
= E1 – E
Hence
∞
m( Υ Fi) = m(E1 – E)
i =1
∞
m Fi = m(E1 – E)
i =1
∞
m(Ei – Ei+1) = m(E1 – E) (i)
i =1
Theorem 35. Let {En} be an increasing sequence of measurable sets, that is, a sequence with En
⊂ En+1 for each n. Let mE1 be finite, then
∞
m( Υ Ei) = lim m En.
i =1
n →∞
Proof. The sets E1, E2 – E1, E3 – E2, ….,En – En-1, are measurable and are pairwise disjoint.
Hence
E1 ∪(E2 – E1) ∪…∪(En – En-1)∪….
is measurable and
108
Moreover,
n n
m(Ei – Ei-1) = ( m Ei – m Ei-1)
i=2 i=2
= ( m E2 – m E1) + ( m E3 – m E2) + ……+ ( m En – m En-1)
= m En – m E1
Thus we have
∞
m[ Υ Ei] = m E1 + lim [m En – m E1]
i =1
n →∞
= lim m En.
n →∞
Definition. The symmetric difference of the sets A and B is the union of the sets A – B and B –
A. It is denoted by A ∆ B
Theorem 36. If m (E1 ∆ E2) = 0 and E1 is measurable, then E2 is measurable. Moreover
m E2 = m E1.
Proof. We have
E2 = [ E1 ∪(E2 – E1)] – (E1 – E2) (i)
By hypothesis, both E2 – E1 and E1 – E2 are measurable and have measure zero. Since E1 and
E2 – E1 are disjoint, E1 ∪(E2 – E1) is measurable and m[E1 ∪(E2 – E1)] = m E1 + 0 = m E1. But,
since
E1 – E2 ⊂ [E1 ∪(E2 – E1)],
it follows from (i) that E2 is measurable and
m E2 = m[E1 ∪(E2 – E1)] - m(E1 – E2)
= m E1 – 0 = m E1.
This completes the proof.
Definition. Let x and y be real numbers in [0, 1]. Then sum modulo 1 of x and y, denoted by
o
x + y, is defined by
o x + y if x + y < 1
x + y=
x + y − 1if x + y ≥ 1.
o
It can be seen that + is a commutative and associative operation which takes pair of numbers
in[0, 1) into numbers in [0, 1).
If we assign to each x ∈ [0, 1) the angle 2π x then addition modulo 1 corresponds to the addition
of angles.
If E is a subset of [0, 1), we define the translation module 1 of E to be the set
o o
E + y = [z | z = x + y for some x ∈ E].
If we consider addition modulo 1 as addition of angles, translation module 1 by y corresponds to
rotation through an angle of 2π y.
109
We shall now show that Lebesgue measure is invariant under translation modulo 1.
o
Lemma. Let E ⊂ [0, 1) be a measurable set. Then for each y ∈ [0, 1) the set E + y is measurable
o
and m(E + y) = m E.
Proof. Let E1 = E ∩[0, 1 – y) and E2 = E ∩[1 – y, 1). Then E1 and E2 are disjoint measurable
sets whose union is E, and so
m E = m E1 + m E2.
We observe that
o o
E1 + y = {x + y : x ∈ E1]
x + y if x + y <1
= x ∈ E1
x + y − 1if x + y ≥ 1.
But for x ∈ E1, we have x + y < 1 and so
o
E1 + y = {x + y, x ∈ E1} = E1 + y.
o
and hence E1 + y is measurable. Thus
o
m(E1 + y) = m(E1 + y) = m(E1),
o o
since m is translation invariant. Also E2 + y = E2 + (y – 1) and so E2 + y is measurable and
o
m(E2 + y) = m E2. But
o o o
E + y = (E1 + y) ∪ (E2 + y)
o o o
and the sets (E1 + y) and (E2 + y) are disjoint measurable sets. Hence E + y is measurable and
o o o
m(E + y) = m[(E1 + y) ∪ (E2 + y)
o o
= m(E1 + y) + m(E2 + y)
= m(E1) + m(E2)
= m(E).
This completes the proof of the lemma.
Construction of a non – measurable set. If x – y is a rational number, we say that x and y are
equivalent and write x ~ y. It is clear that x ~ x; x ~ y y ~ x and x ~ y, y ~ z x ~ z. Thus ‘~’
is an equivalence relation and hence partitions [0, 1) into equivalence classes, that is, classes
such that any two elements of one class differ by a rational number, while any two elements of
different classes differ by an irrational number. By the axiom of choice (Let C be any collection
of non – empty sets. Then there is a function F defined on C which assign to each set A ∈ C on
element F(A) in A.) there is a set P which contains exactly one element from each equivalence
∞
class. Let <ri> be an enumeration of the rational numbers in [0, 1) with r0 = 0 and define
i=0
o
Pi = P + ri . (translator modulo 1 of P)
Then P0 = P. Let x ∈ Pi ∩ Pj. Then
x = pi + ri = pj + rj.
with pi and pj belonging to P. But pi - pj = rj – ri is a rational number, whence pi ~ pj. Since P has
only one element from each equivalence class, we must have i = j. This implies that if i ≠ j,
Pi ∩ Pj = φ, that is , that <Pi> is a pair wise disjoint sequence of sets. On the other hand, each real
110
x∈ Υ
n≥m
En for all m
x ∈ En for some n ≥ m
x ∈ En for infinite values of n.
x ∈ lim En
∞
Ι Υ
m =1 n ≥ m
En ⊂ lim En
∞
Hence lim En = Ι Υ
m =1 n ≥ m
En .
Definition. The set of those elements which belong to all En except a finite number of them is
called lim in f of the sequence of sets {En}. We denote it by lim En.
∞
It may be proved that lim En = ΥΙ
m =1 n ≥ m
En. For, let
∞ ∞
x∈ ΥΙ
m =1 n ≥ m
En x∈ Ι
m =1
En for some m
112
x ∈ lim En.
∞
Similarly, we can show that if x ∈ lim En then x ∈ ΥΙ
m =1 n ≥ m
En .
Theorem 38. If {En} is a sequence of measurable sets, then lim En and lim En are also
measurable.
Proof. We have shown above that
∞
lim En = ΥΙ
m =1 n ≥ m
En.
Since {En} is a sequence of measurable sets, the right hand side is measurable and so lim En is
measurable.
Similarly, since
∞
lim En = ΥΙ
m =1 n ≥ m
En
Proof. There exists a countable collection [In] of open intervals such that E ⊂ Υn
In and
∞
l(In) < m* E + ∈.
n =1
∞
Put O= Υ
n =1
In.
Theorem 40. Let E be a measurable set. Given ∈ > 0, there is an open set O ⊃ E such that
m*(O\E) < ∈.
Proof. Suppose first that m E < ∞. Then by the above theorem there is an open set O ⊃ E such
that
m* O < m* E + ∈
Since the sets O and E are measurable, we have
m*(O\E) = m* O – m* E < ∈.
Consider now the case when m E = ∞. Write the set R of real numbers as a union of disjoint
∞
finite intervals; that is, R = Υn =1
In. Then, if En = E ∩ In, m(En)< ∞. We can, thus, find open sets
On ⊃ En such that
∈
m*(On – En) < .
2n
∞
Define O = Υ
n =1
On. Clearly O is an open set such that O ⊃ E and satisfies
∞ ∞ ∞
O−E= Υ
n =1
On - Υ
n =1
En ⊂ Υ
n =1
(On – En)
Hence
∞
m*(O − E) ≤ m*(On\En) < ∈ .
n =1
114 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
4
MEASURABLE FUNCTIONS AND LEBESGUE
INTEGRAL
Since {x | f(x) ≥ α } and { x | f(x) ≤ α} are measurable by conditions (b) and (d), the set {x | f(x)
= α } is measurable being the intersection of measurable sets.
Suppose α = + ∞ . Then
∞
{x | f(x) = ∞ } = Ι {x | f(x) ≥ n}
n =1
which is measurable by the condition (b) and the fact that intersection of measurable sets is
measurable.
Similarity when α = −∞, then
∞
{x | f(x) = −∞} = Ι {x | f(x) ≤ −n}
n =1
which is again measurable by condition (d).
Hence the result follows.
Second definition of Measurable functions
We see that
{x | f(x) > α }
is inverse image of (α, ∞]. Similarly the sets
[x | f(x) ≥ α} , {x | f(x) < α} , {x | f(x) ε α} are inverse images of [α, ∞], [−∞, α) and
[−∞, α] respectively.
Hence we can also define a measurable function as follows.
A function f defined on a measurable set E is said to be measurable if for any real α any one of
the four conditions is satisfied :
(a) The inverse image f−1(α, ∞] of the half-open interval (α, ∞] is measurable.
(b) For every real α, the inverse image f−1 [α, ∞] of the closed interval [α, ∞] is measurable.
(c) The inverse image f−1 [−∞, α) of the half open interval [−∞, α) is measurable.
(d) The inverse image f−1[−∞, α] of the closed interval [−∞, α] is measurable.
Remark 1. It is immediate that a necessary and sufficient condition for measurability is that {x |
a ≤ f(x) ≤ b} should be measurable for all a, b [including the case a = −∞, b = +∞], for any set of
this form can be written as the intersection of two sets
{x | f(x) ≥ a } ∩ { x | f(x) ≤ b },
if f is measurable, each of these is measurable and so is {x | a ≤ f(x) ≤ b}. Conversely any set of
the form occurring in the definition can easily be expressed in terms of the sets of the form {x | a
≤ f(x) ≤ b}.
Remark 2. Since (α, ∞) is an open set, we may define a measurable function as “A function f
defined on a measurable set E is said to be measurable if for every open set G in the real number
system, f−1(G) is a measurable set.
Definition. Characteristic function of a set E is defined by
1 if x ε E
χE(x) =
0 if x ε/ E
This is also known as indicator function.
Example of a Measurable function
Let E be a set of rationals in [0, 1]. Then the characteristic function χE(x) is measurable.
116 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
We next consider the function cf. In case c = 0, cf is the constant function 0 and hence is
measurable since every constant function is continuous and so measurable. In case c > 0 we
have
α α
{x | cf(x) > α } = {x | f(x) > } = f−1 ( , ∞] ,
c c
and so measurable.
In case c < 0, we have
r
{x | cf(x) > r} = {x | f(x) < }
c
and so measurable.
Now if f and g are two measurable real valued functions defined on the same domain, we shall
show that f+g is measurable. To show iit, it is sufficient to show that the set {x | f(x) + g(x) > α}
is measurable.
If f(x) + g(x) > α, then f(x) > α−g(x) and by the Cor. of the axiom of Archimedes there is a
rational number r such that
α−g(x) < r < f(x)
Since the functions f and g are measurable, the sets
{x | f(x) > r} and {x | g(x) > α−r}
are measurable. Therefore, there intersection
Sr = {x | f(x) > r } ∩ {x | g(x) > α−r}
is also measurable.
It can be shown that
{x | f(x) + g(x) > α } = U{Sr | r is a rational}
Since the set of rational is countable and countable union of measurable sets is measurable, the
set U{Sr | r is a rational} and hence {x | f(x) + g(x) > α } is measurable which proves that f(x) +
g(x) is measurable.
From this part it follows that f−g = f+(−g) is also measurable, since when g is measurable (−g) is
also measurable.
Next we consider fg.
The measurability of fg follows from the identity
1
fg = [(f+g)2 − f2−g2 ] ,
2
2
if we prove that f is measurable when f is measurable. For this it is sufficient to prove that
{ x ε E | f2(x) > α } , α is a real number,
is measurable.
Let α be a negative real number. Then it is clear that the set {x | f2(x) > α } = E(domain of the
measurable function f ). But E is measurable by the definition of f . Hence {x | f2(x) > α } is
measurable when α < 0.
Now let α ≥ 0 , then
{x | f2(x) > α } = {x | f(x) > √α} ∪ {x | f(x) < −√α}
Since f is measurable, it follows from this equality that
{x | f2(x) > α }
is measurable for α ≥ 0 .
118 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
∞
(b) Write E = ΥEi . Clearly, E, being the union of measurable set is measurable. The result
i =1
∞
now follows, since for each real α, we have E = { x ε E, f(x) > α} = ΥEi f(x) > α }
i =1
Theorem 7. Let f and g be any two functions which are equal almost everywhere in E. If f is
measurable so is g.
Proof. Since f is measurable, for any real α the set {x | f(x) > α } is measurable. We shall show
that the set {x | g(x) > α } is measurable. To do so we put
E1 = {x | f(x) > α }
and
E2 = {x | g(x) > α }
Consider the sets
E1− E2 and E2− E1
Since f = g almost everywhere, measures of these sets are zero. That is, both of these sets are
measurable. Now
E2 = [E1 ∪ (E2−E1)] − (E1−E2)
= [E1 ∪ (E2−E1) ∩ (E1−E2)c
Since E1, E2−E1 and (E1−E2)c are measurable therefore it follows that E2 is measurable. Hence
the theorem is proved.
Cor. Let {fn} be a sequence of measurable functions such that lim fn = f almost everywhere.
n →∞
Then f is a measurable function.
Proof. We have already proved that if {fn} is a sequence of measurable functions then lim fn is
n →∞
measurable. Also it is given that lim fn = f a.e. Therefore using the above theorem it follows
n →∞
that f is measurable.
Theorem 8. Characteristic function χA is measurable if and only if A is measurable.
Proof. Let A be measurable. Then
1 if x ε A
χA(x) =
0 if x ε/ A i.e. x ε A c
Hence it is clear from the definition that domain of χA is A ∪ Ac which is measurable due to the
measurability of A. Therefore, if we prove that the set {x | χA(x) > α } is measurable for any real
α, we are through.
Let α ≥ 0 . Then
{x | χA(x) > α } = {x | χA(x) = 1}
= A(by the definition of Ch. fn.)
But A is given to be measurable. Hence for α ≥ 0 . The set {x | χA(x) > α } is measurable.
Now let us take α < 0 . Then
{x | χA(x) > α } = A ∪ Ac
Hence {x | χA(x) > α} is measurable for α < 0 also, since A ∪ Ac has been proved to be
measurable. Hence if A is measurable, then χA is also measurable.
Conversely, let us suppose that χA(x) is measurable. That is, the set {x | χA(x) > α} is
measurable for any real α.
122 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
Let α ≥ 0 . Then
{x | χA(x) > α} = {x | χA(x) = 1} = A
Therefore, measurability of {x | χA(x) > α} implies that of the set A for α ≥ 0 .
Now consider α < 0. Then
{x | χA(x) > α} = A∪ Ac
Thus measurability of χA(x) implies measurability of the set A∪Ac which imply A is
measurable.
Remark. With the help of above result, the existence of non-measurable function can be
demonstrated. In fact, if A is non-measurable set then χA cannot be measurable.
Theorem 9. If a function f is continuous almost everywhere in E, then f is measurable.
Proof. Since f is continuous almost everywhere in E, there exists a subset D of E with m*D = 0
such that f is continuous at every point of the set C = E−D. To prove that f is measurable, let α
denote any given real number. It suffices to prove that the inverse image
B = f −1(α, ∞) = {x ε E | f(x) > α }
of the interval (α, ∞) is measurable.
For this purpose, let x denote an arbitrary point in B ∩ C. Then f(x) > α and f is continuous at x.
Hence there exists an open interval Ux containing x such that f(y) > α hold for every point y of
E ∩ Ux . Let
U= ΥUx
xεB∩C
Since x ε E ∩Ux ⊂ B holds for every x ε B ∩ C, we have
B∩C⊂E∩∪⊂B
This implies
B = (E ∩ U) ∪ (B ∩ D)
As an open subset of R, U is measurable. Hence E ∪U is measurable. On the other hand, since
m*(B∩D) ≤ m*D = 0 ,
B∩D is also measurable. This implies that B is measurable. This completes the proof of the
theorem.
Definition. A function φ, defined on a measurable set E, is called simple if there is a finite
disjoint class {E1, E2,…, En} of measurable sets and a finite set {α1, α2,…, αn} of real numbers
such that
αi if x ε E i , i = 1,2,..., n
f(x) =
0 if x ε/ E1 ∪ E 2 ∪ ... ∪ E n
Thus, a function is simple if it is measurable and takes only a finite number of different values.
The simplest example of a simple function is the characteristic function χE of a measurable set E.
Definition. A function f is said to be a step function if
f(x) = Ci , ξi−1 < x < ξi
for some subdivision of [a, b] and some constants Ci .
Clearly, a step function is a simple function.
Theorem 10. Every simple function φ on E is a linear combination of characteristic functions
of measurable subsets of E.
Proof. Let φ be a simple function and c1, c2,…, cn denote the non-zero real numbers in its image
φ(E). For each i = 1,2,…, n, let
REAL ANALYSIS 123
Ai = {x ε E : φ(x) = Ci}
Then we have
n
φ= C i χ Ai
i =1
On the other hand, if φ(E) contains no non-zero real number, then φ = 0 and is the characteristic
function χφ of the empty subset of E.
It follows from Theorem 10 that simple functions, being the sum of measurable functions, is
measurable.
Also, by the definition, if f and g are simple functions and c is a constant, then f +c, cf, f+g and
fg are simple.
Theorem 10 (Approximation Theorem). For every non-negative measurable function f, there
exists a non-negative non-decreasing sequence {fn} of simple functions such that
lim fn(x) = f(x), x ε E
n →∞
In the general case if we do not assume non-negativeness of f , then we say
For every measurable function f, there exists a sequence {fn}, n ε N of simple function which
converges (pointwise) to f .
i.e. “Every measurable function can be approximated by a sequence of simple functions.”
Proof. Let us assume that f(x) ≥ 0 and x ε E . Construct a sequence
i −1 i −1 i
for n ≤ f ( x ) < n , i = 1,2,..., n 2 n
fn(x) = 2 n 2 2
n for f ( x ) ≥ n
for every n ε N.
If we take n = 1, then
i −1 i −1 i
for ≤ f ( x ) < , i = 1,2,
f1(x) = 2 2 2
1 for f ( x ) ≥ 1
1
0 for 0 ≤ f ( x ) <
2
1 1
That is, f1(x) = for ≤ f (x) < 1
2 2
1 for f ( x ) ≥ 1
That is,
124 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
1
0 for 0 ≤ f ( x ) <
4
1 1 1
for ≤ f ( x ) <
4 4 2
f2(x) = ............................
............................
7 7
for ≤ f ( x ) < 2
4 4
2 for f ( x ) ≥ 2
Similarly we can write f3(x) and so on. Clearly all fn are positive whenever f is positive and also
it is clear that fn ≤ fn+1. Moreover fn takes only a finite number of values. Therefore {fn} is a
sequence of non-negative, nondecreasing functions which assume only a finite number of values.
Let us denote
i −1 i i −1 i
Eni = f−1 n
, n = xε E | n ≤ f ( x ) < n
2 2 2 2
and
En = f−1 [n, ∞] = {x ε E | f(x) ≥ n}
Both of them are measurable. Let
n 2n
i −1
fn = n
χ E n + nχ E n
i =1 2
i
for every n ε N .
n 2n
i −1
Now χ E n is measurable, since E n i has been shown to be measurable and characteristic
2n
i =1
i
−
When we do not assume non-negativenss of the function then since we know that f + and f
are both non-negative, we have by what we have proved above
+
f = lim φ′n(x) … (i)
n →∞
f = lim φ′′n(x) … (ii)
n →∞
where φ′n(x) and φ′′n(x) are simple functions. Also we have proved already that
+
f = f − f
Now from (i) and (ii) we have
+
f − f = lim φ′n(x) − lim φ′′n(x)
n →∞ n →∞
= lim (φ′n(x) − φ′′n(x))
n →∞
= lim φn(x)
n →∞
(since the difference of two simple functions is again a simple function). Hence the theorem.
Littlewood’s three principles of measurability
The following three principles concerning measure are due to Littlewood.
First Principle. Every measurable set is a finite union of intervals.
Second Principle. Every measurable function is almost a continuous function.
Third Principle. If {fn} is a sequence of measurable function defined on a set E of finite
measure and if fn(x) → f(x) on E, then fn(x) converges almost uniformly on E.
First of all we consider third principle. We shall prove Egoroff’s theorem which is a slight
modification of third principle of Littlewood’s.
Theorem 11(Egoroff’s Theorem). Let {fn} be a sequence of measurable functions defined on a
set E of finite measure such that fn(x) → f(x) almost everywhere. Then to each ∈ > 0 there
corresponds a measurable subset E0 of E such that m E c0 < ∈ and fn(x) converges to f(x)
uniformly on E0.
Proof. Since fn(x) → f(x) almost everywhere and {fn} is a sequence of measurable functions,
therefore f(x) is also a measurable function. Let
H = { x | lim fn(x) = f(x) }
n →∞
Clearly measure of E−H is zero.
For each pair (k, n) of positive integers, let us define the set
∞
1
Ekn = Ι {x | |fm(x) − f(x)| < }
m=n k
(Since each fm−f is a measure function, the sets Ekn are measurable).
Then for each k, if we put
∞
E′ = Υ E kn
n =1
Then it is clear that
∞
E′ = Υ E kn ⊃H
n =1
126 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
In fact, if x ε H then x ε E′ H ⊂ E′ .
We have also
∞
1
Ek(n+1) = Ι {x | |fm(x) − f(x)| <
k
}
m = n +1
Clearly
1
Ekn = Ek(n+1) ∩ {x | |fn(x) − f(x)| <
}
k
Hence Ek(n+1) cannot be a proper subset of Ekn. That is,
Ekn ⊂ Ek(n+1)
Thus for each k the sequence [Ekn] is an expanding sequence of measurable sets. Therefore
∞
lim m (Ekn) = m( Υ E kn )
n →∞ n =1
≥ m(H) = m(E) ,
whence
lim m( E ckn ) = 0 . (i)
n →∞
Thus, given ∈ > 0, we have that for each k there is a positive integer nk such that
∈
|m E ckn − 0 | < k , n ≥ nk
2
∈
i.e. | m E ckn | < k , n ≥ nk (ii)
2
Let
∞
E0 = Ι E k nk ,
k =1
then E0 is measurable and
∞
m E c0 = m ( Ι E kn k ) c
k =1
∞
= m ( Υ E ckn k )
k =1
∞
≤ m E ckn k
k =1
∞
∈
= k
(using (ii))
k =1 2
∞
1
= ∈ k
=∈.
k =1 2
It follows from the definition of Ekn that for all m ≥ nk ,
1
| fm(x) − f(x) | < (iii)
k
for every x ε E kn k . Since E0 ⊂ E kn k for every k, the condition m ≥ nk yields (iii) for every x ∈
E0. Hence fn(x) → f(x) uniformly on E0. This completes the proof of the theorem.
Now we pass to the second principle of Littlewood. This is nothing but approximation of
measurable functions by continuous functions. In this connection we shall prove the following
theorem known as Lusin Theorem after the name of a Russian Mathematician Lusin, N.N.
REAL ANALYSIS 127
Theorem 12 (Lusin’s Theorem). Let f be a measurable function defined on [a, b]. Then to
each ∈ > 0, there corresponds a measurable subset E0 of [a, b] such that m E c0 < ∈ and f is
continuous on E0.
Proof. Let f be a measurable function defined on [a, b]. We know that every measurable
function is the limit of a sequence {φn(x)} of simple functions whose points of discontinuity
form a set of measure zero. Thus we have
lim φn(x) = f, x ε [a, b]
n →∞
By Egoroff’s theorem, to each ∈ > 0 there exists a subset E0 of [a, b] such that m E c0 < ∈ and
φn(x) converges to f(x) uniformly on E0. But we know that if {φn(x)} is a sequence of continuous
function converging uniformly to a function f(x), then f(x) is continuous. Therefore f(x) is
continuous on E0. This completes the proof of the theorem.
Theorem 13. Let f be a measurable function defined on [a, b] and assume that f takes values
± ∞ on a set of measure zero. Then given ∈ > 0 we can find a continuous function g and a step
function h such that
|f−g| < ∈ , (f−h) < ∈ ,
except on a set of measure less than ∈ .
Proof. Let H be a subset of [a, b] where f(x) is not ± ∞ . Then by the hypothesis of the theorem
mH = m( [a, b]). We know that every measurable function can be expressed as a almost
everywhere limit of a sequence of step functions which are continuous on a set of measure zero.
That is, we can find a sequence of step functions such that
lim φn(x) = f(x) a.e. on H.
n →∞
Let F ⊂ H such that φn(x) → f(x) and is continuous everywhere on F.
By Egoroff’s theorem for a given ∈ > 0 we can find a subset F′ ⊂ H such that φn(x) → f(x)
uniformly on F′ and
M(F − F′) < ∈
But we know that if {fn} is a sequence of continuous function converging uniformly to a
function f(x), then f(x) is continuous. Therefore f(x) is continuous on F′.
Define a continuous function g(x) on [a, b] such that
0 if x ∉ F'
g(x) =
f ( x ) if x ∈ F'
Therefore on F′ we have
|f−g| < ∈
We have already shown that
m( [a, b] − F′ ) < ∈ .
Also we have shown that φn(x) → f(x) where φn(x) is a sequence of step function, so f(x) is also a
step function. Hence the theorem.
In order to prove the first principle of Littlewood we prove two theorems on approximations of
measurable sets.
Theorem 14. A set E in R is measurable if and only if to each ∈ > 0, there corresponds a pair of
sets F, G such that F ⊂ E ⊂ G, F is closed, G is open and m(G−F) < ∈ .
128 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
1
Proof. Sufficiency :- Taking ∈ = , let the corresponding pair of sets be Fn, Gn with
n
1
m(Gn− Fn) <
n
Let
X = U Fn, Y =
n
Ι Gn
n
It follows that Y−X ⊂ Gn−Fn and
1
m(Y−X) ≤ m(Gn−Fn) <
n
so that
m(Y−X) = 0 .
Since
E−X ⊂ Y−X ,
so
m(E−X) = 0 .
Therefore, E − X is measurable.
But E = (E−X) ∪ X . Therefore E is measurable, since X is measurable and E−X is measurable.
Necessity. We now assume that E is measurable. We first prove this part under the assumption
that E is bounded. Since E is measurable and bounded, we can choose an open set G ⊃ E such
that
∈
m(G) < m(E) + (i)
2
Choose a compact (closed and bounded) set S ⊃ E , and then choose an open set V such that
S−E ⊂ V and
∈
m(V) < m (S−E) + (ii)
2
Let F = S−V. Then F is closed (since S−V = S ∩ Vc which is closed being the intersection of
closed sets) and F ⊂ E . We have
m(F) = m(S) − m(S ∩ V)
≥ m(S) − m(V)
∈
> m(S) − m(S−E) − (Using (ii))
2
∈
= m(E) − (iii)
2
Then
m(G−F) = m(G) − m(F)
= m(G) − m(E) + m(E) −m(F)
∈ ∈
< + =∈ (using (i) and (iii))
2 2
This finishes the proof for the case in which E is bounded.
Now, let E be the measurable but unbounded. Let
Sn = {x | |x| ≤ n } n ∈ Z
E1 = E ∩ S 1
En = E ∩ (Sn−Sn−1), n ≥ 2 .
REAL ANALYSIS 129
Then
E= ΥEn ,
n
where each En is bounded and measurable.
Using what has already been established, let Fn, Gn be a pair of sets such that Fn ⊂ En ⊂ Gn, Fn is
∈
closed, Gn is open, and m(Gn−Fn) < n . Let F = Υ Fn , G = Υ∈n . Then G−F ⊂ Υ (Gn−Fn)
2 n n n
and so
m(G−F) ≤ m{ Υ (G n − Fn)}
n
≤ m (Gn−Fn)
n
∈
=
n 2n
1
= ∈n
=∈.
n 2
We see that G is open and that F ⊂ E ⊂ G, so all that remains to prove is that F is closed.
Suppose {xi} is a convergent sequence (say xi → x) with xi ∈ F for each i. Then {xi} is bounded
N
and so is contained in SN for certain N . Now Fn ⊂ Sn−SN if n > N . Therefore, xi ∈ Υ Fn for
n =1
N
each i. But then the limit x is in Υ Fn , for this last set is closed. Therefore F is closed. This
n =1
finishes the proof.
Definition. If A and B are two sets, then
A ∆ B = (A−B) ∪ (B−A) .
Theorem 15. If E is a measurable set of finite measure in R and if ∈ > 0 , there is a set G of the
N
form G = ΥIn where I1, I2 ,…, IN are open intervals, such that m(E ∆ G) < ∈ .
n =1
Proof. Let us assume at first that E is bounded. Let X be an open interval such that E ⊂ X.
There exist Lebesgue covering {In} and {Jn} of E and X−E respectively such that
∈
| I n | < m(∈) + ,
n 3
∈
| J n | < m(X−E) + ,
n 3
∈
and such that each In and Jn is contained in X. Choose N so that | I n | < and define sets G,
n>N 3
H, K as follows
N
G= ΥIn , H= ΥIn , K = G ∩ ΥJ n
n =1 n>N n
Observe that E−G ⊂ H and G−E ⊂ K so that E ∆ G ⊂ H∪ K and therefore
m(E∆G) ≤ m(H ∪ K) ≤ m(H) + m(K)
130 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
X = [ Υ I n ] ∪ [ Υ (J n − G )] ,
n n
whence
m(X) = m[ Υ I n ] + m [ Υ (J n − G )]
n n
≤ | In | + m(J n − G )
n n
We also have
2∈
| In | + | J n | < m( E ) + m ( X − E ) +
n n 3
2∈
= m(X) + ,
3
whence
2∈
| In | + | Jn | < | In | + m(J n − G ) +
n n n n 3
and therefore, since Jn = (Jn−G) ∪ (Jn ∩ G),
m(K) ≤ m(G ∩ J n ) = m(J n ) − m(J n − G )
n n n
2∈
<
3
Hence when E is bounded
∈ 2∈
m(E∆G) < + =∈
3 3
For the general case, let
Sn = {x | |x| ≤ n} ,
T1 = S 1
T1 = Sn−Sn−1, n ≥ 2
Let En = E ∩ Sn . Then
∞
E= Υ (E ∩ Ti )
i =1
∞
E− En = Υ (E ∩ Ti )
i = n +1
REAL ANALYSIS 131
Example. An example of a sequence < fn > which converges to zero in measure on [0, 1] but
such that < fn(x) > does not converge for any x in [0, 1] can be constructed as follows :
Let n = k + 2v , 0 ≤ k < 2v, and set fn(x) = 1 if x ε [ k 2 − v , (k+1)2−v] and fn(x) = 0
otherwise. Then
2
m{x | |fn(x)| > ∈ } ≤ ,
n
and so fn → 0 in measure, although for any x ε [0, 1], the sequence < fn(x) > has the value 1 for
arbitrarily large values of n and so does not converge.
Definition. A sequence {fn} of a.e. finite valued measurable functions is said to be fundamental
in measure, if for every ∈ > 0,
m({x : |fn(x) − fm(x)| ≥ ∈}) → 0 as n and m→∞.
Definition. A sequence {fn} of real valued functions is said to be fundamental a.e. if there
exists a set E0 of measure zero such that, if x ∉ E0 and ∈ > 0, then an integer n0 = n0 = (x, ∈) can
be found with the property that
|fn(x) − fm(x)| < ∈ , whenever n ≥ n0 and m ≥ n0 .
Definition. A sequence {fn} of a.e. finite valued measurable functions will be said to converge to
the measurable function f almost uniformly if, for every ∈ > 0 , there exists a measurable set F
such that m(F) < ∈ and such that the sequence {fn} converges to f uniformly on Fc.
In this Language, Egoroff’s Theorem asserts that on a set of finite measure convergence a.e.
implies almost uniform convergence.
The following result goes in the converse direction.
Theorem 18. If {fn} is a sequence of measurable functions which converges to f almost
uniformly, then {fn} converges to f a.e.
1
Proof. Let Fn be a measurable set such that m(Fn) < and such that the sequence {fn} converges
n
∞
to f uniformly on Fnc , n = 1,2,… If F = Ι Fn , then
n =1
1
m(F) ≤ µ(Fn) < ,
n
so that m(F) = 0 , and it is clear that, for x ε Fc, {fn(x)} converges to f(x).
Theorem 19. Almost uniform convergence implies convergence in measure.
Proof. If {fn} converges to f almost uniformly, then for any two positive numbers ∈ and δ there
exists a measurable set F such that m(F) < δ and such that |fn(x) − f(x)| < ∈, whenever x belongs
to Fc and n is sufficiently large.
Theorem. If {fn} converges in measure to f, then {fn} is fundamental in measure. If also {fn}
converges in measure to g, then f = g .a.e.
Proof. The first assertion of the theorem follows from the relation
∈ ∈
{x : |fn(x) − fm(x)| ≥ ∈} ⊂ {x : |fn(x) − f(x)| ≥ } ∪ {x : |fm(x) − f(x)| ≥ }
2 2
To prove the second assertion, we have
134 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
∈ ∈
{x : |f(x) − g(x)| ≥ ∈} ⊂ {x : fn(x) − f(x) | ≥ }∪ {x : |fn(x)−g(x)| ≥ }
2 2
Since by proper choice of n, the measure of both sets on the right can be made arbitrarily small,
we have
m({x : |f(x) − g(x)| ≥ ∈} ) = 0
for every ∈ > 0 which implies that f = g a.e.
Theorem 20. If {fn} is a sequence of measurable functions which is fundamental in measure,
then some subsequence {f n k } is almost uniformly fundamental.
Proof. For any positive integer k we may find an integer n(k) such that if n ≥ n(k) and m ≥
n(k), then
1 1
m({x : |fn(x) − fm(x)| ≥ k }) < k .
2 2
We write
n1 = n(1), n2 = (n1+1) ∪ n(2), n3 = (n2+1) ∪ n(3),…; then n1 < n2 < n3 < ….,
So that the sequence {f n k } is indeed on subsequence of {kn}. If
1
Ek = {x : | f n k (x) − f n k +1 (x)| ≥ }
2k
and k ≤ i ≤ j , then, for every x which does not belong to Ek ∪ Ek+1 ∪ Ek+2 ∪…., we have
∞ ∞
1 1
| f n i ( x ) − f n j ( x ) |≤ | f n m ( x ) − f n m +1 ( x ) |< m
= i −1 ,
m =i m =i 2 2
so that, in other words, the sequence {f n i } is uniformly fundamental on
E \ (Ek ∪ Ek+1 ∪ ….). Since
∞
1
m(Ek ∪ Ek+1 ∪ ….) ≤ m(E m ) <
m=k 2 k −1
This completes the proof of the theorem.
Theorem 21. If {fn} is a sequence of measurable functions which is fundamental in measure,
then there exists a measurable function f such that {fn} converges in measure to f.
Proof. By the above theorem we can find a subsequence { f n k } which is almost uniformly
fundamental and therefore fundamental a.e. We write f(x) = lim f n k (x) for every x for which
k →∞
the limit exists. We observe that, for every ∈ > 0 ,
∈ ∈
{x : |fn(x)−f(x)| ≥ ∈] ⊂ {x : |fn(x) − f n k (x)| ≥ } ∪ {x : | f n k (x)−f(x)| ≥ } .
2 2
The measure of the first term on the right is by hypothesis arbitrarily small if n and nk are
sufficiently large, and the measure of the second term also approaches 0 (as k→∞), since almost
uniform convergence implies convergence in measure. Hence the theorem follows.
Remark. Convergence in measure does not necessarily imply convergence pointwise at any
point. Let
r −1 r
Er,k = [ k , k ] (r = 1,2,…, 2k , k = 1,2,…} ,
2 2
REAL ANALYSIS 135
and arrange these intervals as a single sequence of sets {Fn} by taking first those for which k =1,
then those with k = 2, etc. If m denotes Lebesgue measure on [0,1], and fn(x) is the indicator
function of Fn, then for 0 < ∈ < 1,
{x : |fn(x)| ≥ ∈} = Fn
so that, for any ∈ > 0, m {x : |fn(x) | ≥ ∈} ≤ m(Fn) → 0. This means that fn → 0 in measure in
[0, 1]. However, at no point x ∈ [0, 1] does fn(x) → 0; in fact, since every x is in infinitely many
of the sets Fn and infinitely many of the sets (Ω−Fn) we have
lim inf fn(x) = 0 , lim sup fn(x) = 1 for all x ε [0, 1].
136 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
φ = φ.χE .
E
It is often convenient to use representations which are not canonical, and the following lemma is
useful.
Lemma. If E1, E2,…, En are disjoint measurable subset of E then every linear combination
n
φ= c i χ Ei
i =1
with real coefficients c1, c2,…,cn is a simple function and
138 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
n
φ= c i mE i .
i =1
Proof. It is clear that φ is a simple function. Let a1, a2,…, an denote the non-zero real number in
φ(E). For each j = 1,2,…, n let
Aj = ΥEi
ci = a j
Then we have
Aj = φ−1(aj) = {x | φ(x) = aj}
and the canonical representation
n
φ= a jχ A j
j=1
Consequently, we obtain
n
φ= a j mA j
j=1
n
= a jm [ ΥEi ]
j=1 ci = a j
n n
= aj mE i (Since Ei are disjoint, additivity of measures applies)
j=1 ci =a j
n
= c j mE i
j=1
This completes the proof of the theorem.
Theorem 22. Let φ and ψ be simple functions which vanish outside a set of finite measure.
Then
(aφ + b ψ) = a φ + b ψ ,
and, if φ ≥ ψ a.e, then
φ≥ ψ.
Proof. Let {Ai} and {Bi} be the sets which occur in the canonical representations of φ and ψ.
Let A0 and B0 be the sets where φ and ψ are zero. Then the sets Ek obtained by taking all the
intersections Ai ∩ Bj form a finite disjoint collection of measurable sets, and we may write
N
φ= a k χ Ek
k =1
N
ψ= b k χ Ek ,
k =1
and so
N N
aφ + bψ = a a k χ Ek + b b k χ Ek
k =1 k =1
N N
= aa k χ E k + bb k χ E k
k =1 k =1
N
= (aa k + bb k )χ E k
k =1
REAL ANALYSIS 139
Therefore
N
(aφ+bψ) = (aa k + bb k )mE k
k =1
N N
= (aa k )m E k + (bb k )mE k
k =1 k =1
N N
=a a k mE k + b b k mE k
k =1 k =1
=a φ+b ψ.
To prove the second statement, we note that
φ − ψ = (φ−ψ) ≥ 0 ,
since the integral of a simple function which is greater than or equal to zero almost
everywhere is non-negative by the definition of the integral.
Remark. We know that for any simple function φ we have
N
φ= a i χ Ei
k =1
Suppose that this representation is neither canonical nor the sets Ei’s are disjoint. Then using the
fact that characteristic functions are always simple functions we observe that
φ = a1 χ E1 + a2 χ E 2 + … + an χ E n
= a1 χ E1 + a2 χ E 2 + a3 χ E3 + … + an χ E n
= a1mE1 + a2mE2 +…+…+ anm En
N
= a i mE i
k =1
Hence for any representation of φ, we have
N
φ= a i mE i
k =1
Let f be a bounded real-valued function and E a measurable set of finite measure. By analogy
with the Riemann integral we consider for simple functions φ and ψ the numbers
inf ψ
ψ ≥f
E
and
sup φ ,
φ≤ f E
and ask when these two numbers are equal. The answer is given by the following proposition :
Theorem 23. Let f be defined and bounded on a measurable set E with mE finite. In order that
n
mE k = mE
k =−n
The simple function defined by
M n
ψn(x) = kχ E (x)
n k =−n k
and
M n
φn(x) = (k − 1)χ E k (x)
n k =−n
satisfy
φn(x) ≤ f(x) ≤ ψn(x)
Thus
M n
inf ψ ( x )dx ≤ ψ n ( x )dx = km E k
E E n k =− n
and
M n
sup φ( x )dx ≥ φ n ( x )dx = (k − 1)m E k ,
E E n k =− n
whence
M n M
0 ≤ inf ψ ( x )dx − sup φ (x)dx ≤ m E k = mE .
E E n k =− n n
Since n is arbitrary we have
1
But each ∆v is contained in the set {x | φn(x) < ψn(x) − }, and this latter set by (4.3.1) has
ν
ν
measure less than . Since n is arbitrary, m∆v = 0 and so m∆ = 0 . Thus φ* = ψ* except on a
n
set of measure zero, and φ* = f except on a set of measure zero. Thus f is measurable and the
condition is also necessary.
Definition. If f is a bounded measurable function defined on a measurable set E with mE finite,
we define the Lebesgue integral of f over E by
We then define the Lebesgue upper and lower integrals of a bounded function f on E by
inf U [f; P) and sup L[f; P]
P
respectively taken over all measurable position of E. We denote them respectively by
−
f and f
E E
Definition. We say that a bounded function f on E is Lebesgue integrable on E if
−
f= f
E E
Also we know that if ψ is a simple function, then
n
ψ= a k mE k
E k =1
Keeping this in mind, we see that
− −
f = inf ψ (x) dx
E E
for all simple functions ψ(x) ≥ f(x). Similarly
f = sup φ( x )dx
E E
for all simple functions φ(x) ≤ f(x).
Now we use the theorem :
“Let f be defined and bounded on a measurable set E with mE finite. In order that
b b
R f ( x )dx = suf ψ ( x )dx
ψ ≤f
a a
for all step functions φ and ψ and then
b b b b
R f ( x )dx = R f ( x )dx inf φ( x )dx = sup ψ ( x )dx (i)
φ≥f ψ ≤f a
a a a
Since every step function is a simple function, we have
b b b b
R f ( x )dx = suf ψ ( x )dx ≤ inf φ( x )dx ≤ R f ( x )dx
ψ ≤f φ≥ f
a a a a
Then (i) implies that
b b
sup ψ ( x )dx = inf φ( x )dx
ψ ≤f a φ≥ f
a
and this implies that f is measurable also.
Comparison of Lebesgue and Riemann integration
(1) The most obvious difference is that in Lebesgue’s definition we divide up the interval
into subsets while in the case of Rimann we divide it into subintervals.
(2) In both Riemann’s and Lebesgue’s definitions we have upper and lower sums which tend
to limits. In the Riemann case the two integrals are not necessarily the same and the
function is integrable only if they are the same. In the Lebesgue case the two integrals
are necessarily the same, their equality being consequence of the assumption that the
function is measurable.
(3) Lebesgues’s definition is more general than Riemann. We know that if function is the R-
integrable then it is Lebesgue integrable also, but the converse need not be true. For
example the characteristic function of the set of irrational points have Lebesgue integral
but is not R-integrable.
Let χ be the characteristic function of the irrational numbers in [0,1]. Let E1 be the set of
irrational numbers in [0,1], and let E2 be the set of rational numbers in [0,1]. Then P = [E1, E2] is
a measurable partition of (0, 1]. Moreover, χ is identically 1 on E1 and χ is identically 0 on E2.
Hence M[χ, E1] = m[χ, E1] = 1, while M[χ, E2] = m[χ, E2] = 0. Hence U[χ, P] = 1.mE1 + 0.m E2
= 1. Similarly L(χ, P] = 1.m E1 + 0. M E2 = 1. Therefore, U [χ, P] = L[χ, P].
Therefore, it is Lebesgue integrable.
For Riemann integration
M[χ, J] = 1, m[χ, J] = 0
for any interval J ⊂ [0, 1]
∴ U[χ, J] = 1, L[χ, J] = 0 .
∴ The function is not Riemann-integrable.
Theorem 25. If f and g are bounded measurable functions defined on a set E of finite measure,
then
(i) af = a f
E E
(ii) (f + g ) = f + g
E E E
144 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
f≤ g
E E
(iv) If f = g a.e., then
f = g
E E
(v) If A ≤ f(x) ≤ B, then
AmE ≤ f ≤ BmE .
E
(vi) If A and B are disjoint measurable sets of finite measure, then
f = g+ f
A ∪B A B
Proof. We know that if ψ is a simple function then so is aψ. Hence
af = inf aψ = a inf ψ = a f
ψ ≥f ψ ≥f
E E E E
which proves (i).
To prove (ii) let ε denote any positive real number. There are simple functions φ ≤ f, ψ ≥ f, ξ ≤ g
and η ≥ g satisfying
( f + g ) ≥ (φ + ξ) = φ + ξ > f + g − 2 ∈
E E E E E
(f + g ) ≤ (ψ + η) = ψ + η < f + g + 2 ∈
E E E E E E
Since these hold for every ∈ > 0 , we have
(f + g ) = f + g
E E E
To prove (iii) it suffices to establish
( g − f ) ≥0
E
For every simple function ψ ≥ g−f, we have ψ ≥ 0 almost everywhere in E. This means that
ψ ≥0
E
Hence we obtain
f ( x )dx ≤ Bdx =B dx
E E E
= BmE
That is,
f ≤ BmE
E
f = χA∪B f = f(χA+χB)
A ∪B A ∪B A ∪B
= fχA + f χB
A ∪B A ∪B
= f+ f
A B
which proves the theorem.
Theorem 26 (Lebesgue Bounded Convergence Theorem). Let < fn > be a sequence of
measurable functions defined on a set E of finite measure and suppose that <fn > is uniformly
bounded, that is, there exists a real number M such that |fn(x)| ≤ M for all n ε N and all x ε E. If
lim fn(x) = f(x) for each x in E, then
n →∞
f = lim f n .
n →∞
E E
Proof. We shall apply Egoroff’s theorem to prove this theorem. Accordingly for a given ∈ > 0 ,
∈
there is an N and a measurable set E0 ⊂ E such that mE0c < and for n ≥ N and x ε E0 we
4M
have
∈
|fn(x) − f(x) | <
2m( E )
Then we have
146 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
| f n − f | = | (f n − f ) |≤ | f n − f |
E E E E
= | fn − f | + | fn − f |
E0 E c0
∈ ∈
< .m(E 0 ) + .2 M
2m( E ) 4M
∈ ∈
< + =∈ .
2 2
Hence
fn → f .
E E
The integral of a non-negative function
Definition. If f is a non-negative measurable function defined on a measurable set E, we define
f = sup h ,
E h ≤f E
(i) cf = c f , c > 0
E E
(ii) (f + g ) = f + g
E E E
and
(iii) If f ≤ g a.e., then
f≤ g .
E E
Proof. The proof of (i) and (iii) follow directly from the theorem concerning properties of the
integrals of bdd functions.
We prove (ii) in detail.
If h(x) ≤ f(x) and k(x) ≤ g(x), we have h(x) + k(x) ≤ f(x) + g(x), and so
( h + k ) ≤ (f + g )
E E
i.e. h + k ≤ (f + g )
E E E
Taking suprema, we have
(iv) f + g ≤ (f + g )
E E E
REAL ANALYSIS 147
On the other hand, let l be a bounded measurable function which vanishes outside a set of
finite measure and which is not greater than (f+g). Then we define the functions h and k by
setting
h(x) = min(f(x), l(x))
and
k(x) = l(x) − h(x)
We have
h(x) ≤ f(x) ,
k(x) ≤ g(x) ,
while h and k are bounded by the bound l and vanish where l vanishes. Hence
l= h+ k≤ f + g
E E E E E
and so taking supremum, we have
sup ≤ f + g
l≤ f + g E E
that is,
(v) f + g ≥ (f + g )
E E E
From (iv) and (v), we have
(f + g ) = f + g.
E E E
Fatou’s Lemma. If < fn > is a sequence of non-negative measurable functions and fn(x) → f(x)
almost everywhere on a set E, then
f ≤ lim f n
E E
Proof. Let h be a bounded measurable function which is not greater than f and which vanishes
outside a set E′ of finite measure. Define a function hn by setting
hn(x) = min{h(x), fn(x)}
Then hn is bounded by the bounds for h and vanishes outside E′. Now hn(x) → h(x) for each x in
E′.
Therefore by “Bounded Convergence Theorem” we have
h = h = lim h n ≤ lim f n
E E' E' E
Taking the supremum over h, we get
f ≤ lim f n
E E
lim fn ≤ f
Hence
f = lim fn
Definition. A non-negative measurable function f is called integrable over the measurable set E
if
f <∞
E
Theorem 29. Let f and g be two non-negative measurable functions. If f is integrable over E
and g(x) < f(x) on E, then g is also integrable on E, and
(f − g ) = f − g
E E E
Proof. Since
f = (f − g ) + g
E E E
and the left handside is finite, the term on the right must also be finite and so g is integrable.
Theorem 30. Let f be a non-negative function which is integrable over a set E. Then given ∈ >
0 there is a δ > 0 such that for every set A ⊂ E with mA < δ we have
f<∈.
A
Proof. If |f| ≤ K, then
f≤ K = KmA
A A
∈
Set δ < . Then
K
∈
f < K. =∈.
A K
Set fn(x) = f(x) if f(x) ≤ n and fn(x) = n otherwise. Then each fn is bounded and fn converges to f
∈
at each point. By the monotone convergence theorem there is an N such that f N > f − ,
E E 2
∈ ∈
and (f − f N ) < . Choose δ < . If m A < δ, we have
E 2 2N
f = (f − fN) + f N
A A A
We have already defined the positive part f + and negative part f − of a function as
f + = max (f, 0)
f = max(−f, 0)
Also it was shown that
f = f + − f
|f| = f + + f
With these notions in mind, we make the following definition.
+
Definition. A measurable function f is said to be integrable over E if f and f are both
integrable over E. In this case we define
f = f+ − f
E E E
(f + g ) = f + g
E E E
f = f+ f
A∪B A B
Proof. By definition, the functions f + , f, g+ , g are all integrable. If h = f+g, then h = (f +−f)
+ (g+ − g) and hence h = (f + + g+) − (f +g). Since f + + g+ and f + g are integrable therefore
their difference is also integrable. Thus h is integrable.
We then have
h = ([f + + g + ) − (f + g )]
E E
= (f + + g + ) − (f + g )
E E
= f + + g+ − f − g
E E E E
= ( f + − f ) + ( g + − g)
E E E E
That is,
(f + g ) = f + g
E E E
Proof of (ii) follows from part (i) and the fact that the integral of a non-negative integrable
function is non-negative.
150 MEASURABLE FUNCTIONS AND LEBESGUE INTEGRAL
f = f χ A ∪B
A ∪B
= f χA + fχB
= f+ f
A B
* It should be noted that f+g is not defined at points where f = ∞ and g = −∞ and where f = −∞
and g = ∞. However, the set of such points must have measure zero, since f and g are integrable.
Hence the integrability and the value of (f+g) is independent of the choice of values in these
ambiguous cases.
Theorem 32. Let f be a measurable function over E. Then f in integrable over E iff |f| is
integrable over E. Moreover, if f is integrable, then
| f |≤ |f |
E E
Proof. If f is integrable then both f + and f − are integrable. But |f| = f +
+ f −
. Hence
integrability of f + and f − implies the integrability of |f|.
Moreover, if f is integrable, then since f(x) ≤ |f(x)| = |f|(x), the property which states that if f ≤ g
a.e. , then f ≤ g implies that
f ≤ |f| (i)
On ther other hand since −f(x) ≤ |f(x)| , we have
− f ≤ |f| (ii)
From (i) and (ii) we have
| f| ≤ |f| .
Conversely, suppose f is measurable and suppose |f| is integrable. Since
0 ≤ f +(x) ≤ |f(x)|
it follows that f + is integrable. Similarly f − is also integrable and hence f is integrable.
Lemma. Let f be integrable. Then given ∈ > 0 there exists δ > 0 such that | f | < ∈ whenever
A
A is a measurable subset of E with mA < δ .
Proof. When f is non-negative, the lemma has been proved already. Now for arbitrary
measurable function f we have f = f + − f − . So by that we have proved already, given ∈ > 0 ,
there exists δ1 > 0 such that
∈
f+ < ,
A 2
when mA < δ1. Similarly there exists δ2 > 0 such that
∈
f− < ,
A 2
when mA < δ2 . Thus if mA < δ = min (δ1, δ2) , we have
∈ ∈
| f ≤ | f |= f + + f < + =∈
A A A A 2 2
This completes the proof.
REAL ANALYSIS 151
f = lim f n
n →∞
E E
Proof. Since |fn| ≤ g for every n ε N and f(x) = lim fnx), we have |f| ≤ g . Hence fn and f are
integrable. The function g−fn is non-negative, therefore by Fatou’s Lemma we have
g − f = (g − f ) ≤ lim (g − f n )
E E E E
= g − lim f n
E E
whence
f ≥ lim f n
E E
Similarly considering g + fn we get
f ≤ lim f n
E E
Consequently, we have
f = lim f n .
E E
152
5
Proof. It suffices to prove the lemma in the case that each interval in C is closed, for otherwise
we replace each interval by its closure and observe that the set of endpoints of I1, I2,…, IN has
measure zero.
Let O be an open set of finite measure containing E. Since C is a Vitali covering of E, we may
suppose without loss of generality that each I of C is contained in O. We choose a sequence <
In> of disjoint intervals of C by induction as follows :
Let I1 be any interval in C and suppose I1,…, In have already been chosen. Let kn be the
supremum of the lengths of the intervals of C which do not meet any of the intervals I1,…,In .
n
Since each I is contained in O, we have kn ≤ m O < ∞ . Unless E ⊂ Υ I i , we can find In+1 in C
i =1
1
with l(In+1) > kn and In+1 disjoint from I1, I2,…, In .
2
Thus we have a sequence < In > of disjoint intervals of C, and since U In ⊂ O, we have Σ l(In) ≤
m O < ∞ . Hence we can find an integer N such that
∞ ∈
l (I n ) <
N +1 5
Let
N
R = E − Υ In
i =1
It remains to prove that m*R < ∈ .
N
Let x be an arbitrary point of R. Since Υ In is a closed set not containing x, we can find an
i =1
interval I in C which contains x and whose length is so small that I does not meet any of the
intervals I1, I2,…, IN . If now I ∩ Ii = φ for i ≤ N , we must have l(I) ≤ kN < 2l (IN+1). Since lim
l(In) = 0 , the interval I must meet at least one of the intervals In. Let n be the smallest integer
such that I meets In. We have n > N, and l(I) ≤ kn−1 ≤ 2l(In). Since x is in I, and I has a point in
1
common with In, it follows that the distance from x to the midpoint of In is at most l(I) + l(In)
2
5
≤ l(In).
2
Let Jm denote the interval which has the same midpoint as Im and five times the length of
Im. Then we have x ∈ Jm. This proves
∞
R⊂ ΥJn
N +1
Hence
∞ ∞
m*R ≤ l (J n ) = 5 l (J n ) < ∈ .
N +1 N +1
f (x + h ) − f (x )
D+f(x) = lim
h →0 + h
f ( x ) − f (x − h)
D− f(x) = lim
h →0 + h
f (x + h ) − f (x )
D+f(x) = lim
h →0 + h
f (x ) − f (x − h)
D−f(x) = lim
h →0 + h
always exist. These derivatives are known as Dini Derivatives of the function f.
D+ f(x) and D+ f(x) are called upper and lower derivatives on the right and D− f(x) and D− f(x)
are called upper and lower derivatives on the left. Clearly we have D+ f(x) ≥ D+ f(x) and
D− f(x)≥ D− f(x). If D+ f(x) = D+ f(x), the function f is said to have a right hand derivative and
if D− f(x) = D− f(x), the function is said to have a left hand derivative.
If
D+ f(x) = D+ f(x) = D− f(x) = D− f(x) ≠ ± ∞, we say that f is differentiable at x and define
f ′(x) to be the common value of the derivatives at x.
Theorem 1. Every non-decreasing function f defined on the interval [a, b] is differentiable
almost everywhere in [a, b]. The derivative f ′ is measurable and
b
f ' ( x )dx ≤ f(b) − f(a).
a
Proof. We shall show first that the points x of the open interval (a, b) at which not all of the four
Dini-derivatives of f are equal form a subset of measure zero. It suffices to show that the
following four subsets of (a, b) are of measure zero :
A = {x ε (a, b) | D− f(x) < D+ f(x) },
B = {x ε (a, b) | D+ f(x) < D− f(x) } ,
C = {x ε (a, b) | D− f(x) < D− f(x) }
D = {x ε (a, b) | D+ f(x) < D+ f(x) } .
To prove m* A = 0 , consider the subsets
Au,v = {x ε (a, b) | D− f(x) < u < v < D+ f(x) }
of A for all rational numbers u and v satisfying u < v. Since A is the union of this countable
family {Au,v}, it is sufficient to prove m* (Au,v) = 0 for all pairs u, v with u < v .
For this purpose, denote α = m* (Au,v) and let ∈ be any positive real number. Choose an open
set U ⊃ Au,v with m* U < α + ∈ . Set x be any point of Au,v . Since D− f(x) < u, there are
arbitrary small closed intervals of the form [x −h, x] contained in U such that
f(x) − f(x−h) < uh.
Do this for all x ε Au, v and obtain a Vitali cover C of Au,v. Then by Vitali covering theorem
there is a finite subcollection {J1, J2,…, Jn}of disjoint intervals in C such that
n
m*(Au,v − ΥJi ) <∈
i =1
Summing over these n intervals, we obtain
n n
[f ( x i ) − f ( x i − h i )] < u hi
i =1 i =1
< u m* U
155
< u(α+∈)
+
Suppose that the interiors of the intervals J1, J2,…, Jn cover a subset F of Au,v. Now since D f(y)
> v, there are arbitrarily small closed intervals of the form [y, y+k] contained in some of the
intervals Ji (i = 1, 2,…, n) such that
f(y+k) − f(y) > vk
Do this for all y ε F and obtain a Vitali cover D of F. Then again by Vitali covering lemma we
can select a finite subcollection [K1, K2, …, Km] of disjoint intervals in D such that
m
m* [F− ΥKi ] < ∈
i =1
Since m*F > α− ∈, it follows that the measure of the subset H of F which is covered by the
intervals is greater than α − 2∈. Summing over these intervals and keeping in mind that each Ki
is contained in a Jn, we have
n m
{f ( x i ) − f ( x i − h i )} ≥ [f ( yi + k i ) − f ( yi )]
i =1 i =1
m
>v ki
i =1
> v (α−2∈)
so that
v(α−2∈) < u(α + ∈)
Since this is true for every ∈ > 0 , we must have v α ≤ uα. Since u < v, this implies that α = 0 .
Hence m*A = 0 . Similarly, we can prove that m*B = 0, m*C = 0 and m*D = 0.
This shows that
f (x + h ) − f (x )
g(x) = lim
h →0 h
is defined almost everywhere and that f is differentiable whenever g is finite. If we put
1
gn(x) = n[f ( x + ) − f(x)] for x ε [a,b] ,
n
where we re-define f(x) = f(b) for x ≥ b. Then gn(x) → g(x) for almost all x and so g is
measurable since every gn is measurable. Since f is non-decreasing, we have gn ≥ 0 . Hence, by
Fatou’s lemma
b b b
1
g ≤ lim g n = lim n [f ( x + ) − f ( x )]dx
a a a n
1
b+
n b
= lim n f ( x )dx − f ( x )dx
1 a
a+
n
1 1
b+ a+
b n n b
= lim n f ( x )dx + f ( x )dx − f ( x )dx − f ( x )dx
a b a a
1 1
b+ a+
n n
= lim n f ( x )dx − f ( x )dx
b a
≤ f(b) − f(a)
156
(Use of f(x) = f(b) for x ≥ b for first interval and f non-decreasing in the 2nd integral).
This shows that g is integrable and hence finite almost everywhere. Thus f is differentiable
almost everywhere and g(x) = f ′(x) almost everywhere. This proves the theorem.
and then
Vab(f) = sup {V(f,P) for all possible partitions P of [a,b] )
n
= sup |f(xi) − f(xi−1) |
P i =1
is called the total variation of f over the interval [a,b]. If Vab(f) < ∞, then we say that f is a
function of bounded variation and we write f ε BV.
Lemma 2. Every non-decreasing function f defined on the interval [a,b] is of bounded variation
with total variation
Vab(f) = f(b) − f(a).
Prof. For every partition P = [x0, x1, …, xn} of [a,b] we have
n
V(f,P) = | f ( x i ) − f ( x i −1 ) |
i =1
n
= [f ( x i ) − f ( x i −1 )
i =1
= f(b) − f(a)
This implies the lemma.
Theorem 2 (Jordan Decomposition Theorem). A function f: [a,b] → R is of bounded variation
if and only if it is the difference of two non-decreasing functions.
Proof. Let f = g−h on [a,b] with g and h increasing. Then for any, subdivision we have
n n n
| f ( x i ) − f ( x i −1 ) | ≤ [g ( x i ) − g ( x i−1 )] + [h ( x i ) − h ( x i −1 )]
i =1 i =1 i =1
= g(b) − g(a) + h(b) − h(a)
Hence
Vab(f) ≤ g(b) + h(b) − g(a) − h(a) ,
which proves that f is of bounded variations.
On the other hand, let f be of bounded variation. Define two functions g, h : [a, b] → R by
taking
g(x) = Vax(f), h(x) = Vax(f) − f(x)
for every x ε [a, b]. Then f(x) = g(x) − h(x).
The function g is clearly non-decreasing. On the other hand, for any two real numbers x and y in
[a, b] with x ≤ y, we have
157
Now the integrability of f implies integrability of |f| over [a,b]. Therefore, given ∈ > 0 there is a
δ > 0 such that for every measurable set A ⊂ [a, b] with measure less than δ, we have | f | < ∈ .
A
Hence
|F(x) − F(x0)| < ∈ whenever |x−x0| < δ1
and so f is continuous.
To show that F is of bounded variation, let a = x0 < x1 < … < xn = b be any partition of [a,b].
Then
n n xi x i −1
| F( x i ) − F( x i −1 ) | = | f ( t )dt − f ( t )dt |
i =1 i =1 a a
n xi
= | f ( t )dt |
i =1 x i −1
n xi
≤ | f ( t ) | dt
i =1 x i −1
b
= | f ( t ) | dt
a
Thus
b
Vab F ≤ | f ( t ) | dt < ∞
a
Hence F is of bounded variation.
Lemma 9. If f is integrable on [a, b] and
x
f ( t ) dt = 0
a
for all x ε [a,b], then f = 0 almost everywhere in [a,b].
Proof. Suppose f > 0 on a set E of positive measure. Then there is a closed set F ⊂ E with
m F > 0 . Let O be the open set such that
O = (a, b) − F
b
Then either f ≠ 0 or else
a
b
0= f= f + f
a F O
159
∞ bn
= f+ f ( t ) dt, (1)
F n =1 a n
f =0
F
1 x+h
= f ( t )dt
h x
1 x+h
|fn(x)| = | f ( t )dt |
h x
1 x +h 1 x +h
≤ | f ( t ) | dt ≤ K dt
h x h x
K
= .h = K
h
Moreover,
fn(x) → F ′(x) a.e.
Hence by the theorem of bounded convergence, we have
c c
1c
F' ( x )dx = lim f n ( x )dx = lim [F( x + h ) − F( x )]dx
h →0 h
a a a
160
1 c+h 1c
= lim F( x )dx − F( x )dx
h →0 h a+h ha
1 c+ h 1 a+h
= lim F( x )dx − F( x )dx
h c h a
= F(c) − F(a) (since F is continuous)
c
= f ( x )dx
a
Hence
c
[F' ( x ) − f ( x )]dx = 0
a
for all c ε [a,b], and so
F′(x) = f(x) a.e.
by using the previous lemma.
Now we extend the above lemma to unbounded functions.
Theorem 3. Let f be an integrable function on [a,b] and suppose that
x
F(x) = F(a) + f ( x )dt
a
Then F′(x) = f(x) for almost all x in [a, b].
Proof. Without loss of generality we may assume that f ≥ 0 (or we may write “From the
definition of integral it is sufficient to prove the theorem when f ≥ 0).
Let fn be defined by fn(x) = f(x) if f(x) ≤ n and fn(x) = n if f(x) > n. Then f−fn ≥ 0 and so
x
Gn(x) = (f − f n )
a
is an increasing function of x, which must have a derivative almost everywhere and this
derivative will be non-negative. Also by the above lemma, since fn is bounded (by n), we have
d x
( f n ) = fn(x) a.e. (i)
dx a
Therefore,
d x d x
F′(x) = ( f) = (G n + f n )
dx a dx a
d d x
= Gn + fn
dx dx a
≥ fn(x) a.e. (using (i))
Since n is arbitrary, making n→∞ we see that
F′(x) ≥ f(x) a.e.
Consequently,
b b
F' ( x )dx ≥ f ( x )dx
a a
= F(b) − F(a) (using the hypothesis of the theorem)
161
Also since F(x) is an increasing real valued function on the interval [a,b], we have
b b
F' ( x )dx ≤ F(b) − F(a ) = f ( x )dx
a a
Hence
b b
F' ( x )dx = F(b) − F(a ) = f ( x )dx
a a
b
[F' ( x ) − f ( x )dx = 0
a
Since F′(x) − f(x) ≥ 0 , this implies that F′(x) − f(x) = 0 a.e. and so F′(x) = f(x) a.e.
Absolute Continuity
Definition. A real-valued function f defined on [a,b] is said to be absolutely continuous on
[a,b] if, given ∈ > 0 there is a δ > 0 such that
n
| f ( x i ' ) − f ( x i ) |< ∈
i =1
for every finite collection {(xi, xi′)} of non-overlapping intervals with
n
| xi ′ − xi | < δ
i =1
An absolutely continuous function is continuous, since we can take the above sum to consist of
one term only.
Moreover, if
x
F(x) = f ( t )dt ,
a
then
n n xi ' xi
| F(xi′) − F(xi)| = | f ( t )dt − f ( t )dt |
i =1 i =1 a a
n xi '
= | f ( t )dt |
i =1 x i
n xi '
≤ | f ( t ) | dt = |f(t)|dt, where E is the set of intervals (x, xi′)
i =1 x i E
n
≤ → 0 as | x i '− x i | → 0 .
i =1
The last step being the consequence of the result.
“Let ∈ > 0 . Then there is a δ > 0 such that for every measurable set E ⊂ [a, b] with m E < δ, we
have
| f | < ∈”.
A
Hence every indefinite integral is absolutely continuous.
Lemma 6. If f is absolutely continuous on [a,b], then it is of bounded variation on [a,b].
162
Proof. Let δ be a positive real number which satisfies the condition in the definition for ∈ = 1.
Select a natural number
b−a
n>
δ
Consider the partition π = {x0, x1,…, xn} of [a,b] defined by
i(b − a )
xi = x0 +
n
for every i = 0, 1,…, n. Since |xi − xi−1| < δ , it follows that
Vxxii−1 (f) < 1
This implies
n
Vab (f ) = Vxxii−1 (f ) < n
i =1
Hence f is of bounded variation.
Cor. If f is absolutely continuous, then f has a derivative almost everywhere.
Lemma 7. If f is absolutely continuous on [a,b] and f′(x) = 0 a.e., then f is constant.
Proof. We wish to show that f(a) = f(c) for any c ε [a,b].
Let E ⊂ (a,c) be the set of measure c−a in which f ′(x) = 0 , and let ∈ and η be arbitrary positive
numbers. To each x in E there is an arbitrarily small interval [x, x+h] contained in [a,c] such that
|f(x+h) − f(x) | < ηh
By Vitali Lemma we can find a finite collection {[xk, yk]} of non-overlapping intervals of this
sort which cover all of E except for a set of measure less than δ, where δ is the positive number
corresponding to ∈ in the definition of the absolute continuity of f . If we label the xk so that xk
≤ xk+1, we have (or if we order these intervals so that)
a = y0 ≤ x1 < y1 ≤ x2 < …. < yn ≤ xn+1 = c
and
n
| x k +1 − y k | < δ
k =0
Now
n n
| f (y k ) − f (x k ) | ≤ η (y k − x k )
k =0 k =1
< η (c−a)
by the way to intervals {[xk, yk]} were constructed, and
n
| f ( x k +1 ) − f ( y k ) |< ∈
k =0
by the absolute continuity of f . Thus
n n
|f(c) − f(a)| = | [f ( x k +1 ) − f ( y k )] + [f ( y k ) − f ( x k )] |
k =0 k =1
≤ ∈ + η (c−a)
Since ∈ and η are arbitrary positive numbers, f(c) − f(a) = 0 and so f(c) = f(a). Hence f is
constant.
Theorem 4. A function F is an indefinite integral if and only if it is absolutely continuous.
163
Convex Functions
Definition. A function φ defined an open interval (a, b) is said to be convex if for each x, y ε
(a, b) and λ, µ such that λ, µ ≥ 0 and λ + µ = 1, we have
φ(λx + µy) ≤ λφ(x) + µφ(y)
The end points a, b can take the values −∞, ∞ respectively.
If we take µ = 1−λ, λ ≥ 0 , then λ + µ = 1 and so φ will be convex if
(5.1.1) φ(λx + (1−λ)y) ≤ λφ(x) + (1−λ)φ(y)
If we take a < s < t < u < b and
t −s u−t
λ= , µ= , u = x, s = y ,
u −s u −s
then
t −s+u −t u −s
λ+µ= = =1
u −s u −s
and so (5.1.1) reduces to
t −s u−t t −s u−t
φ u+ s ≤ φ(u ) + φ(s)
u −s u −s u −s u −s
164
or
t −s u−t
(5.1.2) φ(t) ≤ φ(u ) + φ(u )
u −s u −s
Thus the segment joining (s, φ(s)) and (u, φ(n)) is never below the graph of φ.
A function φ is sometimes said to be convex on (a,b) it for all x, y ε (a, b),
x+y 1 1
f ≤ f ( x ) + f ( y)
2 2 2
1
(Clearly this definition is consequence of major definition taking λ = µ = ).
2
If for all positive numbers λ, µ satisfying λ +µ = 1, we have
φ(λx + µy) < λφ(x) + µφ(y),
then φ is said to be Strictly Convex.
Theorem 5. Let φ be convex on (a,b) and a < s < t < u < b, then
φ( t ) − φ(s) φ(u ) − φ(s) φ(u − φ( t )
≤ ≤
t −s u −s u−t
If φ is strictly convex, equality will not occur.
Proof. Let a < s < t < u < b and suppose φ is convex on (a,b). Since
t −s u − t t −s + u − t u −s
+ = = =1 ,
u −s u −s u −s u −s
therefore, convexity of φ yields
t −s u−t t −s u−t
φ u+ s ≤ φ(u ) + φ(s)
u −s u −s u −s u −s
or
t −s u−t
(5.1.3) φ(t) ≤ φ(u ) + φ(s)
u −s u −s
or
(u−s) φ(t) ≤ (t−s) φ(u) + (u−t) φ(s)
or
(u−s) (φ(t) − φ(s)) ≤ (t−s) φ(u) + uφ(s) − tφ(s) − uφ(s) + s φ(s)
or
(u−s)(φ(t) − φ(s)) ≤ (t−s) (φ(u)−φ(s))
or
φ( t ) − φ(s) φ(u ) − φ(s)
(5.1.4) ≤
t −s u −s
This proves the first inequality. The second inequality can be proved similarly.
If φ is strictly converse, equality shall not be there in (5.1.3) and so it cannot be in (5.1.4). This
completes the proof of the theorem.
Theorem 6. A differentiable function φ is convex on (a,b) if and only if φ′ is a monotonically
increasing function. If φ′′ exists on (a,b), then φ is convex if and only if φ′′ ≥ 0 on (a, b) and
strictly convex if ψ′′ > 0 on (a,b).
Proof. Suppose first that φ is differentiable and convex and let a < s < t < u < v < b. Then
applying Theorem 5 to a < s < t < u, we get
165
through (x0, φ(x0)) is called a Supporting Line at x0 if it always lie below the graph of φ, that is,
if
(5.1.7) φ(x) ≥ m(x−x0) + φ(x0)
The line (5.1.6) is a supporting line if and only if its slope m lies between the left and right hand
derivatives at x0. Thus, in particular, there is at least one supporting line at each point.
Theorem 9 (Jensen Inequality). Let φ be a convex function on (−∞, ∞) and let f be an
integrable function on [0,1]. Then
φ(f(t))dt ≥ φ[ f(t)dt]
Proof. Put
1
α = f ( t )dt
0
Let y = m(x−α) + φ(α) be the equation of supporting line at α. Then (by (….) above),
φ(f(t)) ≥ m(f(t)−α) + φ(α)
Integrating both sides with respect to t over [0, 1], we have
1 1
φ(f ( t ))dt ≥ m[ f ( t )dt − f ( t )dt ] + φ(α )dt
0 0
1
= 0 + φ(α) dt
0
1
= φ(α) = φ[ f ( t )dt ] .
0
Lp − space
Let p be a positive real number. A measurable function f defined on [0,1] is said to belong to the
space Lp if |f|p < ∞ .
Thus L1 consists precisely of Lebesgue integrable functions on [0,1]. Since
|f+g|p ≤ 2p (|f|p + |g|p) ,
we have
|f+g|p ≤ 2p |f|p + 2p |f|p
and so if f, g ε Lp, it follows that f+g ε Lp . Further, if α is a scalar and f ε Lp, then clearly αf
belongs to Lp. Hence αf + βg ε Lp whenever f, g ε Lp and α, β are scalars.
We shall study these spaces in detail in Course On Functional Analysis.
167
Proof. Let
∞
E= Ι Ei
i =1
Then
∞
E1 = E ∪ Υ (E i − E i+1 ) ,
i =1
and this is a disjoint union. Hence
∞
µ(E1) = µ(E) + µ (Ei−Ei+1)
i =1
Since
Ei = Ei+1 ∪ (Ei−Ei+1)
is a disjoint union, we have
168
Proof. Let
n −1
Gn = En − ΥEi
i =1
Then Gn ⊂ En and the sets Gn are disjoint. Hence
µ(Gn) ≤ µ En,
while
∞ ∞
µ(∪ Ei) = µ Gn ≤ µ En
i =1 i =1
Proof. Let
∞
E= ΥEi
i =1
Then
E = E1 ∪ (E2−E1) ∪ (E3−E2) ∪ …
∞
= E1 ∪ Υ (E i+1 − E i )
i =1
and this is a disjoint union. Hence
∞
µ(E) = µE1 + µ(E i +1 \ E i )
i =1
∞
= µE1 + lim µ (Ei+1 − Ei)
n →∞ i =1
= lim µEn .
n →∞
Definition. A measure µ is called finite if µ(X) < ∞ . It is called σ-finite if there exists a
sequence (Xn) of sets in β such that
∞
X= ΥXn
n =1
and µ Xn < ∞ .
By virtue of a lemma proved earlier in Chapter 3, we may always take {Xn} to be a disjoint
sequence of sets. Lebesgue measure on [0,1] is an example of a finite measure while Lebesgue
measure on (−∞, ∞) is an example of a σ-finite measure.
Definition. A set E is said to be of finite measure if E ε β and µ E < ∞ .
A set E is said to be of σ-finite measure if E is the union of a countable collection of measurable
sets of finite measure.
Any measurable set contained in a set of σ-finite measure is itself of σ-finite measure, and the
union of a countable collection of sets of σ-finite measure is again of σ-finite measure.
Definition. A measure space (X, β, µ) is said to be complete if β contains all subsets of sets of
measure zero, that is, if B ε β, µB = 0 and A ⊂ B imply A ε β.
For example Lebesgue measure is complete, while Lebesgue measure restricted to the σ-algebra
of Borel sets is not complete.
Definition. If (X, β, µ) is a measure space, we say that a subset E of X is locally measurable if
E ∩ B ε β for each B ε β with µB < ∞ .
The collection C of all locally measurable sets is a σ-algebra containing β.
The measure µ is called saturated if every locally measurable set is measurable, i.e., is in β.
For example every σ-finite measure is saturated.
Example. Show that µ(E1 ∆E2) = 0 implies µE1 = µE2 provided that E1 and E2 ε β.
Solution. Since E1, E2 ε β, we have E1\ E2 and E2 \ E1 in β and so E1 ∆ E2 ε β . Moreover,
µ(E1 ∆ E2) = µ [(E1 \ E2) ∪ (E2 \ E1)]
= µ (E1 \ E2) + µ(E2 \ E1)
But, by hypothesis, µ (E1 ∆ E2) = 0. Therefore,
µ(E1 \ E2) = 0 and µ(E2 \ E1) = 0 .
Also, we can write
E2 = [E1 ∪ (E2−E1)] − (E1−E2)
Then
µE2 = µE1 + 0 − 0 = µE1 .
∞ ∞
(iii) E⊂ Ei µ* E ≤ µ *Ei (subadditivity)
i =1 i =1
Because of (ii), property (iii) can be replaced by
∞ ∞
(iii)′ E= ΥEi , Ei disjoint µ* E ≤ µ * Ei
i =1 i =1
The outer measure µ* is called finite if µ* X < ∞ .
By analogy with the case of Lebesgue measure we define a set E to be measurable with respect
to µ* if for every set A we have
µ*A = µ*(A ∩ E) + µ* (A ∩ Ec)
Since µ* is subadditive, it is only necessary to show that
µ* A ≥ µ*(A ∩ E) + µ* (A ∩ Ec)
for every A in order to show that E is measurable.
This inequality is trivially true when µ* A = ∞ and so we need only establish it for sets A with
µ*A finite.
Theorem 14. The class β of µ*-measurable sets is a σ-algebra. If µ is restricted to β, then µ
is a complete measure on β.
Proof. It is obvious that the empty set is measurable. The symmetry of the definition of
measurability in E and Ec shows that Ec is measurable whenever E is measurable.
Let E1 and E2 be measurable sets. From the measurability of E2 ,
µ*A = µ* (A ∩ E2) + µ*(A ∩ E2c)
and
µ*A = µ* (A∩ E2) + µ*(A ∩ E2c ∩ E1) + µ*(A ∩ E1c ∩ E2c)
by the measurability of E1. Since
A ∩ [E1 ∪ E2] = [A ∩ E2] ∪ [A ∩ E1 ∩ E2c]
we have
µ*(A∩ [E1 ∪ E2] ) ≤ µ*(A ∩ E2) + µ* (A ∩ E2c ∩ E1)
by subadditivity, and so
µ*A ≥ µ* (A∩ [E1 ∪ E2] ) + µ* (A ∩ E1c ∩ E2c)
This means that E1 ∪ E1 is measurable. Thus the union of two measurable sets is measurable,
and by induction the union of any finite number of measurable sets is measurable, showing that β
is an algebra of sets.
Assume that E = ∪ Ei, where < Ei > is a disjoint sequence of measurable set, and set
n
Gn = ΥEi
i =1
Then (by what we have proved above) Gn is measurable, and
µ* A = µ* (A ∩ Gn) + µ* (A ∩ Gnc) ≥ µ* (A ∩ Gn) + µ*(A∩ Ec)
because Ec ⊂ Gnc
Now Gn ∩ En = En and Gn ∩ Enc = Gn−1 , and by the measurability of En, we have
µ* (A ∩ Gn) = µ* (A ∩ En) + µ* (A ∩ Gn−1)
By induction (as above, µ* (A ∩ Gn−1) = µ*(A∩ En+1) + µ* (A∩ En−2 and so on)
n
µ* (A ∩ Gn) = µ * (A ∩ Ei)
i =1
and so
171
∞
µ* A ≥ µ* (A ∩ Ec) + µ * (A ∩ Ei)
i =1
≥ µ*(A ∩ Ec) + µ* (A∩ E) ,
∞
since A∩E ⊂ Υ ( A ∩ Ei )
i =1
Thus E is measurable. Since the union of any sequence of sets in an algebra can be replaced by a
disjoint union of sets in an algebra, it follows that B is a σ-algebra.
We now show that µ is finitely additive. Let E1 and E2 be disjoint measurable sets. Then the
measurability of E2 implies that
µ (E1 ∪ E2) = µ* (E1 ∪ E2)
= µ* ( [ E1 ∪ E2] ∩ E2) + µ* ( [ E1 ∪ E2 ] ∩ E2c)
= µ* E2 + µ* E1
Finite additivity now follows by induction.
If E is the disjoint union of the measurable sets {Ei}, then
n n
µi E ≥ µ ΥEi = µE i
i =1 i =1
and so
∞
µ E ≥ µ Ei
i =1
∞
But µ E ≤ µ Ei, by the subadditivity of µ*. Hence µ is countably additive and
i =1
hence a measure since it is non-negative and µ φ = µ*φ = 0 .
Measure on an Algebra
By a measure on an algebra we mean a non-negative extended real valued set function µ
defined on an algebra A of sets such that
(i) µφ=0
(ii) If < Ai > is a disjoint sequence of sets in A whose union is also in A, then
∞ ∞
µ ΥAi = µA i
i =1 i =1
Thus a measure on an algebra A is a measure ⇔ A is a σ-algebra.
We construct an outer measure µ* and show that the measure µ induced by µ* is an extension
of µ (measure defined on an algebra).
We define
∞
µ* E = inf µA i ,
i =1
where < Ai > ranges over all sequence from A such that
∞
E⊂ ΥAi .
i =1
172
∞
Lemma 8. If A ε A and if < Ai > is any sequence of sets in A such that A ⊂ ΥAi ,
i =1
∞
then µ A ≤ µA i .
i =1
Proof. Set
Bn = A ∩ An ∩ Acn−1 ∩ …. ∩ Aic
Then Bn ε A and Bn ⊂ An. But A is the disjoint union of the sequence < Bn > and so by
countable additivity
∞ ∞
µA = µB n ≤ µA n
n =1 n =1
Corollary. If A ε A, µ* A = µ A.
In fact, we have, from above
∞
µA ≤ µA n < µ*A + ∈ ,
n =1
that is,
µ A ≤ µ* A + ∈
Since ∈ is arbitrary, we have
µA ≤ µ* A
Also, by definition,
µ* A ≤ µ A
Hence
µ* A = µA .
Lemma 9. The set function µ* is an outer measure.
Proof. µ*, by definition, is a monotone non-negative set function defined for all sets and
∞
µ* φ = O. We have only to show that it is countably subadditive. Let E ⊂ ΥEi . If µ* Ei = ∞
i =1
for any i, we have µ* E ≤ Σ µ* Ei = ∞ . If not, given ∈ > 0 , there is for each i a sequence
∞
< A ij > ∞j=1 of sets in A such that Ei ⊂ Υ A ij and
j=1
∞
∈
µA ij < µ* Ei +
j=1 2i
Then
∞
µ* E ≤ µA ij < µ * Ei + ∈
i, j i =1
Proof. Let E be an arbitrary set of finite outer measure and ε a positive number. Then there is a
sequence < Ai > from A such that E ⊂ ∪ Ai and
Σ µAi < µ* E + ∈
By the additivity of µ on A, we have
µ(Ai) = µ(Ai ∩ A) + µ(Ai ∩ Ac)
Hence
∞ ∞
µ* E + ∈ > µ (Ai ∩ A) + µ (Ai ∩ Ac)
i =1 i =1
> µ* (E ∩ A) + µ* (E ∩ Ac)
because
E ∩ A ⊂ ∪ (Ai ∩ A)
and
E ∩ Ac ⊂ ∪ (Ai ∩ Ac)
Since ∈ was an arbitrary positive number,
µ* E ≥ µ* (∈ ∩ A) + µ* (E ∩ Ac)
and thus A is µ* - measurable.
Remark. The outer measure µ* which we have defined above is called the outer measure
induced by µ.
Notation. For a given algebra A of sets we use Aσ to denote those sets which are countable
unions of sets of A and use A σδ to denote these sets which are countable intersection of sets in
A σ.
Theorem 15. Let µ be a measure on an algebra A, µ* the outer measure induced by µ, and E
any set. Then for ε > 0 , there is a set A ε A σ with E ⊂ A and
µ* A ≤ µ* E + ∈
There is also a set B ε Aσδ with E ⊂ B and µ* E = µ* B .
Proof. By the definition of µ* there is a sequence < Ai > from A such that E ⊂ ∪ Ai and
∞
µA i ≤ µ* E + ∈ (1)
i =1
Set
A = ∪ Ai
Then µ* A ≤ Σ µ* Ai = Σ µAi (2)
because µ* and µ agree on members of A by the corollary.
Hence (1) and (2) imply
µ* A ≤ µ* E + ∈
which proves the first part.
To prove the second statement, we note that for each positive integer n there is a set An in A σ
such that E ⊂ An and
1
µ* An < µ* E + (from first part proved above)
n
Let B = ∩ An. Then B ε Aσδ and E ⊂ B. Since B ⊂ An ,
1
µ* B < µ* An ≤ µ* E +
n
174
∞
Now considering the sets Ei from inequality (2), F = ΥEi ε β and so F is µ*-measurable. Since
i =1
A⊂F,
µ(F) = µ(A) + µ(F−A)
or
µ (F \ A) = µ (F) −µ(A) < ε (from (2))
Since µ1(E) = µ(E) for each E ε A, we have µ1(F) = µ (F). Then
µ (A) ≤ µ(F) = µ1(F) = µ1(A) + µ1 (F \ A)
≤ µ (A) + µ (F \ A)
(by inequality (3) because (3) is true if A is replaced by any set in β with finite µ-measure).
The relation (4) then yields
µ(A) ≤ µ1(A) + ∈
Since this true for all ∈ > 0, we have
µ (A) ≤ µ1(A) (5)
The relations (3) and (5) then yield
µA = µ1(A)
which completes the proof of the theorem.
Definition. Let f be a non-negative extended real valued measurable function on the measure
space (X, β, µ). Then f dµ is the supremum of the integrals φ dµ as φ ranges over all simple
functions with 0 ≤ φ ≤ f .
Lemma 1 (Fatou’s Lemma). Let < fn > be a sequence of non-negative measurable functions
which converge almost everywhere on a set E to a function f. Then
f ≤ lim f n
E E
Proof. Without loss of generality we may assume that fn(x) → f(x) for each x ε E. From the
definition of f it is sufficient to show that , if φ is any non-negative simple function with φ ≤ f,
then φ ≥ lim f n
E
lim f n = ∞ = φ
E E
If φ < ∞, then there is a measurable set A ⊂ E with µA < ∞ such that φ vanishes identically on
E
E \ A . Let M be the maximum of φ, let ∈ be a given positive number, and set
An = [x ε E if fh(x) > (1−∈) φ(x) for all k ≥ n]
176
Then < An > is an increasing sequence of sets whose union contains A, and so (A \ An) is a
decreasing sequence of sets whose intersection is empty. Therefore, (by a proposition proved
already) lim µ (A An) = 0 and so we can find an n such that µ(A Ak) < ∈ for all k ≥ n.
Thus for k ≥ n
fk ≥ f k ≥ (1− ∈) φ
E Ak Ak
≥ (1−∈) φ− φ
A A−Ak
≥ (1−∈) φ − φ
A −A k
≥ φ−∈ φ−∈M
Since c is arbitrary.
lim f k ≥ φ .