(Cornelius Lanczos) Linear Differential Operators
(Cornelius Lanczos) Linear Differential Operators
(Cornelius Lanczos) Linear Differential Operators
Operators
SIAM's Classics in Applied Mathematics series consists of books that were
previously allowed to go out of print. These books are republished by SIAM as a
professional service because they continue to be important resources for
mathematical scientists.
Editor-in-Chief
Gene H. Golub, Stanford University
Editorial Board
Richard A. Brualdi, University of Wisconsin-Madison
Herbert B. Keller, California Institute of Technology
Ingram Olkin, Stanford University
Robert E. O'Malley, Jr., University of Washington
SIAM would like to thank Bart Childs, Texas A&M University, for suggesting
the corrections made to this edition.
10987654321
All rights reserved. Printed in the United States of America. No part of this
book may be reproduced, stored, or transmitted in any manner without the
written permission of the publisher. For information, write to the Society for
Industrial and Applied Mathematics, 3600 University City Science Center,
Philadelphia, PA 19104-2688.
The royalties from the sales of this book are being placed in a fund to help
students attend SIAM meetings and other SIAM related activities. This fund is
administered by SIAM and qualified individuals are encouraged to write
directly to SIAM for guidelines.
SlcLJTL. is a registered t.
A 1'apotre de I'humanite universelle, le Pere Pire,
dont la charite ne connait pas de limites.
V
This page intentionally left blank
CONTENTS
PAGE
PREFACE xiii
BIBLIOGRAPHY xvii
1. INTERPOLATION
1. Introduction 1
2. The Taylor expansion 2
3. The finite Taylor series with the remainder term 3
4. Interpolation by polynomials 5
5. The remainder of Lagrangian interpolation formula 6
6. Equidistant interpolation 8
7. Local and global interpolation 11
8. Interpolation by central differences 13
9. Interpolation around the midpoint of the range 16
10. The Laguerre polynomials 17
11. Binomial expansions 21
12. The decisive integral transform 24
13. Binomial expansions of the hypergeometric type 26
14. Recurrence relations 27
15. The Laplace transform 29
16. The Stirling expansion 32
17. Operations with the Stirling functions 34
18. An integral transform of the Fourier type 35
19. Recurrence relations associated with the Stirling series 37
20. Interpolation of the Fourier transform 40
21. The general integral transform associated with the Stirling
series 42
22. Interpolation of the Bessel functions 45
2. HARMONIC ANALYSIS
1. Introduction 49
2. The Fourier series for differentiable functions 50
3. The remainder of the finite Fourier expansion 53
4. Functions of higher differentiability 56
5. An alternative method of estimation 58
6. The Gibbs oscillations of the finite Fourier series 60
7. The method of the Green's function 66
8. Non-differentiable functions. Dirac's delta function 68
vii
Vlll CONTENTS
PAGE
9. Smoothing of the Gibbs oscillations by Fejer's method 71
10. The remainder of the arithmetic mean method 72
11. Differentiation of the Fourier series 74
12. The method of the sigma factors 75
13. Local smoothing by integration 76
14. Smoothing of the Gibbs oscillations by the sigma method 78
15. Expansion of the delta function 80
16. The triangular pulse 81
17. Extension of the class of expandable functions 83
18. Asymptotic relations for the sigma factors 84
19. The method of trigonometric interpolation 89
20. Error bounds for the trigonometric interpolation method 91
21. Relation between equidistant trigonometric and polynomial
interpolations 93
22. The Fourier series in curvfitting98 98
3. MATRIX CALCULUS
1. Introduction 100
2. Rectangular matrices 102
3. The basic rules of matrix calculus 103
4. Principal axis transformation of a symmetric matrix 106
5. Decomposition of a symmetric matrix 111
6. Self-adjoint systems 113
7. Arbitrary n x m systems 115
8. Solvability of the general n x m system 118
9. The fundamental decomposition theorem 120
10. The natural inverse of a matrix 124
11. General analysis of linear systems 127
12. Error analysis of linear systems 129
13. Classification of linear systems 134
14. Solution of incomplete systems 139
15. Over-determined systems 141
16. The method of orthogonalisation 142
17. The use of over-determined systems 144
18. The method of successive orthogonalisation 148
19. The bilinear identity 152
20. Minimum property of the smallest eigenvalue 158
4. THE FUNCTION SPACE
1. Introduction 163
2. The viewpoint of pure and applied mathematics 164
3. The language of geometry 165
4. Metrical spaces of infinitely many dimensions 166
5. The function as a vector 167
6. The differential operator as a matrix 170
CONTENTS IX
PAGE
7. The length of a vector 173
8. The scalar product of two vectors 175
9. The closeness of the algebraic approximation 175
10. The adjoint operator 179
11. The bilinear identity 181
12. The extended Green's identity 182
13. The adjoint boundary conditions 184
14. Incomplete systems 187
15. Over-determined systems 190
16. Compatibility under inhomogeneous boundary conditions 192
17. Green's identity in the realm of partial differential operators 195
18. The fundamental field operations of vector analysis 198
19. Solution of incomplete systems 201
5. THE GREEN'S FUNCTION
1. Introduction 206
2. The role of the adjoint equation 207
3. The role of Green's identity 208
4. The delta function 8(x, £) 208
5. The existence of the Green's function 211
6. Inhomogeneous boundary conditions 217
7. The Green's vector 220
8. Self-adjoint systems 225
9. The calculus of variations 229
10. The canonical equations of Hamilton 230
11. The Hamiltonisation of partial operators 237
12. The reciprocity theorem 239
13. Self-adjoint problems. Symmetry of the Green's function 241
14. Reciprocity of the Green's vector 241
15. The superposition principle of linear operators 244
16. The Green's function in the realm of ordinary differential
operators 247
17. The change of boundary conditions 255
18. The remainder of the Taylor series 256
19. The remainder of the Lagrangian interpolation formula 258
20. Lagrangian interpolation with double points 263
21. Construction of the Green's vector 266
22. The constrained Green's function 270
23. Legendre's differential equation 275
24. Inhomogeneous boundary conditions 278
25. The method of over-determination 281
26. Orthogonal expansions 286
27. The bilinear expansion 291
28. Hermitian problems 299
29. The completion of linear operators 308
X CONTENTS
PAGE
6. COMMUNICATION PROBLEMS
1. Introduction 315
2. The step function and related functions 315
3. The step function response and higher order responses 320
4. The input-output relation of a galvanometer 323
5. The fidelity problem of the galvanometer response 325
6. Fidelity damping 327
7. The error of the galvanometer recording 328
8. The input-output relation of linear communication devices 330
9. Frequency analysis 334
10. The Laplace transform 336
11. The memory time 337
12. Steady state analysis of music and speech 339
13. Transient analysis of noise phenomena 342
7. STTJEM-LIOUVILLE PROBLEMS
1. Introduction 348
2. Differential equations of fundamental significance 349
3. The weighted Green's identity 352
4. Second order operators in self-adjoint form 356
5. Transformation of the dependent variable 359
6. The Green's function of the general second order differential
equation 364
7. Normalisation of second order problems 368
8. Riccati's differential equation 370
9. Periodic solutions 371
10. Approximate solution of a differential equation of second
order 374
11. The joining of regions 376
12. Bessel functions and the hypergeometric series 378
13. Asymptotic properties of Jv(z) in the complex domain 380
14. Asymptotic expression of Jp(x) for large values of a; 382
15. Behaviour of Jp(z) along the imaginary axis 384
16. The Bessel functions of the order £ 385
17. Jump conditions for the transition "exponential-periodic" 387
18. Jump conditions for the transition "periodic-exponential" 388
19. Amplitude and phase in the periodic domain 389
20. Eigenvalue problems 390
21. Hermite's differential equation 391
22. Bessel's differential equation 394
23. The substitute functions in the transitory range 400
24. Tabulation of the four substitute functions 404
25. Increased accuracy in the transition domain 405
26. Eigensolutions reducible to the hypergeometric series 409
27. The ultraspherical polynomials 410
CONTENTS XI
PAGE
28. The Legendre polynomials 412
29. The Laguerre polynomials 418
30. The exact amplitude equation 420
31. Sturm-Liouville problems and the calculus of variations 425
8. BOUNDARY VALUE PROBLEMS
1. Introduction 432
2. Inhomogeneous boundary conditions 435
3. The method of the "separation of variables" 438
4. The potential equation of the plane 439
5. The potential equation hi three dimensions 448
6. Vibration problems 464
7. The problem of the vibrating string 456
8. The analytical nature of hyperbolic differential operators 464
9. The heat flow equation 469
10. Minimum problems with constraints 472
11. Integral equations in the service of boundary value problems 476
12. The conservation laws of mechanics 479
13. Unconventional boundary value problems 486
14. The eigenvalue A = 0 as a limit point 487
15. Variational motivation of the parasitic spectrum 494
16. Examples for the parasitic spectrum 498
17. Physical boundary conditions 504
18. A universal approach to the theory of boundary value
problems 508
9. NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS
1. Introduction 512
2. Differential equations in normal form 513
3. Trajectory problems 514
4. Local expansions 515
5. The method of undetermined coefficients 517
6. Lagrangian interpolation in terms of double points 520
7. Extrapolations of maximum efficiency 521
8. Extrapolations of minimum round-off 521
9. Estimation of the truncation error 524
10. End-point extrapolation 526
11. Mid-point interpolations 527
12. The problem of starting values 529
13. The accumulation of truncation errors 531
14. The method of Gaussian quadrature 534
15. Global integration by Chebyshev polynomials 536
16. Numerical aspects of the method of global integration 540
17. The method of global correction 546
Appendix 551
Index 555
This page intentionally left blank
PREFACE
PREFACE
off at a tangent", if necessary. At the same time they force him to develop
those manipulative skills, without which the successful study of mathematics
is not conceivable.)
It is the author's hope that his book will stimulate discussions and
research at the graduate level. Although the scope of the book is restricted
to certain fundamental aspects of the theory of linear differential operators,
the thorough and comprehensive study of these aspects seemed to him well
worth pursuing. By a peculiar quirk of historical development the brilliant
researches of Fredholm and Hilbert in the field of integral equations over-
shadowed the importance of differential operators, and the tendency is
widespread to transform a given differential equation immediately into an
integral equation, and particularly an integral equation of the Fredholm
type which in algebraic language is automatically equivalent to the n x n
type of matrices. This is a tendency which completely overlooks the true
nature of partial differential operators. The present book departs sharply
from the preconceived notion of "well-posed" problems and puts the
general—that is arbitrarily over-determined or under-determined—case in
the focus of interest. The properties of differential operators are thus
examined on an unbiased basis and a theory is developed which submerges
the "well-posed " type of problems in a much more comprehensive framework.
The author apologises to the purist and the modernist that his language
is that of classical mathematics to which he is bound by tradition and
conviction. In his opinion the classical methods can go a long way in the
investigation of the fundamental problems which arise in the field of
differential operators. This is not meant, however, as a slight on those
who with more powerful tools may reach much more sweeping results.
Yet, there was still another viewpoint which militated against an overly
"modernistic" treatment. This book is written primarily for the natural
scientist and engineer to whom a problem in ordinary or partial differential
equations is not a problem of logical acrobatism, but a problem in the
exploration of the physical universe. To get an explicit solution of a given
boundary value problem is in this age of large electronic computers no
longer a basic question. The problem can be coded for the machine and
the numerical answer obtained. But of what value is the numerical answer
if the scientist does not understand the peculiar analytical properties and
idiosyncrasies of the given operator? The author hopes that this book will
help him in this task by telling him something about the manifold aspects
of a fascinating field which is still far from being properly explored.
Acknowledgements. In the Winter Semester 1957-58 the author had the
privilege to give a course on "Selected Topics of Applied Analysis" in the
Graduate Seminar of Professor A. Lonseth, Oregon State College, Corvallis,
Oregon. The lecture notes of that course form the basic core from which the
present book took its start.
By the generous invitation of Professor R. E. Langer the excellent
research facilities and stimulating associations of the Mathematics Research
Center of the U.S. Army in Madison, Wis., were opened to the author, in
XVI PREFACE
xvii
This page intentionally left blank
CHAPTER 1
INTERPOLATION
1.1. Introduction
The art of interpolation goes back to the early Hindu algebraists. The
idea of "linear interpolation" was in fact known by the early Egyptians and
Babylonians and belongs to the earliest arithmetic experiences of mankind.
But the science of interpolation in its more intricate forms starts with the
time of Newton and Wallis. The art of table-making brought into the
foreground the idea of obtaining some intermediate values of the tabulated
function in terms of the calculated tabular values, and the aim was to
achieve an accuracy which could match the accuracy of the basic values.
Since these values were often obtained with a large number of significant
figures, the art of interpolation had to be explored with great circumspection.
And thus we see the contemporaries of Newton, particularly Gregory,
Stirling, and Newton himself, developing the fundamental tools of the
calculus of interpolation.
The unsettled question remained, to what extent can we trust the
convergence of the various interpolation formulas. This question could not
be settled without the evolution of that exact "limit concept" which came
about in the beginning of the 19th century, through the efforts of Cauchy
and Gauss. But the true nature of equidistant interpolation was discovered
even later, around 1900, through the investigations of Runge and Borel.
Our aim in the present chapter will be to discuss some of the fundamental
aspects of the theory of interpolation, in particular those features of the
theory which can be put to good use in the later study of differential
equations. As a general introduction to the processes of higher analysis
one could hardly find a more suitable subject than the theory of interpolation.
2—L.D.O. l
2 INTERPOLATION CHAP. 1
than an infinitesimal element of the function and yet we can predict what
the value of f(x) will be outside of the point x = a, within a circle of the
complex plane whose centre is at x = a. The Taylor series is thus not an
interpolating but an extrapolating series.
Problem 1. Given the following function:
Find the convergence radius of the Taylor series if the centre of expansion is
at x = TT.
[Answer: r = ZTT]
Problem 2. The mere existence of all the derivatives of f^(a) on the real axis
is not sufficient for the existence of the Taylor series. Show that the function
The unfortunate feature of this symbolism is that the equality sign is used
for an infinite process in which in fact equality never occurs.
Instead of operating with the infinite Taylor series we may prefer the
use of the finite series
4 INTERPOLATION CHAP. 1
We shall see later (cf. Chapter 5.18) that on the basis of the general theory of
differential operators we can derive a very definite expression for ijn(x) in the
form of the following definite integral:
which holds if p(x) does not change its sign in the interval [a, b] and f(x) is
continuous; x is some unknown point of the interval [a, &]. These con-
ditions are satisfied in the case of (5) if we identify /(£) with/< n )(a + £) and
/>(£) with (t — I)""1. Hence we obtain the estimation
for x > 0. Show that for any n > 4 the remainder of the series is smaller than
the first neglected term.
Problem 6. The infinite Taylor series of the function (8) converges only up to
x = 1. Let us assume that we want to obtain /(2). How many terms of the
series shall we employ for maximum accuracy, and what error bound do we
obtain for it? Demonstrate by the discussion of the error term (4) that the
error can be greatly diminished by adding the first neglected term with the
weight J.
[Answer:
/9(2) = 46.8984, corrected by \ of next term:/* 9 (2) = 46.7617
correct value: /(2) = 46.7654]
This problem was solved with great ingenuity by Lagrange, who proceeded
as follows.
We construct the "fundamental polynomial" Fn(x) by multiplying all
the root factors:
and
6 INTERPOLATION CHAP. 1
These auxiliary polynomials Pk(%) have the property that they give zero at
all the root points Xi, except at XK where they give the value 1:
[Answer:
Problem 8. Show that if the x^ are evenly distributed around the origin (i.e.
every xjc appears with + and — signs), the interpolating polynomial contains
only even powers if f(x) is an even function: f(x) — f( — x) and only odd powers
if f(x) is an odd function: f(x) = —/(— x). Show that this is not the case if the
xjf are not evenly distributed.
We can consider this differential equation as the defining equation for 7;n(a;),
although a differential equation of nth order cannot have a unique solution
without adding n "boundary conditions". These conditions are provided
by the added information that 7jn(x) vanishes at the n points of interpolation
x = XK'.
Although these are inside conditions rather than boundary conditions, they
make our problem uniquely determined.
At this point we anticipate something that will be fully proved in Chapter 5.
The solution of our problem (3) (with the given auxiliary conditions) can be
obtained with the help of an auxiliary function called the " Green's function:
G(x> £)" which is constructed according to definite rules. It is quite
independent of the given "right side" of the differential equation (2). The
solution appears in the form of a definite integral:
where x is some unknown point of the interval \x\, xn]. The second factor
does not depend on £ any more, but is a pure function of x, which is hide-
pendent of f(x). Hence we can evaluate it by choosing any f(x) we like.
We will choose for f(x) the special function
8 INTERPOLATION CHAP. 1
where Fn(x) is the fundamental polynomial (4.2). This function has the
property that it vanishes at all the points of interpolation and thus the
interpolating polynomial Pn-i(x) vanishes identically. Hence f]n(x) becomes
f(x) itself. Moreover, this choice has the advantage that it eliminates the
unknown position of x since here the nih derivative of f(x) is simply 1
throughout the range. Hence we obtain from (7):
The second factor of the right side of (7) is now determined and we obtain
the estimation
If a function f(x) is tabulated, we shall almost always give the values of f(x)
in equidistant arguments (1). Furthermore, if a function is observed by
physical measurements, our measuring instruments (for example clock
mechanisms) will almost exclusively provide us with functional values which
belong to equidistant intervals. Hence the interpolation between equi-
distant arguments was from the beginning of interpolation theory treated
as the most important special case of Lagrangian interpolation. Here we
need not operate with the general formulae of Lagrangian interpolation
(although in Chapter 2.21 we shall discover a particularly interesting property
of equidistant interpolation exactly on the basis of the general formula of
Lagrangian interpolation) but can develop a specific solution of our problem
by a certain operational approach which uses the Taylor series as its model
and translates the operational properties of this series into the realm of
difference calculus.
In the calculus of finite differences it is customary to normalise the given
SEC. 1.6 EQUIDISTANT INTEBPOLATION 9
and so on.
Now let us start with the truncated Taylor series, choosing the centre of
expansion as the point x = 0:
while
together with the same boundary conditions (7) and (8), then we can translate
the Taylor series into the calculus of finite differences by putting
10 INTERPOLATION CHAP. 1
and proving that r)n(x) vanishes at x = 0, together with its first, second,
. . ., n — 1st differences. But this means that the polynomial
coincides with f(x) in all its differences up to the order n — I and since
these differences are formed in terms of the functional values
shows that these functions indeed satisfy the functional equation (9):
or in symbolic notation
with the understanding that in the expansion on the right side we replace
/»»by/(m).
* Cf. A. A., p. 308.
SEC. 1.7 LOCAL AND GLOBAL INTERPOLATION 11
Here we have connected the functional values /(O) and /(I) by a straight
line. We may want greater accuracy, obtainable by laying a parabola
through the points/(0),/(l),/(2). We now get the " quadratic interpolation"
0.961538 0.961538
0.107466 1.069004
-0.009898 1.059106
0.002681 1.061788
-0.001251 1.060537
0.000851 1.061388
-0.000739 1.060649
0.000568 1.061217
0.000995 1.062212
[correct :/(9.5) = 1.061008]
SEC. 1.8 INTERPOLATION BY CENTRAL DIFFERENCES 13
We observe that in the beginning the error fluctuates with alternating sign
and has the tendency to decrease. After 8 steps a minimum is reached and
from then on the terms increase again and have no tendency to converge.
In fact the differences of high order become enormous. In view of the
change of sign of the seventh and eighth terms we can estimate that the
correct value will lie between 1.061388 and 1.060649. The arithmetic mean
of these two bounds: 1.061018, approaches in fact the correct functional
value 1.061008 with a high degree of accuracy. Beyond that, however, we
cannot go.
Runge's discovery was that this pattern of the error behaviour cannot be
remedied by adding more and more data between, thus reducing Ax to
smaller and smaller values. No matter how dense our data are, the inter-
polation for some rvalue between will show the same general character:
reduction of the error to a certain finite minimum, which cannot be surpassed
since afterwards the errors increase again and in fact become exceedingly
large.
In the present problem our aim has been to obtain the functional value f ( x )
at a given point x. This is the problem of local interpolation. We can use
our judgement how far we should go with the interpolating series, that is
how many of our data we should actually use for a minimisation of our
error. We may have, however, quite a different problem. We may want
an analytical expression which should fit the function y = f(x) with reason-
able accuracy in an extended range of x, for example in the entire range
[ — 10, +10] of our data. Here we can no longer stop with the interpolation
formula at a judiciously chosen point. For example in our previous pro-
cedure, where we wanted to obtain/(9.5), we decided to stop with n = 6 or 7.
This means that we used a polynomial of 5th or 6th order which fits our
data between x = 4 (or 5) and 10. But this polynomial would completely
fail in the representation of f(x) for values which are between —10 and 0.
On the other hand, if we use a polynomial of the order 20 in order to include
all our data, we would get for/(9.5) a completely absurd value because now
we would have to engage that portion of the Gregory-Newton interpolation
formula which does not converge at all. We thus come to the conclusion
that interpolation in the large by means of high order polynomials is not
obtainable by Lagrangian interpolation of equidistant data. If we fit our
data exactly by a Lagrangian polynomial of high order we shall generally
encounter exceedingly large error oscillations around the end of the range.
In order to obtain a truly well fitting polynomial of high order, we have to
make systematic errors in the data points. We will return to this puzzling
behaviour of equidistant polynomial interpolation when we can elucidate it
from an entirely different angle (cf. Chapter 2.21).
The even part of the function/(#) can be expanded in the even Stirling series:
The odd part of f(x) can be made even by multiplication by x and thus
expanding the function xf(x) according to (2). The final result is
expressible in the form of the following expansion:
See A. A..pp.309.310.
SEC. 1.8 INTERPOLATION BY CENTRAL DIFFERENCES 15
The formula (4) shows that the odd Stirling functions $zk+i(x) have to be
defined as follows:
but at the same time g(x) — xf(x) is even and permits the expansion
This means that the coefficients of the odd terms are obtainable in terms of
the 8 operation alone if this operation is applied to the function xf(x), and
the final result divided by 2k + 2.
Here again the direct construction of a central difference table can be
avoided in favour of a direct weighting of the functional values, in analogy
to (6.18). We now obtain
For example:
Problem 13. The exponential function y = e~x is given at the following points:
/(O) = 1, /(I) = 0.367879, /(2) = 0.135335, /(3) = 0.049787, /(4) = 0.018316
Obtain /(0.5) by the Gregory-Newton formula of five terms. Then, adding
the data
/(-I) = 2.718282 /(-2) = 7.38906
obtain/(0.5) by the Stirling formula of five terms; (omitting/(3) and/(4)).
[Answer: Gregory -Newton: /(£) = 0.61197
Stirling :/(£) = 0.61873
correct value :/(£) = 0.60653
The value obtained by the Stirling interpolation is here less accurate than the
G.-N. value.]
get the impression that we are much nearer to the correct value than this is
actually the case. The successive convergents are given in the following
table (the notation/*(0.5) refers to the interpolated values, obtained on the
basis of 1, 3, 5, ...21 data, going from the centre to the periphery, and
taking into account the data to the right and to the left in pairs):
/*(0.5)
1 25 25
0.5 -2.5 23.75
-0.125 0.9375 23.63281
0.0625 -0.54086 23.59901
-0.03906 0.37860 23.58422
0.02734 -0.29375 23.57619
-0.02051 0.24234 23.57122
0.01611 -0.20805 23.56786
-0.01309 0.18357 23.56546
0.01091 -0.16521 23.56366
-0.00927 0.15092 23.56226
The great distance of /*2i(0.5) from the correct value makes it doubtful
whether the addition of the data /(+11), /(+12) . . . out to infinity, would
be able to bridge the gap. A closer analysis corroborates this impression.
The series (8.2) remains convergent as n tends to infinity but the limit does
not coincide withf(x) at the point x = 0.5 (see Chapter 2.21).
3—L.D.O,
18 INTERPOLATION CHAP. 1
What can we say about the behaviour of this series? Will it converge and
if so, will it converge to /(a;)? (We will ask the corresponding problem for
the interval [ — 00, +00] and the use of central differences somewhat later,
in Section 16.)
This problem is closely related to the properties of a remarkable set of
polynomials, called the "Laguerre polynomials". We will thus begin our
study with the exposition of the basic properties of these polynomials
shaped to the aims of the interpolation theory.
We define our function y = f(x) in the interval [0, oo] in the following
specific manner:
and so on. These polynomials have the remarkable property that they are
orthogonal to each other in the interval [0, oo], with respect to the weight
factor e~f, while their norm is 1:
* The notation /*(x) in the sense of an "approximation of /(*)" seems rather ill-
chosen, in view of the convention that the asterisk denotes in algebra the "complex
conjugate" of a complex number. An ambiguity need not be feared, however, because
in all instances when this notation occurs, f(x) is a real function of x.
f The customary notation Ln(t) refers to the present Ln(t), multiplied by n!
SEC. 1.10 THE LAGUEBBE POLYNOMIALS 19
We do not know yet whether this series will converge or not—we will give
the proof in the next section—but for any integer value x = ra the series
terminates after m + 1 terms and the question of convergence does not arise.
For such values the series (8) is an algebraic identity, no matter how the
key-values/(j) (j = 0, 1, 2, . . . , m) may be prescribed.
Let us apply this expansion to the function (4), obtaining
This leads, by exactly the same reasoning as before in the case of (12):
This means
Problem 15. Show that all these relations remain valid if the condition p > 0
is generalised to p > — 1.
Problem 16. The hypergeometric series
is convergent for all (real or complex) |a;| < 1. Put x = z/fi and let j8 go to
infinity. Show that the new series, called the "confluent hypergeometric
series", convergent for all z, becomes
Show that the Laguerre polynomials L^p(t) are special cases of this series, namely
expands the function /(z) into a power series, in terms of the value of/(z)
and all its derivatives at the centre of expansion z — 0. We obtain a
counterpart of the infinite Taylor series, replacing differentiation djdx by
differencing A, in the form of the Gregory-Newton series
demonstrated that we cannot expect the convergence of the series (2) even
under completely analytical conditions. What we can expect, however, is
that there may exist a definite class of functions which will allow representa-
tion with the help of the infinite expansion (2).
In order to find this class, we are going to make use of the orthogonality
and completeness of the Laguerre polynomials in the range [0, oo]. Let us
assume that f(t) is a function which is absolutely integrable in any finite
range of the interval [0, oo] while its behaviour in infinity is such that
Then the function f(t)e~V2 can be expanded into the orthogonal Laguerre
functions Ljc(t)e-^z which leaves us with an expansion of/(£) itself into the
Laguerre polynomials:
where
which we will now evaluate. For this purpose we imagine that we replace
Lk(t) by the actual power series, integrating term by term:
and the only remaining uncertainty is the constant C. But we know from
the definition of Ljc(t) that the coefficient of tk is !/(&!). Therefore, if we
let x go to infinity, the coefficient of xk must become l/kl. This determines
the constant C and we obtain
in agreement with our earlier formula (10.13). The expansion (4) thus
becomes
We have now found a special case of a function which permits the infinite
Gregory-Newton expansion and thus the interpolation by powers of the
functional values/(m), given between 0 and oo.
We will draw two conclusions from this result. First of all, let us put
t = 1. Then we obtain the expansion
The factorial x\ itself goes by far too rapidly to infinity to allow the
Gregory-Newton type of interpolation. But the reciprocal of the factorial is
amenable to such an interpolation. If we let x go toward zero, we obtain
in the limit an interesting approximation of the celebrated " Euler's constant"
Problem 18. By an argument quite similar to that used in the proof of (11),
but now applied to the generalised Laguerre polynomials L^fe), show the
validity of the following relation
But
and in view of the fact that Lk(t) can be conceived as the kih difference of
the function i*/£! (at £ = 0), we can replace Lk(t) by this function, integrate,
and then take the kih difference (considering £ as a variable), and finally
replacing £ by 0. The result of the integration becomes
Substituting this value of gjc in (3) we obtain the infinite binomial expansion
which shows that the integral transform (1) defines a class of functions which
allows the infinite binomial expansion of Gregory-Newton.
On the other hand, let us assume that we have a Gregory-Newton ex-
pansion which is convergent:
and obtain f(x) by constructing the integral transform (1). Hence we see
that the integral transform (1) is sufficiently general to characterise the
entire class of functions which allow the Gregory-Newton type of interpolation
in the infinite integral [0, ooj.
The analytical form (1) of the function f(x) shows that it is in fact an
analytical function of x, throughout the right complex plane R(x) > 0.
Moreover, the interpolation formula (7) remains valid not only on the positive
real axis but everywhere in the right complex z-plane R(z) > 0. Hence we
have obtained an expansion which not only interpolates properly the discrete
functional values f(m) to the values f(x) between the given data, but also
extrapolates f(z) properly at every point z of the right complex half plane.
26 INTERPOLATION CHAP. 1
Problem 19. Carry through the procedure with respect to the generalised
integral transform
expanding g(t) in the polynomials L^>(t). Show that/p(aj) allows the expansion
(7) throughout the right complex half plane. The expansion may also be
written in the form
with
Problem 20. Comparing the integral transforms (1) and (10), demonstrate the
following theorem: If a function allows the infinite Gregory-Newton expansion, it
allows that expansion also if the centre of expansion is shifted by an arbitrary
amount to the right.
Problem 22. Employ the same function in the integral transform (10) and
derive the following binomial expansion:
Problem 23. Show that the right sides of (2) and (3) are in the following relation
to the hypergeometric series (10.23):
SBO. 1.14 RECURRENCE RELATIONS 27
Problem 25. Doing the same in the expansion (5) obtain the following
generalisation of (7):
Hence the operation A on the function has the effect that the coefficient gk
is changed to gjc+i. This operation can be repeated, of course, any number
of times, obtaining each time a jump in the index of gjc by one. Particularly
important is the operation
and consequently
Hence
Here the expansion coefficients g% become Ljc(t). The function on the left
satisfies the following simple functional equation :
that is:
According to the rules (3) and (8) we can write this equation in the form
which yields the following recurrence relation for the Laguerre polynomials
Lk(t):
Problem 26. Show that the operations A and F are not commutative: FA^ AF.
SEC. 1.15 THE LAPLACE TRANSFORM 29
Problem 27. In the expansion (11.20) the function fp(x) satisfies the following
functional equation:
Translate this equation into the realm of the expansion coefficients and obtain
the following recurrence relation for the generalised Laguerre polynomials L^P(t):
Problem 28. The left side of the expansion (13.2) satisfies the functional equation
Find the corresponding recurrence relation for the expansion coefficients and
verify its validity.
[Answer:
with
This expansion converges for all values of the complex variable a whose
real part is greater than zero.
SEC. 1.15 THE LAPLACE TRANSFORM 31
Sometimes our aim is to obtain the input function g(t] from a known
Laplace transform /(a). In this case the expansion of g(t) in Laguerre
polynomials would not be feasible since this expansion goes beyond all
bounds as t goes to infinity. But if it so happens that g(t) is quadratically
integrable without the weight factor e~f; g2(t)dt — finite, then we can
expand g(t) into the orthonormal Laguerre fund ions, obtained by multiplying
the Laguerre polynomials by e~*/2. In this case:
and
the coefficients of this series yield directly the coefficients of the series (12).
This procedure is frequently satisfactory even from the numerical stand-
point.*
Does the Laplace transform permit the Gregory-Newton type of inter-
polation? This is indeed the case, as we can see if we consider that the
function e~x£ allows a binomial expansion, on account of Newton's formula:
Problem 31. Show that the condition (2) yields for the convergence of the
binomial expansion (5) the condition
Problem 32. Choose the input function of the integral transform (12.10) in the
form (1) and deduce the following relation:
y3 = 20.08554
we will once more turn to the hypergeometric series and consider two
particular cases, characterised by the following choice of the parameters:
and
where 3>2*(#) and $2ic+i(x) are the Stirling functions, encountered earlier in
(8.3) and (8.6).
The hypergeometric functions represented by these expansions are
obtainable in closed form. Let us consider the differential equation of
Gauss which defines the hypergeometric function
Show that in the new variable the differential equation (6) becomes
4--L.D.O.
34 INTEBPOLATION CHAP. 1
If we adopt the new angle variable 6 for the expansions (3) and (4), we observe
that the functions F(-x, x, i; sin2 6/2) and F(-x + l,x + 1, f ; sin2 6/2) are
even functions of 6. Hence in the general solution of (9):
y = A cos xQ + B sin xd
the sine-part must drop out, while the constant A must be chosen as 1, since
for 6 = 0 the right side is reduced to 1. We thus obtain
Hence
Let us perform these operations on the left sides of the series (14) and (15):
If we apply the same operation term by term on the right sides, we obtain
the following operational equations (considering t — sin2 6/2 as a variable
and equating powers of t):
SEC. 1.18 AN INTEGRAL TRANSFORM OF THE FOURIER TYPE 35
we must have
and
and
This establishes the two hypergeometric series (16.14) and (16.15) as infinite
Stirling expansions.
1.18. An integral transform of the Fourier type
On the basis of our previous results we can now establish a particular but
important class of functions which allow the infinite Stirling expansion.
First of all we will combine the two series (16.14) and (16.15) in the following
complex form:
with
Since the hypergeometric series converges for all \t\ = jsin2 0/2J < 1, we can
36 INTERPOLATION CHAP. 1
make use of this series for any 9 which varies between — TT and +TT. If we
now multiply by an absolutely integrable function <p(6) and integrate between
the limits — IT, +TT, we obtain the following integral transform:
This f(x) allows the infinite Stirling expansion which means that f(x) is
uniquely determined if it is given at all integer values x = ± m. We can
now form the successive central differences 5*/(0) and yS*/(0)—also obtain-
able according to (8.11) and (8.12) by a binomial weighting of the functional
values f(±m) themselves—and expand f(x) in an infinite series:
However, the formulae (2-4) show that the coefficients gk are also obtainable
by evaluating the following definite integrals:
for obtaining BWITX at all points. [Hint: Consider the Stirling expansion of
sin irx/frx and derive the following series:
Problem 41. Assume the input function <p(6) of the integral transform (4) in
the form
The values of f(x) at integer points are all zero, except at x = m where the
function assumes the value ( — l)m. Hence the binomial weighting of the
functional values is particularly simple. Derive the expansion
which, if written in the general form (17.11) possesses the following expansion
coefficients:
The same coefficients are obtainable, however, on the basis of the integrals (7)
and (8). Hence obtain the following formulae:
(The second integral is reducible to the first. Show the consistency of the
two expressions.)
Problem 42. Show that the first 2m — 1 terms of the Stirling expansion (12)
drop out, because their coefficients are zero.
8f(x). Hence, e.g. the equation Sgr* = Qk+z shall signify that in consequence
of the operation 8f(x) the coefficient g^ of the expansion is to be replaced by
9k+z- With this convention we obtain from the operational equations
(17.5, 6, 9, 10):
Now the two operations y and 8 can be combined and repeated any number
of times.
Since by definition
we obtain
and hence we can obtain an arbitrary f(x ± m) with the help of the two
operations y and 8. But we still need another operation we possessed in the
case of simple differences, namely the multiplication by x (cf. 14.8). This
operation is obtainable by the differentiation of the series (16.14) and (16.15).
For this purpose we return to our original variable t (cf. 16.8), but multiplied
by -4:
and now, differentiating the first series with respect to T and subtracting the
second series, after multiplying it by x/2, we obtain
Let us now differentiate the second series with respect to r and multiply
it by sin2 6 = -r(l + T/4). This gives
Accordingly, we can extend the rules (1-3) by the two additional rules:
We see that any linear recurrence, relation which may exist between the
functional values f(x + ra), with coefficients which are polynomials of x, can be
translated into a linear recurrence relation for the coefficients of the Stirling
expansion.
Problem 43. Show that the two operations 8 and y are commutative:
Problem 44. Find a recurrence relation for the expansion coefficients (18.13) on
the basis of a recurrence relation for the function (18.11). Verify this relation.
[Answer:
and show that both the g^k (representing the even part of (18.11)), and the
gzk+i (representing the odd part of (18.11)) satisfy the appropriate relation.
40 INTERPOLATION CHAP. 1
[Answer:
The coefficients cjg of the expansion (1) are the Fourier coefficients
This series is very different from the Stirling series since the functions of
interpolation are not polynomials in x but the trigonometric functions
If we separate the even and the odd parts of the function f(x), the
expansion (4) will appear in the following form:
(The prime in the first sum refers to the convention that the term k = 0
should be taken with half weight.)
The function <p(0) of the transform (18.4) may be chosen in the following
extreme fashion: <p(6) vanishes everywhere, except in the infinitesimal
neighbourhood of the point 6 = 0i. With this choice of <p(d) we see that
e-to* itself may be considered as a Fourier transform which permits the
expansion (6), provided that 6 is smaller than TT.
Problem 48. The limiting value 6 = -n is still permissible for the expansion of
cos 6x. Obtain the series
Problem 50. Show that the integral transform (18.4) allows the Stirling expansion
also in the key-values x = ±fim, where ft is any positive number between 0 and 1.
1.21. The general integral transform associated with the Stirling series
The two series (16.3) and (16.4) had been of great value for the derivation
of the fundamental operational properties of the Stirling functions. The
same series were used in the construction of the integral transform (18.4)
which characterised a large class of functions which permitted the Stirling
kind of interpolation (and extrapolation) in an infinite domain. We will
now generalise our construction to an integral transform which shall include
the entire class of functions to which the infinite Stirling expansion is
applicable. We first consider the even part of the function: %[f(x) + / ( — #)],
which can be expanded with the help of the even Stirling functions $2yfc(#)-
Once more we use the special series (16.14), but without abandoning the
original variable t which shall now be considered as a complex variable —z
whose absolute value is smaller than 1:
with
which converges. Then we define the function g(d) by the infinite Fourier
series
Then on the unit circle we obtain g(9)lei8 and if the series converges at
\z\ — I, it will certainly converge also for \z\ > 1. Hence the integral
transform (4) may also be written in the form
with the understanding that the range of integration is any closed curve on
which G(z) is analytical, and which includes all singularities of G(z), but
excludes the point z — — 1 (which is the point of singularity of the function
(1)). The function G(z) is analytical everywhere outside the unit circle and
will frequently remain analytical even inside, except for certain singular
points.
44 INTERPOLATION CHAP. 1
As to the odd part \[f(x) — f(—x)"\ of the functionf(x), we have seen that
the Stirling expansion of an odd function is formally identical with the
Stirling expansion of an even function (with the absolute term zero) divided
by x (cf. Section 8). Hence the general representation of the class of
functions which permits the Stirling kind of interpolation in an infinite
domain, may be given in the form of the following integral transform:
where GI(Z), GZ(Z) are arbitrary functions, analytical outside and on the
unit circle, and satisfying the auxiliary condition
Problem 51. Let the function f(x) be defined as one of the Stirling functions
®2k(x)t respectively ^2k-l(x)- Find the corresponding generating functions
Gi(z), 6<2(z).
[Answer:
Problem 52. Show that, if f(x) is an even polynomial of the order 2n, the
generating function 6*1(2) is a polynomial of the order n + 1 in z"1, while
6*2(2) = 0. If f(x) is an odd polynomial of the order 2n — 1, the same is true
of 6*2(2) (with the term z-1 missing), while 6*1(2) = 0.
Problem 53. Find the generating functions of the functions (16.12) and (16.13).
[Answer:
Problem 64. Find the generating functions of the integral transform (18.4).
We see that the Bessel function Jn(x) is an entire function of x which has the
form of the Fourier transform (18.4) if cos <p is introduced as a new variable
6. Consequently the conditions for the applicability of the interpolation
in central differences are fulfilled.
Quite different is the situation with respect to the order p of the Bessel
functions. If p is not an integer, the definition (1) does not hold, but has
to be replaced by the following definition:
where
Now the function cos (a; sin 9/2), considered as a function of <p, is an even
function of tp and it is periodic, with respect to the period 2w, Such a
function can be expanded into a Fourier cosine series:
46 INTERPOLATION CHAP. 1
where
as we can see from the definition (1) of the Bessel functions, for n — 2k.
Hence we obtain the series
and thus, going back to the original Jp(x) according to (2), we obtain the
following interpolation of an arbitrary Jv(x) in terms of the Bessel functions
of even order:
which is a special case of (10), forp = 1. The series on the right terminates
for any integer value of p and expresses the function Jn(x)x~n as a certain
weighted mean of the Bessel functions of even order, up to Jzn(x), with
coefficients which are independent of x.
SEC. 1.22 INTERPOLATION OF THE BESSEL FUNCTIONS 47
Problem 56. What is the maximum tabulation interval Ax — (3 for the key-
values «7n(j8m) to allow convergence of the Stirling interpolation? What is the
same interval for interpolation by simple differences?
[Answer:
Problem 57. Answer the same questions if the tabulated function is ex.
[Answer:
Problem 58. The Harvard Tables* give the following values of the Bessel
functions of even order at the point x = 3.5:
Obtain «/3.5(3.5) by interpolation, and compare the result with the correct value.
The Bessel functions of half-order are expressible in closed form in terms of
elementary functions, in particular:
[Answer: 0.293783539
Correct Value: 0.293783454]
Another aspect of the interpolation properties of the Bessel functions
reveals itself if we write the formula (2), (3) in the following form:
[Answer: 0.2941956626 (observe the very slow convergence, compared with the
result in Problem 58)]
Problem 60. Riemann's zeta-function can be defined by the following definite
integral, valid for all z > 0:
BIBLIOGRAPHY
[1] Jordan, Ch., Calculus of Finite Differences (Chelsea, New York, 1950)
[2] Milne, W. E., Numerical Calculus (Princeton University Press, 1949)
[3] Milne-Thomson, L. M., The Calculus of Finite Differences (Macmillan,
London, 1933)
[4] Whittaker, E. T., and G. Robinson, The Calculus of Observations (Blackie &
Sons, London, 1924)
[5] Whittaker, J. M., Interpolatory Function Theory (Cambridge University
Press, 1935)
CHAPTER 2
HARMONIC ANALYSIS
2.1. Introduction
In the first chapter we studied the properties of polynomial approximations
and came to the conclusion that the powers of x are not well suited to the
approximation of equidistant data. A function tabulated or observed at
equidistant points does not lend itself easily to polynomial interpolation,
even if the points are closely spaced. We have no guarantee that the error
oscillations between the points of interpolation will decrease with an increase
of the order of the interpolating polynomial. To the contrary, only a very
restricted class of functions allows unlimited approximation by powers. If
the function does not belong to this special class of functions, the error
oscillations will decrease up to a certain point and then increase again.
In marked contrast to the powers are the trigonometric functions which we
will study in the present chapter. These functions show a remarkable
flexibility in their ability to interpolate even under adverse conditions. At
the same time they have no "extrapolating" faculty. The validity of the
approximation is strictly limited to the real range.
The approximations obtainable by trigonometric functions fall into two
categories: we may have the function f(x) given in a finite range and our
aim may be to find a close approximation—and in the limit representation—
with the help of a trigonometric series; or we may have f(x) given in a
discrete number of equidistant points and our aim is to construct a well-
approximating trigonometric series, in terms of the given discrete data.
In the first case the theory of the Fourier series is involved; in the second
case, the theory of trigonometric interpolation.
5—L.D.O. 49
50 HABMONIC ANALYSIS CHAP. 2
for granted. We assume that this series is valid in the range [—IT, +TT],
and is uniformly convergent in that range. If we multiply on both sides
by cos kx, respectively sin kx and integrate term by term, we obtain, in
view of the orthogonality of the Fourier functions, the well-known expressions
Then the coefficients (2) (omitting the constant term £«o) are expressible
with the help of/'(#), by using the method of integrating by parts:
SEC. 2.2 THE FOURIER SERIES FOR DIFFERENTIABLE FUNCTIONS 51
alone:
the function f(x) beyond the original range [ — IT, + TT]. We do that by
defining f(x) as a periodic function of the period 2ir:
By this law f(x) is now uniquely determined everywhere. Then the integral
(8) can now be put in the following form, introducing £ — x = 0 as a new
integration variable and realising that the integral over a full period can
always be normalised to the limits — TT, + TT :
In the second term G'\(Q] can be replaced by the constant l/2ir. In the
first term, in view of the discontinuity of Gi(9) we have to take the boundary
term between — TT and 0~ and again between 0+ and TT. In view of the
periodicity of the boundary term the contribution from the two boundaries
at ± TT vanishes and what remains becomes
Hence
which converges everywhere inside and on the unit circle, excluding the point
2 = 1 . Put z = el9 and obtain the infinite sums
[Answer:
This integral can now be used for estimation purposes, by replacing the
integrand by its absolute value:
54 HAEMONIC ANALYSIS CHAP. 2
The second factor is quite independent of the function f(x) and a mere
numerical constant for every n. Hence we can put
and obtain the following estimation of the remainder at any point of the
range:
Our problem is thus reduced to the evaluation of the infinite sum (1).
We shall have frequent occasion to find the sum of terms which appear as
the product of a periodic function times another function which changes
slowly as we go from term to term. For example the change of I/a: is slow
if we go from l/(n + k) to l/(n + k + 1), assuming that n is large. Let us
assume that we have to obtain a sum of the following general character:
Applying this procedure to the series (1) we obtain (in good approximation),
for 6 > 0:
SEC. 2.3 THE REMAINDER OF THE FINITE FOURIER EXPANSION 55
and
Problem 64. If the summation on the left side of (3.9) extends only to k = n,
the upper limit of the integral becomes n + £. Derive by this integration
method the following trigonometric identities, and check them by the sum
formula of a geometrical series:
Problem 66. Evaluate the mean square error of the Fourier series (2.7) and
prove that, while the maximum of the local error T]n(x) remains constantly |,
the mean square error converges to zero with n-1/2.
[Answer:
(We omit Ic = 0 since it is always understood that our f(x) is the modified
function f(x) — \a§ which has no area.) We can write even the entire
Fourier series in complex form, namely
SEC. 2.4 FUNCTIONS OF HIGHER DIFFERENTIABILITY 57
with the understanding that we keep only the real part of this expression.
With this convention we can once more put
where
Once more our aim will be to obtain an error bound for the finite expansion
(3.2) and for that purpose we can again put
where gnm(6) is now defined by the real part of the infinite sum
The method of changing this sum to an integral is once more available and
we obtain
Again we argue that with the exception of a very small range around 6 = 0
the asymptotic stage is quickly reached and here we can put
But then, repeating the argument of the previous section, we get for not
too small n:
and
which avoids the use of the absolute value. Applying this fundamental
inequality to the integral (7) we can make use of the orthogonality of the
Fourier functions and obtain the simple expression (which holds exactly
for all ri):
and thus we can deduce the following estimation for the local error T$,(X)
at any point of the range:
Problem 67. Show that the approximation (15) is "safe" for estimation purposes
because the sum on the left side is always smaller than the result of the integration
given on the right side.
Problem 68. Prove the following inequalities: for any f(x) which satisfies the
boundary conditions (4.2) and whose total area is zero:
The Bzm are the Bernoulli numbers: £, -3A0-, -£$, -£$, . . . (starting with m = 1).
The first integral is small because the range of integration is small. The
second integral is small on account of the A/ft in the denominator.
The estimation (4) is, of course, more powerful than the previous estimation
(3.14), although the numerical factor Cn was smaller in the previous case.
60 HARMONIC ANALYSIS CHAP. 2
Even a jump in the function f(x) is now permitted which would make f'(x)
infinite at the point of the jump, but the integral
remains finite. We see, however, that in the immediate vicinity of the jump
we cannot expect a small error, on account of the first term which remains
finite in this vicinity. We also see that under such conditions the estimated
error decreases very slowly with n.
We fare better in such a case if we first remove the jump in the function
by adding to f(x) the special function O\(x — xi), multiplied by a proper
factor a. Since the function aGi(x — x\) makes the jump — a at the point
x = xi, we can compensate for the jump a of the function f(x) at x = x\
and reduce f(x) to a new function <p(x) which is free of any discontinuities.
For the new function the more efficient estimation of Section 3 (cf. 3.14)
can be employed, while the remainder of the special function aOi(x —xi)
is explicitly at our disposal and can be considered separately.
2.6. The Gibbs oscillations of the finite Fourier series
If/(#) is a truly periodic and analytical function, it can be differentiated
any number of times. But it happens much more frequently that the
Fourier series is applied to the representation of a function f(x) which is
given only between — IT and +TT, and which is made artificially periodic by
extending it beyond the original range. Then we have to insure the boundary
conditions (4.2) by artificial means and usually we do not succeed beyond a
certain m. This means that we have constructed a periodic function which
is m times differentiable but the mth derivative becomes discontinuous at
the point x = x\. Under such conditions we can put this lack of continuity
to good advantage for an efficient estimation of the remainder of the finite
series and obtain a very definite picture of the manner in which the truncated
series fn(x) approximates the true function/(a;).
We will accordingly assume that f(x) possesses all the derivatives up to
the order m, but/("*)(#) becomes discontinuous at a certain point x = xi of
the range (if the same occurs in several points, we repeat our procedure for
each point separately and obtain the resulting error oscillations by super-
position). Now the formula (4.7) shows that it is not f^(x) in itself, but
the integral over/<w»)(a;) which determines the remainder T)n(x) of the truncated
series. Hence, instead of stopping with the mtli derivative, we could proceed
to the m + 1st derivative and consider the jump in the mtlj derivative as a
jump in the integral of the m + 1st derivative. This has the consequence
that the major part of the integral which determines ir)n(x), is reducible to
the immediate neighbourhood of the point x = x\. The same will happen
in the case of a function whose m + 1st derivative does not become necessarily
infinite but merely very large, if compared with the values in the rest of
the range.
Since our function f(x) became periodic by extending it beyond the
SEC. 2.6 THE GIBBS OSCILLATIONS OF THE FINITE FOURIER SEEIES 61
original range of definition, we can shift the origin of the period to any
point x = xi. Hence we do not lose in generality but gain in simplicity if
we place the point of infinity of the m + 1st derivative into the point x = 0.
The integration in the immediate vicinity of the point £ = 0 gives (cf. (4.7)):
The second factor is the jump A of the mth derivative at the point x = 0:
The last term in the numerator is very nearly — lj(n' + £), on account of
the largeness of n'. We fail only in the domain of very small x but even
there the loss is not too serious if we exclude the case m = 0 which we will
consider separately. But then the effect of this substitution is that the m
of the previous term changes to m + 1, with the consequence that now
numerator and denominator cancel out and the second factor becomes 1.
The resulting expression is now exactly the integrand of (4). We have thus
62 HAEMONIC ANALYSIS CHAP. 2
succeeded with the integration and have merely to substitute the limits 0
and oo, obtaining
Only the real part of this expression must be taken, for positive x. The
transition to negative x occurs according to the following rules:
where
On the other hand, if m is even: m = 2s, the real part of (8) yields the odd
function
The formulae (11) and (13) demonstrate in explicit form the remarkable
manner in which the truncated series fn (x) (which terminates with the terms
sin nx, cos nx), approximates the true function f(x). The approximation
winds itself around the true function in the form of high frequency oscilla-
tions (frequently referred to as the "Gibbs oscillations"), which are super-
imposed on the smooth course of f(x) [by definition fn(x) = f(x) — -rjn(x)].
These oscillations appear as of the angular frequency (n + £), with slowly
changing phase and amplitude. The phase a starts with the value 0 at
x = 0 and quickly increases to nearly 7r/2, if n is not too small. Accordingly,
the nodal points of the sine-oscillations and the maxima-minima of the
cosine oscillations are near to the points
These points divide the interval between 0 and IT into n + I nearly equal
sections.
SEC. 2.6 THE GIBBS OSCILLATIONS OF THE FINITE FOURIER SERIES 63
where A is the discontinuity of the mth derivative (cf. (2)). Only in the
immediate vicinity of the point x = 0 are the oscillations of slightly larger
amplitude.
We see that the general phenomenon of the Gibbs oscillations is inde-
pendent of the order m of the derivative in which the discontinuity occurs.
Only the magnitude of the oscillations is strongly diminished as m increases.
But the slow change in amplitude and phase remains of the same character,
whatever m is, provided that n is not too small relative to m.
The phase-shift between even and odd m—sine vibrations in the first case,
cosine vibrations in the second—is also open to a closer analysis. Let us
write the function f(x) as the arithmetic mean of the even function
g(x) = f(x) + f( — x) and the odd function h(x) = f(x) — f( — x). Now the
Fourier functions of an even function are pure cosines, those of an odd
function pure sines. Hence the remainder -rjn(x) shares with the function
its even or odd character. Furthermore, the behaviour of an even,
respectively odd function in the neighbourhood of x = 0 is such that if an
even derivative becomes discontinuous at x = 0, the discontinuity must
belong to h(x). On the other hand, if an odd derivative becomes dis-
continuous at x = 0, that discontinuity must belong to g(x). In the first
case g(x) is smooth compared with h(x), in the second h(x) is smooth com-
pared with g(x). Hence in the first case the cosine oscillations of the
remainder are negligible compared with the sine oscillations, while in the
second case the reverse is true. And thus the discontinuity in an even
derivative at x = 0 makes the error oscillations to an odd function, the dis-
continuity in an odd derivative to an even function.
We can draw a further conclusion from the formulae (11) and (13). If s
is even and thus m of the form 4/n + 1, the error oscillations will start with
a minimum at x = 0, while if s is odd and thus m of the form 4/u, + 3, with
a maximum. Consequently the arrow which goes from /(O) to /w(0), points
in the direction of the break if that break occurs in the first, fifth, ninth,
. . . derivative, and away from the break if it occurs in the third, seventh,
eleventh, . . . derivative. Similarly, if the break occurs in the second, sixth,
tenth, . .. derivative, the tangent of fn(x) at x = 0 is directed towards the
break; if it occurs in the fourth, eighth, twelfth, . . . derivative, away from
the break (see Figure).
The case m = 0. If the discontinuity occurs in the function itself, we
have the case m — 0. Here the formula (8) loses its significance in the realm
of small x. On the other hand, we have obtained the gnl(x) of formula
64 HARMONIC ANALYSIS CHAP. 2
SEC. 2.6 THE GIBBS OSCILLATIONS OF THE FINITE FOURIER SERIES 65
and thus
serious change offn(x) if we shift the nodal points of its error oscillations into
exactly equidistant points. Then we no longer have our original truncated
Fourier series fn(x) but another trigonometric series with slightly modified
coefficients whose sum, however, does not give anything very different from
what we had before. Now this new series has a very definite significance.
It is obtainable by the process of trigonometric interpolation because we can
interpret the new series as a trigonometric expansion of n terms which has
zero error in n equidistantly prescribed points, in other words, which fits
exactly the functional values of f(x) at the "points of interpolation". This
is no longer a problem in infinite series but the algebraic problem of solving
n linear equations for n unknowns; it is solvable by mere summation,
without any integration. In particular we may prescribe the points of
interpolation very near to the nodal points of the Gibbs oscillations if we
agree that the odd part of the function shall be given in the points
where on the right side we have the given mth derivative of the function.
This, however, is not enough for a unique characterisation of f(x) since a
differential equation of mth order demands m additional boundary conditions
to make the problem unique. In the Lagrangian case we succeeded in
characterising the remainder itself by the differential equation (1) and the
added boundary conditions—they were in fact inside conditions—followed
from the added information that the remainder vanishes at the points of
interpolation. In our present problem we will not proceed immediately to
the remainder of the Fourier series but stay first with the function/(x) itself.
We add as boundary conditions the periodicity conditions (4.2) which are
demanded by the nature of the Fourier series. With these added conditions
the function f(x) is now uniquely determined, except for an additional
constant which is left undetermined. We eliminate this freedom by adding
one more condition, namely
(This condition is justified since we can always replace f(x) by f(x) — \a$
which indeed satisfies the condition (2).)
Now under these conditions—as we will prove later—we can solve the
given problem in terms of the "Green's function" 0(x, £):
could be obtained in closed analytical form. This is not the case now
because in the case of the Fourier series the remainder of the Green's
function has an oscillatory character. In order to estimate the value of (4)
on the basis of the maximum of f(m)(£) we have to replace gn(£ — x) by
\dn(£ — #)| which makes the evaluation of the numerical factor in the
remainder formula (4.12) difficult but does not interfere with the fact that
an effective estimation of the error is possible. Moreover, the Lagrangian
remainder formula (1.3.7) operates specifically with the wth derivative while
in our case the order of the derivative m and the order n of the approximating
Fourier series are independent of each other.
Problem 71. Derive the Green's function Gz(6) of (4.6) (for the case m = 2)
from the Gi(d) of Section 2 (cf. 2.9) on the basis that the infinite sum (4.6)
which defines Qz(G) is the negative integral of the sum (2.7), together with the
validity of the condition (2).
[Answer:
Fourier series (3.2), which does not give f(x) but merely defines a certain
fn(x) associated with f(x), can be written in the following form
where
as its Fourier expansion. Then we could replace the infinite sum (4) by
this function and interpret K(£ — x) not merely as an operator but as an
actual function.
Now the general law (2.2) of the Fourier coefficients tells us that this
hypothetical function S(£, x) must satisfy the following conditions:
Moreover, this function is everywhere zero, with the only exception of the
infinitesimal neighbourhood + e of the point 6 — 0. Then the expansion
coefficients a*, 6& of this function become:
and now let e go towards zero, the point £ becomes in the limit equal to £
and we have actually obtained the desired expansion coefficients (7). The
function thus constructed is Dirac's celebrated "delta function" 8(x, £) =
8($, x) = 8(g — x). It is comparable to an infinitely sharp needle which
pinpoints one definite value f(x) of the function, if used under the integral
sign as an operator:
Here the previous equation (5), which was an elegant operational method
SEC. 2.9 SMOOTHING OF THE GIBBS OSCILLATIONS BY FEJ&R'S METHOD 71
of writing the Fourier series of a function f(x) (provided that the series
converges), now appears in consequence of the definition of the delta
function. But we come back to the previous equation (5) if we replace
the delta function by its Fourier expansion (6).
Neither the "delta function", nor its Fourier expansion (6) are legitimate
concepts if we divest them from their significance as operators. The delta
function is not a legitimate function because we cannot define a function by
a limit process which does not possess a limit. Nor is the infinite series (6)
a legitimate Fourier series because an infinite sum which does not converge
to a limit is not a legitimate series. Yet this is entirely immaterial if these
constructions are used under the integral sign, since it suifices that the
limits of the performed operations shall exist.
2.9. Smoothing of the Gibbs oscillations by FejeVs method
We have mentioned before that the Gibbs oscillations of the Dirichlet
kernel (8.3) interfere with an efficient estimation of the remainder f]n(x) of
the finite Fourier series. Dirichlet succeeded in proving the convergence
of the Fourier series if certain restricting conditions called the "Dirichlet
conditions", are demanded of /(#). But a much more sweeping result was
obtained by Feje*r who succeeded in extending the validity of the Fourier
series to a much larger class of functions than those which satisfy the
Dirichlet conditions. This generalisation became possible by a modification
of the summation procedure by which the Fourier series is obtained.
The straightforward method by which the coefficients (2.2) of the Fourier
series are derived may lead us to believe that this is the only way by which
a trigonometric series can be constructed. And yet this is by no means so.
What we have proved is only the following: assuming that we possess a
never ending sequence of terms with definite coefficients ajc, bjc whose sum shall
converge to f(x), then these coefficients can be nothing but the Fourier
coefficients. This, however, does not interfere with the possibility that for
a certain finite n we may find much more suitable expansion coefficients
since here we are interested in making the error small for that particular n,
and not in constructing an infinite series with rigid coefficients which in the
limit must give us /(#). We may gain greatly in the efficiency of our
approximating series if we constantly modify the expansion coefficients as
n increases to larger and larger values, instead of operating with a fixed set
of coefficients. And in fact this gives us the possibility by which a much
larger class of functions becomes expandable than if we operate with fixed
coefficients.
FejeVs method of increasing the convergence of a Fourier series consists
in the following device. Instead of merely terminating the series after n
terms (the terms with a^ and && always act together, hence we will unite
them as one term) and being satisfied with their sum/n(a;), we will construct
a new sequence by taking the arithmetic means of the original sequence:
72 HARMONIC ANALYSIS CHAP. 2
This new 8n(x) (the construction of which does not demand the knowledge
of the coefficients aje, 6* beyond k = n), has better convergence properties
than the original fn(%). But this 8n(x) may be preferable to /»(#), quite
apart of the question of convergence. The truncated Fourier series /»(#)
has the property that it oscillates around the true course of /(#). These
"Gibbs oscillations" sometimes interfere with an efficient operation of the
Fourier series. Fejer's arithmetic mean method has an excellent influence
on these oscillations, by reducing their amplitude and frequently even
eliminating them altogether, making the approach to f(x) entirely smooth,
without any oscillations.
Problem 72. Apply the arithmetic mean method to the Dirichlet kernel (8.3)
and show that it becomes transformed into the new kernel (cf. 3.18):
with
This is now the nih approximation itself and not the remainder of that
approximation. In order to come to the remainder, let us write f ( x + 6)
as follows:
SEC. 2.10 THE REMAINDER OF THE ARITHMETIC MEAN METHOD 73
Substituting we obtain
Now the area under the kernel 3>n(B) is 1 because <Pn(0) is a weighted sum
of cosines (see 9.4) but each one gives the area zero, except the absolute
term 1/27T which has not been changed by weighting (the corresponding k
being zero) and still gives the area 1. Hence the first term is/(:r) and thus
the second term has now to be interpreted as — r)n(x). In this second term
we divide the range of integration into two parts: very small 6 and larger 6.
For the realm of larger B we can again obtain an expression like the last
term of (5.4), although \f'(8)\ is now to be replaced by |/(0) — f(x)\ and
we have to choose the limiting value of 8, which separates the two domains,
not proportional to l/Vn but proportional to l/^n. In order that this
term shall go to zero with increasing n it is only necessary that
Fejer's method demands solely the absolute integrability of/(#), without any
further conditions.
Now we come to the central region and here we cannot use the estimation
based on the maximum of gn(8)—which in our earlier case became \—
because $n(#) grows out of bound at 6 = 0, as n increases to infinity. But
in this region we can interchange the role of the two factors and take the
maximum value of \fi(x, 0)| multiplied by the integral over the absolute
value of <Pn(8) which is certainly less than 1 (Fejer's kernel is everywhere,
positive and needs no change on account of the "absolute value" demand).
And thus in the inner domain we have a contribution which is less than the
maximum of the absolute value offi(x, 6).
Here we cannot argue that this contribution will be small on account of
the small range of integration. But we can argue that we are very near to
the point 8 = 0 and have to examine the maximum of the quantity
Now the continuity of f(x) is not demanded for the fulfilment of the condition
(4). But if f ( x ) is not continuous at the point x, then the quantity (5)
will not be small and the arithmetic mean method will not converge to f(x).
If, however, we are at a point where f(x) is continuous, then by the very
definition of continuity (without demanding differentiability), the quantity
(5) becomes arbitrarily small as the domain of B shrinks to zero. And thus
we have proved that the arithmetic mean method converges to the proper
f(x) at any point in which f(x) is continuous, the only restricting condition
on the class of admissible f(x) being the absolute integrability (4) of the
function.
Problem 74. Carry through the same argument for the case that the continuity
of f(x) holds separately to the right of x and to the left of x but/(&+) and/(a;-)
74 HARMONIC ANALYSIS CHAP. 2
have two different values. Show that in this case the series of Fejer converges
to
(with the understanding that only the real part of this expression is to be
taken). Instead of the usual differentiation we will now introduce a
"curly <2> process", defined as follows:
The operation Qin applied to the functions cos nx and sin nx has the
following effect
and we see that the functions cos nx and sin nx behave like constants with
respect to the operation 3).
Let us apply this operation to the remainder (1) of the Fourier series.
Neglecting quantities of the order w~4 we obtain
We can express the result hi the following more striking form. We intro-
duce the following set of factors, called the "sigma factors":
Notice that this operation leaves the coefficient £ao unchanged (since
or0 = 1) while the last terms with an, bn drop out, because an = 0.
Problem 75. Apply the Qln process to the Dirichlet kernel (8.3), assuming that
n is large.
[Answer:
where F(x) is the indefinite integral off(x). The meaning of this operation
SBC. 2.13 LOCAL SMOOTHING BY INTEGRATION 77
is that we replace the value off(x) by the arithmetic mean of all the values
in the neighbourhood of/(x), between the limits +TT/W,
The operation f(x) may be expressed hi the following way:
where the "kernel" 8»(£ — x) is defined as the "square pulse" of the width
2fl-/n:
A comparison with the equation (8.11) shows that Dirac's "delta function"
can be conceived as the limit of the function 8n(£, x) since the equation
(8.11) can now be written (at all points x in which f(x) is continuous) in the
form:
The effect of the operation In on the Fourier coefficients a*, 6* is that they
become multiplied by the a factors, according to (12.8); and vice versa: the
operation of multiplying the Fourier coefficients by the a factors is equivalent
to submitting f(x) to the In operation.
Problem 76. Show that local smoothing leaves a straight line portion of f(x)
unchanged. What is the effect of local smoothing on the parabola/(a;) = » 2 ?
Express the result in terms of the second derivative.
78 HAEMONIC ANALYSIS CHAP. 2
FAnswftr:
Problem 77. What is the effect of local smoothing on the amplitude, phase
and frequency of the oscillation
Frequency and phase remain unchanged.] Show the validity of (6) for small o>.
Problem 78a. Show that at a point of discontinuity f(x) approaches in the limit
the arithmetic mean of the two limiting ordinates.
Problem 786. Show directly from the definition (12.2) of the 3> process the
validity of the operational equation
Let us assume that the remainder of the truncated series is once more
given in the form (12.1):
Now by definition the application of the sigma factors has the following
effect on the remainder (see 12.5):
On the other hand, the differential equation (4) may be solved without
integration for sufficiently large n asymptotically, by expanding into
reciprocal powers of n:
Comparison with the original Gibbs oscillations (2) show the following
changes: The phase of the oscillations has changed by 7r/2; the amplitude of
the oscillations has decreased by the factor n, but coupled with a change
of the law of decrease which is no longer yn(x) but y'n(x)-
The modified remainder f/n(a;) can be conceived as the true Fourier
remainder of a modified function/(a;) = In$(x], obtained by the process of
local smoothing:
Problem 79. Apply the sigma method to the function (2.9). Show that the
jump at x = 0 is changed to a steep but finite slope of the magnitude — n/2-rr.
Show that the Gibbs oscillations (3.10) now decrease with 1/n2, instead of l/n.
80 HARMONIC ANALYSIS CHAP. 2
Find the position and magnitude of the first two maxima of ij»(0). (The
asymptotic procedure (6) is here not applicable, since we are near to the singular
point at 6 = 0; but cf, (3.10).)
[Answer: with n$ = t;
Position of exteemum determined by condition
Expression of TJ»(<);
Numerical solution:
Minimum between at
This kernel has the same advantageous properties as Fejer's kernel. The
same reasoning we employed in Section 10 for proving that FejeVs method
insures the convergence of the Fourier series at all points where f(x) exists,
and for all functions which are absolutely integrable, is once more applicable.
Hence we obtain the result that the application of the sigma factors makes
the Fourier series of any absolutely integrable function convergent at all points
in which f(x) approaches a definite limit, at least in the sense of f(x+) and
/(#_). At points where these two limits are different, the series approaches
the arithmetic mean of the two limiting ordinates.
The operator In can be repeated, of course, which means that now the
coefficients «#, b^ will become multiplied by c^2. At each step the con-
vergence becomes stronger by the factor n. We must not forget, however,
that the operation of local smoothing distorts the function and we obtain
quicker convergence not to the original but to the modified function. From
the standpoint of going to the limit n -> oo all these series converge eventually
to f(x). But from the standpoint of the finite series of n terms we have to
compromise between the decrease of the Gibbs oscillations and the modifica-
tion of the given function due to smoothing. It is an advantage to cut down
on the error oscillations, but the price we have to pay is that the basic
function to which these oscillations refer is no longer f(x) but Inf(x),
respectively Inkf(x), if we multiply by ak. The proper optimum will
depend on the nature of the given problem.
Problem 80. Show that the function 8n(0) is obtainable by applying the
operation In to the function Gi(0) (cf. 2.9), taken with a negative sign. Obtain
the Gibbs oscillations of the series (1) and the position and magnitude of the
maximum amplitude.
[Answer:
Maximum at 6 = 0:
Compare these oscillations with those of the Dirichlet kernel (8.3) and the
Fejer kernel (9.2).
Problem 81. Obtain the doubly smoothed Dirichlet kernel Kn(Q) by applying
the operator In to (6). Find again the maximum amplitude of the Gibbs
oscillations.
[Answer:
SEC. 2.17 EXTENSION OF THE CLASS OF EXPANDABLE FUNCTIONS 83
With the sole exception of the point 6 = ± TT, this series diverges everywhere.
Nor can we expect a Fourier series for the function cot 0/2 which is no
longer integrable since the area under the curve goes logarithmically to
infinity. Hence the Fourier coefficients cannot be evaluated. The applica-
tion to the OK factors, "however, makes the series convergent:
which is now an even series and which again converges to the proper value
at every point of the range, excluding the origin 0 = 0.
We see that the sigma factors provide us with a tool of extending the
84 HARMONIC ANALYSIS CHAP. 2
Problem 82. Show that the remainder of the series (3) becomes asymptotically
Demonstrate the formula numerically for n. = 10, at the point d = 7r/4 (remember-
ing that 7jn(6) is not f(6) - /w(0) but /(0j - fn(6), cf. (14.8). For a table of the
sigma factors see Appendix).
[Answer: predicted
actual
which hold likewise for all powers of the a*. Additional asymptotic relations
can be derived from the series (17.3) by substituting for 6 the values rr/2,
7T/3, 27T/3, 7T/4:
We have replaced cot 6/2 by 0/2 which is permissible for small 6. A more
accurate treatment would proceed as follows. We put
and make use of the Taylor series of the second factor, on the basis of the
series
86 HABMONIC ANALYSIS CHAP. 2
where n = 2v is even. The integral of the function (15) yields the "square
wave" of the constant value ?r/4 for x > 0 and — 7r/4 for x < 0, with a point
* A series of this kind converges for sufficiently large n up to a certain point, although
it diverges later on. The more descriptive term "semi-convergent" is unfortunately
not common in English mathematical literature.
SEC. 2.18 ASYMPTOTIC RELATIONS FOB THE SIGMA FACTORS 87
Applying the operator (11) to this expansion we notice first of all that
only the even powers of D have to be considered (since we focus our attention
on the point 0 = 0). The result is that the new remainder becomes
This yields the following correction of the slowly convergent Leibniz series:
The new error is only 3.6 units in the sixth decimal place.
We now come to the application of the sigma factors. This means that
the operations (11) and (12) have to be combined, with the following result:
88 HARMONIC ANALYSIS CHAP. 2
We see that here the sigma smoothing reduced the Gibbs oscillations
quadratically, instead of linearly, in n. The reason is that before smoothing
the point x = ?r/2 was a point of maximum amplitude. The shift by 90°
changes this maximum to a nodal point, with the result that the term with
n~2 drops out and the error becomes of third order in 1/w. We thus obtain,
up to quantities of the order n'1
A second smoothing causes a second phase shift by 90° and the maximum
amplitude is once more restored. The reduction by the factor n2 will
cause an error of the order n~z, as we had it in the case of simple smoothing
(since here we do not profit by the privileged position of the point x = ir/2).
The result of the operation is
and thus
Compared with simple smoothing we have not gained more than the factor
2. (As a numerical check, let us apply our formulas to the sigma-weighted
Leibniz series, for v = 5, n = 10. The formula gives (rr/4) - 0.0009822 =
0.7844160, against the actual value of 0.7844133, while the calculated value
of the doubly smoothed series yields (w/4) - 0.0004322 = 0.7849660, against
the actual value of 0.7849681.)
While in this example we started from a point of maximum amplitude
and thus the sigma smoothing gained two powers of n (due to the shift to
a nodal point), it may equally happen that we start from a nodal point,
in which case the sigma smoothing will not decrease but possibly even
increase the local error at that particular point. An example of this kind
is encountered in the formulae (31, 32) of Problem 84.
Problem 83. Show the following exact relations to be valid for the cr-factors:
SEC. 2.19 THE METHOD OF TRIGONOMETRIC INTERPOLATION 89
Numerical check:
Problem 85. Explain why the third of the asymptotic relations (3) will hold
with increasing accuracy as m increases from 1 to n, but ceases to hold if m
becomes larger than n. Demonstrate the situation numerically for n — 6.
[Answer:
between 0 and rr, once defined as an odd, and once as an even function. (In the
first case the function is discontinuous at x = TT where we define it as.f(Tr) = 0.)
Expand this function by interpolation in a sine and cosine series for n = 9
(/? = 0) and compare the resulting Gibbs oscillations with the Gibbs oscillations
of the truncated Fourier series with n = 8.
Now let us assume that we have examined the special function Gm(x, £),
considered as a function of x, and determined the remainder r)n(x) for this
special function
Then again we can make use of "Cauchy's inequality" (4.13) and obtain
Now the second factor is once more the square of the "norm" of f^(x).
In the first factor we encounter conditions which are very similar to those
encountered before, when dealing with the Fourier series (cf. 4.14), and in
fact the result of the analysis is that the error bound (4.16), found before for
the Fourier series, remains valid for the case of trigonometric interpolation.
This result proves once more that the method of trigonometric interpolation
is not inferior to the Fourier series of a comparable number of terms. The
actual coefficients of the two series may differ considerably but the closeness
of approximation is nearly the same in both cases.
We can proceed still differently in our problem of comparing the remainder
of the trigonometric interpolation with the remainder of the corresponding
Fourier series. We will write f(x) in the form
where 1771-1(0;) is the remainder of the truncated series /M-i(a;). Let us now
apply the method of trigonometric interpolation to f(x). This can be done
by interpolating fn-i(x) and f]n-i(x) and forming the sum. But the inter-
polation offn-i(x) must coincide with/M_I(:E) itself, as we can see from the
fact that fn-i(x) is already a finite trigonometric series of the form (19.4),
and the uniqueness of the coefficients (19.5) of trigonometric interpolation
demonstrates that only one such series can exist. Hence it suffices to
interpolate the remainder r)n-i(x)- Since this remainder is small relative
to fn-i(x), we would be inclined to believe that this in itself is enough
to demonstrate that the series obtained by trigonometric interpolation
cannot differ from either fn-i(%) or f(x) by more than a negligibly small
amount.
That this argument is deceptive, is shown by the example of equidistant
polynomial interpolation, considered earlier in Chapter 1. If the zeros of
interpolation are chosen in a definite non-equidistant fashion—namely as the
SEC. 2.21 TRIGONOMETRIC AND POLYNOMIAL INTERPOLATIONS 93
Now we will make use of a fundamental theorem in the theory of the gamma
function [compare the two expansions (1.13.2) (putting p, = — x) and
(1.18.9)]:
The function Qn(x) does not vanish in the given interval. It is the last
factor of F(x) which vanishes at the points x = + k. Hence
We obtain
Then
which agrees with (9), except for the limits of summation which do not go
beyond ± n, in view of the fact that all the later f(k) vanish.
The relation here established between equidistant polynomial and equi-
distant trigonometric interpolation permits us to make use of the theory of
trigonometric interpolation for the discussion of the error oscillations of the
polynomial interpolations of high order. Moreover, the interpolation
formula (9) is in fact even numerically much simpler than the original
Lagrangian formula and may be preferable to it in some cases.
We can now find a new interpretation for the very large error oscillations
of high order polynomial approximations. As far as the transformed
function <p(x] goes, the Gibbs oscillations remain throughout the range of
practically constant amplitude. However, when we return to the original
function f(x), we have to multiply by the function Qn(%) defined by (4).
This multiplies also the remainder 77,1(0;). Now Qn(x] can be closely
estimated by Stirling's formula:
96 HABMONIC ANALYSIS CHAP. 2
which shows that Qn(x) is very nearly the wth power of a universal function
of xjn:
where
and
The general trend of Qn(%} is nearly e2"*2 which shows the very strong increase
of Qn(x) with increasing x, until the maximum 4n is reached at x = n. It
is this exponential magnification of the fairly uniform Gibbs oscillations
which renders high order polynomial interpolation so inefficient if we leave
the central range of interpolation.
The transformed series (9) can be of great help if our aim is to obtain the
limit of an infinite Stirling series. In Chapter 1.9 we have encountered an
interpolation problem in which the successive terms seemed to converge as
more and more data were taken into account, but it seemed questionable
that the limit thus obtained would coincide with the desired functional
value. In the original form of the Stirling series it is by no means easy to
see what happens as more and more terms of the series are taken into account.
We fare much better by transforming the series into the form (9) and then
make the transition to the limit n -> oo. It is true that this procedure
demands a transition from the function f(x) to the new function q>(x). But
this transformation becomes particularly simple if n is very large and
converges to infinity. The relation between f(x) and <p(x), as given by (8),
requires that we should divide f(x) by Qn(x) which for any finite x and very
large n becomes
We see that for any finite point x the functions f(x) and tp(x) coincide in the
limit, as n grows to infinity. This does not absolve us from the obligation
to investigate the possible contribution of the points in infinity. But if the
nature of the function f(x) is such that we know in advance that the
contribution of the points in infinity converges to zero, then it suffices to
find the limit of the infinite sum
In the problem of Chapter 1.9 the given equidistant values had the form
(cf. 1.7.3)
SEC. 2.21 TRIGONOMETRIC AND POLYNOMIAL INTERPOLATIONS 97
This shows that the interpolated value/*(x) does not coincide with f(x) at
any point, except at the integer points x = k which provided the key-values
of the interpolation procedure. The infinite Stirling expansion of our
problem does approach a limit, but it is not the desired limit. In our specific
problem we have a = 2, and we have interpolated at the point x = \.
Since
we obtain
The difference is small and yet significant. It would easily escape our
attention if we were to trust the numerical procedure blindly, without
backing it up by the power of a thorough analytical study.
8—L.D.O.
98 HARMONIC ANALYSIS CHAP. 2
and consider the range [—!,+!]. On the boundaries we find that the
conditions
are automatically satisfied and thus at least function and first derivative
can be conceived as continuous. The first break will occur in the second
derivative.
Under these circumstances we will use the sine series
for the representation of our data (adding in the end the linear correction to
come back to /(#)). The coefficients bje are evaluated according to the
formula (19.5) (replacing, however, sin kxa by smirkxa). The expression
(6.8) shows that the amplitude of the error oscillations will be of the order
of magnitude n~3, except near x = 0 and x = TT where a larger error of the
SEC. 2.22 THE FOURIER SERIES IN CURVE FITTING 99
Fit these data according to the method of Section 22. Study the Gibbs
oscillations of the interpolation obtained.
Problem 90. Let f(x) be given at n equidistant points between 0 and 1 and let
also /'(O) and /'(I) be known. What method of curve fitting could we use
under these circumstances?
[Answer: define
BIBLIOGRAPHY
[1] Churchill, R. V., Fourier Series and Boundary Value Problems (McGraw-Hill,
New York, 1941)
[2] Franklin, Ph., Fourier Methods (McGraw-Hill, New York, 1949)
[3] Jackson, D., Fourier Series and Orthogonal Polynomials (Math. Association
of America, Oberlin, 1941)
[4] Sneddon, I. N., Fourier Transforms (McGraw-Hill, 1951)
* See also A. A., Chapter 5.11, 12.
CHAPTER 3
MATRIX CALCULUS
3.1. Introduction
It was around the middle of the last century that Cayley introduced the
matrix as an algebraic operator. This concept has become so universal in
the meantime that we often forget its great philosophical significance.
What Cayley did here parallels the algebraisation of arithmetic processes by
the Hindus. While in arithmetic we are interested in getting the answer to
a given arithmetic operation, in algebra we are no longer interested in the
individual problem and its solution but start to investigate the properties
of these operations and their effect on the given numbers. In a similar
way, before Cayley's revolutionary innovation one was merely interested in
the actual numerical solution of a given set of algebraic equations, without
paying much attention to the general algebraic properties of the solution.
Now came Cayley who said: "Let us write down the scheme of coefficients
which appear in a set of linear equations and consider this scheme as one
unity":
100
SEC. 3.1 INTRODUCTION 101
To call this scheme by the letter A was much more than a matter of notation.
It had the significance that we are no longer interested in the numerical
values of the coefficients an ... ann. In fact, these numerical values are
without any significance in themselves. Their significance becomes estab-
lished only in the moment when this scheme operates on something. The
matrix A was thus divested of its arithmetic significance and became an
algebraic operator, similar to a complex number a + ib, although character-
ised by a much larger number of components. A large set of linear equations
could be written down in the simple form
where y and 6 are no longer simple numbers but a set of numbers, called a
"vector". That one could operate with sets of numbers in a similar way
as with single numbers was the great discovery of Cayley's algebraisation of
a matrix and the subsequent development of "matrix calculus".
This development had great repercussions for the field of differential
equations. The problems of mathematical physics, and later the constantly
expanding industrial research demanded the solution of certain linear differ-
ential equations, with given boundary conditions. One could concentrate
on these particular equations and develop methods which led to their
solution, either in closed form, or in the form of some infinite expansions.
But with the advent of the big electronic computers the task of finding the
numerical solution of a given boundary value problem is taken over by
the machine. We can thus turn to the wider problem of investigating the
general analytical properties of the differential operator itself, instead of
trying to find the answer to a given individual problem. If we understand
these properties, then we can hope that we may develop methods for the
given individual case which will finally lead to the desired numerical answer.
In this search for "properties" the methods of matrix calculus can serve
as our guiding light. A linear differential equation does not differ funda-
mentally from a set of ordinary algebraic equations. The masters of 18th
century analysis, Euler and Lagrange, again and again drew exceedingly
valuable inspiration from the fact that a differential quotient is not more
than a difference coefficient whose Ax can be made as small as we wish.
This means that a linear differential equation can be approximated to any
degree of accuracy by a set of ordinary linear algebraic equations. But
these equations fall in the domain of matrix calculus. The "matrix" of
these equations is determined by the differential operator itself. And thus
the study of linear differential operators and the study of matrices as
algebraic operators is in the most intimate relation to one another. The
present chapter deals with those aspects of matrix calculus which are of
particular importance for the study of linear differential operators. One of
102 MATEIX CALCULUS CHAP. 3
the basic things we have to remember in this connection is that the trans-
formation of a differential equation into an algebraic set of equations demands
a limit process in which the number of equations go to infinity. Hence we
can use only those features of matrix calculus which retain their significance
if the order of the matrix increases to infinity.
3.2. Rectangular matrices
The scheme (1.1) pictures the matrix of a linear set of equations in which
the number of equations is n and the number of unknowns likewise n.
Hence we have here an "n x n matrix". From the standpoint of solving
a set of equations it seems natural enough to demand that we shall have
just as many equations as unknowns. If the number of equations is smaller
than the number of unknowns, our data are not sufficient for a unique
characterisation of the solution. On the other hand, if the number of
equations is larger than the number of unknowns, we do not have enough
quantities to satisfy all the given data and our equations are generally not
solvable. For this reason we consider in the matrix calculus of linear
algebraic systems almost exclusively only square matrices. However, for
the general study of differential operators this restriction is a severe handicap.
A differential operator such as y" for example requires the addition of two
"boundary conditions" in order to make the associated differential equation
well determined. But we may want to study the differential operator y"
itself, without any additional conditions. In this case we have to deal with
a system of equations in which the number of unknowns exceeds the number
of equations by 2. In the realm of partial differential operators the dis-
crepancy is even more pronounced. We might have to deal with the
operation "divergence" which associates a scalar field with a given vector
field:
A row vector times a column vector (of the same number of elements) gives
a scalar (a 1 x 1 matrix), called the "scalar product" of the two vectors:
where x is n x 1, A is n x m, and y is m x 1.
Two fundamental matrices of special significance: the "zero matrix"
whose elements are all zero, and the "unit matrix", defined by
A triangular matrix is defined by the property that all its elements above
the main diagonal are zero.
SEC. 3.3 THE BASIC RULES OF MATRIX CALCULUS 105
the "eigenvalue problem" associated with A. The scalars AI, A2, . . . , An for
which the equation is solvable, are called the "eigenvalues" (or "character-
istic values") of A while the vectors x\, xz,. . . , xn are called the
"eigenvectors" (or "principal axes") of A. The eigenvalues \t satisfy the
characteristic equation
This algebraic equation of nth order has always n generally complex roots.
If they are all distinct, the eigenvalue problem (13) yields n distinct eigen-
vectors, whose length can be normalised to 1 by the condition
If some of the eigenvalues coincide, the equation (13) may or may not have
n linearly independent solutions. If the number of independent solutions is
less than n, the matrix is "defective" in certain eigenvectors and is thus
called a "defective matrix".
Any square matrix satisfies its own characteristic equation (the
"Hamilton-Cayley identity"):
Moreover, this is the identity of lowest order satisfied by A, if the A$ are all
distinct. If, however, only p of the eigenvalues are distinct, the identity
of lowest order in the case of a non-defective matrix becomes
Defective matrices, however, demand that some of the root factors shall
appear in higher than first power. The difference between the lowest order
at which the identity appears and p gives the number of eigenvectors in
which the matrix is defective.
Problem 91. Let an n x n matrix M have the property that it commutes with
any n x n matrix. Show that M must be of the form M = <xl.
Problem 92. Show that if A is an eigenvalue of the problem (13), it is also an
eigenvalue of the "adjoint" problem
Problem 93. Let the eigenvalues AI, A2, . . ., An of A be all distinct. Show that
the matrix
Problem 94. Show that the eigenvalues of Am are the mth power of the original
eigenvalues, while the eigenvectors remain unchanged.
Problem 95. Show that the (complex) eigenvalues of an orthogonal matrix (10)
must lie on the unit circle \z\ — 1.
Problem 96. Show that the following properties of a square matrix A remain
unchanged by squaring, cubing, . . . , of the matrix:
a) symmetry
b) orthogonality
c) triangular quality.
Problem 97. Show that if two non-defective matrices A and B coincide in
eigenvalues and eigenvectors, they coincide altogether: A — B = 0.
Problem 98. Show that two defective matrices A and B which have the same
eigenvalues and eigenvectors, need not coincide. (Hint: operate with two
triangular matrices whose diagonal elements are all equal.)
Problem 99. Show that Am = 0 does not imply A = 0. Show that if A is an
n x n matrix which does not vanish identically, it can happen that A2 = 0,
or A3 s 0, . . . , or An = 0 without any of the lower powers being zero.
Problem 100. Investigate the eigenvalue problem of a triangular matrix whose
diagonal elements are all equal. Show that by a small modification of the
diagonal elements all the eigenvalues can be made distinct, and that the eigen-
vectors thus created are very near to each other in magnitude and direction,
collapsing into one as the perturbation goes to zero.
The equation (1) can be conceived as the equation of a second order surface
in an n-dimensional space. The eigenvalue problem
characterises those directions in space in which the radius vector and the
normal to the surface become parallel. Moreover in consequence of (1) we
obtain
While in the case of a general matrix A the eigenvalues A< are generally
complex numbers and we cannot guarantee even the existence of n eigen-
vectors—they may all collapse into one vector—here we can make much
more definite predictions. The eigenvalues A$ are always real and the
eigenvectors are always present to the full number n. Moreover, they are
in the case of distinct eigenvalues automatically orthogonal to each other,
while in the case of multiple roots they can be orthogonalised—with an
arbitrary rotation remaining free in a definite /^-dimensional subspace if //,
is the multiplicity of the eigenvalue A^. Furthermore, the length of the
eigenvectors can be normalised to 1, in which case U becomes an orthogonal
matrix :
But then we can introduce a new reference system in which the eigenvectors
—that is the columns of U—are introduced as a new set of coordinate axes
(called the "principal axes"). This means the transformation
Introducing this transformation in the equation (1) we see that the same
equation formulated in the new (primed) reference system becomes
where
This means that in the new reference system (the system of the principal
axes), the matrix 8 is reduced to a diagonal matrix and the equation of the
second order surface becomes
Now we can make use of the fact that the A$ are invariants of an orthogonal
transformation. Since the coefficients of an algebraic equation are
expressible in terms of the roots A«, we see that the entire characteristic
equation (3.14) is an invariant of an orthogonal transformation. This means
that we obtain n invariants associated with an orthogonal transformation
because the coefficient of every power of A is an invariant. The most
important of these invariants are the coefficient of (— A)° and the coefficient
of ( —A) n-1 . The former is obtainable by putting A = 0 and this gives the
determinant of the coefficients of S, simply called the "determinant of S"
SBO. 3.4 PRINCIPAL AXIS TRANSFORMATION OF A SYMMETRIC MATRIX 109
and denoted by \\S\\. The latter is called the "spur" of the matrix and is
equal to the sum of the diagonal terms:
But in the reference system of the principal axes the determinant of S'
becomes the product of all the A<, and thus
while the "spur" of 8' is equal to the sum of the \t and thus
show that the invariance of (5) demands that U satisfy the orthogonality
conditions (3.10).
Problem 102. Show that by considering the principal axis transformation of
S, S2, <S3, . . . , Sn, we can obtain all the n invariants of S by taking the spur
of these matrices
Investigate in particular the case k — 2 and show that this invariant is equal to
the sum of the squares of the absolute values of all the elements of the matrix S:
Problem 103. Show that the following properties of a matrix are invariants of
an arbitrary rotation (orthogonal transformation):
a) symmetry
b) anti-symmetry
c) orthogonality
d) the matrices 0 and I
e) the scalar product xy of two vectors.
Problem 104. Show that for the invariance of the determinant and the spur
the symmetry of the matrix is not demanded: they are invariants of an
orthogonal transformation for any matrix.
Problem 105. Show that the eigenvalues of a real anti-symmetric matrix
A = — A are purely imaginary and come in pairs: Af = ±if$i. Show that one
of the eigenvalues of an anti-symmetric matrix of odd order is always zero.
Problem 106. Show that if all the eigenvalues of a symmetric matrix S collapse
into one: A< = a, that matrix must become 5 = ol.
110 MATRIX CALCULUS CHAP. 3
Problem 107. Find the eigenvalues and principal axes of the following matrix
and demonstrate explicitly the transformation theorem (16), together with the
validity of the spur equations (21):
[Answer:
Problem 108. Find the eigenvalues and principal axes of the following Hermitian
matrix and demonstrate once more the validity of the three spur equations (21):
[Answer:
Problem 111. Show that the following class of n x n matrices are simultaneously
symmetric and orthogonal (of. Section 2.19):
SEC. 3.5 DECOMPOSITION OF A SYMMETRIC MATRIX 111
Show that for all even n the multiplicity of the eigenvalues ± 1 is even, while for
odd n the multiplicity of +1 surpasses the multiplicity of — 1 by one unit.
Problem 112. Construct another class of n x n symmetric and orthogonal
matrices by writing down the elements
lower horizontal 1
and right vertical/
Problem 113. Consider the cases n = 2 and 3. Show that here the sine and
cosine matrices coincide. Obtain the principal axes for these cases.
[Answer:
This shows that an arbitrary symmetric matrix can be obtained as the product
of three factors: the orthogonal matrix U, the diagonal matrix A, and the
transposed orthogonal matrix V,
A further important fact comes into evidence if it so happens that one
112 MATBIX CALCULUS CHAP. 3
or more of the eigenvalues Af are zero. Let us then separate the zero eigen-
values from the non-zero eigenvalues:
Problem 115. Demonstrate the validity of the decomposition theorem (1) for
the matrix (4.23) of Problem 107.
Problem 116. Demonstrate the validity of the decomposition theorem (5) for
the matrix (4.25) of Problem 108.
where
Hence in the new reference system the linear system (1) appears in the
form
Since A is a mere diagonal matrix, our equations are now separated and
immediately solvable—provided that they are in fact solvable. This is
certainly the case if none of the eigenvalues of A is zero. In that case the
TmT£kT»C!£i Wl Q •f .T*1 V
9--L.D.O.
114 MATRIX CALCULUS CHAP. 3
This shows that a self-adjoint linear system whose matrix is free of zero
eigenvalues is always solvable and the solution is unique.
But what happens if some of the eigenvalues A$ vanish? Since any
number multiplied by zero gives zero, the equation
has the geometrical significance that the vector 6 is orthogonal to the ith
principal axis. That principal axis was denned by the eigenvalue equation
has more than one linearly independent solution. And since the condition
(11) has to hold for every vanishing eigenvalue, while on the other hand
these are all the conditions demanded for the solvability of the linear system
(1), we obtain the fundamental result that the necessary and sufficient
condition for the solvability of a self-adjoint linear system is that the right side
is orthogonal to every linearly independent solution of the homogeneous equation
Ay = 0.
Coupled with these "compatibility conditions" (11) goes a further
peculiarity of a zero eigenvalue. The equation
is solvable for any arbitrary y'{. The solution of a linear system with a
vanishing eigenvalue is no longer unique. But the appearance of the free
component y't in the solution means from the standpoint of the original
reference system that the product y'tut can be added to any valid solution
of the given linear system. In the case of several principal axes of zero
eigenvalue an arbitrary linear combination of these axes can be added and
we still have a solution of our linear system. On the other hand, this is
all the freedom left in the solution. But "an arbitrary linear combination
SEC. 3.7 ARBITRARY N X M SYSTEMS 115
of the zero axes" means, on the other hand, an arbitrary solution of the
homogeneous equation (14). And thus we obtain another fundamental
result: " The general solution of a compatible self-adjoint system is obtained by
adding to an arbitrary particular solution of the system an arbitrary solution
of the homogeneous equation Ay = 0."
Problem 117. Show that the last result holds for any distributive operator A.
where the matrix A has n rows and m columns and transforms the column
vector yofm components into the column vector b of n components. Such
a matrix is obviously associated with two spaces, the one of n, the other
of m dimensions. We will briefly call them the JV-space and the M -space.
These two spaces are in a duality relation to each other. If the vector y
of the M -space is given, the operator A operates on it and transplants it
into the N-space. On the other hand, if our aim is to solve the linear
system (1), we are given the vector 6 of the N-space and our task is to find
the vector y of the M -space which has generated it through the operator A.
However, in the present section we shall not be concerned with any
method of solving the system (1) but rather with a general investigation of
the basic properties of such systems. Our investigation will not be based
on the determinant approach that Kronecker and Frobenius employed in
their algebraic treatment of linear systems, but on an approach which
carries over without difficulty into the field of continuous linear operators.
The central idea which will be basic for all our discussions of the behaviour
of linear operators is the following. We will not consider the linear system
(1) in isolation but enlarge it by the adjoint m x n system
The matrix A has m rows and n columns and accordingly the vectors x
and c are in a reciprocity relation to the vectors y and b, x and b being
vectors of the JV^-space, y and c vectors of the M -space.
The addition of the system (2) has no effect on the system (1) since the
vectors x and c are entirely independent of the vectors y and b, and vice
versa. But the addition of the system (2) to (1) enlarges our viewpoint
and has profound consequences for the deeper understanding of the properties
of linear systems.
We combine the systems (1) and (2) into the larger scheme
116 MATRIX CALCULUS CHAP. 3
The vectors (x, y) combine into the single vector z from the standpoint of
the larger system, just as the vectors (b, c) combine into the larger vector a.
However, for our present purposes we shall prefer to maintain the
individuality of the vectors (x, y) and formulate all our results in vector
pairs, although they are derived from the properties of the unified system (5).
Since the unified system has a symmetric matrix, we can immediately
apply all the results we have found in Sections 4, 5, and 6. First of all,
we shall be interested in the principal axis transformation of the matrix 8.
For this purpose we have to establish the fundamental eigenvalue equation
SEC. 3.7 ARBITRARY N X M SYSTEMS 117
which in view of the specific character of our matrix (4) appears in the
following form, putting w = (u, v):
We will call this pair of equations the "shifted eigenvalue problem", since
on the right side the vectors u and v are in shifted position, compared with
the more familiar eigenvalue problem (3.13), (3.18). It is of interest to
observe that the customary eigenvalue problem loses its meaning for n x m
matrices, due to the heterogeneous spaces to which u and v belong, while
the shifted eigenvalue problem (7) is always meaningful. We know in
advance that it must be meaningful and yield real eigenvalues since it is
merely the formulation of the standard eigenvalue problem associated with
a symmetric matrix, which is always a meaningful and completely solvable
problem. We also know in advance that we shall obtain n + m mutually
orthogonal eigenvectors, belonging to n + m independent eigenvalues,
although the eigenvalues may not be all distinct (the characteristic equation
can have multiple roots).
The orthogonality of two wt eigenvectors now takes the form
which yields
result of (12). These vectors can serve as an orthogonal set of base vectors
which span the entire N-, respectively, Jf-space.
We will picture these two spaces by their base vectors which are arranged
in successive columns. We thus obtain two square matrices, namely the
tf-matrix, formed out of the n vectors, m, u%, . . . , un, and the F-matrix,
formed out of the m vectors v\, v%, . . . , vm. While these two spaces are
quite independent of each other, yet the two matrices U and V are related
by the coupling which exists between them due to the original eigenvalue
problem (7) which may also be formulated in terms of the matrix equations
This coupling must exist for every non-zero eigenvalue Xt while for a zero
eigenvalue the two equations
which, together with the 2p "paired" axes actually generate the demanded
m + n principal axes of the full matrix (4).
Problem 118. By applying the orthogonality condition (8) to the pair (m, vi; \{),
(u{, —V{; — Af), (A$ ^ 0), demonstrate that the normalisation of the length of u^
to 1 automatically normalises the length of the associated vt to 1 (or vice versa).
actually yields a self-adjoint system and thus the results of our previous
investigation become directly applicable. In particular we can state
explicitly what are the compatibility conditions to be satisfied by the right
side (b, c) of the unified system (7.5) which will make a solution possible.
This condition appeared in the form (6.11) and thus demands the generation
of the eigenvectors (m, vi) associated with the eigenvalue zero. The
necessary and sufficient condition for the solvability of the system (7.5)
will thus appear in the general form
where (HI, vi) is any principal axis associated with the eigenvalue A = 0:
But now we have seen that these equations fall apart into the two inde-
pendent sets of solutions
and
Problem 122. Show that the number p must lie between 1 and the smaller of
the two numbers n and m:
Problem 123. Prove the following theorem: "The sum of the square of the
absolute values of all the elements of an arbitrary (real) n x m matrix is
equal to the sum of the squares of the eigenvalues AI, A£, • • • > ^p-"
Problem 124. Given the following 4 x 5 matrix:
0.43935348 -0.01703335 -2 6 14
0.33790963 -0.77644357 3 -3 -1
-0.01687774 0.28690800 8 0 0
0.40559800 0.55678265 0 -4 0
0.72662988 0.06724708 0 0 -8
(Note that the zero axes have not been orthogonalised and normalised.) ]
The fundamental matrix decomposition theorem (4) appears now in the form
and reveals the remarkable fact that the operator A can be generated without
any knowledge of the principal axes associated with the zero eigenvalue, that
is without any knowledge of the solutions of the homogeneous equations
Av = 0 and Au = 0. These solutions gave us vital information concerning
the compatibility and deficiency of the linear system Ay = b, but exactly
these solutions are completely ignored by the operator A.
SEC. 3.9 THE FUNDAMENTAL DECOMPOSITION THEOREM 123
[Answer:
[Answer
Problem 127. Construct the 4 x 5 matrix (8.6) with the help of the two non-
zero eigensolutions, belonging to AI and A2, of the table (8.9).
[Answer: Carry out numerically the row-by-row operation
(which are always valid because U and V are always semi-orthogonal) are
reversible:
But these products become / only in the case that the relations (3) hold
and that is only true if ra = n = p.
Has the matrix B any significance in the general case in which p is not
equal to ra and n( Indeed, this is the case and we have good reasons to
consider the matrix B as the natural inverse of A, even in the general case.
Let us namely take in consideration that the operation of the matrix A is
restricted to the spaces spanned by the matrices U and V. The spaces UQ
and VQ do not exist as far as the operator A is concerned. Now the unit
matrix I has the property that, operating on any arbitrary vector u or v, it
leaves that vector unchanged. Since, however, the concept of an "arbitrary
vector" is meaningless in relation to the operator A—whose operation is
restricted to the eigen-spaces U, V—it is entirely sufficient and adequate to
replace the unit matrix / by a less demanding matrix which leaves any
vector belonging to the subspaces U and V unchanged.
The product AB, being an n x n matrix, can only operate on a vector
of the iV-space and if we want this vector to belong to the subspace U,
we have to set it up in the form
and once more we have demonstrated that the product BA has actually the
property to leave any vector belonging to the V-space unchanged.
The matrix B is thus the natural substitute for the non-existent "strict
inverse", defined by (1), and may aptly be called the "natural inverse" of
the matrix A. It is an operator which is uniquely associated with A and
whose domain of operation coincides with that of A. It ignores completely
the fields UQ and VQ. If B operates on any vector of the subspace UQ, it
126 MATRIX CALCULUS CHAP. 3
by
Have we found the solution of our equation? Substitution in (12) yields the
condition
Obtain the normalised least square solution (13) of this system, without making-
use of the complete eigenvalue analysis contained in the table (8.9). Then
check the result by constructing the matrix B with the help of the two essential
axes belonging to AI and A2-
[Hint: Make the right side b orthogonal to u% and u$. Make the solution y
orthogonal to v$, v%, v§, thus reducing the system to a 2 x 2 system which has a
unique solution.]
[Answer:
Show that for this system the compatibility condition (9.7) is automatically
fulfilled. Show also that for an over-determined system which is free of
deficiencies the solution (13), constructed with the help of the B matrix,
coincides with the solution of the system (19).
Here we have the full counterpart of the equation (6.6) which we have en-
countered earlier in the study of n x n linear systems whose matrix was
symmetric. Now we have succeeded in generalising the procedure to arbitrary
non-symmetric matrices of the general n x m type.
But let us notice the peculiar fact that the new equation (4) is a p x p
system while the original system was an n x m system. How did this
reduction come about?
We understand the nature of this reduction if we study more closely the
nature of the two orthogonal transformations (2) and (3). Since generally
the U and F matrices are not full orthogonal matrices but n x p respectively
m x p matrices, the transformations (2) and (3) put a definite bias on the
vectors b and y. We can interpret these two equations as saying that b is
inside the Z7-space, y inside the F-space. The first statement is not
necessarily true but if it is not true, then our system is incompatible and
allows no solution. The second statement again is not necessarily true
since our system may be incomplete in which case the general solution
appears in the form (9.8) which shows that the solution y can have an
arbitrary projection into FO- However, we take this deficiency for granted
and are satisfied if we find a particular solution of our system which can be
later augmented by an arbitrary solution of the homogeneous equation.
We distinguish this particular solution by the condition that it stays entirely
within the F-space. This condition makes our solution unique.
Since the subspaces U and F are both p-dimensional, it is now under-
standable that our problem was reducible from the original n x m system
to a p x p system. Moreover, the equations of the new system are separated
and are solvable at once:
and thus
SEC. 3.12 ERROR ANALYSIS OF LINEAR SYSTEMS 129
We have thus obtained exactly the same solution that we have encountered
before in (10.13) when we were studying the properties of the "natural
inverse" of a matrix.
We should well remember, however, the circumstances which brought this
unique solution in existence:
1. We took it for granted that the compatibility conditions of the system
are satisfied. This demands that the right side 6 shall lie inside the
^-dimensional subspace U of the full ^V-space.
2. We placed the solution in the eigenspace of the matrix A, and that is
the ^-dimensional subspace V of the full If-space.
Problem 130. Show that for the solution of the adjoint system (7.2) the role
of the spaces U and V is exactly reversed. Obtain the reduction of this ra x n
system to the p x p system of equation (4).
and according to the rules of algebra our system must have one and only one
solution. The problem is entrusted to a big computing outfit which carries
through the calculations and comes back with the answer. The engineer
looks at the solution and shakes his head. A number which he knows to be
positive came out as negative. Something seems to be wrong also with
the decimal point since certain components of y go into thousands when
he knows that they cannot exceed 20. All this is very provoking and he
tells the computer that he must have made a mistake. The computer
points out that he has checked the solution, and the equations checked with
an accuracy which goes far beyond that of the data. The meeting breaks
up in mutual disgust.
10—L.D.O.
130 MATRIX CALCULUS CHAP. 3
What happened here? It is certainly true that the ideal case (2)
guarantees a unique and finite answer. It is also true that with our present-
day electronic facilities that answer is obtainable with an accuracy which
goes far beyond the demands of the engineer or the physicist. Then how
could anything go wrong?
The vital point in the mathematical analysis of our problem is that the
data of our problem are not mere numbers, obtainable with any accuracy
we like, but the results of measurements, obtainable only with a limited
accuracy, let us say an accuracy of 0.1%. On the other hand, the engineer
is quite satisfied if he gets the solution with a 10% accuracy, and why
should that be difficult with data which are 100 times as good?
The objection is well excusable. The peculiar paradoxes of linear systems
have not penetrated yet to the practical engineer whose hands are full with
other matters and who argues on the basis of experiences which hold good
in many situations but fail in the present instance. We are in the fortunate
position that we can completely analyse the problem and trace the failure
of that solution to its origins, showing the engineer point by point how the
mishap occurred.
We assume the frequent occurrence that the matrix A itself is known with
a high degree of accuracy while the right side b is the result of measurements.
Then the correct equation (1) is actually not at our disposal but rather the
modified equation
where
the given right side, differs from the "true" right side b by the "error
vector" j3. From the known performance of our measuring instruments we
can definitely tell that the length of /? cannot be more than a small percentage
of the measured vector 6—let us say 0.1%.
Now the quantity the computer obtains from the data given by the
engineer is the vector
since he has solved the equation (3) instead of the correct equation (1). By
substituting (5) in (3) we obtain for the error vector 77 of the solution the
following determining equation:
The question now is whether the relative smallness of /3 will have the relative
smallness of 77 in its wake, and this is exactly the point which has to be
answered by "no".
As in the previous section, we can once more carry through our analysis
most conveniently in a frame of reference which will separate our equations.
SEC. 3.12 ERROR ANALYSIS OF LINEAR SYSTEMS 131
This popular expression is not without its dangers since it creates the
impression that an "ill-conditioned" matrix is merely in a certain mathe-
matical "condition" which could be remedied by the proper know-how. In
actual fact we should recognise the general principle that a lack of information
cannot be remedied by any mathematical trickery. If we ponder on our
problem a little longer, we discover that it is actually the lack of information
that causes the difficulty. In order to understand what a small eigenvalue
means, let us first consider what a zero eigenvalue means. If in one of the
equations of the system (11.4) for example the *th equation, we let A$ converge
to zero, this means that the component y'i appears in our linear system with
the weight zero. We can trace back this component to the original vector
y, on account of the equation
there will be three linear combinations of the 5 unknowns, which are a priori
* See the author's paper on "Iterative solution of large-scale linear systems" in the
Journal of SIAM 6, 91 (1958).
SEC. 3.12 ERROR ANALYSIS OF LINEAR SYSTEMS 133
unobtainable because they are simple not represented in the system. They
are:
here the axes u$ and u$ come in operation and we see that the two
combinations
are a priori un-determinable (or any linear aggregate of these two expressions).
Now we also understand what a very small eigenvalue means. A certain
linear combination of the unknowns, which can be determined in advance,
does not drop out completely but is very weakly represented in our system.
If the data of our system could be trusted with absolute accuracy, then the
degree of weakness would be quite immaterial. As long as that combination
is present at all, be it ever so weakly, we can solve our system and it is
merely a question of numerical skill to obtain the solution with any degree
of accuracy. But the situation is very different if our data are of limited
accuracy. Then the very meagre information that our numerical system
gives with respect to certain linear combinations of the unknowns is not
only unreliable—because the errors of the data do not permit us to make
any statement concerning their magnitude—but the indiscriminate handling
of these axes ruins our solution even with respect to that information that
we could otherwise usefully employ. If we took the values of these weak
combinations from some other information—for example by casting a
horoscope or by clairvoyance or some other tool of para-psychology—we
should probably fare much better because we might be right at least in the
order of magnitude, while the mathematically correct solution has no hope
of being adequate even in roughest approximation.
This analysis shows how important it is to get a reliable estimate
concerning the "condition number" of our system and to reject linear
systems whose condition number (11) surpasses a certain danger point,
depending on the accuracy of our data. If we admit such systems at all,
we should be aware of the fact that they are only theoretically square
systems. In reality they are n x m systems (n < m) which are deficient in
certain combinations of the unknowns and which are useful only for those
combinations of the unknowns which belong to eigenvalues which do not go
134 MATRIX CALCULUS CHAP. 3
The only thing we can be sure of is that a linear system can have no
unique solution if the number of equations is less than the number of
unknowns. Beyond that, however, we can come to definite conclusions
only if in our analysis we pay attention to three numbers associated with a
matrix:
1. The number of equations: n
2. The number of unknowns: m
3. The rank of the matrix: p.
It is the relation of p to n and m which decides the general character of a
given linear system.
The "rank" p can be decided by studying the totality of linearly
independent solutions of the homogeneous equation
or
The analysis of Section 9 has shown that these two numbers are not
independent of each other. If we have found that the first equation has //,
independent solutions, then we know at once the rank of the matrix since
and thus
and thus
These two viewpoints give rise to four different classes of linear systems;
1. Free and complete. The right side can be chosen freely and the solution
is unique. In this case the eigen-space of the operator includes the entire
M and N spaces and we have the ideal case
3. Free and incomplete. The right side is not subjected to any conditions
but the solution is not unique. The operator now includes the entire
SEC. 3.13 CLASSIFICATION OF LINEAE SYSTEMS 137
N-spa,ce but the M-space extends beyond the confines of the eigen-space V
of the operator. Here we have the case
[Answer:
Problem 134. Show that the following system is constrained and complete:
Solution:
Problem 135. Show that the following system is free and incomplete:
[Answer:
Problem 136. Show that the following system is constrained and incomplete:
[Answer:
Compatibility condition:
Solution:
SEC. 3.14 SOLUTION OF INCOMPLETE SYSTEMS 139
But then the vector Aw—no matter what w may be—is automatically of
the form Vq, that is we have a vector which lies completely within the
activated field of the operator. The deficiency is thus eliminated and we
obtain exactly the solution which we desire to get. The auxiliary vector w
may not be unique but the solution y becomes unique.
We can thus eliminate the deficiency of any linear system and obtain the
"natural solution" of that system by adopting the following method of
solution:
Hence the problem (1) becomes uniquely solvable because now we obtain
which is Poisson's equation and which has (under the condition (6)) a
unique solution. This is in fact the traditional method of solving the
problem (1). But we now see the deeper significance of the method: we
gave a unique solution in all those dimensions of the function space in
which the operator is activated and ignored all the other dimensions.
Problem 137. Obtain the solution of the incomplete system (13.20) by the
method (2) and show that we obtain a unique solution which is orthogonal to
the zero-vectors v^, v$.
[Answer:
SBC. 3.15 OVER-DETERMINED SYSTEMS 141
Problem 138. Do the same for the 4 x 4 system (13.22) and show that the
deficiency of w has no influence on the solution y.
[Answer:
If the system
happens to be compatible, then the minimum of (1) is zero and we obtain the
correct solution of the system (2). Hence we have not lost anything by
replacing the system (2) by the minimisation of (1) which yields
We have gained, however, by the fact that the new system is always
solvable, no matter how incompatible the original system might have been.
The reason is that the decomposition (14.3) of A shows that the new right
side
which transforms a scalar field tp into the vector field F. Let us apply the
method (3) to this over-determined system. In Section 14 we have en-
countered the operator "div" and mentioned that its transpose is the
operator " — grad". Accordingly the transpose of the operator "grad" is
the operator " —div". Hence the least-square reformulation of the original
equation (5) becomes
and once more we arrive at Poisson's equation. Here again the procedure
agrees with the customary method of solving the field equation (5) but we
get a deeper insight into the significance of this procedure by seeing that
we have applied the least-square reformulation of the original problem.
If we survey the results of the last two sections, we see that we have
found the proper remedy against both under-determination and over-
determination. In both cases the transposed operator A played a vital role.
We have eliminated under-determination by transforming the original y
into the new unknown w by the transformation y = Aw and we have
eliminated over-determination (and possibly incompatibility) by the method
of multiplying both sides of the given equation by A. The unique solution
thus obtained coincides with the solution (10.13), generated with the help
of the "natural inverse" B.
Problem 139. Two quantities £ and 77 are measured in such a way that their
sum is measured /JL times, their difference v times. Find the most probable
values of £ and 77. Solve the same problem with the help of the matrix B and
show the agreement of the two solutions.
[Answer: Let the arithmetic mean of the sum measurements be a, the arithmetic
mean of the difference measurements be /?. Then
Problem 140. Form the product Ab for the system of Problem 128 and show
that the vector thus obtained is orthogonal to the zero vectors v^, v^ and v$
(cf. Problem 124).
3.16. The method of orthogonalisation
We can give still another formulation of the problem of removing
deficiencies and constraints from our system. The characteristic feature of
the solution (14.2) is that the solution is made orthogonal to the field VQ
SEC. 3.16 THE METHOD OF ORTHOGONALISATION 143
which is composed of the zero-axes of the M-field. On the other hand the
characteristic feature of the least square solution (15.3) is that the right
side b is made orthogonal to the field UQ which is composed of the zero axes
of the jV-field. If we possess all the zero axes—that is we know all the
solutions of the homogeneous equations Av — 0, then we can remove the
deficiencies and constraints of our system and transform it to a uniquely
solvable system by carrying through the demanded orthogonalisation in
direct fashion.
1. Removal of the deficiency of the solution. Let yo be a particular solution
of our problem
and having obtained q, we substitute in (2) and obtain the desired normalised
solution.
2. Removal of the incompatibility of the right side. We can proceed
similarly with the orthogonalisation of the right side b of a constrained
system. We must not change anything on the projection of b into the field
U but we have to subtract the projection into UQ. Hence we can put
Find the least square solution of this system (the arithmetic mean of the right
sides) by orthogonalisation.
The "given data" are in this case the boundary values of /(£) along the
boundary curve C and the relation (1) may be conceived as the solution of
the Cauchy-Kiemann differential equations, under the boundary condition
that/(z) assumes the values/(£) on the boundary. These "given values",
however, are by no means freely choosable. It is shown in the theory of
analytical functions that giving /(£) along an arbitrarily small section of C
is sufficient to determine /(z) everywhere inside the domain included by C.
Hence the values /(£) along the curve C are by no means independent of
each other. They have in fact to satisfy the compatibility conditions
thus our system is infinitely over-determined. And yet the relation (1) is
one of the most useful theorems in the theory of analytical functions.
Another example is provided by the theory of Newtonian potential,
which satisfies the Laplace equation
where
RTs being the distance between the fixed point r inside of C and a point S
of the boundary surface C.
The given data here are the functional values y>(S) along the boundary
surface S, and the values of the normal derivative 8(p]dn along the boundary
surface S. This is in fact too much since q>(S) alone, or (8<p/dn)(S) alone,
would suffice to determine (p(r) everywhere inside the domain. And thus
our problem is once more infinitely over-determined. The given data have
to satisfy the compatibility conditions
where g(r) is any potential function which satisfies the Laplace equation (3)
everywhere inside and on S, free of singularities. There are infinitely
many such functions.
On the surface this abundance of data seems superfluous and handicapped
by the constraints to which they are submitted. But the great advantage
of the method is that we can operate with such a simple function as the
reciprocal distance between two points as our auxiliary function G(T, 8), If
we want to succeed with the minimum of data, we have first to construct
11—L.D.O.
146 MATRIX CALCULUS CHAP. 3
where S' is a point outside of the boundary C but very near to it. The
sharp increase of this function near to the point S' makes it possible to put
the spotlight rather strongly on that particular value (8(pl8n)(S) which
belongs to an S directly opposite to S' (see Figure). Although we did not
succeed in separating (8<p/8n)(S), yet we have a well-conditioned linear
system for its evaluation which can be solved with the help of the large
electronic computers.
We will demonstrate the value of over-determination by an example
within the realm of algebraic equations. Let us assume that we have to
solve the system Ay = b which shall be of the order 2n, i.e. we consider A
as a 2n x 2n matrix. We now separate our 2n unknowns (yi, yz, - • • •> yzn)
into two groups:
All the columns associated with the second group are carried over to the
right side, which means that we write our equation in the form
There are altogether n such solutions, which can be combined into the
2n x n matrix UQ. The compatibility conditions become
SEC. 3.17 THE USE OF OVER-DETERMINED SYSTEMS 147
which gives us n equations for the determination of 77. This requires the
inversion of the n x n matrix UoY. Then, having obtained 77, we go back
to (10) but omitting all equations beyond n, and obtaining £ by inverting
the matrix Xi where Xi is an n x n matrix, composed of the first n rows of X.
The full 2n x n matrix UQ shall also be split into the two n x n matrices
Ui and C/2, writing Uz below U\. We have complete freedom in choosing
the matrix Uz, as long as its determinant is not zero. We will identify it
with the unit matrix /. Then
and
This requires the Diversion of the n x n matrix QY\ + YZ- Then, having
solved this equation, we return to the first half of (10) and obtain
[Answer:
or
since the principal axes associated with the zero eigenvalue do not participate
in the generation of the matrix. If we possess all the p "essential axes",
associated with the non-zero eigenvalues, we possess everything for the
solution of the linear system
into a more useful set. We start with VP+I which we keep unchanged,
except that we divide it by the length of the vector:
Moreover, the condition that the length of v'p+z shall become 1 yields
which determines y (except for the sign). The process can be continued.
At the kth step we have
150 MATRIX CALCULUS CHAP. 3
After m — p steps the entire set (4) is replaced by a new, orthogonal and
normalised set of vectors v'p+i, . . . , v'p+m. But we need not stop here.
We can continue by choosing p more vectors in any way we like, as long as
they are linearly independent of the previous set. They too can be
orthogonalised, giving us the p additional
They too can be orthogonalised and here again we can continue the process,
giving us the p additional vectors
If we put
In the realm of continuous operators, where the matrices grow beyond all
size, the original definition of A loses its meaning. But the identity (1)
maintains its meaning and can serve for the purpose of defining A.
The next fundamental application of (1) is the derivation of the
compatibility conditions of the system Ay — b. We will extend this
system—as we have done before in Section 7—by the adjoint system,
considering the complete system
We can now ask: "Can we prescribe the right sides 6 and c freely? " The
application of the bilinear identity yields the relation
which holds, whatever the vectors 6 and c may be. We have the right to
specify our vectors in any way we like. Let us choose c = 0. Then we
have no longer an identity but an equation which holds for a special case,
namely:
The result means that the right side of the system (3) must be orthogonal to
any solution of the transposed (adjoint) homogeneous equation. The same
result can be derived for the vector c by making b equal to zero.
We have seen in the general theory of linear system that these are the
only compatibility conditions that the right side b has to satisfy. In the
special case that the adjoint homogeneous system (7) has no non-vanishing
solution, the system (3) becomes unconstrained (the vector 6 can be chosen
freely). These results, obtained before by different tools, follow at once
from the bilinear identity (1), which is in fact the only identity that can be
established between the matrix A and its transpose A.
We will now go one step further and consider a linear system which is
either well-determined or over-determined but not under-determined:
yi. For this purpose we add to our previous equation (3) one more equation,
considering the complete system
This means that we consider the value of yi as one of our data. This, of
course, cannot be done freely, otherwise our problem would not have a
unique solution. But the system (10) is a legitimate over-determined system
and it is now the compatibility condition which will provide the solution by
giving us a linear relation between the components of the right side, i.e. a
linear relation between the vector y and an which is in fact yt.
This method is of great interest because it brings into evidence a general
feature of solving linear systems which we will encounter again and again
in the study of linear differential equations. There it is called the "method
of the Green's function". It consists in constructing an auxiliary function
which has nothing to do with the data but is in fact entirely determined by
the operator itself. Moreover, this auxiliary function is obtained by solving
a certain homogeneous equation.
According to the general theory the compatibility of the system (10)
demands in the usual way that we solve the adjoint homogeneous equation
for the solvability of our system. But in our case the matrix A has been
extended by an additional row. This row has all zeros, except the single
element 1 in the *th place. Considering this row as a row-vector, the
geometrical significance of such a vector is that it points in the direction of
the *tb coordinate axis. Hence we will call it the ith "base vector" and
denote it with et, in harmony with our general custom of considering every
vector as a column vector:
while the compatibility condition (12), applied to our system (14), becomes
and
But then the new value of yi, obtained on the basis of a g which is different
from gi becomes
However, the second term vanishes since 6 satisfies the compatibility con-
ditions of the original system. And thus the insensitivity of yi relative to
the freedom of choosing any solution of (18) is demonstrated.
In Section 17 we have encountered Cauchy's integral theorem (17.1)
which was the prototype of a fundamentally important over-determined
system. Here the '' auxiliary vector g " is taken over by the auxiliary function
where z is the fixed point at which f(z) shall be obtained. But the conditions
demanded of (?(£, z) are much less strict than to yield the particular function
(23). In fact we could add to this special function any function g(z) which
remains analytical within the domain bounded by C. But the contribution
generated by this additional function is
and this quantity is zero according to (17.2), due to the nature of the
admissible boundary values /(£). Quite similar is the situation concerning
the boundary value problem (17.4) where again the function G(T,S) need
not be chosen according to (17.5) but we could add any solution of the
Laplacian equation (17.3) which remains everywhere regular in the given
domain. It is exactly this great freedom in solving the under-determined
equation (21) which renders the over-determined systems so valuable from
the standpoint of obtaining explicit solutions. If the system is well-
determined, the equation (18) becomes likewise well-determined and we have
to obtain a unique, highly specified vector g. This is in the realm of partial
differential equations frequently a difficult task.
We return to our original matrix problem (14). Since the solution of
the system (16) changes with i—which assumes in succession the values
1, 2, . . . , m, if our aim is to obtain the entire y vector—we should indicate
this dependence by the subscript i. Instead of one single equation (18) we
now obtain m equations which can be solved in succession
Moreover, the base vectors ei, 62, . . . , em, arranged as columns of a matrix,
yield the m x m unit matrix I:
The m defining equations (25) for the vectors gi, gz> • • - >9m can be united
in the single matrix equation
But the sequence of the factors in (29) is absolutely essential and cannot be
changed to
The solvability of this equation for all i would demand that the adjoint
homogeneous equation
has no non-vanishing solution. But this is not the case if our system is
over-determined (n > m).
Problem 147. Assume that we have found a matrix C such that for all x and y
Problem 148. Show with the help of the natural inverse (10.4) that in the case
of a unique but over-determined linear system the "left-inverse" (30) exists
but the "right inverse" (32) does not exist.
Problem 149. Apply the solution method of this section to the solution of the
over-determined (but complete) system (13.18) and demonstrate numerically
that the freedom in the construction of G has no effect on the solution.
158 MATRIX CALCULUS CHAP. 3
Problem 150. Do the same for the over-determined but incomplete system
(13.22) of Problem 136, after removing the deficiency by adding as a fifth
equation the condition
with the added condition that we choose among the possible solutions of
this problem the absolutely smallest X = Xm. In consequence of this
minimum problem we have for any arbitrary choice of y:
or
we obtain the equations (4) with the added condition that we choose among
all possible solutions the one belonging to the absolutely largest X = AI. In
consequence of this maximum property we obtain for an arbitrary choice
of y:
And thus we can establish an upper and lower bound for the ratio (1):
or also
Let us apply the inequality (9) to this particular vector y = gi, keeping in
mind that the replacement of A by A has no effect on the eigenvalues AI
and ATO, in view of the symmetry of the shifted eigenvalue problem (4):
Since by definition the base vector ei has the length 1 (cf. 19.13), we obtain
for the ith row of the inverse matrix the two bounds
This means that the sum of the squares of the elements of any row of A~l is
included between the lower bound Ai~2 and the upper bound Am~2. Exactly
the same holds for the sum of the squares of the elements of any column,
as we can see by replacing A by A.
Similar bounds can be established for A itself. By considering A as the
160 MATRIX CALCULUS CHAP. 3
inverse of A~l we obtain for the square of any row or any column of A the
inequalities
Then the ith row of the natural inverse (10.4) of the matrix can be
characterised by the following equation:
where vta denotes the iih component of the vector va. By artiing the
correction term on the right side we have blotted out the projection of the
base vector et into the non-activated portion of the tf-space, without
changing in the least the projection into the activated portion.
The equation (17) has a unique solution if we add the further condition
that gt must be made orthogonal to all the n — p independent solutions of
the homogeneous equation
It is this vector g\ which provides us with the ith row of the inverse B. By
letting i assume the values 1, 2, . . ., m we obtain in succession all the rows
of B.
Now we will once more apply the inequality (11) to this vector gt and
once more an upper and a lower bound can be obtained for the length of the
SEC. 3.20 MINIMUM PROPERTY OF THE SMALLEST EIGENVALUE 161
vector. The difference is only that now the denominator, which appears in
(11), is no longer 1 but the square of the right side of (17) for which we obtain
Here again we can extend this inequality to the columns and we can
likewise return to the original matrix A which differs from A~l only in
having A^ instead of Ai"1 as eigenvalues which reverses the role of AI and A m .
We obtain a particularly adequate general scheme if we arrange the
(orthogonalised and normalised) zero-solutions, together with the given
matrix A, in the following fashion.
where
12—L.D.O.
162 MATRIX CALCULUS CHAP. 3
where
BIBLIOGRAPHY
[1] Aitken, A. C., Determinants and Matrices (Interscience Publishers, New York,
1944)
[2] Ferrar, W. L., Algebra (Oxford Press, New York, 1941)
[3] Householder, A. S., Principles of Numerical Analysis (McGraw-Hill, New
York, 1953)
[4] MacDuffee, C. C., The Theonj of Matrices (Chelsea, New York, 1946)
CHAPTER 4
THE F U N C T I O N SPACE
4.1. Introduction
The close relation which exists between the solution of differential
equations and systems of algebraic equations was recognised by the early
masters of calculus. David Bernoulli solved the problem of the completely
flexible chain by considering the equilibrium problem of a chain which was
composed of a large number of rigid rods of small lengths. Lagrange solved
the problem of the vibrating string by considering the motion of discrete
masses of finite size, separated by small but finite intervals. It seemed
self-evident that this algebraic approach to the problem of the continuum
must lead to the right results. In particular, the solution of a problem in
partial differential equations seemed obtained if the following conditions
prevailed:
1. We replace the continuum of functional values by a dense set of
discontinuous values.
2. The partial derivatives are replaced by the corresponding difference
coefficients, taken between points which can approach each other as much
as we like.
3. We solve the resulting algebraic system and study the behaviour of
163
164 THE FUNCTION SPACE CHAP. 4
the solution as the discrete set of points becomes denser and denser, thus
approaching the continuum as much as we wish.
4. We observe that under these conditions the solution of the algebraic
system approaches a definite limit.
5. Then this limit is automatically the desired solution of our original
problem.
The constantly increasing demands on rigour have invalidated some of
the assumptions which seemed self-evident even a century ago. To give an
exact existence theorem for the solution of a complicated boundary value
problem in partial differential equations can easily tax our mathematical
faculties to the utmost. In the realm of ordinary differential equations
Cauchy succeeded with the proof that the limit of the substitute algebraic
problem actually yields the solution of the original continuous problem.
But the method of Cauchy does not carry over into the realm of partial
differential operators, and even relatively simple partial differential equations
require a thorough investigation if a rigorous proof is required of the kind
of boundary conditions which can guarantee a solution. We do not possess
any sweeping methods which would be applicable to all partial differential
equations, even if we restrict ourselves to the realm of linear differential
operators.
Hence, while on the one hand we have no right to claim that a certain
mathematically formulated boundary value problem must have a solution
"for physical reasons", we can, on the other hand, dispense with the
rigorous existence proofs of pure mathematics, in favour of a more flexible
approach which proves the existence of certain boundary value problems
under simplified conditions. Pure mathematics would like to extend these
conditions to much more extreme conditions and the value of such
investigations cannot be doubted. From the applied standpoint, however,
we are satisfied if we succeed with the solution of a fairly general class of
problems with data which are not too irregular.
The present book is written from the applied angle and is thus not
concerned with the establishment of existence proofs. Our aim is not the
solution of a given differential equation but rather the exploration of the
general properties of linear differential operators. The solution of a given
differential equation is of more accidental significance. But we can hardly
doubt that the study of the properties of linear differential operators can
be of considerable value if we are confronted with the task of solving a
given differential equation because we shall be able to tell in advance what
we may and may not expect. Moreover, certain results of these purely
symptomatic studies can give us clues which may be even of practical help
in the actual construction of the solution.
For example in the interval between 0 and 1 we might have chosen 2001
equidistant values of x and tabulated the corresponding functional values of
y = ex. In that case the xt values are denned by
while the y-values are the 2001 tabulated values of the exponential function,
starting from y = 1, and ending with y = 2.718281828.
We will now associate with this tabulation the following geometrical
picture. We imagine that we have at our disposal a space of 2001
dimensions. We assign the successive dimensions of this space to the
z-values 0, 0.005, 0.001, . . . i.e. we set up 2001 mutually orthogonal co-
ordinate axes which we may denote as the axes Xi, X%, . . . , Xn. Along
these axes we plot the functional values yi — f(xt), evaluated at the points
x = Xi. These y\, yz, . . . , yn can be conceived as the coordinates of a
certain point Y of an w-dimensional space. We may likewise connect the
point Y with the origin 0 by a straight line and arrive at the picture of the
vector OY. The "components" or "projections" of this vector on the
successive axes give the successive functional values yi,y%, . . . ,yn-
At first sight it seems that the independent variable x has dropped
completely out of this picture. We have plotted the functional values yt
168 THE FUNCTION SPACE CHAP. 4
as the components of the vector OY but where are the values xi, #2, • • • , #»?
In fact, these values are present in latent form. The role of the independent
variable x is that it provides an ordering principle for the cataloguing of the
functional values. If we want to know for example what the value /(0.25)
is, we have to identify this x = 0.25 with one of our axes. Suppose we find
that x — 0.25 belongs to the axis ^501, then we single out that particular
axis and see what the projection of the vector OY is on that axis. Our
construction is actually isomorphic with every detail of the original tabula-
tion and repeats that tabulation in a new geometrical interpretation.
We can now proceed even further and include in our construction functions
which depend on more than one variable. Let a function f ( x , y] depend on
two variables x and y. We tabulate this function in certain intervals, for
example in similar intervals as before, but now, proceeding in equal intervals
Ax and Ay, independently. If before we needed 2000 entries to cover the
interval [0, 1], we may now need 4 million entries to cover the square
0 < x < 1, 0 < y < 1. But in principle the manner of tabulation has not
changed. The independent variables x, y serve merely as an ordering
principle for the arrangement of the tabular values. We can make a
catalogue in which we enumerate all the possible combinations of x, y
values in which our function has been tabulated, starting the enumeration
with 1 and ending with, let us say, 4 million. Then we imagine a space of
4 million dimensions and again we plot the successive functional values of
u = f ( x , y} as components of a vector. This one vector is again a perfect
substitute for our table of 4 million entries.
We observe that the dimensionality of our original problem is of no
immediate concern for the resulting vector picture. The fact that we have
replaced a continuum by a discrete set of values abolishes the fundamental
difference between functions of one or more variables. No matter how many
independent variables we had, as soon as we begin to tabulate, we automatically
begin to atomise the continuum and by this process we can line up any
number of dimensions as a one-dimensional sequence of values.
Our table may become very bulky but in principle our procedure never
changes. We need two things: a catalogue which associates a definite
cardinal number
SEC. 4.5 THE FUNCTION AS A VECTOR 169
with the various "cells" in which our continuum has been broken, and a
table which associates a definite functional value with these cardinal numbers,
from 1 to n, where n may be a tremendously large number. Now we take
all these functional values and construct a definite vector of the w-dimensional
space which is a perfect representation of our function. Another function
belonging to the same domain will find its representation in the same
w-dimensional space, but will be represented by another vector because the
functional values, which are the components of the new vector along the
various axes, are different from what they were before.
This concept of a function as a vector looks strange and artificial at the
first moment and yet it is an eminently useful tool in the study of differential
and integral operators. We can understand the inner necessity of this
concept if we approach the problem in the same way as Bernoulli and
Euler and Lagrange approached the solution of differential equations.
Since the derivative is defined as the limit of a difference coefficient, the
replacement of a differential equation by a difference equation involves a
certain error which, however, can be reduced to as little as we like by
making the Ax between the arguments sufficiently small. But it is this
replacement of a differential equation by a difference equation which has a
profound effect on the nature of our problem. So far as the solution is
concerned, we know that we have modified the solution of our problem by a
negligibly small amount. But ideologically it makes a very great difference
to be confronted by a new problem in which everything is formulated in
algebraic terms. The unknown is no longer a continuous function of the
variables. We have selected a discrete set of points in which we want to
obtain the values of the function and thus we have transformed a problem
of infinitely many degrees of freedom to a problem of a finite number of
degrees of freedom. The same occurs with partial differential equations
in which the independent variables form a more than one-dimensional
manifold. In the problem of a vibrating membrane for example we should
find the displacement of an elastic membrane which depends on the three
variables x, y, and t. But if we assume that the material particles of the
membrane are strictly speaking not distributed continuously over a surface
but actually lumped in a large number of "mass-points" which exist in
isolated spots, then we have the right picture which corresponds to the
concepts of the "function space". Because now the displacement of the
membrane is no longer a continuous function of x, y, t but a displacement
which exists only in a large but finite number of grid-points, namely the
points in which the mass-points are concentrated. The new problem is
mathematically completely different from the original problem. We are no
longer confronted with a partial differential equation but with a large
number of ordinary differential equations, because we have to describe the
elastic vibrations that the n mass-points describe under the influence of the
elastic forces which act between them. But now we can go one step still
further. We can carry through the idea of atomisation not only with
respect to space but also with respect to time. If we atomise the time variable,
170 THE FUNCTION SPACE CHAP. 4
We have chosen for the sake of simplicity an equidistant set of points, which
is not demanded since generally speaking our Ax = e could change from
point to point. But a constant A x is simpler and serves our amis equally well.
Now the function y(x] will also be atomised. We are no longer interested
in the infinity of values y(x) but only in the values of y(x) at the selected
points xi:
The same happened with the given right side b(x) of the differential equation.
This b(x) too disappeared in its original entity and re-appeared on the
platform as the vector
Closer inspection reveals that strictly speaking these two vectors do not
belong to the same spaces. Our algebraic system (5) is in fact not an
n x n system but an (n — 2) x n system. The number of unknowns
surpasses the number of equations by 2. Here we observe already a
characteristic feature of differential operators: They represent in themselves
without further data, an incomplete system of equations which cannot have
a unique solution. In order to remove the deficiency, we have to give some
further data and we usually do that by adding some proper boundary
conditions, that is certain data concerning the behaviour of the solution
at the boundaries. For example, we could prescribe the values of y(0) and
y(I). But we may also give the values y'(x) at the two endpoints which in
our algebraic transcription means the two values
There are many other possibilities and we may give two conditions at the
point x = 0 without any conditions at x = 1, or perhaps two conditions at
the point x = 1 without any conditions at x = 0, or two conditions which
involve both endpoints simultaneously. The important point is that the
differential equation alone, without boundary conditions, cannot give a unique
solution. This is caused by the fact that a linear differential operator of the
order r represents a linear relation between r + 1 functional values.
Hence, letting the operator operate at every point x = X{, the operator
would make use of r additional functional values which go beyond the limits
of the given interval. Hence we have to cross out r of the equations which
makes the algebraic transcription of an rth order linear differential operator
to a deficient system of n — r equations between n unknowns. The
172 THE FUNCTION SPACE CHAP. 4
In this case y\ and yn disappear on the left side since they are no longer
unknowns. We now obtain a system of n — 2 rows and columns and the
resulting matrix is a square matrix which can be written out as follows, if
we take out the common factor 1/e2:
It must be our aim to save this valuable feature of a metrical space in relation
174 THE FUNCTION SPACE CHAP. 4
(the equality of all Axi = xt+i — X{ is not demanded). If we now form the
scalar product (1), we actually get something very valuable, namely
and this quantity approaches a very definite limit as Axt decreases to zero,
namely the definite integral
This definition has once more the great value that in the limit it leads to a
very definite invariant associated with the multi-dimensional function
<P(x, y,z, . . .), namely the definite integral
where dr is the volume element of the region and the integration is extended
over the complete range of all the variables.
In fact, this generalisation to the multi-dimensional case is so natural
that we often prefer to cover the general case with the same symbolism,
denoting by x an arbitrary point of the given multi-dimensional region and
by dx the volume element of that region. The formula (7) may then be
written in the form
SEC. 4.8 THE SCALAR PRODUCT OF TWO VECTORS 175
in full analogy to the formula (5), although the symbol x refers now to a
much more complicated domain and the integration is extended over a
multi-dimensional region.
4.8. The scalar product of two vectors
If two vectors / and g are placed in a Euclidean space, their mutual
position gives rise to a particularly important invariant, the "scalar
product" of these two vectors, expressible in matrix language by the product
In particular, if this product is zero, the two vectors are orthogonal to each
other. We can expect that the same operation applied to the space of
functions will give us a particularly valuable quantity which will be of
fundamental significance in the study of differential operators. If we
return to Section 7, where we have found the proper definition of the vector
components in the space of functions, we obtain—by the same reasoning
that gave us the "norm" (length square) of a function—that the "scalar
product" of the two functions f(x) and g(x) has the following significance:
The same holds in the multi-dimensional case if we interpret the point "x"
in the sense of the formula (7.8) and dx as the volume-element of the domain:
The vector y of the algebraic equation should represent y(x) at the selected
points x = Xi but in actual fact y cannot be more than an approximation
of y(xt). Hence we will replace y by y and write the algebraic system (3)
in the form
The error vector 8 can be estimated on the basis of the given differential
operator D and the right side b(x). Assuming continuity of b(x) and
excluding any infinities in the coefficients of the operator Dy(x), we can
establish an error bound for the value of the component St at the point
x = xf.
lengths of the vectors would grow to infinity and we should not be able to
.obtain any finite limits as N grows larger and larger. We are in fact
interested in a definite point x = (x1, x2, . . . , xs) of the continuum, although
its algebraic labelling xt changes all the time, in view of the constantly
increasing number of points at which the equation is applied.
We avoid the difficulty by a somewhat more flexible formulation of the
bilinear identity that we have discussed earlier in Chapter 3.19. In that
development the "scalar product" of two vectors x and y was defined on
the basis of
(omitting the asterisk since we want to stay in the real domain). However,
the bilinear identity remains unaffected if we agree that this definition shall
be generalised as follows :
where the weight factors pi, P2, • • • , PN are freely at our disposal, although
we will restrict them in advance by the condition that we will admit only
positive numbers as weights.
Now in the earlier treatment we made use of the bilinear identity for the
purpose of solving the algebraic system (3). We constructed a solution of
the equation
But now, if we agree that the scalar product shall be defined on the basis
of (11), we will once more obtain the solution with the help of (14) but the
definition of gt has to occur on the basis of
Now the discrete point Xi was connected with a definite cell of our con-
tinuum of s dimensions. That cell had the volume
while the total volume r of the continuum is the sum total of all the
elementary cells:
13—L.D.O.
178 THE FUNCTION SPACE CHAP. 4
With this definition the lengths of our vectors do not increase to infinity
any more, in spite of the infinity of N. For example the square of the
error vector S on the right side of (9) now becomes
if we define hz by
This k is a quantity which cannot become larger than the largest of all hi;
hence it goes to zero with ever increasing N.
Let us now solve the system (9) for the *th component, on the basis of (14):
As far as the second sum goes, we have just obtained the bound (20). In
order to bound the first sum we follow the procedure of Section 3.20, with
the only modification that now the ratio (3.20.1) should be defined in
harmony with our extended definition of the scalar product:
However, the problem of minimising this ratio yields once more exactly
the same solution as before, namely the eigenvalue problem (3.20.4), to be
solved for the smallest eigenvalue \m. And thus we obtain, as before in
(3.20.6):
We see that the difference between the algebraic and the continuous
solution converges to zero as N increases to infinity, provided that Xm remains
finite with increasing N. If it so happens that \m converges to zero as N
increases to infinity, the convergence of the algebraic solution to the correct
solution can no longer be established. The behaviour of the smallest
eigenvalue of the matrix A with increasing order N of the matrix is thus of
vital importance.*
4.10. The adjoint operator
Throughout our treatment of linear systems in Chapter 3 we have pointed
out the fundamental importance of the transposed matrix A and the
associated transposed equation Au = 0. Now we deal with the matrix
aspects of linear differential operators and the question of the significance
of A has to be raised. Since the differential operator itself played the role
of the matrix A, we have to expect that the transposed matrix A has to
be interpreted as another linear differential operator which is somehow
uniquely associated with A.
In order to find this operator, we could proceed in the following fashion.
By the method of atomisation we transcribe the given differential equation
into a finite system of algebraic equations. Now we abstract from these
equations the matrix A itself. We transpose A by exchanging rows and
columns. This gives rise to a new linear system and now we watch what
happens as e goes to zero. In the limit we obtain a new differential
equation and the operator of this equation will give the adjoint differential
operator (the word "adjoint" taking the place of "transposed").
While this process is rather cumbersome, it actually works and in
principle we could obtain in this fashion the adjoint of any given linear
differential operator. In practice we can achieve our aim much more
simply by a method which we will discuss in the following section. It is,
however, of interest to construct the associated operator by actual matrix
transposition.
The following general observations should be added. The "transposed"
operator is called "adjoint" (instead of transposed). Moreover, it is
important to observe that the matrix of the transcribed algebraic system is
decisively influenced by the boundary conditions of the problem. The same
differential operator with different boundary conditions yields a different
matrix. For example the problem (6.1) with the boundary conditions (6.10)
yields the matrix (6.11) which is symmetric. Hence the transposed matrix
coincides with the original one and our problem is "self-adjoint". But the
* It should be pointed out that our conclusion does not prove the existence of the
solution of the original differential equation (2). What we have proved is only that
the algebraic solution converges to the desired solution, provided that that solution exists.
180 THE FUNCTION SPACE CHAP. 4
That the original boundary conditions (6.10) had the non-zero values /?i, j8n
on the right side, is operationally immaterial since given numerical values
can be no parts of an operator. What is prescribed is the decisive question;
the accidental numerical values on the right side are immaterial.
Problem 154. Denoting the adjoint operator by Gu(x) find the adjoints of the
following problems:
1.
[Answer:
2.
[Answer:
3.
[Answer :
4. no boundary conditions
[Answer:
SEC. 4.11 THE B1UNEAE IDENTITY 181
which means
u(x), v(x) which satisfy the demanded differentiability conditions but are
not subjected to any specific boundary conditions :
The result of the integration is no longer zero but something that depends
solely on the values of u(x), v(x)—and some of their derivatives—taken on
the boundary of the region. This is the meaning of the expression " boundary
term" on the right side of (4). The fundamental identity (4) is called the
"extended Green's identity".
In order to see the significance of this fundamental theorem let us first
restrict ourselves to the case of a single independent variable x. The given
operator Dv(x] is now an ordinary differential operator, involving the
derivatives of v(x) with respect to x. Let us assume that we succeed in
showing the validity of the following bilinear relation:
where on the right side F(u, v) is an abbreviation for some bilinear function
of u(x) and v(x), and their derivatives. If we are able to prove (5), we shall
at once have (4) because, integrating with respect to x between the limits a
and 6 we obtain
and this equation is exactly of the form (4). Let us then concentrate on the
proof of (5).
The operator Dv(x) is generally of the form
and we see that we have obtained the adjoint operator associated with the
term pk(x)vW(x):
184 THE FUNCTION SPACE CHAP. 4
If we repeat the same procedure with every term, the entire operator T)u(x)
will be constructed.
We have thus obtained a simple and powerful mechanism by which to
any given Dv(x) the corresponding Du(x) can be obtained. The process
requires no integrations but only differentiations and combinations of terms.
We can even write down explicitly the adjoint of the operator (7):
•*
The adjoint boundary conditions, however, have not yet been obtained.
Problem 155. Consider the most general linear differential operator of second
order:
Problem 156. Find the most general linear differential operator of the second
order which is self-adjoint.
[Answer:
[Answer:
(x has the significance of the time t). The boundary conditions are that
at x = 0 the displacement v(x) and the velocity v'(x) are zero:
We investigate the boundary term of the right side. The given boundary
conditions are such that the contribution at the lower limit x = 0 becomes
zero while at the upper limit x = I we have
Nothing is said about v(l) and v'(l). Hence the vanishing of the boundary
term on the right side of (5) demands
Now the boundary term of (5) vanishes automatically and we do not get any
boundary conditions for u(x). The adjoint problem Au is now characterised
by the operator u"(x) alone, without any boundary conditions.
Problem 158. The results of Problem 154 were obtained by direct matrix
transposition. Obtain the same results now on the basis of Green's identity.
Problem 159. Consider the following differential operator:
SEC. 4.14 INCOMPLETE SYSTEMS 187
Obtain the adjoint operator t)u(x) and the adjoint boundary conditions under
the following circumstances:
[Answer:
Find the adjoint boundary conditions. When will the system become self-
adjoint?
[Answer:
Condition of self-adjointness:
give us a good substitute for the direct definition of n, ra, and p. The
number of independent solutions of the system (1) is always ra — p, that of
the system (2) n — p. These solutions tell us some fundamental facts about
the given system, even before we proceed to the task of actually finding the
solution. The system (1) decides the unique or not unique character of
the solution while the system (2) yields the compatibility conditions of the
system.
188 THE FUNCTION SPACE CHAP. 4
The role of the system (1) was: "Add to a particular solution an arbitrary
solution of the system (1), in order to obtain the general solution of the given
system."
The role of the system (2) was: "The given system is solvable if and only
if the right side is orthogonal to every independent solution of (2)."
These results are immediately applicable to the problem of solving linear
differential equations or systems of such equations. Before we attempt a
solution, there are two questions which we will want to decide in advance:
1. Will the solution be unique? 2. Are the given data such that a
solution is possible ?
Let us first discuss the question of the uniqueness of the solution. We
have seen that an r th order differential equation alone, without additional
boundary conditions, represents an (m — r) x m system and is thus r
equations short of a square matrix. But even the addition of r boundary
conditions need not necessarily guarantee that the solution will be unique.
For example the system
represents a second order system with two boundary conditions and thus
we would assume that the problem is well-determined. And yet this is not
so because the homogeneous problem
Hence we could have added one more condition to the system, in order to
make it uniquely determined, e.g.
(vi(x) is the deflection of the bar, I(x) is the inertial moment of the generally
variable cross section, fi(x) the load density.) We assume that the bar
extends from x = 0 to x = I. We will also assume that the bar is not
supported at the two endpoints but at points between and we include the
forces of support as part of the load distribution, considering them as
negative loads.
Now the boundary conditions of a bar which is free at the two endpoints
are
under the boundary conditions (10) demands only the vanishing of vz(x) while
for vi(x) we obtain two independent solutions
The physical significance of these two solutions is that the bar may be
translated as a whole and also rotated rigidly as a whole. These two degrees
of freedom are in the nature of the problem and not artificially imposed
from outside. We can eliminate this uncertainty by adding the two
orthogonality conditions
Problem 161. Show that the added conditions a) or b) are permissible, the
conditions c) not permissible for the elimination of the deficiency of the system
(9), (10):
Problem 162. Find the adjoint system of the problem (9), (10).
[Answer:
Boundary conditions:
The orthogonality of the right side of (14.9) to these solutions demands the
following two compatibility conditions:
with a mechanism which brought the mass back to the origin with zero
velocity at the time moment x = 1. Hence we could add the two surplus
conditions (13.10) with the result that the adjoint homogeneous system
became
They have the following significance. The forces employed by our mech-
anism satisfy of necessity the two conditions that the time integral of the
force and the first moment of this integral vanishes.
Problem 163. Find the compatibility conditions of the following system
[Answer:
Problem 164. In Problem 161 the addition of the conditions (14.17) were
considered not permissible for the removal of the deficiency of the system.
What new compatibility condition is generated by the addition of these boundary
conditions? (Assume I(x) = const.)
[Answer:
if
But let us now assume that the prescribed boundary conditions for v(x) are
of the inhomogeneous type. In that case we have to change over to the
extended Green's identity (12.6). The boundary term on the right side will
now be different from zero but it will be expressible in terms of the given
boundary values and the boundary values of the auxiliary function u(x)
which we found by solving the adjoint homogeneous equation:
As an example let us consider once more the problem of the free elastic
bar, introduced before in Section 14 (cf. 14.9-10). Let us change the
boundary conditions (14.10) as follows:
and let us see what change will occur in the two compatibility conditions
(16.3-4). For this purpose we have to investigate the boundary term of the
extended Green's identity which in our case becomes
The first two terms drop out on account of the adjoint boundary conditions.
The last two terms dropped out earlier on account of the given homogeneous
boundary conditions (14.10) while at present we get
SEC. 4.16 COMPATIBILITY CONDITIONS 193
and thus the compatibility conditions (15.3) and (15.4) must now be extended
as follows:
which express the mechanical principles that the equilibrium of the bar
demands that the sum of all loads and the sum of the moments of all loads
must be zero. The loads which we have added to the previous loads are
(9) of all the forces, extending the integration from 0 to I + e, and defining
the added load distribution by the following two conditions:
What we have here is a single force at the end of the bar and a force couple
acting at the end of the bar, to balance out the sum of the forces and the
moments of the forces distributed along the bar. This means in physical
interpretation that the bar is free at the left end x = 0 but clamped at the
right end x = I since the clamping can provide that single force and force
couple which is needed to establish equilibrium. Support without clamping
can provide only a point force which is not enough for equilibrium except
if a second supporting force is applied at some other point, for instance at
x = 0, and that was the case in our previous example.
What we have seen here is quite typical for the behaviour of inhomo-
geneous boundary conditions. Such conditions can always be interpreted
as extreme distributions of the right side of the differential equation, taking
recourse to point loads and possibly force couples of first and higher order
which have the same physical effect as the given inhomogeneous boundary
conditions.
Problem 166. Extend the compatibility conditions (15.6) to the case that the
boundary conditions (13.3) and (13.10) of the problem (13.2) are changed as
follows:
[Answer:
Obtain the compatibility condition of this system and show that it is identical
with the Taylor series with the remainder term.
SEC. 4.17 GREEN'S IDENTITY 195
Problem 168. Consider once more the system of the previous problem, but
changing the boundary condition at 6 to
Let us assume that we have succeeded with the task of constructing the
adjoint operator f)u(x) on this basis. Then we can immediately multiply
by the volume-element dx on both sides and integrate over the given domain.
On the right side we apply the Gaussian integral transformation:
where va is the outside normal (of the length 1) of the boundary surface S.
We thus obtain once more the extended Green's theorem in the form
The " boundary term " appears now as an integral extended over the boundary
surface. From here we continue exactly as we have done before: we impose
the minimum number of conditions on u(x) which are necessary and sufficient
to make the boundary integral on the right side of (4) vanish. This provides
us with the adjoint boundary conditions. We have thus obtained the
differential operator I)u(x) and the proper boundary conditions which
together form the adjoint operator.
Let us then examine the equation (2). An arbitrary linear differential
operator Dv(x) is composed of terms which contain v(x) and its derivatives
linearly. Let us pick out a typical term which we may write in the form
where A (x) is a given function of the x^ while w stands for some partial
derivative of v with respect to any number of Xj. We now multiply by
u(x) and use the method of "integrating by parts" :
The result of the process is that v is finally liberated and the roles of u and
v are exchanged: originally u was multiplied by a derivative of v, now v is
multiplied by a derivative of u. Hence the adjoint operator has been
obtained as follows:
while the vector Fa(u, v) of the general relation (2) has in our case the
following significance:
once in the sequence x, y and once in the sequence y, x. Show that the equality
of the resulting boundary integral can be established as a consequence of the
following identity:
Prove this identity by changing it to a volume integral with the help of Gauss's
theorem (3).
2. The scalar
3. The vector
Boundary term:
SEO. 4.18 FUNDAMENTAL FIELD OPERATIONS OF VECTOR ANALYSIS 199
Boundary term:
Boundary term:
Consider for example the problem of obtaining the scalar $ from the
vector field
What can we say concerning the deficiency of this equation? The only
solution of the homogeneous equation
Hence the function # will be obtainable from (16), except for an additive
constant.
What can we say concerning the compatibility of the equation (16)? For
this purpose we have to solve the adjoint homogeneous problem. The
boundary term (9) yields the boundary condition
with the added condition (19). Now the equation (20) is solvable by putting
200 THE FUNCTION SPACE CHAP. 4
where B is a freely choosable vector field, except for the boundary condition
(19) which demands that the normal component of curl B vanishes at all
points of the boundary surface S:
The compatibility of our problem demands that the right side of (16) is
orthogonal to every solution of the adjoint homogeneous system. This means
and, since the vector B is freely choosable inside the domain T, we obtain
the condition
Show that in all these cases the problem is self-adjoint. Historically this
problem is particularly interesting since "Green's identity" was in fact estab-
lished for this particular problem (George Green, 1793-1841).
Problem 171. Investigate the deficiency and compatibility of the following
problem:
Boundary condition: V — VQ on S.
[Answer:
V uniquely determined. Compatibility conditions:
What boundary conditions are demanded in order that V shall become the
gradient of a scalar?
[Answer: Only Vv can be prescribed on S, with the condition
We will consider a few characteristic examples for this procedure from the
realm of partial differential operators.
Let us possess the equation
If we put
with the added boundary condition (3) and this equation has indeed a
unique solution.
As a second example let us consider the system
with the boundary condition (9) and this problem has a unique solution.
Let us furthermore put
Hence we obtain the adjoint operator T)u of our system in the following form:
The proper field quantities of the electromagnetic field are not E and H
204 THE FUNCTION SPACE CHAP. 4
but iE and H and the proper way of writing the Maxwellian equations is
as follows:
Problem 173. Making use of the Hermitian definition of the adjoint operator
according to (12.2) show that the following operator in the four variables x, y, z,
t is self-adjoint:
[Answer:
BIBLIOGRAPHY
[1] Halmos, P. R., Finite Dimensional Vector Spaces (Princeton University
Press, 1942)
[ 2] Synge, J. L., The Hypercircle in Mathematical Physics (Cambridge University
Press, 1957)
CHAPTER 5
5.1. Introduction
In the domain of simultaneous linear algebraic equations we possess
methods by which in a finite number of steps an explicit solution of the
given system can be obtained in all cases when such a solution exists. In
the domain of linear differential operators we are not in a similarly fortunate
position. We have various methods by which we can approximate the
solution of a given problem in the realm of ordinary or partial differential
equations. But the actual numerical solution of such a specific problem—
although perhaps of great importance for the solution of a certain problem
of physics or industry—may tell us very little about the interesting analytical
properties of the given problem. From the analytical standpoint we are
not interested in the numerical answer of an accidentally encountered
problem but in the general properties of the solution. The analytical tools
by which a solution is obtained may be of little practical significance but of
great importance if our aim is to arrive at a deeper understanding of the
nature of linear differential operators and the theoretical conclusions we can
206
SEC. 5.2 THE EOLE OF THE ADJOINT EQUATION 207
(where we assume that the first equation includes the given boundary
conditions) is a perfectly legitimate linear system whose compatibility we
can investigate. For this purpose we have to form the adjoint homogeneous
equation
and express the compatibility of the system (2) by making the right side
orthogonal to the solution u(x) of the system (3) (we know in advance that
the adjoint system (3) must have a solution since otherwise the right side
208 THE GREEN'S FUNCTION CHAP. 5
of (2) would be freely choosable and that would mean that p is not deter-
mined). But then the orthogonality of the right side of (2) to the solution
u(x) will give a relation of the following form:
lfi(x)u(x)dx + pu0 = 0 (5.2.4)
and this means that we can obtain p in terms of the given (3(x) by the relation
p = - — $ p(x)u(x)dx (5.2.5)
UQ
This is the basic idea which leads to the construction of the Green's
function and we see the close relation of the construction to the solution of
the adjoint homogeneous equation.
x = #1, then our assumption that v(x) assumes the value p at the point
x = xi can be extended to the immediate neighbourhood of x = x\, with
an error which can be made as small as we wish. We will thus extend
our system (2) in the following sense. The equation
in the sense that now xi is not a definite point but a point with its
neighbourhood which extends over an arbitrarily small v-dimensional domain
of the small but finite volume e that surrounds the point x = x\. Accord-
ingly in Green's identity (3.1) we can exempt Du(x) from being zero not
only at the point x = x\ but also in its immediate neighbourhood. This
statement we will write down in the following form
The function 8£(xi, x) has the property that it vanishes everywhere outside
of the small neighbourhood e of the point x = x\. Inside of the small
neighbourhood e we will assume that Be(xi> %) does not change its sign: it is
either positive or zero.
Then Green's identity yields the compatibility condition of our system
(1), (2) in the following form:
The integration extends only over the neighbourhood e of the point x = x\,
since outside of this neighbourhood the integrand vanishes.
Now we will proceed as follows. In (4) we can replace p(x) by v(x) since
the significance of p(x) is in fact v(x). Then we can make use of the mean
value theorem of integral calculus, namely that in the second integral we
can take out v(x) in front of the integral sign, replacing a; by a certain x\
where xi is some unknown point inside of the domain e. Then we obtain
The uncertainty caused by the fact that x does not coincide with x can
now be eliminated by studying the limit approached as e goes to zero.
This limit may exist, even if Gf(x, £) does not approach any definite limit
since the integration has a smoothing effect and it is conceivable that
Ge(x, |), considered as a function of £, would not approach a definite limit
at any point of the domain of integration and yet the integral (8) may
approach a definite limit at every point of the domain. In actual fact,
however, in the majority of the boundary value problems encountered in
mathematical physics the function Gf(x, £) itself approaches a definite limit
(possibly with the exception of certain singular points where Gf(x, £) may go
to infinity). This means that in the limit process
the function G£(x, £) with decreasing e approaches a definite limit, called the
"Green's function":
but what we actually mean is a limit process, in which the delta function
never appears as an actual entity but only in the legitimate form 8f(x, £),
e converging to zero. The truth of the statement (14) does not demand
that the limit of 8e(x, £) shall exist. The same is true of all equations in
which the symbol 8(x, £) appears.
shall have no non-zero solution. And that again means that the problem
under the given boundary conditions, should not have more than one solution.
Hence the given linear system must belong to the "complete" category,
that is the F-space must be completely filled out by the given operator.
There is no condition involved concerning the C7-space. Our problem can
be arbitrarily over-determined. Only under-determination has to be avoided.
The reason for this condition follows from the general idea of a Green's
function. To find the solution of a given differential equation (with the
proper boundary conditions) means that we establish a linear relation
between v(x) and the given data. But if our system is incomplete, then
such a relation does not exist because the value of v(x) (at the special point x)
may be added freely to the given data. Under such circumstances the
existence of the Green's function cannot be expected.
A good example is provided by the strongly over-determined system
Let us now formulate the problem of the Green's function. The adjoint
operator now is
(v being the normal at the boundary point considered). Now the differential
equation
certainly demands very little of the vector field U since only a scalar
condition is prescribed at every point of the field and we have a vector at
our disposal to satisfy that condition. And yet the equation (8) has no
SEC. 5.5 THE EXISTENCE OF THE GEEEN'S FUNCTION 213
while the definition of the delta function demands that the same integral
shall have the value 1.
Let us, however, complete the differential equation (3) by the added
condition
In this case the adjoint operator (6) ceases to be valid at the point £ = 0
and we have to modify the defining equation of the Green's function as
follows
The condition that the integral over the right side must be zero, determines
a to —1 and we obtain finally the determining equation for the Green's
function of our problem in the following form:
(on the right side the product FG refers to the scalar product of the two
vector fields).
One particularly simple solution of the differential equation (12) can be
obtained as follows: We choose a narrow tube of constant cross-section q
214 THE GREEN'S FUNCTION CHAP. 5
which shall connect the points £ = 0 and £ = x. We define the vector
field 0(x, g) as zero everywhere outside of this tube while inside of the tube
we assume that G(t-) is everywhere perpendicular to the cross-section of the
tube and of constant length. We have no difficulty in showing that the
divergence of the vector field thus constructed vanishes everywhere, except
in the neighbourhood of the points £ = 0 and £ = x. But these are exactly
the neighbourhoods where we do not want div 6r(£) to vanish since the right
side of (12) is not zero in the small neighbourhood e of these points. The
infinitesimal volume e in which the delta function 8e(x, g) is not zero is
given by the product qh. In this volume we can assign to Sf(x, g) the
constant value
The vector field G(g) starts at the lower end of the tube with zero, grows
linearly with h and attains the value Ijq at the end of the shaded volume.
Then it maintains this length throughout the tube T, arrives with this
constant value at the upper shaded volume, diminishes again linearly with
h and becomes zero at the upper end of the tube.
Let us now form the integral (13). This integral is reduced to the very
narrow tube in which G($) is different from zero. If we introduce the
line-element ds of the central line of the tube and consider it as an
—>
infinitesimal vector ds whose length is ds while its direction is tangential to
the line, then the product G(i-)d£ is replaceable by:
and this is the well-known elementary solution of our problem, but here
obtained quite systematically on the basis of the general theory of the
Green's function.
Our problem is interesting from more than one aspect. It demonstrates
the correctness of the Green's function method hi a case where the Green's
function itself evaporates into nothingness. The function G((x, £) is a
perfectly legitimate function of £ as long as e is finite but it does not
approach any limit as e approaches zero. It assumes in itself the nature of
a delta function. But this is in fact immaterial. The decisive question is
not whether G(x, £) exists as e goes to zero but whether the integral (13)
approaches a definite limit as e goes to zero and in our example this is the
case, although the limit of Gf(x, $) itself does not exist.
SEC. 5.5 THE EXISTENCE OF THE GREEN'S FUNCTION 215
and now our problem is to obtain v(x, t) at the previous time moments
between 0 and T.
In harmony with the general procedure we are going to construct the
adjoint problem with its boundary conditions, replacing any given in-
homogeneous boundary conditions by the corresponding homogeneous
216 THE GREEN'S FUNCTION CHAP. 5
reasonable and which in the case of properly given data has a solution and
hi fact a unique solution. That solution, however, cannot be given in
terms of an auxiliary function which satisfies a given non-homogeneous
differential equation. The concept of the Green's function has to be
extended to a more general operator in order to include the solution of such
problems, as we shall see later in Chapter 8.
Problem 176. Remove the over-determination of the problem (3), (10), by the
least square method (application of t> on both sides of the equation) and
characterise the Green's function of the new problem.
[Answer:
Problem 177. Define the Green's function for the Laplacian operator Av, if the
given boundary conditions are
[Answer:
no boundary conditions.]
Now the extended Green's identity—which does not demand any boundary
conditions of u and v—becomes in our case:
The integration (5) on the left side yields minus v(x), because Av = 0, and
Au is reduced to the delta function which exists only in the e-neighbourhood
of the point £ = x, putting in the limit the spotlight on v(x). On the right
side the first term drops out due to the boundary condition (7) and the
final result becomes
with the inhomogeneous boundary condition (2), then we obtain the solution
SEC. 5.6 INHOMOGENEOUS BOUNDAHY CONDITIONS 219
in the form of a sum of two integrals, the one extended over the given volume
T, the.other over the boundary surface S:
This form of the solution shows that the inhomogeneous boundary values
v(S) can be interpreted as equivalent to a double layer of surface charges
placed on the boundary surface 8.
Problem 178. Obtain the solution of the boundary value problem (5.28), with
[Answer:
Problem 179. Obtain the solution of the heat conduction problem (5.19),
(5.20), but replacing the end condition (5.21) by the initial condition
[Answer:
where the Green's function O(x, t; £, r) satisfies the differential equation (5.25)
with the boundary conditions (5.23) and the end-condition
Problem 180. Consider the problem of the elastic bar (4.14.9), with the load
distribution (3(x) = 0 and the inhomogeneous boundary conditions
Problem 181. Solve the same problem with the boundary conditions
[Answer:
These systems of four equations for the four unknowns (vi, v%, vs, v^) is
equivalent to the equation (2), as we can see if we substitute for v2, #3, #4
their values into the last equation. But we can equally consider the given
system as a simultaneous system of four equations in four unknowns. In
the latter case the focus of interest is no longer on vi alone. We may
equally consider v2, ^3, t>4 as unknowns. And thus we are confronted with
a new situation to which our previous discussions have to be extended.
How are we going to construct in this case the "Green's function" which will
serve as the auxiliary function for the generation of the solution?
We return once more to the fundamental idea which led to the concept
of the "function space" (see Section 4.5). The continuous variables
x1, x2, . . . , xs—briefly denoted by the symbol x—were replaced by a set of
discrete values in which the function v(x, y, z) was tabulated. Each one of
these tabulated values opened up a new dimension in that abstract "function
space" in which the function v(x) became represented by a single vector.
Now, if we have not merely one such function but a set of functions
we can absorb all these functions in our function space, without giving up
the idea of a single vector as the representation of a function. We could
think, of course, of the p functions vi, vz, • • . , «V as p. different vectors of
SBC. 5.7 THE GREEN'S VECTOR 221
the function space. But we can also do something else. We can add to
our variable x a new variable} which can only assume the values 1,2, . . . , /*,
and is thus automatically a discrete variable. Here then it is unnecessary
to add a limit process by constantly increasing the density of points in
which the function is tabulated. The variable j automatically conforms to
the demands of matrix algebra since it is from the very beginning an algebraic
quantity. If we replace Vj by the symbol v(x, j) we have only introduced a
new notation but this notation suggests a new interpretation. We now
extend the previous function space by added dimensions along which we
plot the values of vi, v%, . . . , v^ at all tabulated points and the entire set
of vectors is once more represented by one single vector. Whether we write
the given operator in the form of DVJ(X), or Dv(x,j), we know at once what
we mean: we are going to operate with the dimensions (1, 2, . . . , ju,) as if they
belonged to a surplus variable j which can only assume the p, discrete values
1, 2, . . . , /*.
Let us see what consequences this viewpoint has for the construction of
the Green's function. We have first of all to transcribe to our extended
problem the solution (4.11) of a differential equation, in terms of the Green's
function. For this purpose we shall write down the system (3) in more
adequate form. On the right side of this system we have (0, 0, 0, jS). We
can conceive this set of values as accidental and will replace them by the
more general set (]8i, fa, fa, fa)- This means that we will consider a general
system of differential equations (ordinary or partial) in the symbolic form
Yet this is not enough. We do not want to exclude from our considerations
over-determined systems* of the type
where the unknown is a scalar function <2> while the right side is a vector of
s components. In order to cover the general case we have to introduce
two discontinuous variables k and j, k for the function v(x) and j for the
right side )3(a;):
This adjoint operator is obtainable with the help of the Green's identity
which we can now write down in more definite terms:
On the right side all reference to the subscript j disappears. The delta function
of the right side represents a pure right-vector, at both points x and £. The
subscripts k and K run through the same set of values 1, 2, . . . , ^. The
definition of the delta-function on the right side of (16) is given as follows:
with the following interpretation of the right side. We denote with 8k(x, £)
a right side which is composed of zeros, except one single equation, namely
the fcth equation which has the delta function 8(x, £) as its right side. In
order to construct the entire vector Gjc(x, £)j, k = 1, 2, . . . , jn, we have to
solve a system of /Lt simultaneous differential equations for v functions
(ju. < v), not once but ^ times. We let the 8(0;, £) function on the right
side glide down gradually from the first to the /nth equation and thus obtain
in succession the components G\(x, £)j, GZ(X, £)j, . . . , G^(x, £); of the
complete Green's vector G^x, £)j (which is in fact a "vector" in a double
sense: /^-dimensional hi x and v-dimensional in £).
In frequent applications a system of differential equations originates
from one single differential equation of higher order which is transformed
into a system of first order equations by the method of surplus variables.
In such cases the "given right side" J3j(£) of the system consists of one
single function £(£) in the jth equation while the right sides of the remaining
224 THE GREEN'S FUNCTION CHAP. 5
equations are all zero. In this case the sum on the right side of (7.15) is
reduced to one single term and we obtain
Problem 182. Discuss the solution of the problem (7.6)—with the added
condition <Z>(0) = 0—from the standpoint of the "Green's vector" and compare
it with the solution obtained in section 5.
Problem 183. Write the problem of the elastic bar (4.14.9) in the form of a
first order system:
and solve this system for v^(x) under the boundary conditions
where a(x) and fi(x) are determined by the boundary conditions prescribed for
G(x, $) (active variable £):
Problem 184. Solve the same problem for the function vi(x) (the elastic
deflection).
[Answer:
The boundary conditions (7.21) (the bar clamped on both ends) demand for
the adjoint system
and we observe that the new system is self-adjoint, inasmuch as the adjoint
operator and the adjoint boundary conditions coincide with the original
operator and its boundary conditions.
The self-adjointness of a certain problem in linear differential equations
is a very valuable property which corresponds to the symmetry A = A of
the associated matrix problem. Such a symmetry, however, can be destroyed
if the equations of the system Ay = b are not written down in the proper
order, or even if they are multiplied by wrong factors. Hence it is under-
standable that the system (7.20) was not self-adjoint, although in proper
formulation the problem is in fact self-adjoint. We have to find a method
by which we can guarantee in advance that the self-adjoint character of a
system will not be destroyed by a false ordering of the equations.
16—L.D.O.
226 THE GREEN'S FUNCTION CHAP. 5
The majority of the differential equations encountered in mathematical
physics belong to the self-adjoint variety. The reason is that all the
equations of mathematical physics which do not involve any energy losses
are deducible from a "principle of least action", that is the principle of
making a certain scalar quantity a minimum or maximum. All the linear
differential equations which are deducible from minimising or maximising
a certain quantity, are automatically self-adjoint and vice versa: all
differential equations which are self-adjoint, are deducible from a minimum-
maximum principle.
In order to study these problems, we will first investigate their algebraic
counterpart. Let A be a (real) symmetric matrix
The second term can be transformed in view of the bilinear identity (3.3.6)
which in the real case reads
we obtain
point in which the condition (13) is satisfied. Such a summit in the local
sense is called a "stationary value", in order to distinguish it from a true
maximum or minimum. The technique of finding such a stationary value of
the scalar quantity § is that we put the infinitesimal variation of s caused by
a free infinitesimal variation of y, equal to zero.
It is of interest to see what happens to the scalar s in the case of a general
(non-symmetric) matrix A. In that case
But the second term vanishes identically, due to the bilinear identity
This explains why the variational method automatically ignores the anti-
symmetric part of the matrix A. Exactly the same results remain valid in
the complex case, if we replace "symmetric" by "Hermitian" and "anti-
symmetric" by "anti-Hermitian". We have to remember, of course, that
in the complex case the scalar
(which is real for the case A = A) comes about by changing in the first
factor every i to — i since this change is included in the operation y.
These relations can be re-interpreted for the case of differential operators.
If Dv is a self-adjoint operator, the differential equation
But now in the first term an integration by parts is possible by which the
order of differentiation can be reduced. The first term is replaceable
(apart from a boundary term which is variationally irrelevant) by
The original fourth order operator which appeared in (24), could be replaced
by a second order operator. The quadratic dependence on y(x) has not
changed. Generally an operator of the order 2n can be gradually reduced
to an operator of the order n.
The integral (26) represents the "elastic energy" of the bar which in the
state of equilibrium becomes a minimum. The additional term in Q,
caused by the load distribution f3(x):
[Answer:
[Answer:
SBC. 5.9 THE CALCULUS OF VARIATIONS 229
Problem 188. Show that the following variational integral yields no differential
equation but only boundary conditions:
* Cf., e.g., the author's book [6], quoted in the Bibliography of this chapter.
230 THE GREEN'S FUNCTION CHAP. 5
depends on a certain variable w which is present in L, without any derivatives.
Such a variable can be eliminated in advance, by solving for w the equation
[Answer:
Problem 190. Derive the differential equation and boundary conditions of the
elastic bar which is free at the two ends (i.e. no imposed boundary conditions)
by minimizing the integral
[Answer:
Problem 191. Do the same for the supported bar; this imposes the boundary
conditions
Problem 192. Show that all linear differential equations which are deducible
from a variational principle, are automatically self-adjoint in both operator and
boundary Conditions.
[Hint: Replace 8v(x) by u(x) and make use of Green's identity.]
We can make this function purely algebraic by introducing the first and the
second derivatives of v(x) as new variables. We do that by putting
But in the new formulation we have to add the two auxiliary conditions
(The variables were employed in the sequence p\. . . pn; <?i. . . qn>) The
boundary term has the form
and the entire canonical system may be written in the form of one unified
scheme:
if we agree that the notation pzn+i shall have the following significance:
which can be added without any harm since it has no effect on the solution
of the equation (21).
To obtain the resulting canonical scheme we proceed as follows. By the
method of surplus variables we introduce the first, second, . . . , n — 1st
derivative as new variables. We will illustrate the operation of the
principle by considering the general linear differential equation of third
order:
Ai(x)vm(x) + A2(x)v"(x) + A3(x)v'(x) + AI(X}V(X) = P(x) (5.10.23)
We denote v by #4 and introduce two new variables p$ and p$ by putting
Hence A\v" = p$. The first term of (23) may be written in a slightly
modified form:
and thus the given differential equation (23) may now be formulated as the
following first order system:
234 THE GKEEN'S FUNCTION CHAP. 5
If now we multiply by the undetermined factors pi, p2, pz and apply in
the first term the usual integration by parts technique, we obtain the
adjoint system in the form
The two systems (26) and (27) can be combined into the single system
This is a special case of the general canonical scheme (16), with a matrix
(17) in which the n x n matrices P and Q are missing and thus the operators
D and D fall apart, without any coupling between them. But the canonical
equations of Hamilton are once more valid.
The procedure we followed here has one disadvantage. In the case that
the given differential operator is self-adjoint, we may destroy the self-adjoint
nature of our equation and thus unnecessarily double the number of equations,
in order to obtain the canonical scheme. An example was given in the
discussion of the problem of the elastic bar (cf. Section 7), there we used the
method of surplus variables and succeeded in reducing the given fourth
order equation into a system of first order equations (7.20), which, however,
were not self-adjoint, although the system (8.2) demonstrated that the same
system can also be given in self-adjoint form. This form was deducible if
we knew the action integral from which the problem originated by the
process of variation. Then the Hamiltonian method gave us the canonical
system (8.2). But let us assume that we do not know in advance that our
problem can be put in self-adjoint form. Our equations are given in the
non-self-adjoint form (7.20). Is there a way of transforming this system into
the proper canonical form of four instead of eight equations?
SEC. 5.10 THE CANONICAL EQUATIONS OF HAMILTON 235
We can now go through with the Hamiltonian scheme and finally, after
obtaining the resulting self-adjoint system, compare it with the given
system and see whether the two systems are in fact equivalent or not.
In our problem (7.20) we have given our differential equation in the form
But now ^4 is purely algebraic and putting the partial derivative with respect
to ^4 equal to zero we obtain the condition
with
The equivalence of this canonical system with the original system (31) and
(32) is easily established if we make the following identifications:
Show the self-adjoint character of the system (14), (15), denoting the adjoint
variables by p^ <ft. Do the same for the boundary conditions
SEC. 5.11 THE HAMILTONISATION OF PAETIAL OPERATORS 237
with
Since rpi, 952, <ps, are purely algebraic variables (their derivatives do not
appear in U), they can be eliminated, obtaining:
The Hamiltonian system for the four variables pi, pz, ps, <p becomes:
Problem 195. According to Problem 170 (Chapter 4.18) the differential equation
shall have no solution other than the trivial solution v(x) = 0. If the
adjoint homogeneous system
possessed solutions which did not vanish identically, the Green's function
method did not lose its significance. It was merely necessary that the
given right side should be orthogonal to every independent solution of the
equation (2):
The number of equations, the number of unknowns and the rank of the
matrix all coincide. In that case we have Hadamard's "well-posed"
problem: the solution is unique and the given right side can be chosen freely.
Under such conditions the Green's function possesses a special property
which leads to important consequences. Let us consider, together with the
problem
We have chosen the notation &(x, £) to indicate that the Green's function
of the adjoint problem is meant. The defining equation of this function is
(in view of the fact that the adjoint of the adjoint is the original operator):
whose solution is obtainable with the help of the Green's function 0(x, |),
although we have now to replace x by £ and consequently choose another
symbol, say a, for the integration variable:
If we now identify jS(a) with S(x, a) in order to apply the general solution
method (10) to the solution of the special equation (8), the integration over
CT is reduced to the immediate neighbourhood of the point a = x and we
obtain in the limit (as e converges to zero), on the right side of (10):
while the left side v(£) is by definition @(x, £). This gives the fundamental
result
which has the following significance: The solution of the adjoint problem (6)
can be given with the help of the same Green's function G(x, £) which solved the
original problem. All we have to do is to exchange the role of "fixed point"
and '' variable point''.
We will once more assume that neither the given homogeneous system, nor
the adjoint homogeneous system has non-vanishing solutions. We can now
investigate the role of the reciprocity theorem (12.12), if the Green's /unction
G(x, |) is changed to the Green's vector Gk(x, £);. This question can be
answered without further discussion since we have seen that the subscripts
k and j can be conceived as extended dimensions of the variables x and £.
Accordingly the theorem (12.12) will now take the form
where the active variable is now x—together with the subscript k—while £
and j are constants during the process of integration.
This means that the same Green's vector G^(x, £)/ can be obtained in two
different ways. In the first case we write down the adjoint system, putting
the delta function in the kih equation and solving the system for Uj(g). In
the second case we write down the given system, putting the delta function
in the jih equation and solving the system for vjc(x). The general theory
shows that both definitions yields the same function Gk(x, £)y. But the
two definitions do not coincide in a trivial fashion since in one case £, in the
other x is the active variable. In a special case it may not even be simple
to demonstrate that the two definitions coincide without actually con-
structing the explicit solution.
If our system is self-adjoint, the second and the first system of equations
becomes identical and we obtain the symmetry condition of the Green's
vector in the form:
But in the original interpretation GI(X, £)% was denoted by G(x, |), while
GZ(X, £)i was denoted by G(x, £). And thus the symmetry relation (14.9)
expresses in fact the reciprocity theorem
obtained earlier (in Section 12) on a different basis. By the same reasoning
the generalised reciprocity theorem (3) can be conceived as an application
of the symmetry relation (6), if again we complement the given vectorial
system by the adjoint vectorial system, which makes the resultant system
self-ad joint.
Problem 198. Define the Green's function G(x, |) of Problem 183 (cf. (7.20)), by
considering x as the active variable.
[Answer:
Then
We can extend this method to the case that p goes to infinity. Let us
assume that
The more dimensional case is quite similar and the equation (11) expresses
the generation of f(x) by a superposition of pulses if we omit the limits a
and b and replace them by the convention that our integral is to be
extended over the entire given domain of our problem, d£ denoting the
volume-element of the domain.
Let us now return to our equation (3). We will generate the right side
fi(x) in terms of pulses, according to the equation
The previous ai corresponds to /?(£), the previous fa(x) to S(x, £). Accordingly
the equation (5) now becomes
replacing the notation Vi(x) by v(x, £). In order to bring into evidence that
we have constructed a special auxiliary function which depends not only on
x but also on the position of the point £ (which is a mere constant from the
standpoint of solving the differential equation (13)), we will replace the
notation v(x, £) by G(x, £):
where again 8(x, £); denotes that the delta function ?>(x, £) is put in the
jth equation (while all the other right sides are zero). Once more the result
agrees with the corresponding result (14.5) of our previous discussion, but
here again obtained on the basis of the superposition principle.
Problem 200. On the basis of the superposition principle find the solution of the
following boundary value problems:
Find the compatibility condition between « and jS and explain the situation
found above (why is p = k forbidden?).
[Answer: By Green's identity:
Since the delta function is zero everywhere to the left of x, up to the point
£ = x — (e/2), y(g) must be a constant C in this region. But the delta
function is zero also to the right of x, beyond the point £ = x + (e/2). Hence
y(g) must again be a constant in this region. But will it be the same
constant C we had on the left side? No, because the presence of the pulse
in the region between £ = x — (e/2) and x + (e/2) changes the course of
the function ?/(£). In this region we get
the left, and y(g) = C + I coming from the right, with a point of dis-
continuity at the point £ = x. The magnitude, of the, jump is 1.
Now let us change our differential equation to
Then */(£) is no longer a constant on the two sides of the point x but a
function of £, obtainable by solving the homogeneous equation
But in the narrow region between £ = x + (e/2) we can write our equation
in the form
and we find that now the increase of y(j;) in this narrow region is not exactly
1, in view of the presence of the second term, but the additional term is
proportional to e and goes to zero with e going to zero. Hence the previous
result that y(t-) suffers a jump of the magnitude 1 at the point £ = x remains
once more true. Nor is any change encountered if we add on the left sides
terms which are still smoother than y(£), namely y(-V($), y ( ~ 2) (£)> where
these functions are the first, second . . . , integrals of y(£). As in (9), all
these terms can be transferred to the right side and their contribution to
the jump of y(£) at £ = x is of the order e2, e3, . . . , with the limit zero as e
goes to zero.
Now we can go one step further still and assume that the coefficient of
y'(£) in the equation (7) is not 1 but a certain continuous function p(i-)
which we will assume to remain of the same sign throughout the range [a, 6].
In the neighbourhood of the point £ = x this function can be replaced by
the constant value p(x). By dividing the equation by this constant we
change the height of the delta function by the factor p~l(x). Accordingly
the jump of y(£) at the point £ = x will no longer be 1 but l/p(x).
Let us now consider an arbitrary ordinary linear differential operator
T)u(£) of wth order. This equation is exactly of the type considered before,
if we identify uW(g) with y'(t-), that is y(£) with ufr-Vffl. Translating our
previous result to the new situation we arrive at the following result: The
presence of the delta function at the point £ = x has the consequence that u($),
SEC. 5.16 GREEN'S FUNCTION or ORDINARY DIFFERENTIAL EQUATIONS 251
u'(g), u"(£), . . . , up to w< w ~ 2 >(£) remain continuous at the point £ = x, while,
w (n-i)(£) suffers a jump of the magnitude ljp(x) if p(£) is the coefficient of the
highest derivative %<")(£) of the given differential operator.
In view of this result we can dispense with the delta function in the case
of ordinary differential equations, and characterise the Green's function
G(x, |) by the homogeneous differential equation
to the left and to the right from the point £ = x if we complement this
equation by the aforementioned continuity and discontinuity conditions.
We can now return to our previous only partially solved problem of
obtaining the 2n undetermined constants AI, Bi, of the system (4). The
given boundary conditions yield n homogeneous algebraic equations. Now
we add n further conditions. The condition that u(x), u'(x), . . . , u(n~^(x)
must remain continuous, whether we come from the left or from the right,
yields n — 1 homogeneous algebraic equations between the At and the B^
The last condition is that u(n~V(x) is not continuous but makes a jump of
the magnitude
p(t;} being the coefficient of uW(£) of the adjoint operator I)u(£). This last
condition is the only non-homogeneous equation of our algebraic system of 2n
unknowns AI, BI.
As an example let us construct the Green's function of the differential
equation of the vibrating spring
These four equations determine the four constants A\, AZ, B\t B% of our
solution as follows:
and thus
which must hold in view of the self-adjoint character of our problem. This
does not mean that the expressions (20) must remain unchanged for an
exchange of x and £. An exchange of x and £ causes the point £ to come
to the right of # if it was originally to the left and vice versa. What is
demanded then is that an exchange of x and £ changes the left Green's
function to the right Green's function, and vice versa:
This, however, is actually the case in our problem since the second expression
of (20) may also be written in the form
SEC. 5.16 GREEN'S FUNCTION OF ORDINARY DIFFERENTIAL EQUATIONS 253
We also observe that the Green's function ceases to exist if the constant
p assumes the values
because then the denominator becomes zero. But then we have violated
the general condition always required for the existence of the Green's
function, namely that the homogeneous equation (under the given boundary
conditions) must have no solutions which do not vanish identically. If the
condition (24) is satisfied, then the homogeneous solutions (14) satisfy the
boundary conditions (13) and the homogeneous problem has now non-zero
solutions. The modification of our treatment to problems of this kind will
be studied in a later section (see Section 22).
Problem 202. Find the Green's function for the problem
[Answer:
Problem 203. Do the same for (25) with the boundary conditions
[Answer:
Problem 204. Find the Green's function for the problem of the vibrating
spring excited by an external force:
Obtain the result by considering once £ and once x as the active variable.
[Answer:
Problem 205. Find the Green's function for the motion of the "ballistic
galvanometer"
254 THE GBBEN'S FUNCTION CHAP. 5
[Answer:
Problem 206. If the constant y in the previous problem becomes very large, the
first term of the differential equation becomes practically negligible and we obtain
Demonstrate this result by solving (30) with the help of the Green's function.
Problem 207. Solve the differential equation of the vibrating spring in a resisting
medium:
with the help of the Green's function and discuss particularly the case of
"critical damping" p = a.
[Answer:
Problem 208. Solve with the help of the Green's function the differential
eauation
and compare the solution with that obtained by the "variation of the constants"
method.
[Answer:
Problem 209. The differential equation of the loaded elastic bar of uniform
cross-section [cf. Section 4.14, with I(x) = 1] is given by
We thus see that we can come from the one Green's function to the other
by adding some solution of the homogeneous equation. The coefficients of
this solution have to be adjusted in such a way that the new boundary
conditions shall become satisfied.
For example the Green's function of the clamped-free uniform bar came
out in the form (16.38). Let us now obtain the Green's function of the
clamped-clamped bar:
For this purpose we will add to the previous G(x, £) an arbitrary solution
of the homogeneous equation
that is
Problem 210. Obtain by this method the expression (16.27) from (16.26).
Problem 211. The boundary conditions of a bar simply supported on both
ends are
Obtain the Green's function of the simply supported bar from (16.38).
[Answer: Added term:
Problem 212. Obtain from the Green's function (16.29) the Green's function
of the same problem but modifying the boundary conditions to
By terminating the series after n terms we cannot expect that f*(x) shall
coincide with f(x). But we can introduce the difference between f(x) and
f*(x) as the "remainder" of the series. Let us call it v(x):
We will now find the Green's function of our differential equation (3)
and accordingly obtain the solution in the range [0, x] by the integral
But then the boundary conditions (4) make every one of these coefficients
equal to zero and thus
Now we come to the range x > g. Here again the homogeneous equation
has to be solved and we can write our polynomial in the form
The conditions of continuity demand that v(g), v'(g), . . . , w(«- 2 )(f) shall be
zero since the function on the left side vanishes identically. This makes
5o = &i = &2 = . . . &«-2 = 0, and what remains is
or
18--L.D.O.
258 THE GREEN'S FUNCTION CHAP. 5
which gives
and this is the Lagrangian remainder of the truncated Taylor series which, as
we have seen in Chapter 1.3, can be used for an estimation of the error of
the truncated Taylor series.
Problem 213. Carry through the same treatment on the basis of the adjoint
equation, operating with G(x, £).
we first of all notice that at the points of interpolation (1) f(x) and f*(x)
coincide and thus
These conditions take the place of the previous boundary conditions (18.4).
Furthermore, if we differentiate (4) n times and consider that the nth
derivative of /*(#) (being a polynomial of the order n — 1) vanishes, we
once more obtain the differential equation
We thus have to solve the differential equation (6) with the inside conditions
is put out of action at these points, just as we have earlier conceived the
appearance of the delta function at the point £ = x as a consequence of the
fact that we have added v(x) to the data of our problem.
260 THE GREEN'S FUNCTION CHAP. 5
The full adjoint problem can thus be described as follows. The delta
function 8 (x, g) appears not only at the point £ = x but also at the points
£ = xi,xz, . . . ,xn. The strength with which the delta function appears at
these points has to be left free. The completed adjoint differential equation
of our problem will thus become
We have gained the n new constants «i, «2> • • • > « « which remove the
over-determination since we now have these constants at our disposal,
together with the n constants of integration associated with a differential
equation of the nth order. They suffice to take care of the 2n boundary
conditions (10).
First we will satisfy the n boundary conditions
This is the problem of the Green's function G(x, £), solved in the previous
section as a function of x, but now considered as a function of £. We can
take over the previous solution:
We will combine these two solutions into one single analytical expression
by using the symbol [tn~l] for a function which is equal to tn~l for all
positive values of t but zero for all negative values of t:
must vanish identically. We thus see that the Green's function 0(x, £) will
not extend from a to 6 but only from x\ to xn if the point x is inside the
interval [xi, xn], or from x to xn if x is to the left of xi, and from xi to x if
x is to the right of xn.
The vanishing of Qn-i(l) demands the fulfilment of the following n
equations:
We will now choose for Pn-i(x) the Lagrangian polynomials <pk(x), defined
by (3). Then the factor of ajc becomes 1, the factor of all the other ay zero,
and we obtain
Hence we see that the strength with which the delta functions are
represented at the points of interpolation xjc, depends on the position of the
variable point x. Moreover, this strength is given by exactly the same
interpolation coefficients, taken with a negative sign, which appear in the
Lagrangian interpolation formula.
We have now constructed our Green's function in explicit form:
(We have assumed that x is an inside point of the interval [xi, xn]; if x is
outside of this interval and to the left of xi, the lower limit of integration
becomes x instead of x\; if x is outside of [xi, xn] and to the right of xn, the
upper limit of integration becomes x instead of xn.)
It will hardly be possible to actually evaluate this integral. We can use
it, however, for an estimation of the error TJ(X) of the Lagrangian interpolation
262 THE GREEN S FUNCTION CHAP. 5
(this r)(x) is now our v(x)). In particular, we have seen in the first chapter
that we can deduce the Lagrangian error formula (1.5.10) if we can show
that the function G(x, g), taken as a function of £, does not change its sign
throughout the interval of its existence which in our case will be between
xi and xn, although the cases x to xn or x\ to x can be handled quite
analogously.
First of all we know that G(x, £) = u(g) vanishes at the limiting points
£ = xi and | = xn with all its derivatives up to the order n — 2; w
go beyond n — 2 because the (n — l)st derivative makes a jump at x\
and xn, due to the presence of 8(2:1, £) and 8(xn, £) in the wth derivative
(cf. (12)), and thus starts and ends with a finite value. Now u(£), being a
continuous function of £ and starting and ending with the value zero, must
have at least one maximum or minimum in the given interval [xi, xn~\. But
if u(j;) were to change its sign and thus pass through zero, the number of
extremum values would be at least two. Accordingly the derivative u'(£)
must vanish at least once and if we can show that it vanishes in fact only
once, we have established the non-vanishing of u(£) inside the critical
interval. Continuing this reasoning we can say that w(*>(£) must vanish
inside the critical interval at least k times and if we can show that the
number of zeros is indeed exactly k and not more, the non-vanishing of u(i-)
is once more established.
Now let us proceed up to the (n — 2)nd derivative and investigate its
behaviour. Since the nih derivative of w(£) is composed of delta functions,
the (n — l)st derivative is composed of step functions. Hence it is the
(n — 2)nd derivative where we first encounter a continuous function composed
of straight zig-zag lines, drawn between the n + 1 points xi, x%,... ,x,
.. ., xn. The number of intervals is n and since no crossings occur in the
first and in the last interval, the number of zeros cannot exceed n — 2.
Hence w<w~2>(£) cannot vanish more than n — 2 times while our previous
reasoning has shown that it cannot vanish less than n — 2 times. This
establishes the number of zeros as exactly n — 2 which again has the conse-
quence that an arbitrary &th derivative has exactly k zeros within the given
interval while G(x, £) itself does not change its sign as £ varies between xi
and xn. The theorem on which the estimation of Lagrange was based is
thus established. Furthermore, the formula (23) puts us in the position to
construct G(x, £) explicitly, on the basis of Lagrange's interpolation formula.
The Lagrangian interpolation coefficients <pic(x) ordinarily multiplied by
SBC. 5.20 LAGRANGIAN INTERPOLATION WITH DOUBLE POINTS 263
f(xje) are now multiplied by the functional values of the special function
[(x — £)n~1]/(w — 1)! taken at the points x = xje (considering £ as a mere
parameter).
We see that the Green's function of Lagrangian interpolation can be
conceived as the remainder of the Lagrangian interpolation of the special
function [(x - £)n-1]/(w - 1)!.
Problem 214. Show that if x > xn and £ is a point between xn and x:
Problem 215. Show from the definition (23) that G(x, £) vanishes at all points
X = Xjc-
Problem 216. Show from the definition (23) that G(x, £) vanishes at all values of
£ which are outside the realm of the n + 1 points \x\, x%, . . . , xn, x].
contains all the root factors once and only once. But in a similar way as
in algebra where a root may become a multiple root through the collapsing
of several single roots, something similar may happen in the process of
interpolation. Let us assume that the functional value f(x^) is prescribed
in two points x = xjc ± e which are very close together, due to the smallness
of e. Then we can put
($(x) is composed of all the other root factors.) The two terms associated
with the two critical points become
The sign ± shall mean that we should take the sum of two expressions,
264 THE GREEN'S FUNCTION CHAP. 5
the one with the upper, the other with lower sign. The factor of /(«*)
becomes
and this becomes the factor of f ' ( x ) . On the other hand, the expression (5)
can now be written in the form
For the estimation of an error bound the Lagrangian formula (1.5.10) holds
again, counting every root factor with the proper multiplicity.
Problem 217. By analysing the expression (11) demonstrate that the inter-
polating polynomial assumes the value f(xje) at the point x = XK, while its
derivative assumes the value f'(xje) at the point x = x^.
Problem 218. Obtain a polynomial approximation of the order 4 by fitting
f(x) at the points x = ± 1, 0, and/'(#) at the points x = ± 1.
[Answer:
Problem 219. Apply this formula to an approximation of sin (rr/2)x and cos (rr/2)a;
and estimate the maximum error at any point of the range [—1, 1].
[Answer:
Problem 220. Explain why the error bound for the cosine function can in
fact be reduced to
. . e -•
Problem 223. Apply this interpolation once more to the functions sin (ir/2)x and
cos (7r/2)g and estimate the maximum errors.
266 THE GBEEN'S FUNCTION CHAP. 5
[Answer:
Problem 224. Obtain the Green's function for the remainder of this inter-
polation.
[Answer:
Problem 225. Show that this Green's function is characterised by exactly the
same conditions as the Green's function of the clamped bar, considered before
in Section 17 except that the new domain extends from — 1 to +1 while the
range of the bar was normalised to [0,1]. Replacing a; by a; + 1, £ by £ + 1,
show that the expression (19) is in fact equivalent to (17.8), if we put 1 = 2.
we need first of all the explicit construction of the Green's vector Gjf(x, £)].
This means that—considering £ as the active variable—we should put the
delta function in the fcth equation
while all the other equations have zero on the right side.
This again means that the homogeneous equations
SEC. 5.21 CONSTRUCTION OF THE GREEN'S VECTOR 267
are satisfied in both regions £ < x and £ > x. Assuming that we possess
the homogeneous solution with all its 2n constants of integration, we can
set up the solution for £ < x with one set of constants and the solution for
£ > x with another set of constants, exactly as we have done in Section 16.
The 2n boundary conditions of our problem provide us with 2n linear algebraic
relations between the 4n free constants. Now we come to the joining of the
two regions at the point £ = x. In view of the fact that the delta function
exists solely in the fcth equation, we obtain continuity in all components pi(x),
with the only exception of the component pn+k where we get a jump of 1 in
going from the left to the right:
(The previous v is now p2.) The two fundamental solutions of the homo-
geneous system are
Accordingly we establish the two separate solutions to the left and to the
right from the point £ = x in the form
The boundary conditions (7) yield the following two relations between the
four constants A\, A%, BI, B%:
268 THE GREEN'S FUNCTION CHAP. 5
Let us first obtain the two components G\(x, £)i, 2- Then pi is continuous
at £ = x while pz(x) makes a jump of the magnitude 1. This yields the two
further relations
Hence we have now obtained the following two components of the full
Green's vector:
with the convention that the upper sign holds for £ < x, the lower sign for
$ > x.
We now come to the construction of the remaining two components
G^(x, £)i, 2> characterised by the condition that now pz remains continuous
while — pi makes a jump of 1 at the point £ = x. The equations (9) remain
unchanged but the equations (10) have to be modified as follows:
SBC. 5.21 CONSTRUCTION OF THE GBEBN'S VECTOR 269
Explain these relations on the basis of the differential equation (21.6), assuming
the right sides in the form y(x), f$(x), instead of 0, j3(ic).
Problem 227. Consider the canonical system (8.2) for the clamped elastic bar
(boundary conditions (7.21)). In (7.24) the Green's function for the solution
VQ(X) was defined and the determining equations (7.26-26) deduced, while later
the application of the reciprocity theorem gave the determining equations
(14.11-12), considering x as the active variable. From the standpoint of the
Green's vector the component G$(x, £)i is demanded, that is, we should evaluate
u
l(£)» putting the delta function in the third equation (which means a jump of
1 at £ = x in the function vz(g)). We can equally operate with G\(x, £)%, that is,
evaluate «3(£), putting the delta function in the first equation (which means a
jump of — 1 at £ = x in «4(£)). In the latter method we have to exchange in the
end x and £. Having obtained the result—for the sake of simplicity assume a bar
270 THE GREEN'S FUNCTION CHAP. 5
of uniform cross-section, i.e., put/(£) = const. = /—we can verify the previously
obtained properties of the Green's function, deduced on the basis of the defining
differential equation but without explicit construction:
la). Considered as a function of £, O(x, £) and its first derivative must vanish
at the two endpoints £ = 0 and £ = I.
Ib). The coefficients of £2 and £3 must remain continuous at the point £ = x.
Ic). The function O(x, g) must pass through the point £ = x continuously
but the first derivative must make a jump of 1 at the point £ = x.
2a). Considered as a function of x the dependence can only be linear in x,
with a jump of 1 in the first derivative at the point x = £.
2b). If we integrate this function twice from the point x = 0, we must wind
up at the endpoint x = I with the value zero for the integral and its first
derivative.
[Answer:
both conditions are included in the statement that the homogeneous system
and likewise
Under these conditions our equation is once more solvable and the solution
is unique. Hence we can expect that the solution will again be obtainable
with the help of a Green's function G(x, £), called the "constrained Green's
function":
272 THE GBEEN'S FUNCTION CHAP. 5
The question arises how to define these functions. This definition will
occur exactly along the principles we have applied before, if only we
remember that our operations are now restricted to the activated subspace
of the function space. Accordingly we cannot put simply the delta function
on the right side of the equation. We have to put something on the right
side which excludes any components in the direction of the homogeneous
solutions, although keeping everything unchanged in the activated dimen-
sions. We will thus put once more the delta function on the right side of the
defining equation, but subtracting its projection into the unwanted dimensions.
This means the following type of equation:
and likewise
Now we have to find the undetermined constants pj, ajc. This is simple
if we assume that the homogeneous solutions vk(£), respectively wj(£) have
been orthogonalised, and normalised, i.e. we use such linear combinations of
the homogeneous solutions that the resulting solutions shall satisfy the
orthogonality and normalisation conditions
These equations are now solvable (under the proper boundary conditions)
but the solution will generally not be unique. The uniqueness is restored,
however, by submitting the solution to the orthogonality conditions (6).
As an example we return to the problem we have discussed in Section 16.
We have seen that the solution went out of order if the constant p satisfied
the condition (16.24). These were exactly the values which led to the
homogeneous solutions
These two solutions are already orthogonal and thus we can leave them as
they are, except for the normalisation condition which implies in our case
the factor V%]1:
Hence for the exceptional values (16.24) the defining equation for the
Green's function now becomes
assuming that € is small. The difficulty arises only for e = 0; for any
finite e the problem is solvable. Now we let e go to zero. Then there is a
term which is independent of e and a term which goes with 1/e to infinity.
We omit the latter term, while we keep the constant term. It is this constant
term which automatically yields the Green's function of the constrained problem.
For example in our problem we can put
19—L.D.O.
274 THE GREEN'S FUNCTION CHAP. 5
in which case we know already the solution of the equation (17) since we
are in the possession of the Green's function for arbitrary values of p (cf.
(16.20); we neglect the negligible powers of e):
Here then is the Green's function of our problem obtained by a limit process
from a slightly modified problem which is unconditioned and hence subjected
to the usual treatment, without any modification of the delta function on the
right side. This function would go to infinity without the proper pre-
cautions. By modifying the right side in the sense of (16) we counteract
the effect of the term which goes to infinity and obtain a finite result. The
symmetry of the Green's function:
Show that the solution satisfies the given boundary conditions, the defining
differential equation (13), the symmetry condition, and the orthogonality to the
homogeneous solution.
[Answer:
Compatibility condition:
The variable x ranges between — 1 and +1 and the coefficient of the highest
derivative: 1 — x2 vanishes at the two endpoints of the range. This has a
peculiar consequence. We do not prescribe any boundary conditions for
v(x). Then we would expect that the adjoint problem will be over-
determined by having to demand 4 boundary conditions. Yet this is not
the case. The boundary term in our case becomes
The boundary term vanishes automatically, due to the vanishing of the first
factor. Hence we are confronted with the puzzling situation that the
adjoint equation remains likewise without boundary conditions which makes
our problem self-adjoint since both differential operator and boundary
conditions remain the same for the given and the adjoint problem.
In actual fact the lack of boundary conditions is only apparent. The
vanishing of the highest coefficient of a differential operator at a certain
point makes that point to a singular point of the differential equation, where
the solution will generally go out of bounds. By demanding finiteness (but
not vanishing) of the solution we already imposed a restriction on our
solution which is equivalent to a boundary condition. Since the same occurs
276 THE GREEN'S FUNCTION CHAP. 6
at the other end point, we have in fact imposed two boundary conditions on
our problem by demanding finiteness of the solution at the two points
x = ± 1.
Our aim is now to find the Green's function of our problem. Since our
differential equation is self-adjoint, we know in advance that the Green's
function G(x, £) will become symmetric in x and £. There is, however, the
further complication that the homogeneous equation has the solution
Accordingly we have to make use of the extended definition (22.8) for the
Green's function. The normalised homogeneous solution becomes
which yields
By the same reasoning, if we set up our solution on the right side: £ > x:
we obtain
because it is now the function log (1 — £) which goes to infinity and which
thus has to be omitted.
So far we have obtained
But this is exactly what the delta function on the right side demands: the
jump of the n — 1st derivative must be 1 divided by the coefficient of the
highest derivative at the point £ = x (which in our problem is (1 — x2)).
We have now satisfied the differential equation (5) and the boundary
conditions (since our solution remains finite at both points $ = +1). And
yet we did not get a complete solution because the two constants BI and BZ
have to satisfy the single condition (14) only. We can put
Then
Problem 229. Assuming that fi(x) is an even function: fi(x) = fi( — x), the
integration is reducible to the range [0, 1]. The same holds if /?(#) is odd:
fi(x) — — j8( — x). In the first case we get the boundary condition at x = 0:
Carry through the process for the half range with the new boundary conditions,
(at x = 1 the previous finiteness condition remains) and show that the new
result agrees with the result obtained above.
[Answer:
Compatibility condition:
(which expresses the fact that the system constantly repeats its motion under
the influence of the periodic exciting force) it is necessary and sufficient
that the orthogonality conditions
are satisfied. But let us assume that these conditions are not satisfied.
Then instead of saying that now our given problem is unsolvable, we can
go through the regular routine of the solution exactly as before. But
finally, when checking the compatibility conditions, we find that we have to
make allowances in the boundary conditions in order to make our problem
solvable. The problem of the vibrating spring, kept in motion by a periodic
280 THE GREEN'S FUNCTION CHAP. 5
external force, is a very real physical problem in which we know in advance
that the solution exists. But in the case of resonance the return of the
system to the original position cannot be expected and that means that the
conditions (1) will no longer hold.
Now in the construction of the adjoint system we went through the
following moves. We multiplied the given operator Dv(x) by an un-
determined factor u(x) and succeeded in "liberating" v(x), which was now
multiplied by a new operator t)u(x). In this process a boundary term
appeared on the right side. For example in the problem of the vibrating
spring:
Now under the homogeneous boundary conditions (1) the right side dropped
out and we obtained the compatibility condition
for the case that the homogeneous equation under the given homogeneous
boundary conditions possessed non-zero solutions. But if we allow that
the given boundary conditions (1) have something on the right side, let us
say Pi, P2> then the compatibility condition (4) has to be modified as follows:
which gives
But now the solution with the help of the Green's function (22.20) has
also to be modified because the inhomogeneous boundary values pi and pz
will contribute something to the solution. The Green's identity (3) comes
once more in operation and we find that we have to add to our previous
solution the right side of (3), but with a negative sign:
SEC. 5.25 THE METHOD OF OVER-DETERMINATION 281
Problem 231. The boundary conditions of an elastic bar, free on both ends,
can be given according to (4.14.10) in the form:
where
and
obtain the point load and the point torque demanded at the point x = 0, to
keep the bar—which is free at the other endpoint x = I—in equilibrium.
[Answer:
does not occur any more. We need not modify the right side in order to
make the equation solvable. In fact, another strange phenomenon is now
encountered. The method of the Green's identity now shows that the
adjoint equation becomes
The solution (8) does not coincide with the earlier solution (22.20)—
complemented by (24.8)—because we followed a different policy in normal-
ising the free homogeneous solutions. But the simpler Green's function
(7)—obtained along the usual lines of constructing the Green's function
without any modification of the right side—is just as good from the stand-
point of solving the originally given problem as the more elaborate function
(22.20). If we want to normalise the final solution in a different way, we
can still add the homogeneous solution
This can be done since the auxiliary function G\(x, £) puts us in the position
to solve the inhomogeneous equation (14) (remembering, however, that we
have to consider x as the integration variable and this x should preferably
be called x\ in order to distinguish it from the previous x which is a mere
constant during the integration process. The £ on the right side of (14)
becomes likewise x\). In our problem we obtain:
But this is not yet the final answer. We still have the uncertainty of the
homogeneous solution
where the symbol [ ] shall again indicate (cf. (19.16)) that all negative values
SEC. 5.25 THE METHOD OF OVER-DETERMINATION 285
(with different constants A and B). The sum of (20) plus (21) has to satisfy
the condition of orthogonality, integrating with respect to the range
£ = [0,1], that is
This gives
where the upper sign holds for £ < x, the lower sign for £ > x. The
symmetry in x, g is evident. Moreover, a comparison with the earlier
expression (22.20) shows perfect agreement.
Problem 232. Apply the method of over-determination to Problem 228 by
adding the boundary condition
v(Q) = 0
Find the Green's function of this problem and construct with its help the
constrained Green's function (22.24).
[Answer:
286 THE GREEN'S FUNCTION CHAP. 5
Problem 233. Consider the problem of the free elastic bar of constant cross-
section, putting I(x) = 1 (of. Chapter 4.14 and Problem 231), with the added
boundary conditions
Obtain the Green's function of this problem and construct with its help the
constrained Green's function G(x, g) of the free elastic bar.
[Answer:
which we will arrange in increasing order, starting with the smallest eigen-
value AI, and continuing with the larger eigenvalues which eventually
become arbitrarily large. In harmony with our previous policy we will omit
all the negative A^ and also all the zero eigenvalues, which in the case of
partial operators may be present with infinite multiplicity.
Now the corresponding eigenfunctions
and
or
The omission of the zero-axes is a property of the operator itself and not
the fault of the (generally incomplete) orthogonal system (4)—which,
however, must not omit any of the eigenfunctions which belong to a positive
\i, including possible multiplicities on account of two or more eigenvalues
collapsing into one, in which case the associated eigenfunctions have to be
properly ortho-normalised.
The eigenfunctions Ui(x) are present in sufficient number to allow an
expansion of /3(x) into these functions, in the form of an infinite convergent
series:
We want to assume that f$(x) belongs to a class of functions for which the
expansion converges (if f3(x) is everywhere in the given domain finite,
sectionally continuous, and of bounded variation, this condition is certainly
satisfied. It likewise suffices that /3(x) shall be sectionally differentiate).
We now multiply this expansion by a certain ujc(x) and integrate over the
given domain term by term. This is permissible, as we know from the
theory of convergent infinite series. Then on the right side every term
except the kih drops out, in consequence of the orthogonality conditions (5),
while in the &th term we get /?#. And thus
On the other hand, the unknown function v(x) can likewise be expanded,
but here we have to use the functions Vi(x) (which may belong to a completely
SEC. 5.26 ORTHOGONAL EXPANSIONS 289
different functional domain, for example v(x) may be a scalar, j3(a;) a vector,
cf. Section 5, and Problem 236):
from which
We will assume that the eigenvalue spectrum (3) starts with a definite finite
smallest eigenvalue AI ; this is not self-evident since the eigenvalue spectrum
may have a "condensation point" or "limit point" at A = 0 in which case
we have an infinity of eigenvalues which come arbitrarily near to A = 0
and a minimum does not exist. Such problems will be our concern in a
later chapter. For the present we exclude the possibility of a limit point
at A = 0. Then the infinite series
20—L.D.O.
290 THE GREEN S FUNCTION CHAP. 5
is demanded by the compatibility of the system: the right side must have
no components in those dimensions of the function space which are not
included by the operator. We have to test the given function fi(x) as to
the validity of these conditions because, if these conditions are not fulfilled,
we know in advance that the given problem is not solvable.
Problem 234. Given the partial differential equation
[Answer:
Problem 236. Formulate the eigenvalue problem for the scalar-vector problem
(5.3) and obtain the orthogonal expansions associated with it.
[Answer:
where
where
where
292 THE GREEN'S FUNCTION CHAP. 5
the auxiliary function Gn(x, g) being defined as follows:
once more omitting the solutions for A = 0 but keeping all the positive and
negative A$ for which a solution is possible. The transition to the "shifted
eigenvalue problems" (26.2) permits us to generalise the usual self-adjoint
expansion to a much wider class of operators which includes not only the
"well-posed", although not self-adjoint problems but even the case of
arbitrarily over-determined or under-determined problems. Hence the
functions vi(x), Ui(g) need not belong to the same domain of the function
space but may operate in completely different domains.
SEC. 5.27 THE BILINEAR EXPANSION 293
For example in our Problem 236 the functions Vi(x) are the scalar functions
<pi(x), while the functions Ui(g) are the vectorial functions grad <?*(£)• The
Green's function of our problem—which is a scalar with respect to the
point x and a vector with respect to the point £—is obtainable with the help
of the following infinite expansion:
where the pi(x] can be chosen freely as any functions of x. This additional
sum will not contribute anything to the solution v(x) since the right side
satisfies the compatibility conditions
(cf. 26.17) and thus automatically annuls the contribution from the added
sum (13). As we have seen in Section 5, over-determined systems possess
the great advantage that we can choose our Green's function much more
liberally than we can in a well-determined problem n = m = p, where in
fact the Green's function is uniquely defined.
Another interesting conclusion can be drawn from the bilinear expansion
concerning the reciprocity theorem of the Green's function, encountered
294 THE GREEN'S FUNCTION CHAP. 5
earlier in Section 12. Let us assume that we want to solve the adjoint
equation (26.7). Then our shifted eigenvalue problem (26.2) shows at once
that in this case we get exactly the same eigenvalues and eigenfunctions, with
the only difference that the role of the functions ut(x) and vi(x) is now
exchanged. Hence the bilinear expansion of the new Green's function
G(x, £) becomes:
but this is exactly the previous expansion (7), except that the points x and
£ are exchanged; and thus
All these results hold equally for ordinary as for partial differential
operators since they express a basic behaviour which is common to all linear
operators. But in the case of ordinary differential equations a further
result can be obtained. We have mentioned that generally the convergence
of the bilinear expansion (7) cannot be guaranteed without the proper
modifications. The difficulty arises from the fact that the Green's function
of an arbitrary differential operator need not be a very smooth function.
If we study the character of the bilinear expansion, we notice that we can
conceive it as an ordinary orthogonal expansion into the ortho-normal system
Vi(x), if we consider a; as a variable and keep the point £ fixed, or another
orthogonal expansion into the eigenfunctions Ui(£), if we consider £ as the
variable and x as a fixed point. The expandability of G(x, £) into a con-
vergent bilinear series will then depend on whether or not the function
G(x, £) belongs to that class of functions which allow an orthogonal expansion
into a complete system of ortho-normal functions. This "completeness" is
at present of a restricted kind since the functions vi(x] and ui(x] are generally
complete only with respect to a certain subspace of the function space.
However, this subspace coincides with the space in which the constrained
Green's function finds its place. Hence we have no difficulty on account of
the completeness of our functions. The difficulty arises from the fact that
G(x, £) may not be quadratically integrable or may be for other reasons too
unsmooth to allow an orthogonal expansion.
In the domain of ordinary differential equations, however, such an
unsmoothness is excluded by the fact that the Green's function, considered
as a function of x, satisfies the homogeneous differential equation Dv(x) = 0
with the only exception of the point x = £. Hence G(x, £) is automatically
SEC. 5.27 THE BILINEAR EXPANSION 295
They have the common characteristics that they assume at x = 1 their maximum
value 1.
On the basis of the results of Section 23 obtain the following infinite
expansions:
296 THE GREEN'S FUNCTION CHAP. 5
Then the operator on the left side of (23.1) becomes —£>Dv which shows that
the eigenvalues of the shifted eigenvalue problem (26.2) associated with (28)
n.rA Aonn.l t,r>
Obtain the Green's function of the operator (28) for the range [0, 1], with the
boundary condition
Problem 239. Show that at the point of discontinuity £ = 0 the series (32)
yields the arithmetic mean of the two limiting ordinates, and thus:
Problem 240. Obtain the Green's function and its bilinear expansion for the
following operator:
[Answer:
Problem 241. Solve the same problem for the boundary condition
[Answer:
298 THE GREEN'S FUNCTION CHAP. 5
Problem 242. Solve the same problem for the boundary condition
[Answer:
SEC. 5.28 HEEMITIAN PROBLEMS 299
Problem 243. Consider the same problem with the boundary condition
This condition would be self-adjoint in the algebraic sense but is not self-
adjoint in the Hermitian sense since the adjoint boundary condition becomes
There is no change here since the imaginary unit does not occur anywhere.
However, the adjoint boundary condition—obtained in the usual fashion,
with the help of the extended Green's identity—becomes
and we see that our problem loses its self-adjoint character. Without this
change of i to — i, however, our eigenvalue problem would lose its significance
by not yielding real eigenvalues or possibly not yielding any eigensolutions
at all. On the other hand, we know in advance from the general analytical
theory that the shifted eigenvalue problem with the proper boundary
conditions will yield an infinity of real eigenvalues and a corresponding set
of eigenfunctions which, although complex in themselves, form an ortho-
normal set of functions in the sense that
In this section we will study the nature of such problems with the help
of an over-simplified model which is nevertheless instructive by demonstrating
SEC. 5.28 HEBMITIAN PROBLEMS 301
and the A& must again become positive real numbers. Moreover, the shifted
eigenvalue problem yields
The two boundary conditions (27.56) and (8) give the following two conditions:
and thus
We see that in spite of the complex nature of a the eigenvalues A& become
real. In fact, we obtain once more the same system of eigenvalues as before
in (27.58):
where the upper sign holds for A& = 2Jbr + AQ and the lower sign for
Afc = 2&7T — AQ. The amplitude factor Ajc follows from the condition
with
The boundary conditions (12) establish the following relation between AQ,
y and the original complex constant a:
(For the sake of formal simplicity we have departed from our usual con-
vention of consistently positive eigenvalues. If we want to operate with
consistently positive A^, we have to change the sign of A, OQ, and y for the
second group of eigenvalues which belong to the negative sign of the formula
(21)0
The Green's function can again be constructed along the usual lines.
However, in view of the complex elements of the operator (which in our
problem come into evidence only in the boundary conditions) some character-
istic modifications have to be observed. First of all, the Green's function
corresponds to the inverse operator and this feature remains unaltered even
hi the presence of complex elements. Since the proper algebraic adjoint
is not the Hermitian adjoint JD* but I), the definition of the Green's function
—considered as a function of |—must occur once more hi terms of f>:
SEC. 5.28 HERMITIAN PROBLEMS 303
while in the case of the adjoint Green's G(x, £) the corresponding equation
becomes:
(Notice that the asterisk appears consistently in connection with the variable
£.) The bilinear expansion (27.7) becomes now modified as follows:
Similar is the procedure with respect to the expansion of v(x) into the
ortho-normal eigenfunctions Vi(x).
We will apply these formulae to our problem (8-9). Let us first construct
the Green's function associated with the given operator Dv = v'. The rule
304 THE GREEN'S FUNCTION CHAP. 5
(24) demonstrates that we obtain once more the result of Problem 243
(cf. 27.59), although the constant a is now complex:
Imaginary part:
If now we divide by A and form the sum, we notice that the result is
expressible in terms of two functions f(t) and g(t):
SEC. 5.28 HEEMITIAN PROBLEMS 305
In order to identify the two functions f(t) and g(t), we will make use of
the fact that both systems v^x) and ujc(x) represent a complete ortho-
normal function system, suitable for the representation of arbitrary section-
ally continuous and differentiable functions. Let us choose the function
where
Since the imaginary part of the left side of (48) must vanish, we get
and taking out the constants cos 0o and sin do in the trigonometric sums
(49), we finally obtain for the two sums (41) and (42);
21—L.D.O.
306 THE GREEN'S FUNCTION CHAP. 5
Now we return to our formulae (43), (44), substituting the proper values
for/(#) and g(x). Assuming that £ < x (and x + £ < 1), we obtain for (43):
Combining real and imaginary parts into one complex quantity we finally
obtain
in full accordance with the value of G(x, £) for £ < x (cf. 34). If £ > x,
the only change is that the first term of (53) changes its sign and we obtain
the correct value of G(x, £) for £ > x.
Problem 244. In the above proof the restricting condition x + £ < 1 was
made, although in fact x + £ varies between 0 and 2. Complement the proof
by obtaining the values of/(£) and g(t) for the interval 1 < t < 2. (Hint: put
x = 1 + x'.) Show that at the point of discontinuity t = 1 the series yield
the arithmetic mean of the two limiting ordinates.
[Answer:
Problem 245. By specifying the values of a: to 0 and £ obtain from (51) and
(52) generalisations of the Leibniz series
[Answer:
SEC. 5.28 HERMITIAN PROBLEMS 307
In particular:
Problem 246. Consider a as purely imaginary a = ia> and demonstrate for this
case the validity of the bilinear expansion of the second Green's function (38).
Problem 247. Obtain for the interval x = [0, 1] the most general Hermitian
operator of first order and find its Green's function G(x, |).
[Answer:
Boundary condition:
where
where
where p(x) is a monotonously increasing function, prove that the following set of
functions form a complete Hermitian ortho-normal set in the interval [0, 1]:
Boundary condition:
where
and show that the expansion into the functions (68) is equivalent to the Fourier
series in its complex form.
has exactly the same eigenfunctions as our previous problem, while the new
eigenvalues A'$ have changed by the constant amount € :
with
It is only when we come to the solution v(x) and the limit process involved
in the gradual decrease of e that the difficulties arise:
For every finite e the solution is unique and finite. But this solution does
not approach any limit as c converges to zero, except if all the ft disappear
which means the conditions
By this procedure we have done no harm to our problem since the second
equation is completely independent of the first one and can be solved by the
trivial solution
But now we will establish a weak coupling between the two equations by
modifying our system as follows:
that once more the eigenfunctions have remained unchanged while the
eigenvalues have changed by the constant amount e, exactly as in (4).
Once more the previous eigenvalue A = 0 has changed to the eigenvalue
A = e and the zero eigenvalue can be avoided by making e sufficiently small.
Hence the previously incomplete operator becomes once more complete and
spans the entire U space and the entire V space. We know from the general
theory that now the right side can be given freely and the solution becomes
unique, no matter how small e may be chosen. In matrix language we
have changed our original n x ra matrix of the rank p to an (n + m) x
(n + m) matrix of the rank n + m. The conditions of a "well-determined"
and "well-posed" problem are now fulfilled: the solution is unique and the
right side can be chosen freely.
The right side j8(#) can be analysed in terms of the complete ortho-
normal function system Ui(x), ul(x):
where
312 THE GREEN'S FUNCTION CHAP. 5
while the solution u(x), v(x) can be analysed in terms of the complete
ortho-normal function systems U{(x), uJ(x), respectively Vi(x), vk(x):
Then the differential equation (15) establishes the following relation between
the expansion coefficients a^ b^ on the one hand and j8,j on the other
The second equation shows that none of the. eigenfunctions vk(x) appear in
the expansion (19) which are not represented in the operator Dv(x). The
normalisation we have employed before, namely to put the solution com-
pletely into the activated F-space of the operator, is upheld by the perturbed
system (15) which keeps the solution constantly in the normalised position,
without adding components in the non-activated dimensions.
The new system includes the function u(x) on equal footing with the
function v(x). Now the first formula of (2) shows that the solution u(x)
is weakly excited in all the activated dimensions of the 17-space and converges
to zero with e going to zero. This, however, is not the case with respect to
the non-activated dimensions u3(x). Here the first formula of (21) shows
that the solution increases to infinity with e going to zero, except if the
compatibility conditions fl = 0 are satisfied. Once more we approach our
problem from a well-posed and well-determined standpoint which does not
involve any constraints. These constraints have to be added, however, if
we want our solution to approach a definite limit with e going to zero.
These results can also be stated in terms of a Green's function which now
becomes a "Green's vector" because a pair of equations is involved. Since
the second equation of the system (15) has zero on the right side, only the
two components GI(X, £)i and G%(x, £)i are demanded:
SEC. 5.29 THE COMPLETION OF LINEAR OPERATORS 313
The formulae (20) and (21) establish, the following bilinear expansions for
the two components of the Green's function:
with the perturbation (15), we need the other two components of the Green's
vector:
The relation
is once more fulfilled. We can, as usual, define the Green's function G%(x, |)i
by considering £ as the active variable and solving the adjoint equation.
f)u(t;) = 8(x, £). But in our case that equation takes the form
and we obtain a new motivation for the modification of the right side which
is needed in the case of a constrained system. The expression (25) for
GZ(X, £)z shows that the first term goes to zero while the second term is
proportional to 1/e and thus ev(g) will contribute a finite term. If we write
(27) in the form
Hence we are back at the earlier equation (22.13) of Section 22, which
defined the differential equation of the constrained Green's function. We
see that the correction term which appears on the right side of the equation
can actually be conceived as belonging to the left side, due to the small
modification of the operator by the e-method which changes the constrained
operator to a free operator and makes its Green's function amenable to the
general definition in terms of the delta function. Hence the special position
of a constrained operator disappears and returns only when we demand
that the solution shall approach a definite limit as e converges to zero.
314 THE GREEN'S FUNCTION CHAP. 5
We will add one more remark in view of a certain situation which we
shall encounter later. It can happen that the eigenvalue A = 0 has the
further property that it is a limit point of the eigenvalue spectrum. This
means that A = 0 is not an isolated eigenvalue of the eigenvalue spectrum
but there exists an infinity of A^-values which come arbitrarily near to zero.
In this case we find an infinity of eigenvalues between 0 and e, no matter
how small we may choose e. We are then unable to eliminate the eigen-
value A = 0 by the e-method discussed above.
The difficulty can be avoided, however, by choosing e as purely imaginary,
that is by replacing e by — ie. In this case the solution v(x) remains real,
while u(x) becomes purely imaginary. The Green's functions (23) now
become
Although the eigenvalues of the problem (16) have now the complex values
Aft — ic, this is in no way damaging, as the expressions (30) demonstrate.
The eigenvalues of the modified problem cannot be smaller in absolute value
than \c\ and the infinity of eigenvalues which originally crowded around
A = 0, now crowd around the eigenvalue — ie but cannot interfere with the
existence of the Green's function and its bilinear expansion in the sense of
(30). We are thus able to handle problems—as we see later—for which
the ordinary Green's function method loses its significance, on account of
the limit point of the eigenvalue spectrum at A = 0.
Problem 252. The Green's function (28.65) goes out of bound for a> = 0 but
at the same time A = 0 becomes an eigenvalue. Obtain for this case the proper
expression for the constrained Green's function.
[Answer:
BIBLIOGRAPHY
[1] Cf. {!}, pp. 351-96
[2] Cf. {3}, Chapter 3 (pp. 134-94)
[3] Cf. {7}, Part I, pp. 791-895
[4] Fox, C., An Introduction to the Calculus of Variations (Oxford University
Press, 1950)
[5] Kellogg, O. D., Foundations of Potential Theory (Springer, Berlin, 1929)
[6] Lanczos, C., The Variational Principles of Mechanics (University of Toronto
Press, 1949)
CHAPTER 6
COMMUNICATION PROBLEMS
6.1. Introduction
Even before the Green's function received such a prominent position in
the mathematical literature of our days, a parallel development took place
in electrical engineering, by the outstanding discoveries of the English engineer
0. Heaviside (1850-1925). Although his scientific work did not receive
immediate recognition—due to faulty presentation and to some extent also
due to personal feuds—his later influence on the theory of electric networks
was profound. The input-output relation of electric networks can be
conceived as an excellent example of the general theory of the Green's
function and Green's vector and the relation of Heaviside's method to the
standard Green's function method will be our concern in this chapter.
Furthermore, we shall include the general mathematical treatment of the
galvanometer problem, as an interesting example of a mathematically well-
defined problem in differential equations which has immediate significance
in the design of scientific instruments. This has repercussions also in the
fidelity problem of acoustical recording techniques.
It remains zero between a and £, and then jumps to the constant value 1
between £ and b. Exactly as the pulse was a universal function which was
rigidly transported to the point £—which made 8(x, £) to 8(x - £)—the
SEC. 6.2 THE STEP FUNCTION AND RELATED FUNCTIONS 317
same can be said of the new function 8l(x, |), which can be written in the form
8l(x — £) where the universal function 81(f) is denned as follows:
The new building block is now the integral of the previous step function.
This new function 82(£) is already continuous, although its tangent is dis-
continuous at t = 0:
The boundary term of (7) becomes f'(a)(x — a) which yields the following
formula:
Now our function is put together with the help of straight line portions and
the ruggedness has greatly decreased. We will now succeed with a much
Here even the discontinuity of the tangent is eliminated and only the
curvature becomes discontinuous at t = 0. We now get by integrating by
parts
Once more we have a universal function S 3 (t) which is shifted from point to
point, multiplied by the proper weight factor and the sum formed. But
now the base function is a parabolic arc which avoids discontinuity of either
function or tangent. The resulting curve is so smooth that a small number
of parabolic arcs can cover rather large portions of the curve. Since the
second derivative of 83(£) is a constant, our construction amounts to an
approximation of f(x) in which f"(x) in a certain range is replaced by its
average value and the same procedure is repeated from section to section.
320 COMMUNICATION PROBLEMS CHAP. 6
Problem 254. Obtain the three parabolic building blocks for the generation of
a function f(x) defined as follows:
[Answer: i
Problem 255. Approximate the function f(x) = x3 in the range [0, 2] with the
help of two parabolic arcs in the interval x = [0, 1] and x = [1, 2], chosen in
such a way that the constants of the differential equation (15) shall coincide
with the second derivative of f(x) at the middle of the respective intervals.
[Answer:
and obtain
From this solution we can return to the standard solution in terms of the
Green's function G(x, £), if we integrate by parts with respect to j3'(|):
This shows the following relation between Heaviside's Green's function and
the standard Green's function, which is the pulse response :
which comes about in consequence of the fact that the right side of (6)
vanishes throughout the given range if 8l(x, £) moves out into the end
point £ = 6.
Historically the Green's function denned with the help of the unit step
function rather than the unit pulse played an important role in the theoretical
researches of electrical engineering, since Heaviside, the ingenious originator
of the Green's function in electrical engineering, used consistently the unit
step function as input, instead of the unit pulse (i.e. the delta function).
He thus established the formula (9) instead of the formula (4). In the
22—L.D.O.
322 COMMUNICATION PROBLEMS CHAP. 6
later years of his life Heaviside became aware of the theoretical superiority
of the pulse response compared with the unit step function response. How-
ever, the use of the unit step function became firmly established in
engineering, although in recent years the pulse response gains more and
more access into advanced engineering research.
From the practical standpoint the engineer's preference for the unit step
function is well understandable. It means that at a certain time moment
t = 0 the constant voltage 1 is applied to a certain network and the output
observed. To imitate the unit pulse (in the sense of the delta function)
with any degree of accuracy is physically much less realisable than to
produce the unit step function and observe its effect on the physical system.
The pulse response is a much more elusive and strictly speaking only
theoretically available quantity.
There are situations, involving the motion of mechanical components,
when even the step function response is experimentally unavailable because
even the step function as input function is too unsmooth for practical
operations. Let us consider for example a servo-mechanism installed on an
aeroplane which coordinates the motion of a foot-pedal in the pilot's cockpit
and the induced motion of the rudder at the rear of the aeroplane. The
servo-mechanism involves hydraulic, mechanical, and electrical parts. To
use the step function as input function would mean that the foot-pedal is
pushed out suddenly into its extreme position. This is physically impossible
since it would break the mechanism. Here we have to be satisfied with a
Green's function which is one step still further removed from the traditional
Green's function, by applying the linear input function (2.8). The foot-
pedal is pushed out with uniform speed into its extreme position and the
response observed. This function G2(x, £) is the negative integral of the step
function response Gl(x, £) considering £ as the variable.
Problem 256. Obtain the solution of (3) in terms of Q2(x, g) and O3(x, £), making
use of the formulae (2.9) and (2.13).
[Answer:
Problem 257. Obtain the relation of Gz(x, £) and Gr3(a?, £) to the standard
Green's function G(x, g).
[Answer:
SEC. 6.4 THE INPUT-OUTPUT RELATION OF A GALVANOMETER 323
because now the Green's function shares the property of the base function
&(x — £) to become a function of the single variable t = x — £ only:
The second equation tells us at once that the upper limit of integration
will not be b but x since the integrand vanishes for all £ > x. The lower
limit of the integral is zero since we started our observations with the time
moment x = 0.
For the sake of formal simplification we will introduce a natural time
scale into the galvanometer problem by normalising the stiffness constant to
1. Furthermore, we will also normalise the output v(x)—the scale reading—
by introducing a proper amplitude factor. In order to continue with our
standard notations we will agree that the original x should be denoted by x,
the original v(x) by v(x). Then we put
With this transformation the original differential equation (4) (in which
x is replaced by x and v by v) appears now in the form
Our problem depends now on one parameter only, namely on the "damping
ratio" K, for which we want to introduce an auxiliary angle y, defined by
If we write down the Green's function in the new variables, we obtain the
expression
Problem 258. Obtain the step function response Ol(t) of the galvanometer
problem.
[Answer:
Problem 259. Obtain the linear response Q2(t) and the parabolic response Oz(t)
of the galvanometer problem.
[Answer:
is such that it will emphasise the region around t = 0 while the region of
very large values of t will be practically blotted out. The galvanometer
has a "memory" by retaining the earlier values fi(x — t) before the
326 COMMUNICATION PROBLEMS CHAP. 6
instantaneous value fi(x) but this memory is of short duration if the damping
ratio is sufficiently large. For very small damping the memory will be so
extended that the focusing power on small values of t is lost. In that case
we cannot hope for any resemblance between v(x) and fi(x).
Apart from these general, more qualitative results we do not obtain much
information from our solution (4.10), based on a Green's function which
was defined as the pulse response. We will now make our input function
less extreme by using Heaviside's unit step function as input function.
Then the response appeared—for the normalised form (4.7) of the differential
equation—in the form (4.11). If we plot this function, we obtain a graph
of the following character:
From this graph we learn that the output Gl(t), although it does not resemble
the input in the beginning, will eventually reproduce the input J3(t) with a
gradually decreasing error. This shows that in our normalisation the
proportionality factor between v(x) and fi(x) will be 1. The original v(x)
of the general galvanometer equation (4.4) had to be multiplied by pz in
order to give the new v(x}. Hence in the original form of the equation the
proportionality factor of the output v(x) will become /a2, in order to compare
it with the input fi(x).
Still more information can be obtained if we use as input function the
linear function fi(t) = t of Problem 259 (cf. 4.12). Here we obtain a graph
of the following character:
SBC. 6.6 FIDELITY DAMPING 327
From this graph we learn that the output—apart from the initial disturbance
—follows the input with a constant time lag of the amount
plus further terms which go to zero with increasing t. We now see that
there is indeed a distinguished value of y, namely the value
which is only 71% of the critical damping. Now we have fidelity for any
curve of arbitrary parabolic arcs, which follow each other in intervals large
compared with the time !//> (since there is a quickly damped disturbance
at the intersection of the arcs, due to the excitation of the eigen-vibration
of the galvanometer). It is clear that under such conditions the fidelity of
the galvanometer recording can be greatly increased since the parabolic
328 COMMUNICATION PROBLEMS CHAP. 6
arcs, which approximate a function, can be put much further apart than
mere straight line sections which are too rigid for an effective approximation
of a smooth function. Hence we will denote the choice (3) of the damping
constant a as "fidelity damping".
Problem 263. Obtain the parabolic response (?3(<) of the galvanometer for
fidelity damping and demonstrate that at t — 0 function and derivatives vanish
up to (and inclusive of) the third derivative.
[Answer:
The time-lag for fidelity damping is 2/c = A/2 and thus the differential
equation
We can make our error estimation still more accurate by allowing a certain
time-lag at which the third derivative /?'"(#) is to be taken. Our error
estimation can thus be made accurate for any polynomial of the order four.
Let us use as input function
SBC. 6.7 THE ERROR OF THE GALVANOMETER RECORDING 329
we obtain
Critical damping:
Problem 264. Given the stiffness constants p = 5 and p — 10. Calculate the
relative errors of the galvanometer response (disregarding the initial disturbance
caused by the excitation of the eigen-vibration), for the input signal
for both fidelity and critical damping. Compare these values with those
predicted by the error formulae (12) and (13).
F Answer:
330 COMMUNICATION PROBLEMS CHAP. 6
Problem 265. Apply the method of this section to the error estimation of the
general case in which the damping ratio K = <x/p is arbitrary (but not near to
the critical value
[Answer:
their significance may be totally different and the coupling between the two
functions need not necessarily be established by a differential equation but
can be of a much more general character. Generally we cannot assume that
there is necessarily a close resemblance between fi(x) and v(x). We may
have the mathematical problem of restoring the input function fi(x) if v(x)
is given and this may lead to the solution of a certain integral equation.
But frequently our problem is to investigate to what extent we can improve
the resemblance of the output v(x) to the input f3(x) by putting the proper
mechanism inside the black box.
becomes
where v\(x), v%(x), . . . , vn(x) are the outputs which correspond to fii(x),
fa(x), . . . , J3n(x) as inputs.
Now we have seen before (cf. equations (5.4.13-15)) that an arbitrary
continuous function f(x) can be considered as a linear superposition of delta
functions. If now we obtain the response of the black box mechanism to
the delta function as input function, we can obtain the response to an
arbitrary input function fl(x), in a similar manner as the solution of a linear
differential equation was obtained with the help of the Green's function
<*(*,€):
Hence
SEC. 6.8 INPUT-OUTPUT OF LINEAR COMMUNICATION DEVICES 333
It is thus sufficient to observe the output G(x) which follows the unit pulse
input, applied at the time moment x = 0. This function is different from
zero for positive values of x only while for all negative values G(x) is zero :
that it will not last forever but disappear eventually. It is possible that
strictly speaking G(t] will never become exactly zero but approach zero
asymptotically, as t grows to infinity:
If we know what the response of our device is to the complex input function
(1), we shall immediately have the response to both cos cox and sin cox as
input functions, by merely separating the real and imaginary parts of the
response.
Before carrying out the computation we will make a small change in the
formula (8.12). The limits of integration were 0 and x. The lower limit
came about in view of 0(t) being zero for all negative values of t. The
upper limit x came about since we have started our observation at the time
moment x = 0. Now the input signal fi(x) did not exist before the time
SEC. 6.9 FREQUENCY ANALYSIS 335
moment x — 0 which means that we can define fi(x) as zero for all negative
values of x.
But then we need not stop with the integration at t = x but can continue
up to t = oo :
The condition (2) automatically reduces the integral (3) to the previous
form (8.12). The new form has the advantage that we can now drop the
condition (2) and assume that the input f3(x) started at an arbitrary time
moment before or after the time moment x = 0.
This will be important for our present purposes because we shall assume
that the periodic function (1) existed already for a very long—mathematically
infinitely long—time. This will not lead to any difficulties in view of the
practically finite memory time of our device.
We introduce now
The factor of ei(ax can be split into a real and imaginary part:
with
Moreover, we may write the complex number (6) in "polar form", with the
"amplitude" p(co) and the "argument" 0(a>):
Now the output follows the input with the time-lag a but reproduces the
input with the proportionality factor PQ. If these conditions hold for a
sufficiently large range of the frequency o>, the output will represent a high-
fidelity reproduction of the input, apart from the constant time-lag a which
for many purposes is not damaging.
regular and lawful function and we have no right to prescribe it freely, even
in an arbitrarily small interval.
If now we consider the integral (9.6) which defines the transfer function
we see that by the definition of the Laplace transform we obtain the funda-
mental relation
The transfer function F(a>) which determines the frequency response of our
device, is thus obtained as the Laplace transform of the pulse response, taken
along the imaginary axis.
Problem 266. Consider the pulse response of a galvanometer given by (4.9).
Obtain the Laplace transform of this function and show that the singularity of
L(p) lies in the negative half plane, no matter what the value of the damping
ratio K = cos y (between 0 and oo) may be.
Problem 267. Obtain the transfer function (3) and study it from the standpoint
of the fidelity problem. Explain the special role of the value K = 1/V2 (fidelity
damping).
[Answer:
The amplitude response for the critical value becomes (1 + co4)"1/2 the dis-
tortion being of fourth instead of second order in w.
Problem 268. Show that in the case of fidelity damping the maximum error
of the galvanometer response for any a> between 0 and 1/2 V 2 does not surpass
2% and that the error prediction on the basis of (7.12) in this frequency range
holds with an accuracy of over 97.5%.
Problem 269. Show that from the standpoint of smallest phase distortion the
value K = V3/2 (y = 77/6 = 30°) represents the most advantageous damping
ratio.
Problem 270. Assuming that co varies between 0 and l/4« (K > %), show that
the maximum phase distortion of the galvanometer in that frequency range
does not surpass the value
The error thus induced in F(a>) can be estimated by the well-known integral
theorem
where
Problem 271. Find the memory time of a galvanometer with critical damping
if we desire that the error limit (11.5) shall be 3% of the total area under O(t):
[Answer: T - 5.36.]
Problem 272. Consider the Laplace transform L(p) of the function
Consider on the other hand the same transform Li(p) but integrating only up to
t = 4. Show that the analytical behaviour of Li(p) is very different from that
of L(p) by being free of singularities in the entire complex plane, while L(p) has
a pole at the points p = — l ± i f i . And yet, in the entire right half plane,
including the imaginary axis p = ia), the relative error of Li(p) does not exceed
2%
is presented to the human ear, the ear is not sensitive to the presence of
the phase angle 0. It cannot differentiate between the input function
sin MX and the input function sin (tax — 6). From this fact the inference
is drawn that the phase shift induced by the complex transfer function F((a)
is altogether of no importance. Since it makes no difference whether v(x) is
presented to the ear or that superposition of sin tax and cos wx functions
which in their sum are equivalent to v(x), it seems that we can completely
discard the investigation of the phase-shift. The superposition principle
holds, the ear does not respond to the phase of any of the components,
hence it is altogether immaterial what phase angles are present in the output
v(x). According to this reasoning the two functions
and
are equivalent as far as our heading goes. The ear will receive the same
impression whether v(x) or v\(x) is presented to it.
This reasoning is in actual fact erroneous. The experimental fact that
our perception is insensitive to the phase angle, ##, holds only if sustained
musical sounds are involved. It is true that any combination of sine or
cosine functions which are perceived as a continuous musical sound can be
altered freely by adding arbitrary phase angles to every one of the com-
ponents. But we have "noise" phenomena which are not of a periodic
340 COMMUNICATION PROBLEMS CHAP. 6
kind and which are not received by the ear as musical sounds. Such
"noise" sequences can still be resolved into a superposition of steady state
sine and cosine functions which have no beginning and no end. But the
ear perceives these noises as noises and the steady state analysis is no
longer adequate, although it is a mathematical possibility. Now it is no
longer true that the periodic components into which the noise has been
resolved can be altered by arbitrary phase shifts. Such an alteration would
profoundly influence the noise and the perception of the noise. For noise
phenomena which represent a succession of transients and cannot be
perceived as a superposition of musical notes, the phase angle becomes of
supreme importance. A phase shift which is not merely proportional to o>
(giving a mere time-lag) will have no appreciable effect on the strictly
"musical" portions of recorded music and speech but a very strong effect
on the "transient" portions of the recording.
Even in pure music the transient phenomena are by no means of
subordinate importance. The sustained tone of a violin has a beginning
and an end. The ear receives the impression of a sustained tone after the
first cycles of tone generation are over but the transition from one tone to
the other represents a transient phenomenon which has to be considered
separately. The same holds for any other instrument. It is the experience
of many musically trained persons that they recognise the tone of a certain
type of instrument much more by the transitions from tone to tone than by
the sustained notes. The older acoustical investigations devoted the focus
of attention almost completely to the distribution of "overtones" in a
sustained note. Our ear perceives a musical sound under the aspects of
"loudness", "pitch", and "tone quality". The loudness or intensity of
the tone is determined by the amplitude of the air pressure vibrations which
create in the ear the impression of a tone. The "pitch" of the tone is
determined by the frequency of these vibrations. The "tone quality", i.e.
the more or less pleasant or harmonious impression we get of a musical sound,
is determined by the distribution of overtones which are present on account
of the tone-producing mechanism. In strictly periodic tone excitement the
overtones are in the frequency ratios 1:2:3:..., in view of the Fourier
analysis to which a periodic function of time can be submitted. The
presence of high overtones can give the sound a harsh and unpleasant
quality. We must not forget, however, that generally the amplitude of the
high overtones decreases rapidly and that in addition the sensitivity of our
ear to high frequencies decreases rapidly. Hence it is improbable that in
the higher frequency range anything beyond the third or fourth overtone
is of actual musical significance. In the low notes we may perceive the
influence of overtones up to the order 6 or 7. It would be a mistake to
believe, however, that we actually perceive the overtones as separate tones
since then the impression of the tone would be one of a chord rather than
that of a single tone. It is the weakness of the overtones which prevents
the ear from hearing a chord but their existence is nevertheless perceived
and recorded as "tone quality".
SEC. 6.12 STEADY STATE ANALYSIS OF MUSIC AND SPEECH 341
music. This graph can be of a rather irregular shape and the question arises
how to obtain a faithful reproduction of it.
That this reproduction cannot be perfect, is clear from the outset.
Perfect reproduction would mean that a delta function as input is recorded
as a delta function as output, except for a certain constant time-lag. This
is obviously impossible, because of the inertia of the mechanical components
of the instrument. In fact we know that even the much more regular unit
step function cannot be faithfully reproduced since no physical instrument
can make a sudden jump. Nor is such an absolutely faithful reproduction
necessary if we realise that our ear itself is not a perfect recording instrument.
No recording instrument can respond to a definite local value of the input
signal without a certain averaging over the neighbouring values. Our ear
is likewise a recording instrument of the integrating type and thus unable to
perceive an arbitrarily rugged /(£). A certain amount of smoothing must
characterise our acoustical perception, and it is this smoothing quality of
the ear on which we can bank if we want to set up some reasonable standards
in the fidelity analysis of noise.
We recognise from the beginning that the reproduction of a given rather
complicated noise profile of short duration will be a much more exacting
task than the recording of musical sounds of relatively low frequencies.
But the question is how far have we to go in order to reproduce noise with
a fidelity which is in harmony with the high, but nevertheless not arbitrarily
high, capabilities of the human ear.
This seems to a certain degree a physiological question, but we can make
some plausible assumptions towards its solution. We will not go far wrong
if we assume that the ear smooths out the very rugged peaks of an input
function by averaging. This averaging is in all probability very similar
to that method of "local smoothing" that we have studied in the theory
of the Fourier series and that we could employ so effectively towards an
increased convergence of the Fourier series by cutting down the contribution
of the terms of very high frequency. The method of local averaging attaches
to the vibration eiwt the weight factor
theory of the Fourier series, caused by local smoothing (cf. 2.13.6). The
smoothing time r can thus be established by the relation
or, returning to the general time scale of the galvanometer equation (4.4):
With advancing age the smoothing time increases, partly due to an increase
in the damping constant a, and partly due to a decrease in the stiffness
constant p, considering the gradual relaxation of the elastic properties of
living tissues. This explains the reduction of the frequency range to which
the ear is sensitive with advancing age.
We often treat the phenomenon of "hard of hearing" as a decline of
the hearing nerve in perceiving a certain sound intensity. The purpose of the
hearing aid is to amplify the incoming sound, thus counteracting the
reduction of sound intensity caused by the weakening of the hearing nerve.
A second factor is often omitted, although it is of no smaller importance.
This is the reduced resolution power of the ear in perceiving a given noise
profile (1). As the smoothing time r increases, the fine kinks of the noise
profile (1) become more and more blurred. Hence it becomes increasingly
more difficult to identify certain consonants, with their characteristic noise
profiles. If in a large auditorium the lecturer gets the admonition "louder"
from the back benches, he will not only raise his voice but he will instinctively
talk slower and more distinctly. This makes it possible to recognise certain
noise patterns which would otherwise be lost, due to smoothing. If we
listen to a lecture delivered in a language with which we are not very
familiar, we try to sit in the front rows. This has not merely the effect of
higher tone intensity. It also contributes to the better resolution of noise
patterns which in the back seats are blurred on account of the acoustical
echo-effects of the auditorium. (Cockney English, however, cannot be
resolved by this method.)
From this discussion certain consequences can be drawn concerning the
fidelity analysis of noise patterns. In the literature of high fidelity sound
reproducing instruments we encounter occasional claims of astonishing
magnitude. One reads for example that the "flat top amplitude character-
istics" (i.e. lack of any appreciable amplitude distortion) has been extended
to 100,000 cycles per second. If this fact is technically possible, the question
can be raised whether it is of any practical necessity (the distance travelled
by the needle in the playing of a 12-in. record during the time of
1/100,000 = 10~5 second is not more than 0.005 mm). Taking into account
the limited resolution power of the human ear, what are the actual fidelity
requirements in the reproduction of noise patterns? Considering the
inevitable smoothing operation of the ear, it is certainly unnecessary to try
to reproduce all the kinks and irregularities of the given noise profile since
SEC. 6.13 TRANSIENT ANALYSIS OF NOISE PHENOMENA 345
the small details are obliterated anyway. In this situation we can to
some extent relax the exact mathematical conditions usually observed in the
generation of functions. In Section 2 we have recognised the unit step
function as a fundamental building block in the generation of f(x). The
formula (2.5) obtained f(x) by an integration over the base function
&l(x — £). This integral could be conceived as the limit of the sum (2.6),
reducing the Axt to smaller and smaller values. In the presence of smoothing,
We have used 12 units for the horizontal part of the curve, in order to reduce
the disturbance at the beginning and the end of the linear section to
practically negligible amounts. Hence we have solved the problem of
faithfully reproducing a noise pattern which is composed of straight line
sections of the duration r. Since the units used were normalised time units,
the return to the general galvanometer equation (4.4) establishes a stiffness
constant p which is in the following relation to the smoothing time r:
346 COMMUNICATION PROBLEMS CHAP. 6
This result indicates that we need not extend the fidelity requirement in
amplitude and phase beyond the limit I/T.
A similar result is deducible from a still different approach. We have
seen in the treatment of the Fourier series that the infinite Fourier series
could be replaced with a practically small error by a series which terminates
with n terms, if we modify the Fourier coefficients by the sigma factors and
at the same time replace f(x) by /(#), obtained by local smoothing. The
smoothing time of the local averaging process was
or, expressing once more everything in terms of cycles per second, the last
term of the finite Fourier series becomes an cos 2-nvx + bn sin 2-n-vx while r
becomes 1/v. We may equally reverse our argument and ask for the smooth-
ing time T which will make it possible to obtain f(x) with sufficient accuracy
by using frequencies which do not go beyond v. The answer is r = 1/v.
Here again we come to the conclusion that it is unnecessary to insist on
the fidelity of amplitude and phase response beyond the upper limit v = I/T.
The result of our analysis is that the apparently very stringent fidelity
requirements of a transient recording are in fact not as stringent as we
thought in the first moment. If amplitude and phase distortion can be
avoided up to about 10 or perhaps 15 thousand cycles per second, we can be
pretty sure that we have attained everything that can be expected in the
realm of high-fidelity sound reproduction. The great advances made in
recent years in the field of high-fidelity equipment is not so much due to a
spectacular extension of the amplitude fidelity to much higher frequencies
but due to a straightening out of the phase response which is of vital importance
for the high-fidelity reproduction of noise, although not demanded for the
recording of sustained sounds. The older instruments suffered by strong
phase distortion in the realm of high frequencies. To have eliminated this
distortion up to frequencies of 10,000 cycles per second has improved the
reproduction of the transients of music and speech to an admirable degree.
BIBLIOGRAPHY 347
BIBLIOGRAPHY
[1] Bush, V., Operational Circuit Analysis (Wiley, New York, 1929)
[2] Carson, J. R., Electric Circuit Theory and Operational Analysis (McGraw-Hill,
New York, 1920)
[3] Pipes, L. A., Applied Mathematics for Physicists and Engineers (McGraw-Hill,
New York, 1946)
CHAPTER 7
S T U R M - L I O U V I L L E PROBLEMS
7.1. Introduction
There exists an infinite variety of differential equations which we may
want to investigate. Certain differential equations came into focus during
the evolution of mathematical theories owing to their vital importance in
the description of physical phenomena. The "potential equation", which
involved the Laplacian operator A, is in the foremost line among these
equations. The separation of the Laplacian operator in various types of
coordinates unearthed a wealth of material which demanded some kind of
universal treatment. This was found during the nineteenth century,
through the discovery of orthogonal expansions which generalised the out-
standing properties of the Fourier series to a much wider class of functions.
The early introduction of the astonishing hypergeometric series by Euler
was one of the pivotal points of the development. Almost all the important
function classes which came into use during the last two centuries are in
some way related to the hypergeometric series. These function classes are
characterised by ordinary differential equations of the second order. The
importance of these function classes was first discovered by two French
mathematicians, J. Ch. F. Sturm (1803-55) and J. Liouville (1809-82).
These special types of differential operators are thus referred to as belonging
348
SBC. 7.2 DIFFERENTIAL EQUATIONS OF FUNDAMENTAL SIGNIFICANCE 349
where A(x), B(x), G(x) are given functions of x. We will assume that A(x)
does not go through zero in the domain of investigation since that would
lead to a "singular point" of the differential equation in which generally
the function v(x) goes out of bound. It can happen, however, that A(x)
may become zero on the boundary of our domain.
We will enumerate a few of the particularly well investigated and
significant differential equations of pure and applied analysis. The majority
of the fundamental problems of mathematical physics are in one way or
another related to these special types of ordinary second-order differential
equations.
1. If
where the index p, called the "order of the Bessel function", may be an
integer, or in general any real or even complex number.
3. Mathieu's differential equation
This differential equation embraces many of the others since almost all
functions of mathematical physics are obtainable from the hypergeometric
series by the proper specialisation of the constants a, p, y, and the proper
transformation of the variable x.
5. The differential equation of the Jacobi polynomials, obtained from the
Gaussian differential equation by identifying a with a negative integer — n:
9. Still another class of polynomials, associated with the range [0, oo], is
established by the "differential equation of Laguerre" :
is associated with the range [—00, +00] and defines the "Hermitian
polynomials" Hn(x). These polynomials are limiting cases of the ultra-
spherical polynomials (10), by letting y go to infinity and correspondingly
changing x in the following sense:
Problem 274. By the added substitution x = x\a show that the function
Problem 275. By making the further substitution v(x) = x~vvi(x) show that the
function
Problem 279. Obtain the general solution of Bessel's differential equation for
the order p = n — J (n an arbitrary integer) in terms of elementary functions
as follows:
if we examine the procedure of Chapter 4.6, we shall notice that the correla-
tion of the matrix A to the operator D is not unique. It depends on the
method by which the continuum is atomised. For example the differential
operator was translated into the matrix (4.6.11). In this translation the Axi
of the atomisation process were considered as constants and put equal to e.
If these Axi varied from point to point, the associated matrix would be quite
different. In particular, the originally symmetric matrix would now
become non-symmetric. Now a variation of the Axi could also be conceived
as keeping them once more as constants, but changing the independent
variable x to some other variable t by the transformation
because then
A uniform change Afy = e of the new variable t does not yield any longer a
corresponding uniform change in x. Since the properties of the associated
matrix A are of fundamental importance in the study of the differential
operator D, we see that a transformation of the independent variable x
into a new variable t according to (1) is not a trivial but in fact an essential
operation. It happens to be of particular importance in the study of
second order operators.
We have seen in Chapter 5.26 that the eigenfunctions Ui(x), Vi(x), which
came about by the solution of the shifted eigenvalue problem (5.26.2), had
the orthogonality property
(With a similar relation for the functions V{.) This modified orthogonality
property can be avoided if we use x instead of t as our independent variable.
But occasionally we have good reasons to operate with the variable t rather
than x. For example the functions ui(x) may become polynomials if
expressed in the variable t, while in the original variable x this property
would be lost.
Under these circumstances it will be of advantage to insist on a complete
freedom in choosing our independent variable x, allowing for an arbitrary
transformation to any new variable t. For this purpose we can survey our
previous results and every time we encounter a dx, replace it by <p'(t)dt.
Since, however, we would like to adhere to our standard notation x for our
24—L.D.O.
354 STURM-LIOT7VILLE PROBLEMS CHAP. 7
independent variable, we will prefer to call the original variable t and the
transformed variable x. Hence we prefer to re-write (4) in the form
We will call the identity (9), which demies the adjoint operator t) with
respect to the weight factor w(x), the "weighted Green's identity". We
see that the definition of the adjoint operator is vitally influenced by the
choice of the independent variable. If we transform the independent
variable t to a new variable x, and thus express Dv in terms of x, we obtain
the adjoint operator D«—likewise expressed in the new variable x—by
defining this operator with the help of the weighted Green's identity (9).
The definition of the Green's function G(x, £) is likewise influenced by the
weight factor w(x), because the delta function 8(x, £) is not an invariant of
a coordinate transformation. The definition of S(x — £) = S(t) demanded
that
SEC. 7.3 THE WEIGHTED GREEN'S IDENTITY 355
and the definition of the Green's function must occur on the basis of the
equation (cf. 5.4.12)
or, if we want to define the same function in terms of the original operator
(cf. 5.12.15):
remains unchanged and the construction of the Green's function with the
help of the bilinear expansion (5.27.7) is once more valid:
(We exclude zero eigenvalues since we assume that the given problem is
complete and unconstrained.) However, in the solution with the help of
the Green's function the weight factor w(x) appears again:
Formulate the given problem in the new variable x and solve it with the help
of the weight factor
We can dispose of the weight function w(x) in such a way that the genera]
SEC. 7.4 SECOND-ORDER OPERATORS IN SELF-ADJOINT FORM 357
operator (2.1) becomes transformed into the self-adjoint form (2). For
this purpose we must define the functions A\ and w according to the following
conditions:
and thus
Here we have the weight factor which makes the general second order operator
(2.1) self-adjoint.
The boundary term of the weighted Green's identity becomes
Then the four constants of these relations can be chosen freely, except for
the single condition:
and let us go with e toward zero. Then the quantity piq% — pzqi is reduced
to pie, while qi is freely at our disposal. Hence in the limit, as e goes to
zero, we obtain the boundary condition
358 STURM-LIOUVILLE PROBLEMS CHAP. 7
where v is arbitrary. This condition takes now the place of the second
condition (9). At the same time, as e goes to zero, pi must go to infinity.
This implies that the first condition (9) becomes in the limit:
Hence the boundary conditions (12) and (13)—with arbitrary p. and v—are
permissible self-adjoint boundary conditions.
Our operator is now self-adjoint, with respect to the weight factor w(x).
Hence the shifted eigenvalue problem (3.16) becomes simplified to
in view of the fact that the functions Ui(x) and vt(x) coincide. The resultant
eigensolutions form an infinite set of ortho-normal functions, orthogonal
with respect to the weight factor w(x):
While generally the eigenvalue problem (14) need not have any solutions,
and, even if the solutions exist, the A$ will be generally complex numbers,
the situation is quite different with second order (ordinary) differential
operators, provided that the boundary conditions satisfy the condition (10).
Here the eigenvalues are always real, the eigensolutions exist in infinite
number and span the entire action space of the operator. The orthogonality
of the eigenfunctions holds, however, with respect to the weight factor
w(x), defined by (7).
Problem 281. Obtain the weight factor w(x) for Mathieu's differential operator
(2.5) and perform the transformation of the independent variable explicitly.
[Answer:
The boundary condition is nevertheless present by the demand that v(x) must
not grow to infinity stronger than ex/2.
Problem 283. Show that boundary conditions involving the point a alone (or
6 alone) cannot be self-adjoint.
SEC. 7.5 TRANSFORMATION OF THE DEPENDENT VARIABLE 359
Problem 284. Find the condition, under which a self-adjoint periodic solution
becomes possible.
[Answer:
We are back at Green's identity without any weight factor. The trans-
formation (2), (3) absorbed the weight factor w(x) and the new operator has
become self-adjoint without any weighting. The solution of the eigenvalue
•nrohlfim
For the purpose of solving the differential equation, however, these eigen-
functions are equally applicable, and may result in a simpler solution
method.
As an illustrative example we will consider the following differential
equation:
which is a special case of the differential equation (2.20), and thus solvable
in thfi form
(We shall see in Section 12 that the second fundamental solution of Bessel's
differential equation, viz. Jk(x), becomes infinite at x = 0 and is thus
ineligible as eigenfunction.) The boundary condition (11) demands
and this means that V — X has to be identified with any of the zeros of the
Bessel function of the order Jc (that is those #-values at which Jjc(x) vanishes).
If these zeros are called xm, we obtain for \m the selection principle
There is an infinity of such zeros, as we must expect from the fact that the
eigenvalue problem of a self-adjoint differential operator (representing a
symmetric matrix of infinite order) must possess an infinity of solutions.
The weight function w(x) of our eigenvalue problem becomes (cf. 4.7)
The solution of the differential equation (10) can now occur by the standard
method of expanding in eigenfunctions. First we expand the right side in
our ortho-normal functions:
where
Then we obtain the solution by a similar expansion, except for the factor
A^1:
362 STUBM-LIOUVILLE PROBLEMS CHAP. 7
The new weight factor is w(x) = x2 and we will make our equation self-
adjoint by multiplying through by xz:
and thus
and
The new eigenfunctions oscillate with an even amplitude, which was not
the case in our earlier solutions Jfc(#). In fact, we recognise in the new
solution of the eigenvalue problem the Fourier functions, if we introduce
instead of # a new variable t by putting
the expansion of the right side into eigenfunctions becomes a regular Fourier
sine analysis of the function /3i(ee*).
The freedom of transforming the function v(x) by multiplying it by a
proper factor, plus the freedom of multiplying the given differential equation
by a suitable factor, can thus become of great help in simplifying our task
of solving a given second order differential equation. The eigenfunctions
and eigenvalues are vitally influenced by these transformations, and we may
wonder what eigenfunction system we may adopt as the "proper" system
associated with a given differential operator. Mathematically the answer
is not unique but in all problems of physical significance there is in fact
a unique answer because in these problems—whether they occur in hydro-
dynamics, or elasticity, or atomic physics—the eigensolutions of a given
physical system are determined by the separation of a time dependent
differential operator with respect to the time t, thus reducing the problem to
an eigenvalue problem in the space variables. We will discuss such problems
in great detail in Chapter 8.
Problem 285. Transform Laguerre's differential equation (2.16) into a self-
adjoint form, with the help of the transformation (2), (3).
[Answer:
364 STUBM-LIOUVILLB PROBLEMS CHAP. 7
Problem 288. Making use of the differential equation (2.23) find the normalised
eigenfunctions and eigenvalues of the following differential operator:
This means
It will thus be sufficient to construct the Green's function for the self-adjoint
equation (4).
SBO. 7.6 GREEN'S FUNCTION OF SECOND-ORDER DIFFERENTIAL EQUATION 365
Green's identity now becomes
The solution of the two simultaneous algebraic equations (9) and (10) for
the constants C± and C% yields:
Since the point £ can be chosen arbitrarily, no matter how we have fixed
the point x, we obtain the condition
We have obtained this result from the symmetry of the Green's function
G(x, |) but we can deduce it more directly from the weighted Green's identity
which is exactly the relation we need for the symmetry of G(x, £).
An important consequence of this result can be deduced if we consider
the equation
SEC. 7.6 GREEN'S FUNCTION or SECOND-ORDER DIFFERENTIAL EQUATION 367
as a differential equation for vz(x), assuming that v\(x) is given. We can
integrate this equation by the method of the "variation of constants".
We put
and thus
and
if
Problem 292. Consider Laguerre's differential equation (2.16) which defines the
polynomials Ln(x). Show that the second solution of the differential equation
goes for large x to infinity with the strength ex.
Problem 293. The proof of the symmetry of the Green's function, as discussed
in the present section, seems to hold universally while we know that it holds
368 STUKM-LIOUVTLLE PROBLEMS CHAP. 7
only under self-ad joint boundary conditions. What part of the proof is
invalidated by the not-self-ad joint nature of the boundary conditions?
[Answer: As far as vi(£) and v%(J;) goes, they are always solutions of the homo-
geneous equation. But the dependence on x need not be of the form f(x)v(£),
as we found it on the basis of our self-adjoint boundary conditions.]
7.7. Normalisation of second order problems
The application of a weight factor w(x) to a given second-order differential
equation is a powerful tool in the investigation of the analytical properties
of the solution and is often of great advantage in obtaining an approximate
solution in cases which do not allow an explicit solution in terms of elementary
functions. In the previous sections we have encountered the method of the
weight factor in two different aspects. The one was to multiply the entire
equation by a properly chosen weight factor w(x) (cf. 6.2). If w(x) is chosen
according to (4.7), the left side of the equation is transformed into
but also changed the operator D to Z>i (cf. 5.3). The new differential
equation thus constructed became [cf. (5.7)]:
where
We thus obtain two new functions which we will denote by b(x) and c(x):
SEC. 7.7 NORMALISATION OF SECOND ORDER PROBLEMS 369
Then we apply the transformation (3-5) which is now simplified due to the
fact that A(x) = 1:
The new form of the differential operator has the conspicuous property
that the term with the first derivative is missing. The new differential operator
is of the Sturm-Liouville type (2.3) but with A(x) — 1. It is characterised
by only one function which we will call U(x):
If we know how to solve this differential equation, we have also the solution
of an arbitrary second order equation since the solution of an arbitrary
second order differential equation can be transformed into the form (12)
which is often called the "normal form" of a linear homogeneous differential
equation of second order.
Problem 294. Transform Besael's differential equation (2.4) into the normal
form (12).
[Answer:
Problem 295. Transform Laguerre's differential equation (2.16) into the normal
form.
[Answer:
Problem 296. Transform Hermite's differential equation (2.18) into the normal
form.
[Answer:
25—L.D.O.
370 STUBM-LIOtTVILLE PROBLEMS CHAP. 7
If we put
if U is positive and
thus
But then
and
Since both the real and the imaginary part of this solution must be a
solution of our equation (8.2) (assuming that U(x) is real), we obtain the
general solution in the form
This form of the solution leads to the remarkable consequence that the
solution of an arbitrary linear homogeneous differential equation of second
order for which the associated U(x) is positive in a certain interval, may be
conceived as a periodic oscillation with a variable frequency and variable
amplitude. Ordinarily we think of a vibration in the sense of a function of
the form
frequency of this oscillation are necessarily coupled with each other. If the
frequency of the oscillation is a constant, the amplitude is also a constant.
But if the frequency changes, the amplitude must also change according to
a definite law. The amplitude of the vibration is always inversely pro-
portional to the square root of the instantaneous frequency. If we study the
distribution of zeros in the oscillations of the Bessel functions or the Jacobi
polynomials or the Laguerre or Hermite type of polynomials, the law of the
zeros is not independent of the law according to which the maxima of the successive
oscillations change. The law of the amplitudes is uniquely related to the
law of the zeros and vice versa.
This association of a certain vibration of varying amplitude and frequency
with a solution of a second-order differential equation is not unique, however.
The solution (11) contains two free constants of integration, viz. the phase
constant 6 and the amplitude constant A. This is all we need for the
general solution of a second-order differential equation. And yet, if we
consider the equations (4) and (5), which determine j8(z), we notice that we
get for j8(x) a differential equation of second order, thus leaving two further
constants free. This in itself is not so surprising, however, if we realise that
we now have a complex solution of the given second-order differential
equation, with the freedom of prescribing V(XQ) and V'(XQ) at a certain point
x = XQ as two complex values, which in fact means four constants of
integration. But if we take the real part of the solution for itself:
then we see that the freedom of choosing j3(#o) and P'(XQ) freely must lead
to a redundancy because to any given V(XQ), V'(XQ) we can determine the
constants A and 6 and, having done so, the further course of the function
v(x) is uniquely determined, no matter how fi(x) may behave. This means
that the separation of our solution in amplitude and frequency cannot be
unique but may occur in infinitely many ways.
Let us assume, for example, that at x = XQ v(x) vanishes. This means
that, if the integral under the cosine starts from x = XQ, the phase angle
becomes 7r/2. Now in this situation the choice of /?'(#o) can have no effect
on the resulting solution, while the choice of j8(#o) can change only a factor
of proportionality. And yet the course of j8(a;)—and thus the instantaneous
frequency and the separation into amplitude and frequency—is profoundly
influenced by these choices.
Problem 298. Investigate the differential equation
The condition for the successful applicability of the simplified solution (1)
is that U must be sufficiently large. Our solution will certainly fail in the
neighbourhood of U = 0. It so happens, however, that in a large class of
SEC. 7.10 SOLUTION OF A DIFFERENTIAL EQUATION OF SECOND ORDER 375
problems U ascends rather steeply from the value U = 0 and thus the
range in which the KWB approximation fails, is usually limited to a relatively
small neighbourhood of the point at which U(x) vanishes.
In order to estimate the accuracy of the KWB solution, we shall substitute
in Riccati's differential equation
if our realm starts with x = a and we assume that v(x) is at that point
adjusted to the proper value v(a) in amplitude and phase. To carry through
an exact quadrature with the help of (7) as integrand will seldom be possible.
But an approximate estimation is still possible if we realise that the second
term in the numerator of (7) is of second order and can thus be considered
as small. If, in addition, we replace in the denominator Vl7 by its
minimum value between a and x, we obtain the following estimation of the
relative error of the KWB approximation:
376 STUEM-LIOTJVILLE PROBLEMS CHAP. 7
(We have assumed that (log U)" does not go through zero in our domain,
otherwise we have to sectionalise the error estimation.)
Problem 299. Given the differential equation
which makes an exact error analysis possible. Obtain the solution by the
KWB method and compare it with the exact solution in the realm x = [3, oo].
Choosing a = oo, estimate the maximum error of y(x) and v(x) on the basis of
the formulae (7) and (9) and compare them with the actual errors.
[Answer:
Maximum error of y(x) (which occurs at x = 3): 77(8) = — 0.0123
(formula (7) gives - 0.0107)
Maximum relative error of
(formula (9) gives 0.0237)]
Problem 300. For what choice of U(x) will the KWB approximation become
accurate? Obtain the solution for this case.
[Answer:
consideration. This change of sign does not lead to any singularity but it
does have a profound effect on the general character of the solution since the
solution has a periodic character if U(x) is positive and an exponential
character if U(x) is negative. The question arises how we can continue our
solution from the one side to the other, in view of the changed behaviour of
the function. The KWB approximation is often of inestimable value in
giving a good overall picture of the solution. The accuracy is not excessive
but an error of a few per cent can often be tolerated and the KWB method
has frequently an accuracy of this order of magnitude. The method fails,
however, in the neighbourhood of U(x) = 0 and in this interval a different
approach will be demanded. Frequently the transitory region is of limited
extension because U(x) has a certain steepness in changing from the negative
to the positive domain (or vice versa) and the interval in which U(x)
In order to study the behaviour of this solution for both positive and
negative values of x, a brief outline of the basic analytical properties of the
Bessel functions will be required.
This means:
and thus we see that the entire function F(y; x) satisfies Bessel's differential
equation if we make the following correlation:
Since, furthermore, the constant p appears in (2.4) solely hi the form p2,
we obtain the general solution of Bessel's differential equation as follows:
with the analytical nature of Jp(x) and require that along some half-ray of
the complex plane, between r = 0 and oo, a "cut" is made and we must
not pass from one border to the other. However, this cut is unnecessary
if we stay with x in the right complex half plane:
with
We want to move along a large circle with the radius r = TQ, the angle 6
changing between 0 and ir/2. This demands the substitution
is large, in view of the largeness of TO, and is thus amenable to the KWB
solution. In fact, pz is negligible in comparison to the first term (except if
the order of the Bessel function is very large), which shows that the
asymptotic behaviour of the Bessel functions will be similar for all orders p.
The KWB solution (10.3) becomes in our case (considering that TQ is a
constant which can be united with the constants A\ and AZ) :
Returning to the original v(6) (cf. 13.6) and writing our result in terms of
z we obtain
382 STUBM-LIOUVELLE PROBLEMS CHAP. 7
If we know how Jp(x) behaves for large real values of the argument, the
equation (11) will tell us how it behaves for large complex values. Our
problem is thus reduced to the investigation of Jp(x) for large real values
of a;.
With the transformation sin <p = t the same integral may be written in the
form
Moreover, since the integrand is an even function, the same integral may
also be written in the complex form:
Now let x be large. Then we will modify the path of integration of the
variable t as follows:
We will first investigate the contribution of the path CD. Let us put
SEC. 7.14 ASYMPTOTIC EXPANSION OF Jp(x) FOE LARGE VALUES OF X 383
and
The result of the integration can be written down as the constant factor
The path AB contributes the same, except that all i have to be changed
to — i. The path BC contributes nothing since here the integrand becomes
arbitrarily small. The final result of our calculation is that the integral (3)
becomes
which holds for negative orders as well and which carries the asymptotic
relation (11) over into the realm of negative p.
we notice that the e~ix part of Jv(x] can only be obliterated, if we choose the
following linear combination of Jp(x) and J-p(x):
Accordingly in the formula (13.11) the constant AZ drops out and we obtain
along the imaginary axis z — iy:
Apart from an arbitrary complex constant, the function Kp(z) is the only
linear combination of Bessel functions which decreases exponentially along
the positive imaginary axis.
Problem 304. Show on the basis of (12.10) that the function Kp(z) is real every-
where along the imaginary axis.
Problem 305. Obtain the asymptotic value of the function
Problem 306. Obtain the asymptotic value of Mp( — iy) (cf. 12.14).
[Answer:
We thus see that v(x) is for any choice of the constants A\ and A% an entire
analytical function of the complex variable z, regular throughout the complex
plane. The cuts needed in the study of the Bessel functions disappear com-
pletely in the resulting function (11.4).
We will now choose as fundamental solutions of our differential equation
the following combinations of Bessel functions:
26—L.D.O.
386 STUEM-LIOUVILLE PROBLEMS CHAP. 7
We see that in the positive range of x the functions f(x) and g(x) represent
two oscillations (of variable amplitude and frequency) which have the
constant phase shift of ir/2 relative to one another.
We now come to the study of the negative range of x, starting with the
function f(x). As we see from (1) (considering a: as a positive quantity):
Then by definition :
and thus
But then, making use of the asymptotic behaviour of KP(iy] (see 15.4), we
obtain for large values of x:
For large values of x we obtain (on the basis of the asymptotic behaviour
of the function Ip(x) of Problem 307 (cf. 15.10)):
SEC. 7.17 JUMP CONDITIONS FOR TRANSITION "EXPONENTIAL-PERIODIC" 387
Hence
since in the exponents V — U and iVU are identical expressions. But this
argument is in fact wrong, because there is a gulf between the two types of
solutions which cannot be bridged without the proper precautions. We
have to use our function f(x) as a test function for the coefficient AI and
g(x) as a test function for the coefficient AZ- The comparison of the formulae
388 STURM-LIOUVTLLE PROBLEMS CHAP. 7
On the other hand, the comparison of the formulae (16.11) and (16.4)—
the latter written in complex form—yields:
and thus the complete relation between the two pairs of constants becomes
which means
The relations (17.14) which hold for the transition from the exponential
to the periodic domain, now become:
and thus
390 STUKM-UOT7VILLE PEOBLEMS CHAP. 7
These are the formulae by which the constants of the exponential domain are
determined if the constants of the periodic domain are given, and vice versa.
The transition occurs in the sequence: exponential-periodic. If the sequence
is the reverse, viz. periodic-exponential, we have to utilize the formulae
(18.2) which now give
and
The domain of our solution is the infinite range x = [— oo, + oo] and we
demand that the function shall not go to infinity as we approach the two
end-points x = ± oo.
Now this eigenvalue problem is solvable with the help of the hyper-
geometric series, after the proper transformations. The solution is well
known in terms of the "Hermitian polynomials" Hn(x) (cf. 2.19). But let
us assume that this transformation had escaped us and we would tackle our
problem by the KWB method. We see that if x stays within the limits
± VA, we have a periodic, outside of those limits an exponential domain.
The KWB method requires the following integration:
We put
or
where k is an arbitrary integer. This means that the periodic solution must
arrive at the critical point U = 0 with a definite phase angle.
The solution appears in the periodic range according to (19.1) in the
following form:
So far only one of the constants, namely 6, has been restricted, but we
still have the constant C at our disposal and thus a solution seems possible
for all A. We have to realise, however, that x assumes both positive and
negative values and the transition to the exponential domain occurs at both
points x = ± VA. The second point adds its own condition, except if
F(x) is either an even or an odd function, in which case the two conditions
on the left and on the right collapse into one, on account of the left-right
symmetry of the given differential operator. In fact, this is the only chance
of satisfying both conditions. Now the solution (7) yields in the numerator
The condition that our function shall become even demands the vanishing of
the second term, which means
and thus
Similarly, the condition that our function shall become odd demands the
vanishing of the first term, which means
and thus
394 STIJRM-LIOTTVILLE PROBLEMS CHAP. 7
This holds in the periodic domain x > p\. In the exponential domain
x < pi we obtain similarly
The substitution
and the complete result, valid in the exponential domain, may be written
in the following form
The general solution has two free constants B\ and BZ, associated with the
+ signs in the exponent.
The point x = 0 is a singular point of our differential equation in which
U(x) becomes infinite. However, our approximation does not fail badly in
this neighbourhood. If we let x go towards zero, we find
We must remember that our function v(x) is not the Bessel function Jp(x)
but JP(x)Vx. Moreover, the Bessel functions Jp(x), respectively J-p(x),
assume by definition in the vicinity of zero the values
396 STUBM-LIOFVILLE PEOBLEMS CHAP. 7
We will now see what happens if we come to the end of the exponential
domain and enter the periodic domain. In order to make the transition,
we must put our solution in the form (17.7). But let the upper branch
with the + sign in the exponent be given hi the more general form
Then the form (17.7) demands that we shall write this solution as follows:
which gives
Similarly
Now hi our problem the point XQ in which U(x) vanishes becomes the
point x = pi. The value of K(XQ) can be taken from the form (3) of our
solution:
Let us first consider the case of Jp(x). Here only B\ is present and we
obtain
The transition to the periodic range occurs according to the formulae (19.4)
which now gives
which differs from the corresponding quantity in (14.11) only by the fact
that p is replaced by p\. This involves a very small error, as we have seen
before.
Another change can be noticed in the amplitude constant C. In the
traditional estimation G should have the value
which is in agreement with the correct asymptotic value. Here again the
error is fairly small, e.g. for n = 5 not more than 5.7%. It is frequently
more advisable, however, to make the amplitude factor G correct in the
periodic range and transfer the amplitude error to the exponential domain.
In this case the approximation of Jp(x) will be given as follows:
for x > pi:
398 STURM-LIOUVILLE PROBLEMS CHAP. 7
The transition to the periodic range occurs once more on the basis of the
formulae (19.4) which now gives
which is replaceable bj
while in actual fact the periodicity factor of J-p(x) should come out as
Hence we have arrived in the periodic range with the wrong phase.
Let us investigate the value of the constant (7. Disregarding the small
difference which exists between p and^i and making use of Stirling's formula
(19) we obtain
Now we can take advantage of the reflection theorem of the Gamma function
which gives
Hence
Here again our result is erroneous since the amplitude factor of VxJ-p(x)
in infinity is V2}ir, without the factor sin irp.
SEC. 7.22 BESSEL'S DIFFERENTIAL EQUATION 399
The phenomenon here encountered is of considerable interest. We have
tried to identify a certain solution of Bessel's differential equation by
starting from a point where the solution went to infinity. But the differential
equation has two solutions, the one remaining finite (in fact going to zero
with the power XP) the other going to infinity. Now the solution which
remains finite at x = 0 allows a unique identification since the condition of
finiteness automatically excludes the second solution. If, however, we try
to identify the second solution by fitting it in the neighbourhood of the
singular point x = 0, we cannot be sure that we have in fact obtained the
right solution since any admixture of the regular solution would remain
undetected. The solution which goes out of bound swamps the regular
solution.
What we have obtained by our approximation, is thus not necessarily
J-p(x) but
Hence the function — NP(x) has a periodicity which agrees with (26).
Moreover, the amplitude factor (29) is explained by the fact that it is not
— NP(x) itself but — (sin prr)N p(x) that has been represented by our
approximation.
If again we agree that the amplitude factor C shall become correct in the
periodic region and only approximately correct in the exponential region,
we obtain the following approximate representation of the Neumann function
N9(x):
for x > pi:
400 STUEM-UOUVILLE PEOBLEMS CHAP. 7
and thus we have to tabulate the two substitute functions which will become
the factors of C cos 6 and C sin 6 in the transitory region. These latter
functions shall be called <pip(x) and (pzv(x).
Let us first normalise the constant a of the differential equation (11.1) to
9/4, as we have done in (11.3). We have chosen the two functions (16.2)
as the two fundamental solutions of our differential equation. Moreover,
we have seen by the formula (16.9) that the upper branch of the KWB
approximation will go with /(—#). However, the asymptotic solution
should become
SEC. 7.23 THE SUBSTITUTE FUNCTIONS IN THE TRANSITORY RANGE 401
The comparison with (16.9) shows that the function <p\e(x) should become
identified with
In the periodic range the formulae (16.3) and (16.4) come in operation
and our final result becomes:
The method of computing these four functions will be given in the following
section.
Generally the differential equation which is valid in the transitory range
will be of the form (11.1) with a constant a which is not 1 but U'(XQ). The
transition to the general case means that our previous x has to be replaced by
Hence in the general case, where the value of XQ and U'(XQ) is arbitrary, the
substitute functions have to be taken with the argument ^/U'(xo)(x — XQ).
Moreover, a constant factor has to be applied in order to bring the
asymptotic representation of these functions in harmony with the KWB
approximation. This factor is | J7'(zo)|~1/6-
Let us first assume that V'(XQ) > 0. We then have the transition
27—L.D.O.
402 STUBM-LIOUVTLLE PEOBLEMS CHAP. 7
2. factor of A 3:
3. factor of C cos 6:
4. factor of C sin 6:
Let us assume, on the other hand, that the transition occurs in the sequence
periodic to exponential. In this case the correlation occurs as follows:
1. factor of AI:
2. factor of AZ :
3. factor of C cos 0:
4. factor of C sin 9:
As an example let us consider the case of the Bessel functions Jp(x) (with
positive p), studied before. We want to determine the value of Jv(pi), or
still better the value of the function v(x) = VxJp(x), at the transition point
* = XQ. Here we have
SEC. 7.23 THE SUBSTITUTE FUNCTIONS IN THE TRANSITORY RANGE 403
and thus
If, on the other hand, the Neumann function NP(x) is in question, we have
(cf. (22.33))
exact values of the Bessel functions Jp(x), taken at the point pi. The
values of pi now become :
3.9686, 5.9791, 7.9843, 9.9875, 11.990
Substitution in the formula (16) yields
The Table III of the Appendix gives the numerical values of these four
* Tables of the modified Hankel functions of order one-third and of their derivatives
(Harvard University Press, Cambridge, Mass., 1945); cf. in particular the case y — 0
on pp. 2 and 3.
SEC. 7.25 INCREASED ACCURACY IN THE TRANSITION DOMAIN 405
functions, in intervals of 0.1, for the range x = [0, — 3], respectively [0, 3].
Beyond this range no substitution is demanded since the KWB approxima-
tion becomes sufficiently accurate.
Problem 308. Obtain with the help of the tables the values of J5(4.7), ^5(6),
J5(5.2) and likewise Jio(9.5), Jio(lO), Ji0(10.5).
[Answer:
J5(4.7) = 0.2307 J5(5) = 0.2661 J5(5.2) = 0.2977
exact: (0.2213) (0.2611) (0.2865)
Jlo(9.5) = 0.1695 Jio(lO) = 0.2086 Jio(10.5) = 0.2459
(0.1650) (0.2075) (0.2477) ]
Problem 309. The x-value at which <piP + cpzv, respectively <p]V — (p%P first
vanishes, is x = 2.3381, resp 1.1737. Find accordingly the approximate position
of the first zero of Jp(x) and Np(x), In the case of JP(x) compare the result
with the exact position of the first zero for p = 5, 7, 9, and 10.
[Answer:
The relatively poor agreement indicates the presence of a systematic error which
will be investigated in the next section.]
The higher order terms shall be neglected but not the quadratic term. We
will briefly put
406 STUEM-LIOUVILLB PROBLEMS CHAP. 7
On the other hand, our four functions <p(x) satisfied the differential equation
It will be our aim to bring the two differential equations (3) and (4) in
harmony with each other. For this purpose we establish a relation between
the variables x and £. While before we have assumed that x and £ are
simply proportional to each other, we will now apply a correction term and
put
which gives
Now we make use of the method of taking out a proper factor in order to
obliterate the first order term (cf. 7.9-10). We put
where
SBC. 7.25 INCREASED ACCURACY IN THE TRANSITION DOMAIN 407
The corresponding factor of the differential equation (3) may be written in the
form
Considering (12) the final result becomes that in the transition domain we
have to use the ^-functions in the following manner:
408 STURM-LIOUVTLLE PROBLEMS CHAP. 7
For the sake of increased accuracy the tables (23.11) and (23.12) have to
be modified according to this correction. The correction is in fact quite
effective. Let us obtain for example once more the first zero of Jp(x).
This demands the first zero of the function <piP + <pz* which is at the point
x = 2.3381. Now in the present case
The last term is the correction which has to be added to our previous formulae
(24.5). The corrected values of the zeros obtained in Problem 309 now
become
The factor A(x) of v"(x) vanishes at the two points x = 0 and 1. These
points are thus natural boundary points of the operator which limits the
range of x to [0, 1]. According to the general theory the weight factor
w(x) of our operator becomes (cf. 4.7)
The eigenvalue equation associated with our operator assumes the form
This term has the peculiarity that it vanishes without any imposed conditions
on u and v, due to the vanishing of the first factor. In fact, however, this
implies that v(x) and v'(x) remain finite at the points x = ± 1. Since the
points x = ± 1 are singular points of our differential operator where the
solution goes out of bound if no special precautions are taken, the very
condition that v(x) and v'(x) must remain finite at the two end-points of the
range, represents two homogeneous boundary conditions of our differential
operator which selects its eigenvalues and eigenfunctions. In particular,
the hypergeometric function F(a, /?, y; x) goes to infinity at the point x = 1,
except in the special case that the series terminates automatically after a
finite number of terms. This happens if the parameter a (or equally j8 but
this does not give anything new since F is completely symmetric in a and
j3) is equated to a negative integer —n. Then the eigenvalues An become
(according to (5)):
410 STUBM-LIOUVILLE PROBLEMS CHAP. 7
This weighting has the peculiarity that it puts the emphasis on the
neighbourhood of the origin x = 0 (for b > 2). With increasing b we
obtain a weight factor which is practically exponential since for large b
we obtain practically
Problem 312. Let in (10) 6 go to infinity like l//x, but at the same time transform
x to fj.xi. Show that in the limit the weight factor w(x) becomes e~x.
which means
Now the operator (1) becomes in the new variable and under the condition
(3):
Then
Moreover,
Hence in the new variable t our differential equation (for large £) becomes
But this is BesseVs differential equation for the order n = 0, taken at the
point 2at (cf. 2.20):
This now means that v(Q) must become 1, but then the factor of proportion-
ality of Jo must become 1 in view of the fact that Jv(x) starts with the value
Since for small values of 6 the tangent becomes practically the angle itself,
we obtain for small 9:
This solution can be linked to the solution (8) which is valid in the domain
of larger 6. Let us first assume the case of an even function. Then the
matching of the cosine factors demands the condition
Let us now assume the case of an odd function. The solution can now be
written in the form
which gives
The case of both even and odd functions is included in the selection rule
where n is an integer, with the understanding that all even n demand the
choice cosine and all odd n the choice sine in the general expression (8).
The determination of the eigenvalues a according to (22) is not exact
but very close. The exact law for a is
The zeros of the Chebyshev polynomials of odd order follow the law
while the Gaussian zeros follow (in close approximation) the law
The difference is that hi the Chebyshev case the full circle is divided into
4/u, + 2 = 2n equal parts and the points projected down on the diameter.
In the Gaussian case the full circle is divided into 4/i + 3 = 2n + I equal
parts and again the points projected down on the diameter.
In the case of the polynomials of even order the zeros of the Chebyshev
polynomials become
Now the half circle is divided into 4/z = 2n, respectively 4/z + 1 = 2n + I
equal parts and the points are projected down on the diameter, but skipping
all points of even order 2, 4, ... and keeping only the points of the order
1, 3, 5, . . ., n — 1.
The asymptotic law of the zeros is remarkably well represented even for
small n. We obtain for example for n = 4, 5, 6, 7, 8 the following
distribution of zeros, as compared with the exact Gaussian zeros:
n =4 n =5
0.3420 (0.3400) 0.5406 (0.5385)
0.8660 (0.8611) 0.9096 (0.9062)
n—6 n= 7 n =8
0.2393 (0.2386) 0.4067 (0.4058) 0.1837 (0.1834)
0.6631 (0.6612) 0.7431 (0.7415) 0.5265 (0.5255)
0.9350 (0.9325) 0.9510 (0.9491) 0.7980 (0.7967)
0.9618 (0.9608)
In order to test the law of the amplitudes we will substitute x = 0. Then
SEC. 7.28 THE LEGENDRE POLYNOMIALS 417
we obtain for the polynomials of even order n = 2p, the following starting
values:
The Legendre polynomials have the property that they are derivable from
a "generating function" in the followinsr wav:
From this property of Pn(x) we derive the following exact values of PZ^)
and P' 2M+ i(0):
Within this accuracy the value of P2^(0) coincides with that given in (32),
except that /* + J is replaced by p., while in the case of P'2M+i(0) we find
that 4ju, + 3 is replaced by 4/j, + 4.
The numerical comparison shows that in the realm of n = 5 to n = 10
we obtain on the basis of the asymptotic formula the following initial
values, respectively initial derivatives (the numbers in parenthesis give the
corresponding exact values):
ro = 6 n= 8 n = 10
PB(0) = -0.3130 0.2737 -0.2462
(-0.3125) (0.2734) (-0.2461)
n =5 n =7 TI = 9
P'ft(0) = 1.8712 -2.1851 2.4592
(1.8750) (-2.1875) (2.4609)
We see that the asymptotic law gives very good results even in the realm
of small n (starting from n = 5).
28—L.D.O.
418 STURM-LIOTJVTLLE PROBLEMS CHAP. 7
Moreover, they have the remarkable property that their "norm" becomes 1:
and we see that for small values of x we are in the periodic, for large values
in the exponential domain. The dividing point U(XQ) = 0 is determined by
the root of a quadratic equation. We will simplify our task, however, by
the observation that for a sufficiently large A the first term of U(x) will
quickly become negligible. We fare better if we do not neglect the first
term but combine it with the second term in the form
This gives
The second solution, VxNo (VAa;), has to be rejected since we know that
x~Vzv(x) must remain finite at x = 0. Moreover, the solution (13) is
already properly normalised in view of the condition (4) which holds for all
Laguerre polynomials. We have thus a situation similar to that encountered
in the study of the Legendre polynomials around the point x = 1.
We now come to the U(x) given by (10) and the KWB approximation
associated with it. We will put
Then
and we have to link the solution (17) to (18) for sufficiently small values of x.
Now small values of x mean small values of <p. But in the realm of
small <p the argument of the cosine function in (17) is replaceable by 2Xy — 8
and this quantity becomes, if we go back to the original variable x on the
basis of (14)—neglecting in this realm the small constant ju/2:
Now the transition point XQ at which U(XQ) becomes zero, belongs to the
value (p — 7T/2. Hence the phase angle 6 becomes, according to (21.7):
The transition into the exponential domain must be such that the branch
with the positive exponent vanishes since otherwise our solution would go
exponentially to infinity which is prohibited since we know that our solution
must go exponentially to zero as x grows to infinity. This means A\ = 0
and we see from (19.5) that this condition demands
where n is an arbitrary integer. The comparison of (22) and (23) gives the
selection rule
and this is the exact eigenvalue of Laguerre's differential equation (6). Once
more, as in the solution of Hermite's differential equation, the KWB method
leads to an exact determination of the eigenvalue which is generally not to
be expected, in view of the approximate nature of our integration procedure.
The constant of the negative branch of the exponential domain becomes,
according to (19.5):
Problem 313. Obtain the value of *Pn(x) at the transition point XQ.
[Answer:
Then
Let us put
But now we can assume that <p is purely imaginary since we have split away
the amplitude C(x) and what remains is a pure oscillation with constant
amplitude. But then we can put
difference is, however, that the function U(x) is replaced by Ui(x), defined
according to (4). The solution (11) is no longer an approximate but an
exact solution of the given problem. In order to obtain this solution we
must first obtain the function C(x) by solving the non-linear second-order
differential equation
Then also C"(x\) = 0 and we see that C(x) will change very slowly in this
neighbourhood. We may be able to proceed to the next maximum on the
basis of a local expansion of a few terms and estimate the increase or decrease
of the maximum without actually integrating the amplitude equation (12).
Problem 315. In the discussion of the differential equation (9.15) of the
vibrating spring a solution was obtained which could be interpreted as an
oscillation of variable amplitude C(x) and frequency /3(x). Show that the
variable amplitude QS(aj)]-1/2 satisfies the exact amplitude equation (12).
SEC. 7.31 STTJRM-LIOUVILLE PBOBLEMS 425
Making use of the standard technique discussed hi Chapter 5.9 (see particu-
larly the second equation of the system (5.9.4)), we arrive at the differential
equation
The differential equation (2) is a necessary but not sufficient condition for
the stationary value of the variational integral. A further condition is
required by the vanishing of the boundary term. This establishes the
boundary conditions without which our problem is not fully determined.
Now it is possible that our variational problem is constrained by certain
conditions which have to be observed during the process of variation. For
example we may demand the minimum of the integral
but under the restricting condition that some definite boundary values are
prescribed for v(x):
426 STTJRM-LIOUVILLB PEOBLEMS CHAP. 7
because v(x) remains fixed at the two end-points during the process of
variation. Then the boundary term (3) vanishes automatically and we get
no further boundary conditions through the variational procedure.
It is equally possible, however, that we have no prescribed boundary
conditions, that is, the variational integral (4) is free of further constraints.
In that case 8v(x) is not constrained on the boundary but can be chosen
freely. Here the vanishing of the boundary term (3) introduces two
"natural boundary conditions" (caused by the variational principle itself,
without outside interference), namely—assuming that A(x) does not vanish
on the boundary:
We may also consider the case that the problem is partly constrained, by
prescribing one boundary condition, for example
Then 8v(x) vanishes on the lower boundary but is free on the upper boundary,
and the variational principle provides the second boundary condition in the
form
with the auxiliary condition (10). This condition can be united, however,
with the Lagrangian by the method of the Lagrangian multiplier (cf.
Chapter 5.9), which we want to denote by p:
SEC. 7.31 STTTEM-LIOUVILLE PEOBLEMS 427
This adds to our original problem the two new variables p and w, but w is
purely algebraic and can be eliminated, by the equation
This yields
and
where
(which demands the same conditions for the variations 8v(x) and $p(x),
omitting the constants y\ and 72 on the right side), we see that the boundary
term (19) vanishes if the single condition
We will consider the case of the homogeneous differential equation and thus
put P(x) = 0. Then the Hamiltonian function (16) becomes a homogeneous
algebraic form of the second order in the variables p and v, and thus by
Euler's formula of homogeneous forms:
and also
Hence the variational integral becomes zero for the actual solution. It is
worth remarking that this result is independent of any boundary conditions.
It is a mere consequence of the canonical equations which in themselves do
not guarantee a stationary value of Q.
Let us now consider the eigenvalue problem associated with our operator:
(In the boundary conditions (20) we now have to replace y\ and 72 by zero.)
This equation can be conceived as consequence of our variational principle
if in our Lagrangian (1) we put fi(x) = 0 and replace C(x) by C(x) + A. But
another interpretation is equally possible. We normalise the solution v(x)
bv the condition
This expression is now a positive definite quadratic form of v and v' which
cannot take negative values for any choice of v(x). The same holds then of
the integral Q and—in view of (30)—of the eigenvalues A<. Hence the
eigenvalues associated with a positive definite Lagrangian can only be positive
numbers.
But we can go further and speak of the absolute minimum obtainable for
Q under the constraint (27). This absolute minimum (apart from the
factor 2) will give us the smallest eigenvalue of our eigenvalue spectrum.
This process can be continued. After obtaining the lowest eigenvalue and
the associated eigenfunction vi(x), we now minimise once more the integral
Q with the auxiliary condition (27), but now we add the further constraint
that we move orthogonally to the first eigenfunction Vi(x):
The absolute minimum of this new variational problem (which excludes the
previous solution vi(x) due to the constraint (32)) yields the second lowest
eigenvalue A2 and its eigenfunction vz(x). We can continue this process,
always keeping all the previous constraints, plus one new constraint, namely
the orthogonality to the last eigenfunction obtained. In this fashion we
activate more and more dimensions of the function space and the eigenvalues
enter automatically in the natural arithmetical order.
We will now give an example which seems to contradict our previous
result by yielding a negative eigenvalue, in spite of a positive definite
Lagrangian. We make the simple choice
The solution
The added term amounts to a mere boundary term but this term is no longer
positive definite, in fact it may become negative and counteract the first
positive term to such an extent that the resulting Q becomes negative.
This is what actually happens in the example (33-34).
Problem 316. Obtain the complete ortho-normal system associated with the
problem (33), (34).
[Answer:
Problem 317. Obtain the eigenfunctions and eigenvalues which belong to the
Lagrangian
[Answer:
Added boundary condition:
where
SEC. 7.31 STUEM-LIOUVILLE PROBLEMS 431
Problem 318. Show that for any /it > 2 the integral Q can be made arbitrarily
small by a function of the type e~at where a is very large. Hence the minimum
problem cannot lead to a discrete smallest eigenvalue.
Problem 319. Obtain the eigenfunctions and eigenvalues for the limiting case
/i = 2.
[Answer: The eigenvalues become continuous since only the boundary condition
at v(l) gives a selection principle:
with the constraints (27) and (34). (Although this problem seems to coincide
with Problem 316, this is in fact not the case because now the given Lagrangian
is positive definite and the boundary conditions (34) must be treated as constraints.
A negative eigenvalue is not possible under these conditions.)
[Answer: The method of the Lagrangian multiplier yields the differential
equation
with the following interpretation. A strict minimum is not possible under the
given constraints. The minimum comes arbitrarily near to the solution of the
eigenvalue problem which belongs to the boundary conditions t/(0) = v'(n) = 0,
without actually reaching it.
This example shows that a variational problem does not allow any tampering
with its inherent boundary conditions (which demand the vanishing of the boundary
term (37)). If constraints are prescribed which do not harmonise with the
inherent boundary conditions, the resulting differential equation is put out of
action at the end-points, in order to allow the fulfilment of the inherent boundary
conditions which must be rigidly maintained.]
BIBLIOGRAPHY
[1] Cf. {!}, pp. 82-97, 324-36, 466-510, 522-35
[2] Cf. {12}, Chapters XIV-XVII (pp. 281-385)
[3] Jahnke, E., and F. Emde, Tables of Functions with Formulae and Curves
(Dover, New York, 1943)
[4] MacLachlan, N. W., Bessel Functions for Engineers (Clarendon Press, Oxford,
1934)
[5] Magnus, W., and F. Oberhettinger, Formulas and Theorems of the Special
Functions of Mathematical Physics (Chelsea, New York, 1949)
[6] Szego, G., Orthogonal Polynomials (Am. Math. Soc. Colloq. Pub., 23, 1939)
[7] Watson, G. N., A Treatise on the Theory of Bessel Functions (Cambridge
University Press, 1944)
CHAPTER 8
8.1. Introduction
In all the previous chapters we were primarily concerned with the general
theory of linear differential operators which were characterised by homo-
geneous boundary conditions; that is, certain linear combinations of the
unknown function v(x) and its derivatives on the boundary were prescribed
as zero on the boundary, while, on the other hand, the "right side" of the
differential equation was prescribed as some given function fi(x) of the
domain. Historically, another problem received much more elaborate
attention. The given differential equation itself is homogeneous by having
zero on the right side (hence f$(x) = 0). On the other hand, some linear
combinations of the unknown function and its partial derivatives are now
prescribed as given values, generally different from zero. Instead of an
inhomogeneous differential equation with homogeneous boundary conditions
we now have a homogeneous differential equation with inhomogeneous
432
SEC. 8.1 INTBODUCTION 433
or more generally
represents the "parabolic" type, while the equation of the vibrating string:
or more generally
(called the "wave equation"), belong to the "hyperbolic" type. With the
advent of wave-mechanics the Schrodinger equation
and many allied equations came in the focus of interest. Here the question
of boundary values is often of subordinate importance since the entire space
is the domain of integration. But in atomic scattering problems we encounter
once more the same kind of boundary value problems which occur hi optical
and electro-magnetic diffraction phenomena, associated with the Maxwellian
equations.
29—L.D.O.
434 BOUNDARY VALUE PROBLEMS CHAP. 8
where
which yields
SEC. 8.2 INHOMOGENEOTJS BOUNDARY CONDITIONS 437
The second term is extended over the boundary surface a and involves the
boundary values of Ui(a) and its partial derivatives, together with the
boundary values of VQ(X), which in fact coincide with the given boundary
values for v(x).
If we separate the first term on the right side of (9) and examine its
contribution to the function, we find the infinite sum
and it seems that we have simply obtained — VQ(X) which compensates the
VQ(X) we find on the right side of the expression (1). It seems, therefore,
that the whole detour of separating the preliminary function VQ(X) is
unnecessary. But in fact the function VQ(X) does not belong to the functions
which can be expanded into the eigenfunctions vt(x), since it does not
satisfy the necessary homogeneous boundary conditions. Moreover, the
separation of this term from the sum (9) might make the remaining series
divergent. In spite of the highly arbitrary nature of VQ(X) and the inde-
pendence of the final solution (2) of the choice of VQ(X), this separation is
nevertheless necessary if we want to make use of the expansion of the
solution into a convergent series of eigenfunctions vt(x).
The method of the Green's function is likewise applicable to our problem
and here we do not have to split away an auxiliary function but can apply
the given inhomogeneous boundary values directly, in spite of the fact that
the Green's function is defined in terms of homogeneous boundary con-
ditions. We have defined the Green's function by the differential equation
now this volume integral is zero, due to the vanishing of j8(#). On the
other hand, we now have to make use of the "extended Green's identity"
(4.17.4):
and obtain v(x) in the form of an integral extended over the boundary
surface a:
438 BOUNDARY VALUE PROBLEMS CHAP. 8
Here the functions Fa(u, v) have automatically the property that they
blot out all those boundary values which have not been prescribed. What
remains is a surface integral involving all the "given right sides" of the
boundary conditions. We may denote them by
where the auxiliary functions GI(X, u), . . . , Gp(x, a) are formed with the
help of the Green's function G(x, £) and its partial derivatives, applied to the
boundary surface a.
The unique solution thus obtained is characterised by the following
properties. If our operator is incomplete by allowing solutions vi(x) of the
homogeneous equation
we assume that the given boundary conditions fa(a) are such that the
integral conditions
are automatically fulfilled since otherwise the boundary data are in-
compatible and the given problem is unsolvable.
Problem 321. Assume the existence of non-zero solutions of (18) and the ful-
filment of the required compatibility conditions (19). Now apply the method
of transforming the given inhomogeneous boundary value problem into a homo-
geneous boundary value problem with inhomogeneous differential equation.
Show that now the orthogonality of the right side to the "forbidden" axes ui
is automatically satisfied.
has many exceptional properties, due to its close relation to the celebrated
"Cauchy-Riemann differential equations", which are at the foundation of
the theory of analytical functions. A function of the complex variable
2 = x + iy:
440 BOUNDARY VALUE PROBLEMS CHAP. 8
has the property that its real and imaginary parts are related to each other
by the two partial differential equations
(with a similar solution for Vz)- But now we have to demand that our
solution be periodic in 6 with the period 27r, since two points which belong
to the angles 6 and B + 2™, in fact coincide, and without the required
periodicity our solution would not be single-valued. This condition restricts
the possible values of the product a/3 to Jcz, where k is an arbitrary positive
integer, or zero:
We now come to the solution of the first set of conditions (9). Here we
obtain for U\ alone the differential equation
while the first of the conditions (9), combined with (13), gives
442 BOUNDAEY VALUE PROBLEMS CHAP. 8
represents once more the complete solution of the equation (20), if we stay
inside a circle within which the equation holds. Let us normalise the
radius of that circle to r = 1 and prescribe on the periphery of the circle
the boundary values
This can be done in the present case since we have the real part of an infinite
series which is summable by the formula of the geometrical series:
This Green's function is not identical with the full Green's function
G(x, |) of the potential equation, but the two functions are closely related
to each other. We have seen in (2.16) that the auxiliary functions
Ga(x, a) are expressible in terms of the Green's function G(x, £) and its
partial derivatives, taken on the boundary surface a, which in our case of
two dimensions is reduced to a boundary curve. But in our simple problem
of a circle we have no difficulty in constructing even the full Green's function
G(x, £) which satisfies the differential equation
hold in spaces of arbitrary dimensions, although the spaces of two, three, and
four dimensions are of primary interest from the applied standpoint:
1. The operator remains invariant with respect to arbitrary translations
and rotations. Hence any solution of the Laplacian equation remains a
444 BOUNDARY VALUE PROBLEMS CHAP. 8
whose solution is
5. We will assume that j3(r) is different from zero only within a certain
(^-dimensional) sphere of the radius r = e. Then our particular solution
Vi(r) outside of this sphere must be of the form (33) and the constant 6 will
be determined by the application of the relation (40), integrating over the
inner sphere:
hi three dimensions
in four dimensions
for n — 2k:
for n = 2Jfc + 1:
By the definition of the delta function the integral over the right side
becomes 1. The delta function can be assumed to be spherically symmetric
and concentrated in a small sphere—whose centre is at the origin—with
the radjus « which shrinks to zero. Hence the constant B is now 1 and
446 BOUNDARY VALUE PROBLEMS CHAP. 8
the validity of our solution (43-46) applies in the limit to any point outside
the point r = 0. We thus obtain for the Green's function G(x, g):
in two dimensions:
in three dimensions:
in four dimensions:
and we observe that p and rop' become equal on the unit circle. Hence
the linear superposition
Hence we have obtained the Green's function of the "first boundary value
problem of potential theory" (when the values of F(o-) are prescribed on the
boundary), for the case that the boundary is the unit circle or the unit
sphere. The result is for the case of two dimensions:
If—for the case of the circle—we substitute in this formula the expression
(55) (CT corresponds to r = 1), we obtain
Finally, if the point x does not have the coordinates fo, 0 but r, 6, we can
reduce this problem to the previous one by a mere rotation of our reference
system. The final formula becomes, if the integration variable is denoted
by*:
[Answer:
Constraint:
Problem 323. Construct the complete Green's function G(x, £) of this problem
(constrained on account of (64)) and show that the solution (63) is identical
with the solution obtained on the basis of the Green's function method.
Demonstrate the symmetry of Q(x, £).
(Hint: Use again a proper linear combination of the contributions of the point x
and its conjugate x'.)
[Answer :
Definition of G(x, $):
Solution:
Problem 324. Obtain special solutions of the potential equation (20) by assuming
that V(r, 6) is the sum of a function of r and a function of 6:
Formulate the result as the real part of a function /(z) of the complex variable
z = reie.
[Answer:
and the following partial differential equation for the function Y(9, </>):
in view of the fact that S((f)) must become a periodic function of <j>. The
associated eigenfunctions are
We start with m =• 0. Then the equation (8) becomes, in the new variable
and we will return to the equation (4), in order to obtain the full solution
of V(r, 6, 0):
and we will dispose of u(x) in such manner that the factor of w' shall remain
— 2x. For this purpose we have to put
But this is exactly the differential equation (8) for /3 = m2 and a = n(n +1).
We have thus obtained the following particular solutions of the Laplacian
differential equation:
452 BOUNDARY VALUE PEOBLEMS CHAP. 8
Problem 325. Show that to any solution V(r, 0, $) of the Laplacian equation
a second solution can be constructed by putting
Obtain the coefficients cn of the expansion (19) in terms of the given boundary
values.
[Answer:
Answer:
Problem 330. Choose the upper sign in (35) and demonstrate the following
property of this solution. We select the point r = a, 6 = 0 on the positive
z-axis and construct a sphere of the radius b < a around the point (a, 0) as
centre. Then the normal derivative of V(r, 6) along this sphere becomes
SEC. 8.5 THE POTENTIAL EQUATION IN THREE DIMENSIONS 453
Problem 331. The Green's function for the case of a unit sphere was obtained
in (4.56) as far as the "first boundary value problem of potential theory" is
concerned. Solve the same problem for the "second boundary value problem"
("Neumann problem"). Hint: the operation with the conjugate points is not
enough, but we succeed if the result (36) is taken into account.
[Answer:
Definition of Green's function:
Constraint:
(The additional constant has no effect on the integration and may be omitted.)]
Problem 332. Show that the solution thus obtained automatically satisfies the
conditions
provided that the compatibility condition (38) is satisfied. Explain the origin
of the condition a).
[Answer: the vanishing of the coefficient CQ in the expansion (19).]
Problem 333. Consider the problem of minimising the integral
Our differential equation (1) now separates into the ordinary differential
equation
which is solved by
SEC. 8.6 VIBRATION PROBLEMS 455
This means that at the time moment t = 0 the initial displacements and the
initial velocities of the vibrating medium are given. Then we can expand
both f(x) and g(x) into the eigenfunction system vi(x):
But then the comparison with (5) shows that we have in fact obtained the
coefficients at, bt explicitly:
Problem 336. Show that the Green's function O(x, t;^,r) of the problem (1),
with/(ic) = g(x) = 0, can be constructed as follows:
456 BOUNDARY VALUE PROBLEMS CHAP. 8
[Answer:
if £km denotes the (infinitely many) zeros of the Bessel function of the order k:
Problem 338. Show that a vibration problem with inhomogeneous but time-
independent boundary conditions can be solved as follows. We first solve the
given inhomogeneous boundary value problem for the differential equation
with
If we denote
and replace the product of two sine functions by the difference of two
cosine functions, we obtain the general term of the expansion (7) in the
following form:
458 BOUNDARY VALUE PROBLEMS CHAP. 8
where
The solution of this variational problem is the differential equation (2), with
the boundary conditions (3), caused by the constraints which are imposed
on the string by the forces which prevent it from moving at the two fixed
ends.
The Green's function itself is the displacement of the string, as a function
of x and t, after the hammer blow is over. The expression (7) shows at
once an interesting property of the motion of the string, observed by the
early masters of acoustical research: "an overtone which has a nodal point
at the point where the hammer strikes, cannot be present in the harmonic
spectrum of the vibrating string". Indeed, the sum (7) represents a
harmonic resolution of the motion of any particle of the string, if we consider
x as fixed and describe the motion as a function of time. The overtones
have frequencies which are integer multiples of the fundamental frequency
1/(2Z) and any overtone receives the weight zero if the last factor has a nodal
point at the point x = £.
It is this harmonic analysis in time which leads to the notion that the
string performs some kind of "vibration", as the name "vibrating string"
indicates. That such vibrations are possible is clear from the mathematical
SEC. 8.7 THE PROBLEM OF THE VIBRATING STRING 459
form of the eigenfunctions, if they are taken separately. But this does not
mean that under the ordinary conditions of bringing the string into motion,
some kind of vibration will occur. It is a curious fact that our "physical
intuition" can easily mislead us if we try to make predictions without the
aid of the exact mathematical theory. What happens if we strike the
string with a hammer ? How will the disturbance propagate? The answer
frequently given is that some kind of "wave" will propagate along the
string, similar to the waves observed in a pond if a stone is dropped in the
water. Another guess may be made from the way in which an electro-
magnetic disturbance is propagated in space: it spreads out on an
ever-increasing sphere at the velocity of light, giving a short disturbance
at the points swept over by this expanding sphere. The picture derived
from this analogy would be a narrow but sharp "hump" on the wire
which will propagate with constant velocity (in fact with the velocity 1
due to the normalisation of our differential equation), both to the right
and to the left from the point of excitation.
Both guesses are in fact wrong. In order to study the phenomenon in
greater detail, we have to find first of all the significance of the sum (10).
We have encountered the same sum earlier (of. 2.2.10), and making use of
the result there obtained we get
which implies
Let us now apply the hammer blow first of all at the centre of the string,
that is the point £ = Z/2 and examine the displacement of the string, given
by (11), at the points x = 1/2 ± x\. By the principle of symmetry we
know in advance that the disturbance must spread symmetrically to the
right and to the left, and thus the displacement v(s ± xi) which belongs to
these points, must be the same. If we substitute the result (14) in the
general formula (11), we get
time I, when it jumps over to the height — \ and repeats the same cycle
over once more, on the negative side. After the time of 21 the full cycle is
repeated in identical terms.
The remarkable feature of this result is that with the limited kinetic
energy of the initial blow larger and larger portions of the string are excited,
which seems to contradict the conservation law of energy. In actual fact
the expression (13) shows that the accumulated potential energy is zero
because the hill is of constant height and thus dvfdx = 0. The local kinetic
energy is likewise zero because the points of the hill, after rising to the
constant height of £ (or — £) remain perfectly still, their velocity dropping to
zero. The exchange between potential and kinetic energy takes place
solely at the end of the hill which travels out more and more but repeats the
same phenomenon in identical terms.
If the hammer blow is applied away from the centre £ = 1/2, the only
change is that the reflection of the hill at the two end-points now occurs at
different time moments and thus the receding of the hill starts at one end
at a time when the hill is still moving forward on the other side. The
collapsing and reversal of sign now occurs at the mirror image of the point
of excitation, but again after the time 21 has passed from the time moment
of excitation.
The actual motion of the particles is far from a harmonic oscillation. It
SEC. 8.7 THE PROBLEM OF THE VIBRATING STRING 461
projected down on the J£-axis by drawing two straight lines at the angles
of —45° downward. We arrive at the two points x\ and x%. Then we
take the arithmetic mean of the two values f(x\} and f ( x 2 ) as the solution
v(x, t). If f(x) is plotted geometrically, we can obtain the solution by a
purely geometrical construction.
If we pluck the string at the centre, the shape of the string at t = 0 will
be an isosceles triangle. How will this triangle move, if we release the
string? Will the entire triangle move up and down as a unit, vibrating in
unison ? This is not the case. The geometrical construction according to
Figure (22) demonstrates that the outer contour of the triangle remains at rest
but a straight line moves down with uniform speed, truncating the triangle
to a quadrangle of diminishing height, until the figure collapses into the
axis OL. Then the same phenomenon is repeated downward in reversed
sequence, building up the triangle gradually, until the mirror image of the
original triangle is restored. We are now at the time moment t = I and the
half-cycle of the entire period is accomplished. The second half-cycle
repeats the same motion with opposite sign.
If the plucking occurs at a point away from the centre, the descending
straight line will not be horizontal. Furthermore, the triangle which
develops below the zero-line is the result of two reflections, about the X
and about the Y axes. In other respects the phenomenon is quite similar
to the previous case.
Problem 339. Obtain the solution of the boundary value problem (2), (3), with
the initial conditions
obtain the general solution of the differential equation (2) in the form
P(x + t) + Q(x - t) which may also be written in the form
b) Show that the boundary conditions (3) demand the following extension of
the functions A(p) and B(p) beyond the original range [0, Z]:
c) Obtain the solutions (17) and (21) on the basis of this method, without
any eigenfunction analysis.
464 BOUNDARY VALUE PEOBLEMS CHAP. 8
(making use of the simplified notation vx for dv/dx, vt for 8vj8t). Now our
SEC. 8.8 NATURE OF HYPERBOLIC DIFFERENTIAL OPERATORS 467
with
These latter conditions arise from the fact that we want to employ the
method discussed in Section 2 which transforms the inhomogeneous boundary
value problem (in our case initial value problem) with a homogeneous
differential equation into an inhomogeneous differential equation with
homogeneous boundary conditions. The "right side" fi(x, t) of this in-
homogeneous equation has to be put in the new formulation (7) in the place
of the zero of the third equation, while the right sides of the first and the
second equation remain zero. If we eliminate pi and pz from these two
equations and substitute in the third equation, we are back at the single
equation
but not the boundary conditions (9)). We will call the orthogonal eigen-
functions of our problem Ut(x) and Vt(x), each one of these functions
representing in fact a vector, with the three components
with
with the proper boundary and initial conditions. If we once more solve the
time-independent eigenvalue problem
Then
with
where
The significance of the Green's function (14) is the heat flow generated by
a heat source of the intensity 1, applied during an infinitesimal time at
t = 0, and in the infinitesimal neighbourhood of the point x = £. Such a
heat flow would occur even if the body extended on both sides to infinity
SEC. 8.9 THE HEAT FLOW EQUATION 471
We can thus ask for the limiting value of the Green's function if I goes to
infinity. Accordingly, we will place the point £ at the midpoint of the rod:
| = 1/2 and put once more (as we have done in the problem of the viratingb
string):.
letting I go to infinity. But then the sum on the right side of (15) becomes
more and more an integral; in the limit, as I grows to infinity, we obtain
and we obtain
Thus the Green's function of the heat flow equation in one dimension for
an infinitely extended medium becomes
which hold if the two ends of the rod are insulated against heat losses.
472 BOUNDAEY VALUE PROBLEMS CHAP. 8
a) Show that any solution of the heat flow equation under these boundary
conditions satisfies the condition
where
with prescribed boundary values, determined by the closed space curve which
terminates the membrane. We will consider the simple case of a plane
circular membrane of unit radius whose frame is kept at the constant
distance a from the horizontal plane. Here the solution
We then approach the limit in which only one, point of the membrane is
pinned down. For the sake of simplicity we assume that the peg is centrally
applied.
Our problem is a "minimum problem with constraints" since the potential
SBC. 8.10 MINIMUM PROBLEMS WITH CONSTRAINTS 473
where the right side is proportional to the force density required for the
pinning down of the membrane.
Now we have seen that the only possible solution of (3) under circular
symmetry is
Let us now multiply the inhomogeneous equation (4) by the area element
2irrdr and integrate between 0 and e. We then obtain
The quantity on the left side is proportional to the total force required for
pinning the membrane down. We see that this force is becoming smaller
and smaller as the radius e of the peg decreases. At the same time the
solution (7) shows that the indention caused by the peg becomes more and
more local since for very small e/v(r) becomes practically v = a, except in the
immediate neighbourhood of r = e. In the limit, as e recedes to zero, we
obtain the following solution: the membrane is everywhere horizontal but
it is pinned down at r = 0. This means that v(r) assumes everywhere the
constant value v = a, except at r = 0, where v(0) = 0.
While this solution exists as a limit, it is not a legitimate solution because
it is a function which cannot be differentiated at r = 0 and hence does not
belong to that class of functions which are demanded in the process of
474 BOUNDARY VALUE PROBLEMS CHAP. 8
minimising the integral (1). We thus encounter the peculiar situation that
if we require the minimisation of the integral (1) with the boundary con-
dition v(l) = a and the inside condition
this problem has no solution. We can make the given integral as small
as we wish but not zero, and thus no definite Tnim'mum can be found under the
given conditions.
The situation is quite different, however, if the membrane is not pinned
down at a point but along a line. A line has the same dimensionality as the
boundary and a constraint along a line can in fact be considered as a boundary
condition, if we add the line of constraint to the outer boundary. Constraints
of this type are of frequent occurrence in physical problems. We may
consider for example a three-dimensional flow problem of a fluid which is
forced to flow around an obstacle which is given in the form of a surface.
Or we may have a problem in electrostatics in which the potential along an
inner surface is given as zero, since the surface is earthed. Again, in a
diffraction problem which requires the solution of the differential equation
where f3(x) is not zero in the domain of the constraint and the symbolic
notation x refers again to an arbitrary point of an ^-dimensional manifold.
The solution of our new problem will be once more a function VQ(X) which
satisfies the given inhomogeneous boundary data, together with the homo-
geneous differential equation, but to this solution we now have to add the
solution of the inhomogeneous equation (12), with homogeneous boundary
conditions. This can be done in terms of the Green's function of our problem
and thus the complete solution of our problem can be given as follows:
But now we must make use of the fact that the given constraint exists in a
certain sw6-domain of our space, more precisely on a given inner surface
which we want to denote by a', in distinction to the boundary surface a.
Accordingly we have to rewrite the equation (13) as follows:
SEC. 8.10 MINIMUM PROBLEMS WITH CONSTRAINTS 475
The quantity /3(cr') is proportional to the density of the surface force which
is required for the fulfilment of the given constraint. But this force is not a
given quantity. What is given is the constraint, which demands that v(x)
becomes zero on the surface a', or more generally that v(x) becomes some
prescribed function on this surface. The force needed for the maintenance
of this condition adjusts itself in such a way that the constraint is satisfied.
Let us express this physical situation in mathematical terms. We will
denote by s' an arbitrarily selected point of the inner surface a'. Then
our constraint demands that the following equation shall be satisfied:
Then
point x = £, but to this part we had to add a regular solution of the potential
equation (cf. (4.47-49)), chosen in such a way that on the boundary a
the required homogeneous boundary conditions of the Green's function are
satisfied. This is now quite different in our present problem. The adjoint
equation—which defines the Green's function—is now strongly under-
determined and in fact we obtain no boundary conditions of any kind for the
function G(x, £). Hence we can choose the added function V(x) as any
solution of the potential equation, even as zero. Hence the over-determina-
tion of the problem has the fortunate consequence that the Green's function
can be explicitly given in the form of a simple power of the distance rX(
(times a constant), in particular in two dimensions
Now the fact that our data have been properly given has the following
consequence. If we approach with the inside point x any point s on the
boundary, we actually approach the given boundary value F(s). Hence we
can consider as the criterion of properly given boundary values that the
equation (3) remains valid even in the limit, when the point x coincides
with the boundary point s. Then we get the following integral relation,
valid for any point s of the boundary surface a:
This is once more a Fredholm type of integral equation, only the kernel
K(x, g) has changed, compared with the previous problem (5).
From the standpoint of obtaining an explicit solution in numerical terms
we may fare better if we avoid the solution of an integral equation whose
kernel goes out of bound at the point s — a. The surplus data are also
obtainable by making use of the compatibility conditions which have to be
satisfied by our data. We then have a greater flexibility at our disposal
because the compatibility conditions appear in the form
where u(a) can be chosen as any function which satisfies the homogeneous
equation
and is free of any singularities inside the given domain. We can once
more choose as our u(a) the reciprocal distance rax~l, provided that the fixed
point x is chosen as any point outside the boundary surface. By putting the
point x sufficiently near to the surface, yet not directly on the surface we
* For a more thorough study of the theory of integral equations, cf. [8] and [11] of
the Chapter Bibliography.
SEC. 8.12 THE CONSERVATION LAWS OF MECHANICS 479
avoid the singularity of the kernel and reduce the determination of the
surplus data numerically to the solution of a well-conditioned large-scale
system of ordinary linear equations.
Problem 345. Show on the basis of (3) that the potential function V(T) is
everywhere inside the domain T an analytical function of the rectangular
coordinates (x, y, z) (that is the partial derivatives of all orders exist), although
the boundary values themselves need not be analytical.
and for this reason we call the matter tensor a "symmetric tensor". Hence
the number of independent components is reduced from n2 to n(n + l)/2,
which means in 2, 3, and 4 dimensions respectively 3, 6, and 10 independent
components. A further fundamental property of the matter tensor is that
its divergence vanishes at all points:
If the matter tensor were not symmetric, these would be all the adjoint
solutions since in that case the operator on the left side of (5) would be
replaced by d^t/Bx^ alone. The symmetry of the matter tensor has, how-
ever, another class of solutions in its wake. We choose an arbitrary pair
of subscripts, for example i and k, and put
while all the other <&a are equated to zero. These solutions, whose total
number is n(n — l)/2, give us an additional set of boundary conditions,
namely
expresses the fact that a material body can be in equilibrium only if the
resultant of the external forces acting on it is zero.
Let us now turn to the second class of boundary conditions of the type
32—L.D.O.
482 BOUNDARY VALUE PBOBLBMS CHAP. 8
which means that a material body can be in equilibrium only if the resultant
moment of the external forces acting on the body is zero. The conditions (12)
and (15) are fundamental in the statics of rigid or any other kind of bodies.
We have obtained them as the compatibility conditions of a partial differential
equation, namely the equation which expresses the divergence-free nature
of the matter tensor. Earlier, in Chapter 4.15, when dealing with an elastic
bar which is free at the two end-points, we found that the differential
equation of the elastic displacement was only solvable if two compatibility
conditions are satisfied: the sum of the forces and the sum of the moments
of the forces had to be zero. At that time we had an ordinary differential
equation of fourth order; now we have a system of three partial differential
equations of first order which leads in a more general setting to similar
compatibility conditions.
We will now leave the realm of statics and enter the realm of dynamics.
Einstein in his celebrated "Theory of Relativity" has shown that space
and time belong inseparably together by forming a single manifold.
Minkowski demonstrated that the separation of the physical world into space
and time is purely accidental. All the equations of mathematical physics
can be written down in a form in which not merely the three space variables
#i> #2, #3 (corresponding to the three rectangular coordinates x, y, z) play an
equivalent role, but these three coordinates are supplemented by the fourth
coordinate #4, which in physical interpretation corresponds to the product
ict, where c is the velocity of light:
First of all we consider the four conditions (8). In view of the added
terms we have to complement the previous surface integrals by further
integrals which are extended over the entire volume of the masses. We
will introduce the following four quantities, three of whom correspond to the
three components of a vector and the fourth to a scalar:
Since we have integrated over the total volume of our domain, these four
quantities are no longer functions of xi, x%, #3, but they are still functions of
£4. Now the equation (12) appears in the following more general form:
Let us now remember that the first term of (20) was physically interpreted
as the "total force" exerted on the body by the outside forces. Since
Newton's law of motion states that "the time rate of change of the total
484 BOUNDARY VALUE PROBLEMS CHAP. 8
Now we define the "centre of energy " or "centre of mass" of our mechanical
system by putting
We can get rid of the last term by subtracting the equation (21), multiplied
by £k> thus obtaining
must be sufficient to restore the missing quantity vt(x, 0). But as a boundary
value problem we have violated the condition that a hyperbolic differential
equation should not be characterised by peripheral data.
3. The values of the potential V(x, y, z) are given on a very flat ellipsoid
a. By calculation we have obtained V in the neighbourhood of the origin
x = y = z = 0, in the form of an infinite Taylor expansion which, however,
does not converge beyond a certain small radius r = p at which the sphere
r = p touches the ellipsoid. By an accident we have lost the original
boundary values whose knowledge is very precious to us. We want to
restore the original data from the given Taylor expansion. We know that
the solution exists and is in fact obtainable by the method of analytical
continuation. But considered as a boundary value problem we can say that
V and dV/dv are given on the inner boundary r = p, while no boundary
values are given on the outer boundary <r. These are initial type of boundary
conditions for an elliptic differential equation, in contradiction to the general
rules.
From the standpoint of the general analytical theory we have the right
to ask what motivations are behind these prohibitions. The answer was
given by J. Hadamard who, in his celebrated "Lectures on the Cauchy
Problem" (cf. [5]), introduced the concept of a "well-posed" or "correctly
set" problem ("un probleme correctement pose"), by postulating certain
conditions that a properly formulated boundary value problem should
satisfy. The context of his discussions demonstrates that he considers
both under-determined and over-determined problems as not-well-posed.
In the under-determined case the solution is not unique, while in the over-
determined case the given data are not freely choosable but restricted by the
necessary compatibility conditions. Hence Hadamard's "well-posed"
problem represents in the language of algebra the case of an n x n linear
system with non-vanishing determinant which establishes a one-to-one
correspondence between the left side and the right side.
There is, however, a third condition demanded by Hadamard which has
no analogy in the algebraic situation. We will call this the Condition C:
"an arbitrarily small perturbation of the data should not cause a finite change
in the solution". It is this condition to which we have to pay particular
attention when dealing with the general theory of boundary value problems,
in which we abandon the restrictions which go with the special class of
"well-posed" problems.
This is a purely symbolic equation which has no direct significance since the
right side represents a necessarily divergent infinite sum. But the signifi-
cance of this sum was that the operation Dv(x) could be obtained with the
help of the following integral operation:
more than compensates for the largeness of A$ and even AjCf converges to
zero. The resulting sum
converges and represents the function u(x) which came about as the result
of the operation Dv(x).
The sum (1) finds its natural counterpart in another infinite sum which
represents the eigenfunction decomposition of the inverse operator:
This too is a sum which need not have an immediate significance. What
we mean by it is once more that this sum operates in the sense of a term by
term integration on the function u(g), in order to obtain v(x):
We can thus go from the left to the right, starting with v(x) and obtaining
u(x) on the basis of the operation (2), or we can go from the right to the
left, obtaining v(x) on the basis of the operation (6). So far as the analytical
theory of linear differential operators is concerned, both operations are of
equal interest, although usually we consider only the second operation, if our
task is to "solve" the given differential equation (with the given boundary
conditions).
The operator D~l is much nearer to an actual value than D itself. In
many problems the infinite sum (5) converges and defines a definite function
of the two points x and £; the " Green's function " of the problem. But even
if the sum (5) did not converge in itself, we could arrive at the Green's
function by a proper limit process.
This general exposition has to be complemented by the remark that we
have omitted from our expansions the eigenvalue A = 0. The significance
of the zero eigenvalue was that certain dimensions of the function space
were not represented in the operator and exactly for this reason the omission
of these axes was justified. However, the zero axes of the C7-space were
not immaterial. We had to check our data concerning their orthogonality
with respect to these axes since otherwise our problem was self-contradictory
and thus unsolvable.
What would happen now to this theory if we applied it to the case of
boundary data which hi the customary sense are injudiciously chosen?
490 BOUNDAEY VALUE PROBLEMS CHAP. 8
cannot converge, exactly as the sum (4) could not converge on account of
the limit point A$ = oo of the regular spectrum.
But here again the divergence of the sum (10) does not mean that the
solution (6) has to go out of bound. The substitution of a permissible
function in (2) had the consequence that the right side of (4) approached a
definite limit which was u(x). Now, if we go backward by starting with
u(x) as the given function, we shall obtain the right side of (5) as a con-
vergent sum because the expansion coefficients
with
In fact, for this part of the solution even a Green's function can be con-
structed and we can put
because the sum (5), extended only over the eigenfunctions Vi(x), even if it
does not converge immediately, will converge after an arbitrarily small
smoothing.
We now come to the parasitic spectrum for which a solution in terms of a
Green's function is not possible. Here the sum
has to remain in the form of a sum and we have to require the convergence
of this sum at all points x of our domain. This implies—since v(x) must be
quadratically integrable—that we should have
But these conditions are much milder than strict orthogonality to the
parasitic spectrum which would make the right sides of (17) and (18) not
finite but zero.
The resulting solution v(x) of our problem is now the sum of the contribu-
tions of the regular and the parasitic spectrum:
SBC. 8.14 THE EIGENVALUE A = 0 AS A LIMIT POINT 493
Problem 346. Consider the problem of the cooling bar (9.9-10), but replacing
the initial condition (9.7) by the end condition
Obtain the compatibility condition of the function F(x), on the basis of the
Fourier expansion (9.11).
[Answer:
where
Transform this equation to polar coordinates (r, 6) and assume that /(I, 9) is
given as a (complex) function of 6:
Given the further information that /(z) is analytical between the circles r = 1
and r = R. Find the compatibility condition to be satisfied by <p(6).
[Answer:
where
Find again the compatibility condition of this problem under the same
assumptions as those of the previous problem. Interpret the result in terms of
the Cauchy-Biemann equations (25).
[Answer:
where
494 BOUNDABY VALUE PEOBLEMS CHAP. 8
The early masters of calculus assumed that the initial values (8.7.26) of
the problem of the vibrating string have to be prescribed as analytical
functions of x. They were led to this assumption by the decomposition of
the motion into eigenfunctions which are all analytical and thus any linear
combination of them is likewise analytical. Later the exact limit theory
of Cauchy revealed the flaw in this argument which comes from the fact that
an infinite series, composed of analytical terms, can in the limit approach a
function which is non-analytical. But exactly the same argument can also
be interpreted in the sense that a non-analytical function can be replaced
by an analytical function with an error which can be made at each point
as small as we wish. Hence we would think that the diiference between
demanding analytical or non-analytical boundary values cannot be of too
great importance. And yet, the decisive difference between the well-posed
and ill-posed type of boundary value problems lies exactly in the question
whether the nature of the given problem allows non-analytical data, or
demands analytical data. An analytical function is characterised by a very
high degree of consistency, inasmuch as the knowledge of the function
along an arbitrarily small arc uniquely determines the course of the function
along the large arc, while a non-analytical function may change its course
capriciously any number of times. But then, if an analytical function has
such a high degree of predicatability, we recognise at once that a boundary
value problem which requires analytical data will be automatically over-
determined to an infinite degree, and we can expect that conditions will
prevail which deviate radically from the "well-posed" type of problems
whose data need not be prescribed with such a high degree of regularity.
In this section we shall show that the parasitic spectrum comes into existence
automatically in problems of this type.
As a starting point we will consider the first unusual boundary value
problem listed in Section 13, the cooling of a bar whose temperature distribu-
tion has been observed at the time moment t = T, while our ami is to find
by calculation what the temperature distribution was at the earlier time
moment t = 0. We realise, of course, that the function v(x, T) = F(x) is by
no means freely at our disposal. But we can assume that F(x) is given to
us as the result of measurements and we have the right to idealise the physical
situation by postulating that our recording instrument provides the course
of F(x) free of errors, to any degree of accuracy we want. Hence the
compatibility of our data is assured in advance. We have given the function
v(x, T) which has developed from an initially given non-analytical but
permissible temperature distribution v(x, 0) = /(#). If we can obtain
v(x, T) from v(x, 0), we must also be able to obtain v(x, 0) from v(x, T).
And in fact the solution (9.11) is reversible. By obtaining the coefficients
C| from the given initial distribution we could obtain v(x, t) at any later time
moment. But if we start with v(x, T), the expansion coefficients of this
function will give us c* multiplied by an exponential function and thus the
SEC. 8.15 VARIATIONAL MOTIVATION OF PAEASITIC SPECTRUM 495
This sum would diverge, of course, if we had started with the wrong data,
but it remains convergent if F(x) has been properly given.
We will now investigate the eigenvalue spectrum associated with our
problem. This problem can be conceived as the solution of a minimum
problem. The shifted eigenvalue problem
This reasoning would not hold in the case where the minimum is zero
because Dv(x) = 0, or jbu(x) = 0 may have non-vanishing solutions, although
the other equation may have no such solution. We assume, however, that
A = 0 is not included in the eigenvalue spectrum.
This condition is satisfied in our cooling problem. The boundary condition
for v(x, t) is
and no regular solution of the heat flow equation exists which would give a
uniformly vanishing solution at t = T, without vanishing identically. The
same can be said of the adjoint equation Du = 0, under the boundary
condition
But the analytical nature of the heat flow equation for any t > 0 allows a
much more sweeping conclusion. Let us assume that F(x, T) is not given
along the entire rod between x = 0 and x = I, but only on a part of the rod.
Then the corresponding boundary condition (10) will now involve only the
range x = [0,1 — e]. Yet even that is enough for the conclusion that the
homogeneous equation has no non-vanishing solution, because an analytical
function must vanish identically if it vanishes on an arbitrarily small arc.
Then our minimum problem requires that we shall minimise the integral
(5) (with the auxiliary condition (6)), under the boundary condition
This condition is less stringent than the previous condition (10) which required
the vanishing of v(x, T) for the complete range of x. This greater freedom
in choosing our function v(x, t) must give us a better minimum than before;
that is, the smallest eigenvalue must decrease. But let us view exactly the
same problem from the standpoint of the adjoint equation. Here the
shrinking of the boundary for v(x, t) increases the boundary value for u(x, t),
because now it is not enough to require that u(x, 0) shall be zero. It has
to be zero also on that portion of the upper boundary on which v(x, T)
remained free. We have thus a more restricted minimum problem which
must lead to an increase of the smallest eigenvalue. And thus we come to
the contradictory conclusion that the same eigenvalue must on the one hand
decrease, on the other hand increase. We have tacitly assumed in our
reasoning that there exists a smallest eigenvalue. The contradiction at
which we have arrived forces us to renounce this assumption, and this can
only mean that the eigenvalue spectrum can become as small as we wish,
because in that case there is no smallest eigenvalue. And thus we have
been able to demonstrate the existence of the parasitic spectrum in the
given cooling problem by a purely logical argument, without any explicit
calculations.
Quite similar is the situation concerning the third of the problems
enumerated in Section 13. Here the potential function was characterised
by giving the function and its normal derivative along a certain portion
SEC. 8.15 VARIATIONAL MOTIVATION OF PARASITIC SPECTRUM 497
under the constraints that the values of v and dvjdv are prescribed on the
boundary surface a. The problem leads to the differential equation
is identical with that obtained for the functions v, u of the previous para-
graph, if we identify p. with A2. Our problem seems "well-posed", and in
fact it is, if the given boundary data extend over the complete boundary.
Then the A-spectrum starts with a definite finite AI and the parasitic
spectrum does not appear. The solution is unique and the data freely
choosable. But let us now assume that once more the same minimum
problem is given, but with boundary data which omit one part of the boundary,
be that part ever so small. At this moment the situation changes com-
pletely. The smallest eigenvalue falls to an infinitesimal quantity; we get
the parasitic spectrum, and the problem becomes unsolvable with boundary
data which are not properly given. This means that our minimum problem
has no solution. We can approach a certain minimum as near as we wish
but a definite minimum cannot be obtained.
Indeed, the analytical solution demands once more the fulfilment of the
differential equation (14), with the boundary conditions
possible. In that case we get zero for the requested minimum. But for
any other choice of boundary data our problem becomes unsolvable. And
yet the solution exists immediately if we add further conditions to our
problem, for example by requiring that v and 8v/dv shall vanish on the
remaining portion S of the boundary surface a.
The "method of least squares" is based on the principle that a function
of some parameters which is everywhere positive must have a minimum
for some values of the parameters. This theorem is true in algebra, where
the number of parameters is finite. It seemed reasonable to assume that
the same theorem will hold in the realm of positive definite differential
operators. Hence the attempt was made to demonstrate the existence of
the solution of the boundary value problems of potential theory on this
basis. This principle is called (although with no historical justification)
"Dirichlet's principle". In the case of the potential equation this principle
is actually applicable, no matter whether the boundary values are prescribed
on the total boundary or only on some parts of it. But our result con-
cerning the "biharmonic equation" (14) shows that this principle can have
no universal validity. It holds in all cases in which the parasitic spectrum
does not exist. But here we have an example of a completely "elliptic"
type of differential equation, with apparently well-chosen peripheral
boundary conditions, which is in fact "ill-posed" in Hadamard's sense.
This peculiarity of our problem is then traceable to the appearance of the
parasitic spectrum which again is closely related to the failure of Dirichlet's
principle.
and put
from which
and thus
The first boundary condition (4) demands A% = —A\ and hence we can
put:
Now the relation (7) does not make a necessarily real. For any A which
is larger than k2, a becomes imaginary. We can show at once that indeed
this is the only possibility. In the former case we see from (9) that u(t) is
a monotonously increasing function of t which cannot vanish for any value
500 BOUNDARY VALUE PROBLEMS CHAP. 8
and exactly with the same reasoning as above we obtain the solution
In this case the possibility of a real a cannot be ruled out. The first
boundary condition (11) requires the condition
and thus
For large k the A'fc decrease rapidly and come arbitrarily near to zero. We
have thus proved the existence of a parasitic spectrum.
The analysis of this solution shows two characteristic features:
1. The parasitic spectrum is a one-dimensional sequence; to every k
(for large k) only one A'jt can be found, while the regular eigenvalue spectrum
is two-dimensional (to every k an infinity of periodic solutions can be found).
2. The division by a very small A'fc makes the data exceedingly vulnerable
to small errors and in principle our data have to be given with infinite
accuracy, in order to solve the given problem. Hadamard's "Condition C"
(see Section 13), is not fulfilled. But a closer examination reveals that the
very small A'^ belong to very high k values. If the time T is sufficiently
small, then the dangerously small A'fc will occur at such large k values that
even the complete omission of the parasitic spectrum will cause a minor
error, provided that the initial temperature distribution is sufficiently
smooth. Under such circumstances we can restore from our data v(x, T)
SEC. 8.16 EXAMPLES FOE THE PARASITIC SPECTRUM 501
the initial temperature distribution v(x, 0), if our data are given with
sufficiently high, but not infinitely high accuracy. The time T of the
backward extrapolation depends on the accuracy of our data, and it is
clear that for large T an excessive (but still not infinite) accuracy is demanded,
if we want to obtain v(x, 0) with a given finite accuracy. An absolute
accuracy of the data would only be required if we do not tolerate any error
in the finding of v(x, 0).
In this example the parasitic spectrum came into existence on account of
the unconventional type of boundary value problem from which we started.
Much more surprising is the appearance of this spectrum in a perfectly
regular and "well-posed" problem, namely the Cauchy-problem (initial
value problem) associated with the vibrating string. The peculiar riddles
which we have encountered in the last part of Section 8, find their resolution
in the unexpected fact that even in this very well-posed problem the
parasitic spectrum cannot be avoided, if we formulate our problem hi that
" canonical form" which operates solely with first derivatives, the derivatives
of higher order being absorbed by the introduction of surplus variables
(cf. Chapter 5.11).
We have formulated the canonical system associated with our problem in
the equations (8.7). The eigenvalue problem becomes (in view of the
self-adjoint character of the differential operator):
The first two horizontal lines can be solved algebraically for pi, p2, qi, qz>
obtaining
We put
and obtain
where «i2 and az2 are determined by the two roots of the equation
These, however, are not the boundary conditions of our canonical problem.
The derivative v'(t) was absorbed by the new variable pz, similarly u'(t) by
<?2. The conditions
We want to find out whether or not these conditions can be met by very
small A-values. In that case A3 becomes negligible on the right side of (27)
and we have to solve the equation
and write (26) in trigonometric rather than exponential form. The first
of the boundary conditions (29) reduces the free constants of our solution
to only three constants:
and thus
while the second condition yields, by the same reasoning that led to (34)
and (35):
This means
(ra = integer). We see that for any choice of the integer m the eigenvalue
X'mk can be made as small as we wish by choosing k sufficiently large. The
existence of a very extended parasitic spectrum is thus demonstrated and
we now understand why the solution of the canonical system (8.7) is less
smooth than the right side J3(x), put in the place of zero in the third equation.
The propagation of singularities along the characteristics—which is in such
strange contradiction to our expectations if we approach the problem from
the standpoint of expanding both right side and solution into their respective
eigenfunctions—can now be traced to the properties of the parasitic spectrum
which emerges unexpectedly in this problem.
504 BOUNDARY VALUE PROBLEMS CHAP. 8
Problem 349. Obtain the parasitic spectrum for the following (non-conventional)
boundary value problem:
[Answer:
Problem 350. The analytical function f(z) (see equation (14.23)) is known to
be analytical in the strip between y = 0 and y = I, Moreover, it is known to be
periodic with the period 2?r:
principles which provide not only the equations of motion but also the
"natural boundary conditions" of the given physical problem. Imposed
boundary conditions are merely circumscribed interventions from outside
which express in simplified language the coupling which in fact exists
between the given system and the outer world. The actual forces which
act on the system, modify the potential energy of the inner forces and the
physical phenomenon is in reality not a modification of the boundary
conditions of the isolated system but a modification of its potential energy.
Hence it is the differential operator which in reality should be modified and
not the boundary conditions which actually remain the previous "natural
boundary conditions".
As a concrete example let us consider the vibrations of a membrane for
which the boundary condition
where the integration is extended over the entire boundary. Although our
condition is now of a global character, whereas before we demanded a condi-
tion that had to hold at every point of the boundary, yet our new condition
can only hold if the integrand becomes zero at every point, and thus we are
back at the original formulation.
But now we will make use of the "Lagrangian multiplier method" that
we have employed so often for the variational treatment of auxiliary con-
ditions. Our original variational integral was given in the form
Let us observe that we would obtain the same Lagrangian if the condition
(2) were replaced by the less extreme condition
506 BOUNDARY VALUE PEOBLEMS CHAP. 8
where e is not zero but small. It is the magnitude of the constant //, which
will decide what the value of the right side of the condition (5) shall become.
With increasing \i the constant e decreases and would become zero if /n
grows to infinity.
The term that we have added to the Lagrangian L = T — V:
which is the same partial differential equation we had before. The added
term comes in evidence only when we establish the natural boundary
condition of our problem, which now becomes
This again shows that the exact condition (1) would come about if /*
became infinite, which is prohibited for physical reasons. The changed
boundary condition (8) instead of (1) would eventually come into evidence
in the vibrational modes of extremely high frequencies.
However, even so, we cannot be satisfied by the expression (6). If the
potential energy of the elastic forces require an integration over the two-
dimensional domain of the coordinates (x, y) we cannot assume that the
boundary forces will be concentrated on a line. Although apparently the
membrane is fixed on the boundary line only, physically there is always a
small but finite band along the boundary on which the external forces act.
Accordingly we have to introduce the potential energy of the forces which
maintain the constraint on the boundary in the form
where the function W(x, y} has the property that it vanishes everywhere
except in a very thin strip of the width e in the immediate vicinity of the
boundary curve s, where W(x, y} assumes a large constant value:
In fact, however, we cannot be sure that the force acting on the boundary
has the same strength at all points. We could have started our discussion
by replacing the boundary condition (1) by the integral condition
SEC. 8.17 PHYSICAL BOUNDARY CONDITIONS 507
where the weight factor p(s) is everywhere positive but not necessarily
constant. Accordingly we cannot claim that the large positive value of
W(x, y) in a thin strip along the boundary will be necessarily a constant
along the boundary curve s. It may be a function of s, depending on the
physical circumstances which prevail on the boundary. For the macro-
scopic situation this function is of no avail, since practically we are entitled
to operate with the strict boundary condition (1). But the method we
have outlined—and which can be applied to every one of the given boundary
conditions, for example to the two conditions v = 0 and 8v]dv = 0 in the
case of a clamped plate, which now entails the addition of two expressions
of the form (9)—has the great advantage that it brings into play the actual
physical mechanism which is hidden behind a mathematical boundary
condition. We have modified the potential energy of our system and thus
the given differential operator with which we have to work. The "imposed
boundary conditions" are now gone. They have been absorbed by the
modification of the differential operator. The actual boundary conditions
follow from the variational problem itself and become the natural boundary
conditions of the given variational integral.
We can now answer the question whether the parasitic spectrum
encountered in our previous discussions might not have been caused by the
imposition of artificial boundary conditions, and might not disappear if
we operate with the actual physical situation in which only natural boundary
conditions occur. The answer is negative: the parasitic spectrum cannot
be removed by the replacement of the imposed boundary conditions with the
potential energy of forces which maintain that condition. Indeed, the
replacement of the given constraint by a potential energy weakens the
constraint. Hence the chances of a miiiimum under the given conditions
have improved and the eigenvalue must be lowered rather than increased.
The parasitic spectrum must thus remain, with a very small alteration toward
smaller A'*. And thus we arrive at a strange conclusion. We have seen
that the smallest eigenvalue of the shifted eigenvalue problem can always
be defined as the minimum of a certain positive definite variational integral—
in fact as the absolute minimum of that integral. We should think that at
least under natural boundary conditions a definite minimum must exist.
Now we see that this is not so. In the large class of problems in which the
parasitic spectrum makes its appearance (and that includes not only the
non-conventional type of boundary value problems, but the well-posed
hyperbolic type of problems in which the parasitic spectrum is a natural
occurrence if the problem is formulated in its canonical form), we obtain no
definite miiiimum, in spite of the regular nature of the given differential
operator, the finiteness of the domain, and the fact that we do not impose
any external boundary conditions on the problem. "Dirichlet's principle"
fails to hold in this large class of problems. The minimum we wanted to
get can only be reached as a limit, since we obtain an infinity of stationary
values which come to zero as near as we wish, without ever attaining the
value zero.
508 BOUNDARY VALUE PROBLEMS CHAP. 8
may have been chosen. The basic problem thus remained the solution of
the inhomogeneous differential equation
Furthermore, the solvability of our problem demanded that the right side
(3(x) should be orthogonal to all the zero-axes of the Z7-space:
where
and the zero axes are omitted. Indeed, if u(x) had projections in the
direction of the t^-axes, we should have to add to the right side the sum
which is only possible if each one of the ft defined by (3) vanishes. Hence
the compatibility of the data can be replaced by the single scalar condition
(4), which makes no reference to the missing axes.
This, however, is generally not enough. In addition to the regular
spectrum whose eigenvalues increase to infinity, we may have a "parasitic
spectrum", whose eigenvalues X't converge to zero. While the given
function j8(x) need not be orthogonal to the axes u't(x) of the parasitic
spectrum, it is necessary that the projections in the direction of these axes
shall be sufficiently weak to make the following sum convergent:
510 BOUNDARY VALUE PROBLEMS CHAP. 8
The operator on the left side has exactly the same eigenfunctions as the
original one but the eigenvalues are shifted by a small amount. This shift
eliminates the zero eigenvalue and also its immediate neighbourhood—that
is the parasitic spectrum. We now have a complete and unconstrained
operator which satisfies all the requirements of a "well-posed" problem: the
solution is unique, the right side fi(x) is not subjected to compatibility
conditions and the solution is not infinitely sensitive to small changes of
the data. The Green's function exists and we can find the solution in the
usual fashion with the help of this function. The eigenfunction analysis is
likewise applicable and we need not distinguish between small and large
eigenvalues since none of the eigenvalues becomes smaller in absolute value
than e.
The solution of the modified problem exists even for data which from the
standpoint of our original problem are improperly given. Moreover, the
solution is unique. We analyse the given fi(x) in terms of the eigenfunctions
ut(x):
in this approach and we have arrived at a universal basis for the treatment
of arbitrarily over-determined or under-determined systems. The distinction
between properly and improperly given data comes into appearance only if
we investigate what happens if e converges to zero. The criterion for
properly given data becomes that the solution (vf(x), uf(x)) must approach
a definite limit:
The fact that v(x) is the limit of vf(x) may also be interpreted by saying
that, for an e which is sufficiently small, the difference between vc(x) and
v(x) can be made at all points x as small as we wish. We thus come to the
following result: "If the data are given adequately, the difference between
an arbitrarily ill-posed and a well-posed problem can be reduced at every
point of the domain to an arbitrarily small amount."
BIBLIOGRAPHY
[1] Bergman, S., and M. Schiffer, Kernel Functions and Differential Equations
(Academic Press, New York, 1953)
[2] Churchill, R. V., Fourier Series and Boundary Value Problems (McGraw-
Hill, 1941)
[3] Courant, R., and D. Hilbert, Methods of Mathematical Physics, Vol. II
(German Edition, Springer, Berlin, 1937)
[4] Gould, S. H., Variational Methods for Eigenvalue Problems (University of
Toronto Press, 1957)
[5] Hadamard, J., Lectures on the Cauchy Problem (Dover Publications, New
York, 1953)
[6] Kellogg, O. D., Foundations of Potential Theory (Springer, Berlin, 1929)
[7] McConnell, A. J., Applications of the Absolute Differential Calculus (Blackie
& Sons, London, 1942)
[8] Smithies, F., Integral Equations (Cambridge University Press, 1958)
[9] Spain, B., Tensor Calculus (Oliver & Boyd, Edinburgh, 1956)
[10] Synge, J. L., and A. Schild, Tensor Calculus (University of Toronto Press,
1956)
[11] Tricomi, F. G., Integral Equations (Interscience Publishers, 1957)
CHAPTER 9
The characteristic feature of these equations is that on the right side the
function H (the "Hamiltonian function") is only a function of the variables
qi, pi (and possibly the time t), without any derivatives.
If the basic differential equation—or system of such equations—is not
derivable from a variational principle, or if we deal with a mechanical
system in which frictional forces are present (which do not allow a variational
treatment), we shall nevertheless succeed in reducing our problem to a
first order system by the proper introduction of surplus variables. No
matter how complicated our original equations have been and what order
34—L.D.O.
514 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9
where the Ft are given functions of the dynamical variables V{ and x. The
equations (2) can be conceived as special cases of the general system (3),
considering the complete set of variables q\, . . . , qn; pi, . . . , pn as our Vi
and thus n = 2m. Moreover, it is a specific feature of the Hamiltonian
system that the right sides F{ can be given in terms of a single scalar
function H, while in the general case (3) such a function cannot be found.
However, while for the general analytical theory of the Hamiltonian
"canonical equations" the existence of H is of greatest importance, for the
numerical treatment this fact is generally of no particular advantage.
Important is only that we shall reduce our system to the normal form (3).
Let us assume for example that the given differential equation is a single
equation of the order n. The general form of such an equation is
We now solve this equation for y^(x) and write it in the explicit form
Then we introduce the derivatives of y(x) as new variables and replace the
original single equation of the order n by the following system of n equations
of the first order—denoting y by v-\_:
We have thus succeeded in formulating our problem in the normal form (3).
9.3. Trajectory problems
For the numerical solution of the system (2.3) it is necessary to start
from a definite initial position of the system. This means that the quantities
vi, vz, . . . ,vn have to be given at the initial time moment x = 0:
much labour and very limited accuracy. We could change our differential
equation to a difference, equation:
While this procedure seems on the surface appealing on account of its great
simplicity, it has the drawback that for an efficient approximation of a
curve with the help of polygons we would have to make the step-size h
excessively small, but then the inevitable numerical rounding errors would
swamp our results.
For this reason our aim must be to render our approximating curve
more flexible by changing it from a straight line to a parabola or a polynomial
of third or even fourth order. In this manner we can increase the step-size
k and succeed with a smaller number of intermediate points.
All methods for the numerical integration of ordinary differential equations
operate with such approximations. Instead of representing the immediate
neighbourhood of our curve as a straight line:
The coefficients of this expansion are in principle obtainable with the help
of the given differential equation, by successive differentiations. But in
practice this method would be too unwieldy and has to be replaced by more
suitable means. The various methods of step-by-step integration endeavour
to obtain the expansion coefficients of the local Taylor series by numerically
appropriate methods which combine high accuracy with ease of computation
and avoidance of an undue accumulation of rounding errors (these are
caused by the limited accuracy of numerical computations and should not
be confounded with the truncation errors, which are of purely analytical
origin). A certain compromise is inevitable since the physical universe
operates with the continuum as an actuality while for our mental faculties
SEC. 9.5 THE METHOD OF UNDETERMINED COEFFICIENTS 517
the continuum is accessible only as a limit which we may approach but never
reach. All our numerical operations are discrete operations which can never
be fully adequate to the nature of the continuum. And thus we must be
reconciled to the fact that every step in the step-by-step procedure is not
more than approximate. The calculation with a limited number of decimal
places involves a definite rounding error in all our computations. But even
if our computations had absolute accuracy, another error is inevitable
because an infinite expansion of the form (6) is replaced by & finite expansion.
We thus speak of a "truncation error" caused by truncating an infinite
series to a finite number of terms. Such a truncation error is inevitable
in every step of our local integration process. No matter how small this
error may be, its effect can accumulate to a large error which is no longer
negligible (as we shall see in Section 13). Hence it is not enough to take
into account the possible damage caused by the accumulation of numerical
rounding errors and devise methods which are "numerically stable", i.e.
free from an accumulation of rounding errors. We must reckon with the
damaging effect of the accumulation of truncation errors which is beyond
our control and which may upset the apparently high local accuracy of our
step-by-step procedure. The only way of counteracting this danger is not
to consider our local procedure as the final answer, but to complement it by
a global process in which considerations in the large come into play, against
the purely local expansions which operate with the truncated Taylor series.
This we shall do in Section 17.
For the time being we will study the possibilities of local integration
from the principal point of view. Our procedure must be based on the
method of interpolation. We have at our disposal a certain portion of the
curve which we can interpolate with sufficient accuracy by a polynomial of
not too high order, provided that we choose the step-size Ax = h small
enough. This polynomial can now be applied for extrapolating to the next
point xjc+i. Then we repeat the procedure by including this new point and
dropping the extreme left point of the previous step. In this fashion there
is always the same length of curve under the search light, while this light
moves slowly forward to newer and newer regions, until the entire range of
x is exhausted.
we can evaluate the derivatives V'M — v'i(kh). From now on we will omit
the subscript i of vi(x) since the same interpolation and extrapolation
process will be applicable to all our vi(x}. Hence v(x) may mean any of
the generalised coordinates vi(x) of our problem.
It is clear that in a local process only a limited number of ordinates can
be used. This number cannot become too large without making the step-
size h excessively small since we want to stay in the immediate vicinity of
the point in question. We will not go beyond three or four or perhaps
five successive ordinates. We will leave this number optional, however, and
assume that we have at our disposal the following 2m data:
Now we multiply all these equations by some undetermined factors «i, «2>
. . . , am as far as the first group is concerned and — j3i, — fa, . . . , —j9m as
far as the second group is concerned. Then we add all these equations.
On the left side we get the sum
This factor we want to make equal to 1 because our aim is to predict v(x)
(and that is ym) in terms of all the previous ym-i '•
Our programme cannot be carried out with absolute accuracy but it can be
accomplished with a high degree of accuracy if we succeed in obliterating on
the right side the factors of h, hz, and so on. We have 2m coefficients at
our disposal and thus 2m degrees of freedom. One degree of freedom is
absorbed by making the sum (6) equal to 1. The remaining 2m — 1 degrees
of freedom can be used to obliterate on the right side the powers of h, up
to the order ft27™"1. Hence we shall obtain the extrapolated value ym with
an error which is proportional to hzm. An extrapolation will be possible on
the basis of m functional values and m derivatives, which is of the order
2m if we denote the "order of the approximation" according to the power
of h to which the error is proportional. Hence extrapolation on the basis
of 1, 2, 3, and 4 points will be of the order 2, 4, 6, and 8.
The linear system of equations obtained for the determination of the
coefficients o^, fa is given as follows:
520 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9
with
where
and
But then we can take advantage of the flexibility of the scheme (5.5)
and add conditions which will guarantee numerical stability. The danger
does not come from the coefficients fa which are multiplied by the small
factor h and thus can hardly cause any harm from the standpoint of rounding
errors. Hence the ra degrees of freedom of the fa are still at our disposal,
thus giving us a chance to reduce the error at least to the order of magnitude
hm+l. As far as the «f go, they have to satisfy the condition
but otherwise they are freely at our disposal. Now it seems reasonable to
make all the «< uniformly small by choosing them all equal:
we have minimised the effect of rounding errors since the best statistical
averaging of random errors is obtainable by taking the arithmetic mean of
the data.
There is still another reason why a small value of the «$ is desirable. It
should be our policy to put the centre of gravity of our interpolation
formula on the fa and not on the at since the fa are multiplied by the
derivatives rather than the ordinates themselves. But these very derivatives
are determined by the given differential equation and it is clear that the
less we rely on this equation, the more we shall lose in accuracy and vice
versa. Hence we should emphasise the role of the fa as much as possible
at the expense of the «$. This we accomplish by choosing all the «$ as
uniformly small.
The question of the propagation of a small numerical error during the
iterative algorithm (5.7) is answered as follows. The second term is
negligible, in view of the smallness of the step-size h. We have to investigate
the roots of the algebraic equation
and thus
The assumption |A| > 1 would make the left side negative, in contradiction
to the inequality (7). Moreover, the assumption |A| = 1 but A =£ 1 would
exclude the equal sign of (7) and is thus likewise eliminated. The only
remaining chance that A = 1 is a double root is disproved by the fact that
F'(\) cannot be zero at A = 1. The numerical stability of our process is
thus established.
The coefficients j8j of the formula
can be obtained by solving the algebraic system (5.8) for the jfy;
(substituting for the a* the constant value (2)). But we can also conceive
our problem as an ordinary Lagrangian interpolation problem for f'(x)
instead of f(x), with the m points of interpolation x — —1, — 2 , . . . , — m
(for the sake of convenience we can normalise Ji to 1). Then f(x) is
obtainable by integration.
Let us consider for example the case m = 2. Then
This gives
and extrapolating to x = 0:
A different principle for the estimation of the local error can be established
by checking up on the accuracy with which we have satisfied the given
differential equation. The polynomial by which we have extrapolated ym,
can also extrapolate the derivative y'm. Let us call this extrapolated value
y'm. In the absence of errors this y'm would coincide with the y'm obtained
by substituting the ym values of all the functions v^x) into FI(V]C, x). In
view of the discrepancy between the two values we can say that we have
not solved the differential emiation
The following table contains the formulae for the calculation of the
extrapolated value y'm.
the time estimating also the effect of the truncation error in the solution
or in the differential equation or both. Now it may happen that this
error estimation indicates that the error begins to increase beyond a danger
point which necessitates the choice of a smaller h. We shall then stop and
continue our algorithm with an h which has been reduced to half its previous
value. This can be done without any difficulty if we use our interpolating
polynomial for the evaluation of the functional values half way between the
gridpoints. Then we substitute the values thus obtained into the functions
Fi of our differential equation (5.1), thus obtaining the mid-point values
of the y'ic. The new points combined with the old points yield a consecutive
sequence of points of the step-size A/2 and we can continue our previous
algorithm with the new reduced step-size.
The following table contains in matrix form the evaluation of the mid-point
ordinates, using as a-term the arithmetic mean of all the ordinates y^. The
notation y§\ refers to the mid-point value of y half way between yo and y\,
similarly y\i to the mid-point value of y half way between y\ and «/£> a*id
so on. It is tacitly understood that the arithmetic mean M is added to
the tabular products. For example the line yzz, for m = 4, has the following
significance. Obtain the ordinate at an x-value which is half way between
that of 7/2 and 7/3 by the following calculation:
To obtain the next value Vi(h) we have to rely on a local Taylor expansion
around the origin x = 0. However, it would be generally quite cumbersome
to obtain the high order derivatives of the functions Fi(vjc, x} by successive
differentiations. We can avoid this difficulty by starting our numerical
scheme with a particularly small value of h. For this purpose we transform
our independent variable x into a new variable t by putting
The knowledge of the second derivatives of all the Vi(x) would still not be
sufficient, however, to obtain vt(h) with sufficient accuracy. But the
situation is quite different if we abandon the original variable x and consider
from now on the functions vi as functions of the new variable t. Then the
expansion around the origin appears as follows:
(We will agree that we denote differentiation with respect to x in the previous
manner by a dash, while derivatives with respect to t shall be denoted by a
dot.) Hence in the new variable t we obtain
We have thus obtained v(t) at the second point t = h with an error which is
of sixth order in h, without differentiating with respect to x more than
twice. Then, by substituting these Vk(h) into the functions Ft(V]c(h); h) we
obtain also
We now come to the third point t = 2h. Here we can make use of the
six data v(0), v(Q), v(Q), v(0) and v(h), v(h), on the basis of the (unstable)
formula:
(in our case this formula is simplified on account of v(0) = v(0) = 0). Then
again we evaluate by substitution the quantities vt(2h).
From here we proceed to the fourth point t = 3h on the basis of the
(likewise unstable) formula
Now we are already in the possession of four ordinates (and their derivatives)
and we can start with the regular algorithm of Section 8 or 10, if we are
SBC. 9.13 THE ACCUMULATION OF TRUNCATION ERRORS 531
satisfied with ra = 4, which leads to an error of the order h5. If, however,
we prefer an accuracy of one higher order (m = 5), we can repeat the
process (9) once more, obtaining vfih) on the basis of the points h, 2k, 3h.
Then we arrived at our five points which are demanded for the step-by-step
algorithm with m = 5.
We add four further formulae which may be of occasional interest, two
of them stable, the other two unstable:
Stable
Unstable
and assumes that terms of the order e2 are negligible. This has the great
advantage that we obtain a linear differential equation for the correction
Ui(x). If we substitute the expression (1) in (5.1) and make use of the
notation AM (cf. (12.3)) for the partial derivatives of Ft with respect to
the Vk, we obtain for ut the following differential equation:
(8j(x, £) denotes the delta function put in the jth equation, while the other
equations have zero on their right side). We know from the properties of
the Green's function that all ut(x) will vanish up to the point x = g, while
beyond that point we have to take a certain linear combination of the
homogeneous solutions which satisfy the condition that all the Ui(x] vanish
at x •= g, except for Uj(x) which becomes 1 at the point x = g :
Substitution of (6) in the system (5) yields for A the characteristic equation
SEC. 9.13 THE ACCUMULATION OF TRUNCATION ERRORS 533
From the standpoint of f(x) the given functional values represent the
derivatives /'(#*) which appeared in the formula (5.20.11). But then the
difficulty arises that the ordinates /(#*) themselves, which enter the first
term of the formula, are not known. We can overcome, however, this
difficulty by the ingenious device of putting the points xjc in such positions
that their weights automatically vanish, Then our interpolation (or extra-
polation) formula will not contain any other data than those which are
given to us.
Let us assume that we want to obtain the definite integral
Now the vanishing of the factor of/(a^) demands the following condition
(cf. (5.20.11)):
SEC. 9.14 THE METHOD OF GAUSSIAN QUADRATURE 535
Then
or
because the addition of the last term does not change anything, since
G(xjc) = 0. But now the differential operator in front of G(x) has the property
that it obliterates the power xn and thus transforms a polynomial of the
order n into a polynomial of lower order. But a polynomial of an order
less than n cannot vanish at n points without vanishing identically. And
thus the differential equation (11)—which is Legendre's differential equation
—must hold for G(x) not only at the points Xk but everywhere. This
identifies G(x) as the nth Legendre polynomial Pn(x), except for an
irrelevant factor of proportionality:
The zeros of the Gaussian quadrature are thus identified as the zeros of the
nth Legendre polynomial.
In this approach to the problem of Gaussian quadrature we obtain the
weight factors of the Gaussian quadrature formula in a form which differs
from the traditional expression. The traditional weights of the Gaussian
quadrature formula
536 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9
while the second term of the formula (5.20.11) gives, without any integration,
in view of the formulae (7):
and we obtain
characterises the Gaussian quadrature. But we can save the other outstand-
ing feature of the Gaussian method, namely that it operates with orthogonal
polynomials and thus provides a global approximation which converges
better and better as the degree of the polynomial increases.
We obtain a perfect counterpart of the Gaussian quadrature for the case
of an indefinite integral if we replace the Legendre polynomials by another
outstanding set of polynomials, called "Chebyshev polynomials".* The
operation with these polynomials is equivalent to the transformation
which transforms the range [0, Z] of x into the range [0, TT] of 6. The
originally given function g(x), if viewed from the variable 6, becomes a
periodic function of 6 of the period 2-rr, which can be expanded in a Fourier
series. Moreover, it is an even function of 6 which requires only cosine
functions for its representation. We assume that g(x) is given at n points
which are equidistant in the variable 9 but not equidistant in the variable x.
Two distributions are in particular of interest:
and
where
Indeed, the form of the function (6) shows that the sum (5) represents a
Fourier series of sine and cosine terms up to the order n. Moreover, if we
put 0 = 6k, we obtain
and thus we have actually interpolated the given 2n data with the help
of a Fourier series of lowest order which fits these data.
Now we want to integrate the function y(6), denoting the indefinite
integral by 0(6):
For this purpose we replace the exact cp(0) by its approximation on the
basis of trigonometric interpolation, that is <pn(d). Then the formula (5)
yields
where
The integral in the last line is for large n very nearly equal to Si (nt) where
Si x is the sine integral, defined by (2.3.11). It is preferable to introduce a
slightly different function that we want to denote by K(t):
and put
The subscript n rims from 0 to n in the case of the distribution (2) and
from 1 to n in the case of the distribution (3) (in the first case yo — 7n = 0;
this means that we lose our two end-data gr(0) and g(l) which enter all our
calculations with the weight zero. The loss is not serious, however, if n is
sufficiently large). We must extend our domain of ##• values toward negative
fc, in order to have a full cycle. Hence we define
Although the data (16) have to be weighted for every value of 0 separately,
yet the formula (18) shows that it suffices to give a one-dimensional sequence
of weight factors for every n. Let us assume that we want to obtain G(6)
at the data points. Then
We may prefer to obtain the integral not at the data points but half way
between these points. This is advisable in order to minimise the errors.
The error oscillations of trigonometric interpolation follow the law
where 9? is a constant phase angle (which is zero for the first and ir/2 for the
second distribution of data points), while the amplitude A(B) changes
540 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9
slowly, compared with the rapid oscillations of the second factor. Hence
the approximate law of the error oscillations of the integral becomes
which is zero half-way between the data points. At these points we can
expect a particularly great accuracy, gaining the factor n compared with the
average amplitude of the error oscillations. For these points the coefficients
Ws have to be defined as follows:
(the prime attached to sigma shall indicate that the last term is to be taken
with half weight). The symmetry pattern of this weighting is somewhat
different from the previous pattern, due to the exclusion of the subscript
5 = 0.
and the amplitudes are particularly small since we are near to the nodal
points of these oscillations, namely half-way between the consecutive
minima and maxima which occur at the points
are put on a parallel strip which gradually moves to the right. The initial
position of these two strips is as follows:
We multiply corresponding elements and form the sum. Let the result be
G0;
We now move the strip one place to the right, obtaining the following picture:
First of all we need our data yjc and for this purpose we have to form the
products (15.16) for n = 12, 6k = 7.5°, 22.5°, 37.5°, 52.5°, 67.5°, 82.5°.
SBO. 9.16 NUMERICAL ASPECTS OF GLOBAL INTEGRATION 543
Since g(x) is symmetric with respect to the centre point x = % (or 9 = 90°),
the second set of data k = 7 to 12 merely repeats the first set in reverse
order. Hence we will list the yic only up to k = 6.
k 7k
1 0.00861632
2 0.02702547
3 0.04890525
4 0.07577005
5 0.10548729
6 0.12760580
If now we carry out the calculations according to the movable strip technique,
with the data yjc on the fixed strip and the weights (3) on the movable strip,
we obtain the following results. The exact integral is available in our
example:
(evaluated at points which are in the variable 6 half-way between the data
pouits), and the corresponding values obtained by global integration. The
sum of the ordinates is listed separately, in order to show the effect of the
correction.
The correction, although very small, is highly effective and extends the
accuracy to the seventh decimal place.
We will, however, add another distribution of our data points which
corresponds to (15.2) and which is more suitable in view of the application
to the solution of differential equations. We will now assume that our data
are given at multiples of the angle irfn and that the evaluation of the
integral shall occur at the same points. Then we need the weights which
correspond to the definition (15.20). These weights are contained in the
following table.
s ws s W8
0 0 1 -0.01012223
1 -0.08891091 8 0.00762260
2 0.04742642 9 -0.00547171
3 -0.03134027 10 0.00354076
4 0.02267382 11 -0.00174001
5 -0.01712979 12 0
6 0.01317377
The values in this table oscillate with much larger amplitudes than those of
the previous table (3) since at present we are at the points of the minima
and maxima of the function (4). Special attention has to be given to the
value WQ which according to the general rule should be listed as | instead of
zero. But again we will take into account the effect of this large constant
separately. It amounts to the following modification of the previous law
of the ordinates. Instead of adding up the yi according to (12), we have to
modify the sum by taking the two limiting ordinates with half weight:
We list the results once more in tabular form, in full analogy to the previous
table (17).
where a had the value 0.2 instead of 1. However, the more or less smooth
character of the function merely changes the number of ordinates needed
for a certain accuracy. In the present example we are in the possession
of an exact error analysis and we can show that the accuracy of the tables
(17), respectively (22), could have been matched with the much less smooth
function (23) (with a = 0.2), but at the expense of a much larger number of
ordinates since the present number n = 12 would have to be raised to
36 + L.D.O.
546 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9
n = 50. Our aim was merely to demonstrate the numerical technique and
for that purpose a simpler example seemed more adequate.
9.17. The method of global correction
In the previous methods of solving trajectory problems by inching
forward step by step on the basis of local Taylor expansions, we did not
succeed in exhibiting an explicit solution in the form of a set of continuous
functions Vi(x). The point of reference was constantly shifting and we could
not arrive at a solution which could be truly tested whether or not it satisfied
the given differential equation. Even if we did find that the differential
equation is actually satisfied at every point with a high degree of accuracy,
we would still have to convince ourselves that these small local errors will
not accumulate and possibly cause a large error in the solution. But the
principal objection to the method of shifting centres of expansion is that
we have obtained a set of discrete values Vi(xm) instead of true functions of
x\ Vi(x). We usually get round the difficulty in a purely empirical fashion
by reducing the step-size h and making repeated trials, until we come to the
point where the ym-values become "stabilised", by approaching more and
more definite limits. But a real "solution " in the sense of testable functions
has not been achieved.
This situation is quite different, however, if we change from the variable
x to the angle variable 8, obtaining once more the solution by the previous
step-by-step process. Although we have once more only a discrete sequence
of ordinates, we can now combine all these ordinates into a continuous and
differentiable function by the method of trigonometric interpolation. Now
we have actually obtained our Vi(6) and can test explicitly to what extent
the given differential equation has been satisfied.
It is more satisfactory to start with the derivatives y'm, which we possess
in all the data points. By trigonometric interpolation we can combine these
data to a true function v'(6) and then, by the technique of integration
discussed in the sections 15 and 16 we also obtain v(6). In contradistinction
to the previous problem our "data" are now more simply constructed. In
the previous case (15.16) the factor 1/2 sin 6% appeared because we had to
obtain dg/dd in terms of the given dg\<Lx. In the present problem we have
abandoned from the very beginning the operation with the original variable
x and changed over to the variable 6. Hence we possess dg/dd without any
transformation and our y^ now become simply
function Vi(d) found by integrating the derivative data y'm, and finally the
true function v*i(0) which actually satisfies the given differential equation
(5.1). The difference
can be explicitly obtained at all the data points because the previous process
gave us Vi(dm) and now we have found the new Vi(0m). We assume that the
difference pi(0m) is small. Then we have a good indication that the function
Vi(d) will need only a small correction in order to obtain the true function
v*(6). Hence we will put
Then, in view of the smallness of Ui and pi, we can be satisfied with the
solution of the linear perturbation problem
If the second term were absent, we could immediately integrate this equation
by the previous global integration technique. The second term has the
effect that instead of an explicit solution ui(dm} at the data points we obtain
a large scale linear system of algebraic equations for the determination of
the Ui(0m). Since the matrix of this system is nearly triangular, we have
no difficulty in solving our system by the usual successive approximations.
We have then obtained the global correction which has to be added to the
preliminary step-by-step solution and we have the added advantage that
now we really possess the functions v*i(d) at all points, instead of a discrete
set of ordinates which exist only in a selected sequence of isolated points.
Great difficulties arise, however, if it so happens that the linear algebraic
system associated with the linear differential equation (5) is "badly
conditioned". Then the smallness of the right side cannot guarantee the
smallness of ui(0}. Not only is our perturbation method then in danger
of being put out of action, but the further danger exists that the solution of
the given problem (5.1) is exceedingly sensitive to very small changes of
the initial data. The solution of such problems is very difficult and we may
have to resort to the remedy of sectionalising our range and applying the
step-by-step method in combination with the global correction technique
separately in every section, thus reducing the damaging influence of
explosive error accumulations.
Numerical Example. The following numerical example characterises the
intrinsic properties of the local and global integration procedures. We
548 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9
where
While the local procedure uses parabolas of fourth order with a constantly
shifting origin, the global procedure uses the same trigonometric cosine
polynomial of tenth order. It is this sameness which prevents the gradual
increase of the truncation errors. The trigonometric interpolation is
characterised by periodic rather than exponentially increasing errors.
* Tables of the Bessel Functions Yo(Z) and Yi(Z) for Complex Arguments (Computa-
tion Laboratory, National Bureau of Standards; Columbia University Press, New York,
1950).
SEO. 9.17 THE METHOD OF GLOBAL CORRECTION 549
In the following table ui denotes the successive u(x) values, obtained by
the step-by-step procedure, while ug denotes the values obtained by the
global method. The correct values u(x) (taken from the NBS tables), are
listed under u*. The function u(x) corresponds to the negative real part
of YQ(IX) of the tables (on p. 364). The same notations hold for the function
v(x) which corresponds to the negative imaginary part of Yi(ix).
BIBLIOGRAPHY
[1] Bennett, A. A., W. E. Milne, and H. Bateman, Numerical Integration of
Differential Equations (Dover Publications, New York, 1956)
[2] Collatz, L., The Numerical Treatment of Differential Equations (Springer,
Berlin, 1960)
[3] Fox, L., The Numerical Solution of Two-Point Problems in Ordinary Differ-
ential Equations (Clarendon Press, Oxford, 1957)
[4] Hildebrand, F. B., Introduction to Numerical Analysis (McGraw-Hill, 1956)
[5] Levy, H., and E. A. Baggot, Numerical Studies in Differential Equations
(Watts, London, 1934)
[6] Milne, W. E., Numerical Solution of Differential Equations (Wiley, New
York, 1953)
[7] Morris, M., and O. E. Brown, Differential Equations (Prentice-Hall, New
York, 1952)
APPENDIX
n =4 n =4 n =8 n =8
551
552 APPENDIX
36*
554 APPENDIX
TABLE III: The four transition functions from the exponential to the
periodic domain and vice versa; KWB method; cf. Chapter 7.23, 24.